Subscribe to Updates
Get the latest news from tastytech.
Browsing: Business & Startups
# Introduction An LLM engineer is not the same thing as a general machine learning engineer. Where a machine learning engineer might spend months training a neural network from scratch, an LLM engineer’s work centers on adapting, orchestrating, and serving pretrained large language models (LLMs). The job is to take a capable foundation model and turn it into something that does useful work reliably inside a real product. Demand for this role has grown substantially in 2026. LLM features that spent 2023 and 2024 as internal demos are now shipping as production systems, and organizations need engineers who can build…
Autoregressive models are one of the most important ideas in time series forecasting and sequence modeling. The name may sound technical at first, but the concept is surprisingly intuitive. An autoregressive model predicts the next value by looking at previous values. That is the core idea. For example, tomorrow’s temperature may depend on the temperatures from the last few days. Next month’s sales may depend on sales from previous months. The next word in a sentence may depend on the words that came before it — the main idea powering LLMs. In all these cases, the model is using the…
# Introduction Data cleaning and preparation are estimated to occupy up to 80% of a data scientist’s daily workflow. Because Pandas is the standard data manipulation library in Python, the efficiency of your operations directly dictates how quickly you can move from raw, dirty datasets to model-ready features. And there is good reason to want to increase your cleaning and preparation time: it translates directly to more time available to spend on modeling, analysis, and communicating insights. However, many developers write Pandas code that mimics standard Python looping structures or uses imperative, state-mutating updates. These approaches suffer from several issues:…
# Introduction If you work with sensor readings, server metrics, or any data that arrives over time, you already know that standard scikit-learn pipelines don’t quite fit. Time series data has structure that tabular models ignore: seasonality, trend, temporal ordering, and the fact that future values depend on past ones. sktime is a Python library built specifically for this. It gives you a scikit-learn-style API — fit, predict, transform — but designed from the ground up for time series. You can do forecasting, classification, regression, and clustering on time series, all with a consistent interface. In this article, you’ll work…
# Introduction The Python scientific computing and machine learning ecosystem relies heavily on NumPy. It acts as the performance engine behind libraries like Pandas, Scikit-Learn, SciPy, and PyTorch. NumPy’s speed comes from its underlying implementation in optimized C, where contiguous blocks of memory are manipulated without the overhead of Python’s object model and dynamic interpreter. Unfortunately, many data scientists and developers write NumPy code that fails to leverage this power. By carrying over standard Python loops or writing naive calculations that force unnecessary memory allocations and array copies, performance bottlenecks are suffered. When working with large datasets, these inefficiencies lead…
# Introduction Agentic coding sessions are expensive. A single Claude Code session — reading files, writing code, running tests, iterating — can burn 10–50x more tokens than a plain chat conversation. At scale, that adds up fast. Add rate limits that can interrupt a long-running workflow mid-session, and the dependency on a third-party API that can change pricing, enforce stricter policies, or go down at any point, and the case for local inference becomes straightforward. Local models in 2026 are good enough. For the tasks Claude Code handles daily — code completion, refactoring, debugging, codebase explanation — a well-chosen…
Gemini models have always kept up with AI advancements. From text-based chatbots in 2023, Gemini has evolved into a multimodal system capable of understanding and generating text, audio, images… and now videos. AI video generation is no longer a standalone tool. With Gemini Omni, video creation becomes mainstream. Gemini Omni isn’t important because it generates videos. It’s important because video generation is becoming just another capability of an AI assistant When used correctly, the use cases for it can actually be very creative (if you can look past the guardrails). Sentence or Image → Video Yeah your read it right.…
# Introduction If you are starting a company, that doesn’t mean you have to raise venture capital from day one. There are tons of different funding options out there, and the best one really depends on what type of business you are building, how much traction you have, and how much ownership you want to keep. Some of these funding routes are non-dilutive, which means you don’t have to give away any equity. Others can give you access to capital, mentorship, and investor networks in exchange for some equity. The best funding route can also change depending on what kind…
# Introduction Most teams discover they need a feature store the hard way. A fraud model works in the notebook and quietly breaks in production. A support agent gives a generic answer because it has no idea who the user is. A recommender pipeline duplicates the same “30-day spend” calculation across three jobs, and two of them disagree. A feature store is the piece of infrastructure that fixes those problems. It defines features once, stores them in two shapes (one for training, one for serving), and keeps both in sync. We are going to build a minimal one from scratch…
Large language models usually generate text one token at a time. While this autoregressive approach delivers strong quality and instruction following, it can be inefficient for local users because GPUs often spend more time moving weights from memory than doing parallel compute. Google DeepMind’s DiffusionGemma takes a different path, generating and refining blocks of tokens in parallel using diffusion-style text generation. In this article, we’ll explore how DiffusionGemma works, how it performs, and how developers can run it locally. What is DiffusionGemma? DiffusionGemma is Google DeepMind’s experimental open-weight model for diffusion-based text generation, built on the Gemma 4 26B A4B…