Close Menu

    Subscribe to Updates

    Get the latest news from tastytech.

    What's Hot

    A Should Pad Landed Warhammer FTL In DMCA Takedown Jail

    February 10, 2026

    This Horror Classic Still Holds the Guinness Record for Most Appearances of a Film in Other Movies

    February 10, 2026

    BMW Opened the Bespoke Door With Skytop and Speedtop. Now It’s Time for an ALPINA Coupe.

    February 10, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    tastytech.intastytech.in
    Subscribe
    • AI News & Trends
    • Tech News
    • AI Tools
    • Business & Startups
    • Guides & Tutorials
    • Tech Reviews
    • Automobiles
    • Gaming
    • movies
    tastytech.intastytech.in
    Home»AI Tools»New model design could fix high enterprise AI costs
    New model design could fix high enterprise AI costs
    AI Tools

    New model design could fix high enterprise AI costs

    gvfx00@gmail.comBy gvfx00@gmail.comNovember 5, 2025No Comments4 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Enterprise leaders grappling with the steep costs of deploying AI models could find a reprieve thanks to a new architecture design.

    While the capabilities of generative AI are attractive, their immense computational demands for both training and inference result in prohibitive expenses and mounting environmental concerns. At the centre of this inefficiency is the models’ “fundamental bottleneck” of an autoregressive process that generates text sequentially, token-by-token.

    For enterprises processing vast data streams, from IoT networks to financial markets, this limitation makes generating long-form analysis both slow and economically challenging. However, a new research paper from Tencent AI and Tsinghua University proposes an alternative.

    Table of Contents

    Toggle
      • A new approach to AI efficiency
      • Rebuilding the toolkit for the continuous domain
      • Reducing enterprise AI costs
      • Related posts:
    • Gaza’s daily nightmare vs US talk of AI-driven smart cities | Donald Trump
    • US$905B bet on agentic future
    • How chess helped me understand grief | Opinions

    A new approach to AI efficiency

    The research introduces Continuous Autoregressive Language Models (CALM). This method re-engineers the generation process to predict a continuous vector rather than a discrete token.

    A high-fidelity autoencoder “compress[es] a chunk of K tokens into a single continuous vector,” which holds a much higher semantic bandwidth.

    Instead of processing something like “the”, “cat”, “sat” in three steps, the model compresses them into one. This design directly “reduces the number of generative steps,” attacking the computational load.

    The experimental results demonstrate a better performance-compute trade-off. A CALM AI model grouping four tokens delivered performance “comparable to strong discrete baselines, but at a significantly lower computational cost” for an enterprise.

    One CALM model, for instance, required 44 percent fewer training FLOPs and 34 percent fewer inference FLOPs than a baseline Transformer of similar capability. This points to a saving on both the initial capital expense of training and the recurring operational expense of inference.

    Rebuilding the toolkit for the continuous domain

    Moving from a finite, discrete vocabulary to an infinite, continuous vector space breaks the standard LLM toolkit. The researchers had to develop a “comprehensive likelihood-free framework” to make the new model viable.

    For training, the model cannot use a standard softmax layer or maximum likelihood estimation. To solve this, the team used a “likelihood-free” objective with an Energy Transformer, which rewards the model for accurate predictions without computing explicit probabilities.

    This new training method also required a new evaluation metric. Standard benchmarks like Perplexity are inapplicable as they rely on the same likelihoods the model no longer computes.

    The team proposed BrierLM, a novel metric based on the Brier score that can be estimated purely from model samples. Validation confirmed BrierLM as a reliable alternative, showing a “Spearman’s rank correlation of -0.991” with traditional loss metrics.

    Finally, the framework restores controlled generation, a key feature for enterprise use. Standard temperature sampling is impossible without a probability distribution. The paper introduces a new “likelihood-free sampling algorithm,” including a practical batch approximation method, to manage the trade-off between output accuracy and diversity.

    Reducing enterprise AI costs

    This research offers a glimpse into a future where generative AI is not defined purely by ever-larger parameter counts, but by architectural efficiency.

    The current path of scaling models is hitting a wall of diminishing returns and escalating costs. The CALM framework establishes a “new design axis for LLM scaling: increasing the semantic bandwidth of each generative step”.

    While this is a research framework and not an off-the-shelf product, it points to a powerful and scalable pathway towards ultra-efficient language models. When evaluating vendor roadmaps, tech leaders should look beyond model size and begin asking about architectural efficiency.

    The ability to reduce FLOPs per generated token will become a defining competitive advantage, enabling AI to be deployed more economically and sustainably across the enterprise to reduce costs—from the data centre to data-heavy edge applications.

    See also: Flawed AI benchmarks put enterprise budgets at risk

    Banner for AI & Big Data Expo by TechEx events.

    Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security Expo, click here for more information.

    AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

    Related posts:

    Baidu ERNIE multimodal AI beats GPT and Gemini in benchmarks

    Venezuela says over 100 political prisoners released; pope meets Machado | Nicolas Maduro News

    How Cisco builds smart systems for the AI era

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleAWS is building a new high-speed subsea internet cable to connect the US and Ireland
    Next Article Teaching robots to map large environments | MIT News
    gvfx00@gmail.com
    • Website

    Related Posts

    AI Tools

    How does the cutoff of Starlink terminals affect Russia’s moves in Ukraine? | Russia-Ukraine war News

    February 10, 2026
    AI Tools

    Chinese AI Models Power 175,000 Unprotected Systems as Western Labs Pull Back

    February 10, 2026
    AI Tools

    Is Portugal shifting to the right? | Elections

    February 9, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    BMW Will Put eFuel In Cars Made In Germany From 2028

    October 14, 202511 Views

    Best Sonic Lego Deals – Dr. Eggman’s Drillster Gets Big Price Cut

    December 16, 20259 Views

    What is Fine-Tuning? Your Ultimate Guide to Tailoring AI Models in 2025

    October 14, 20259 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram

    Subscribe to Updates

    Get the latest tech news from tastytech.

    About Us
    About Us

    TastyTech.in brings you the latest AI, tech news, cybersecurity tips, and gadget insights all in one place. Stay informed, stay secure, and stay ahead with us!

    Most Popular

    BMW Will Put eFuel In Cars Made In Germany From 2028

    October 14, 202511 Views

    Best Sonic Lego Deals – Dr. Eggman’s Drillster Gets Big Price Cut

    December 16, 20259 Views

    What is Fine-Tuning? Your Ultimate Guide to Tailoring AI Models in 2025

    October 14, 20259 Views

    Subscribe to Updates

    Get the latest news from tastytech.

    Facebook X (Twitter) Instagram Pinterest
    • Homepage
    • About Us
    • Contact Us
    • Privacy Policy
    © 2026 TastyTech. Designed by TastyTech.

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.