Close Menu

    Subscribe to Updates

    Get the latest news from tastytech.

    What's Hot

    A Should Pad Landed Warhammer FTL In DMCA Takedown Jail

    February 10, 2026

    This Horror Classic Still Holds the Guinness Record for Most Appearances of a Film in Other Movies

    February 10, 2026

    BMW Opened the Bespoke Door With Skytop and Speedtop. Now It’s Time for an ALPINA Coupe.

    February 10, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    tastytech.intastytech.in
    Subscribe
    • AI News & Trends
    • Tech News
    • AI Tools
    • Business & Startups
    • Guides & Tutorials
    • Tech Reviews
    • Automobiles
    • Gaming
    • movies
    tastytech.intastytech.in
    Home»AI Tools»Enterprises are rethinking AI infrastructure as inference costs rise
    Enterprises are rethinking AI infrastructure as inference costs rise
    AI Tools

    Enterprises are rethinking AI infrastructure as inference costs rise

    gvfx00@gmail.comBy gvfx00@gmail.comNovember 30, 2025No Comments6 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email


    AI spending in Asia Pacific continues to rise, yet many companies still struggle to get value from their AI projects. Much of this comes down to the infrastructure that supports AI, as most systems are not built to run inference at the speed or scale real applications need. Industry studies show many projects miss their ROI goals even after heavy investment in GenAI tools because of the issue.

    The gap shows how much AI infrastructure influences performance, cost, and the ability to scale real-world deployments in the region.

    Akamai is trying to address this challenge with Inference Cloud, built with NVIDIA and powered by the latest Blackwell GPUs. The idea is simple: if most AI applications need to make decisions in real time, then those decisions should be made close to users rather than in distant data centres. That shift, Akamai claims, can help companies manage cost, reduce delays, and support AI services that depend on split-second responses.

    Jay Jenkins, CTO of Cloud Computing at Akamai, explained to AI News why this moment is forcing enterprises to rethink how they deploy AI and why inference, not training, has become the real bottleneck.

    Table of Contents

    Toggle
      • Why AI projects struggle without the right infrastructure
      • Why inference now demands more attention than training
      • How edge infrastructure improves AI performance and cost
      • Where edge-based AI is gaining traction
      • Why cloud and GPU partnerships matter more now
      • The infrastructure needed to support agentic AI and automation
      • What companies need to prepare for next
      • Related posts:
    • Marketing agencies using AI in workflows serve more clients
    • How SAP is modernising HMRC’s tax infrastructure with AI
    • Populist billionaire Andrej Babis’s party set to win Czech election | Elections News

    Why AI projects struggle without the right infrastructure

    Jenkins says the gap between experimentation and full-scale deployment is much wider than many organisations expect. “Many AI initiatives fail to deliver on expected business value because enterprises often underestimate the gap between experimentation and production,” he says. Even with strong interest in GenAI, large infrastructure bills, high latency, and the difficulty of running models at scale often block progress.

    Jay Jenkins, CTO of Cloud Computing at Akamai.

    Most companies still rely on centralised clouds and large GPU clusters. But as use grows, these setups become too expensive, especially in regions far from major cloud zones. Latency also becomes a major issue when models have to run multiple steps of inference over long distances. “AI is only as powerful as the infrastructure and architecture it runs on,” Jenkins says, adding that latency often weakens the user experience and the value the business hoped to deliver. He also points to multi-cloud setups, complex data rules, and growing compliance needs as common hurdles that slow the move from pilot projects to production.

    Why inference now demands more attention than training

    Across Asia Pacific, AI adoption is shifting from small pilots to real deployments in apps and services. Jenkins notes that as this happens, day-to-day inference – not the occasional training cycle – is what consumes most computing power. With many organisations rolling out language, vision, and multimodal models in multiple markets, the demand for fast and reliable inference is rising faster than expected. This is why inference has become the main constraint in the region. Models now need to operate in different languages, regulations, and data environments, often in real time. That puts enormous pressure on centralised systems that were never designed for this level of responsiveness.

    How edge infrastructure improves AI performance and cost

    Jenkins says moving inference closer to users, devices, or agents can reshape the cost equation. Doing so shortens the distance data must travel and allows models to respond faster. It also avoids the cost of routing huge volumes of data between major cloud hubs.

    Physical AI systems – robots, autonomous machines, or smart city tools – depend on decisions made in milliseconds. When inference runs distantly, these systems don’t work as expected.

    The savings from more localised deployments can also be substantial. Jenkins says Akamai analysis shows enterprises in India and Vietnam see large reductions in the cost of running image-generation models when workloads are placed at the edge, rather than centralised clouds. Better GPU use and lower egress fees played a major role in those savings.

    Where edge-based AI is gaining traction

    Early demand for edge inference is strongest from industries where even small delays can affect revenue, safety, or user engagement. Retail and e-commerce are among the first adopters because shoppers often abandon slow experiences. Personalised recommendations, search, and multimodal shopping tools all perform better when inference is local and fast.

    Finance is another area where latency directly affects value. Jenkins says workloads like fraud checks, payment approval, and transaction scoring rely on chains of AI decisions that should happen in milliseconds. Running inference closer to where data is created helps financial firms move faster and keeps data inside regulatory borders.

    Why cloud and GPU partnerships matter more now

    As AI workloads grow, companies need infrastructure that can keep up. Jenkins says this has pushed cloud providers and GPU makers into closer collaboration. Akamai’s work with NVIDIA is one example, with GPUs, DPUs, and AI software deployed in thousands of edge locations.

    The idea is to build an “AI delivery network” that spreads inference across many sites instead of concentrating everything in a few regions. This helps with performance, but it also supports compliance. Jenkins notes that almost half of large APAC organisations struggle with differing data rules across markets, which makes local processing more important. Emerging partnerships are now shaping the next phase of AI infrastructure in the region, especially for workloads that depend on low-latency responses.

    Security is built into these systems from the start, Jenkins says. Zero-trust controls, data-aware routing, and protections against fraud and bots are becoming standard parts of the technology stacks on offer.

    The infrastructure needed to support agentic AI and automation

    Running agentic systems – which make many decisions in sequence – needs infrastructure that can operate at millisecond speeds. Jenkins believes the region’s diversity makes this harder but not impossible. Countries differ widely in connectivity, rules, and technical readiness, so AI workloads must be flexible enough to run where it makes the most sense. He points to research showing that most enterprises in the region already use public cloud in production, but many expect to rely on edge services by 2027. That shift will require infrastructure that can hold data in-country, route tasks to the closest suitable location, and keep functioning when networks are unstable.

    What companies need to prepare for next

    As inference moves to the edge, companies will need new ways to manage operations. Jenkins says organisations should expect a more distributed AI lifecycle, where models are updated across many sites. This requires better orchestration and strong visibility into performance, cost, and errors in core and edge systems.

    Data governance becomes more complex but also more manageable when processing stays local. Half of the region’s large enterprises already struggle with the variance in regulations, so placing inference closer to where data is generated can help.

    Security also needs more attention. While spreading inference to the edge can improve resilience, it also means every site must be secured. Firms need to protect APIs, data pipelines, and guard against fraud or bot attacks. Jenkins notes that many financial institutions already rely on Akamai’s controls in these areas.

    (Photo by Igor Omilaev)

    Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.

    AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

    Related posts:

    US Supreme Court to consider Trump’s bid to end birthright citizenship | Courts News

    Accenture and Anthropic partner to boost enterprise AI integration

    What Anthropic's Discovery Means for Enterprises

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous Article7 AI Tools I Can’t Live Without as a Professional Data Scientist
    Next Article This Is Bugatti’s Final W-16 Track Car
    gvfx00@gmail.com
    • Website

    Related Posts

    AI Tools

    How does the cutoff of Starlink terminals affect Russia’s moves in Ukraine? | Russia-Ukraine war News

    February 10, 2026
    AI Tools

    Chinese AI Models Power 175,000 Unprotected Systems as Western Labs Pull Back

    February 10, 2026
    AI Tools

    Is Portugal shifting to the right? | Elections

    February 9, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    BMW Will Put eFuel In Cars Made In Germany From 2028

    October 14, 202511 Views

    Best Sonic Lego Deals – Dr. Eggman’s Drillster Gets Big Price Cut

    December 16, 20259 Views

    What is Fine-Tuning? Your Ultimate Guide to Tailoring AI Models in 2025

    October 14, 20259 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram

    Subscribe to Updates

    Get the latest tech news from tastytech.

    About Us
    About Us

    TastyTech.in brings you the latest AI, tech news, cybersecurity tips, and gadget insights all in one place. Stay informed, stay secure, and stay ahead with us!

    Most Popular

    BMW Will Put eFuel In Cars Made In Germany From 2028

    October 14, 202511 Views

    Best Sonic Lego Deals – Dr. Eggman’s Drillster Gets Big Price Cut

    December 16, 20259 Views

    What is Fine-Tuning? Your Ultimate Guide to Tailoring AI Models in 2025

    October 14, 20259 Views

    Subscribe to Updates

    Get the latest news from tastytech.

    Facebook X (Twitter) Instagram Pinterest
    • Homepage
    • About Us
    • Contact Us
    • Privacy Policy
    © 2026 TastyTech. Designed by TastyTech.

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.