Close Menu

    Subscribe to Updates

    Get the latest news from tastytech.

    What's Hot

    BMW tuner AC Schnitzer will shutdown by end of 2026

    March 22, 2026

    Lebanon’s Aoun warns Israeli attack on bridge ‘prelude to ground invasion’ | Israel attacks Lebanon News

    March 22, 2026

    Top 10 AI Coding Assistants of 2026

    March 22, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    tastytech.intastytech.in
    Subscribe
    • AI News & Trends
    • Tech News
    • AI Tools
    • Business & Startups
    • Guides & Tutorials
    • Tech Reviews
    • Automobiles
    • Gaming
    • movies
    tastytech.intastytech.in
    Home»AI Tools»Baidu ERNIE multimodal AI beats GPT and Gemini in benchmarks
    Baidu ERNIE multimodal AI beats GPT and Gemini in benchmarks
    AI Tools

    Baidu ERNIE multimodal AI beats GPT and Gemini in benchmarks

    gvfx00@gmail.comBy gvfx00@gmail.comNovember 13, 2025No Comments4 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Baidu’s latest ERNIE model, a super-efficient multimodal AI, is beating GPT and Gemini on key benchmarks and targets enterprise data often ignored by text-focused models.

    For many businesses, valuable insights are locked in engineering schematics, factory-floor video feeds, medical scans, and logistics dashboards. Baidu’s new model, ERNIE-4.5-VL-28B-A3B-Thinking, is designed to fill this gap.

    What’s interesting to enterprise architects is not just its multimodal capability, but its architecture. It’s described as a “lightweight” model, activating only three billion parameters during operation. This approach targets the high inference costs that often stall AI-scaling projects. Baidu is betting on efficiency as a path to adoption, training the system as a foundation for “multimodal agents” that can reason and act, not just perceive.

    Table of Contents

    Toggle
      • Complex visual data analysis capabilities supported by AI benchmarks
      • Baidu shifts from perception to automation with its latest ERNIE AI model
      • Unlocking business intelligence with multimodal AI
      • Related posts:
    • Wall Street’s AI gains are here — banks plan for fewer people
    • Can diplomacy end the conflict between Thailand and Cambodia? | Conflict News
    • North Korea test-fires cruise missiles as Trump visits South Korea | Nuclear Weapons News

    Complex visual data analysis capabilities supported by AI benchmarks

    Baidu’s multimodal ERNIE AI model excels at handling dense, non-text data. For example, it can interpret a “Peak Time Reminder” chart to find optimal visiting hours, a task that reflects the resource-scheduling challenges in logistics or retail.

    ERNIE 4.5 also shows capability in technical domains, like solving a bridge circuit diagram by applying Ohm’s and Kirchhoff’s laws. For R&D and engineering arms, a future assistant could validate designs or explain complex schematics to new hires.

    This capability is supported by Baidu’s benchmarks, which show ERNIE-4.5-VL-28B-A3B-Thinking outperforming competitors like GPT-5-High and Gemini 2.5 Pro on some key tests:

    • MathVista: ERNIE (82.5) vs Gemini (82.3) and GPT (81.3)
    • ChartQA: ERNIE (87.1) vs Gemini (76.3) and GPT (78.2)
    • VLMs Are Blind: ERNIE (77.3) vs Gemini (76.5) and GPT (69.6)

    It’s worth noting, of course, that AI benchmarks provide a guide but can be flawed. Always perform internal tests for your needs before deploying any AI model for mission-critical applications.

    Baidu shifts from perception to automation with its latest ERNIE AI model

    The primary hurdle for enterprise AI is moving from perception (“what is this?”) to automation (“what now?”). ERNIE 4.5 claims to address this by integrating visual grounding with tool use.

    Asking the multimodal AI to find all people wearing suits in an image and return their coordinates in JSON format works. The model generates the structured data, a function easily transferable to a production line for visual inspection or to a system auditing site images for safety compliance.

    The model also manages external tools and can autonomously zoom in on a photograph to read small text. If it faces an unknown object, it can trigger an image search to identify it. This represents a less passive form of AI that could power an agent to not only flag a data centre error, but also zoom in on the code, search the internal knowledge base, and suggest the fix.

    Unlocking business intelligence with multimodal AI

    Baidu’s latest ERNIE AI model also targets corporate video archives from training sessions and meetings to security footage. It can extract all on-screen subtitles and map them to their precise timestamps.

    It also demonstrates temporal awareness, finding specific scenes (like those “filmed on a bridge”) by analysing visual cues. The clear end-goal is making vast video libraries searchable, allowing an employee to find the exact moment a specific topic was discussed in a two-hour webinar they may have dozed off a couple of times during.

    Baidu provides deployment guidance for several paths, including transformers, vLLM, and FastDeploy. However, the hardware requirements are a major barrier. A single-card deployment needs 80GB of GPU memory. This is not a tool for casual experimentation, but for organisations with existing and high-performance AI infrastructure.

    For those with the hardware, Baidu’s ERNIEKit toolkit allows fine-tuning on proprietary data; a necessity for most high-value use cases. Baidu is providing its latest ERNIE AI model with an Apache 2.0 licence that permits commercial use, which is essential for adoption.

    The market is finally moving toward multimodal AI that can see, read, and act within a specific business context, and the benchmarks suggest it’s doing so with impressive capability. The immediate task is to identify high-value visual reasoning jobs within your own operation and weigh them against the substantial hardware and governance costs.

    See also: Wiz: Security lapses emerge amid the global AI race

    Banner for AI & Big Data Expo by TechEx events.

    Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security Expo. Click here for more information.

    AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

    Related posts:

    Israel’s war forces Gaza children into work as breadwinners | Gaza News

    Trump hosts Saudi Arabia’s Mohammed bin Salman: Five key takeaways | Politics News

    Agentic AI drives finance ROI in accounts payable automation

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleThe Algorithm Can Tell If A Pig Is Happy Or Sad
    Next Article 2026 BMW M2 Track Package Spied With Aggressive Aero Kit
    gvfx00@gmail.com
    • Website

    Related Posts

    AI Tools

    Lebanon’s Aoun warns Israeli attack on bridge ‘prelude to ground invasion’ | Israel attacks Lebanon News

    March 22, 2026
    AI Tools

    Iran says will hit region’s energy sites if US, Israel target power plants | US-Israel war on Iran News

    March 22, 2026
    AI Tools

    Evloev upsets Murphy, sets up featherweight title shot against Volkanovski | Mixed Martial Arts News

    March 22, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    BMW Will Put eFuel In Cars Made In Germany From 2028

    October 14, 202511 Views

    Best Sonic Lego Deals – Dr. Eggman’s Drillster Gets Big Price Cut

    December 16, 20259 Views

    What is Fine-Tuning? Your Ultimate Guide to Tailoring AI Models in 2025

    October 14, 20259 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram

    Subscribe to Updates

    Get the latest tech news from tastytech.

    About Us
    About Us

    TastyTech.in brings you the latest AI, tech news, cybersecurity tips, and gadget insights all in one place. Stay informed, stay secure, and stay ahead with us!

    Most Popular

    BMW Will Put eFuel In Cars Made In Germany From 2028

    October 14, 202511 Views

    Best Sonic Lego Deals – Dr. Eggman’s Drillster Gets Big Price Cut

    December 16, 20259 Views

    What is Fine-Tuning? Your Ultimate Guide to Tailoring AI Models in 2025

    October 14, 20259 Views

    Subscribe to Updates

    Get the latest news from tastytech.

    Facebook X (Twitter) Instagram Pinterest
    • Homepage
    • About Us
    • Contact Us
    • Privacy Policy
    © 2026 TastyTech. Designed by TastyTech.

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.