Close Menu

    Subscribe to Updates

    Get the latest news from tastytech.

    What's Hot

    PS5 owners are facing massive SSD storage price hikes

    June 18, 2026

    If Patricia From ‘Widow’s Bay’ Officially Won You Over, You Need Her 2-Part Comedy Series

    June 18, 2026

    BMW Hints More Wagons Could Be Coming to America

    June 18, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    tastytech.intastytech.in
    Subscribe
    • AI News & Trends
    • Tech News
    • AI Tools
    • Business & Startups
    • Guides & Tutorials
    • Tech Reviews
    • Automobiles
    • Gaming
    • movies
    tastytech.intastytech.in
    Home»AI News & Trends»In game theory, generalists sometimes win out over specialists | MIT News
    In game theory, generalists sometimes win out over specialists | MIT News
    AI News & Trends

    In game theory, generalists sometimes win out over specialists | MIT News

    gvfx00@gmail.comBy gvfx00@gmail.comJune 18, 2026No Comments6 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email



    Whether you’re playing poker against a single opponent or find yourself in a bidding war over a home purchase with another prospective buyer, you are operating under conditions of imperfect information. You know what cards you’re holding in the poker game, and you also know how much above the home’s asking price you can afford, but you don’t know your opponent’s hand in the card game or how high the other home buyer is willing to go. 

    A paper co-authored by MIT researchers and presented in April at the International Conference on Learning Representations in Rio De Janeiro won’t tell you what to do in these situations, specifically. But it does offer new insights into so-called imperfect-information games that involve two contestants facing off in a “zero-sum” competition, where one player’s gain means the other player’s loss.

    MIT researchers on the project include Sobhan Mohammadpour, a PhD student in MIT’s Department of Electrical Engineering and Computer Science (EECS) and the Laboratory for Information and Decision Systems (LIDS); and Gabriele Farina, an assistant professor in EECS and a principal investigator at LIDS. Additional co-authors include Max Rudolph of the University of Texas at Austin (UT), Nathan Lichtlé of the University of California at Berkeley (UCB), Alexandre Bayen of UCB, J. Zico Kolter of Carnegie Mellon University (CMU), Amy X. Zhang ’11, MNG ’12 of UT; Eugene Vinitsky of New York University; and Samuel Sokota of CMU. 

    The focus of the new work is on algorithms that could be used to train neural networks to participate in imperfect-information games. The assumption, long-held in the field, was that algorithms grounded in principles of game theory would, in this setting, clearly outcompete a general-purpose variety of algorithms called policy gradient methods, which came into use for decision-making in the 1990s. The term “policy” in this context basically means strategy, whereas “gradient” refers to a path that leads in the direction of greatest change — to the top (or bottom) of a hill, for example. Policy gradient methods are being used to train neural networks to make decisions that move — in small, sequential steps — toward a particular goal (like reaching a summit, metaphorically speaking), with continual adjustments and course corrections made along the way to bring the agent closer to the intended destination.

    Although strategic games were not on the original agenda when policy gradient methods were conceived in the early 1990s, the authors of the new paper still wondered how this class of algorithms might fare in two-player games. These methods become more complicated to analyze in multi-agent settings, according to Farina. “There is still a direction you can move in to improve your circumstances, but, because of the other player’s actions, that direction can constantly change over the course of the game. And those shifts can be rapid.”

    “It had been pretty much taken for granted that specialized game-theoretic algorithms were the right approach for this setting,” says Sokota. “Our study showed that policy gradient methods can work better than these specialized algorithms, and that the specialized algorithms may not work as well as people thought — which raises an interesting sociological question about why this went unnoticed for so long. Part of the answer is that the field hadn’t done the engineering work required to rigorously evaluate the algorithms, so it was hard to tell what worked and what didn’t.”

    Consequently, a major contribution of this work has been to provide an even-handed way of appraising different algorithms that can teach agents — i.e., neural networks — how to compete in imperfect-information games. “We’re taking a different approach,” notes Rudolph. “Unlike many of the papers published in this field, we’re not proposing a new algorithm that can beat out other algorithms. We’re proposing a benchmark that can assess these algorithms.”

    Simply put, a benchmark consists of software designed to rate the performance of algorithms. “What we’re offering is a testing grounds, or playing grounds, where people can take their algorithms, train them for a specific task, and see how well they do,” says Farina.

    The group calculates a player’s performance in terms of a concept called exploitability, which measures how well a player does against the “worst-case adversary,” Sokota explains. “In a game like poker, this opponent wouldn’t know what my hand is, but would know how I would behave for any given hand.” Achieving a zero on this scale implies perfect play, whereas a high exploitability score indicates far-from-optimal play.

    Five games were played in experiments carried out by the team: two versions of Phantom Tic-Tac-Toe, in which players can’t see what their opponent has done, along with two imperfect-information variants of a board game called Hex, and another game of deception called Liar’s Dice.

    The biggest challenge faced by the researchers was getting the exploitability measure to work on games of this size, which may include as many as 30 billion states. A “state” in this case is not just all the possible board positions, but also encompasses the entire history of the game, including every step and misstep along the way. 

    “It’s like looking into a dark room that’s filled with objects you can’t see,” says Mohammadpour. “Somehow, you need to figure out where these objects are and exactly how they got there.” Previous researchers, Mohammadpour adds, have typically used exploitability for games that are 100,000 times smaller than the ones analyzed in their study.

    In the experiments carried out on these five games, neural networks trained with policy gradient algorithms got better (lower) exploitability scores than networks trained on game theory-based algorithms. In head-to-head competitions, which took place in the next round, the policy gradient-trained networks again beat their game theory-trained opponents. “Those results were reassuring,” Rudolph says, “because they give us more confidence in our benchmarking approach.”

    The team has made their benchmarking software freely available and convenient to use. “You don’t need a supercomputer,” Mohammadpour says. “You can run it on an ordinary laptop. And all you have to do is add a single line of code to a commonly used collection of benchmarking software called OpenSpiel.”

    Although their experiments involved some fairly obscure games, Farina would like to put this work into a broader context. “Keep in mind that the term ‘game’ really applies to any multi-agent strategic interaction,” he says. “So the lessons we learn from this research are by no means limited to recreational games.”

    Vinitsky agrees. “Hidden information is a very important property of the world,” he says. “It pervades a range of things — including military operations, trading scenarios, and negotiations — all of which are carried out under conditions of hidden information. The idea that we can improve on these games suggests that we can also do better in these other settings as well.”

    Ian Gemp — a computer scientist and game theory expert at Google DeepMind who was not involved in this study — finds these results encouraging. “This work serves as a compelling reminder,” he says, “that modernizing classical tools [like policy gradient methods] remains a highly productive path for solving complex strategic problems.”

    Table of Contents

    Toggle
      • Related posts:
    • World Leaders Gather in India to Discuss AI’s Future. Competition Hangs in the Air.
    • Gemini är nu en universal translator
    • Simpler models can outperform deep learning at climate prediction | MIT News

    Related posts:

    Ray Kurzweil ’70 reinforces his optimism in tech progress | MIT News

    A “ChatGPT for spreadsheets” helps solve difficult engineering challenges faster | MIT News

    Best Unfiltered AI GF Chatbots: Tested

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleGoogle Cloud generative AI automates council planning operations
    Next Article BMW Hints More Wagons Could Be Coming to America
    gvfx00@gmail.com
    • Website

    Related Posts

    AI News & Trends

    Could AI tell you where you left your keys? | MIT News

    June 17, 2026
    AI News & Trends

    MIT’s Initiative for New Manufacturing builds momentum | MIT News

    June 17, 2026
    AI News & Trends

    When it comes to predicting people’s preferences, it pays to consider “the power of three” | MIT News

    June 12, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Black Swans in Artificial Intelligence — Dan Rose AI

    October 2, 2025198 Views

    Every Clue That Tony Stark Was Always Doctor Doom

    October 20, 2025127 Views

    We let ChatGPT judge impossible superhero debates — here’s how it ruled

    December 31, 202598 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram

    Subscribe to Updates

    Get the latest tech news from tastytech.

    About Us
    About Us

    TastyTech.in brings you the latest AI, tech news, cybersecurity tips, and gadget insights all in one place. Stay informed, stay secure, and stay ahead with us!

    Most Popular

    Black Swans in Artificial Intelligence — Dan Rose AI

    October 2, 2025198 Views

    Every Clue That Tony Stark Was Always Doctor Doom

    October 20, 2025127 Views

    We let ChatGPT judge impossible superhero debates — here’s how it ruled

    December 31, 202598 Views

    Subscribe to Updates

    Get the latest news from tastytech.

    Facebook X (Twitter) Instagram Pinterest
    • Homepage
    • About Us
    • Contact Us
    • Privacy Policy
    © 2026 TastyTech. Designed by TastyTech.

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.