Lex'Recap - Noam Brown: AI vs Humans in Poker and Games of Strategic Negotiation

Introduction

> Building an AI that approximates the Nash equilibrium without trying to adapt or exploit individual players can outperform even top human experts. Our bot proved this by crushing professional poker players in heads-up no limit Hold'em, despite not engaging in psychological tactics like reading souls.

> Developing an AI that can strategically out-negotiate humans using natural language in complex environments like the board game Diplomacy shows the potential for AI systems to handle intricate social interactions and negotiations, revealing a promising leap in the intersection of AI and human communication.

No Limit Texas Hold 'em

> One of the key differences between No Limit Texas Hold'em poker and other variants is the ability to bet any amount, leading to rapidly escalating stakes. It's not just about strategy, but also about managing the pressure of high-stakes situations. In poker, maximizing expected value is crucial, but in No Limit Hold'em, the potential for large bets can lead players to make risky moves or feel overwhelmed by the stakes, creating both opportunities and challenges in gameplay.

Solving poker

> The beauty of poker lies in the idea that there's an "objectively correct way" to play, which fascinates me deeply. "If you could figure out what that is, then you're... making unlimited money," highlighting the thrill of strategy and the challenge of mastering the game.

> The distinction between creating an AI that wins and one that is "fun to play with" is crucial. AI should not only aim to optimize game outcomes, but also to engage players emotionally and create memorable experiences, reminiscent of how people enjoy games like Civilization.

> The potential for AI to revolutionize NPC interactions in gaming is exciting. By utilizing language models, we could see a shift from combat-focused gameplay to more nuanced, drama-rich interactions that allow for "the push and pull of human connection" within the gaming world.

Poker vs Chess

> One of the key insights about poker compared to games like chess or Go is the imperfect information aspect. In Texas Hold'em, this means players only see their own two face-down cards and must strategically navigate what others might hold, employing tactics like bluffing to predict and manipulate opponents' perceptions. This complexity—balancing probabilities and hidden information—is a crucial point where poker diverges significantly from perfect information games like chess.

> When we developed the poker bot Libratus, our primary goal was to design an algorithm capable of maintaining a perfect balance between different actions, especially bluffing. In high-level poker play, it’s essential to avoid patterns that opponents can exploit, thus maintaining unpredictability. Expert strategies—both in bots and human plays—aim to approximate the Nash equilibrium, striving for a balanced approach where even if an opponent deduces the strategy over numerous hands, it remains theoretically unbeatable in expectation.

AI playing poker

> "Phil Hellmuth's success lies in being able to take advantage of predictability in other players and guiding them into deviating from optimal play."

> "The shift towards Game Theory Optimal strategies in poker was solidified by AI bots like ours that crushed top players by approximating Nash equilibrium without adapting."

> "Search plays a crucial role in game AI, as seen in landmark achievements like AlphaGo, highlighting the power of planning ahead for optimal decision-making."

> "In the Liberatus competition against top poker players, humans coordinated to exploit bot weaknesses like difficulties with over-betting and handling tough all-in scenarios."

> "After years of intense work on Liberatus, winning against top human players was a huge sense of accomplishment, fulfilling a dream of finding a successful poker strategy since high school."

Heads-up vs Multi-way poker

> The transition from two-player to six-player poker revealed some fascinating insights for me about game theory. Even though six-player poker doesn’t guarantee a Nash equilibrium like two-player games do, I found that the techniques we developed still work surprisingly well in practice. “It turns out that in six player poker, because it’s such an adversarial game...those techniques end up working really well,” which speaks to the inherent competitiveness of poker itself.

> Another key takeaway was the impact of algorithmic advancements on performance. I realized that the shift to depth-limited search was a game changer, making “the final training run...cost less than 150 on AWS,” showing that it’s often the clever algorithms that significantly reduce computational demands, rather than just better hardware or more data. This highlights how innovation in approaches can sometimes be more impactful than sheer resource allocation.

Greatest poker player of all time

> Poker has been transformed by AI in recent years, with top players now training with AI to refine their strategies and identify mistakes. This computational approach has significantly advanced the skill level in the game, making modern poker players much more sophisticated compared to those from the height of the poker boom on ESPN.

> It's tough to pinpoint the greatest poker player of all time due to the game's evolution, but I'd say Daniel Negreanu stands out. He's one of the few from the old school who has successfully kept up with the developments in AI and continues to study game theory optimal play, which is a remarkable testament to his adaptability and dedication to the game.

Diplomacy game

> Diplomacy is unique with a strong cooperative element, focusing on forming alliances to succeed against other players in a seven-player game set in Europe before World War One. The game combines risk, poker, and social dynamics, making it more about people than pieces. Its open communication format allows for strategic negotiations and role-playing, creating a rich and dynamic gaming experience.

> The game was created in the 50s to teach about diplomacy, with historical figures like JFK and Henry Kissinger reportedly enjoying it. Ideal gameplay involves preventing one player from winning, emphasizing the futility of war and the power of diplomacy. Despite power imbalances between countries like France and Russia, the game self-balances through player interactions, typically ending in draws and shared victories among remaining players.

AI negotiating with humans

> The complexity of diplomacy as an AI research challenge lies primarily in its reliance on natural language communication, which goes far beyond simple piece movement: "Your action space is basically all the different sentences that you could communicate to somebody else in this game." This intricacy demands that AI not only comprehend language's nuances but also navigate the underlying social dynamics.

> While previous advancements in game AI focused on adversarial environments, diplomacy's cooperative elements introduce a new layer of difficulty. Unlike purely competitive games, "You can't leverage self-play to solve this game," emphasizing the need to integrate human behavior into the AI's strategy in a meaningful way.

> Training an AI bot for diplomacy required collecting human data and enforcing a strategy that mirrored human tendencies rather than merely optimizing for machine-level play. "The bot can't figure out how to communicate in English... it would figure out some weird robot language," highlighting the importance of ensuring the AI’s communication aligns with human expectations to foster cooperation.

> One of the unexpected findings was that the bot needed to prioritize trust over deception; "Lying would make the bot perform worse in the long run," revealing that fostering genuine relationships in diplomacy leads to better outcomes than resorting to deceit.

> This work not only lays a foundation for improved AI diplomacy players but also provides profound insights into human-AI interaction, potentially aiding research into negotiation and trust dynamics. "How can we form Trust between intelligent entities?" was a crucial question that emerged, illuminating a broader landscape for future exploration in AI and social behaviors.

AI in geopolitics

> Diplomacy in AI has the potential to revolutionize how we approach real-world geopolitical decisions. The game's foundational concept, born from the failures of World War One, supports the idea that war is inherently a negative-sum game. With advancements in AI, there's hope that this technology could guide leaders in making better decisions, ultimately helping to avoid destructive outcomes like warfare. "If people approach diplomacy the right way then war is ultimately unsuccessful."

> Despite the advancements, deploying AI strategies like those used in diplomacy to real-world scenarios faces challenges. Unlike well-defined domains such as code generation or theorem proving, real-world applications in diplomacy and negotiation are complex due to the lack of a well-defined action space and reward function. However, the use of natural language and vast data from the internet makes diplomacy a significant step closer to practical application compared to AI developments in other games. "That's why I do think that diplomacy is taking a big step closer to the real world than anything that's come before in terms of game AI breakthroughs."

Human-like AI for games

> One key insight is how using techniques developed for diplomacy can lead to more human-like AI players in games like chess and Go, creating a stronger yet familiar opponent. By finding an intermediate balance between leveraging search and planning while keeping close to human policy, AI can play in a human-like style and be tuned to specific ELO ratings.

> Another important point raised is the ethical and cybersecurity challenges that arise as AI systems become more integrated in games. As AI evolves to mimic human behavior and styles, cheat detection becomes harder, raising questions about trust and fairness in human vs. human gameplay. The potential of human-like AI systems offers great opportunities for learning and enhancing gameplay, but also requires better defenses to ensure a fair and enjoyable gaming experience.

Ethics of AI

> The intersection of AI and human ethics is deeply intertwined; developing systems capable of deception raises significant questions about honesty and morality. It's not just about whether an AI should lie, but understanding the nuances of truth versus kindness—“is a white lie a bad lie?” This mirrors our own human relationships and reflects back on us as we grapple with these fundamental dilemmas.

> Engaging with AI in a diplomatic context reveals inherent biases and illuminates the potential for unity amidst division. As we confront robotic entities, it prompts us to reevaluate what it means to be sentient and how we treat systems that can exhibit suffering, pushing us toward a future where ethical considerations regarding AI might mirror civil rights discussions; it's a thrilling time to address age-old philosophical questions through the design of impactful technologies.

AGI

> One of the major challenges for AI today is data inefficiency. Current AI models, like those that play Go, require millions of samples to train effectively, whereas humans achieve proficiency with far fewer examples. Overcoming this issue is particularly crucial for deploying AI in real-world scenarios, such as robotics, where generating numerous samples is impractical.

> A potential solution to data inefficiency could involve leveraging large background models that apply general knowledge across various domains, similar to how humans approach new tasks. This might allow AI systems to use extensive, pre-existing knowledge to adapt more quickly to specific problems, reducing the need for vast amounts of training data.

Advice to beginners

> I think having a diversity of viewpoints and backgrounds is really helpful for tackling challenges in machine learning. When working with teammates, each person brings something unique to the table, making collaboration fruitful.

> Life may be more about defining a reward function that minimizes unintended consequences than about following a specific policy. Figuring out what truly matters to us, our "reward function," can be a challenging yet crucial part of navigating life's complexities.

Lex'Recap AI-generated recaps from the Lex Fridman podcast

Noam Brown: AI vs Humans in Poker and Games of Strategic Negotiation