Why Halo Infinite's AI Bots Play More Like Humans
Is it the future of AI for games? Or just some clever game design?
AI and Games is a YouTube series made possible thanks to crowdfunding on Patreon as well as right here with paid subscriptions on Substack.
Support the show to have your name in video credits, contribute to future episode topics, watch content in early access and receive exclusive supporters-only content.
2021's Halo Infinite was the first entry of the long-running shooter franchise to carry AI-controlled bots in its multiplayer modes. But unlike bots in most other games, they received high praise from critics and players alike upon release thanks to their more natural and realistic playstyles; behaving like humans play Halo, and enabling players of all skill levels to engage in all-out spartan warfare.
For this entry of AI and Games, we return once more to the Halo franchise, to find out how exactly 343 Industries made these bots act human-like in their behaviour. The answer is using good old-fashioned AI techniques, combined with a completely different approach to designing their behaviours.
The Design Challenge
Halo Infinite is the first entry in the series to offer up the main multiplayer modes as free-to-player. While it is monetised courtesy of Battle Passes, premium cosmetics, and even the main story campaign as paid DLC, releasing the game using this more contemporary monetisation model presents new challenges for developers 343 Industries. Critically, by allowing anyone to jump into a match of Slayer or Stronghold free of charge, it led to a broader range of players checking out the franchise than ever before.
It's easy to forget that Halo Infinite came out 20 years after the first entry of the franchise. This means that you now have a range of players coming to this game, all of whom have different experiences of the series up until that point. On one end you have people who have stuck it out since the earliest days of Halo: Combat Evolved and have played every entry of the series since, lapsed players who fell off after Halo 3 in 2007, all the way to people fresh to the fight who have only now picked up a Battle Rifle for the very first time.
And it's this range of experiences that can be quite intimidating for a new player, and it was 343's job to find a way to improve onboarding, as well as the overall experience for both novice and seasoned players. As detailed by Sara Stern the Senior Multiplayer Designer for Halo Infinite at a talk at the 2022 Game Developers Conference, it's not uncommon for even fans of the Halo campaigns to not be particularly skilled at the multiplayer. While you may love each entry of the Master Chief's ongoing adventures, the skills you develop for playing the campaign modes don't directly translate to the multiplayer. Conversely, even if you're good at playing other popular online shooters such as Overwatch, Valorant or Call of Duty, you then have to figure out how these skills have to adapt to fit the Halo series.
The Sandbox Team on Halo Infinite, describe it as follows:
"Halo's combat is a dynamic, rhythm of engagement… a reactive and cerebral dance that feels like a symphony of combat choices."
So with this in mind, Stern alongside a team of designers and programmers went about redefining how onboarding works in Halo Infinite. This included not just the development of bots for the multiplayer experience, but also what is known as The Academy: a new tutorialisation of the mechanics and gameplay systems of Halo multiplayer, that utilises the bots in service of its goals.
How Do They Work?
So you might be wondering what's the secret to these bots. Are they running on some new shiny machine learning algorithm? Did 343 Industries pull down gigabytes of replay data and feed it to a deep neural network like Google DeepMind did with their AlphaStar bot? Surely in this heyday of deep learning technologies, this was a fantastic opportunity for Microsoft to utilise these techniques on one of their biggest gaming IPs.
Halo Infinite's bots are built from the ground up using good old-fashioned game AI, and leveraging the years of development that have already gone into AI-controlled characters for modern Halo titles.
The Halo games continue to use Behaviour Trees as their primary mechanism for designing and implementing behaviours for their non-player characters. I'll forego explaining how a behaviour tree works here, beyond that it’s a tree-like data structure that allows for logic to dictate what actions execute in what order. For more information, there is already a full episode of my AI 101 series (see below) that explains it in detail during which I also detail how the Halo franchise, and notably Halo 2 has been instrumental in the development and refinement of the Behaviour Trees as a practice within the games industry.
In their 2022 GDC presentation, Brie Chin-Deyerle, a senior gameplay engineering lead on the Halo Infinite multiplayer team, explained that the core behaviour of Halo's bots adheres to many of the common behavioural traits you would expect of a human player:
They engage in combat with enemies that get too close to them or are within an acceptable target range.
They pick up objectives like the skull in Oddball, and also deliver objectives like in Capture the Flag
They can interact with switches and other objects like vehicles.
They can hide from enemies if close by, or even hunt down specific targets.
They can contest an item such as a power weapon and try to stop other players from taking it.
They can subsequently collect and use weapons and other special items either from the ground or
They can traverse and patrol the gameplay space looking for enemies.
They can guard objectives, be it a location in a match of Stronghold, or even seek to protect objective carriers in CTF or Oddball.
Implementing these behaviours is naturally a significant amount of work, but critically, it's not what makes the bots appear more human. Everything described would satisfy the core requirements of a functional bot, but it doesn't address how these bots behave like humans. Critically, there are three pillars of the bot’s design that are the reasons they're so successful in achieving their design goals:
First of all, many of the individual skills that the bots exhibit, be it conducting strafing movement during combat, aiming a weapon and the use of melee and grenades were modified to not only appear more human (and therefore more fallible) but also adjusted across multiple skill levels so that as players got better at the game, the bots reflected their growth.
Secondly, there is a Utility AI system that runs inside the behaviour trees that conducts a thorough analysis of the current state of the game to allow it to define how exactly any given bot should react to what is happening, with this system being iterated upon heavily to ensure changes in priority reflect human playstyles.
And lastly, the Utility system is enriched even further by having the bots not just reflect on the current gameplay situation, but also have a much more nuanced understanding of the game mode they're currently playing in. And that factors into their decision-making as well.
So let's dig into these three topics a little deeper, and explain how each of these elements reinforces the Halo bot experience.
Bots That Play Like Players
The first of the three pillars that define Halo Infinite's bots is the way in which key skills of the Halo Combat Dance are programmed in the bots. As mentioned already, the bots in Halo were programmed such that specific aspects of their behaviour were built to reflect not just how a human might play Halo, but also how they gradually improve as their time spent with the game increases.
What's critical to understand, is that - in the vast majority of cases - bots in games are not built to play the game like a human would, never mind having their skills change in line with player experience. You would often build a bot from a much more clinical perspective: ensuring it can move around the world, complete objectives and kill players. Implementing these behaviours will satisfy the needs of the game’s core design, but they're seldom built to reflect how novice or expert human players would achieve those same tasks. As a result, your typical bot is then punishingly effective at its job, with delays added to decision-making, specific actions being blocked from use, or random adjustments being made to their aim to dumb them down and create different difficulty settings. In short: the bots are often built to be godlike and then dumbed down to suit the player’s needs, which seldom captures the experiences of their opponents who are gradually improving their skills.
Speaking of, Halo Infinite's bots have five key skills that aim to reflect the Halo combat dance and these are used by each of the core behaviours of the bot:
Their movement during combat: strafing, jumps and even crouching.
Their aiming of weapons is designed to be imperfect but improves as difficulty increases.
Grenade usage as a means to initiate or conclude combat, or control territory.
Melee combat that better reflects human logic.
And lastly, their confidence in a given encounter that will trigger either a fight or flight response.
To figure out how to define each of these skills, as well as how they should evolve over time. The bot team came up with levels of performance that highlight how humans actually play Halo multiplayer. These levels were defined by looking at recorded footage of players, or simply watching colleagues within 343 Industries, given the staff at the studio ranged broadly from expert players to people who were truly novices in the multiplayer component. Each skill was identified to have several levels of competence, ranging from level 1 to level 4, and those levels were then transposed across the four difficulty levels of the bots: Recruit, Marine, ODST, and Spartan.
So how do these skill levels differ? Well let's look at some of the design goals the bots were expected to reflect and how they were implemented.
If we start with Level 1 aiming, this reflects players who have trouble maintaining their aim on a target. These players potentially overcorrect as a result of thumbing the controller sticks or dragging the mouse, they struggle to factor weapon recoil and keep track of players as they run around them.
Level 2 aiming expects the player to focus a little more on headshots when shields are down, level 3 does a better job of factoring player movement and recoil, and level 4 plays without any real issues with their aim.
To reproduce this in bots, they're programmed to point their reticle within a radius of the target, and then try to move it towards the target before they open fire. Several variables are exposed in the codebase that, as the bots difficulty goes up, gradually improve and create a better-aiming bot:
The maximum distance of the reticle to the target.
How quickly the bot will close the reticle on the target.
How quickly it reacts to changes in the target’s direction.
How quickly it reacts to fast-moving targets.
How well it guesses the player’s overall movement.
Meanwhile, if we consider strafing and movement, this is a vital skill in Halo to prevent both your shields from being broken, and then being vulnerable to headshots. So the skill levels start at level 1, in which a player seldom strafes at all. Given they will rush straight at targets when they see them and, given what we just discussed about player aim, they refrain from strafing given they struggle to target their opponent. As the levels increase, strafing begins to increase, followed by strafe jumping at level 3, and the introduction of crouching, sliding and hopping at level 4. Hence the bots have parameters that dictate how far and how frequently they can use left/right, forward/backwards and jump/crouch actions. All of this ultimately influences the complexity of their movement. The implementation of this at Spartan difficult, as was described by Stern, was often nicknamed in the studio as 'giving someone the ol'Razzle-Dazzle' and is the source of the callsigns of two of the bots in the game, namely 343-Razzle, and 343-Dazzle.
Meanwhile, strategies for grenades and combat are built to reflect more strategic use of these skills. The bots aiming accuracy and target prioritisation improve as each level increases. Until the highest levels where the bots seek to control player movement by aiming based on your current trajectory. Plus melee skills in bots improve with difficulty such that they begin to prioritise melee attacks when their target’s back is turned, and when their target’s shield states are low. In fact, this skill begins to emphasise the bot’s need to close distances during successful gun fights such that it can finish an enemy off with a cheeky slap upside the head.
The last skill, the bot’s confidence in combat, is a much more nuanced conversation, and it ties into the big technical change to the underlying behaviour tree systems of the bots. In that they do a much better job of recognising where and when they are in the context of a Halo combat dance.
Better Contextual Awareness
Ensuring a bot can recognise its combat situation, is far from a straightforward element of game design. In humans, we'd often think of this as our fight-or-flight instincts. Do we think we can hold out and take down the opponents in front of us, or is best for us to fall back, recharge shields, grab a power weapon or even just hide out in a corner and hope to get the drop on your enemy. Once again this is a very human, instinctual behaviour, and it is difficult to get AI bots to process like this and behave in a way that aligns with it.
As Stern explained in their talk, the bots in Halo Infinite rely on what is known as their 'confidence' in which it calculates which action it should take. And this was discussed in much more detail in the talk by Chin-Deyerle in that the behaviour trees of the bots rely on a Utility AI system to decide which of the behaviour tree behaviours they should execute.
For those not familiar, Utility AI is a process in which different behaviours are ranked with a numeric value, and from that the AI then prioritises the highest-ranking behaviour. A Utility can be computed either from a simple mathematical function or from a more complicated process that layers even more designer insights. If you want to know more about Utility AI and see other examples of it in practice, check out my episode on AI 101 which explains the technique in more detail and looks at high-profile examples ranging from goal selection in FEAR to behaviour priorities for companions in Dragon Age: Inquisition.
In Halo Infinite, the bots use Utility AI to analyse the current game state, and from that it will decide which specific behaviour in the behaviour tree to prioritise. Each behaviour carries what is known as an ambition, and the ambition stores a utility function. These are all part of the bots scripting logic in the Slipspace engine that is written using the Lua programming language. Instead of making it just a base mathematical function, each behaviour ambition utility is calculated by factoring in game state information and then is normalised into a value that scales between 0 and 1.
So the confidence value, as discussed before is a utility calculation that is derived by looking at the state of conflict. It looks at its own health and shields, the opponent’s health and shields, the distance between the two players, the ammo they have left in the weapon and the height differential between the two players. This is then calculated into a numeric value that, if high, tells the bot to stick it out, and stay in the fight by using the attack behaviour tree strategies. But if it's low, it will then flee and prioritise another ambition that scores higher, like trying to find a new objective, or outright hide to break the line of sight.
This process is also highly customisable, given the calculations will change depending on what's happening in the game at that point in time, and this brings us to the third and final element: how the bots balance between fighting enemy players, and actually winning the match in specific game modes.
Game Mode Expertise
The Utility AI backend is adopted to make the bots even more effective in team games, given they use unique calculations to understand how Game Modes affect their strategy. Playing in different game modes introduces a whole host of new problems for the bot to understand: is it worth sticking out to win a fight knowing our stronghold objective is being captured? Is it more important to grab the power item or the oddball given the team only needs to hold it for 2 more seconds to win? Everything we've discussed so far is perfect for a bot that is prioritising team deathmatch modes like Slayer. But if playing Capture the Flag, Oddball or any other objective mode where scoring isn't achieved by killing opposing players, this is going to make the bots a huge liability.
When a given game mode is active, it introduces for the bots a suite of new ambitions for the game mode itself. While previously the ambitions related to specific behaviours, now the Game Mode Ambitions can refer to an object or area in the level that is relevant to the game mode they're currently playing and what actions to execute to satisfy them. So for example, the bot will have ambitions to capture and deliver their opponent’s flag in CTF, to hold the skull in oddball, and in strongholds it will have a capture or defend ambition for each of the active zones in the map.
And so now in an objective-driven match, the bots can factor in whether they prioritise picking up the flag, fighting a nearby enemy player or grabbing the power weapon in a capture the flag match, on a moment-to-moment basis. The bot team focussed heavily on managing the utility calculations for the objective ambitions so that sacrifice plays are possible by the bots. So if it's clear that making a rush with the flag to close that final couple of yards could yield a point being scored, even if it means the nearby enemies will shred them into confetti shortly after, then the bot will take the gamble.
On top of all of these smart and interesting design choices, are additional considerations when working with a Utility AI system.
For example, it's common for a utility-driven system to experience what is sometimes known as thrashing, meaning it has two utility values for actions A and B that are very close to each other. It then decides to do option A, and then immediately afterwards option B has a better utility value, so it swaps to B. But it could find itself in a situation where it just constantly bounces between A and B until the player puts a sniper rifle through its skull. So when the Halo bots decide on a particular behaviour, the utility value of that action is - for a short time - increased in value. This means the bot effectively doubles down on a particular decision for a short period of time, and the situation needs to change drastically for the bot to then realise it should switch it out for another behaviour, rather than looking like it's indecisive.
On top of this, the bots can share information. On higher difficulties, the bots can conduct map callouts to each other. They share the last known positions of enemy players, and this is factored into their active ambitions. Hence if one AI bot tells the others that a player was last spotted in a given location, they can then factor in whether it's more important to head to that location to find the target than it is to play the objective.
But it's not just the bots that share knowledge, the weapon lockers do as well. If one bot nearby has decided it's going to go for a weapon at a locker or other spawn point, then it has to ask the weapon locker if it can do so. This is because the weapon lockers have an ambition for interacting with them, but they set a rule on how many players at once can decide they want to grab the weapon. Hence it stops two bots trying to grab the same gun at once. It does not, however, prevent the bot from taking a gun you were wanting to grab, nor does it mean a bot will not race you to the locker if it's clear you're going for it too. In fact, the dev team were happy with keeping that in the game when it shipped, because ultimately it's how human players behave and is authentic with the typical Halo experience.
There's also one last thing that I thought was an interesting touch. As I mentioned earlier, the logic for utility calculations is all written in the Lua programming language. Now, Lua is perfectly capable of the job, but it sometimes isn't the most performant of languages for specific tasks. Hence there was a concern that the code for calculating the decisions wasn't going to be fast enough. But after stress testing it, the developers realised this actually fits the goals of the bots. Research by neuroscientists estimates that the average human being takes around 200-250ms (that's between a 1/5th and 1/4 of a second) to process new information such that they can react to it. Hence for the dev team, the benchmark was to ensure that all of the bots could process their information in the same window. Assuming a full match of 8 bots, that gives them around 31.25 ms each to process their decisions, which is plenty of time for the bots to run the Lua code. Hence it fits that they're a little slower to react to change because they're designed to behave and react like a regular human.
While Halo Infinite's bots aren't the smartest AI critters in the galaxy, the fact that they will more often than not suffer at the hands of a human player as they continue to learn the complexities of the game is the real victory. By looking at the core skills human players develop and finding ways to replicate them in these AI-controlled Spartans, they provide not just a solution to backfilling uneven lobbies, but critically, they help players slowly develop and improve at playing the game. For the seasoned Halo veteran, this may mean little, but for those new to the franchise, it's an opportunity to show them the ropes and help them attain their goals.
Thinking Like Players: How 'Halo Infinite's' Multiplayer Bots Make Decisions
Brie Chin-Deyerle, GDC 2022
Deconstructing the Combat Dance: Designing Multiplayer Bots for 'Halo Infinite' Sara Stern, GDC 2022