Analysing the AI of The Last of Us Part II
How Naughty Dog redefined the prestige single player experience
AI and Games is a YouTube series made possible thanks to crowdfunding on Patreon. Support the show to have your name in video credits, vote for future episode topics, watch content in early access and receive exclusive patron-only videos and merchandise.
On its release in 2020, The Last of Us Part Two helped set the new standard for Sony's flagship franchises of deep single-player experiences. Players join Ellie and Dina on a path of revenge, seeking to find those that wronged them, as the cycles of violence perpetuate and impact the lives of all affected.
It's a game emblematic of Sony's desire to deliver high production value and rich graphical fidelity, and in this instance, the developers Naughty Dog discovered some of the issues that arise when you go down that path.
In this examination of how artificial intelligence was employed through The Last of Us Part Two, it's about a lot more than just core gameplay, of supporting characters, rival factions and the ever-dangerous infected. It's about how AI works to support the higher production values, and the issues that arise as motion capture becomes an increasingly prevalent element of AAA gaming.
[SPOILER WARNING for THE LAST OF US and THE LAST OF US PART TWO]
Four years after the closing moments of the original, we rejoin Ellie and Joel in their new life in the compound of Jackson, Wyoming when tragedy strikes. As Ellie, players ride out with her partner Dina to the war-torn remnants of Seattle, encountering your fair share of enemies from the Washington Liberation Front, and the Seraphite, alongside the infected who continue to roam the land in familiar and unexpected forms.
The core of the gameplay is largely unchanged for The Last of Us Part Two. The game plays as a combination of stealth combat and a hard-hitting cover shooter. Players need to sneak around environments, avoid being detected by enemies via their programmed senses, and take them out as quietly as possible. But of course, more often than not the situation goes south and players need to improvise, and rely on a variety of melee and ranged weaponry to cut through your adversaries. Be it improvised baseball bats, pistols, molotovs, shotguns and rifles.
Rather than reinvent the core combat, the game introduces a number of small yet meaningful tweaks and changes, with some requiring entirely new AI systems to work as envisaged, while others make small but meaningful adjustments to existing systems and ultimately have a real impact on the experience. The ally system from the first game is expanded upon, with characters like Dina, Jesse and Joel supporting you in combat encounters. Plus the introduction of dogs that follow the player’s spline of recent movement, and new infected such as the Shamblers which exude corrosive spores. Enemies are more aggressive, can hunt down the player more effectively and maintain the pressure in a way that the first game simply couldn't achieve. But arguably the biggest change is the expansion of the melee combat. Players can now fight against multiple enemies at once, dodge their attacks and then go in with a killing blow. While this addition at first glance might not seem too advanced when compared to the original game, it has a huge impact on AI systems and their processes, as we'll see shortly.
So with that, let's start looking at how the combat AI evolved from the original game to support these new systems. To really understand and appreciate a lot of what I'm about to explore, I'd recommend you watch episode 52 of AI and Games, which explains the inner workings of the original Last of Us game. As we'll see in a moment, every new system explored in this video either builds upon or complements existing AI systems from the previous game.
Don't worry if you gotta pause to go watch it, I'll be right here waiting for you. Plus, it means you get to listen to my voice even more, which... some of you really enjoy apparently.
Improving Combat AI
While there is a myriad of new gameplay systems in The Last of Us Part Two, there is an underlying technological innovation that actually helps enrich the gameplay experience. This game was built for the PlayStation 4 as it approaches its lifecycle in 2020, while also being released for the PlayStation 5. When we compare this to the original game released on the PlayStation 3 in 2013, it means that developers had more computational resources, meaning more CPU and memory on the console, to play with, and this has an impact in two very big ways.
The first is the number of active NPCs in a given sequence given there is more CPU and memory to work with, which allows for the number of enemies on screen at once to move into the double digits. This does have an impact on encounters, but it also removes a secret system in the original game that worked around the console’s limitations. To preserve CPU and memory performance, the original game limits the number of what are known as 'active' NPCs. If an NPC is deemed inactive, even though it's in the current encounter, it means that much of its sensor processing and intelligent behaviour is disabled and instead it simply runs its patrol route. The first game would only consider a handful of NPCs in the proximity of the player to be active at any one time. In fact in an interview in 2022 with Matthew Gallant who was a combat designer on the original game, it was stated that the limit was 8 active brains, despite their being situations with many more active AI characters in the scene. This meant that you could conceivably eliminate some enemies quietly without issue, and then reposition yourself to attack the remaining numbers with ease given the more distant enemies would not wake up from their inactive state until much later. However, in The Last of Us Part Two, the number of active NPCs is significantly increased, thus ensuring more enemies are active and engaged in combat. This makes each encounter much more intense given it's harder to break out of the combat mode once you move into it as enemies are much more alert and actively exploring the environment.
Speaking of this exploration, this leads to the second big change, in that enemies have a heightened state of awareness compared to the original game. As discussed before, hunters and infected use a finite state machine for behaviour design - a topic covered in my AI 101 series - which transitions into search and combat based on specific conditions, such as seeing a dead body, or the player themselves. But it had limitations, critically if an enemy saw an ally die in proximity, it would always know exactly where the player was given the AI didn't distinguish between seeing a player killing an enemy, and an enemy dying without anyone in proximity. Nor did it factor in silenced weaponry such as the bow and arrow - which is expanded upon in The Last of Us Part Two given players can now craft silencers.
The Last of Us Part Two addresses this by having an increased level of awareness, critically for the human hunters. Now in the event an enemy is shot by an arrow or shot with the silenced pistol, rather than the NPC knowing immediately where the player is, it instead will not search the area of the map the player where the shot could have came from.
Similarly, the enemies in the original game would search an area, and after a while, if they failed to find the player, they would go idle or return to their original patrol routes. However now, once a character has been lifted to a more heightened state of awareness, that there is an enemy threat in proximity, they never back down from their search behaviour and instead will continue to explore the environment waiting to stumble across you. Plus when the AI does go into combat against the player, the cover selection system from the original game received an upgrade. The details on this are unclear, but it appears to carry more information on the offensive and defensive capabilities of a given cover point than before, allowing for characters to make more informed choices as they push forward.
None of these are big groundbreaking additions, nor are they new to the stealth genre, rather these are small, simple and meaningful design decisions that each help facilitate the experience as envisaged and really sell the reality of the situation the player has found themselves in.
Plus another small yet really impactful change is the expansion of the bark systems. As discussed in my AI 101 video on the topic, a bark is an audio cue delivered by a character or in-game system in response to the player’s actions. This already existed in the original game, with enemies yelling if they're under fire, of if they spot the player. However, it is fleshed out in much greater detail in The Last of Us Part Two. While this has a small but meaningful impact on the infected, it really sells the human characters more effectively. For example, ally characters can spot specific enemy types in proximity and give tactical information in combat. Meanwhile, the human enemies have a richer range of barks to react to the violence you're dishing out. One of the most exciting, and haunting barks, is when you kill an enemy only for their ally to see it happen or react to their body moments afterwards, they might yell out the character’s name. It's a small detail - and I'll concede I didn't do a headcount to find out just how many of the enemy AI have the same name - but it humanises your opponents in an interesting and meaningful way. This is really an exercise in effort, providing a myriad of barks for different names that can be played at runtime given the circumstance, but I commend the developers for this attention to detail that adds to the themes of unending violence that the game explores.
Building Out the Melee Systems
So with all these big additions explored, the real change to the main gameplay loop for The Last of Part Two is the melee combat. This is a completely revamped system from that seen in the original game in Uncharted 4. Previously players couldn't dodge a melee attack, the duration of an enemy attack was quite short, and it was almost a given that if either the player or an NPC attacked one another, it would hit the opponent. This was expanded to allow for a more engaging and complicated facet of combat.
Each melee attack is built around what are known as units, and are comprised of three key elements:
The animation played during the attack.
The start and end conditions dictate whether the attack is possible and how the NPC should be positioned at the end.
Lastly, there are the events, the critical moments of the attack. This can include the frames of the animation where a hit can occur, the tracking of the opponent that needs to occur during the swing, and also things like iFrames that dictate when they can't be hurt during the attack.
Now for each NPC, they actually store a collection of melee behaviours, and each NPC can only be in one of these behaviours at any one time. These give a lot more detail on how a character should behave before and after the melee attack is executed. So a behaviour may dictate that the enemy sprint up to the player and once close enough launch one of several attacks, or it could be that they run up to a fixed distance nearby and then wait a certain period of time before striking.
But allowing for a character to either run straight up and smack the player, or embrace the 'Assassin's Creed' or 'Batman Arkham' style of surrounding the player, adds a whole new layer of complexity. Given a character needs to know they won't bump into geometry, will line up the attack correctly and can follow the player around as they move in the space.
There's an entire target tracking system that was added to The Last of Us Part Two that ensures there are clear lines of attack but can compensate for a character being around a corner of geometry and it needed to reconcile a small turn to line up as planned. In addition, it has to calculate that if a given melee attack is going to be triggered, that the character can still line up the attack. Hence it uses a system called 'nav probes' which run navigation and collision checks against the in-game navigation mesh. This, alongside checks against geometry, allows for the AI to check whether it has clearance in front of them to do the run-up, whether the proposed motion after the attack is still safe and whether the NPC or the player would glitch into geometry when executing a proposed action, which is y'know a bit of a mood killer for sure. In fact, there is a system not just for ensuring that a melee attack for the AI characters doesn't undershoot or overshoot, but there's a similar system employed on the player side so you don't have the same problem. Plus there's a system for calculating whether a melee attack using a weapon will have the AI accidentally hit geometry. But interestingly, if the system does detect that the melee weapon might hit the geometry, it doesn't block the enemy from using it, instead, it merely discourages it. That way it can still result in these slightly erroneous behaviours, given it appears realistic.
But all of this then leads to two other big issues, handling geometry if the player is backed into a corner, as well as how to compensate for other AI characters being in proximity. So first up, to keep the infected from bumping into each other, there's a strafe slot system. This creates ‘spokes’ on a wheel that surrounds the player, and on those spokes, one NPC can stand on it. Each spoke has to continually run calculations to help determine whether it's a valid attack position as the player continues to move around and whether it satisfies the attack conditions of the NPC that wants to use it. Once an enemy is actively in melee combat and using the strafe slot system, it still has to continually run calculations to ensure it's facing in the right direction of the player, and whether it needs to move to another slot that will better suit its purposes. It's a lot of work to facilitate one small part of the game while ensuring it feels engaging to participate in.
This target tracking for melee attacks is an ongoing process, but it will automatically disable if the player successfully dodges an attack, this makes it so the enemy that missed the attack appears more dumbfounded by the dodge, but also reinforces that it caught everyone else off guard too.
The one fun element to all of this is how to handle nearby geometry. In the original game, it was possible for melee combat that saw the player or the enemy pressed up against a wall, and this was expanded quite drastically for the sequel. In order for this to work, there are a specific set of checks known as wall probes that run in the proximity of the player and AI characters during melee combat. It looks for nearby wall locations by casting 8 rays in the direction of the attacking player to find nearby walls. In fact, it's not 8 rays, but specifically 2 sets of 4, 4 high and 4 low. Specifically, so it can tell if the wall behind the player is a low wall or a high wall. Low walls can then trigger specific animations where the player or the enemy gets pressed up against it, so it needs to figure out the position and orientation of the wall to see whether that's feasible.
This leads to a small but meaningless little bug: when the player is pushed down against a low wall, the animation has to make sure their back lies flush on the surface. But that would then mean that all low walls would have to be the right height for Ellie to fall back onto them. But this isn't Gears of War, so Naughty Dog came up with a creative solution that allows the low wall height to vary. If the wall is slightly too high, then the game makes a small vertical adjustment to the height of the character before the animation plays. This often means that Ellie's feet are off the ground and the animation looks a bit wonky, but critically these animations typically trigger a special camera orientation. Meaning that the vast majority of the time, there's no way for you to notice.
A seemingly small but nonetheless critical AI implementation in The Last of Us Part Two is the use of motion matching. This is an entirely new AI component that is being used in an area most players don't think about: the animation system.
Motion Matching is the name given to a particular type of animation system that is designed to find the best possible set of animations for a blend. When a character moves from a walking animation to another given it's about to run, or climb over geometry, you want to find the best point to blend them together such that it looks smooth and realistic and then make all the necessary adjustments such the character can transition from one animation to the other. Traditionally this is done by hand, using blend trees and state machines with a lot of parameters being tweaked. But in more recent years given the ever-increasing numbers of bespoke animations for different characters and situations, there's an increasing need for automation to take over this workload to not just improve quality, but also protect the sanity of animators everywhere. In fact in The Last of Us Part Two there are over 6000 animation clips handling the movement of AI characters, totalling around 6 and a half hours in length.
This is an increasingly prominent technique in the video games industry, and in fact longtime readers will recall I actually discussed one of if not the earliest examples of this in AAA games, in my episode on the AI of Hitman.
Hiding the Seams
But perhaps what is the most interesting element of the AI iteration in The Last of Us Part Two, is a new and improved ambient AI system that is designed to make allies and other relevant characters appear more intelligent and realistic when they're not in combat. When the combat is in full force, a lot of concessions can be made on how a character might behave, and indeed some corners might be cut for the purpose of the experience. But when they're standing around or simply following the player be it as Ellie or Abby, then these are the moments in which it becomes much easier for a character to make even the smallest of decisions that wind up looking kinda dumb.
As discussed in a 2021 GDC talk by Bryan Collinsowrth and Michal Mach, the team discussed the challenge of making NPC allies feel more alive and realistic, and the biggest issue they're dealing with, is really the expectations of players. Naughty Dog like many other AAA production studios now heavily invests in motion capture performances for many of the cinematics and cutscenes that run in their games. In fact many games now use these motion-captured animations during gameplay as well. This slow and gradual expansion of the technology over the past 10-15 years thanks to its pervasive use not just in games but also in visual effects for film and television, has made it much more effective to deliver emotionally charged or complicated action sequences using motion capture that is then cleaned up afterwards, rather than animators having to handcraft every action the characters take.
This then presents a problem, given you can have characters like Ellie, Dina, Abby, Joel and Tommy deliver these emotionally rich moments thanks to an actor whose original performance is delivered through motion capture, and then mere seconds later, they're standing out like a big dumb robot: because the AI is in charge now and using a collection of pre-built animations. Now the motion matching helps address the blending issues, so the trick now is to figure out how to keep these characters active and move around the game world in a way that feels authentic.
The solution was to create a new ambient ally system: which was designed to improve the fidelity and performance of AI characters so they appear more realistic, but it was built specifically for when in non-combat sequences.
As we know from the first game, there were already several systems in place for AI to behave in the world. Pre-built animations known as Cinematic Action Packs or CAPS are used in The Last of Us and even Uncharted 4 to have allies run a specific animation (some of which are motion captured) at a particular location and angle. These are typically used when the NPC has to interact with particular parts of the environment. But these were quite limited, given they were often built to only work in specific situations and could only play in a fixed orientation. Meanwhile, allies could follow you around or explore a local area, but often with little local awareness, or their movements could even seem oddly robotic given they didn't move realistically.
Hence the focus was on providing new systems that can address some of these issues. But it even had an impact on level design, given it made sense for the environments to be large enough to allow the player and their allies to move around unencumbered. Something that the previous games didn't need to factor as strongly.
But for The Last of Us Part Two, the allies now had several improved and new systems to improve their ambient behaviour. This includes...
Positional Action Packs, which are much like the existing CAPs, but this time they didn't need to be matched to a specific position to hit it, or interact with a particular object. It could also set the orientation that it needs to face, and thanks to the motion matching mentioned earlier, the animation team didn't have to worry about setting up animations for entering or exiting the behaviour, given the AI handled it for each PAP. So the character could be looking in a particular direction, looking at a particular item of interest, but with a bit more variation.
A new set of idle and walking animations that allow characters to move and stand around in the world in a slower, more relaxed state. In fact, many of the relaxed walk and idle animations are not only specific per character, but also change in different chapters of the game. Hence a character would have different sets of these animations to reflect how they're feeling at that point in the story.
With all these idle and walking anims in place, it could allow for what was known as 'Wander Posts'. Wander Posts are locations in the world that a character could slowly move to and then run a simple idle animation. In fact, when these behaviours are going to be used, the level designers could use a tool that would auto-generate wander posts all over the available area.
Once all of this was in place, it allowed for an entirely new Exploration System to be built for these scenarios. The purpose of it was to have a character be able to randomly explore the world in a way that felt natural. In short, what it's doing is figuring out a realistic way in which a character can move between CAPs or PAPs in the scene. So what it does is decide which CAP or PAP to execute, and gradually make their way over to it, using a Wander Post and the more casual walk animations along the way. This would mean that the character casually moves towards an area of interest to explore, rather than beeline towards it like they're the freaking Terminator.
The funny thing about the Exploration System is that they actually based it on a piece of AI tech from the first game. The exploration system is actually based on the cover point system used in combat for the Hunters in The Last of Us.
As detailed back in my analysis of the original game, during combat, the human enemies in The Last of Us run a calculation against all the 20 closest points of cover to decide what is the best direction to go. In fact, this is a Utility system, which is another topic I covered in my AI 101 series. This process uses a range of criteria to determine whether a piece of cover is a good one to move into when in combat.
So the developers made a copy of this system and used it specifically for generating exploration behaviours. To do this it considers all CAPs, PAPs and Wander Points in a region to be a valid post for it to visit. And when it finishes interacting with a CAP or PAP, it will prioritise a Wander Point to break up the flow, before it visits another one. There is a selection of design criteria used to decide which posts it wants to visit. Prioritising posts that are near to and in front of the non-player character, posts that are in the open space rather than in corners of the map, it avoids posts that it recently used (so you don't see it doing the same behaviour on loop) and critically, it aims for posts that are either close to the player, on the players screen right now, or will result in the NPC passing in the proximity of the player. Why is that important? Well if it didn't take that into consideration, then there's no guarantee the player would ever see it happen, thus making this entire exercise an utter waste of time.
All that said, it wasn't perfect, as the characters still sort of ping-ponged between locations, albeit in a more relaxed manner. So the solution was to actually visualise the process of selecting the posts. After a post is selected, the NPC uses a special idle animation that shows them looking around or examining the local area. This shows to the player that the NPC is thinking about what it wants to look at next, despite the fact it's already decided what it's going to do.
While this is used in many areas of the game where the player is not in combat, the best place to see this in practice, which in fact Naughty Dog themselves used to test the system, was the Museum flashback with Joel and Ellie at the end of Chapter 2. Similarly, there's additional work in the leader system, where an NPC leads a player through an environment, that was designed with the Aquarium sequence with Abby and Yara at the end of Chapter 7 in mind. The new leader AI is being designed to pay closer attention to whether the player can see the leader and vice versa, plus bespoke animations for turning to face the player in conversation that are turned on and off dynamically based on what designers intended.
If you're interested in seeing a more detailed breakdown of this new ambient system, check out this walkthrough video on the AI and Games Plus YouTube channel, during which I highlight how this new system is being put to use for Joel's AI behaviour.
The Last of Us Part Two continues Sony's drive towards premiere single-player storylines that deliver high-fidelity visuals. As we've seen throughout this video AI is proving critical in expanding existing mechanics, improving production and visuals and helping sell the more theatrical elements of the experience for players. Bringing you closer to the story and its characters, and following along for the ride.
"Bringing AI to Life in 'The Last of Us Part II'" Bryan Collinsworth, Michal Mach, GDC 2021
"Melee AI in 'The Last of Us Part II'" Ming-Lun Chou, GDC 2021
"Motion Matching in 'The Last of Us Part II'" Michal Mach, Maksym Zhuravlov, GDC 2021
“'The Last of Us Part II': Designing the Museum Flashback" Evan Hill, Level Design Summit, GDC 2022