Creating animation for video games has become reliant on motion capture, bit SPINE developer Nekki has gone further than most, having created its own app, called Cascadeur. This adds to a workflow that includes Unreal Engine 5 and Epic Game’s MetaHuman, but one that doesn’t always rely on those ‘off the shelf’ apps for its stylised anime-like animation.
In our developer series on upcoming gun-fu game SPINE we’ve already discovered how Nekki is creating a new ‘intelligent’ action camera as well as why the game’s art director loves using Unreal Engine 5, but here I chat to the game’s lead animator, Evgeniy Khapugin, and discover how performance capture is being used for the game’s fluid, athletic combat.
If you’re inspired by Evgeniy’s insights, then read our coverage of the best animation software and the best laptops for animation, and try creating for yourself. If you want to know more about the game, head over to SPINE’s website.
CB: What mocap technology did you use for SPINE, and how did it influence the realism and fluidity of the character animations?
Evgeniy Khapugin: We use every motion capture tool we can get our hands on! Mocap is a convenient tool that allows us to quickly see results. By the final stages, there may be little to none of the original mocap left, but it lays a strong foundation. Mocap also helps us determine whether something needs to be changed or completely redone.
At our studio, we have two Xsens suits, primarily used for recording body mechanics. Typically, the mocap actors are our own animators. These sessions don’t involve complex acrobatics, high-quality combat, or intricate scenes with props and interactions. It’s usually simpler animations focusing on acting or daily life scenarios.
For example, a character showing off how cool they are by singing a song; a bandit bot making a throat-slashing gesture; a market worker arranging fruit on a stall. These types of animations are hard to create by hand but are relatively easy to mocap and refine afterward.
We also work with external mocap studios using classic optical marker systems, specifically Vicon in our case. There, we capture movements that our team can’t perform – complex falls with rolls, parkour and stunt work, fight choreography, paired combat interactions and interactions involving props.
Mocap is extremely helpful when creating locomotion animations because it results in highly detailed and realistic movements that would be very difficult to create by hand.
Since SPINE is a realistic project, we don’t have to make many stylistic adjustments to the mocap data.
CB: What were the biggest challenges you encountered while integrating motion capture into the animation process for SPINE, especially with the dynamic and fast-paced combat?
EK: The biggest challenge for us is the leap in animation quality, which demands more complex and polished rigs.
For comparison, in our previous projects, we had strict limits on the number of joints. In our mobile project Shadow Fight 4: Arena, we used 91 joints, 58 of which were facial. Now, developing for PC and consoles, we no longer have such hardware limitations and have increased the number of joints significantly. Our main character currently has 1,217 joints, 839 of which are facial. But these aren’t just joints – there’s also an automated functionality that could be adjusted afterwards. For example, when a leg is raised and bent, the knee doesn’t deform, the calf flattens when it touches the thigh, and the stomach doesn’t collapse into the spine.
Our rig is inspired by Unreal Metahuman. However, pure Metahuman comes with an excessive number of joints and closed nodes, which limits customisation. So we took inspiration from it and created a system that suits our needs.
These technical aspects required not only rig improvements in Maya but also the development of custom tools in Cascadeur specifically for SPINE.
CB: Did you have to develop any custom tools or approaches to adapt motion capture data to fit the unique art style and combat mechanics of SPINE?
EK: Nekki developed its own software for physics-based animation called Cascadeur [read our Cascadeur review]. Cascadeur has been in development for about 15 years and has continuously evolved to meet the needs of modern game development. While most of our animators work in Cascadeur, we aren’t strictly tied to it. Sometimes there is a need for some flexibility, so we freely move animations between Cascadeur and Maya.
Cascadeur has evolved into a standalone product, driven by both internal and external user requests for mocap tools. One such tool is ‘animation unbaking’, which simplifies the process of working with motion capture. This tool functions somewhat similarly to the Simplify Curve filter in Maya, but it’s more advanced.
With internal algorithms, Cascadeur identifies keyframes and retains only those, preserving the overall motion while reducing the number of frames. This makes it easier for animators to edit. After ‘unbaking’, we can work on retiming to make movements sharper. Then, we can apply auto-physics to correct any inaccuracies, and finally, refine amplitudes, arcs, action lines, and so on.
Maya is more of a complementary tool for us rather than the primary one. In Maya, we mainly use various scripts rather than a unified workflow. It depends on what each animator is used to. Personally, I follow the animation principles used by Richard Lico and his ‘Space Switching’ technique. I’ve written my own scripts based on his principles and adjust them as needed.
For example, if I need to clean up the curve of a club’s tip, I run one script; if I need to add shaking to the chest after absorbing an impact, I use another; if I need to adjust the arc of an arm while it’s holding another character, I run a third script. It’s a combination of scripts, a skilled animator’s eye, and, of course, Animbot.
Animbot is a plugin for Maya that offers an enormous range of features to simplify an animator’s life – adjusting keys, reducing amplitudes, parenting objects without breaking the rig, and so on. If you’ve ever wished for something in Maya for animation, it’s probably already in Animbot.
All these tools help us work with mocap. But as I mentioned earlier, we mostly don’t use mocap for combat animations. In 90 percent of cases, combat animations are done by hand. But in those 10 percent of cases where we do capture combat animations, we know they’ll need adjustments, even if they’re technically perfect (no jittering, all movements are clean, etc.). Since we’re making a game, all mocap timing needs to be tweaked. Unfortunately or fortunately, real-world movement speed isn’t sufficient to meet gameplay needs.
Most animation adjustments depend on the requirements, but there are some universal changes every animation goes through:
- Speeding up the entry and exit from actions back to idle
- Tightening up timing to make movements snappier than in reality
- Emphasizing striking poses and action lines
- Exaggerating motion arcs to make movements easier to read in-game
- Refining nearly all key poses. Mocap always comes out “a bit off” for gameplay, where the clarity of action lines is crucial for readability
- Slightly extending pauses after strikes, if timing allows
CB: How did you handle the balance between the raw data from motion capture and the need for exaggeration or stylisation in animation to fit the game’s aesthetic and gameplay requirements?
EK: In SPINE, we have two types of animation tasks: gameplay animations and cutscene / acting animations.
Let’s start with gameplay animations. In gameplay, the most important aspect is how the animation feels and plays in the game, and only then how it looks. If you view these animations outside of the game, they may not look as physically accurate as if they were in a cutscene.
When the player presses a button, they expect an immediate response, and the animation needs to provide that feedback. Raw motion capture data is useful in the early stages, just adjusting the timing and distances. We insert it roughly, without fine-tuning. The focus here is speed and iteration. Playing these rough animations with the game designer helps us figure out the direction.
After that, we enter the classic animation pipeline – refining key poses, enhancing timing, creating clean action lines, and so on.
The second type of animations is for cutscenes. Cutscenes are essentially mini-movies where the player doesn’t control the character. In these animations, we don’t need to deviate as much from realistic timing. However, in fight scenes, we still make things a bit faster and sharper to enhance the visual impact. Without this adjustment, raw mocap often feels too slow.
CB: How did Unreal Engine 5 aid the capture and animation process? Were you able to get live shots or previs the action?
EK: We don’t use Unreal Engine 5 during the mocap process. When working with Vicon, we use MotionBuilder to view the character immediately, while for Xsens, we rely on their native software without retargeting.
In our animation pipeline, we try to avoid using Unreal for animations. Not because Unreal is bad, but because it’s important for us to have a single source of animation software. If animations are created in different places, there’s a high risk that someone will overwrite another person’s work, and fixing that can be very difficult, or some pipeline scripts will work in one software but wouldn’t in another.
In Unreal Engine 5, we handle procedural animations. For example, our robotic spiders walk procedurally. They have six legs, climb walls and floors, and their legs automatically find where to step and react accordingly.
We also use Unreal Engine 5 for IK (inverse kinematics) for both hands and feet [read out insights on how to animate from mocap data]. Hands need to grab objects from the floor or tables, hold weapons of various sizes, and grip bots of different heights. Feet need to stand on stairs, crates, and, of course, defeated enemies.