DNEG’s work to explore the most dynamic cinematic techniques for the storytelling of Here, the movie starring Tom Hanks and directed by Robert Zemeckis, has utilised AI to create environmental transitions that are seen throughout the film. It’s a subtle effect and one that required an intensive period of research and development.
This isn’t the first use of AI we’ve reported on for Here’s VFX. The use of AI to de-age Tom Hanks in the film has already impressed, and the approach to covering one scene over millions of years has resulted in some subtle yet spectacular VFX work from DNEG. (Read what VFX artists think of AI.)
Martine Bertrand, who was Senior Researcher in Artificial Intelligence (AI) at DNEG when making Here (she has since moved to ILM), reflects on the use of AI in when it came to creating the period transitions within the frame throughout the film.
Martine recalls: “My role at DNEG is as a senior researcher in AI. Whilst Here was being shot, and we were doing the VFX work, I was leading a team. When the work started, visual effects supervisor Johnny Gibson reached out to me and said ‘We have these transitions and they’ll allow you to move from one era to another and lots of those transitions happen inside the living room. We want to do something new.”
She continues: “In researching what might be possible, Johnny Gibson showed Martine a range of examples, sourced online, that were “examples of technology being driven by a pre-trained GAN model [GAN / Generative Adversarial Network]. They’ve been around for a few years. The most popular version out there right now is StyleGAN.”
“GAN can generate images from a noisy vector and then once you’ve trained the model, the cool thing is you can take two noisy vectors and traverse the space in between them and generate intermediate shapes that will look very much like the domain of images that you’re generating.
“And this is the very important point: I told Johnny that it was possible, but that we would need to train GAN on living rooms. If you are going to generate living rooms, and transition between those, we need to have a living room GAN and to get a good quality GAN it takes a significant amount of images of living rooms and that would be very time-consuming.
“We need to take cameras and a crew and shoot a bunch of living rooms, scrape the internet to find living rooms that we can use. It gets complicated. And then we need to train that model, and that also requires lots of resources and can become cost prohibitive. So, I had to ask: ‘What could we do?’”
To arrive at an answer to the questions, Martine looked at what Latent Diffusion Models might offer DNEG, and recalls that, “It was the very early days of so-called Latent Diffusion Models – an example of those was Stable Diffusion. I thought, ‘What if we take one of those pre-trained LDMs, trained on millions of images, and it understands the semantics of the world.”
She remarks: “What if we can take this? I like to tinker and so I thought ‘Maybe I could pull apart this model. I understood the mathematics of those kinds of models. Maybe I can pull it apart, and I can do something: maybe I can exploit that latent space that’s a sort of abstract representation of the image and then move around this a little bit like with a GAN to generate intermediate shapes that stay semantically relevant to the current transition.
“So I tried it and it was janky, but now we had different living rooms that existed in between the two with vastly different setups. I showed it to Johnny and he said ‘Maybe we’ve got something there.’ And I thought “OK, that’s positive.
“So, I continued for quite a few weeks, to try to smooth this entire transition that I was generating and I wondered if ‘Maybe we can generate key frames ? Maybe we can use an AI interpolator?’ In this case, I adopted one that had been proposed by Google called Large Motion-Frame Interpolation. I was generating the key frames with the latent generative model and then trying to string them together with the interpolation and it was starting to look good.”
As in any creative process, iterations of an idea were worked through by Martine and her team; eventually bringing a solution to light. “At one point Johnny said that the transition wasn’t ‘melty’ enough and so he showed me a melting type of transition and I was like ‘Oh, wow. So, I thought ‘Let’s try to make this more melty, and that was the third challenge I was encountering,” recalls Martine.
To begin working towards something more ‘melty’, Martine considered “reversing the steps” and it was a case of discovering new solutions through experimentation Martine tells me this period was about exploring the tools and how they assemble together to create the final effect.
“At one point, I thought ‘If I interpolate between the two rooms using Google films which makes absolutely no sense really, because it’s not motion, but it gave me this twirly, melty, weird kind of in-betweenness,” says Martine, adding: “I thought ‘OK, but that’s not semantically meaningful, but it is melting and maybe I can now apply a clean-up pass with the generative approach on top of that and maybe that will make it melty and semantically meaningful. And it did!”
The creative challenge was not entirely over with, and Martine explains how they had arrived at working with “an interesting tool” and how, in turn, a fourth challenge presented itself.
“The fourth challenge was to get this script-like tool into Nuke and into the hands of artists and that was done by my colleague E.J.Rowe collaborating with Devina (the leader compositor for the film),” she tells me, before explaining: “They worked together for weeks to build that tool and make it so that it would do what the artists wanted it to do. We found that in a day you could generate hundreds of different transitions, and that was something that was impossible to do before and you could select the ones that you are most interested in and then use this alongside other, more traditional approaches (like dissolves) and combine this all together to create a very novel and interesting transition that’s just a few seconds in the movie.”
Of the sizeable effort involved in research and development work proceeding in tandem with achieving the required workload for the film, Martine notes that the AI-rendered transitions for the film “do contribute to the storytelling”.
She explains: “I was very pleased to see the final result of almost a year-long project. We had to understand how the model works and build from retrained components out there. The overall assembly was definitely something that was novel back then (2023). Back then, it was on the cutting edge.”
Reflecting on the creative discoveries involved in realising the transitions for the film, Martine notes: “It’s teamwork and collaboration and I would say that initially nobody had a clear idea of what these transitions would be like and of course what we did influence the final look of these effects.”
Bringing our conversation to a close, Martine reflects on the nuance of the subtle work that he was involved with, including how effectively the shots ‘morphed’ from one era to another seamlessly.
“In the work that I contributed to, there’s a transition when one of the characters (Olivia) comes in close to the screen: the transition is very clear in the background and you can see the morphing effect of that particular transition,” reflects Martime. “For me, when I saw that one, it really cemented the feeling that I got [with] the morphing. It was how I imagined it would ultimately look. It was really well crafted. It was very brief, but it shows the tool in action quite well.”
Martine adds: “There are other shots that are fantastic, where the tool has been used to good effect: for example, when we are behind the champagne popping and then in front and that’s also very well done.”
Amidst the film’s more overtly spectacular shots and sequences, the work achieved by Martine and her team provide a series of visual grace notes that possess a real sense of elegance and invention.
Inspired by the work of Martine and the team on Here? Then read our interview with three leading VFX supervisors from DNEG, Framestore and MPC share advice for getting into the industry. Already keen? The read our guides to the best 3D modelling software to start experimenting for yourself.