At Siggraph Shutterstock revealed its new text-to-3D generative AI model, an API that can be used in a browser or plugged into 3D software such as Blender to create workable models in minutes. Being able to create 3D models from prompts or image references could really open up a complex art form to millions of new creatives.
As the Adobe report from last year stated, the future of design is 3D, and we’ve had small steps to text-to-3D in the past year, but the new Shutterstock generative AI model revealed at Siggraph 2024 is one of the best I’ve seen. This generative API is built using NVIDIA Edify generative AI architecture and has been trained on Shutterstock’s own content, which includes ‘half a million ethically-sourced 3D models, more than 650 million images’, so it has an ethical basis.
I saw early results of Shutterstock’s 3D gen AI last year, which resulted in “triangle soup” as Dade Orgeron, vice president of innovation at Shutterstock, tells me honestly. The new model is far removed and delivers 3D models creatives and artists can use in branding, prototypes, previz and the early stages of game design. “You can create pure quads,” remarks Dade. “This is great for being able to do animation [and] you get a really nice, artist friendly mesh.”
Shutterstock’s AI models have room for 500-word prompt descriptions, so “you can get a lot of detail” and the API even has a ‘prompt enhancer’, so it will clean up and target your prompt text if needed. Or you can drop in an image and render a 3D model from that reference. Finally the tool enables you to add textures, building from a simple base colour before including PBR materials and roughness.
“We plan to actually offer a material generator in the near future, so that if you need materials but don’t want to start from scratch, you’ll be able to use generative AI to build those materials,” says Dade.
Once a model is created it can be exported as GLB, USDz and OBJ formats, as triangles and quad meshes, complete with physically based rendering (PBR) materials. This AI is finally out-putting usable models. Dade tells me during my Siggraph demo how the idea behind this gen AI platform is not to replace 3D artists, but firstly to give creatives a starting point to avoid the anxiety of a blank, white screen. It’s also aimed at newcomers to 3D modelling.
“This is a great way for people who don’t know anything about 3D and really want to get into 3D, and really start prototyping and developing worlds and environments and other things like that,” explains Dade. “This means you don’t have to start modelling from scratch anymore. So you can start just building out ideas. It’s a very, very powerful tool for enabling people who may not actually know much about 3D.”
Speed is important, as we chat Dade reveals last year’s Shutterstock gen AI tool would create a 3D model from a prompt in 40 minutes on eight GPUs (so many 3D artists would rightly say they’re quicker), now it’s closer to two minutes for a mesh. “What’s really great is you can get a preview of what you want to generate in about 10 seconds, so you can quickly iterate,” says Dade.
Shutterstock already has a Blender plug-in ready to launch in September with the commercial release, so you can generate models in the viewport to iterate on in the software. A 3ds Max plug-in is also in development, demonstrating Shutterstock has a commercial focus on uses for its generative AI.
Dade says video game studios are an obvious user-base but highlights how generative AI can open up 3D modelling to more sectors. “We’re working with a large toy manufacturer who was really interested in utilising this technology for prototyping new ideas for brands,” reveals Dade. “So you can see there’s a lot of applications here. And because it doesn’t require that 3D expertise to get started, it opens up prototyping, concept [design] and concepting to more people at these organisations.”
Right now Shutterstock’s 3D AI creates models good for concepting but not final production, but this is coming, says Dade. The ‘hockey stick curve’ is rising sharply and every new version delivers better results. Dade tells me in the near future the AI will create sharper, higher resolution models, text and even object segmentation – which is a must.
Dade is also keen to point to AI’s failings, and how it’s not the “silver bullet” solution to all problems, and how to get good results from AI takes time and creativity. “It’s really just another tool to enable your creative skills,” he says.
One offshoot of developing Shutterstock’s AI for 3D models was the creation of generative 360 16K HDRI images, which creates rich and detailed natural environments for lighting 3D scenes from text prompts. These 360 HDRI images can be created from prompts or a single image can be uploaded and the Shutterstock AI will generate a wraparound image from that reference. While the text-to-3D model generator is grabbing headlines, this new feature could quietly find a lot of fans in the community.
It’s a feature that happened “very quickly” for the team as the model began outputting higher and higher resolutions as that ‘hockey stick’ curve of improvements saw faster improvements. Dade says we can expect more new features like the HDRI image generator as the AI API evolves.
“We’re going to continue to see these different products develop on the back burner for a while, and then you just get this result that’s going to blow everyone’s socks off,” he says, teasing new features like rigging and animation, as well as music and video creation, to eventually make Shutterstock a “very robust, generative AI platform”.