Samsung is developing in Moscow an artificial intelligence that manages to create photo-speaking avatars without 3D modeling. The research team published a document detailing how it was possible to develop an AI of its kind.
Traditionally, it would take several images to create a talking head model. However, the new technique only requires two, three or even just one reference image.
One of the engineers, Dmitry Ulyanov, published an example on Twitter. explains that although it is possible to create the model with only one image, the final result is better with more examples. More images lead to greater realism.
Samsung explains that the first step is to create a network that links frames related to subjects’ faces. Using this data, the system creates a video face generator. Finally, the system analyzes the realism and pose of the generated frames.
How can it be used in practical terms
With this system, it was possible to create animated avatars from Leonardo Da Vinci, Mona Lisa, Salvador Dali and Marylin Monroe. The system uses the VoxCeleb2 database, which contains hundreds of videos of celebrity interviews.
This technique of capturing and learning faces can be extremely useful for virtual presence. Video conferencing would have a completely different dynamic if one could only use their voice with a virtual replica.
There are also interesting applications for video games, in capturing images of actors for the characters. The Motion Capture technique can be eventually replaced by these avatars. Movies with special effects can also benefit from this artificial intelligence.
However, there is always the negative use that in this case would be the deep fakes. Deep fakes are fake celebrity models later used for adult content videos.