Towards Synthetic Reality: When DeepFakes meet AR/VR
by Walter Pasquarelli
In the past months, there’s been significant buzz around DeepFakes. My Twitter feed, populated predominantly with tech-for-good and privacy advocates, went bananas when a video of Mark Zuckerberg emerged criticising the fact that there could be ‘one man with billions of people’s data’. Shortly before, a manipulated video of US Speaker of the House Nancy Pelosi making her sound as if she were drunk circulated on the internet. The video, which was even shared by President Trump and members of the GOP, was later verified as a hoax. But with the quality of these videos rapidly improving, DeepFakes have sparked wider epistemological discussions about the future understandings of knowledge and truth.
These developments in AI and machine learning (ML) are concurrently taking place with giant leaps in immersive experiences technologies, better known as augmented reality (AR) and virtual reality (VR). We’ve all heard about AR and VR, but their application in our daily lives is (aside from perhaps Pokemon Go and other video games) fairly limited. Yet it appears that both AR/VR and DeepFakes could operate symbiotically in that they reinforce each other’s perceived levels of realism. Here, I offer a few thoughts on how the marriage of these technologies might transform our understanding of reality and presence.
DeepFakes: A quick recap
DeepFakes fall under the wider category of AI-generated ‘synthetic media’, that is ‘audiovisual information in digital form […] which is a composite of multiple pieces of information synthesised to produce a substantially new informational artefact’. They are typically created with ‘general adversarial networks’ (GANs). GANs consist of two neural networks working in tandem: the first network (‘the generator’) will create drafts or samples of, for example, videos, images, or audio; the second network (‘the discriminator’), in turn, ‘will detect if a sample is created by the generator or is a real sample for an existing sample library’. It is an iterative process by which the two networks ‘bounce’ off each - meaning the generator learns how to fool the discriminator over time by producing increasingly accurate synthetic media. In this way GANs generate replicas or realistic manipulations of natural objects and people in a way which look highly realistic. FaceApp, for instance, the controversial mobile app that creates face transformations of photographs, works with GANs.
Yet, there is a reason why DeepFakes have sparked so much discussion recently. I spoke to Henry Ajder from Deeptrace Labs, who explained to me that:
‘the recent phenomenon of Deepfakes represents a revolution in the creation process of synthetic media. Whereas previously significant time, money, and expertise was required to generate realistic synthetic outputs, Deepfakes or AI-generated synthetic media automate these processes in a fraction of the time, and with increasingly realistic results.’
To date, replicas (such as the ones mentioned above) can still be identified with the naked eye, but soon AI-generated synthetic media is expected to be able to fully replicate photos, video footage, or even human voices, beyond the point of human detection - at least not without the help of relevant software.
The key here, however, is not to consider AI-generated synthetic media as a stand alone technology. But rather as one that in combination with other technologies exponentially increases in its impact.
What is AR and VR?
AR technology is applied to improve and complement natural environments, through so-called ‘digital components’ - such as the little Pikachu which appears on the screen in Pokemon Go. These digital components are interwoven with the physical (real) world in order to generate experiences that are immersive, interactive, and ultimately feel real. VR in turn goes a step further, in that the environment that the user experiences, is entirely artificial. In other words, the entire virtual world is a single digital production. The key is to understand AR/VR as a technology that is fundamentally immersive, providing users with a feeling of real ‘presence’ within a semi-natural space and time.
As one might expect, the holy grail in the development of immersive technologies is to create fully realistic digital components which merge with, or substitute, the physical world in a way which we perceive as natural, defying our sensory judgment of what is real and not - and this is precisely where AI-generated synthetic media comes in.
Towards synthetic reality (SR)
For developers of immersive technologies, the technology powering synthetic media is a ‘hot area’ because of its potential for creating replicas of real-life persons or objects and turning them into photo-realistic digital components which feel part of the natural environment. Going a step further, applying AI-generated synthetic media in VR will mean the creation of entire ‘synthetic realities’ (SR), transporting the user to far away to places that never were.
What sounds like a Matrix scenario might, however, just be a few years away. Google just published a breakthrough paper ‘in controlling depth perception in video footage’, and similarly Samsung AI Center Moscow, has released a paper on creating ‘full-body avatars’ for video-conferences and games. As a report by Wired argued, these companies could soon be using synthetic media technology to make their avatars fully realistic - by using just a few pictures. Today, immersive technologies are applied primarily in games, but within the next 20 years, it is predicted that it will be in as much use as smartphones today.
Creating synthetic realities does have enormous potential for providing access to places and people that might otherwise be unreachable. Synthetic versions of prominent professors could provide tutoring to people in less developed countries, or to those who might otherwise not be able to afford it. Business executives would be able to hold remote meetings with international partners, whilst having the feeling that they were actually present in the same room - thereby cutting travel costs and saving time. Similarly, MPs could create realistic avatars of themselves and to communicate with individual members of their constituencies.
But would a replica of a real person be able to establish the same connection as a human being? Possibilities are endless, but SR does bring up some serious questions about the future of presence and reality. In fact, it would seem quite redundant to travel somewhere, if you were able to experience the same through SR, without the stress of long-haul flights.
A world without problems
This issue becomes more contentious when SR becomes more enjoyable than natural reality. Picture this: you are now able to replicate or create any environment or person and place them in a world where you are the absolute ruler - what would life be like in Walterville? Others have painted scenarios where we would be able to replicate deceased loved ones allowing us to interact with them beyond death, and never needing to let go of them. All of these scenarios have a clear pattern in common - namely the pursuit of creating an own ideal reality free from problems and discontent.
Yet, from psychological literature to religious scripts, conventional wisdom has hitherto argued that life entails a degree of suffering, and thus our ability to be able to do well is based on our resilience and ability to learn how to solve problems and overcome obstacles. If we were able to immerse ourselves into an ideal synthetic world with no obstacles or difficulties, it is fair to assume that it will have a tremendous impact on our ability to thrive in the ‘real world’, let alone manage relationships with perfectly imperfect human beings. The experience of social media bubbles have taught us that people have a tendency to create their own ideal spaces shutting down people they don’t agree with and don’t want to hear. SR will enable to turn a blind eye to ‘undesirable’ people and realities in a similar fashion, the only difference being that SR bubbles will feel real and natural.
For centuries, philosophers ranging from Plato to George Berkeley or David Hume, have tried to answer the question of whether there is an objective natural reality. In some ways SR has resolved this question, in that we will be able to immerse ourselves into purely subjective realities. If SR will be able to replicate and create ‘presence’ of objects, people, or entire environments, there is a serious question of how this will affect our understanding and sensory perception of the actual natural presence. If one way or another we perceive something or someone to be real and present, triggering emotional responses and attachment, if it will become impossible for humans to tell the difference between natural and synthetic reality - then where do we draw the line between what is and what is not?