Original Source Here
Coming Soon From Google: A Hologram AI That Behaves Like a Real Person
Four technologies showcased at Google I/O 2021 has the potential to combine and power a humanlike hologram android.
In the Netflix sci-fi series Altered Carbon, there is a charming male character called Poe. He’s a hotel manager that’s really an Artificial Intelligence (AI) powered hologram.
Since he’s in the hospitality business, Poe is programmed to have a warm and empathetic personality. Although he is capable of understanding and displaying human emotions, sometimes he does verge on being too eager to please.
Poe represents the holy grail that AI researchers doing ‘multimodal machine learning’ have been after for some time — the ability to fully integrate language, visual and acoustic understanding into one cohesive model— so that they can build an interactive AI that can understand information contextually and respond just as a human would.
For example, how does one interpret the sentence “You must be joking” in a text message? It could be a casual remark, humor, or an expression of frustration — all depending on the facial expression and tone of the person saying it.
If an AI could be trained to take audio and visual clues into account to determine the emotional context, in addition to just text and literal meanings, then it can respond far more accurately and carry on the conversation in a logical manner.
But how does one get there?
Google held its annual developer conference from 18–20 May 2021. In the event, Google showed off a project called “LaMDA” — a chatbot AI which Google says is in the early stages of research and development.
Conventional chatbots are usually confined to specific topics it was trained for and requires the programmer to first input precise phrases, words and dialogue it could draw context and intention from. Once the human goes off on an unrelated topic or says something random, the chatbot is usually unable to find an appropriate response.
LaMDA, on the other hand, can hold a free-flowing conversation with a human being even when he goes off on a tangent suddenly. (See this demo gif clip from Google below.)
At Google I/O 2021, CEO Sundar Pichai also showed how LaMDA could even pretend to be an inanimate personality such as the planet Pluto, or even a paper aeroplane, and carry on a logical conversation with a human.
If a chatbot can pretend to be a planet, I don’t think it would have much problem as a hotel manager.
Ok, so Google will soon be able to develop a generalist chatbot that can take on any personality, but that’s just words.
Can it see and hear humans as well, and respond to them contextually?
Well, Google is in the early stages of developing AI that can respond to a complex user query like an intelligent and well-informed human being would after doing some research on the internet.
This technology it calls the Multitask Unified Model, or MUM. At Google I/O, the demonstration involved a query about what preparation was needed to climb Mount Fuji compared to another mountain — something a hotel manager could definitely expect from a guest.
What a coincidence!
BTW: MUM also understands 75 languages, a useful capability any hotel would wish for in a manager.
Pichai also briefly demonstrated how MUM will become truly multimodal in future and be capable of analyzing the content within both video and audio data— so that it can pinpoint answers to search queries down to an exact moment in a video.
Once Google has mastered analyzing and interpreting visual and audio information, it would then be able to incorporate those models into computer vision and speech recognition to make it possible for AI to synthesize text, expressions, body language and tones in response to a human conversation.
Ask the stars
Finally you might ask: Okay, so Google is in the midst of developing the software capability. What about the hardware needed to create a real Poe?
The answer lies in two other new technologies revealed at the conference that are already being used by Google — the new Tensor Processing Units (TPU) v4 and Project Starline.
The TPU v4 can be connected into pods that contains 4,096 chips each, giving it a total computing power of one exaflops and “10x the interconnect bandwidth per chip compared to any other networking technology”.
Layman version: Google has built a supercomputer that can be used to process machine learning and power AIs over the cloud, bringing it another step closer towards its quantum computing vision.
Project Starline, on the other hand, is quite simply, next generation video conferencing technology.
It uses sensor cameras, 3D imaging and a 65-inch ultra high-resolution screen with light field display technology that makes the video feed of the person you are looking at so realistic, it is as if you are able to see through a one-way mirror at the person on the other side.
Now imagine if this was set up as a hotel reception counter. Instead of a real person’s video feed, what if it was an AI-generated hologram reacting to you in real-time with expressions and natural dialogue?
Would you be able to tell?
Of course, to bring all these cutting edge technologies together and create a lifelike hologram that can see, hear and speak to you, it probably wouldn’t happen anytime soon. But Google’s capabilities are clearly headed in that direction.
In five or seven years’ time, when you visit a hotel next, perhaps the person behind the desk is still a real person.
Or just maybe, it is Google’s version of Poe.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot