Bringing AI To Reality

SPONSORED FEATURE: A sharp new tool being used more and more by creative enterprise IT teams, generative AI, has the potential to enable major advances in the way an enterprise conducts its business. This is because its value can be utilized in numerous internal and external-facing applications and services, including such items as sales chatbots, HR processes, and avatar-led training, instruction and marketing videos.

By combining this intriguing AI software with fast new-generation server hardware, in-the-know startups are designing, building and delivering new applications to market – ahead of schedule, in many cases.

One such startup is DeepBrain, a human-focused AI technology specialist. DeepBrain is a member of the Lenovo AI Innovators Program, which provides startups with access to Lenovo’s expertise, resources and server hardware to help them create products and accelerate corporate growth.

DeepBrain’s flagship product is the AI Avatar, which can be used to create ultra-realistic avatar-led videos that can be used for various purposes, including sales, training and marketing. The AI Avatar uses DeepBrain’s proprietary generative AI video synthesis to combine text-to-speech and large language models (LLMs) to create AI humans that are nearly indistinguishable from real people.

Hyper-realistic avatar videos

“A key differentiator for our avatars is what we call hyper-realistic,” says Joe Murphy, DeepBrain’s business development officer. “If I show you the real person and generate videos side by side, you shouldn’t be able to tell the difference, and we have data to back that up. We have a measurement called the structural similarity index, where 100% equals 100% similar to the real person; we come in at 96.5 on that scale.” DeepBrain actually clones the real person’s voice in each production, Murphy said.

One of Murphy’s first customers was MBN, a Seoul, South Korea, news network. “They launched the first AI news anchor using our technology,” Murphy said. “They identified what they call the ‘franchise face’ of the network – like Anderson Cooper on CNN or Sean Hannity on Fox. Her name was Kim Joo Ha – a trusted brand face with a trusted voice. We brought her into the studio and recorded her talent for about four hours in a green-screen environment. That’s the training data that we use to build our model.”

The Kim Joo Ha avatar is not used in MBN’s news show each day, Murphy said. Instead, it substitutes for the real person in routine appearances such as promotional and advertising spots, so the anchor herself can focus on what she does best – delivering the news. “This obviously turns out to be a much better use of her time,” Murphy said.

Over the years, DeepBrain has tightened up its production process and is constantly improving, Murphy said.

“So now we’re down to about four hours of video, followed by three weeks of machine learning time,” he said. “And then the model is ready. Essentially, it’s a text-in/video-out model. You inject whatever text you want, then you hit Export. The model then generates the video of that person delivering the script that you’ve typed. So it’s simple: Script in, video out.”

The process of compiling the data-heavy files of video and audio – which often takes hours to finish and render – has been slow for pioneering AI companies. DeepBrain has solved that problem.

“The speed of synthesis is our second differentiator,” Murphy said. “That’s what enables us to have conversational AI humans; we can synthesize video as fast as real-time. So that enables interactive video where you can ask a question and the AI human can then respond. That is something that we’ve rolled out in 711 (convenience stores) in Korea and the Novotel hotel in Korea. So these are situations where customers can walk up to a kiosk, ask questions, and then the AI human answers the question. It’s basically taking a chatbot and putting an avatar as the human face.”

DeepBrain uses “smart caching” in these interactive kiosks, Murphy said, to handle the repetitive questions. “Where people are asking the same questions, it remembers that answer. When that question comes in, it’s all queued up, ready to go,” he said.

More recently, DeepBrain has been connecting to large language models, such as ChatGPT and Llama 2.
“We can’t anticipate what those models are going to say,” Murphy said, “so we’re streaming it out as fast as we can. That takes a one-second start time, then the answer starts coming out. It’s like when you talk to a smart speaker like Siri or Alexa, there’s a bit of a pause. It’s the same thing for us, but now we’re synthesizing video on top of the audio.”

DeepBrain AI Avatars are streamed in real time with low latency, ensuring that they can provide timely, conversational responses to questions. This makes them ideal for a wide range of applications, including customer service, education, and marketing.

Other use cases

Some other examples of how DeepBrain AI Avatars can be used include training videos, which by their very nature are more engaging and interactive than standard videos, and instruction videos that are more fun to watch than traditional videos. For example, an AI Avatar could be used to provide step-by-step instructions on how to use a new software program or how to perform a complex task, and perhaps add a few jokes along the way.

Marketing videos can take a step up by being more personalized and engaging than traditional videos. For example, an AI Avatar could be used to create a personalized video message for each customer or to provide customers with support and assistance in real-time.

How the right hardware renders high quality

This super-high quality does not happen without outstanding backend support. The Lenovo AI Innovators Program has helped DeepBrain by providing it with access to ready-to-deploy infrastructure solutions based on the company’s high-end Lenovo ThinkSystem SR675 V3 servers. This hardware is essential for running DeepBrain’s computationally intensive LLMs.

The ThinkSystem SR675 V3 is equipped with up to two 4th-Gen Intel Xeon Scalable processors and as much as 6TB of memory, ideal for handling large language models (LLMs). You don’t find that much memory available in just any server.

As a result, this ready-to-deploy server is well-suited for AI applications such as DeepBrain. In addition, it features a variety of fast storage options, such as NVMe SSDs, which can provide the performance needed for demanding AI workloads. It also supports high-speed networking, such as InfiniBand and Ethernet. Customers get to call the shots on these options.

More about the AI Innovators Program

In addition to providing access to the hardware, the Lenovo AI Innovators Program offers startups access to Lenovo AI centers of excellence. These centers provide startups with the subject matter expertise and resources they need to build customized proofs of concept for potential customers.

“The AI Innovators Program was very important to our plan as we started (with the MBN news anchor avatar project),” Murphy said. “It helped us detail our CEO’s strategic vision in the U.S. to land and expand, take what we did in Korea, and then bring it to North America.”

In the meantime, DeepBrain is now working with more than a dozen news stations across China and Korea with AI news anchors, and they’re all using the AI Studios product that’s demonstrated on the company website, Murphy said.

The centers of excellence serve 180 countries and more than 20,000 business partners, helping them daily to build their personalized proofs of concept for potential customers.

Lenovo will be demonstrating many of its next-generation solutions – including the DeepBrain use case – at its Lenovo Tech World conference on Oct. 24.

Sponsored by Lenovo.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.