Most of the common examples we cite on the benefits of artificial intelligence (AI) showcase how the technology can for example, “read” millions of legal documents in a matter of seconds, or how AI can be used to detect irregularities in medical screenings that may be overlooked by the human eye or tabulate millions of lines of data into easy to read summaries. AI is viewed as a technology that can speed up both complex and redundant tasks. Artificial Intelligence is also being used by consumer electronics giant Samsung to improve sound and picture quality across its television screens.
“Ironically, I don’t recall sound engineers being interested in AI in the beginning when we started seeing a boom in the technology,” said Sunmin Kim, Head of the Sound Device Lab, Visual Display Business at Samsung Electronics. But with an intuition that AI could help improve TV sound, the Sound Device Lab decided to get a head start, about six years ago, in exploring AI and its applications to sound technologies.
Then, a shift occurred. With the advent of the Neural Processing Unit (NPU) and its application to Samsung’s flagship TVs, the possibilities of applying AI technology to TV audio were unlocked. The team fast-tracked machine learning for its TVs which led to significant early successes. Kim is confident that AI has and will continue to exponentially elevate the viewing experience by enhancing TV sound in numerous and profound ways.
Today, AI sits at the core of Samsung’s audio strategy with its applications extending across numerous products. This widespread adoption of AI has resulted in features like Q-Symphony and SpaceFit Sound that enrich the audio experience, alongside other audio technologies that breathe life into content through dialogue and movement. Additional capabilities include Active Voice Amplifier, which adjusts and improves dialogue clarity with the speaker and surrounding noise in mind, Human Tracking Sound, which dynamically reproduces the sound based on the position of the on-screen speaker, and OTS Pro, which creates a dynamic soundscape based on the movement of objects or the speaker on the screen.
Samsung’s advanced AI applications include a deep learning algorithm that distinguishes and separates the primary voice on the screen from other sounds. Before, an equalizer feature simply adjusted the high-, middle- and low-pitched sounds within humans’ audible frequency range — approximately 20 to 20,000Hz — to improve sound quality. The 2023 TVs and screens go above and beyond, providing deeper and greater details by analyzing content scene by scene and accentuating various audio elements including human voices, background music and sound effects. Additionally, this optimization process happens behind the scenes so that users can enjoy optimal sound effortlessly.
The listening environment and its acoustic properties must be considered to truly optimize sound from TVs. Samsung’s latest TVs come with what the company calls the SpaceFit Sound feature that leverages AI technology to assess surrounding environments and ensure sound is appropriately adjusted. The feature utilizes the TV’s built-in microphone to automatically identify various factors present in the room, such as the distance between the TV and the wall as well as the room’s acoustic properties, to measure the reflection of the TV’s sound. AI is used to enhance the sound accordingly.
“We found certain frequency patterns that are more susceptible to change based on a device’s surroundings, potentially impacting the TV’s sound quality,” said Kibeom Kim. “The built-in microphone identifies changes in acoustic properties and sound caused by factors like a carpet or an empty room to optimize sound regardless of the TV’s location.”
“Traditionally, TVs use a set of dedicated sounds to check sound through the mic. SpaceFit Sound, on the other hand, utilizes real content to analyze viewing environments and modifies sound accordingly,” Kim explained. “The technology was designed not only for real scenarios but also real-time circumstances as the feature automatically adjusts sound.”
Q-Symphony is a proprietary technology from Samsung that orchestrates an interplay between a TV’s speakers and a connected soundbar, resulting in a richer, more vibrant audio experience. As the name suggests, the feature allows two audio outputs to synchronize. More specifically, the soundbar plays primary audio channels, while the TV speakers add background and surround sounds to create a dynamic and three-dimensional audio experience.
Despite the apparent simplicity in concept, Q-Symphony leverages a wide range of AI technologies to produce and sync sounds with such accuracy. Any gaps in sound levels between the TV speakers and soundbar must be precisely synced to prevent unwanted echoes and achieve perfect audio harmony. Q-Symphony 1.0, which originally utilized the TV’s top speakers, evolved into Q-Symphony 2.0, which controls all speakers with improved channel separation technology for a greater sense of depth and immersion.
The latest Q-Symphony 3.0 integrates the neural processor and AI-based real-time voice separation technology. This advanced Q-Symphony feature provides three-dimensional sound by distinguishing and optimizing various audio elements including voices, background music and sound effects based on the type of content and users’ volume settings. The resulting sound is a perfect replication of the audio track the creators intended.
Samsung’s AI algorithm can also take the input signals and play them through multiple channels, whether it’s the soundbar or all the TV speakers, customizing each channel for powerful and dynamic sounds. In addition to content featuring Dolby Atmos or 5.1-channel audio, content with regular stereo channels can also be processed to create 20 individual channels. “The sound offered through Q-Symphony is so immersive that viewers feel as though they are physically present on set with so many different background sounds coming alive through the feature,” said Kim.
These features are the outcome of two symbiotic AI technologies: a content analysis model and a sound separation model. The neural processors work with both auditory and visual cues, among other signals, to create the perfect audio experience that syncs with what is happening on the screen. Despite the complexity of this process, Samsung was able to bring these features to life by forming a cross-departmental team of engineers and pooling resources across picture, sound and other departments.
Sunmin Kim, who heads the Sound Device Lab, believes that sound is just one side of the coin: “The focus on sound quality is a given, but all innovations that made breakthroughs came from user-centric design. Sound settings and features need to be straightforward and user friendly.”
Seongsu Park noted that 70% of TV sound comes from the product while the remaining 30% is shaped by the space in which the sound is played: “Our products will continue to leverage the latest measurement systems and AI algorithms to analyze space and sound settings for optimal sound quality.”