Meta Unveils New AI-Powered Speech Translation System for Hokkien Language
Meta announced it has built the first AI-powered translation system for a primarily oral language, Hokkien, as part of its Universal Speech Translator (UST) project.
The breakthrough is particularly noteworthy because until now, machine translation tools were limited to written languages. Previously, the standard technique was to feed an AI model large amounts of written text in order to train it. Now, however, researchers can leverage different types of data, a combination of written and speech, to create a translation system for unwritten languages.
The researchers said that collecting data for the project proved to be a challenge as Hokkien is considered a low-resource language. They also said there are relatively few human English-to-Hokkien translators, which makes it hard to collect and annotate the data. For this reason, they used Mandarin as an intermediary language to serve as a bridge of sorts for the AI between Hokkien and English.
They also mined speech data and aligned the English speech and texts whose semantic embeddings are similar to Hokkien.
Traditionally, translation systems rely on transcriptions or are speech-to-text systems. But that, of course, won't work for Hokkien, so the researchers chose to adopt a new modelling approach.
"We used speech-to-unit translation (S2UT) to translate input speech to a sequence of acoustic units directly in the path previously pioneered by Meta," writes the researchers. "Then, we generated waveforms from the units."
"In addition, UnitY was adopted for a two-pass decoding mechanism, where the first-pass decoder generates text in a related language (Mandarin) and the second-pass decoder creates units."
They then evaluated the accuracy of the translation system using a host of metrics and systems. What's more, they created the first Hokkien-English bidirectional speech-to-speech translation benchmark dataset based on a Hokkien speech corpus called Taiwanese Across Taiwan. The dataset is open-sourced, which would allow other researchers to build upon it and improve the system's accuracy.
Meta noted though that the system is still a work-in-progress, with it only being able to translate Hokkien to English one full sentence at a time. Regardless, the social media giant believes this is a step toward the future in its desire to translate hundreds of languages.
There are about 49 million Hokkien speakers in the world spread across different countries, such as Singapore, China, Taiwan, Malaysia and the Philippines. Meta hopes that through this new translation system, these people will be able to seamlessly communicate with others around the world in online spaces like the metaverse using their native language.
The demo for the translation system can be accessed here.
Meta announced it has built the first AI-powered translation system for a primarily oral language, Hokkien, as part of its Universal Speech Translator (UST) project.
The breakthrough is particularly noteworthy because until now, machine translation tools were limited to written languages.
The social media giant noted though that the system is still a work-in-progress, with it only being able to translate Hokkien to English one full sentence at a time.