Some even argue that if you have watched the show in English, you haven’t really watched it at all. Many people who claim to be English-Korean bilinguals argue the translation does not do justice to the brilliantly written stories, clever dialogue and script. Since its release, the nine-episode survival drama has topped Netflix’s charts in 90 countries and is poised to become the most-watched show in Netflix history.Īs the global popularity of the Korean thriller continues to grow, there have been debates over the quality of the English subtitle translation, particularly on social media. YouTube (of course!) – they’ve written about it here.There is no question that Squid Game has become a global sensation.There are many companies in the automated subtitling space and some of them are Is Netflix the only ones in the Automated Subtitling game? If you know anything about it, do let me know. I haven’t found details on their transcribing process. But, essentially, they use a simplification step followed by the translation step. You can read the paper or the Nvidia article for further details. Netflix’s simplification model, which they call the automatic pre-processing model (APP) is applied to English-language sources and the output of this step is sent to the machine translation step. In our example, Netflix’s algorithm would simplify “beat it” to something like “go away” and then translate it into other languages. You can read more about Text Simplification here. Instead of directly translating a sentence, they first “simplify” the sentence using a corpus of words and phrases and translate that instead. Nvidia has also blogged about it here because Netflix used Nvidia GPUs to crunch through the data. Netflix recently published a paper that describes what they do and its very interesting once you get past the initial jargon. Netflix’s idea for Automated Subtitle Translation The Cambridge dictionary defines it as followsīut, how are you going to teach your AI/ML model this nuance? Transcribing the text is one thing, but, translating it considering cultural differences, usage of speech is a completely different ball game and I am glad that big research teams like the ones at Netflix are taking a stab at it. Is he telling the other person to go beat/hit something? NO! He’s telling the other person to go away or go do something else. What does this seemingly innocent four-word sentence mean? Suppose a character says to another – “Hey you – beat it!” What can go wrong with Machine Translation? Machine Translation is also a huge research topic and is really hard and we’ll see why. You could have an awesome algorithm that can transcribe speech really well and then a person comes along with a speech impairment like a lisp and it can confuse the algorithm. The first (transcribing) has challenges in understanding the way different people speak the same language. translation (from English to Spanish or Hindi)īoth present huge technical challenges.English words being spoken into English text to be printed on the screen) The main thing that you need to consider is that there are two main jobs that need to be accomplished for automated subtitling But, accents and dialects will play a massive role when it comes to training and testing such ML models. I purposely did not say “language”, because I feel it is better to have a different model per language. You need to have your software trained extremely well to distinguish accents and dialects. I see a similar problem when it comes to automated subtitling. Their sample set was completely biased because of their accents + pronunciations, and as a result, it failed in real-world testing. Get this – for some odd reason, the R&D group had used their own voice samples to train the model and had overlooked a simple fact – the entire team was made of researchers from China who had come over to the US to study and work. The team could not understand what went wrong, until they realized the obvious. When they took used model for live testing on their actual customers, it performed horribly. The professor was explaining how a research group at a certain mobile giant (struggling badly now) built ML models using several thousands of hours of speech. When I read about Netflix using AI for creating subtitles, it reminded me of a class I took on Pattern Recognition while I was in grad school. Is Netflix the only ones in the Automated Subtitling game?.Netflix’s idea for Automated Subtitle Translation.What can go wrong with Machine Translation?.What’s involved in Automated Subtitling?.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |