Can AI Help with Subtitling or Segmenting? Curious to Hear from Others!

Hi everyone,

I’ve recently started contributing as a subtitle editor on Viki, and I’m really enjoying the process! At the same time, I’m also exploring how technology…especially AI…might support the work we do here, without taking away the human touch that’s so important to the community.

I came across a generative AI tutorial that shows how AI can help with generating text, summarizing conversations, and even translating short phrases. It made me wonder: could tools like this ever be helpful in subtitling or segmenting…maybe for initial drafts or time-stamp suggestions?

Of course, nothing replaces the accuracy and cultural sensitivity of real contributors, but I’m curious if anyone here has tried using any AI tools to speed up the process or help with difficult translations.

I’m all for responsible and ethical use, especially in a community like this one. Just wanted to open up the conversation and see what others think!

Looking forward to your thoughts,
M Richards

What a great idea for a discussion! I’m curious about other people’s opinion.

I believe Viki prohibits copy-pasting text from a translation software (while from what I understand, they use software to automatically translate some of their shows to different languages, and it doesn’t really work well).

That being said, who of us uses a printed dictionary nowadays or is absolutely fluent in both languages they work with? In my opinion, you should never copy-paste anything you don’t understand, since not everything AI/translation software produces is correct and makes sense. It’s usually fine with major languages like Chinese and English, but with the more complex ones, it often screws up even the most simple things (best example: “you” can be male, female, singular and plural in English where other languages have 4 different words x up to 8 different cases).

Checking if it comes up with a better idea of translation or checking for a word/expression you don’t know, should be fine. Using it while you have absolutely no idea what the line says or without understanding the text the software came up with well, no no no.

1 Like

Well… Viki has. They hire paid subbers to edit machine translations for the presubbed languages (English, Spanish, French, and Portuguese) before the shows become available for us.

But for us, it’s forbidden beyond just looking up a word we forgot.

Some still use printed dictionaries, but even online (or on CD or whatever), there are official dictionaries and other useful language-related sites that are not generated by AI. It’s not like AI is our only option now.

Every qualified subber/editor/moderator.

That AI is less terrible for English (and maybe also Chinese) nowadays is only because AI had far more training for English than for other languages so it’s just more advanced there. That doesn’t mean that it’s perfect for English, though. All languages have their difficulties and AI is never going to beat a qualified human being.

It’s not my idea of helpful and I think most professional translators don’t like it, either. Instead of translating from scratch, you’d have to clean up the mess of a machine translation. Indeed, some AI is not completely terrible, but I still think translating from scratch goes faster.
But also… translators will lose their rich vocabulary… or at least their translations will. The beautiful words and expressions they might have come up with on their own will now be overshadowed by some mediocre standard (or even inaccurate) translation by the machine and when exposed to that time and time again, the translator might lose some of their creativity or feel for the language. Also, they might lose motivation altogether.
And all the standard mediocre translations that follow will have their influence on everyone who grows up with those subtitles and ultimately, on our languages.
As for segmenting, we used to segment from scratch, way before anything was translated. Nowadays, we edit segments while having to save the subtitles of 4 presubbed languages in the process and over time, we get more and more restrictions that make our word harder. I doubt any segmenter would say that the introduction of AI has done the segmenting experience any good.

4 Likes

In languages you already know, AI is super helpful with official-technical-scientific-legal translations, with those pertaining to computer science etc. You know, these texts that are boring and difficult at the same time. Especially if it is from another language towards English. It will speed up the translation, because it will almost eliminate the need to look up difficult words in the dictionary. It needs careful checking because it messes up genders and other things like that, but you do gain time.
It’s also good if you have no idea of the language, there is no translation whatsoever, and you need to know more or less what is said, without need of perfect accuracy - just the gist, and you don’t plan on using the translation in any official way.
Lastly, it’s good if you want to check individual pieces of an already done translation that you’re doubtful about because it doesn’t make much sense. But it’s helpful to have 2-3 different AI models, not just one.
Now, for subtitles and Viki, it’s a bit less straightforward.
Firstly, because the level of the AI translation drops dramatically for languages other than English, Spanish (possibly also German and French).
And most of all because of something very specific to subtitles: subtitles depend on the video to make sense. The AI model is not watching the video, does not know the story, or what’s happening during the utterances, or between one subtitle and the next, when there are actions without words. There might be a chase, there might be a death, there might be a kiss…
You would also have to explain to the AI model beforehand who are the characters (including who is male and who is female, what speech level they use one with the other, what is their relationship, what has happened between them in the past and they are now referring to.

Even a human translator has the same problem, when s/he sometimes has to work from only the screenplay text without a chance to watch the video.
We sometimes see it on Viki, when we see nonsense translations and we understand it’s a lazy subber who hasn’t watched the episode but only his/her part, or even has translated from Bulk Translations without watching even their own part. (These people won’t ever find a place in my team again)

2 Likes

If AI can “watch and understand” a visual storyline it might be faster than human translators.

For written dialogues in Chinese the he, she, it problem won’t occur because of the signs. With spoken ta AI might have the same issues like human translators unless the AI would first watch all episodes and then start translating.

AIs can learn way more backgound to languages that hobby translators on Viki.

Fun fact:

I had a discussion in offline life with another native speaker of my own language and the definition of a certain term/sentence. That other native speaker had a wrong understanding about the own language.

I asked a LLM AI about that sentence/term plus detailed explanation etc.

The AI understands the meaning 100% and was even able to explain the focus of this specific wording in that sentence.

The native speaker was unable to do so…

Within 2 years the LLM I tried make so much progress that I think there will be even more progress in the future.

2 Likes

It’s a strange thing. Sometimes they are so accurate and good that you are amazed. And sometimes they make the most stupid mistakes.
As in “State the animals to be found in the Swiss Alps and classify them into carnivores and herbivores”. He lists all the animals, but he puts a couple of (obvious) animals in the wrong category.
Or once I needed all the Oscar winners in the “best film” category in a decade. I told the MML to give me a list in chronological order. The chronological order was off and one entry was the wrong year.
Or the one I did lately (because of the character limit). I told Chat GPT to translate an English sentence I had prepared into a number of European languages and put next to each sentence, in parenthesis, the character count of that sentence including spaces. He got all of them wrong! Someone here in Discussions alerted me to it and I had to check each one manually in MS Word.
It was weird, because it was such an easy task compared to all the other wonderful things they are able to do. I still don’t understand how this could happen, when the prompt was very clear and precise.
What I’m saying is that they are unpredictable. (Yet.) They can make your work much faster, but the result still needs checking afterwards, you cannot trust it blindly.

1 Like