Implementing MT in the Video Game Localization Workflow at Electronic Arts

This week’s contribution is part of an article by Cristina Anselmi and Inés Rubio that was published on Multilingual. If you want to know more about the basic principles of MT applied to video game localization, check out our previous article!

Electronic Arts owns a large portfolio of titles and its localization needs grow constantly. With millions of words needing yearly translation, the localization team strives to provide the best quality in-game translations while being efficient both in terms of speed and cost. EA had been investigating MT for several years before we decided to finally implement it in April 2019. We wanted to be certain we had all the needed elements and knowledge to face this change, as it would be a revolution not only in the workflow but especially in the mindset of the team.

Categorization of the Content:

We started by categorizing all of our text types in order to understand which ones would be more suitable for MT and, based on that, which level of quality we could expect to achieve for each of them. We identified eight main text types based on the gaming experience they inhabited: player feedback, customer support, back translation (from a non-English source), game content, websites, tutorial/user guides, live chat and translation for information. Then we applied three main criteria to establish the potential for MT implementation: utility of the content, speed delivery, and sentiment, which considers the emotional engagement of the player.

Implementing MT in the Video Game Localization Workflow at Electronic Arts

Choosing the right MT provider:

Once we identified the text, we started looking into different providers with the following selection of criteria in mind:

  • The customizability of the engine. We wanted a provider that enabled us to internally control the customization of the engine. Once we could do this, we would be completely autonomous. Our own deep knowledge of our TMs gave us a great advantage in understanding how to best use them for the type of training needed to build an MT model.
  • Connectivity with computer-assisted translation (CAT) tool. The aim of MT implementation at Electronic Arts was to simplify and automate the workflow further. Thus, one important criterion during the selection process was the seamless connection with our current CAT through an API, in order to avoid the creation of additional steps that would complicate the workflow.
  • Quality of the raw output. Before choosing the provider, we made certain to run enough tests and to benchmark the quality levels we wanted to achieve for each type of text. Our aim was an output quality needing the least number of edits possible, as we don’t publish anything without post-editing.
  • Cost. MT is neither free nor cheap, and its implementation requires a costly investment in resources and time. A potential cost reduction in production and controlled system maintenance costs opens the door to increasing the scope of localization — for example, by increasing language pairs.

Building Language Models

Once we chose the provider, we created several language models. Our initial approach was to identify one project where MT would be applicable. After careful analysis, we decided not to limit the implementation to a single project, but to pick a variety of texts, (as risky as it would be), to be able to understand which challenges we faced. We decided to pick about 24 projects, with a variance of customer support text, marketing text, internal documentation, and in-game text. 

Our provider only allowed us to build a language model on top of an existing one, instead of creating one from scratch. Since the biggest volume of translations of the content we wanted to apply MT to is represented by customer support text, and this contains a variety of games, we decided to first create engines which we called “generic”, meaning trained with almost all of our TMs, properly prepared and cleaned according to specific criteria. This can be challenging, as these kinds of engines can perform very well with generic texts that don’t contain an important amount of IP-specific terminology, but they do not perform well with games in which terminology plays a very important role. With this in mind, we started testing texts selected from a mix of projects to assess the quality of the output and identify potential risks of MT for each type of project.

Quality evaluation

We decided to combine two approaches in our output quality evaluation. On top of the already mentioned BLEU evaluation method, which we controlled by carefully selecting the reference to compare the MT output to, we added a human qualitative style and linguistic evaluation that consists of giving a score from one to five in terms of fluency and adequacy. The linguist assessed the grammaticality of the output, without collocation errors, style pitfalls or unnatural language; that is to say, how much of the meaning and emotion expressed in the source text was present in each of the target language translations and if they were grammatically correct.

Test results

We picked a text for each project and language we wanted to apply MT to (24 projects in total and 27 languages) and had it evaluated by both internal and external linguists to obtain an averaged and less biased score. Not all of the languages evaluated reached the quality

output we were hoping for, but we decided to implement them anyway during our trials. By doing so, we could get more data to be able to improve the quality, thanks to the feedback coming from post-editors.

While we risked creating player dissatisfaction, especially for the low-quality languages, we decided to take the first step toward a big change. Thanks to the good relationship we have with both of our internal and external partners, we achieved functional quality and gathered valuable feedback that helped us improve our MT processes and output quality in all languages. This experience helped us readjust our MT rollout strategy and define a plan for the following months to improve the quality even further as it is both a long and delicate process.

Continuous improvement

After the implementation, we analyzed the feedback systematically. We are now able to leverage it not only to improve the quality of some specific languages but to understand our next MT strategy. This has allowed us to categorize mistakes according to what can be fixed by training the engine, like mistranslation, grammar issues, terminology and everything else connected to language and style. Additionally, we looked at what can be fixed by improving the CAT integration, for instance, this could be issues with tags and variables, glossary violations, and spacing. Moreover, what we are considering and testing out now are a franchise and game-customized engines, targeting increased stylistic and tone accuracy in the output text and refining the process of cleaning our data for training.

One important concept we kept in mind when we started planning the implantation of MT for our games is that it is not meant in any way to replace human translators, as the current status of MT doesn’t allow the publication of the text without human intervention, especially for in-game text. That is why the workflow we have in place now includes a combination of TMs and MT, followed by post-editing and a linguistic quality assurance review by our localization testers.

Recapping and looking forward

The implementation of MT can allow video game localization publishers to accelerate time-to-market and to increase the pool of languages offered, creating a new competitive edge. MT implementation in games is a long and iterative journey, and it is important to bear this in mind. Game text is very peculiar and can vary a lot from IP to IP, making it complicated to find one perfect engine and provider able to service all genres and all languages. We encourage video game localization teams to invest now in available testing options until reaching the expected results.

One of the main barriers to the implementation of MT in the gaming industry is the lack of available trained professionals. A good post-editing team is essential to reaching the cost and time margins pursued by this technology. This team must also possess a deep understanding of sensible quality, cost and time expectations, which are all key to managing stakeholders. Solid and unbiased metrics and processes will be your allies in this journey, and fortunately, the localization industry has made great progress here already.

If you are interested in knowing the next steps in the MT implementation journey at EA, don’t miss Cristina’s presentation at Game Global 2020.

About the Author
Cristina Anselmi
Cristina Anselmi
Electronic Arts
About the Author
Inés Rubio
Inés Rubio
Sr. Global Translation Manager at National Instruments

Inés Rubio is a localization veteran with experience in ecommerce, marketing and 15 years of gaming industry background. Her interest in process improvement centered around technology and outsourcing solutions drove her to research potential machine translation applications to games localization before the Google Neural Machine Translation landmark. Inés currently applies her passion to translation and localization processes at National Instruments, a leader in software defined automated testing and measurements