Translating DITA Content at Teradata

In the product localization, resource files such as a Java property file contain elements for User Interface (UI) translation. The UI element is a key-value pair. The product uses the unique key to identify the element, and then extract the text value associated to the element. The UI localization is to translate these text values within the resource file. At Teradata, we utilize and manage these unique keys along with the text value in the translation memories (TMs). When we translate new versions of the product, we can “migrate” translations from the previous versions using those unique keys. The migration is the engineering process, but not the translation process performed by human translators.

We implement the same concept to DITA translation as well. We parse the DITA source file, and then break down and identify elements in XML. The CMS system we use at Teradata support the localization id, which is a unique key for each element within a single DITA source file. As DITA is structured in XML format, DITA elements are slightly different from UI elements. However, the DITA element has the text value for translation along with the key same as the UI element. Therefore, we can implement the same concept “translation migration” to DITA same as UI translation. Translation migration is different from pre-translation performed by the translation software such as Trados. Pre-translation is to just focus on the text and use various text matching technique to find out the same text in TMs and reuse its translation. Translation migration is to migrate translations programmatically based on the element’s unique key and the source text. Thus it is more precise and reliable than pre-translation.

In the presentation, I will discuss how to identify and define the DITA element including inlines for translation, then discuss translation migration. We have also developed pre-translation for DITA elements. We also customize Neural Machine Translation (NMT) to handle the source text that has inline elements and utilize Teradata terminology and preferred translation within NMT for better translation quality.

Lastly, we have also developed/implemented automation of those localization processes in house. Now, we run DITA translation automation to perform the following translation steps:

  • Element identification
  • Translation Migration
  • Pre-Translation
  • Customized NMT

At the end, we can translate new version of the document very quickly in high quality and less costly.

What can the audience expect to learn?

We have a very unique localization solution that does not depend on the commercial TMS (Translation Management System) or translation software like Trados. Even with DITA XML files, we can identify and process elements and apply translation migration which is very critical concept for translation, translation quality, cost savings and automation. NMT customization is another critical feature for element-based DITA translation. Those unique features should be helpful for audience to understand what we can do in DITA translation.

Meet the presenter

Tak Takahashi has been working on internationalization and localization at Teradata San Diego as a Globalization SME for more than 20+ years. He holds a degree in Architectural Engineering. He is also an architect and a computer engineer certified by Japanese government.

⇐Return to Agenda