next up previous contents index
Next: Document Preparation: Authoring Up: Machine Translation in Previous: Introduction

The Scenario

Let us suppose that you are a native English speaker engaged as a professional German-English translator in the Language Centre for a multinational manufacturing company. One of the products this company supplies is computer products. In this organization the Language Centre is principally responsible for the translation of documents created within the company into a variety of European and Oriental languages. The Language Centre is also charged with exercising control over the content and presentation of company documentation in general. To this end, it attempts to specify standards for the final appearance of documents in distributed form, including style, terminology, and content in general. The overall policy is enshrined in the form of a corporate Document Design and Content Guide which the Centre periodically updates and revises.

The material for which MT is to be used consists of technical documentation such as User and Repair manuals for software and hardware products manufactured or sourced by the company. Some classes of highly routine internal business correspondence are also submitted for MT. Legal and marketing material, and much external business correspondence, is normally translated by hand, although some translators in the organization prefer to use MT here as well.

All material  for translation is available in electronic form on a computer network which supports the company's documentation system. Although most documents will be printed out at some point as standard paper User Manuals and so forth, the system also supports the preparation of multi-media hypertext  documents. These are documents which exist primarily in electronic form with a sophisticated cross-reference system; they contain both text and pictures (and perhaps speech and other sounds). These documents are usually distributed to their final users as CD-ROMs, although they can be distributed in other electronic forms, including electronic mail. Printed versions of these documents can also be made.

Everyone in the language department has a workstation --- an individual computer. These are linked together by the network. The documentation system which runs on this network allows users to create and modify documents by typing in text; in other words, it provides very sophisticated word processing facilities. It also provides sophisticated means for storing and retrieving electronic documents , and for passing them around the network inside the company or via external networks to external organizations. As is usual with current computer systems, everything is done with the help of a friendly interface based on windows, icons and menus, selections being made with a mouse.

The MT system which you use is called ETRANS and forms part of the overall documentation system. (ETRANS is just a name we have invented for a prototypical MT system.) Parts of an electronic document on the system can be sent to the MT system in the same way that they can be sent to a printer or to another device or facility on the network. ETRANS is simultaneously available from any workstation and, for each person using it, behaves as if it is his or her own personal MT system.

Earlier this morning, one of the technical authors had completed (two days after the deadline) a User Manual for a printer the company is about to launch. The text is in German . Although this author works in a building 50 kilometres away, the network ensures that the document is fully accessible from your workstation. What follows is a fragment of the text which you are viewing in a window on the workstation screen and which you are going to translate:

 

As with all the technical documents submitted to ETRANS, all the sentences are relatively short and rather plain. Indeed, it was written in accordance with the Language Centre document specification and with MT very much in mind. There are no obvious idioms or complicated linguistic constructions. Many or all of the technical terms  relating to printers (e.g. Druckdichte `print density') are in regular use in the company and are stored and defined in paper or electronic dictionaries  available to the company's technical authors and translators.

To start up ETRANS, you click on the icon bearing an ETRANS logo, and this pops up a menu giving various translation options. ETRANS handles six languages: English, German, French, Italian, Spanish and Japanese. The printer document needs to be translated into English, so you select English as the target language option. Another menu shows the source language to be used. In this case, there is no need to select German because ETRANS has already had a very quick look at your printer document and decided, given rather superficial criteria such as the presence of umlauts and other characteristics of German orthography , that it is probably German text. If ETRANS had guessed wrongly --- as it sometimes does --- then you could select the correct source language from the menu yourself. By clicking on an additional menu of ETRANS options, you start it translating in batch  or full-text mode; that is, the whole text will be translated automatically without any intervention on your part. The translation starts appearing in a separate screen window more or less immediately. However, since the full source text is quite long, it will take some time to translate it in its entirety. Rather than sit around, you decide to continue with the revision of another translation in another window. You will look at the output as soon as it has finished translating the first chapter.

The output of ETRANS can be found on page gif. The quality  of this raw output is pretty much as you expect from ETRANS. Most sentences are more or less intelligible  even if you don't go back to the German source. (Sometimes some sentences may be completely unintelligible.) The translation is relatively accurate  in the sense that it is not misleading --- it doesn't lead you to think that the source text says one thing when it really says something quite the opposite. However, the translation is very far from being a good specimen of English. For one thing, ETRANS clearly had difficulties with choosing the correct translation of the German  word ein which has three possible English equivalents: a/ an, on and one.

Apart from these details, it has also made quite a mess of a whole phrase:

In order to post-edit  such phrases it will be necessary to refer back to the German source text.

 

Leaving ETRANS to continue translating later chapters of the document, you start post-editing the first chapter by opening up a post-edit  window, which interleaves a copy of the raw ETRANS output with the corresponding source sentences (e.g. so that each source sentence appears next to its proposed translation). Your workstation screen probably now looks something like the Figure on page gif.

Icons and menus give access to large scale on-line  multilingual dictionaries --- either the ones used by the ETRANS itself or others specifically intended for human users. You post-edit  the raw MT using the range of word-processing functions provided by the document processing system. Using search facilities, you skip through the document looking for all instances of a, an or one, since you know that these are often wrong and may need replacement. (Discussions are in progress with the supplier of ETRANS who has promised to look into this problem and make improvements.) After two or three other global searches for known problem areas, you start to go through the document making corrections sentence by sentence. The result of this is automatically separated from the source text, and can be displayed in yet another window. Page gif shows what your workstation screen might now look like.

  
Figure: Translators' Workstation while Post-Editing  a Translation

Note that ETRANS has left the document format completely unaltered. It may be that the translation is actually slightly longer (or shorter) than the source text; any necessary adjustment to the pagination of the translation compared to the source is a matter for the document processing system.

After post-editing  the remaining text, you have almost completed the entire translation process. Since it is not uncommon for translators to miss some small translation errors introduced by the MT system, you observe company policy by sending your post-edited  electronic text to a colleague to have it double-checked. The result is similar to that on page gif.

The only thing left to be done is to update the term dictionary , by adding any technical terms that have appeared in the document with their translation terms which other translators should in future translate in the same way, and report any new errors the MT system has committed (with a view to the system being improved in the future).

  
Figure: Translators' Workstation Previewing Output

So that, in outline, is how MT fits into the commercial translation process. Let us review the individuals, entities and processes involved. Proceeding logically, we have as individuals:

In many cases the document management role will be fulfilled by translators or technical authors. For obvious reasons, there will be fairly few individuals who are both technical authors and translators.

 

The important entities in the process are:

Clearly any translation system is likely to be a very complex and sophisticated piece of software; its design at the linguistic level is discussed in detail in other chapters in this book. A detailed discussion of Electronic Documents can be found in Chapter gif.

Finally, the various processes or steps in the whole business are:

The scenario gave a brief flavour of all three steps. We shall now examine each of them in rather more detail.



next up previous contents index
Next: Document Preparation: Authoring Up: Machine Translation in Previous: Introduction



Arnold D J
Thu Dec 21 10:52:49 GMT 1995