next up previous contents index
Next: A Bit of Up: Introduction and Overview Previous: Why MT Matters

Popular Conceptions and Misconceptions

 

Some popular misconceptions about MT are listed on page gif. We will discuss them in turn.

``MT is a waste of time because you will never make a machine that can translate Shakespeare''.

The criticism that MT systems cannot, and will never, produce translations of great literature of any great merit is probably correct, but quite beside the point. It certainly does not show that MT is impossible. First, translating literature requires special literary skill --- it is not the kind of thing that the average professional translator normally attempts. So accepting the criticism does not show that automatic translation of non-literary texts is impossible. Second, literary translation is a small proportion of the translation that has to be done, so accepting the criticism does not mean that MT is useless. Finally, one may wonder who would ever want to translate Shakespeare by machine --- it is a job that human translators find challenging and rewarding, and it is not a job that MT systems have been designed for. The criticism that MT systems cannot translate Shakespeare is a bit like criticism of industrial robots for not being able to dance Swan Lake.

``There was/is an MT system which translated The spirit is willing, but the flesh is weak into the Russian equivalent of The vodka is good, but the steak is lousy, and hydraulic ram into the French equivalent of water goat. MT is useless.''

The `spirit is willing' story is amusing, and it really is a pity that it is not true. However, like most MT `howlers' it is a fabrication. In fact, for the most part, they were in circulation long before any MT system could have produced them (variants of the `spirit is willing' example can be found in the American press as early as 1956, but sadly, there does not seem to have been an MT system in America which could translate from English into Russian  until much more recently --- for sound strategic reasons, work in the USA had concentrated on the translation of Russian into English, not the other way round). Of course, there are real MT howlers. Two of the nicest are the translation of French  avocat (`advocate', `lawyer' or `barrister') as avocado, and the translation of Les soldats sont dans le café as The soldiers are in the coffee. However, they are not as easy to find as the reader might think, and they certainly do not show that MT is useless.

 

``Generally, the quality  of translation you can get from an MT system is very low. This makes them useless in practice.''

Far from being useless, there are several MT systems in day-to-day use around the world. Examples include METEO  (in daily since 1977 use at the Canadian Meteorological Center  in Dorval, Montreal), SYSTRAN  (in use at the CEC , and elsewhere), LOGOS , ALPS , ENGSPAN  (and SPANAM ), METAL , GLOBALINK . It is true that the number of organizations that use MT on a daily basis is relatively small, but those that do use it benefit considerably. For example, as of 1990, METEO  was regularly translating around 45 000 words of weather bulletins every day, from English into French  for transmission to press, radio, and television. In the 1980s, the diesel engine manufacturers Perkins Engines  was saving around £ 4 000 on each diesel engine manual translated (using a PC version of WEIDNER  system). Moreover, overall translation time per manual was more than halved from around 26 weeks to 9-12 weeks --- this time saving can be very significant commercially, because a product like an engine cannot easily be marketed without user manuals.

Of course, it is true that the quality  of many MT systems is low, and probably no existing system can produce really perfect translations.gif However, this does not make MT useless. First, not every translation has to be perfect. Imagine you have in front of you a Chinese newspaper which you suspect may contain some information of crucial importance to you or your company. Even a very rough translation would help you. Apart from anything else, you would be able to work out which, if any, parts of the paper would be worth getting translated properly. Second, a human translator normally does not immediately produce a perfect translation. It is normal to divide the job of translating a document into two stages. The first stage is to produce a draft translation, i.e. a piece of running text in the target language, which has the most obvious translation problems solved (e.g. choice of terminology, etc.), but which is not necessarily perfect. This is then revised  --- either by the same translator, or in some large organizations by another translator --- with a view to producing something that is up to standard for the job in hand. This might involve no more than checking, or it might involve quite radical revision aimed at producing something that reads as though written originally in the target language. For the most part, the aim of MT is only to automate the first, draft translation process.gif

``MT threatens the jobs of translators.''

The quality  of translation that is currently possible with MT is one reason why it is wrong to think of MT systems as dehumanizing monsters which will eliminate human translators, or enslave them. It will not eliminate them, simply because the volume of translation to be performed is so huge, and constantly growing, and because of the limitations of current and forseeable MT systems. While not an immediate prospect, it could, of course, turn out that MT enslaves human translators, by controlling the translation process, and forcing them to work on the problems it throws up, at its speed. There are no doubt examples of this happening to other professions. However, there are not many such examples, and it is not likely to happen with MT. What is more likely is that the process of producing draft translations, along with the often tedious business of looking up unknown words in dictionaries, and ensuring terminological consistency, will become automated, leaving human translators free to spend time on increasing clarity and improving style, and to translate more important and interesting documents --- editorials rather than weather reports, for example. This idea borne out in practice: the job satisfaction of the human translators in the Canadian Meteorological Center improved when METEO  was installed, and their job became one of checking and trying to find ways to improve the system output, rather than translating the weather bulletins by hand (the concrete effect of this was a greatly reduced turnover in translation staff at the Center).

``The Japanese have developed a system that you can talk to on the phone. It translates what you say into Japanese, and translates the other speaker's replies into English.''

The claim that the Japanese have a speech to speech translation system, of the kind described above, is pure science fiction. It is true that speech-to-speech translation is a topic of current research, and there are laboratory prototypes that can deal with a very restricted range of questions. But this research is mainly aimed at investigating how the various technologies involved in speech and language processing can be integrated, and is limited to very restricted domains (hotel bookings, for example), and messages (offering little more than a phrase book in these domains). It will be several years before even this sort of system will be in any sort of real use. This is partly because of the limitations of speech systems, which are currently fine for recognizing isolated words, uttered by a single speaker, for which the system has been specially trained, in quiet conditions, but which do not go far beyond this. However, it is also because of the limitations of the MT system (see later chapters).

``There is an amazing South American Indian language with a structure of such logical perfection that it solves the problem of designing MT systems.''

The South American Indian language story is among the most irritating for MT researchers. First, the point about having a `perfectly logical structure' is almost certainly completely false. Such perfection is mainly in the eye of the beholder --- Diderot was convinced that the word order of French exactly reflected the order of thought, a suggestion that non-French speakers do not find very convincing. What people generally mean by this is that a language is very simple to describe. Now, as far as anyone can tell all human languages are pretty much as complicated as each other. It's hard to be definite, since the idea of simplicity is difficult to pin down, but the general impression is that if a language has a very simple syntax, for example, it will compensate by having a more complicated morphology (word structure), or phonology (sound structure).gif However, even if one had a very neat logical language, it is hard to see that this would solve the MT problem, since one would still have to perform automatic translation into, and out of, this language.

``MT systems are machines, and buying an MT system should be very much like buying a car.''

There are really two parts to this misconception. The first relates to the sense in which MT systems are machines. They are, of course, but only in the sense that modern word processors are machines. It is more accurate to think of MT systems as programs that run on computers (which really are machines). Thus, when one talks about buying, modifying, or repairing an MT system, one is talking about buying, modifying or repairing a piece of software. It was not always so --- the earliest MT systems were dedicated machines, and even very recently, there were some MT vendors who tried to sell their systems with specific hardware, but this is becoming a thing of the past. Recent systems can be installed on different types of computers. The second part of the misconception is the idea that one would take an MT system and `drive it away', as one would a car. In fact, this is unlikely to be possible, and a better analogy is with buying a house --- what one buys may be immediately habitable, but there is a considerable amount of work involved in adapting it to one's own special needs. In the case of a house this might involve changes to the decor and plumbing. In the case of an MT system this will involve additions to the dictionaries to deal with the vocabulary of the subject area and possibly the type of text to be translated. There will also be some work involved in integrating the system into the rest of one's document processing environment. More of this in Chapters gif and gif. The importance of customization, and the fact that changes to the dictionary form a major part of the process is one reason why we have given a whole chapter to discussion of the dictionary (Chapter gif).

Against these misconceptions, we should place the genuine facts about MT. These are listed on page gif.

 

The correct conclusion is that MT, although imperfect, is not only a possibility, but an actuality. But it is important to see the product in a proper perspective, to be aware of its strong points and shortcomings.

Machine Translation started out with the hope and expectation that most of the work of translation could be handled by a system which contained all the information we find in a standard paper bilingual dictionary. Source language words would be replaced with their target language translational equivalents, as determined by the built-in dictionary, and where necessary the order of the words in the input sentences would be rearranged by special rules into something more characteristic of the target language. In effect, correct translations suitable for immediate use would be manufactured in two simple steps. This corresponds to the view that translation is nothing more than word substitution (determined by the dictionary) and reordering (determined by reordering rules).

Reason and experience show that `good' MT cannot be produced by such delightfully simple means. As all translators know, word for word translation doesn't produce a satisfying target language text, not even when some local reordering rules (e.g. for the position of the adjective with regard to the noun which it modifies) have been included in the system. Translating a text requires not only a good knowledge of the vocabulary of both source and target language, but also of their  grammar --- the system of rules which specifies which sentences are well-formed in a particular language and which are not. Additionally it requires some element of  real world knowledge --- knowledge of the nature of things out in the world and how they work together --- and technical knowledge of the text's subject area. Researchers certainly believe that much can be done to satisfy these requirements, but producing systems which actually do so is far from easy. Most effort in the past 10 years or so has gone into increasing the subtlety, breadth and depth of the linguistic or grammatical knowledge available to systems. We shall take a more detailed look at these developments in due course.

In growing into some sort of maturity, the MT world has also come to realize that the `text in translation out' assumption --- the assumption that MT is solely a matter of switching on the machine and watching a faultless translation come flying out --- was rather too naive. A translation process starts with providing the MT system with usable input. It is quite common that texts which are submitted for translation need to be adapted (for example, typographically, or in terms of format) before the system can deal with them. And when a text can actually be submitted to an MT system, and the system produces a translation, the output is almost invariably deemed to be grammatically and translationally imperfect. Despite the increased complexity of MT systems they will never --- within the forseeable future --- be able to handle all types of text reliably and accurately.  This normally means that the translation will have to be corrected (post-edited)  and usually the person best equipped to do this is a translator.

This means that MT will only be profitable in environments that can exploit the strong points to the full. As a consequence, we see that the main impact of MT in the immediate future will be in large corporate environments where substantial amounts of translation are performed. The implication of this is that MT is not (yet) for the individual self-employed translator working from home, or the untrained lay-person who has the occasional letter to write in French. This is not a matter of cost: MT systems sell at anywhere between a few hundred pounds and over . It is a matter of effective use. The aim of MT is to achieve faster, and thus cheaper, translation. The lay-person or self-employed translator would probably have to spend so much time on dictionary updating and/or post-editing that MT would not be worthwhile. There is also the problem of getting input texts in machine readable form, otherwise the effort of typing will outweigh any gains of automation. The real gains come from integrating the MT system into the whole document processing environment (see Chapter gif), and they are greatest when several users can share, for example, the effort of updating dictionaries, efficiencies of avoiding unnecessary retranslation, and the benefits of terminological consistency.

Most of this book is about MT today, and to some extent tomorrow. But MT is a subject with an interesting and dramatic past, and it is well worth a brief description.



next up previous contents index
Next: A Bit of Up: Introduction and Overview Previous: Why MT Matters



Arnold D J
Thu Dec 21 10:52:49 GMT 1995