Some popular misconceptions about MT are listed on page
.
We will discuss them in turn.
``MT is a waste of time because you will never make a machine that can
translate Shakespeare''.
The criticism that MT systems cannot, and will never, produce translations of great literature of any great merit is probably correct, but quite beside the point. It certainly does not show that MT is impossible. First, translating literature requires special literary skill --- it is not the kind of thing that the average professional translator normally attempts. So accepting the criticism does not show that automatic translation of non-literary texts is impossible. Second, literary translation is a small proportion of the translation that has to be done, so accepting the criticism does not mean that MT is useless. Finally, one may wonder who would ever want to translate Shakespeare by machine --- it is a job that human translators find challenging and rewarding, and it is not a job that MT systems have been designed for. The criticism that MT systems cannot translate Shakespeare is a bit like criticism of industrial robots for not being able to dance Swan Lake.
``There was/is an MT system which translated The spirit
is willing, but the flesh is weak into the Russian equivalent of
The vodka is good, but the steak is lousy, and hydraulic
ram into the French equivalent of water goat. MT is
useless.''
The `spirit is willing' story is amusing, and it really is a pity that it is not true. However, like most MT `howlers' it is a fabrication. In fact, for the most part, they were in circulation long before any MT system could have produced them (variants of the `spirit is willing' example can be found in the American press as early as 1956, but sadly, there does not seem to have been an MT system in America which could translate from English into Russian until much more recently --- for sound strategic reasons, work in the USA had concentrated on the translation of Russian into English, not the other way round). Of course, there are real MT howlers. Two of the nicest are the translation of French avocat (`advocate', `lawyer' or `barrister') as avocado, and the translation of Les soldats sont dans le café as The soldiers are in the coffee. However, they are not as easy to find as the reader might think, and they certainly do not show that MT is useless.
``Generally, the quality of translation you can get from an MT system
is very low. This makes them useless in practice.''
Far from being useless, there are several MT systems in day-to-day use around the world. Examples include METEO (in daily since 1977 use at the Canadian Meteorological Center in Dorval, Montreal), SYSTRAN (in use at the CEC , and elsewhere), LOGOS , ALPS , ENGSPAN (and SPANAM ), METAL , GLOBALINK . It is true that the number of organizations that use MT on a daily basis is relatively small, but those that do use it benefit considerably. For example, as of 1990, METEO was regularly translating around 45 000 words of weather bulletins every day, from English into French for transmission to press, radio, and television. In the 1980s, the diesel engine manufacturers Perkins Engines was saving around £ 4 000 on each diesel engine manual translated (using a PC version of WEIDNER system). Moreover, overall translation time per manual was more than halved from around 26 weeks to 9-12 weeks --- this time saving can be very significant commercially, because a product like an engine cannot easily be marketed without user manuals.
Of course, it is true that the quality of many MT systems is low, and
probably no existing system can produce really perfect
translations.
However, this does not make MT useless. First, not every translation
has to be perfect. Imagine you have in front of you a Chinese
newspaper which you suspect may contain some information of crucial
importance to you or your company. Even a very rough translation would
help you. Apart from anything else, you would be able to work out
which, if any, parts of the paper would be worth getting translated
properly. Second, a human translator normally does not immediately
produce a perfect translation. It is normal to divide the job of
translating a document into two stages. The first stage is to produce
a draft translation, i.e. a piece of running text in the target
language, which has the most obvious translation problems solved (e.g.
choice of terminology, etc.), but which is not necessarily perfect.
This is then revised --- either by the same translator, or in some
large organizations by another translator --- with a view to producing
something that is up to standard for the job in hand. This might
involve no more than checking, or it might involve quite radical
revision aimed at producing something that reads as though written
originally in the target language. For the most part, the aim of MT is
only to automate the first, draft translation process.
``MT threatens the jobs of translators.''
The quality of translation that is currently possible with MT is one reason why it is wrong to think of MT systems as dehumanizing monsters which will eliminate human translators, or enslave them. It will not eliminate them, simply because the volume of translation to be performed is so huge, and constantly growing, and because of the limitations of current and forseeable MT systems. While not an immediate prospect, it could, of course, turn out that MT enslaves human translators, by controlling the translation process, and forcing them to work on the problems it throws up, at its speed. There are no doubt examples of this happening to other professions. However, there are not many such examples, and it is not likely to happen with MT. What is more likely is that the process of producing draft translations, along with the often tedious business of looking up unknown words in dictionaries, and ensuring terminological consistency, will become automated, leaving human translators free to spend time on increasing clarity and improving style, and to translate more important and interesting documents --- editorials rather than weather reports, for example. This idea borne out in practice: the job satisfaction of the human translators in the Canadian Meteorological Center improved when METEO was installed, and their job became one of checking and trying to find ways to improve the system output, rather than translating the weather bulletins by hand (the concrete effect of this was a greatly reduced turnover in translation staff at the Center).
``The Japanese have developed a system that you can talk to on the
phone. It translates what you say into Japanese, and translates the
other speaker's replies into English.''
The claim that the Japanese have a speech to speech translation system, of the kind described above, is pure science fiction. It is true that speech-to-speech translation is a topic of current research, and there are laboratory prototypes that can deal with a very restricted range of questions. But this research is mainly aimed at investigating how the various technologies involved in speech and language processing can be integrated, and is limited to very restricted domains (hotel bookings, for example), and messages (offering little more than a phrase book in these domains). It will be several years before even this sort of system will be in any sort of real use. This is partly because of the limitations of speech systems, which are currently fine for recognizing isolated words, uttered by a single speaker, for which the system has been specially trained, in quiet conditions, but which do not go far beyond this. However, it is also because of the limitations of the MT system (see later chapters).
``There is an amazing South American Indian language
with a structure of such logical perfection that it solves the
problem of designing MT systems.''
The South American Indian language story is among the most irritating
for MT researchers. First, the point about having a `perfectly logical
structure' is almost certainly completely false. Such perfection is
mainly in the eye of the beholder --- Diderot was convinced that the
word order of French exactly reflected the order of thought, a
suggestion that non-French speakers do not find very convincing. What
people generally mean by this is that a language is very simple to
describe. Now, as far as anyone can tell all human languages are
pretty much as complicated as each other. It's hard to be definite,
since the idea of simplicity is difficult to pin down, but the general
impression is that if a language has a very simple syntax, for
example, it
will compensate by having a more complicated morphology (word
structure), or phonology (sound structure).
However, even if one had a very
neat logical language, it is hard to see that this would solve the MT
problem, since one would still have to perform automatic translation
into, and out of, this language.
``MT systems are machines, and buying an MT system should
be very much like buying a car.''
There are really two
parts to this misconception. The first relates to the sense in which
MT systems are machines. They are, of course, but only in the sense
that modern word processors are machines. It is more accurate to think
of MT systems as programs that run on computers (which really
are machines). Thus, when one talks about buying, modifying, or
repairing an MT system, one is talking about buying, modifying or
repairing a piece of software. It was not always so --- the
earliest MT systems were dedicated machines, and even very recently,
there were
some MT vendors who tried to sell their systems with specific hardware,
but this is becoming a thing of the past. Recent systems can be
installed on different types of computers. The second part of the
misconception is the idea that one would take an MT system and `drive
it away', as one would a car. In fact, this is unlikely to be
possible, and a better analogy is with buying a house --- what one
buys may be immediately habitable, but there is a considerable amount
of work involved in adapting it to one's own special needs. In the
case of a house this might involve changes to the decor and plumbing.
In the case of an MT system this will involve additions to the
dictionaries to deal with the vocabulary of the subject area and
possibly the type of text to be translated. There will also be some
work involved in integrating the system into the rest of one's
document processing environment. More of this in
Chapters
and
. The importance of
customization, and the fact that changes to
the dictionary form a major part of the process is one reason why we
have given a whole chapter to discussion of the dictionary
(Chapter
).
Against these misconceptions, we should place the genuine facts about
MT. These are listed on page
.
The correct conclusion is that MT, although imperfect, is not only a possibility, but an actuality. But it is important to see the product in a proper perspective, to be aware of its strong points and shortcomings.
Machine Translation started out with the hope and expectation that most of the work of translation could be handled by a system which contained all the information we find in a standard paper bilingual dictionary. Source language words would be replaced with their target language translational equivalents, as determined by the built-in dictionary, and where necessary the order of the words in the input sentences would be rearranged by special rules into something more characteristic of the target language. In effect, correct translations suitable for immediate use would be manufactured in two simple steps. This corresponds to the view that translation is nothing more than word substitution (determined by the dictionary) and reordering (determined by reordering rules).
Reason and experience show that `good' MT cannot be produced by such delightfully simple means. As all translators know, word for word translation doesn't produce a satisfying target language text, not even when some local reordering rules (e.g. for the position of the adjective with regard to the noun which it modifies) have been included in the system. Translating a text requires not only a good knowledge of the vocabulary of both source and target language, but also of their grammar --- the system of rules which specifies which sentences are well-formed in a particular language and which are not. Additionally it requires some element of real world knowledge --- knowledge of the nature of things out in the world and how they work together --- and technical knowledge of the text's subject area. Researchers certainly believe that much can be done to satisfy these requirements, but producing systems which actually do so is far from easy. Most effort in the past 10 years or so has gone into increasing the subtlety, breadth and depth of the linguistic or grammatical knowledge available to systems. We shall take a more detailed look at these developments in due course.
In growing into some sort of maturity, the MT world has also come to
realize that the `text in
translation out' assumption
--- the assumption that MT is solely a matter of switching on the
machine and watching a faultless translation come flying out --- was
rather too naive. A translation process starts with providing the MT
system with usable input. It is quite common that texts which
are submitted for translation need to be adapted (for example,
typographically, or in terms of format) before the system can deal
with them. And when a text can actually be submitted to an MT system,
and the system produces a translation, the output is almost invariably
deemed to be grammatically and translationally imperfect. Despite the
increased complexity of MT systems they will never --- within the
forseeable future --- be able to handle all types of text
reliably and accurately. This normally means that the translation will
have to be corrected (post-edited) and usually the person best
equipped to do this is a translator.
This means that MT will only be profitable in environments that can
exploit the strong points to the full. As a consequence, we see that
the main impact of MT in the immediate future will be in large
corporate environments where substantial amounts of translation are
performed. The implication of this is that MT is not (yet) for the
individual self-employed translator working from home, or the
untrained lay-person who has the occasional letter to write in French.
This is not a matter of cost: MT systems sell at anywhere between a
few hundred pounds and over . It is a
matter of effective use. The aim of MT is to achieve faster, and thus
cheaper, translation. The lay-person or self-employed translator would
probably have to spend so much time on dictionary updating and/or
post-editing that MT would not be worthwhile. There is also the
problem of getting input texts in machine readable form, otherwise the
effort of typing will outweigh any gains of automation. The real gains
come from integrating the MT system into the whole document processing
environment (see Chapter
), and they are greatest when
several users can share, for example, the effort of updating dictionaries,
efficiencies of avoiding unnecessary retranslation, and the benefits
of terminological consistency.
Most of this book is about MT today, and to some extent tomorrow. But MT is a subject with an interesting and dramatic past, and it is well worth a brief description.