There is some dispute about who first had the idea of translating automatically between human languages, but the actual development of MT can be traced to conversations and correspondence between Andrew D. Booth, a British crystallographer, and Warren Weaver of the Rockefeller Foundation in 1947, and more specifically to a memorandum written by Weaver in 1949 to the Rockerfeller Foundation which included the following two sentences.
``I have a text in front of me which is written in Russian but I am going to pretend that it is really written in English and that it has been coded in some strange symbols. All I need to do is strip off the code in order to retrieve the information contained in the text.''The analogy of translation and decoding may strike the sophisticated reader as simplistic (however complicated coding gets it is still basically a one-for-one substitution process where there is only one right answer --- translation is a far more complex and subtle business), and later in the memorandum Weaver proposed some other more sophisticated views, but it had the virtue of turning an apparently difficult task into one that could be approached with the emergent computer technology (there had been considerable success in using computers in cryptography during the Second World War). This memorandum sparked a significant amount of interest and research, and by the early 1950s there was a large number of research groups working in Europe and the USA, representing a significant financial investment (equivalent to around 000 000). But, despite some success, and the fact that many research questions were raised that remain important to this day, there was widespread disappointment on the part of funding authorities at the return on investment that this represented, and doubts about the possibility of automating translation in general, or at least in the current state of knowledge.
The theoretical doubts were voiced most clearly by the philosopher Bar-Hillel in a 1959 report, where he argued that fully automatic, high quality , MT (FAHQMT) was impossible, not just at present, but in principle. The problem he raised was that of finding the right translation for pen in a context like the following:
The argument was that (i) here pen could only have the interpretation play-pen, not the alternative writing instrument interpretation, (ii) this could be critical in deciding the correct translation for pen, (iii) discovering this depends on general knowledge about the world, and (iv) there could be no way of building such knowledge into a computer. Some of these points are well taken. Perhaps FAHQMT is impossible. But this does not mean that any form of MT is impossible or useless, and in Chapter we will look at some of the ways one might go about solving this problem. Nevertheless, historically, this was important in suggesting that research should focus on more fundamental issues in the processing and understanding of human languages.
The doubts of funding authorities were voiced in the report which the
US National Academy of Sciences commissioned in 1964 when it set up
the Automatic Language Processing Advisory Committee
(ALPAC) to report on the state of play with
respect to MT as regards quality , cost, and prospects, as against the existing cost of, and need for translation. Its report, the so-called ALPAC Report , was damning, concluding that there was no shortage of human translators, and that there was no immediate prospect of MT producing useful translation of general scientific texts. This report led to the virtual end of Government funding in the USA. Worse, it led to a general loss of morale in the field, as early hopes were perceived to be groundless.
The spectre of the ALPAC report, with its threats of near complete withdrawal of funding, and demoralization, still haunts workers in MT. Probably it should not, because the achievements of MT are real, even if they fall short of the idea of FAHQMT all the time --- useful MT is neither science fiction, nor merely a topic for scientific speculation. It is a daily reality in some places, and for some purposes. However, the fear is understandable, because the conclusion of the report was almost entirely mistaken. First, the idea that there was no need for machine translation is one that should strike the reader as absurd, given what we said earlier. One can only understand it in the anglo-centric context of cold-war America, where the main reason to translate was to gain intelligence about Soviet activity. Similarly, the suggestion that there was no prospect of successful MT seems to have been based on a narrow view of FAHQMT --- in particular, on the idea that MT which required revision was not `real' MT. But, keeping in mind the considerable time gain that can be achieved by automating the draft translation stage of the process, this view is naive. Moreover, there were, even at the time the report was published, three systems in regular, if not extensive, use (one at the Wright Patterson USAF base, one at the Oak Ridge Laboratory of the US Atomic Energy Commission , and one the EURATOM Centre at Ispra in Italy).
Nevertheless, the central conclusion that MT did not represent a useful goal for research or development work had taken hold, and the number of groups and individuals involved in MT research shrank dramatically. For the next ten years, MT research became the preserve of groups funded by the Mormon Church, who had an interest in bible translation (the work that was done at Brigham Young University in Provo, Utah ultimately led to the WEIDNER and ALPS systems, two notable early commercial systems), and a handful of groups in Canada (notably the TAUM group in Montreal, who developed the METEO system mentioned earlier), the USSR (notably the groups led by Mel'cuk, and Apresian) , and Europe (notably the GETA group in Grenoble, probably the single most influential group of this period, and the SUSY group in Saarbrücken). A small fraction of the funding and effort that had been devoted to MT was put into more fundamental research on Computational Linguistics, and Artificial Intelligence, and some of this work took MT as a long term objective, even in the USA (Wilks' work on AI is notable in this respect). It was not until the late 1970s that MT research underwent something of a renaissance.
There were several signs of this renaissance. The Commission of the European Communities (CEC) purchased the English-French version of the SYSTRAN system, a greatly improved descendent of the earliest systems developed at Georgetown University (in Washington, DC), a Russian-English system whose development had continued throughout the lean years after ALPAC , and which had been used by both the USAF and NASA . The CEC also commissioned the development of a French-English version , and Italian-English version. At about the same time, there was a rapid expansion of MT activity in Japan, and the CEC also began to set up what was to become the EUROTRA project, building on the work of the GETA and SUSY groups. This was perhaps the largest, and certainly among the most ambitious research and development projects in Natural Language Processing. The aim was to produce a `pre-industrial' MT system of advanced design (what we call a Linguistic Knowledge system) for the EC languages. Also in the late 1970s the Pan American Health Organization (PAHO) began development of a Spanish-English MT system (SPANAM) , the United States Air Force funded work on the METAL system at the Linguistics Research Center, at the University of Texas in Austin, and the results of work at the TAUM group led to the installation of the METEO system. For the most part, the history of the 1980s in MT is the history of these initiatives, and the exploitation of results in neighbouring disciplines.
As one moves nearer to the present, views of history are less clear and more subjective. Chapter will describe what we think are the most interesting and important technical innovations. As regards the practical and commercial application of MT systems. The systems that were on the market in the late 1970s have had their ups and downs, but for commercial and marketing reasons, rather than scientific or technical reasons, and a number of the research projects which were started in the 1970s and 1980s have led to working, commercially available systems. This should mean that MT is firmly established, both as an area of legitimate research, and a useful application of technology. But researching and developing MT systems is a difficult task both technically, and in terms of management, organization and infrastructure, and it is an expensive task, in terms of time, personnel, and money. From a technical point of view, there are still fundamental problems to address. However, all of this is the topic of the remainder of this book.