next up previous contents index
Next: Knowledge-Based MT Up: Rule-Based MT Previous: Rule-Based MT

Flexible or Multi-level MT

   Most transfer  or interlingual  rule-based systems are based on the idea that success in practical MT involves defining a level of representations for texts which is abstract enough to make translation itself straightforward, but which is at the same time superficial enough to permit sentences in the various source and target languages to be successfully mapped into that level of representation. That is, successful MT involves a compromise between depth of analysis  or understanding of the source text, and the need to actually compute the abstract representation. In this sense, transfer systems are less ambitious than interlingual systems, because they accept the need for (often quite complex) mapping rules between the most abstract representations of source and target sentences. As our linguistic knowledge increases, so too MT systems based on linguistic rules encoding that knowledge should improve. This position is based on the fundamental assumption that finding a sufficiently abstract level of representation for MT is an attainable goal. However, some researchers have suggested that it is not always the case that the deepest level of representation is necessarily the best level for translation.

This can be illustrated easily by thinking about translation between closely related languages such as Norwegian  and Swedish .

 

  In the second example here, both languages have exactly the same word order, although the words themselves and their grammatical features differ. In the first example, we see that Swedish  (like English) does not allow the use of an article together with a possessive pronoun, which Norwegian (like, say, Italian) does. These are certainly minimal differences, and it would be a serious case of overkill to subject the source language sentences to `in depth' analysis , when essentially all that is required to deal with this structural difference is to express a correspondence between the structures described by the following syntactic rules  (here `Poss' stands for `Possessive pronoun').

(Swedish) NP Poss Adj N (Norwegian) NP Det Adj N Poss

   Of course, it would be straightforward to design a special purpose MT system which was equipped only with the sort of linguistic rules required to perform this type of superficial manipulation of syntactic  structures. But a number of considerations, not least economic considerations, militate against this. Instead one could conclude that what is required is an approach to rule-based translation which is sufficiently flexible to carry out deep analysis  only when required, so that the same MT engine can be used for dealing with pairs of closely related languages and pairs of languages which differ greatly. Such ideas lie behind attempts to design flexible systems which can operate in a variety of modes, according to the depth of analysis  required for the language pair, or even the particular examples in hand.

There are other reasons for the current interest in flexible systems. In the example above, we have tried to show that what is the `appropriate level' of analysis for one language pair might be quite inappropriate for another pair. But some researchers have pointed out that a similar situation obtains within one and the same language pair. Though really convincing arguments are hard to find, the idea is that translation seems to depend on information about different levels of linguistic information at the same time. For example, for most translation purposes, as we have noted previously, a representation in terms of semantic relations  (AGENT, PATIENT, etc.) is attractive. However, such a representation will probably not distinguish between ( a), ( b) and ( c). This means they will be translated alike, if this is the representation that is produced by analysis . But in many cases this would not produce a very good translation.

Ideally, what one wants is a semantic  account of the differences between these examples. This has to do with the difference between what is presupposed, and what is asserted, or what is treated as `given', and what as new information (e.g. in ( b) it is presupposed that Sam broke something, and stated that the thing in question was the printer). Producing such an account is not impossible, and may indeed produce a better MT system in the long run. However, it is by no means easy, and, at least in the short term, it would be nice if one could use information about semantic relations  where that is useful, and information about surface syntactic form where that was useful. This would be possible if one had a way of allowing information from a variety of levels to be referred to in transfer . Of course, the difficulty then would be to allow this flexibility while still ensuring that the pieces of information can be correctly combined to give a suitable target translation.

There are various proposals in the MT literature concerning flexible MT. Some researchers working within the paradigm of example-based MT ,  which we discuss below, have proposed architectures which are flexible with respect to the level at which translation occurs. Another rather radical idea depends on the fact that several contemporary linguistic theories provide a `multidimensional' characterisation of a linguistic string. One can get a flavour of what is involved by looking at the following representation.

  
Figure: A Multidimensional Representation

This representation of the sentence Kim walks is multidimensional, in the sense that it contains information about several levels, or dimensions, of structure at the same time: information about ORTHography , SYNtax , SEMantics , and constituent  structure (the DaughTeRs feature). Such multidimensional representations are known as signs. Identity of values is indicated by tags, boxed indices like , .

If we look first at the DTRS value, we can see that there are two daughters, the first an NP (i.e. whose SYNtax contains an attribute CAT with value NP), and the second a VP. The NP has no daughters, and the VP has one daughter, whose category is V. The ORTHography of the whole S is made up of , the ORTHography of the NP, i.e. mary, and the ORTHography of the VP, which is identical to the ORTHography of the V, tagged . The TNS (TeNSe) of S, VP, and V are identical, and the NP, VP, and V have the same NUMber value.

The semantics of the S indicates that the argument of the predicate is the value tagged , that is, the semantics of the NP, .

We have seen that representation carries information about ORTHography, SYNTax, SEMantics and daughters (DTRS) at the same time (a fuller representation would include information about morphology  too). Formally, it is just a collection of features (i.e. attributes and values) of the kind we have seen before, with the difference that the value of some of the attributes can be an entire structure (collection of features), and we allow different attributes to have the same value (indicated by means of a tag, a number written in a box). This is sometimes called a re-entrance.gif

The syntactic information is essentially equivalent to the sorts of category label we have seen before, and the value of the DTRS attribute simply gives the values of the daughters a node would have in a normal consituent structure tree of the kind that were given in Chapter gif. One interesting point to note is that there is a value for SEMantics given for the mother sign, and for every one of the daughter signs. (In fact, the SEM value of the S is identical to the SEM value of the VP, and the V, and the SEM value of the AGENT of the S is identical to the SEM value of the NP Kim.)

One way one could use such a structure would be just to take the value of the SEM attribute for the mother sign in the output of analysis , and input this value to transfer  (in a transfer system ) or synthesis  (in an interlingual system ). This would involve only adapting the techniques we described in earlier chapters for transfer and synthesis to deal with complex attribute-value structures, rather than trees (this is not very difficult). Of course, this would mean that one was losing any benefit of multidimensionality for translation (though one might be able to exploit it in analysis ).

If one is to exploit multidimensionality in transfer  or synthesis  (which was the aim) the only possible part of the sign to recurse through, applying rules, is the structure of the DTRS attribute. However, as we noted, this is just the surface phrase structure, enhanced with some information about semantics and orthography. If this is so, then one might wonder whether any advantage has been gained at all.

The solution is not to think in terms of applying rules to representations or structures at all, but to focus on the attribute-value structure as simply a convenient graphic representation of the solution to a set of constraint s. For example, for the representation on page gif, one such constraint would be that the CATegory value of the mother sign is S. More precisely, the value of SYN on the mother sign is an attribute-value structure which contains an attribute CAT, with value S. That is, if we give names like X0, X1, X2, etc. to the various attribute-value structures, with X0 the name of the mother sign, then the value of SYN in X0 is a structure X1, and the value of CAT in X1 is S:

X0:SYN = X1 X1:CAT = S

If we name the attribute-value structure of the VP X4, and that of the V X5, we also have the following, indicating that S, VP, and V all have the same SEM values.

X0:SEM = X4:SEM X4:SEM = X5:SEM

The value of the ORTHography attribute in X0 is the concatenation of the values in the NP (X6) and the VP (X5):

X0:ORTH = concatenation(X6:ORTH, X5,ORTH)

One can think of a representation like that on page gif as simply a graphic representation of the solution to a set of such equations, and one can use the equations as the basis for translation, in the following way. First, it is the task of analysis  to produce the equation set. This is not, in fact, difficult --- we have already seen, in Chapter gif how one can add instructions to grammar rules to create different kinds of representations. Using them to create sets of equations is a simple extension of this idea. This set of constraints describes a source structure. The translation problem is now to produce a set of constraints whose solution will yield a target language structure. Ultimately, of course, one is interested in the ORTH value in such a structure, but in the meantime, one can state constraints such as: ``the SEM of the source structure, and the SEM of the target structure must be identical'' (this assumes that the SEM values are `interlingual'), or ``the SEM of the target structure must be the result of applying some `transfer'  function to the SEM of the source structure''. But one can easily state constraints in terms of other attributes, for example, ``in the case of proper nouns, the value of ORTH in the source structure and the value of ORTH in the target structure must be the same''. Similarly, if we add attributes and values giving information about grammatical relations  such as subject, etc. into the constraints, we can state constraints in terms of these.

Of course, we cannot, in this way, guarantee that we will deal with all of the source structure (we may leave parts untranslated by failing to produce appropriate target language constraints), or that solving the target language constraints will produce a single target structure, or even any structure at all (the constraints may be inconsistent). Nor have we indicated how the constraints are to be solved. Moreover, one will often not want such constraints to be observed absolutely, but only by default. For example, proper names should only keep the same orthography  form if there is no constraint that says otherwise (in translating English into French , one would like to ensure that London translates as Londres). There are a number of serious difficulties and open research questions here. However, one can get a feeling for a partial solution to some of these problems by considering the following rather simple approach.

Recall that the constraints we gave above made the SEMantics of the S equal to the SEMantics of the VP, and the V. One may immediately think of this as involving the V contributing its SEMantics to the S, but one can also see it the other way round, as putting the semantics of the whole S `into' the V. What this means, of course, is that all the semantic information conveyed by the sentence is represented (somewhat redundantly) in the representations of the words. Now suppose that we have translation constraints which say, for example, that the translation of the word walk must be the word marcher, with the same semantics, and that the translation of Sam must be Sam, again with the same semantics. What we must do now is produce a target structure. The problem we have is interestingly like the problem we have when we try to parse a sentence: then we typically know what the words are, and what order they are in, but not what the sentence as a whole means; here we know what the words are, and what the sentence as a whole means (it is represented `in the words'), but not what the word order should be. One possibility is simply to use the target grammar to parse Sam, and marcher in all possible orders. To take a slightly more interesting case, suppose the source sentence is ( ):

If the target language is French , the target grammar will be asked to parse the strings in ( ):

One can expect the target grammar to reject ( a), and ( c). It would accept ( b), but only with the meaning that is different from that of the source sentence, which we have carried over in the constraints linking see to voir. This leaves only the correct solution ( d).    


next up previous contents index
Next: Knowledge-Based MT Up: Rule-Based MT Previous: Rule-Based MT



Arnold D J
Thu Dec 21 10:52:49 GMT 1995