Another excellent book on machine translation

No categories

Another excellent book on machine translation

Posted by Mats Dannewitz Linder on November 21st, 2020

Review of Jörg Porsiel (ed.): Maschinelle Übersetzung für Übersetzungsprofis. 384 pages. BDÜ Fachverlag 2020. €37. Order here (“Einkaufskorb” means “Shopping basket”).

Three years ago, the German translators’ organisation MDÜ published, via its publishing company BDÜ Fachverlag, the excellent machine translation primer “Machine Translation – What Language Professionals Need to Know” (reveiewed by myself here). Its original text was written in German; this follow-up is written in both English and German, and although its title is in German (easily translated into English: Machine Translation for Professional Translators), the fact is that about 57 percent of the text is in English! Still, some of the parts in German are of such importance that I wish they were made available to an even larger audience. (And one is written in a German which seems almost intentionally to confirm the image of German as an unusually difficult language, with long, convoluted sentences – the worst one being an entire paragraph of 9 lines, 83 words. All other contributions, however, are lucidly written.)

This is the book for everyone who wants to (a) get a comprehensive picture of the current situation in the domain of machine translation, and (b) delve deeper into some of the areas which are most important. In particular I would recommend reading the very first of the contributions, Patrick Bessler’s and Aljoscha Burchardt’s “Gute Qualität zum kleinen Preis? Wandel von Erwartungen und Prozessen im Kontext von Maschineller Übersetzung” (Good Quality at Low Cost? Changed Expectations and Processes in the Context of Machine Translation), because it gives such a complete and knowledgeable picture of the whole process, from client to translator/post-editor, stressing the need for knowledge on the part of the client and going into detail as concerns the new types of problems – in particular with the arrival of the neural MT – for both Language Service Providers (LSPs) and translators. (Examples are the new tasks which the LSPs must handle with regard to both clients and translators, the problem of assessing – in advance – the MT quality; and the new types of errors which must be handled.) Much of this – and more – is also touched upon in the foreword by the editor, Jörg Porsiel, but the in-depth coverage here, in only 12 pages, is admirable.

Today, as everyone knows, it’s the so-called neural variant which dominates MT. This has consequences for the handling of the MT suggestions – consequences which are discussed in many places in this book. But for the reader who is interested in the theories behind neural MT, there is a long presentation here (“Neural Machine Translation”, by Josef van Genabith) – 57 pages – with texts on information theory, mathematical expressions, and neural networks which may tax the reader’s concentration powers. However, there are also some parts of much more general interest, such as the detailed discussion of the differences between statistic and neural MT (pp. 73-74); also on human parity and research directions, where the discussion of translation for under-resourced languages is particularly important.

The future that MT brings

In particular I think van Genabith’s thoughts about the future are worth noting: “Going out on a limb, (N)MT will fundamentally change the work of human translators to (i) post-editing raw (N)MT translation outputs, (ii) certifying (post-edited or raw) translations and (iii) moving human translators much more into copy-editing and language and content quality control”. About that future there are further discussions. Donald DePalma writes about “Augmented Translation Intelligence”, where more or less “intelligent” functions will make possible a more extensive use of resources on the net (some interesting examples: “disambiguate words and phrases”, “deliver contextual information”, “suggest locale-specific content”) as well as the facilitation of the cooperation between colleagues. And his CSA colleague Arle Lommel makes (in “At human parity?”) some cutting remarks on the claims that NMT is (almost) on a par with “human” translation. He ends with some sensible suggestions as to what MT developers should concentrate on rather than pursuing the elusive target of “human parity”, namely improved quality estimation (see below), integration with speech technogies, connection with human translators, and simpler deployment.

Another look to the future is presented in “Machine Translation of Novels in the Age of Transformer”. Transformer is, according to the authors, “the state-of-the-art architecture in neural MT”. A project is presented where translations of 12 novels using different methodologies – one of which was Transformer – were evaluated. The authors do not claim any particular degree of “success”; only that Transformer is by far the best of the systems. They also suggest that training on segments longer than isolated sentences will lead to further improvements.

Yet another branch of future development is covered in “Neural Interactive Translation Prediction”; i.e. MT where basis for the MT suggestions are immediately updated. Unsurprisingly, this study indicates that such updating would be preferable to many translators; however, so far I know of only two providers (Lilt and CASMACAT, the one used here) offer that feature. (But the ModernMT service comes very close, with immediate updates of your uploaded TM, where matches take precedence over MT hits.)

When it comes to the future of post-editing work – i.e. editing of MT suggestions in a CAT tool, segment by segment; so-called PEMT – experienced post-editor Sara Grizzo is sceptical (in “Hat Post-Editing ausgedient?”; “Is Post-Editing a thing of the past?”): this is demanding work which is to a large extent impopular among translators. She has come to the conclusion that, on the whole, PEMT makes sense above all for light post-editing (gisting). For more demanding translations, one should try to make use of MT in ways which are more palatable to the translator.

So what about post-editing itself?

PEMT is of course an important topic for this book, and it is covered in eight contributions. The aforementioned Sara Grizzo has two more contributions: one (“Post-Editing: ein Praxisleitfaden”) is a brief manual on the practise of post-editing. And in “Bezahlmodelle für Post-Editing” (“Payment models for Post-Editing”) she discusses the three main payment practices which are common today; one point being that none of them is really satisfactory. However, in the last contribution to the book, “Edit-Distance Based Compensation for Machine Translation”, Vincent Asmuth describes a variant of one of the models – EDC (for Edit-Distance Calculation) – which seems to take into account the work actually done by the post-editor, such as research and consideration of the MT suggestions, none of which is reflected in the resulting changes (if any) to the suggested translations. This is an interesting variant which could well be used as a starting point for discussion of this matter.

Another point raised by Grizzo is the importance of assessing in advance the amount of work needed for a post-editing job, and in particular the quality of the MT output. So far, only Memsource dares argue that they have a reliable function for this so-called Quality Estimation (as opposed to the Quality Control/Quality Assurance done on the final translation result), and it is briefly described, by Sara Szac and Heidi Depraetere, in “Quality Estimation”. They also describe a project called APE-QUEST, funded by the European Commission. They say that QE “should be used”; however, apart from referring to APE-QUEST – which I don’t believe is generally available – they do not provide any solutions.

Yet more on PEMT

Other articles on PEMT are, first, “DIN ISO 18587 in der Praxis” by Ilona Wallberg: an overview of the ISO standard, the title of which is “Translation services — Post-editing of machine translation output — Requirements”. Personally I am not sure of its importance, but since it is quite expensive (ca. EUR82) it is good to have it described in detail here.

Related to this contribution is “The post-editior’s skill set according to industry, trainers and linguists”, by Clara Ginovart and Antoni Oliver, which lists a number of skills fundamental to PEMT; the most important ones being “decision-making, error identification and respect of PE guidelines”.

“Post-edition – fit für die Praxis” (The Practice of Post-editing), by Uta Seewald-Heeg and Chuan Ding, is in some ways a companion text to Sara Grizzo’s shorter (and more easily read) “Praxisleitfaden”, already mentioned.

And a more psychologically-oriented approach to PEMT is taken by Jean Nitzke in “Problemlösungsstrategien beim Post-Editing in Verbindung mit psychologischen Aspekten” (Problem-solving Strategies in Post-Editing in Connection with Psychological Aspects). A question posed: Can a post-editor work with PEMT every day without the ability to concentrate and the motivation suffering? The perhaps obvious answer given here is that one should strive to work with a mixture of different tasks, and thereby develop methods and strategies for post-editing.

François Massion discusses PEMT from the viewpoint of an LSP (Language Service Provider) in “NMT im Einsatz bei einem Dienstleister” (NTM practiced by an LSP). It contains a section on optimization of post-editing (pp. 270-) which is certainly of general interest. And in a report on the use of terminology in training and customization of MT engines (“Terminologie in der neuronalen maschinellen Übersetzung” by Tom Winter and Daniel Zielinski) there is a discussion on the importance of terminology during translation which is well worth reading, in particular the detailed part on terminology errors in machine translations (pp. 216-).

As for the rest…

Other topics covered are the matter of confidentiality (two articles) and the use of controlled language (and while this discussion is certainly worth while, it is at least my experience that the LSP – not to mention the end-of-the-line translator/editor – extremely seldom has the opportunity to affect the source text in this manner).

It should also be mentioned that sprinkled in many of the contributions is the view that translation and post-editing are two very different tasks, and while a good post-editor is probably also a good translator, far from all translators find post-editing at all attractive.

If there is one perspective which I miss in this very rich book, however, it is the use of MT not for post-editing work – i.e. the use of MT is not requested by the client; it is simply used as an additional resource in a “normal” job. This is not the same task as PEMT! For one thing, you can choose among various MT engines; for another, it does not affect your pay. (But I must admit that the work itself is more or less similar to PEMT.)

Finally, I would strongly urge BDÜ to present this book all in English. While I am sure that most German-speaking readers have little problem with the English texts, I doubt that the reverse is true. And the book deserves a very wide readership. May I suggest to use the assistance of NMT?

There is also a brief glossary and presentations of the (30) contributors.

The MDÜ web site has a brief presentation of the book in German as well as sample pages – 18 of them, including the contents list.

Note: I had intended to provide a German version of this review as well, but in the end I refrained, since (a) my writing in German leaves a bit to be desired (even though I have no problem reading, and translating from, German), and (b) those German readers who are interested in this tome no doubt will have no problems reading this text in English.

Posted in Uncategorized

You can leave a response, or trackback from your own site.

Recent Posts

Another excellent book on machine translation

Leave a comment

Studio Manual

General sites on translation

Related blogs

SDL Trados resources