More about the OpenAI plugin

The wiki is very informative, but a bit sparse with regard to the account business. Here are some more instructions:

Sign up for an OpenAI account by clicking the corresponding link in the Settings window, as shown by the wiki, step 1.

You arrive at a Create account window; just follow the instructions.

Follow the instructions in the wiki, step 2. (Note that these instructions also specify where the API key – among other things – is stored; however, the search path is not correct; it should read %appdata%\Trados AppStore\Roaming\OpenAI Translator\Settings. And the Settings.xml file contains the settings for the translations whereas the API key is found in the Config.xml file. A simpler way to store the key is simply to copy it to somewhere else – I sent it in an email message to myself.)

Still, an account is not enough: You also need to fill it with some money: Click the Playground entry on top left.

Then click Settings > Billing. Click Add to credit balance and add a suitable amount – it can be small, since the service is not costly. Furthermore, you can always cancel your payment plan by going here and selecting Cancel plan.

And now you can continue with the wiki, step 3.

Also note about step 4: If you don’t want automatic searching, the way to obtain a suggestion from OpenAI is to select the Search from Source option above the source text.

(More about Using the default prompts in the wiki: Note that when you write your own prompts, they too must contain the same 3 placeable parameters. Parameter {2}, “the search text”, is used when you select, in the OpenAI Translator dialog, whether to search from source or from target. So this is how you define parameter {0}, the language of the search text, for the translation process. You could use searching from target for instance to post-edit an already made machine translation – sometimes such a process actually produces an improvement. Or you could use it to correct the terminology, by giving the “order” to change term X into term Y, term Z into term Q, etc. Note also that you don’t have to use English for a prompt; you can use any language that you prefer.)

Furthermore, in your OpenAI account window, it may be useful to familiarize yourself with some of the menu items to the left:

Under API keys, not only can you see a list of “secret keys” and edit them as well as deleting them; also create new keys.

Your costs for using the service is shown under Usage.

Under Settings, you will find:

Organization: your Organization name (to be entered by you) and ID (generated by OpenAI).

Team: Your “team members” (yourself and any you invite) are shown.

Limits: Rate limits and Usage limits; extensive stuff to be explored by yourself.

Billing: See above.

Profile: Your own contact details.

Note on confidentiality: The wiki refers to two sites, “Privacy policy” which applies to your own personal data (which is protected in accordance with US and EU regulations), and “Enterprise privacy at OpenAI” which applies to the translation content. OpenAI says that “We do not train on your data from ChatGPT Enterprise or our API Platform” (it is the API part which concerns you) and “You own your inputs and outputs (where allowed by law)”.
Thus your confidentiality is guaranteed by OpenAI.

Standards for translation

Yes, there are standards for translation – but they are expensive and, at least for freelance translators, perhaps only of moderate interest. However, it may still be useful to be reasonably familiar with them. You should also know that they apply to requirements and responsibilities, not to the results.

Ingemar Strandvik – Quality Manager at the European Commission’s Translation Directorate – is a friend of standardization, even when it comes to something fundamentally as ephemeral as translation. In his presentation, Why standards can benefit translators (2017), he says that standards can contribute the following:

Definition of quality of service provision:

  • Compliance with requirements; meeting needs and expectations ð specifications

Distilled wisdom of the profession, best practice:

  • process focus, competencies, workflow steps

Benchmarks, references, check lists:

  • codification of translators’ common sense
  • legal translation with[out] translation training
  • presented in an authoritative way ð credibility, assertiveness, improved communication

What is a Standard?

On the definition of the term “standard”, Ingemar Strandvik has found the following quote: “in short, a standard is an agreed wayof doing something . This may include the manufacture of a product, the administration of a process, the delivery of a service or of materials – standards can cover a huge range of operations performed by organizations and used by their customers.”

Standards of interest to us translators are (the links are to free sample pages):

  • ISO/TS 11669:2012, Translation projects – General guidance (technical report)
  • ISO 17100:2015, Translation Services – Requirements for translation services
  • ISO 18587:2017, Translation services – Post-editing of machine translation output – Requirements
  • ISO 20771:2020, Legal translation – Requirements

Furthermore, a project to standardize quality assessment, Translation services – Assessment of translation output – General guidance (ISO 50960-#), was recently launched. See below.


Note: In the ATA Chronicle for Sept/Oct 2021, there is an interesting article on ISO standards and information security: Is Applying ISO Standards to Information Security the New Black in Translation?.


To what use?

A standard can basically be used in two ways: As information on processes (“good practice”) and requirements, and for qualification through certification. The latter is an expensive process and is normally only relevant for translation agencies of some size. I have never heard of a company (translation agency or direct customer) that required a freelance translator to be certified. And a quick survey among a dozen Swedish agencies suggests that only one of them is certified (then against ISO 17100); most have never experienced a demand for certification, and many do not even know the existence of any of these standards.

As regards standards as a basis for (further) training, no doubt the freelance translator may find useful information. But their main focus is the translation agency, which may well want to inform on the requirements that the standard imposes on the subcontractor. If a freelancer wants spend money on this form of training, I would actually recommend the technical report for translation projects even though it is by far the most expensive one.

In any case, general knowledge may well be useful, so here is a summary of the contents.

ISO/TS 11669

The most comprehensive standard is the general guidance for translation projects, ISO/TS 11669. Formally, it is not a standard but a “technical specification”, which means, among other things, that it cannot be certified against. Furthermore, it is much more detailed than a regular standard: “An organizing principle of this Technical Specification is the importance of structured specifications in translation projects. … A system is described for making decisions about how translation projects are to be carried out. … In practice, requesters do not always provide project specifications … [but] Requesters and TSPs should work together to determine project specifications. … When both requesters and TSPs agree on project specifications, the quality of a translation … can be determined by the degree to which the target content adheres to the predetermined specifications.” Also, these specifications are “the starting point for all assessments, both qualitative and quantitative”.

A major point here is that the specifications are crucial to the assessment of the quality of a translation. Not surprisingly, more than half of the standardʼs instructions relate to the preparation of structured specifications for translation projects. They include comprehensive lists and descriptions of parameters – almost 40 of them, including sub-parameters. Examples are production tasks, which consist of “typical production tasks”: Preparation, Initial translation, In-process quality assurance [consisting of self-checking, revision, review, final formatting, and proofreading], and “additional tasks”.


Note: For the preparation of project specifications (a neglected area), see this excellent (and easy to read) document from the European Commission: Translation Quality Info Sheets for Contractors.


Other parts of the standard include a chapter on Working together – Requesters and translation service providers (TSPs); this includes definitions of freelance translators as a translation provider, as well as the competencies of translators and professionals – see the box below. A chapter on Translation project management and, finally, a chapter on Phases of the translation project. The latter is particularly detailed. Especially notable here is a whole page on “post-production”, including feedback from the end user – the same element is covered in the corresponding standard, ISO 17100, in seven lines!

Obviously, this standard is primarily aimed at translation agencies. To me, it is just as obvious that it is little applied by them only to a minor extent. One would expect it to be reflected in the instructions to the subcontractors/freelancers, but I have never seen that. I think that most agencies would find quite a lot of useful stuff here; there is probably very little that its authors have missed.

ISO 17100

The step from ISO/TS 11669 to ISO 17100 is not big; the latter is, as I said, the corresponding standard, against which certification can be done, and it concerns translation service requirements. Here we find the “normal ingredient”, the chapter on competencies and qualifications (see box below). Thereafter follow processes and measures prior to production, including quotation and contract as well as preparations. Then there is a chapter on the production process: Project management, translation process (translation, control, checking reading, peer review, Proofreading and final verification and approval for delivery). And finally, a short chapter on “processes after delivery”. (“[the agency] should forward feedback from customers to all interested parties” – but how many freelancers hear about the end customer’s reactions?)

There are also six “informative” annexes (informative as opposed to normative, i.e. the annexes are not included in the certification requirements), including Agreements and project specifications, as well as Project registration and project reporting – thus they should be seen as suggestions on what could be included.

ISO 20771

Related to ISO 17100 is ISO 20771, i.e. requirements relating to the translation of legal texts. It applies to translation that is law-related or falls within the legal domain, in terms of both content and context. It is pointed out that “legal translators” may be subject to specific requirements regarding professionalism, confidentiality and ethics, as well as procedures relating to authorization, certification and safety approval. And unlike ISO 17100 – which focuses primarily on agencies – this standard is primarily intended for individual translators. If you are a freelance translator interested in certification, this is probably the standard which would be of primary use for you.

The usual section on competences and qualifications is comparatively detailed and includes requirements for “recognized degrees” in language, translation or legal – a requirement which, unlike the ISO 17100 requirements, cannot be replaced by at least five years of professional experience.

Furthermore, the responsibility of the legal translator is described in detail, and this applies even more to the elements Agreement and service specification, Translation, Check, Revision and review, Verification and correction, Signing off and record keeping, Authorized certification, Complaints, individual responsibility and corrective action. Of course there is also a chapter on Confidentiality, security and professional liability insurance – and finally on Professional development and involvement. There is also an Annex with “Information on authorized legal translation used in judicial settings,  and for the use of public authorities and commercial purposes”.

ISO 18587

Probably of more general interest is ISO 18587, which applies to requirements for post-editing of machine translation (MT). A preliminary section on the post-editing process contains few details out of the ordinary; it might be noted that the agency/translator is required to determine whether the source text is at all suitable for MT, and that relevant specifications should exist. There are also requirements that the target text must fulfill; it can be noted that these requirements actually apply in equal measure to any translation, but they are not included in ISO 17100. Interestingly enough, the introduction notes that the rapid technological development in the MT field means that the standard is restricted to ‘that part of the process that begins upon the delivery of the MT output and the beginning (my italics) of the human process that is known as post-editing’.

As far as competencies and qualifications are concerned, it is worth noting that they are virtually identical to the corresponding text in ISO 17100 – that is, nothing specific to post-editing. However, there is also a sensible section on professional competence that includes requirements for “a basic understanding of common errors that an MT system makes” and “the knowledge and ability to establish whether editing MT output makes sense, in terms of time and effort estimations”.

Finally, there is a short chapter on the requirements of “full” post-editing, in addition to the general post-editing requirements. “Full” post-editing means that the resulting target text should not be distinguishable from a corresponding target text produced by a human professional translator without the assistence of MT.

Quality Control Standard – ISO 50960-#

Finally, it may be worth mentioning that there is a process of standardizing the quality control of translations, the assessment of translation output – general guidance. Since there are several different systems for this, and that much development is taking place (not least in the European Commission), one might think that it is perhaps a little early. On the other hand it might be useful to try to manage/summarize this subject. The attempt is complicated by an effort to cover both what is known as quality estimation, i.e. the quality assessment of MT, and quality assessment, that is, the assessment of the “normal” translation quality. The problem is that the former is intended to be done by computers, while the latter – even in this standard – is based on human judgment. In any case, it is expressly stated that the standard does not include quality control or correction.

The project is at such an early stage that it makes no sense to give more than a hint of where it is headed. Thus, the chapter Assessment processes include the sections “Score sheet for translation assessment”, “Basic rules for completing the scoring record”, “scope of the target text to be assessed”. Anyone who has experience of filling out these score sheets will probably agree that it is a task to be avoided.


Competencies and qualifications

Thankfully, ISO 17100 and 18587 set the same requirements for the translatorʼs competencies: Translation competence; linguistic and textual competence in source and target languages; competence in research, information acquisition and processing; cultural competence; technical competence; domain competence – all briefly described.

ISO 20771 (legal translation) places far more and more detailed requirements. Thus, the translator must not only spend at least 5% of the working time on the cost of the work, but also participate in at least one training event a year, preferably be a member of a related professional organization, and document all training and education (how this can be done is described in an appendix).

ISO/TS 11669 deviates by simply specifying requirements for competences in source and target languages, but they are in fact qualifications. The same goes for its requirements on translation competences: they are also about qualifications.

There are three criteria for translation qualifications in ISO 17100 and 18587, of which at least one must be fulfilled: a degree in translation, a degree in any other field plus two years of professional experience, or five years of professional experience. The criteria of the legal standard are again far more far-reaching: six different ones, all of which must be met.


Informative appendix: Terminology

Much of the terminology is common between the standards. I would like to mention a few terms of specific interest here. (Remember that all terminology chapters are included in the free sample pages.) As you will see, the definitions in ISO/TS 11669 are often different from those in the other standards; it can be noted that in 11669 was published in 2012, whereas the others are 3-7 years later. Every Note is part of the standard as well.

  • The following terms are found only in ISO/TS 11669:

A-language: native language, or language that is equivalent to a native language, into which the translator typically translates from his or her B-language and/or C-language

Note: The A-language is generally the language of education and daily life for a translator.

B-language: language, other than a translator’s native language, of which the translator has an excellent command and from which the translator typically translates into his or her A-language

C-language: language of which a translator has a complete understanding and from which the translator sometimes translates into his or her A-language

Note: A translator can have several C-languages.

overt translation: type of translation in which aspects of the source language and source culture are intentionally left visible

covert translation: type of translation intended to make the translation product appear as though it had been authored originally in the target language and target culture

requester: person or organization requesting a translation service from a TSP or language service provider [cf. client, customer below]

Note 1: The requester is usually the person or organization that asks for, and receives, the translation product on behalf of the end users, and that usually directly or indirectly determines the TSP’s compensation for rendering the translation service. In the case of government or non-profit organizations, pro-bono transactions, or in-house translation within a company, there is sometimes no monetary compensation for translation services.

Note 2: In the commercial sector, the requester is sometimes called the client or customer. These terms, however, are ambiguous and could refer to the end user. For this reason, requester is the preferred term.

  • The following terms are common to all the standards:

locale: set of characteristics, information or conventions specific to the linguistic, cultural, technical and geographical conventions of a target audience

language register: variety of language used for a particular purpose or in a particular social or industrial domain

ISO 11669 has it slightly different: register; usage register: set of properties that is characteristic of a particular type of content, and which takes into account the nature of the relationship between the creator and audience, the subject treated and the degree of formality or familiarity of the content

translation service provider; TSP: language service provider that delivers translation services [ISO 11669: person or organization supplying a translation service]

Note: A TSP can be a translation company, a translation agency, a translation organization (profit, non-profit or governmental), a single freelance translator or post-editor, or an in-house translation department.

Note that the expression “language service provider” is used as if taken for granted, but without the common abbreviation LSP. However, in ISO/TS 11669 that term is also defined:

language service provider; LSP: person or organization that provides translation, interpreting and/or other language-related services such as transcription, terminology management or voice-overs

Note: The concepts of language service provider and TSP are connected by a generic relation, with language service provider being the generic concept and TSP the specific concept. TSPs generally provide only translation services, which can include revision or review. In some cases, language service providers provide mainly translation services but in many languages.

client; customer [not in ISO 11669]: person or organization that commissions a service from a TSP by formal agreement

Note: The client can be the person or organization requesting or purchasing the service and can be external or internal to the TSP’s organization.

reviser: person who revises translation output

  • Furthermore:

revision: bilingual examination of target language content against source language content for its suitability for the agreed purpose [ISO 11669: bilingual editing of target content based on a comparison between the source content and the target content]

Note: The term bilingual editing is sometimes used as a synonym for revision.

review: monolingual examination of target language content for its suitability for the agreed purpose [ISO 11669: monolingual editing of target content with respect to the conventions of the subject field(s) to which the target content belongs]

Note: The term monolingual editing is sometimes used as a synonym for review.

proofread: examine the revised target language content and applying corrections before printing

correction: translation service action taken to correct an error in target language content or translation process or a nonconformity to a requirement of this International Standard when conformity has been claimed

Note: Corrections generally arise as a result of errors found when the translator is checking the target language content, when reported by a reviser or reviewer or proofreader or client, or during an internal or external audit of the implementation of this International Standard.

 

The use of Language Mapping for machine translation

The matter of language mapping is not something for which there is often a need, but just in case, I’ll give a brief description here.

So: You might have a situation where one, or both, languages (usually the target language) does not have an “engine” in the MT Cloud, but you would like to use a language (or pair) which is similar enough, language-wise, that to use it (them) could be of benefit.

Or you do have a TM for exactly that pair, but to use also a neighbouring language in an MT engine might also be useful.

Or even that you do have an MT engine from another provider but, again, looking at another language would be beneficial.

There are two ways to access the mapping table (I shall get back to the table itself below):

  1. You use the Language Codes Mapping Table, which you can open either via Add-Ins > Language Mapping (any changes here will affect all coming projects which use the same – in this case – target language).
  2. Or you can do it in connection with the selection of the plugin SDL Machine Translation Cloud as source for MT/TM – but then only if the project’s languages are included among the available MT engines.

Let’s say – to use an actual case – I have a translation from English to Luxembourgish, which latter language does not figure in the SDL MT Cloud services. However, since Luxembourgish is not too far away from German, using an MT engine for En > De might be useful. But I still don’t want to change the actual project languages. Here is where the language mapping comes in handy.

In this case only the first method can be used, since Luxembourgish is not offered as a target language in SDL’s MT Cloud and therefore the De > Lu pair causes an error message when I try to use SDL Machine Translation Provider for that. Therefore, before I create the project I open the mapping table as described above:

Every category is self-explanatory except MT Code and MT Code (locale). The MT Code is what actually decides the language in question, no matter what language name is specified. So in this case I want German to be mapped onto Luxembourgish, and therefore the MT Code for the latter (ltz) needs to be changed to “ger”. I do that and click OK.

I should add that you can search in all categories at the same time, so the easiest way to find the code I need is to type, in the Search field, “lu” for Luxembourgish and then “de” or “ger” for German. (Instead of scrolling.)

[As for the MT Code (locale), they are for variants of the same language, so that for instance Arabic (U.A.E.), Arabic (Algeria) and Arabic (Egypt) all have the same MT Codes but different locales and thus different MT engines. You cannot do anything with this except if suddenly the MT Cloud provider tells you that they have now introduced e.g. Arabic (Bahrain) with the locale code “arb” – you can then add that to the list even if the plugin provider has not yet updated it.]

Now when I arrive at the step in the project creation wizard where I select to use TM/MT resources (step 3, Translation Resources), I can select the SDL Machine Translation Provider even though the project’s target language does not have an MT engine. The settings will look like this:

As you see, the target language and flag is Luxembourgish, but the actual MT target is German – exactly as I wanted.

If my project’s target language had had its own MT engine but I wanted to look at another (related) language, I could have done that mapping here. Let’s say I translate to Danish but would be helped by looking at an MT engine for Norwegian as the target. Then I would click the View Language Mapping button in this dialog and get the same mapping table as above. There I would change the MT code “dan” into “nor” and get the desired result.

So – in principle simple although it takes a bit of text to explain it. My thanks to the ever-patient Paul Filkin for taking his time to clarify all my confusion.

Read more about the plugin here, about language mapping here, and about the MT Cloud Codes here.

Another excellent book on machine translation

Review of Jörg Porsiel (ed.): Maschinelle Übersetzung für Übersetzungsprofis. 384 pages. BDÜ Fachverlag 2020. €37. Order here (“Einkaufskorb” means “Shopping basket”).

Three years ago, the German translators’ organisation MDÜ published, via its publishing company BDÜ Fachverlag, the excellent machine translation primer “Machine Translation – What Language Professionals Need to Know” (reveiewed by myself here). Its original text was written in German; this follow-up is written in both English and German, and although its title is in German (easily translated into English: Machine Translation for Professional Translators), the fact is that about 57 percent of the text is in English! Still, some of the parts in German are of such importance that I wish they were made available to an even larger audience. (And one is written in a German which seems almost intentionally to confirm the image of German as an unusually difficult language, with long, convoluted sentences – the worst one being an entire paragraph of 9 lines, 83 words. All other contributions, however, are lucidly written.)

This is the book for everyone who wants to (a) get a comprehensive picture of the current situation in the domain of machine translation, and (b) delve deeper into some of the areas which are most important. In particular I would recommend reading the very first of the contributions, Patrick Bessler’s and Aljoscha Burchardt’s “Gute Qualität zum kleinen Preis? Wandel von Erwartungen und Prozessen im Kontext von Maschineller Übersetzung” (Good Quality at Low Cost? Changed Expectations and Processes in the Context of Machine Translation), because it gives such a complete and knowledgeable picture of the whole process, from client to translator/post-editor, stressing the need for knowledge on the part of the client and going into detail as concerns the new types of problems – in particular with the arrival of the neural MT – for both Language Service Providers (LSPs) and translators. (Examples are the new tasks which the LSPs must handle with regard to both clients and translators, the problem of assessing – in advance – the MT quality; and the new types of errors which must be handled.) Much of this – and more – is also touched upon in the foreword by the editor, Jörg Porsiel, but the in-depth coverage here, in only 12 pages, is admirable.

Today, as everyone knows, it’s the so-called neural variant which dominates MT. This has consequences for the handling of the MT suggestions – consequences which are discussed in many places in this book. But for the reader who is interested in the theories behind neural MT, there is a long presentation here (“Neural Machine Translation”, by Josef van Genabith) – 57 pages – with texts on information theory, mathematical expressions, and neural networks which may tax the reader’s concentration powers. However, there are also some parts of much more general interest, such as the detailed discussion of the differences between statistic and neural MT (pp. 73-74); also on human parity and research directions, where the discussion of translation for under-resourced languages is particularly important.

The future that MT brings

In particular I think van Genabith’s thoughts about the future are worth noting: “Going out on a limb, (N)MT will fundamentally change the work of human translators to (i) post-editing raw (N)MT translation outputs, (ii) certifying (post-edited or raw) translations and (iii) moving human translators much more into copy-editing and language and content quality control”. About that future there are further discussions. Donald DePalma writes about “Augmented Translation Intelligence”, where more or less “intelligent” functions will make possible a more extensive use of resources on the net (some interesting examples: “disambiguate words and phrases”, “deliver contextual information”, “suggest locale-specific content”) as well as the facilitation of the cooperation between colleagues. And his CSA colleague Arle Lommel makes (in “At human parity?”) some cutting remarks on the claims that NMT is (almost) on a par with “human” translation. He ends with some sensible suggestions as to what MT developers should concentrate on rather than pursuing the elusive target of “human parity”, namely improved quality estimation (see below), integration with speech technogies, connection with human translators, and simpler deployment.

Another look to the future is presented in “Machine Translation of Novels in the Age of Transformer”. Transformer is, according to the authors, “the state-of-the-art architecture in neural MT”. A project is presented where translations of 12 novels using different methodologies – one of which was Transformer – were evaluated. The authors do not claim any particular degree of “success”; only that Transformer is by far the best of the systems. They also suggest that training on segments longer than isolated sentences will lead to further improvements.

Yet another branch of future development is covered in “Neural Interactive Translation Prediction”; i.e. MT where basis for the MT suggestions are immediately updated. Unsurprisingly, this study indicates that such updating would be preferable to many translators; however, so far I know of only two providers (Lilt and CASMACAT, the one used here) offer that feature. (But the ModernMT service comes very close, with immediate updates of your uploaded TM, where matches take precedence over MT hits.)

When it comes to the future of post-editing work – i.e. editing of MT suggestions in a CAT tool, segment by segment; so-called PEMT – experienced post-editor Sara Grizzo is sceptical (in “Hat Post-Editing ausgedient?”; “Is Post-Editing a thing of the past?”):  this is demanding work which is to a large extent impopular among translators. She has come to the conclusion that, on the whole, PEMT makes sense above all for light post-editing (gisting). For more demanding translations, one should try to make use of MT in ways which are more palatable to the translator.

So what about post-editing itself?

PEMT is of course an important topic for this book, and it is covered in eight contributions. The aforementioned Sara Grizzo has two more contributions: one (“Post-Editing: ein Praxisleitfaden”) is a brief manual on the practise of post-editing. And in “Bezahlmodelle für Post-Editing” (“Payment models for Post-Editing”) she discusses the three main payment practices which are common today; one point being that none of them is really satisfactory. However, in the last contribution to the book, “Edit-Distance Based Compensation for Machine Translation”, Vincent Asmuth describes a variant of one of the models – EDC (for Edit-Distance Calculation) – which seems to take into account the work actually done by the post-editor, such as research and consideration of the MT suggestions, none of which is reflected in the resulting changes (if any) to the suggested translations. This is an interesting variant which could well be used as a starting point for discussion of this matter.

Another point raised by Grizzo is the importance of assessing in advance the amount of work needed for a post-editing job, and in particular the quality of the MT output. So far, only Memsource dares argue that they have a reliable function for this so-called Quality Estimation (as opposed to the Quality Control/Quality Assurance done on the final translation result), and it is briefly described, by Sara Szac and Heidi Depraetere, in “Quality Estimation”. They also describe a project called APE-QUEST, funded by the European Commission. They say that QE “should be used”; however, apart from referring to APE-QUEST – which I don’t believe is generally available – they do not provide any solutions.

Yet more on PEMT

Other articles on PEMT are, first, “DIN ISO 18587 in der Praxis” by Ilona Wallberg: an overview of the ISO standard, the title of which is “Translation services — Post-editing of machine translation output — Requirements”. Personally I am not sure of its importance, but since it is quite expensive (ca. EUR82) it is good to have it described in detail here.

Related to this contribution is “The post-editior’s skill set according to industry, trainers and linguists”, by Clara Ginovart and Antoni Oliver, which lists a number of skills fundamental to PEMT; the most important ones being “decision-making, error identification and respect of PE guidelines”.

“Post-edition – fit für die Praxis” (The Practice of Post-editing), by Uta Seewald-Heeg and Chuan Ding, is in some ways a companion text to Sara Grizzo’s shorter (and more easily read) “Praxisleitfaden”, already mentioned.

And a more psychologically-oriented approach to PEMT is taken by Jean Nitzke in “Problemlösungsstrategien beim Post-Editing in Verbindung mit psychologischen Aspekten” (Problem-solving Strategies in Post-Editing in Connection with Psychological Aspects). A question posed: Can a post-editor work with PEMT every day without the ability to concentrate and the motivation suffering? The perhaps obvious answer given here is that one should strive to work with a mixture of different tasks, and thereby develop methods and strategies for post-editing.

François Massion discusses PEMT from the viewpoint of an LSP (Language Service Provider) in “NMT im Einsatz bei einem Dienstleister” (NTM practiced by an LSP). It contains a section on optimization of post-editing (pp. 270-) which is certainly of general interest. And in a report on the use of terminology in training and customization of MT engines (“Terminologie in der neuronalen maschinellen Übersetzung” by Tom Winter and Daniel Zielinski) there is a discussion on the importance of terminology during translation which is well worth reading, in particular the detailed part on terminology errors in machine translations (pp. 216-).

As for the rest…

Other topics covered are the matter of confidentiality (two articles) and the use of controlled language (and while this discussion is certainly worth while, it is at least my experience that the LSP – not to mention the end-of-the-line translator/editor – extremely seldom has the opportunity to affect the source text in this manner).

It should also be mentioned that sprinkled in many of the contributions is the view that translation and post-editing are two very different tasks, and while a good post-editor is probably also a good translator, far from all translators find post-editing at all attractive.

If there is one perspective which I miss in this very rich book, however, it is the use of MT not for post-editing work – i.e. the use of MT is not requested by the client; it is simply used as an additional resource in a “normal” job. This is not the same task as PEMT! For one thing, you can choose among various MT engines; for another, it does not affect your pay. (But I must admit that the work itself is more or less similar to PEMT.)

Finally, I would strongly urge BDÜ to present this book all in English. While I am sure that most German-speaking readers have little problem with the English texts, I doubt that the reverse is true. And the book deserves a very wide readership. May I suggest to use the assistance of NMT?

There is also a brief glossary and presentations of the (30) contributors.

The MDÜ web site has a brief presentation of the book in German as well as sample pages – 18 of them, including the contents list.

Note: I had intended to provide a German version of this review as well, but in the end I refrained, since (a) my writing in German leaves a bit to be desired (even though I have no problem reading, and translating from, German), and (b) those German readers who are interested in this tome no doubt will have no problems reading this text in English.

Changing the Studio language settings

In addition to what I have written in the manual about changing the languages used in the Freelance edition of Studio, there is one other quite simple way of doing it. But it involves manipulating the Windows registry, so care should be taken when you do it. (Even before I wrote this, a corresponding instruction was published as a wiki post at the SDL Community without my being aware. Fortunately, we say the same thing although the wiki has more images.)

To be on the safe side (even though the change involved is very simple), you should first backup your registry. On the site How to backup the entire Registry on Windows 10, you will find detailed instructions on how to backup and restore the registry using system restore. You can also use a manual backup, which is described in How to back up and restore the registry in Windows. (I haven’t tried either, so I leave it to you do decide which is best.)

Anyway, hoping you won’t run into any problems, here is the procedure for using the registry to be able to change your Studio languages (all of them, if needed).

  1. Right-click the Start icon and select Search. Then enter regedit and open the registry (allowing the computer to make changes when asked). You can open it with or without administrator authority; the result will be the same.
  2. To be on the safe side, create a backup copy of the registry by selecting File > Export.
  3. Then lookup manually the address for making a change: HKEY_CURRENT_USER\Software\Microsoft\LSDRClient15
    (For Studio 2017 it is LSDRClient5.) You can also press Ctrl+F and search for LSDRClient15. And here is the folder:

 

 

 

  1. Right-click the folder name and select Rename.
  2. Name the folder, for instance, LDSRClient15_old.
  3. Close the registry.
  4. Open Studio. During the opening process you will be asked to select your five languages. (It once happened to me that this stage did not appear. I checked and found that, for some reason, my registry change hadn’t “taken”. When I did it again, I got the desired result.)
    That is the only change. Everything else is as before, e.g. all projects and AppStore plugins remain.

AppStore plugin names in different contexts

Once you have started to download and use AppStore plugins (you should!), you may notice that they often have different names in different contexts. Thus the useful appNotifications has an installation file called AppStoreIntegration; it is called appNotifications in the Plugin Management window and AppStoreIntegration in the Plug-ins list on the Add-ins ribbon. Normally this is something that you don’t need to care about, but as soon as you – for instance – want to locate a plugin in one of the lists, or check if you have already downloaded the installation file, you may be in trouble.

This list covers most of the plugins which to my mind are among the more useful ones, and it gives many – but far from all – of the various names. (The reason that the first column has more names than the others is that I plan to fill in the rest of the columns for those, too.)

The bottom list shows which plugins are already on the Plug-ins list in Studio itself before you add anything – but they are still not called System plug-ins. There is probably a reason for this, but I don’t know why.

Of course these things will change from time to time, so I will update the list now and then. This version is dated October, 2019. And the highlighted items are ones that I simply don’t know for certain what they are, so I have made guesses.

The list in pdf format is found here, and I have included the above text.

What are your default QA check settings?

This is a discussion held at the Studio Beta Group Forum. Since you have to be a member of that group to read it, and since it is quite interesting, I have obtained permission from the participants to publish it here.

Daniel Brockmann

Now that CU2 is out the door, I would love to hear from you on a specific topic. We are currently designing QA checks for the Online Editor environment, and “reinventing” them to some extent for that context. One question that came up was what typical default settings for QA checks are. Studio just has the forgotten check and nothing else enabled – which we believe is making its use a bit more difficult than if some other checks would already be available by default. So – against that background – can I ask you to reply here with the defaults you typically change? Obviously many of you also have specific RegEx checks etc. – maybe for those you can just say at a high level “I add my own regex checks” or so. A high-level list would be best.

Marco Rognoni 

I usually include the following:

Check for repeated words in target

Check that source and target end with the same punctuation

Check for multiple spaces

Claudio Nasso 

Regarding your QA checks question, these are my custom settings, compared to those already checked or unchecked by default:

  • Segment verification > Source and target are identical
  • Segments to Exclude > Exclude exact matches
  • Segments to Exclude > Exclude repetitions
  • Segments to Exclude > Exclude locked segment
  • Inconsistencies > Check for inconsistent translations
  • Inconsistencies > Check for repeated words in target
  • Inconsistencies > Check for unedited fuzzy matches
  • Punctuation > Check that source and target end with the same punctuation
  • Punctuation > Check for unintentional spaces before (applies to Italian, in my case)
  • Punctuation > Check for multiple spaces
  • Punctuation > Check for multiple dots
  • Punctuation > Check for multiple dots > Ignore ellipsis dots (…)
  • Punctuation > Check for extra space at the end of target segment
  • Punctuation > Check brackets
  • Numbers > Check numbers
  • Numbers > Check times
  • Numbers > Check dates
  • Numbers > Check measurements
  • Trademark check > Check trademarks characters
  • Length limitations > Check length limitation (only when necessary)
  • Tag verifier > Ignore formatting tags (in this case I uncheck it)
  • Verification settings > Ignore locked segments
  • Verification settings > Enable recognition of two-letters terms
  • Number verifier > Number verifier settings > Exclude tag text
  • Number verifier > Number verifier settings > All source thousands separators > Period (applies to Italian, in my case)
  • Number verifier > Number verifier settings > All decimal separators > Comma (applies to Italian, in my case)

Marco Rognoni 

Hi Claudio,

That’s a lot of QA checks! 🙂

Personally my experience is that by adding so many checks you always get a lot of errors, and it takes more time to verify each of them in Studio rather than manually check them during review/proofreading stages before delivery.

Of course each of us has a personal way of working, so I totally understand that you may prefer to have all those checks in place.

This shows that Daniel’s question is very interesting, and most likely each reply will be different based on the established method of every single linguist.

Claudio Nasso 

Hi Marco,

you are right, enabling all my proposed verification items may generate lot of errors/warnings/notes, but this is true only when the review/editing stages of a translated project have been carried out in an inadequate way.

When the reviewing/proofreading stages have been correctly carried out, the number of “errors/warnings/notes” will be much less, and they will further help us to spot those we have forgotten.

Moreover, after having set general custom QA checks, pairing them to the proper “signal” (I mean “Error”, “Warning” or “Note”), we have an option to show just the desired “signal”, or to disable some of them before running the “Verify” function on a particular project.

But, as you have pointed out, the choice of custom QA settings is tied to specific projects/requirements, and I agree with you that Daniel’s question is interesting because it will spot various working methods adopted by each colleague.

Tuomas Kostiainen 

Generally, I use the following checks:

  • Segment verification > Check for forgotten and empty translations
  • Segments to Exclude > Exclude PerfectMatch units
  • Segments to Exclude > Exclude locked segment
  • Inconsistencies > Check for inconsistent translations [Ignore tags and case]
  • Inconsistencies > Check for repeated words in target [Ignore numbers and case]
  • Punctuation > Check for unintentional spaces before [:!?;]
  • Punctuation > Check for multiple spaces
  • Punctuation > Check for multiple dots > Ignore ellipsis dots (…)
  • Punctuation > Check for extra space at the end of target segment
  • Regular Expressions > I use my own
  • Trademark check > Check trademark characters
  • (Length limitations > Check length limitation [only when necessary])
  • Tag verifier > All 5 tag checks AND Ignore formatting tags

(Copied and modifed from Claudio’s list — thank you!)

Frank Drefs 

We use the following settings:

Segment verification > Source and target are identical

  • Segments to Exclude > All deselected
  • Inconsistencies > Check for inconsistent translations (Ignore tags + Ignore case selected)
  • Inconsistencies > Check for repeated words in target (Ignore case selected)
  • Inconsistencies > Check for unedited fuzzy matches
  • Punctuation > Check for multiple dots
  • Punctuation > Check for multiple dots > Ignore ellipsis dots (…)
  • Tag verifier > All checks selected
  • Tag verifier > Ignore formatting tags

Claudia Alvis 

  • Segment verification > Check for forgotten and empty translations
  • Inconsistencies > Check for inconsistent translations [Ignore tags and case]
  • Inconsistencies > Check for repeated words in target [Ignore numbers]
  • Inconsistencies > Check for unedited fuzzy matches [All segments]
  • Punctuation [All checked]
  • Numbers [None checked]
  • Trademark check > Check trademarks characters
  • Length limitations > Check length limitation (only when necessary)
  • Tag verifier [Tags added, Deleted, Ghost tags]
  • Terminology verifier > Check for possible non-usage of target terms [min. match value 85%]
  • Terminology verifier > Check for terms which may have been set as forbidden
  • Terminology verifier > Ignore locked segments

What you want to know about Intento

Intento is a remarkable Studio plugin which gives you easy access to more than twenty MT providers. It’s use is not free (see below), but the charges are very reasonable.

Confidentiality

This is what is said in “Exhibit E” of the license agreement:

“We have two types of requests (from Customer to Intento and from Intento to Third-Party Services), and four types of Customer Data: request metadata, input data (request payload), user credentials (to external services, if necessary), and the data processing results (e.g. translated text or tags extracted from an image).” And: “The request metadata is everything contained in the request except input data and credentials, plus metadata derived from the payload.” These data may be deleted by user request.

Input data (source text) is stored for milliseconds from the reception of the request and submission to the MT provider. The same applies to the reception of the target text and the submission to the client.

But of course you also have to check the MT provider’s confidentiality terms to make sure that they fulfil your needs.

Account

To create an account, click Sign in and the follow the simple procedure. Once you have an account, you have access to all the necessary information at the Console Dashboard – in particular your API key for Production. (The Sandbox option – which is free – is not really for translation but for testing of the API integration.) You will need that key every time you make a new project setting using Intento, because unfortunately it is not possible to save it in the project settings. (Maybe in the future?)

Project Settings

The Intento Studio plugin is available both from the AppStore and from the Intento shared folder; the latter may be in a later development stage since the former needs reviewing by SDL before publishing. (Curiously, none of them appear on the Add-Ins > Plugins list and only the shared folder one appears in the Sdl Plugin Management window.

In Studio, open the Project Settings and then appropriate Translated Memory and Automated Translation settings. Then (as usual) click Use. The Intento MT Hub plugin is shown on the list as Intento MT Translation Provider; the other one is called Intento MT Provider. They open the following settings windows with the former first:

In both cases you must enter the API key I mentioned above and click the Check button. You need to do this every time you open this window; the key is not retained.

Once the key is approved, the Provider list is available. For some of the providers you can use your own credentials, which means that the charge will be made directly by that provider – otherwise Intento will charge on behalf of the provider. The “custom model” is applicable if you have your own customized MT model with the provider in question. “Payload Logging for 30 min” is meant for clients who may need logging (the default mode is “no trace”) in case there are issues to resolve.

Smart routing

Smart routing means that Intento makes the choice of MT provider for you, based on the latest Intento MT Benchmark. The price will be less than USD25 per 1M characters, and “full data protection” is provided, as well as high reliability.

Prices

There is ample information on prices in the license agreement (“Exhibits” A, C, and D), but you probably want to know before even starting the procedure of creating an account or setting up a project. Here is a brief overview. You will be charged for both Intento’s and the MT provider’s services.

Intento’s prices are as follows:

Amount of Machine Translation (characters) Price in USD for routing to MT provider
(per 1M characters)
up to and including 100M 5.00
>100M – 1B 4.00
> 1B 3.00

You pay per month or when the accumulated fee equals USD 5000.

For 1M characters – probably at least 100,000 words in a European language, except Finnish – I must say I could easily afford USD5.

MT providers’ prices are “Retail Recommended Prices for Third Party Services”, i.e. what you pay if you use that provider directly. This means that the charges for the MT provider services are the same whether you let Intento charge for them or you use your own credentials in the setup.

So there you have it. Apart from the question marks as concerns user information, I think this is an excellent service.

The AppStore at SDL

The SDL AppStore site is renovated from time to time. Also, not all functions on it are self-evident. So here is a brief orientation which may help you to utilize it without too much experimentation.

This is the start page:

 

 

 

 

 

 

 

 

 

Unless you are interested in web content management, select Language Solution Apps. You will be shown apps in these categories: Trending apps, Recently added / updated, Most popular for Studio 2019, and Apps for terminology.

You can to narrow your search by clicking the View all apps button, which leads to these filtering options:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Your selections are shown above that menu, like in this example:

 

The apps that are shown can be sorted using these two buttons:

 

 

Clicking Top rated (or whatever that position says – i.e. your latest choice), you get a choice of Last updated – Most downloaded – Most recent – Most reviewed and Top rated.

With the arrow at right, you will have the apps sorted top down or the opposite (Last to Oldest or the reverse, Most to Least or the reverse, etc.).

Some words of explanation:

  • Last updated and Most recent seem to be the same. A “last updated” (or “recent”) app is an app that is either new, has newly been revised (with a new edition number), or has newly been revised even if the edition number is the same. A bit confusing.
  • Most reviewed is exactly that (although of course the reviews are not always positive; however, they sometimes contain valuable information, so it may pay to take a look).
  • Top rated are apparently the apps with best ratings. However, the rationale for the ordering of the non-reviewed apps is not clear.

As you can see, there are many similarities between this site and the AppStore window in Studio. And despite its advantages there is still room for improvement (apart from such minor matters as clarification of the Last updated/Most recent, Most reviewed and Top rated categories). And you still cannot sort the plugins by alphabetical order – but of course that is now the default presentation in the corresponding Studio window. For myself, I would also like to see such a simple thing as a designated space for the price of the paid apps.

How (un)safe is machine translation?

Note: This is a revised version of a text previously published at the eMpTy Pages blog under the heading “The Data Security Issues Around Public MT – A Translator Perspective”, with an extensive introduction by blog editor Kirti Vashee and some reader comments. This version is slightly updated.

Some time ago there were a couple of posts on this site discussing data security risks with machine translation (MT), notably by Kirti Vashee and by Christine Bruckner. Since they covered a lot of ground and might have created some confusion as to what security options are offered, I believe it may be useful to take a closer look with a more narrow perspective, mainly from the professional translator’s point of view. And although the starting point is the plugin applications for SDL Trados Studio, I know that most of these plugins are available also for other CAT tools.

About half a year ago, there was an uproar about Statoil’s discovery that some confidential material had become publicly available due to the fact that it had been translated with the help of a site called translate.com (not to be confused with translated.net, the site of the popular MT provider MyMemory). The story was reported in several places; this report gives good coverage.

Does this mean that all, or at least some, machine translation runs the risk of compromising the material being translated? Not necessarily – what happened to Statoil was the result of trying to get something for nothing; i.e. a free translation. The same thing happens when you use the free services of Google Translate and Microsoft’s Bing. Frequently quoted terms of use for those services state, for instance, that “you give Google a worldwide license to use, host, store, reproduce – – – such content”, and (for Bing): “When you share Your Content with other people, you understand that they may be able to, on a worldwide basis, use, save, record, reproduce – – – Your Content without compensating you”. This  should indeed be offputting to professional translators but should not be cited to scare them from using services for which those terms are not applicable.

The principle is this: If you use a free service, you can be almost certain that your text will be used to “improve the translation services provided”; i.e. parts of it may be shown to other users of the same service if they happen to feed the service with similar source segments. However, the terms of use of Google’s and Microsoft’s paid services – Google Cloud Translate API and Microsoft Text Translator API – are totally different from the free services. Not only can you select not to send back your finalized translations (i.e. update the provider’s data with your own translations); it is in fact not possible – at least not if you use Trados Studio – to do so.

Google and Microsoft are the big providers of MT services, but there are a number of others as well (MyMemory, DeepL, Lilt, Kantan, Systran, SDL Language Cloud…). In essence, the same principle applies to most of them. So let us have a closer look at how the paid services differ from the free.

Google’s and Microsoft’s paid services

Google states, as a reply to the question Will Google share the text I translate with others: “We will not make the content of the text that you translate available to the public, or share it with anyone else, except as necessary to provide the Translation API service. For example, sometimes we may need to use a third-party vendor to help us provide some aspect of our services, such as storage or transmission of data. We won’t share the text that you translate with any other parties, or make it public, for any other purpose.”

And here is the reply to the question after that, Will the text I send for translation, the translation itself, or other information about translation requests be stored on Google servers? If so, how long and where is the information kept?: “When you send Google text for translation, we must store that text for a short period of time in order to perform the translation and return the results to you. The stored text is typically deleted in a few hours, although occasionally we will retain it for longer while we perform debugging and other testing. Google also temporarily logs some metadata about translation requests (such as the time the request was received and the size of the request) to improve our service and combat abuse. For security and reliability, we distribute data storage across many machines in different locations.”

For Microsoft Text Translator API the information is more straightforward, on their “API and Hub: Confidentiality” page: “Microsoft does not share the data you submit for translation with anybody.” And on the “No-Trace” page: “Customer data submitted for translation through the Microsoft Translator Text API and the text translation features in Microsoft Office products are not written to persistent storage. There will be no record of the submitted text, or portion thereof, in any Microsoft data center. The text will not be used for training purposes either. – Note: Known previously as the “no trace option”, all traffic using the Microsoft Translator Text API (free or paid tiers) through any Azure subscription is now “no trace” by design. The previous requirement to have a minimum of 250 million characters per month to enable No-Trace is no longer applicable. In addition, the ability for Microsoft technical support to investigate any Translator Text API issues under your subscription is eliminated.

Other major players

As for DeepL, there is the same difference between free and paid services. For the former, it is stated – on their “Privacy Policy DeepL” page, under Texts and translations – DeepL Translator (free) – that “If you use our translation service, you transfer all texts you would like to transfer to our servers. This is required for us to perform the translation and to provide you with our service. We store your texts and the translation for a limited period of time in order to train and improve our translation algorithm. If you make corrections to our suggested translations, these corrections will also be transferred to our server in order to check the correction for accuracy and, if necessary, to update the translated text in accordance with your changes. We also store your corrections for a limited period of time in order to train and improve our translation algorithm.”

To the paid service, the following applies (stated on the same page but under Texts and translations – DeepL Pro): “When using DeepL Pro, the texts you submit and their translations are never stored, and are used only insofar as it is necessary to create the translation. When using DeepL Pro, we don’t use your texts to improve the quality of our services.” And interestingly enough, DeepL seems to consider their services to fulfil the requirements stipulated – currently as well as in the coming legislation – by the EU Commission (see below).

Lilt is a bit different in that it is free of charge, yet applies strict Data Security principles: “Your work is under your control. Translation suggestions are generated by Lilt using a combination of our parallel text and your personal translation resources. When you upload a translation memory or translate a document, those translations are only associated with your account. Translation memories can be shared across your projects, but they are not shared with other users or third parties.”

MyMemory – a very popular service which in fact is also free of charge, even though they use the paid services of Google, Microsoft and DeepL (but you cannot select the order in which those are used, nor can you opt out from using them at all) – uses also its own translation archives as well as offering the use of the translator’s private TMs. Your own TM material cannot be accessed by any other user, and as for MyMemory’s own archive, this is what they say, under Service Terms and Conditions of Use:

“We will not share, sell or transfer ’Personal Data’ to third parties without users’ express consent. We will not use ’Private Contributions’ to provide translation memory matches to other MyMemory’s users and we will not publish these contributions on MyMemory’s public archives. The contributions to the archive, whether they are ’Public Data’ or ’Private Data’, are collected, processed and used by Translated to create statistics, set up new services and improve existing ones.” One question here is of course what is implied by “improve” existing services. But MyMemory tells me that it means training their machine translation models, and that source segments are never used for this.

And this is what the SDL Language Cloud privacy policy says: “SDL will take reasonable efforts to safeguard your information from unauthorized access. – Source material will not be disclosed to third parties. Your term dictionaries are for your personal use only and are not shared with other users using SDL Language Cloud. – SDL may provide access to your information if SDL plc believes in good faith that disclosure is reasonably necessary to (1) comply with any applicable law, regulation or legal process, (2) detect or prevent fraud, and (3) address security or technical issues.”

Is this the whole truth?

Most of these terms of services are unambiguous, even Microsoft’s. But Google’s leaves room for interpretation – sometimes they “may need to use a third-party vendor to help us provide some aspect of [their] services”, and occasionally they “will retain [the text] for longer while [they] perform debugging and other testing”. The statement from MyMemory about improving existing services also raises questions, but I am told that this means training their machine translation models, and that source segments are never used for this. However, since MyMemory also utilizes Google Cloud Translate API (and you don’t know when), you need to take the same care with both MyMemory and Google.

There is also the problem with companies such as Google and Microsoft that you cannot get them to reply to questions if you want clarifications. And it is very difficult to verify the security provided, so that the “trust but verify” principle is all but impossible to implement (and not only with Google and Microsoft).

Note, however, that there are plugins for at least the major CAT tools that offer possibilities to anonymize (mask) data in the source text that you send to the Google and Microsoft paid services, which provides further security. This is also to some extent built into the MyMemory service.

But even if you never send back your translated target segments, what about the source data that you feed into the paid services? Are they deleted, or are they stored so that another user might hit upon them even if they are not connected to translated (target) text?

Yes and no. They are generally stored, but – also generally – in server logs, inaccessible to users and only kept for analysis purposes, mainly statistical. Cf. the statement from MyMemory.

My conclusion, therefore, is that as long as you do not return your own translations to the MT provider, and you use a paid service (or Lilt), and you anonymize any sensitive data, you should be safe. Of course, your client may forbid you to use such services anyway. If so, you can still use MT but offline; see below.

What about the European Union?

Then there is the particular case of translating for the European Union, and furthermore the provisions in the General Data Protection Regulation (GDPR), to enter into force on 25 May 2018. As for EU translations, the European Commission uses the following clause in their Tender specifications:

”Contractors intending to use web-based tools or any other web-based service (e.g. cloud computing) to execute the /framework contract/ must ensure full compliance with the terms of this call for tenders when using such services. In particular, the provisions on confidentiality must be respected throughout any web-based process and the Union’s intellectual and industrial property rights must be safeguarded at all times.” The commission considers the scope of this clause to be very broad, covering also the use of web-based translation tools.

A consequence of this is that translators are instructed not to use “open translation services” (beggars definition, does it not?) because of the risk of losing control over the contents. Instead, the Commission has its own MT-system, e-Translation. On the other hand, it seems possible that the DG Translation is not be quite up-to-date as concerns the current terms of service – quoted above – of Google Cloud Translate API and Microsoft Text Translation API, and if so, there may be a slight possibility that they might change their policy with regard to those services. But for now, the rule is that before a contractor uses web-based tools for a EU translation assignment, an authorisation to do so must be obtained (and so far, no such requests have been made).

As for the GDPR, it concerns mainly the protection of personal data, which may be a lesser problem generally for translators (at least if you don’t handle texts such as medical records, legal cases, etc.). In the words of Kamocki & Stauch on p. 72 of Machine Translation, “The user should generally avoid online MT services where he wishes to have information translated that concerns a third party (or is not sure whether it does or not)”. If you do handle personal data, you should forget about MT since the new regulation requires you to have a contract with the data processor (i.e. the MT service provider), and I doubt that for instance Google or Microsoft will be bothered.

Offline services and beyond

There are a number of MT programs intended for use offline (as plugins in CAT tools), which of course provides the best possible security (apart from the fact that transfer back and forth via email always constitutes a theoretical risk, which some clients try to eliminate by using specialized transfer sites). The drawback – apart from the fact that being limited to your own TMs – is that they tend to be pretty expensive to purchase.

The ones that I have found (based on investigations of plugins for SDL Trados Studio) are, primarily, Slate Desktop translation provider, Transistent API Connector, and Tayou Machine Translation Plugin. I should add that so far in this article I have only looked at MT providers which are based on providers of statistical machine translation or its further development, neural machine translation. But it seems that one offline contender which for some language combinations (involving English) also offers pretty good “services” is the rule-based PROMT Master 18.

However, in conclusion I would say that if we take the privacy statements from the MT providers at face value – and I do believe we can, even when we cannot verify them – then for most purposes the paid translation services mentioned above should be safe to use, particularly if you take care not to pass back your own translations. But still I think both translators and their clients would do well to study the risks described and advice given by Don DePalma in this article. Its topic is free MT, but any translation service provider who wants to be honest in the relationship with the clients, while taking advantage of even paid MT, would do well to study it.

Powered by WordPress | Designed by: backlink indexing | Thanks to Mens Wallets, warcraft gold and buy backlinks