The IATE termbase – some suggestions

IATE (Inter-Active Terminology for Europe) is the European Union’s inter-institutional terminology database, and it is such a huge – more than 8 million termpost entries – and comprehensive (and also to a great extent normative) terminology resource that I feel motivated to include some extra information about it.

A good place to start is IATE’s own “Download IATE” page, with a brief description including the number of term posts for each of the 26 languages and a download link.

Further information, including some links to useful sites, is found at the Terminology Coordination site, the Download IATE.TBX page.

If you want to learn about the basics of how to use the gigantic .tbx file and the practical problems associated with that, Paul Filkin’s blog post What a whopper! is a fine place to do that.

If, however, you want to go directly to importing a suitable language pair, then if you’re lucky, Paul has already done the work for you. For 11 English <> [some other EU language], he provides (free of charge) the corresponding .tbx file, as well as for 7 other language combinations involving French, German, Italian, Polish, Dutch, Spanish, and Portuguese. Go to the blog post A few bilingual TBX resources and find out.

However, there are glitches in the IATE base material, and they are present also in Paul’s files. But they have been dealt with in the files that are provided by Henk Sanderson via his SanTrans site (santrans.net). The problems, specified in detail by Henk, are these:

  • The handling of synonyms varies.
  • Context notes are sometimes inserted in the term itself.
  • There are a lot of HTML tags or formatting strings that mess up the term post.
  • Subjects are listed as numerical codes which are extremely problematic to interprete.
  • Some terms are in fact whole sentences, more appropriate for a TM than for a termbase.
  • There are numerous non-UTF-8 characters.
  • There are a lot of 1-, 2- and 3-letter words.

Henk has taken care of all this in the files he provides. They are not free, but very inexpensive: 10 euro for the first language pair, 7.50 for each additional pair. Order under the Contact & Comments tab at the SanTrans site. Included are very detailed and easy-to-follow instructions.

You will find all this illustrated in a pedagogical way in Paul Filkin’s blog post IATE, the last word… maybe!.

You can leave a response, or trackback from your own site.

Leave a comment

Powered by WordPress | Designed by: backlink indexing | Thanks to Mens Wallets, warcraft gold and buy backlinks