The Term Recognition Module (TRM) ☍ enables you to compare terms in a source document with the terminology stored in IATE. You can manually upload one or several documents and retrieve a termbase or other type of file containing the relevant matches from IATE.
On this page, you will find the following sections:
- Create a TRM request
- Use additional filters
- Add exclusion lists
- Retrieve a TRM request
- Use HTML as output format
- Additional information about TRM
Read more on user groups and access rights.
CREATE A TRM REQUEST
To create a TRM request:
- Go to the ‘Term processing’
menu and click on the ‘Term Recognition Module (TRM)’ tab. This will automatically open the ‘Create TRM request’ tab. - Upload one or multiple monolingual documents by clicking on the grey box or by using drag and drop.
The most common editable formats are accepted (i.e. Word, Excel, PowerPoint, editable PDF, RTF, HTML, XML, CSV, etc.). |
- Choose whether to apply an exclusion list (i.e. a list of terms that should not be proposed as candidate terms). You can either fill in a template or use the proposed exclusion lists. Upload the file (or drag and drop) like you did with the other documents. Read more about exclusion lists.
Stopword lists are automatically applied. |
- Name your TRM request.
- Use the sliding buttons to include or exclude matches without results in the target language, filter out or include confidential data, and choose the execution time for your request.
Historical entries will be automatically excluded from all retrievals. |
- Choose the source and target languages of your request.
The requested termbase can be bilingual or monolingual. The target languages supported are all official EU languages, plus IS, NO, RU and TR. The system automatically detects your source language. However, you can manually select a specific language if your document is very short, contains lists of terms, or uses closely related languages. Please note that non-EU languages are not supported as source languages. |
- Choose between four output formats:
- TBX and SDLTB, which can be used with a computer-assisted translation (CAT) tool for quick consultation and/or automatic display while translating. TBX is an open, standard format for termbases, whereas SDLTB is a proprietary format used within Trados Studio. Please note that SDLTB is not fully compliant with monolingual termbases and could contain errors.
- HTML, a user-friendly format that can be used by interpreters, experts, authors or any other users not working with CAT tools; and
- JSON, a technical format.
- If you have inserted several source files and have selected a termbase as the output format, you can choose whether you want to receive one termbase for each document, or one termbase covering all documents.
- For all output formats except HTML, select the analysis type, which can be:
- algorithm-based, using the same algorithms as a standard IATE search; or
- N-gram, which is based on similarity of results, and should provide improved results for highly inflected or compounding languages
- Finally, click on ‘Create’ to submit your TRM request.
USE ADDITIONAL FILTERS
When creating a TRM request, you can apply the following additional filters by clicking on the ‘Show more’
button:
- Primarity: by default, all entries (primary and not primary) are retrieved.
- Entry confidentiality: by default, all entries (confidential and not confidential) are retrieved, but you may need to filter out confidential data in some cases (e.g. when distributing termbases to freelancers).
- Domains: by default, all domains are selected, but you can narrow your results by selecting a limited number of domains. The domains are the EuroVoc and CJEU domains used to classify each IATE entry. In order to change the default domain selection, click on ‘Click to add domains’ and then select the domains you are interested in. You can see all the subdomains by clicking on the ‘Expand all’ button, if needed.
- ‘In collection’ and ‘Not in collection’: by default, no collection is added. You can type a keyword to launch a search by collection name and add the desired collection(s) as a filter, so that only entries included or not included in the selected collection(s) are retrieved.
- Term type (source language): by default, only lookup forms are excluded.
- Term type (target language): by default, only lookup forms are excluded.
- Evaluation (target language): by default, all terms (deprecated, obsolete, admitted, preferred, and proposed) are retrieved.
- Term validation (target language): by default, all terms (validated, not validated, and pre-IATE) are retrieved.
- Minimum reliability (target language): by default, all terms (downgrade prior to deletion, reliability not verified, minimum reliability, reliable, and very reliable) are retrieved.
- LL aggregated completion score (target language): by default, only entries with an ‘Average to high’ score in the target language are retrieved, which hides language sections that are mostly empty (scores 0 to 2). If you only want to find complete entries with several fields filled in, change this setting to ‘High’ (scores 6 to 16). If you want to see all available content regardless of how complete it is, choose ‘All’ (scores 0 to 16).
- Owner (institution) of target TL: by default, terms from all institutions are retrieved except for CoR [CdT], EUMS [Consilium], FL [CdT], FL_SCIC [COM], IATE TMN [CdT], Swiss Data [COM], and TAXEUD [COM].
- Customer (target language): by default, terms from all customers are retrieved.
ADD EXCLUSION LISTS
A proposed exclusion list containing the EN words which appear most frequently in the DGT corpus and should not be retrieved as part of the results is uploaded by default for any EN TRM request (including requests coming from the internal IATE plug-ins for Trados Studio). This list is also available for consultation at the bottom of the ‘Create TRM request’ screen.
Another exclusion list (‘Proposed exclusion list with most duplicated EN terms in IATE’) contains the most frequent EN duplicate terms in IATE. When applied to a TRM request, these terms are excluded from the results.
Additionally, you can upload your own exclusion file, using the template available for download at the bottom of the screen. To use this option, you have to upload at least two files. You will then be given the option of marking one of them as an exclusion file.
RETRIEVE A TRM REQUEST
The ‘Retrieve TRM requests’ ☍ tab shows the status and details of your request. You will need to refresh the page to update the status of your request.
Once you launch a request, processing should normally take a few minutes. Requests that take longer than 90 minutes to be processed are timed out and will be marked as failed. The recommended alternative is to relaunch the request and select the scheduling option for execution outside core hours, in which case the timeout period is extended to 10 hours. Additionally, when no results are retrieved, they will also be marked as failed.
To retrieve your TRM request:
- Go to the ‘Term processing’
menu and click on the ‘Term Recognition Module (TRM)’ tab. - Switch to the ‘Retrieve TRM requests’ tab.
- Click the ‘Show more’
button to see more details about your request. You can cancel your request at any time by clicking on the red cross icon. Results will be retrieved up to the moment of the cancellation (those already retrieved will appear in green, while the unprocessed ones will appear in red and marked as failed). - When the results have been retrieved, download the output files via the dedicated buttons, either one by one
or all together
.
Multiple termbases belonging to the same project can be downloaded at the same time (parallel individual downloads). Depending on your browser settings, the multiple download might be blocked. In that case, you need to enable the pop-up from your browser. After clicking on the ‘Download all’button, click on the red icon appearing in the address bar. Depending on your preferences, you might want to make sure that your download settings are not set to ‘Ask every time where to save the file before downloading’. |
In case your TRM request fails:
- Check your source file (particularly its length), apply the filters again, and resubmit the request. You can schedule longer documents or divide them into smaller pieces (maximum 50 pages each) to avoid timeouts.
- If your request still fails, you can contact the IATE Team for help. Make sure that you copy the error details by clicking on the failed request and paste them into an email addressed to the IATE Team (iate@cdt.europa.eu).
USE HTML AS OUTPUT FORMAT
The HTML output file displays the source document with highlights over the matches available in IATE. Two highlight colours, orange and blue, are used in alternation, purely to make the results easier to read and analyse.
- Click on a highlighted term to display the matching IATE entries with the target terms and their metadata.
- Click on the entry ID to open the full entry view in IATE in a new tab.
Search results are sorted just like a standard search. The system gives priority to primary entries, high-reliability target languages, and validated terms. It then lists non-lookup matches, followed by lookup matches (if you selected that option). Once downloaded, you can view the highlighted HTML file offline. |
ADDITIONAL INFORMATION ABOUT TRM
- All TRM requests are only available to download for 72 hours.
- IATE cached data for TRM are updated every three hours (all settings and outputs, except TRM retrievals where the n-gram option is selected, in which case the copy used is updated weekly).
- The following data are excluded from the retrievals:
- MUL and Latin data,
- two-character words,
- terms which only contain digits or digits with special characters, and
- raw entries.
- Deprecated, obsolete, unvalidated, and pre-IATE terms are included by default in the retrievals. Lookup forms are excluded by default. You can use the additional filters to change any of these parameters if needed.
- The sorting of results is similar to that in the standard search: priority is given to primary entries, maximum reliability across all TLs for the target language, validated target terms and non-lookup matches followed by lookup matches (if selected).
- Remember to filter out confidential data when distributing termbases to freelancers.
TRM is also available in standalone mode (offline) for processing sensitive documents. Please contact your central terminology service if you need additional information. |
(*) User GROUPs and access rights
Check below to see which IATE user groups can create term recognition requests:
| User group | Create TRM request |
|---|---|
| NON-LOGGED-IN USER | No |
| FREELANCE BASIC USER | Yes |
| INTERNAL LOGGED-IN USER (except LIMITED) | Yes |
The most common editable formats are accepted (i.e. Word, Excel, PowerPoint, editable PDF, RTF, HTML, XML, CSV, etc.).