Număr de pagini:   [1 2] >
How to convert a TBX file into a Translation Memory
Inițiatorul discuției: Adrián L.

Adrián L.
Spania
din engleză în spaniolă
Mar 29

Hello guys.

A while ago, I was looking for free translation memories for Trados and I came across a TBX file (IATE, a free file given by the European Union). I tried to convert it to .sdltb using glossary converter, but it has given me all kinds of errors (the file is too big, the format is incorrect, etc...). I know this TM is used by Wordfast Everywhere (it has about 700.000 terms), but I don't know how to make it work.

My questions are:

Is there a way to
... See more
Hello guys.

A while ago, I was looking for free translation memories for Trados and I came across a TBX file (IATE, a free file given by the European Union). I tried to convert it to .sdltb using glossary converter, but it has given me all kinds of errors (the file is too big, the format is incorrect, etc...). I know this TM is used by Wordfast Everywhere (it has about 700.000 terms), but I don't know how to make it work.

My questions are:

Is there a way to convert the TBX file without Glossary Converter?

Could I extract the TM from Wordfast Everywhere somehow?

Thank you!
Collapse


 

Andriy Yasharov  Identity Verified
Ucraina
Local time: 05:22
Membru (2008)
din engleză în rusă
+ ...
Goldpan TMX/TBX Editor Mar 29

Goldpan TMX/TBX Editor can help with your task.

https://logrusglobal.com/goldpan.html


Stepan Konev
 

Adrián L.
Spania
din engleză în spaniolă
INIŢIATORUL SUBIECTULUI
Privacy concerns Mar 29

Andriy Yasharov wrote:

Goldpan TMX/TBX Editor can help with your task.

https://logrusglobal.com/goldpan.html


That program looks good, but it asks far too many personal details:

"Your LinkedIn of Facebook profile must have information about your professional standing".
"You must be a member of Localization Professional group on LinkedIn or Facebook"

Why do they need that data? It seems like an artificial way to gatekeep the app, if you ask me.

I am a freelance translator that is just starting and wants to get some experience with CAT tools.

[Editado a las 2021-03-29 11:15 GMT]


 

Milan Condak  Identity Verified
Local time: 04:22
din engleză în cehă
Xbench Mar 29

Adrián L. wrote:

My questions are:

Is there a way to convert the TBX file without Glossary Converter?



Hi Adrián,

I made a presentation in Czech. I hope you will understand my pictures.

www.condak.cz/nove/2021-02/27/cs/02.html

Here is machine translation CS > EN:

In 2019 I downloaded a large ZIP file that contained all languages; I extracted the language few options and the thematic area to the TBX file.

In Xbench, I did TBX conversion to TMX, by import and export method.

I imported TMX into the database of TMLookup DB.

From the database of the TMlookup, I exported the TXT file = finished glossary for OmegaT.
--
Glossaries for OmegaT are in UTF-8, glossaries for Wordfast are in Unicode 16 (LE or BE).

Milan

The user of Goldpan TBX Editor for creating TBX.

[Edited at 2021-03-29 14:37 GMT]


 

Stepan Konev  Identity Verified
Federaţia Rusă
Local time: 05:22
din engleză în rusă
Goldpan is a good tool Mar 29

If you want privacy you should not ever use any social network or even Internet in general. My only regret about Goldpan is that it does not have an Undo feature. Unlike other converters that do just one thing: convert A to B, Goldpan allows you to take several actions with your files: edit, replace, copy, paste, find&replace and even use regular expressions, etc. It takes one wrong move to have to start the entire work from scratch again. (Other converters do not bear this risk not because they... See more
If you want privacy you should not ever use any social network or even Internet in general. My only regret about Goldpan is that it does not have an Undo feature. Unlike other converters that do just one thing: convert A to B, Goldpan allows you to take several actions with your files: edit, replace, copy, paste, find&replace and even use regular expressions, etc. It takes one wrong move to have to start the entire work from scratch again. (Other converters do not bear this risk not because they are smarter, but because they just do not offer such functionality.) In other respects, Goldpan is a very powerful tool and worthy of providing a link to your Facebook page. I never received any sort of spam from them. They just asked me once to give my feedback. That's it.

[Edited at 2021-03-29 15:36 GMT]
Collapse


 

Samuel Murray  Identity Verified
Ţările de Jos
Local time: 04:22
Membru (2006)
din engleză în afrikaans
+ ...
@Adrián Mar 29

Adrián L. wrote:
I came across a TBX file (IATE, a free file given by the European Union). ... Is there a way to convert the TBX file [to TMX]?


I also have a version of that file (downloaded in 2018, 2 GB unzipped). I tried it in various tools just now. Locamotion's tbx2po gives an error message. Xbench 2.9 opens it without complaining but I can't figure out how to export anything from it (and besides, 2.9 may not preserve the Unicode characters correctly anyway).

Goldpan says that the file is too big and that I have to split it into smaller files using Batch Tools > Split. (GoldenDict gives a similar error message.) Well, when I try to do that in Goldpan, I get an error message that is apparently related to a Windows setting, saying that DTD processing is not allowed for some or other security reason. I solved this by opening the TBX file in a text editor (I used Akelpad) and just deleted the DOCTYPE line (line 2 of the file). Then Goldpan split the file without any complaint into 100 MB files, which even the Trados Glossary Converter appeared to accept. I'm not sure what effect deleting the DOCTYPE line would have, but I doubt if the effect would be great.


 

Stepan Konev  Identity Verified
Federaţia Rusă
Local time: 05:22
din engleză în rusă
Heartsome Mar 29

It took 22 minutes for Heartsome to convert IATE_export_26022019.tbx to IATE_export_26022019.tmx. Plus 7 minutes to open.
However you can download tbx files as split by Paul Filkin of SDL from here.

[Edited at 2021-03-29 19:42 GMT]


 

Milan Condak  Identity Verified
Local time: 04:22
din engleză în cehă
A short presentation Mar 29

Samuel Murray wrote:
Xbench 2.9 opens it without complaining but I can't figure out how to export anything from it (and besides, 2.9 may not preserve the Unicode characters correctly anyway).


For Unicode you need licenced version 3.x. This is not Adrian's case.

The presentation:

Xbench: TBX to TMX

http://www.condak.cz/nove/2021-03/29/en/00.html

Export TXT or TMX. My output format is TMX.

HTH

Milan


 

Samuel Murray  Identity Verified
Ţările de Jos
Local time: 04:22
Membru (2006)
din engleză în afrikaans
+ ...
@Milan Mar 30

Milan Condak wrote:
Samuel Murray wrote:
Xbench 2.9 opens it without complaining but I can't figure out how to export anything from it.

http://www.condak.cz/nove/2021-03/29/en/00.html


Hmm, it may be that my Xbench 2.9 does not actually open the file appropriately.

When I create a project, I get this screen (which you also get):

01

and when I click OK, I get this message:

02

and when I choose YES, I get this dialog:

03

but there is nothing I can do on that dialog. I can tick or untick the box, but it doesn't change anything. So, based on this, it may be that none of the terms are actually imported by Xbench to begin with. This would explain why it keeps exporting a TMX file with zero segments in it.

I use drag and drop to add the TBX file to the project, but I see you use the "Add..." button. I tried using the "Add..." button. At first, the dialog has only three tabs:

04

but after clicking Next, it opens a fourth tab, as in your screenshots, but: no language codes:

05

My test TBX file is IATE_export_29082018.tbx (1.92 GB). It is here, zipped (112 MB):
https://wsi.li/dl/FnRz3k7omJvaujYg5/d7475b


[Edited at 2021-03-30 06:59 GMT]


 

Milan Condak  Identity Verified
Local time: 04:22
din engleză în cehă
Second short presentation with animation Mar 30

Samuel,

I do not see a TBX file you are opening.
--
http://www.condak.cz/nove/2021-03/30/en/00.html

TMX NL-CS 2019-10, see an animation.

Notes:

Give the name to project after importing a file(s).

Give the name to exported TMX before export.

Milan


 

Clarisa Moraña  Identity Verified
Argentina
Local time: 23:22
Membru (2002)
din engleză în spaniolă
+ ...
convert them into a termbase Mar 30

Iate termbases are huge, and they mainly consist of terms, thus I would recommend you to convert into termbases. They are very useful in that way. I have exported those bases by specific fields, such as Mechanical engineering, Electronics, Wood, Banks, and so on. I attach them to my translation projects as termbases.

 

Adrián L.
Spania
din engleză în spaniolă
INIŢIATORUL SUBIECTULUI
Solved Mar 30

Milan Condak wrote:

Adrián L. wrote:

My questions are:

Is there a way to convert the TBX file without Glossary Converter?



Hi Adrián,

I made a presentation in Czech. I hope you will understand my pictures.

www.condak.cz/nove/2021-02/27/cs/02.html

Here is machine translation CS > EN:

In 2019 I downloaded a large ZIP file that contained all languages; I extracted the language few options and the thematic area to the TBX file.

In Xbench, I did TBX conversion to TMX, by import and export method.

I imported TMX into the database of TMLookup DB.

From the database of the TMlookup, I exported the TXT file = finished glossary for OmegaT.
--
Glossaries for OmegaT are in UTF-8, glossaries for Wordfast are in Unicode 16 (LE or BE).

Milan

The user of Goldpan TBX Editor for creating TBX.

[Edited at 2021-03-29 14:37 GMT]


Thank you Milan. This was exactly what I was looking for. You seem pretty knowledgeable about this stuff.

Cheers!


 

Stepan Konev  Identity Verified
Federaţia Rusă
Local time: 05:22
din engleză în rusă
The task has changed from 2GB to 505KB Mar 30

Samuel Murray wrote:
My test TBX file is IATE_export_29082018.tbx (1.92 GB).

You work with the entire glossary file with all languages included, while Milan Condak extracted just 2 languages. Obviously, if we talk now about 2 languages only, or extract by specific fields, any converter can do this task. It took less than a minute for Glossary Converter to convert 2 languages without any error. The same applies to Heartsome. However the original task was a bit different...

[Edited at 2021-03-30 18:33 GMT]


 

Samuel Murray  Identity Verified
Ţările de Jos
Local time: 04:22
Membru (2006)
din engleză în afrikaans
+ ...
@Stepan, @Milan Mar 30

Stepan Konev wrote:
Samuel Murray wrote:
My test TBX file is IATE_export_29082018.tbx (1.92 GB).

You work with the entire glossary file with all languages included, while Milan Condak extracted just 2 languages.

It also occurred to me that perhaps Xbench can't handle TBX files with more than two languages in it. Or maybe it is just a simple thing that we need to change in the TBX header... who knows.

Milan Condak wrote:
Samuel, I do not see a TBX file you are opening.

My test TBX file is IATE_export_29082018.tbx (1.92 GB). It is here, zipped (112 MB):
https://wsi.li/dl/FnRz3k7omJvaujYg5/d7475b
https://we.tl/t-3hU9DyCOFH


[Edited at 2021-03-30 22:01 GMT]


 

Milan Condak  Identity Verified
Local time: 04:22
din engleză în cehă
Xbench is for language pair Mar 31

Samuel Murray wrote:

Stepan Konev wrote:
Samuel Murray wrote:
My test TBX file is IATE_export_29082018.tbx (1.92 GB).

You work with the entire glossary file with all languages included, while Milan Condak extracted just 2 languages.

It also occurred to me that perhaps Xbench can't handle TBX files with more than two languages in it. Or maybe it is just a simple thing that we need to change in the TBX header... who knows.

Milan Condak wrote:
Samuel, I do not see a TBX file you are opening.

My test TBX file is IATE_export_29082018.tbx (1.92 GB). It is here, zipped (112 MB):
https://wsi.li/dl/FnRz3k7omJvaujYg5/d7475b
https://we.tl/t-3hU9DyCOFH


[Edited at 2021-03-30 22:01 GMT]


First step is extract TBX language pair from downloaded ZIP file.
Xbench support many formats of bilingual files.
Export TXT from Xbench contains lot of unuseful data.
TXT from TMX contain clear "glossary" data.
If you need multilingual TMX you can align TMX from more TXT files.

In a year 2021 you have to ask for generating data on demand:

http://www.condak.cz/nove/2021-02/27/cs/03.html

Milan


 
Număr de pagini:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to convert a TBX file into a Translation Memory

Advanced search







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »