Număr de pagini:   [1 2] >
Help! What is the best CAT tools for embedded text (in diagrams, OLE) in Word
Inițiatorul discuției: Claudia Alvis
Claudia Alvis
Claudia Alvis  Identity Verified
Peru
Local time: 05:22
Utilizator
spaniolă
+ ...
Dec 7, 2007

Hello,

I have a large .doc document I need to translate. The document has several embedded diagrams (Document Objects) with text that has to be translated in several text boxes. I know that Trados doesn't handle embedded object in Word, so I thought of using SDLX to get the text from the embedded diagrams, and an accurate word count. But I did a test with a 217-word segment + 100-word embedded diagram and both analysis gave me the same word count: 217. I don't want to translate each
... See more
Hello,

I have a large .doc document I need to translate. The document has several embedded diagrams (Document Objects) with text that has to be translated in several text boxes. I know that Trados doesn't handle embedded object in Word, so I thought of using SDLX to get the text from the embedded diagrams, and an accurate word count. But I did a test with a 217-word segment + 100-word embedded diagram and both analysis gave me the same word count: 217. I don't want to translate each diagram separately because it will take me forever but also, I might miss something important.

Is there a CAT tool that handles embedded objects in Word properly? Or is there a way to tell Trados or SDLX to "read" those Objects? The kind of object I'm talking is this: {EMBED Word Document.8\s}, so if I right-click on it, I get the Document Object window and it shows up as a separate document.

I was also wondering, if I CAT tool can't do the trick, if I could batch-save all those Objects, work on them and then update like a TOC. I tried doing that manually, and TRADOS got the right word count but I don't know how to batch save them.

I'd appreciate ANY help.

Thanks.


[Edited at 2007-12-08 06:17]
Collapse


 
Antoní­n Otáhal
Antoní­n Otáhal
Local time: 12:22
Membru (2005)
din engleză în cehă
+ ...
Transit could make it Dec 8, 2007

I have used Star Transit XV for translating embedded Excel and "editable images" in Word and PowerPoint without problems. From what you say, you have something like "Word embeddd in Word" here, which I have never met, but I suppose it would work as well. Note that you need at least Smart version for importing Word into Transit.

HTH
Antonin


 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finlanda
Local time: 13:22
Membru (2003)
din finlandeză în germană
+ ...
Did you try Tageditor? Dec 8, 2007

At least textboxes are handled well in TE, but when I tried SDLX, the program got stuck already when converting.
(Fortunately) I have no experiences with these embedded objects.
One circumvention could be to convert to pdf and scan these back to Word. Or simply copy the contents to another Word-doc without embedding.


 
Claudia Alvis
Claudia Alvis  Identity Verified
Peru
Local time: 05:22
Utilizator
spaniolă
+ ...
INIŢIATORUL SUBIECTULUI
All kinds of objects Dec 8, 2007

Actually, upon revising some of the documents, I've found that it has all kinds of objects like, Excel embedded in Word embedded in Word, that is an excel table, embedded in a Word file that's embedded in the main document. And I just started with the file, I don't wanna think what's coming up next.

Heinrich, TagEditor would definitely not work. I mean, it doesn't even work with simple objects, let alone this kind of 'monster'.


 
Brandis (X)
Brandis (X)
Local time: 12:22
din engleză în germană
+ ...
multiple choice... Dec 8, 2007

Hi! Text embedded in pictures or diagrams can also be diagrams or alphabets / alphanumeric. In the second event, I would use a picture extraction facility like snagit into a folder and process them from there either in paintbrush, photoshop, fireworks without changing the dimensions of the diagrams. Process the rest of the .doc content clean the bilingual file and reembed the processed diagrams through replacement function. Best regards, Brandis

 
Claudia Alvis
Claudia Alvis  Identity Verified
Peru
Local time: 05:22
Utilizator
spaniolă
+ ...
INIŢIATORUL SUBIECTULUI
Graphics and diagrams Dec 8, 2007

Hello Brandis,

To be honest, I'm not too concerned about the embedded pictures (I'll worry about it later), the problem right now is extracting the text from the embedded objects (Excel, Word) on the files. I've been thinking maybe there's a tool like snagit, but that could work with embedded Office objects. So I could work on the files without having to worry about breaking the tags.

I'm also worried about the word count, because in just a couple of "double-embedded" E
... See more
Hello Brandis,

To be honest, I'm not too concerned about the embedded pictures (I'll worry about it later), the problem right now is extracting the text from the embedded objects (Excel, Word) on the files. I've been thinking maybe there's a tool like snagit, but that could work with embedded Office objects. So I could work on the files without having to worry about breaking the tags.

I'm also worried about the word count, because in just a couple of "double-embedded" Excel tables, I've found more than 500 words that neither Trados nor SDLX are counting.
Collapse


 
Brandis (X)
Brandis (X)
Local time: 12:22
din engleză în germană
+ ...
It works as it had worked for me... Dec 8, 2007

Claudia Alvis wrote:

Hello Brandis,

To be honest, I'm not too concerned about the embedded pictures (I'll worry about it later), the problem right now is extracting the text from the embedded objects (Excel, Word) on the files. I've been thinking maybe there's a tool like snagit, but that could work with embedded Office objects. So I could work on the files without having to worry about breaking the tags.

I'm also worried about the word count, because in just a couple of "double-embedded" Excel tables, I've found more than 500 words that neither Trados nor SDLX are counting.
Using Snagit ( This is not advertisement) one could extract all pictures from a website, from a book and similarly from a document. Save them to a separate folder. Tag the document, clean up and replace with the processed graphics, you have to maintain the original proportions to retain the document format. SNAGIt is a freeware for 30 days I think and it is fully functional. I found it to be great in such instances. There is however an ECM-Plugin, somewhat expensive but does a grand job. Best regards, Brandis

[Edited at 2007-12-08 06:25]


 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finlanda
Local time: 13:22
Membru (2003)
din finlandeză în germană
+ ...
seems no tool will do it all Dec 8, 2007

But my advice about converting to pdf and scanning would allow you at least to count the text.
Probably nobody would be prepared to pay the thousands of Euros a tool would cost that could handle complicated files like in this example. That's why such a tool probably has not been developed.
You could ask Jost Zetschke, who routinely researches all possible translating tools. Perhaps there is a tool for big translation agencies that could handle these objects.

I would crea
... See more
But my advice about converting to pdf and scanning would allow you at least to count the text.
Probably nobody would be prepared to pay the thousands of Euros a tool would cost that could handle complicated files like in this example. That's why such a tool probably has not been developed.
You could ask Jost Zetschke, who routinely researches all possible translating tools. Perhaps there is a tool for big translation agencies that could handle these objects.

I would create a new file and copy and past the objects one by one. After translation they do not have to be embedded, because the content will not change.
Have you tried to open the file in Openoffice Writer?
There should be a function for "flattening" those objects.

Good luck!
Collapse


 
Antoní­n Otáhal
Antoní­n Otáhal
Local time: 12:22
Membru (2005)
din engleză în cehă
+ ...
Transit really can do it smoothly Dec 8, 2007

Heinrich, your statement "seems no tool will do it all" looks like a bit of an overstatement.

Antonin


 
Samuel Murray
Samuel Murray  Identity Verified
Ţările de Jos
Local time: 12:22
Membru (2006)
din engleză în afrikaans
+ ...
I know of no such tool Dec 8, 2007

Claudia Alvis wrote:
I have a large .doc document I need to translate. The document has several embedded diagrams (Document Objects) with text that has to be translated in several text boxes.


If there were only text boxes, then OmegaT could do it. And there is a macro in the Wordfast Yahoogroup's file section for extracting all text box text, translate it and put them all back in one go. But... you talk about embedded objects, so I gather it aint just simple text boxes with text in them, right?

I don't want to translate each diagram separately because it will take me forever but also, I might miss something important.


It may come to that. Still, isn't there a way you can select all text except embedded objects, and delete it all, leaving only the embedded stuff?

I was also wondering, if I CAT tool can't do the trick, if I could batch-save all those Objects, work on them and then update like a TOC.


Hmm, that would be interesting to play around with. Tis a pity I don't have access to your document...


 
Gillian Scheibelein
Gillian Scheibelein  Identity Verified
Germania
Local time: 12:22
din germană în engleză
+ ...
a suggestion... Dec 8, 2007

Hi Claudia,

you can send me the file and I'll try a Transit import. We have a new filter that allows imports of all types of Office documents into a single project. I can then copy the extracted text into a Word file and you can count and translate it. It is worth a try. Transit is excellent at extracting text out of ppt and xls files - even hidden text!

Cheers,
Jill


 
Peter Linton (X)
Peter Linton (X)  Identity Verified
Local time: 11:22
din suedeză în engleză
+ ...
Create a PDF Dec 8, 2007

Whatever tools you find, Heinrich Pesch's advice about creating a PDF file is very good. That way you can be reasonably sure of displaying all the text, even if hidden by Word. It thus provides a good way of checking that you have not missed anything.

I recently had a good example of the problem -- a DOC file with several apparently empty pages that appeared only if you right clicked the mouse on each empty page. So the final word count was twice as big as the value Word (and the cu
... See more
Whatever tools you find, Heinrich Pesch's advice about creating a PDF file is very good. That way you can be reasonably sure of displaying all the text, even if hidden by Word. It thus provides a good way of checking that you have not missed anything.

I recently had a good example of the problem -- a DOC file with several apparently empty pages that appeared only if you right clicked the mouse on each empty page. So the final word count was twice as big as the value Word (and the customer) expected.

In this case, I converted the PDF file into a Word DOC (using OmniPage 15), and all the hidden text suddenly appeared.

In hindsight, I should have checked the settings in Word Tools/Options/Show Picture placeholders and Field codes.
Collapse


 
Claudia Alvis
Claudia Alvis  Identity Verified
Peru
Local time: 05:22
Utilizator
spaniolă
+ ...
INIŢIATORUL SUBIECTULUI
Transit XV Dec 8, 2007

Thanks everyone for your suggestions and offers, you're very generous especially you-know-who.

I also want to thank Antoní­n and Gillian for leading me in the right direction. Transit XV does recognize and work with embedded objects and 'objects embedded into objects embedded', which is not a small task. A generous colleague has let me use his copy of Transit (I know, I know), and even he's on vacations we've spent all morning in a trial-and-error session and I think we finally ha
... See more
Thanks everyone for your suggestions and offers, you're very generous especially you-know-who.

I also want to thank Antoní­n and Gillian for leading me in the right direction. Transit XV does recognize and work with embedded objects and 'objects embedded into objects embedded', which is not a small task. A generous colleague has let me use his copy of Transit (I know, I know), and even he's on vacations we've spent all morning in a trial-and-error session and I think we finally have it figured out.

First of all, I have to say that I am fairly impressed with Transit, I'd been looking for an alternative to Trados and I think Transit might be it. I'm posting what I did, even though with Transit, it doesn't seems so complicated anymore, it might help somebody else because Transit was not the end of the process.


  • I prepared the project with Transit. In File Type I chose MS-Word then I went to Options and checked 'Process objects' from the 'Embedded Objects' group box.
  • I added a couple of sample files I had previously prepared; one with a 'diagram within a word object within word' and the other with a 'table as an excel object within a word object within word'.
  • Transit managed to "read" the text from the embedded objects, I pseudo-translated some of them then I exported the files.
  • But when I opened the exported files, the text in the objects hadn't changed. I mean the normal text was translated but the text inside the tables and diagrams were still in the original language.
  • It turned out that I had to manually 'Convert' them. For instance, with the diagram with text-boxes embedded as a Word Object, I had to right-click on the code {EMBED Word Document.8\s} and select Document Object > Convert > Convert to > Microsoft Office Word Document. With the diagram as an Excel Object embedded in a Word Object embedded in the document, I had to do one extra thing. I right-clicked on the table, then selected Document Object>Open. Once I got into the Word Object, I right-clicked on the table then selected Worksheet Object> Convert > Convert to > Microsoft Office Excel Worksheet.
  • So 'Convert' was the way to update the objects, just like I thought I had to do manually but Transit save me a lot of time.


Since it's such a complex text, it's very likely that I'll come across many more of this jewells

Brandis, Heinrich, Peter, the reason I didn't want to resort to converting the file to pdf is that I didn't want to modify the code, let alone replace it with just the text. My file has plenty of bookmarks, codes, links, so I was afraid if I modify something, I will ruin the file. And those codes are a nightmare to fix.

My concern now is how to charge for doing this, as I don't really know how long this whole process would take me and I'm basically learning how to do this. I had never worked on Word document that was so heavily-coded as much as this one.

Thanks

Claudia
Collapse


 
Antoní­n Otáhal
Antoní­n Otáhal
Local time: 12:22
Membru (2005)
din engleză în cehă
+ ...
Regarding the "intellectual property rights" issue when testing Transit Dec 9, 2007

When I was considering a purchase of Transit, they gave me a full veriosn for one monh as a trial, so perhaps your admitted usage of someone else's licence may not be that bad.

Now that you mention it, I do recall that the embedded objects require kind of a "refresh" step after they are exported from Transit. Sorry I did not forewarn you - fortunately, that kind of jobs do not come my way that often, so I had happily
... See more
When I was considering a purchase of Transit, they gave me a full veriosn for one monh as a trial, so perhaps your admitted usage of someone else's licence may not be that bad.

Now that you mention it, I do recall that the embedded objects require kind of a "refresh" step after they are exported from Transit. Sorry I did not forewarn you - fortunately, that kind of jobs do not come my way that often, so I had happily forgotten about it. I mainly use this feature for Excel within PowerPoint, and the "after-processing" stage with ppt files (after exporting them from Transit before sending them out to the customer) is usually quite extensive anyway...

Antonin

[Edited at 2007-12-09 00:05]
Collapse


 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finlanda
Local time: 13:22
Membru (2003)
din finlandeză în germană
+ ...
Good to know Dec 9, 2007

But what is the price-tag? Which version of Transit is needed for this?

I do not understand very much of document structures, so I would like to ask, is it really worth the trouble?
I believe embedded objects are pieces of other applications somewhere in a document hierarchy. When person A embedds an excel file in a Word-file and person B makes changes to the excel-file, the next time person A opens his Word-file the changes are realised.
But what if person A gives the w
... See more
But what is the price-tag? Which version of Transit is needed for this?

I do not understand very much of document structures, so I would like to ask, is it really worth the trouble?
I believe embedded objects are pieces of other applications somewhere in a document hierarchy. When person A embedds an excel file in a Word-file and person B makes changes to the excel-file, the next time person A opens his Word-file the changes are realised.
But what if person A gives the word-document to person C for translation. If translator C uses Transit and preserves all links to the original file structure, will subsequently any change person B will make to his excel-file and person A opens the translated Word-file, will the changes person B has done effect the translation, so that the untranslated version of the excel-file will replace the translated version?

Why is it really necessary to preserve the links when translating?

Regards
Heinrich

PS: If one can convert something to the Transit-format, it should be possible to do the actual translation in Word using PlusToys-macro and convert back to Transit format and from Transit back to the original.
Collapse


 
Număr de pagini:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Help! What is the best CAT tools for embedded text (in diagrams, OLE) in Word







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »