Numbering in Word makes character encoding go nuts
Thread poster: Ksenia Sergeeva
Ksenia Sergeeva
Ksenia Sergeeva  Identity Verified
Russian Federation
Local time: 02:32
English to Russian
+ ...
Mar 16, 2016

So, I've been trying to create a proper Deja Vu project (with Russian as source language) and start the translation, but some segments kept looking like the character encoding is wrong... You know, going all ãòēpù è ùòå and things like that. I've spent a lot of time trying to do something about it, and then some more time trying to see where it happened. Well, turns out that all the segments which go right after auto numbering numbers (e.g. 1.3. This and that) became unreadable in Deja V... See more
So, I've been trying to create a proper Deja Vu project (with Russian as source language) and start the translation, but some segments kept looking like the character encoding is wrong... You know, going all ãòēpù è ùòå and things like that. I've spent a lot of time trying to do something about it, and then some more time trying to see where it happened. Well, turns out that all the segments which go right after auto numbering numbers (e.g. 1.3. This and that) became unreadable in Deja Vu.
Any ideas about what I can do about it?
Collapse


 
VIP9N
VIP9N
Local time: 02:32
Russian to English
+ ...
More info required Mar 17, 2016

So, I've been trying to create a proper Deja Vu project...


With files in which format have you tried: *.doc or *.docx, *.ppt or *.pptx, etc.?

...some segments kept looking like the character encoding is wrong... You know, going all ãòēpù è ùòå and things like that


It would be plausible in the prior-to-unicode reality, but today sounds weird. As long as you didn't mention your incoming format for the files, one would guess only about the reasons: word-file created in Chinese/Japanese version of office, or made in MS Office 97 Or maybe it's just the font, selected on your PC for dispalying your operations in DéjàVu panes does not support Cyrillic characters.

... I've spent a lot of time trying to do something about it...

Like what, for example?


 
Ksenia Sergeeva
Ksenia Sergeeva  Identity Verified
Russian Federation
Local time: 02:32
English to Russian
+ ...
TOPIC STARTER
More info :) Mar 18, 2016

It's a docx Word file.
I'm sure it wasn't created in Office 97 or Japanese version of Word. The font is the same throughout the document, and the font in Deja Vu displays Cyrillic characters just fine... until they have numbering in front of them. Deja Vu also failed to import the table of contents from this file.

VIP9N wrote:
Like what, for example?

Yes, you are right, this is weird. So I took some weird actions. I'm new to Deja Vu X3 and never had any problems with X2, so my actions were quite erratic... I've changed filters a couple of times, that's all I can remember properly. Then I decided to use another CAT tool, and this solved my problem.


 
VIP9N
VIP9N
Local time: 02:32
Russian to English
+ ...
File cleaning is required Mar 18, 2016

It's a docx Word file. I'm sure it wasn't created in Office 97 or Japanese version of Word. The font is the same throughout the document, and the font in Deja Vu displays Cyrillic characters just fine... until they have numbering in front of them. Deja Vu also failed to import the table of contents from this file.


Well, I would say smth is wrong with the font of the original file. Probably, the Word-file had been saved with embedded fonts for digits or so. I would try to use free TransTools - http://www.translatortools.net/about.html

I'm new to Deja Vu X3 and never had any problems with X2, so my actions were quite erratic...


I have been using this tool for about fifteen years or so, and I would say that all CATs have their advantages and disadvantages, but the only problem, which is similar to yours I’ve ever heard in DéjàVu was its operation with the Armenian language. Never with Cyrillic characters.

Then I decided to use another CAT tool, and this solved my problem.


Well, if I had walked in your shoes I would try to clean formatting of the original file first. Then, to no avail, certainly I would try any other CAT and see its behaviour.


 
mikhailo
mikhailo
Local time: 02:32
English to Russian
+ ...
ре Mar 22, 2016

So, I've been trying to create a proper Deja Vu project (with Russian as source language) and start the translation, but some segments kept looking like the character encoding is wrong... You know, going all ãòēpù è ùòå and things like that. I've spent a lot of time trying to do something about it, and then some more time trying to see where it happened. Well, turns out that all the segments which go right after auto numbering numbers (e.g. 1.3. This and that) became unreadable in Deja Vu.
Any ideas about what I can do about it?


Шрифты в оригинале. В документах на перевод старайтесь ограничиться стандартными виндовыми - TNR, Arial, Courier New, ну и новыми типа Calibri, если не хватает старых.
Посмотрите в оригинале стили, связанные с такими нумерованными абзацами. Может там номера хитрым шрифтом делаются, который Дежа применяет ко всему предложению.
После всяких горе-верстальщиков и не такие чудеса бывают.

It's a docx Word file.
I'm sure it wasn't created in Office 97 or Japanese version of Word. The font is the same throughout the document, and the font in Deja Vu displays Cyrillic characters just fine... until they have numbering in front of them. Deja Vu also failed to import the table of contents from this file.


Преобразуйте в DOCX. С ним дежа работает лучше.
А зачем вам TOC? Переведёте документ, обновите поле в переведенном документе и получите готовый TOC.
Это лучше, чем мучиться, расставляя отступы табами или пробелами в оглавлениях, сделанных недоумками.
Мне в таких трудах даже проще расставить заголовки по оригиналу, чтобы потом забацать автоматический TOC.


Yes, you are right, this is weird. So I took some weird actions. I'm new to Deja Vu X3 and never had any problems with X2, so my actions were quite erratic... I've changed filters a couple of times, that's all I can remember properly. Then I decided to use another CAT tool, and this solved my problem.


Дай Бог, чтобы в других КАТ у вас не возникли проблемы гораздо серьёзнее.


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Pavel Tsvetkov[Call to this topic]

You can also contact site staff by submitting a support request »

Numbering in Word makes character encoding go nuts






Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »