Macro to mark repetitions in a document?
Thread poster: Samuel Murray
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 11:29
Member (2006)
English to Afrikaans
+ ...
Jul 3, 2014

Hello everyone

I have a text file that contains one segment per line. Some lines are repetitions of previous lines. I would like to know if there is a macro (e.g. in MS Word) that will mark all subsequent repetitions in e.g. a different colour, so that a translator translating such a file in a non-CAT tool can see when he gets to a segment that he has already translated. The macro should not mark the first occurence of the repetition (or: it should mark the first occurence in a d
... See more
Hello everyone

I have a text file that contains one segment per line. Some lines are repetitions of previous lines. I would like to know if there is a macro (e.g. in MS Word) that will mark all subsequent repetitions in e.g. a different colour, so that a translator translating such a file in a non-CAT tool can see when he gets to a segment that he has already translated. The macro should not mark the first occurence of the repetition (or: it should mark the first occurence in a different way) because the translator should still translate the first occurence of it.

Do you know of such a tool?

Thanks
Samuel
Collapse


 
Hugo Rincón
Hugo Rincón
Venezuela
English to Spanish
+ ...
Will you do this word by word? Jul 3, 2014

I mean, will you go through the document and specifically select the words that are to be found and highlighted? If that's the case, you could use a very useful MS Word feature. I'm talking about "Find and Replace". You search for the word in the "Find" tab and then click on the checkbox that says "Highlight all items found in". This will allow you to "select" all the occurrences of that particular word or phrase and change the format without modifying or altering the rest o... See more
I mean, will you go through the document and specifically select the words that are to be found and highlighted? If that's the case, you could use a very useful MS Word feature. I'm talking about "Find and Replace". You search for the word in the "Find" tab and then click on the checkbox that says "Highlight all items found in". This will allow you to "select" all the occurrences of that particular word or phrase and change the format without modifying or altering the rest of the document.Collapse


 
Gerard de Noord
Gerard de Noord  Identity Verified
France
Local time: 11:29
Member (2003)
English to Dutch
+ ...
Show us your macro first Jul 3, 2014

Dear Samuel,

On this website you have a record of asking questions you know the answer to. Maybe it's more productive to publish the macro you've created, tell us where it goes wrong and ask us for a solution.

Cheers,
Gerard


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 11:29
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Nice to know I have a "record" Jul 4, 2014

Gerard de Noord wrote:
On this website you have a record of asking questions you know the answer to.


I did not realise that it came across like that. Obviously when I ask a question I continue to search for an answer (including, when I don't find anything, writing something myself), and when I do find the answer (or write something myself) it is only useful to share it. That is my personal opinion.

Maybe it's more productive to publish the macro you've created, tell us where it goes wrong and ask us for a solution.


I haven't created any. I don't have time for that right now. I have an unfortunate situation which turns out to be somewhat worse every time I think that I may have found a solution for it.

I blame Trados, of course.


 
Tony M
Tony M
France
Local time: 11:29
Member
French to English
+ ...
SITE LOCALIZER
Is the document order important? Jul 4, 2014

If not, I'd say create a table with as many columns as necessary, then sort it alphabetically on the column with your words in, so that all the identical segments will at least appear together.

If the order IS important, then you could create a sequentially-numbered indexing column at the extreme left, say, then sort alphabetically on the 'wanted word' column, and finally after translation re-sort on the 'index' column to get the original order back.

Of course, if the t
... See more
If not, I'd say create a table with as many columns as necessary, then sort it alphabetically on the column with your words in, so that all the identical segments will at least appear together.

If the order IS important, then you could create a sequentially-numbered indexing column at the extreme left, say, then sort alphabetically on the 'wanted word' column, and finally after translation re-sort on the 'index' column to get the original order back.

Of course, if the text needs to be read in sequence for the translation to make sense, then you're scuppered
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 11:29
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
@Hugo Jul 4, 2014

Hugo Rincón wrote:
I mean, will you go through the document and specifically select the words that are to be found and highlighted?


No, I don't want to do that. I need something that can be repeated for a number of files, with little or no effort by myself, particularly the effort of making sure I follow all the steps in a long process, particularly when midnight is approaching and there is an obligation but practically no financial gain.

You search for the word in the "Find" tab and then click on the checkbox that says "Highlight all items found in". This will allow you to "select" all the occurrences of that particular word or phrase and change the format without modifying or altering the rest of the document.


Thanks, I'm aware of that feature. However, to "Find" these segments, I first need to know which segments are repeating, and then I have to enter them into the Find dialog one by one. In addition, I would have to perform an ordinary search as well, because I need to unmark the first occurence of the repetition.

My situation is that I have a translator who doesn't use CAT and who can't (right now) use CAT (I considered the OmegaT route since that is the simplest CAT tool I know of, but OmegaT has grown into a monster download in recent months, and I'm dealing with a slow-expensive-internet here).

If a file has 1000 segments and 250 of them are repetitions, and the repetitions are not marked, then not only will the translator translate the repetitions inconsistently but he'll also spend a lot more time on the translation than the end-client originally envisaged. I'm in the very fortunate situation that there are practically no internal fuzzies for this project, and no client TM.

Methods that involve sorting the file (without unsorting it afterwards) are not useful to me because many of the segments need to be translated in context.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 11:29
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
@Tony Jul 4, 2014

Tony M wrote:
If the order IS important, then you could create a sequentially-numbered indexing column at the extreme left, say, then sort alphabetically on the 'wanted word' column, and finally after translation re-sort on the 'index' column to get the original order back.


That is the one solution that I was thinking of last night. However, I don't know how to create such an indexing column in MS Word. Do you know how to do it? I can record macros but not write them.

I might need something like:
* Count number of ^p
* For every ^p, replace ^p with ^p countnumber ^t


 
Tony M
Tony M
France
Local time: 11:29
Member
French to English
+ ...
SITE LOCALIZER
Quick 'n' dirty Jul 4, 2014

It all depends what is the maximum document length you have; if there aren't TOO many lines in any one doc, then it might be simplest just to create a sequential column in say Excel (you know, cell = cell-1 +1 and fill down), then paste it across into Word.

Sorry, I don't know how to do macros, so can't suggest a solution in that direction.

Don't know if it might be possible to use Word's own line numbering facility?


 
Orrin Cummins
Orrin Cummins  Identity Verified
Japan
Local time: 18:29
Japanese to English
+ ...
This might get you on the right track Jul 4, 2014

Copy and paste the text into a blank Excel sheet, make sure all cells with in that column are selected (there should be only one column) then on the Home tab go to:

Conditional Formatting -> Highlight Cells Rules -> Duplicate Values

From here you can select whether to highlight duplicate values or unique values.

The problem is that if you choose duplicate values, it also highlights the first iteration of that string. But at least this might save you some ti
... See more
Copy and paste the text into a blank Excel sheet, make sure all cells with in that column are selected (there should be only one column) then on the Home tab go to:

Conditional Formatting -> Highlight Cells Rules -> Duplicate Values

From here you can select whether to highlight duplicate values or unique values.

The problem is that if you choose duplicate values, it also highlights the first iteration of that string. But at least this might save you some time until you figure out a better way.
Collapse


 
Terry Richards
Terry Richards
France
Local time: 11:29
French to English
+ ...
This will get you started Jul 4, 2014

Sub MarkDup()
'
' MarkDup Macro
' Macro created 04/07/2014 by Terry Richards
'

' Move to start of doc

Selection.HomeKey Unit:=wdStory

' For each paragraph

Dim Para As Paragraph
Dim ParaIx As Long
Dim PrevIx As Long
ParaIx = 0

For Each Para In ActiveDocument.Paragraphs

ParaIx = ParaIx + 1

'See if this p
... See more
Sub MarkDup()
'
' MarkDup Macro
' Macro created 04/07/2014 by Terry Richards
'

' Move to start of doc

Selection.HomeKey Unit:=wdStory

' For each paragraph

Dim Para As Paragraph
Dim ParaIx As Long
Dim PrevIx As Long
ParaIx = 0

For Each Para In ActiveDocument.Paragraphs

ParaIx = ParaIx + 1

'See if this paragraph has a duplicate above

For PrevIx = ParaIx - 1 To 1 Step -1

If ActiveDocument.Paragraphs(PrevIx).Range = Para.Range Then

ActiveDocument.Paragraphs(ParaIx).Range.Font.Animation = wdAnimationMarchingRedAnts

End If

Next PrevIx

Next Para


End Sub

This will mark all duplicate paragraphs with marching red ants. As is, it only finds exact duplicates - including formatting and case - but you can modify that if you want.
It could also be sped up by stopping the backwards search once you get a hit.
Collapse


 
Johan Kjallman
Johan Kjallman  Identity Verified
Local time: 11:29
Member (2008)
English to Swedish
+ ...
process the files in memoQ Jul 4, 2014

Hi Samuel,

in memoQ it's possible to create a view of all repetitions (excluding the first occurrences). I could therefore think of some different ways to pre-process the files in memoQ to meet your needs. But before I get into details, would you consider using memoQ for this task? Would it be important that the translator works with the original layout in Word? A quick and dirty method could otherwise be to export a two-column rtf file where the reps have a different status. You ca
... See more
Hi Samuel,

in memoQ it's possible to create a view of all repetitions (excluding the first occurrences). I could therefore think of some different ways to pre-process the files in memoQ to meet your needs. But before I get into details, would you consider using memoQ for this task? Would it be important that the translator works with the original layout in Word? A quick and dirty method could otherwise be to export a two-column rtf file where the reps have a different status. You can reimport this file into memoQ afterwords to export a clean file with the original layout.

/Johan
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 11:29
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Okay, how does one do it in MemoQ? Jul 5, 2014

Johan Kjallman wrote:
In memoQ it's possible to create a view of all repetitions (excluding the first occurrences).


I do have access to a MemoQ installation, so that would seem like a good solution. Do you know of a video that explains this procedure?

Would it be important that the translator works with the original layout in Word?


My file is a plain text file with all segments delimited by line breaks, so, no, it's not important.


 
Johan Kjallman
Johan Kjallman  Identity Verified
Local time: 11:29
Member (2008)
English to Swedish
+ ...
I'll record the procedure Jul 5, 2014

Hi Samuel,

great! I don't know if there is any existing video that explains this procedure, but I can make a recording later today showing how I'd do it. You can send me a test file if you want to (if so, send me a private message)

/Johan


 
Johan Kjallman
Johan Kjallman  Identity Verified
Local time: 11:29
Member (2008)
English to Swedish
+ ...
Videos Jul 5, 2014

Hi again,

I have now recorded two videos with two different approaches.
These are actually my first videos, so the quality is what it is. Hopefully the message gets through though.<
... See more
Hi again,

I have now recorded two videos with two different approaches.
These are actually my first videos, so the quality is what it is. Hopefully the message gets through though.

https://dl.dropboxusercontent.com/u/14601215/highlight_reps_part1.swf
https://dl.dropboxusercontent.com/u/14601215/highlight_reps_part2.swf

BR/Johan
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Macro to mark repetitions in a document?






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »