Batch removing old translation variants from TM

This Forum is for general issues about MadCap Lingo
Post Reply
okonvick
Propeller Head
Posts: 27
Joined: Mon Jan 12, 2015 6:19 am

Batch removing old translation variants from TM

Post by okonvick »

I have a large TM developed over the course of many years. We have recently done a revision of the translation of one of our larger projects, so some existing segment translations were edited. As a result, we now have many cases when the TM includes two variants of how to translate the same thing. I would like to remove the old ones since Lingo often applies the older (not the updated) translation when autotranslating the project using the TM. Is there any tool for this or do I have to start programming to dig through the SQL database storing the TM?
okonvick
Propeller Head
Posts: 27
Joined: Mon Jan 12, 2015 6:19 am

Re: Batch removing old translation variants from TM

Post by okonvick »

To answer my own question, Lingo does not include a function for this, so I've put together a piece of SQL code that solves this problem.

The script looks for translations the source segments of which are the same - in other words for different translations of the same thing. If more than one translations are found (a duplicity), the translations are sorted from oldest to the newest and everything except the newest is removed. It could also happen that after this step, there is still more than one translation left since it can happen that two translations have the same creation datetime - if that happens, the script only leaves the one with the highest ID, which is safe to take as the newest one.

This solved the problem for me without any exceptions.

You can run this code on the translation memory database - for example in SQL Management Studio.

Code: Select all

delete from Translations where id in (
select 
t.id
from Translations t 
join Segments source on (source.id = t.sourceSegment_id)
join Segments targ on (targ.id = t.targetSegment_id)
join 
(
select 
cast(s_source.RawText as varchar(max)) as src_rawtext,
s_source.language_id,
max(s_targ.ModifiedOn) as targ_mod
from 
Translations s_t
join Segments s_source on (s_source.id = s_t.sourceSegment_id)
join Segments s_targ on (s_targ.id = s_t.targetSegment_id)
group by cast(s_source.RawText  as varchar(max)),s_source.language_id) ondra
on (ondra.src_rawtext =  cast(source.RawText  as varchar(max)) and ondra.language_id = source.language_id
and ondra.targ_mod > targ.ModifiedOn) 
);
go


delete from Translations where not id in (
select max(t.id)
from 
Translations t
join Segments source on (source.id = t.sourceSegment_id)
join Segments targ on (targ.id = t.targetSegment_id)
group by cast(source.RawText as varchar(max)),source.language_id
);
go

delete  from Segments 
where not exists (Select 1 from Translations t where t.sourceSegment_id = Segments.id or t.targetSegment_id= Segments.id);
go
Don't forget to backup your translation memory before trying this ;)
ASTeC
Propeller Head
Posts: 15
Joined: Thu Feb 28, 2008 3:26 am
Location: Belgium
Contact:

Re: Batch removing old translation variants from TM

Post by ASTeC »

Hello,

You can also avoid the creation of these double translations. To do this, update the translated segments instead of adding a new (and additional) translation (as it is done by default)...

For detailed procedure, see https://help.madcapsoftware.com/lingo11 ... Source.htm
Post Reply