Translators - Translator Resources
Lista globale ProZ.com e shërbimeve të përkthimit
 The translation workplace

Exporting my BigMama to Trados txt - how long???




 


Përdorues
Afishuesi i rubrikës: Wolfgang Jörissen
Exporting my BigMama to Trados txt - how long???

Wolfgang Jörissen  Identity Verified
Poloni
Local time: 14:17
Anëtar
holandisht në gjermanisht
+ ...

MODERATOR
Aug 28, 2008

Hi there,

Started an attempt to export my BigMama to Trados txt like 15 minutes ago and I see my harddisc activity indicator flickering, but not a single block under "Exporting..." in the export wizard has shown up yet. I have to add that it is really a huge venture, with 500.000+ units and a filesize of about 1 GB (all files combined). I know DVX does things like this slowly but steadily, but does anyone have an idea just how long it might take on a Dual Core 1,77 MHz laptop with 2 GB of RAM? One day's shopping or a short vacation ?

[Edited at 2008-08-28 08:38]


Direct link   Reply with quote
 

Harry Bornemann  Identity Verified
Gjermani
Local time: 14:17
Anëtar qysh 2002
anglisht në gjermanisht
+ ...
A short vacation Aug 28, 2008

I estimate 10 MB per hour, which would be 4 days, but it could as well take 8 days..

Direct link   Reply with quote
 

Wolfgang Jörissen  Identity Verified
Poloni
Local time: 14:17
Anëtar
holandisht në gjermanisht
+ ...

MODERATOR
1,5 hour later Aug 28, 2008

... and literally 1 block showing up. Well, as I said, slowly but steadily. But at least something _is_happening.

Direct link   Reply with quote
 
FarkasAndras
Hungari
Local time: 14:17
anglisht në hungarisht
+ ...
slow Aug 28, 2008

That's a pretty ridiculous pace.
I take it you're using deja vu, right? How long would it take to import such a file?

Trados manages TMs twice that size in reasonable time scales. Import of a million TUs takes well under 10 minutes IIRC, certainly under half an hour.
I expect that the export of 500000 TUs would be WELL under an hour, possibly under 5 minutes. It's just a couple of hundred MBs of text, not HD video encoding.


Direct link   Reply with quote
 

Wolfgang Jörissen  Identity Verified
Poloni
Local time: 14:17
Anëtar
holandisht në gjermanisht
+ ...

MODERATOR
Drawback indeed Aug 28, 2008


FarkasAndras wrote:

That's a pretty ridiculous pace.
I take it you're using deja vu, right? How long would it take to import such a file?


AFAIK, importing does not take that long, but the export speed is a drawback indeed. Trados might have a faster import/export, but I am quite happy with the translation features of DVX, so I do not see any reason to switch.

Any idea if the export can be speeded up in some way (except by adding RAM)? Other format?


Direct link   Reply with quote
 

Kevin Lossner  Identity Verified
Gjermani
Local time: 14:17
Anëtar qysh 2003
gjermanisht në anglisht
Speeded up? Aug 28, 2008


Not a chance except with more RAM or a faster computer. Slow import/export is one of those crosses we DVX users have to bear for now. Not an issue really for a typical customer TM with 10,000-20,000 TUs, but the big ones are a killer. I've been trying to get up the nerve to import that awful TM from the EU for months, but I'd have to plan to have the system tied up for a few days. Unfortunately, this almost cries out for a service bureau to do this sort of thing.

Trados, in contrast, is gratifyingly fast for this function, which is one of the few good things I can say about that tool I was amazed how fast the 350,000 or so DE-EN TUs from that EU mess imported!


Direct link   Reply with quote
 
Advertisement

Wolfgang Jörissen  Identity Verified
Poloni
Local time: 14:17
Anëtar
holandisht në gjermanisht
+ ...

MODERATOR
Test with Access Aug 28, 2008

Did a test exporting it to Access and after interrupting it after 2 minutes, I found that about 4000 TUs were exported, so let's say just under 1%. Is it realistic to assume that it will go on at that pace? This would mean an export time of 21 hours. Long enough allright, but still a lot less than a week.

[Edited at 2008-08-28 11:23]


Direct link   Reply with quote
 

Harry Bornemann  Identity Verified
Gjermani
Local time: 14:17
Anëtar qysh 2002
anglisht në gjermanisht
+ ...
Yes, go for Access Aug 28, 2008

Any measured value is better than an estimation
(which is still better than no idea at all)
and exporting from Access to a textfile is fast, too.


Direct link   Reply with quote
 

Selcuk Akyuz  Identity Verified
Turqi
Local time: 15:17
Anëtar qysh 2006
anglisht në turqisht
+ ...

MODERATOR
Access Aug 28, 2008

I have tested it now, export of a small TM (20,000 segments) took only 2 minutes. But my TM was compacted (or repaired). Last year export of a TM with more than 300,000 segments took some 6 hours or more.

But you can always use Access for this purpose. Open (a copy of) your TM with Access. Open the table "Sentences", first move the "Lang" column next to the "Sentence" column, and then copy the columns "Lang" and "Sentences" into a text document. (ANSI format AFAIK)

Then copy the contents of the text document into Word.

Replace language code of the source segment (in my case 9) followed by a tab with

</TrU>
<TrU>
<Seg L=EN-GB>


Replace language code of the target segment (in my case 31) followed by a tab with
<Seg L=TR>

Move to the top of the document, and replace the first </TrU> with
<RTF Preamble>
<FontTable>
</RTF Preamble>

And go to the bottom of the document and be sure that the last segment is followed with

</TrU>



Copy all the text in Word, and paste into the text document (AFAIK save as ANSI )

Ready for importing to Trados



[Edited at 2008-08-28 15:13]


Direct link   Reply with quote
 

Wolfgang Jörissen  Identity Verified
Poloni
Local time: 14:17
Anëtar
holandisht në gjermanisht
+ ...

MODERATOR
Wow, I'm impressed Aug 28, 2008

Selcuk, this looks like a very good solution. Obviously, a part of me is still living in the DV3 age, when MDBs were protected and could not be opened directly in Access.

Well, opening and moving the column as you advised is no problem, but the number of records blows the clipboard of course. Does anybody know of some clipboard enhancement that would eliminate this problem? Right now, clipboard copied a mere 4700 entries, that's all.

I also tried to export the whole thing as a txt file, but apart from the fact that Access actually collapsed, I would still have the unnecessary columns to get rid of. If I would open the txt file with OpenOffice Calc, would my TUs be trunctuated?

Another issue: The TM is basically bilingual, but I spilled a couple of texts in other combinations into it. Any ideas how to deal with that when making the txt file Trados-ready?

And... any idea how to get the date in there? Maybe convert to table in Word? Might take ages again

[Edited at 2008-08-28 18:41]


Direct link   Reply with quote
 

Rie Matsuda  Identity Verified
Shtetet e Bashkuara të Amerikës
Local time: 08:17
Anëtar qysh 2006
anglisht në japonisht
+ ...
filter out unnecessary columns, or... Aug 28, 2008

[quote]Wolfgang Jörissen wrote:


I also tried to export the whole thing as a txt file, but apart from the fact that Access actually collapsed, I would still have the unnecessary columns to get rid of. If I would open the txt file with OpenOffice Calc, would my TUs be trunctuated?

**************

You may have tried this, but .... when you choose, within DV, to export as a txt (just plain txt, not Trados txt), the wizard will take you to a blank window where you select the fields you would like to export, and even change the colum order. Was it still too big to handle?

Then... I'm wondering if you can take an "divide and conquer" approach. When you choose DV TM as the export destination, there is an option to define filtering conditions. I was once able to filter the data by the date, (you need to write a SQL fragment) but I have never done in any other ways. If this is possible, then you can create subsets of this big one, and do the rest of the TMX export work in any way you like.


Rie


Direct link   Reply with quote
 
Advertisement

Selcuk Akyuz  Identity Verified
Turqi
Local time: 15:17
Anëtar qysh 2006
anglisht në turqisht
+ ...

MODERATOR
Export in parts using SQL Aug 28, 2008

Hi Wolfgang,

Another solution, hope it is correct
You can use the following SQL command (it will export the first 100 TUs)


ID IN (SELECT ID FROM Sentences WHERE ID > = 1 AND ID < = 100)


Any SQL gurus there? I am not sure if this command is correct.



[Edited at 2008-08-28 20:49]


Direct link   Reply with quote
 

Wolfgang Jörissen  Identity Verified
Poloni
Local time: 14:17
Anëtar
holandisht në gjermanisht
+ ...

MODERATOR
Thanks for all the suggestions Aug 29, 2008

... that I am dying to try out.
Just off the record: A couple of years ago, I postponed my transition from DV3 to DVX for the very same reason. I took just about a full working day to get it all imported (still on a slower processor and less RAM) and before that, I made several attempts that I broke off along the way. That is why my TM in my main combination STILL listens to the name "BigMama NL-DE another attempt"


Direct link   Reply with quote
 
FarkasAndras
Hungari
Local time: 14:17
anglisht në hungarisht
+ ...
best of luck Aug 30, 2008


Wolfgang Jörissen wrote:


FarkasAndras wrote:

That's a pretty ridiculous pace.
I take it you're using deja vu, right? How long would it take to import such a file?


Trados might have a faster import/export, but I am quite happy with the translation features of DVX, so I do not see any reason to switch.



... and I didn't mean that you should switch at all, I was just surprised that it takes so long with DVX compared to trados. As I said, it's just text and you have reasonably powerful hardware. Inefficient implementation I guess.


Direct link   Reply with quote
 
David Turner  Identity Verified
Francë
Local time: 14:17
Anëtar qysh 2006
frëngjisht në anglisht
+ ...
Import first into the lexicon Sep 4, 2008


Kevin Lossner wrote:
I've been trying to get up the nerve to import that awful TM from the EU for months, but I'd have to plan to have the system tied up for a few days.


You could import it first into the Lexicon which is lightning fast in comparison (only taking a few minutes) and then send the lexicon to the TM which again only takes a short time.
Importing big TMs straight into a DVX MDB does indeed take an inordinate amount of time so there must be major flaw somewhere in the works. I believe Atril have identified the problem and are working on it.

David Turner


Direct link   Reply with quote
 


Moderatorët e këtij forumi
GoodWords[Call to this topic]