How to convert a (scanned) PDF into Word in Trados 2019
Thread poster: Ulrike Cisar
Ulrike Cisar
Ulrike Cisar  Identity Verified
Germany
Local time: 20:08
Member (2015)
French to German
+ ...
Jan 13, 2021

Hi,

I´d need some help from someone familiar with Trados 2019. I bought the latest update in autumn, mostly for its OCR software for the conversion of (scanned) PDF files into Word since I receive quite a large number of non-editable PDFs from a certain agency. I hoped the software would save me time and effort but somehow I just don´t see how the conversion of the file in Trados is done. Watching tutorials hasn´t helped, either. Is anybody here willing to help and share some us
... See more
Hi,

I´d need some help from someone familiar with Trados 2019. I bought the latest update in autumn, mostly for its OCR software for the conversion of (scanned) PDF files into Word since I receive quite a large number of non-editable PDFs from a certain agency. I hoped the software would save me time and effort but somehow I just don´t see how the conversion of the file in Trados is done. Watching tutorials hasn´t helped, either. Is anybody here willing to help and share some useful advice?

Thank you!
Collapse


 
Sonia Cunha-Goldner
Sonia Cunha-Goldner
United States
Local time: 15:08
English to Portuguese
Adobe Acrobat DC Jan 13, 2021

Hi, I use Adobe Acrobat DC to perform the OCR, and then import the PDF to Trados (I have 2021, but 2019 works too). Sometimes, it works, but sometimes I have to export from Adobe to Word and fix the errors, before importing to SDL Trados as a Word file.

 
Elena Bailey
Elena Bailey  Identity Verified
Spain
Local time: 20:08
Member
Spanish to English
+ ...
Add pdf directly to project and double-check converted document in project folder Jan 13, 2021

I upload the pdf files directly to the project (when creating a new project). Trados converts the file when it "prepares" the project so you can then translate it. I usually double-check that the conversion has been done correctly before starting to translate it by going to the project folder (wherever you have saved it locally or in the cloud), and clicking on the source language folder. In this folder, you should see the original pdf file that you uploaded plus the converted file.
Someti
... See more
I upload the pdf files directly to the project (when creating a new project). Trados converts the file when it "prepares" the project so you can then translate it. I usually double-check that the conversion has been done correctly before starting to translate it by going to the project folder (wherever you have saved it locally or in the cloud), and clicking on the source language folder. In this folder, you should see the original pdf file that you uploaded plus the converted file.
Sometimes Trados struggles with scanned documents or poor quality files, so I also use Adobe in these cases to convert the pdf then I upload the word. But I would say that in general pdf conversion has improved in recent versions of Trados (I use the 2019 version too).
If for some reason your files are not converting at all, maybe double check the "file types" section under "options", then scroll down to see the options selected for pdf files.
Send me a private message if you still have issues and I can send you some screenshots.
Collapse


Yossi Rozenman
 
Roy Oestensen
Roy Oestensen  Identity Verified
Denmark
Local time: 20:08
Member (2010)
English to Norwegian (Bokmal)
+ ...
I would invest in an OCR software Jan 14, 2021

I never trust the automatic OCR functions, as I find they mostly work well with editable pdf files, and even then the result isn't always satisfactory.

Personally I rather rely on Abby Finereader, which I can recommend. It gives me the option to indicate manually what parts of the document should be OCR-ed, which especially is useful when the PDF is a scanned, and therefore not editable, document. After that I can export it as for instance a Word document, which I can import into St
... See more
I never trust the automatic OCR functions, as I find they mostly work well with editable pdf files, and even then the result isn't always satisfactory.

Personally I rather rely on Abby Finereader, which I can recommend. It gives me the option to indicate manually what parts of the document should be OCR-ed, which especially is useful when the PDF is a scanned, and therefore not editable, document. After that I can export it as for instance a Word document, which I can import into Studio.
Collapse


Kevin Fulton
finnword1
Stepan Konev
Kenedy Chia
Jorge Payan
 
finnword1
finnword1
United States
Local time: 15:08
English to Finnish
+ ...
Use an external OCR software Jan 15, 2021

I 100% agree with Roy. Do OCR first, using a stand-alone software, then clean up the formatting and possible OCR errors, and import it into whichever CAT tool you use.

 
Schtroumpf
Schtroumpf
Local time: 20:08
German to French
+ ...
Trados has never worked fine on PDF Jan 15, 2021

Despite their claims, and AFAIK, Trados has never succeeded in producing anything worth while from PDF, even if the document was not scanned but directly generated from MSO.
This is not only my humble opinion but also what I have always been told in my professional environment, and some people there have really deep insight into Trados.
I agree 120 % with the colleagues' suggestion that you should OCR your PDF before the Trados stage (and even fix the layout and spelling as well).


 
Dan Lucas
Dan Lucas  Identity Verified
United Kingdom
Local time: 19:08
Member (2014)
Japanese to English
Consider software, but also surcharges Jan 15, 2021

Roy Oestensen wrote:
Personally I rather rely on Abby Finereader, which I can recommend.

I initially misunderstood the OP's question and thought she was asking about editable PDFs. In my experience Studio handles those very well, and transparently.

For image-only, I too would recommend ABBYY FineReader (or Nuance Omnipage). The problem is that OCR processing of documents is seldom straightforward, even with the best OCR software out there.

I charge clients who want me to tackle a non-editable PDF file the equivalent of a couple of hours of my time to get the source file into readable state. The OCR process is just the start of it. At the very least, you'd want to compare the readable Word document to the image-only PDF source document before you started work, to ensure there are no unfortunate errors or spelling mistakes.

In addition, when I quote them a charge for the OCR, it's surprising how often clients suddenly find themselves able to produce a Word file after all. I prefer it that way, not least because they then have responsibility for the source file.

Regards,
Dan


 
A. & S. Witte
A. & S. Witte
Germany
Local time: 20:08
German to English
+ ...
OK, sounds good. But does it really have to be subscription-based SaaS? Jan 17, 2021

Sonia Cunha-Goldner wrote:

Hi, I use Adobe Acrobat DC to perform the OCR, and then import the PDF to Trados (I have 2021, but 2019 works too). Sometimes, it works, but sometimes I have to export from Adobe to Word and fix the errors, before importing to SDL Trados as a Word file.


Admittedly, the above marks a success ratio that even a skilled operator of the automatic and semi-automatic modes of newer versions of FineReader like my wife will not be able to report as those mostly yield results requiring manual post-OCR formatting for your CAT tool, which is why she sometimes, in the case of poor source documents, uses its intricate manual mode. So that cuts it for me, although the comparison between the two tools obviously depends on what sort of source documents you still accept (how poor they can be if you take them). However, does it really have to be subscription-based Software as a Service? See we are only translators, not Jennifer and Jonathan Hart.

Cheers,

Sebastian


 
Anthony Rudd
Anthony Rudd

Local time: 20:08
German to English
+ ...
PDFs and Trados Jan 19, 2021

I would agree with "Schtroumpf". I recently had to process a non-scanned PDF file. Although Trados correctly recognized the text, the sequence of very many segments was "random", which made merging segments (many segments were partial sentences) impracticable. MemoQ processed the same file without problem.

 
Ulrike Cisar
Ulrike Cisar  Identity Verified
Germany
Local time: 20:08
Member (2015)
French to German
+ ...
TOPIC STARTER
How to convert a (scanned) PDF into Word in Trados 2019 Jan 19, 2021

Thank you all for your valuable input and sharing your experiences!

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to convert a (scanned) PDF into Word in Trados 2019







CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »