Problems with OCR and small text เธรดต่อผู้เขียนข้อความ: James Greenfield
| James Greenfield สหราชอาณาจักร Local time: 06:43 สมาชิก (2013) ภาษาฝรั่งเศส เป็น ภาษาอังกฤษ + ...
Hi,
I am currently translating a dead PDF. I managed to OCR the document and the results were fine apart from the bibliography section at the end which is in very small print. The results for this section make no sense. Using abbyy finereader I tried to increase the resolution but the results were equally as bad. Does anyone have any advice? This is the first time I have had this problem. Perhaps someone with Abby finereader could guide me as to how to properly increase the resoluti... See more Hi,
I am currently translating a dead PDF. I managed to OCR the document and the results were fine apart from the bibliography section at the end which is in very small print. The results for this section make no sense. Using abbyy finereader I tried to increase the resolution but the results were equally as bad. Does anyone have any advice? This is the first time I have had this problem. Perhaps someone with Abby finereader could guide me as to how to properly increase the resolution. When I try to do this the image size automatically becomes smaller and it still is unable to recognise the text. Many thanks for any advice. ▲ Collapse | | | Sergei Leshchinsky ยูเครน Local time: 08:43 สมาชิก (2008) ภาษาอังกฤษ เป็น ภาษารัสเซีย + ...
send me the file?
also, if it is raster, then all you have is all you have. | | | James Greenfield สหราชอาณาจักร Local time: 06:43 สมาชิก (2013) ภาษาฝรั่งเศส เป็น ภาษาอังกฤษ + ... TOPIC STARTER
Sergei Leshchinsky wrote:
send me the file?
also, if it is raster, then all you have is all you have.
Thanks, I've just sent you an email. | | | James Greenfield สหราชอาณาจักร Local time: 06:43 สมาชิก (2013) ภาษาฝรั่งเศส เป็น ภาษาอังกฤษ + ... TOPIC STARTER could anyone help? | Nov 29, 2015 |
I don't suppose anyone has really powerful OCR software that would be prepared to do me a massive favour. I can't manage to OCR the bibliograohy which is in small text and to hand type the 64 entries it is going to take me a long time. Thanks very much. | |
|
|
Not sure if post-facto solutions will help | Nov 29, 2015 |
Hi James,
I'm not an expert, but I think if the scan of the original document was not a high enough resolution, then attempts to increase the resolution of the scan won't help, because the "raw material" is inadequate. If I take a blurry photo of something, no amount of fiddling with the sharpness or resolution of the photo will give me a clear photo. I think the only alternative to typing out the text is to get a better scan.
Good luck!
Melissa | | | James Greenfield สหราชอาณาจักร Local time: 06:43 สมาชิก (2013) ภาษาฝรั่งเศส เป็น ภาษาอังกฤษ + ... TOPIC STARTER
Hi Melissa,
Yes, I think that's right. This section is in English anyway so I have decided not to include it. I thought about including it as it is the bibliography and the French text refers to these English journals, but as you say there is no way of increasing the resolution and hand typing it out would take me an awful long time,
James | | | Anton Konashenok สาธารณรัฐเช็ก Local time: 07:43 ภาษาฝรั่งเศส เป็น ภาษาอังกฤษ + ... Do you really need to type it? | Nov 30, 2015 |
If the list of references is already in the target language anyway, it makes sense to ask the client if they'd accept it as a pasted image instead of text. If so, you can just copy it using the Snapshot tool of Adobe Reader, then paste it into your target document. | | | esperantisto Local time: 08:43 สมาชิก (2006) ภาษาอังกฤษ เป็น ภาษารัสเซีย + ... SITE LOCALIZER Convert to black and white | Nov 30, 2015 |
In my experience, increasing the resolution above 300 dpi has no noticeable effect on recognition results even for small print. However, there is one setting (off by default) that can be usable: Tools → Options → General → More options… → Convert color/gray-scale images to black and white (translating this menu items from Russian UI for FR 8.0, thus, they may be different in your case). Try it with on.
Also, if the sections in question are French only, do select French onl... See more In my experience, increasing the resolution above 300 dpi has no noticeable effect on recognition results even for small print. However, there is one setting (off by default) that can be usable: Tools → Options → General → More options… → Convert color/gray-scale images to black and white (translating this menu items from Russian UI for FR 8.0, thus, they may be different in your case). Try it with on.
Also, if the sections in question are French only, do select French only for the language and (re)recognize. ▲ Collapse | |
|
|
Tom in London สหราชอาณาจักร Local time: 06:43 สมาชิก (2008) ภาษาอิตาลี เป็น ภาษาอังกฤษ
James Greenfield wrote:
Hi,
I am currently translating a dead PDF. I managed to OCR the document and the results were fine apart from the bibliography section at the end which is in very small print. The results for this section make no sense. Using abbyy finereader I tried to increase the resolution but the results were equally as bad. Does anyone have any advice? This is the first time I have had this problem. Perhaps someone with Abby finereader could guide me as to how to properly increase the resolution. When I try to do this the image size automatically becomes smaller and it still is unable to recognise the text. Many thanks for any advice.
I don't know about you, James, but my Abbby Fine Reader for MacOS outputs to plain text. The resulting file can then be opened in Word and saved as a .doc file. Then you can alter the text any way you want to. I do this all the time.
[Edited at 2015-11-30 07:51 GMT] | | | Rolf Keller เยอรมนี Local time: 07:43 ภาษาอังกฤษ เป็น ภาษาเยอรมัน Enlarge the picture externally | Dec 1, 2015 |
esperantisto wrote:
In my experience, increasing the resolution above 300 dpi has no noticeable effect on recognition results even for small print.
Ack.
Convert color/gray-scale images to black and white
Ack.
Plus plan C:
Enlarge the picture beforehand.
If needs be, go to a copy shop, make an enlarged copy, try different contrast settings etc, then scan/export the result onto an USB stick. The shop staff will help you with this.
Back in your office, OCR the file on the stick. | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Problems with OCR and small text Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
| LinguaCore | AI Translation at Your Fingertips
The underlying LLM technology of LinguaCore offers AI translations of unprecedented quality. Quick and simple. Add a human linguistic review at the end for expert-level quality at a fraction of the cost and time.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |