Word count having 2 different languajes in source document เธรดต่อผู้เขียนข้อความ: Ana Lopez
| Ana Lopez สหรัฐเม็กซิโก Local time: 23:43 สมาชิก (2013) ภาษาอังกฤษ เป็น ภาษาสเปน + ...
Hello!!
I'm working on a PDF document that has German/English in two "columns" and I only have to translate the English part, do you know any way I can ONLY count the English words?
Trados has statistics, but I don't know if there is a tool to count by language.
The only way I can think of is to count them manually. Do you know anything faster?
Thank you. | | | Jack Doughty สหราชอาณาจักร Local time: 06:43 ภาษารัสเซีย เป็น ภาษาอังกฤษ + ... เพื่อระลึกถึง Convert to Word | Jun 4, 2014 |
You can convert it to Word using an OCR. Abbyy fine Reader and Abbyy PDF Converter come to mind. | | | Ana Lopez สหรัฐเม็กซิโก Local time: 23:43 สมาชิก (2013) ภาษาอังกฤษ เป็น ภาษาสเปน + ... TOPIC STARTER Can Word count by language? | Jun 4, 2014 |
Thanks! I already converted it to Word however, since the columns are mixed with images I cannot just "select" the English column. Thus asking if there is any other way than by marking page by page. Maybe there isn't, just asking | | | Tony M ฝรั่งเศส Local time: 07:43 ภาษาฝรั่งเศส เป็น ภาษาอังกฤษ + ... SITE LOCALIZER Are languages set? | Jun 4, 2014 |
When you did the conversion using OCR, were you able to set the languages of the relevant bits?
If the text DOES have its 'language' attributes correctly set, then you can do an ordinary word count in Word; then search and replace all for 'any character' + language attribute = (say) German, replacing with nothing.
Then do another word count, and this will be the EN words without the German ones; in fact, you don't even need to have done the preliminary word count, I was... See more When you did the conversion using OCR, were you able to set the languages of the relevant bits?
If the text DOES have its 'language' attributes correctly set, then you can do an ordinary word count in Word; then search and replace all for 'any character' + language attribute = (say) German, replacing with nothing.
Then do another word count, and this will be the EN words without the German ones; in fact, you don't even need to have done the preliminary word count, I was just thinking of subtracting the EN from the total, since TOTAL – EN = German, of course!
Naturally, if the language attribute was NOT correctly set in the first place, this won't work; but at least you'll know for next time.
BTW, you say that the images are stopping you from selecting all the EN column, but why? Are they in merged cells or something? You ought to be able to process your table in such a way as to unmerge all the cells, which will probably push all the images into the l/h column or something, but will leave you with two clean columns you can select properly.
Your are SURE it is in a proper Word table? OCR conversions have a nasty habit of 'organizing' (well, that's not what I call it...) text into newspaper-style columns, in which case you'll have a harder job on your hands trying to sort it out. It might even be simpler to convert everything to single-column and remove all column breaks from the document, and then see what you have left... ▲ Collapse | |
|
|
Tony M ฝรั่งเศส Local time: 07:43 ภาษาฝรั่งเศส เป็น ภาษาอังกฤษ + ... SITE LOCALIZER Failing that... | Jun 4, 2014 |
...if the original document really is organized neatly into two columns, why not just do another 'dummy' OCR run on it, selecting ONLY the EN column as you go through, so you'll actually have a document at the end of it that ONLY contains the EN you need to translate; you might even be able to use this for your translation, or at worst, it will be a useful intermediate stage for your word count.
[Modifié le 2014-06-04 20:58 GMT] | | | Ana Lopez สหรัฐเม็กซิโก Local time: 23:43 สมาชิก (2013) ภาษาอังกฤษ เป็น ภาษาสเปน + ... TOPIC STARTER I'll try the option | Jun 4, 2014 |
I'll try making a dummy OCR conversion, from Abbyy, only identifying English as language, and see how it goes with the find & replace.
Thank you so much Tony M.!! | | | Ümit Karahan ตุรกี Local time: 08:43 ภาษาอังกฤษ เป็น ภาษาเติร์ค + ... Paste only text | Jun 5, 2014 |
Hi.
Try to copy the all by Ctrl+A, Ctrl+C and then choose to paste it as text only in a blank word page. So you can get rid of images.
[Edited at 2014-06-05 01:14 GMT] | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Word count having 2 different languajes in source document Pastey | Your smart companion app
Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.
Find out more » |
| Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |