![]() ![]() I think, the problem here is that the x-coord of each RTL letter is rendered by measuring x from left rather than right i.e., (x,y) should be (W-x, y) where W is the page width. Free Foxit PDF Editor Benefits Impressive functionality for work with text and pictures Can protect your work with features like encryption, e-signature, etc. ![]() Usually, we face this problem when rendering Arabic text in HTML by setting "text-align:right" hOCR file is correct.Īctually, the words are not reversed (you still can read every letter) but the "entire line is mirrored". See the following example for more details:Īs you see in the Tesseract PDF/A text, every word is reversed although the. However, when i extracted the stored text in PDF/A using pdfToText, the words are reversed too, which means the text was stored in the wrong order. Yes, if you open the PDF in Acrobat, it will give you reversed words, and will work fine for Google Chrome PDF reader. However, when export results as PDF/A, the stored text in PDF/A are reversed. ![]() The Tesseract recognizes and displays Arabic text correctly. I would like to inform that the problem still persists in Tesseract 4.1.1 ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
May 2023
Categories |