Layout in Arabic, Chinese and Russian. Export text from a PDF file

I am exhibitor of long documents in Arabic, Chinese and Russian. The text was provided in PDF format when I copy and paste this in Indesign, it happens like question marks, boxes and other characters having nothing to do with the text I'm trying to layout.  I put the cast on the Arabic Myriad and Arabic dictionary still nothing that resembles Arabic or any other language. Same thing with the Chinese and Russian. Any suggestions on how to get the text from the PDF file where is the language. Appreciate help with this.  Thank you.

Thanks for the caption, Ellis

Thus, KK: you're in for a world of pain. The intials "WP" at the beginning of these policies means that the text is out of WordPerfect. Make presentations multilingual in WP, it was annoying, but possible. It was developed in the pre-Unicode world where every single method of complex script layout was a dirty hack. If you like dirty nerdy details, I can tell you how it worked, but that it is sufficient to say that attempt to harvest the non-latine-script text of: WP and it reuse for use in InDesign is just pure pain. WordPerfect-specific code pages were never really supported anywhere outside WP.

That being said, I have a script around laying somewhere for the conversion of the WP-Cyrillic in Unicode. (In fact, I think it does Windows CP 1251, but that works as well.) But this is only one of the forty-five languages? And the Chinese has been pixelated? And PDF files were originally generated by Distiller 3? If you have no choice, it is time to leave. If you do not have the choice, I really hope that you charge hourly. My experience in this area (very large) is that it will cost three to five times more to extract the text as it would for a professional translation generate a new key of the text, and and then to have a second professional translation review text rekeyed looking for typos.

Russian OCR is darn good these days, but Chinese OCR is random. I've never seen good Arabic OCR - doesn't mean it's not out there, but I couldn't help you find it.  But the chances that all 45 languages available reliable OCR, and the result of said OCRing will not have to be reviewed by someone who knows the language, are virtually nil.

Tags: InDesign

Similar Questions

  • Import of text Arabic, Chinese and Russian

    Can someone help me out with this please? I export data from a database to a file .txt using MML. It works fine, except for Arabic, Chinese and Russian language text, which is not supported in a .txt file. If I save the file in Unicode format, then the text in Arabic, Chinese and Russian is maintained, but then when I try to import that FrameMaker, he does not see the Unicode as MML file and therefore will not import it.

    Has anyone else encountered a similar problem and may have found a clever way around it please?

    Like all tools, there is a bit of a learning curve. However, the user has a guide number of samples that show you how things are are.

    There are two paths you can take to treat your data; use either approach purely XML with XSLT to wrap your tags with the appropriate orders of Miramo or use their language of preprocessor macros (mmpp) (a bit like VBscript) use your own tag structure to encapsulate the commands.

    The workflow is usually such that data reporing tool is used to create the preliminary tagged output (just like you do now to create text MML files). Then you use XSLT or mmpp (even if you have to write your own code for the latter) to process the output in order to encapsulate the actual orders of Miramo around the content. Then, you run Miramo on the wrapped content. Miramo uses your FM models specified for information of formatting and produces MIF output which feeds on FM to create the final book (and also creates all the files generated FM as the table of contents and Index). Miramo also lets call in scripts (Framescript or Extendscript) or any other API to process the contents of FM in steps specified in the workflow of creating book.

    Miramo is an enterprise-level tool that is designed to produce large volumes of content. The Personal Edition is just limited in the number of simultaneous output channels which be treated and requires a user to open each session rather than be sitting on a server and automatically treat the content as it appears in the specified locations.

    The latest version allows also you deal directly DITA content (using FM models, you get more esthetic and easier control over the output that you could use the DITA-OT) and also has a new module that produces PDF directly without even using FM.

  • How can I copy text from a pdf file in firefox?

    I no longer seem to highlight and copy the text from a pdf file, while in firefox 33.0, using adobe to view PDF files.

    How do I activate the selection tool, so I can do this?

    If there is text to be copied (in other words, if the pdf file is not a scanned image of the text), by default, you should see a pointer that will change when you hover over the text to an ibeam.

    If you see a hand tool, click on the double arrow towards the top right of the display screen to get a drop-down list and select "disable the hand tool.

  • How can I remove text from a PDF file?

    How to clear the text from a pdf file

    Hi iggybuskus2,

    If you want to change the text in a PDF document, you will need to use Acrobat. You can download a free Acrobat www.adobe.com/products/acrobat.htmltrial, if you want to test.

    Best,

    Sara

  • Import text from a pdf file and edit it

    I need to go "back" in the way that the normal process would be. I have a PDF (a book) and I need to recreate it in InDesign CS5 (7.0.4) so that I can edit the text. It is no longer available to left the document that produced the pdf document as a first step. I must tell you that I'm working with Acrobat 8.1.5 in Win 7 machine.  That's what I thought would be the answer for me: first, I exported the PDF in Acrobat 8 as an html tag (export-> HTML-> HTML 4.01, CSS 1.0) file. The html file seemed to have all the formatting in it. Then, I opened the html doc in MS Word and saved as a file of rtl. Then I could put the text of the rtl file in my new InDesign document. The bad news is that, in all the many notes, italicized text are lost. Of course notes are not linked (with the actual note reference numbers), but I did not do. Does anyone have another technique, version, software or script to try? It is important to get the italics to convert the exported file and I can make corrections on the text in InDesign.

    www.Recosoft.com.

    Search for PDF2ID.

    Bob

  • How can copy text from a PDF file in the right order?

    I get texts translation of customers in PDF files. I need to copy some of the text of these files in Word, but the selection of Adobe jumps tool documents around, seemingly at random, sort of paragraphs and lines from different locations on a page are automatically selected together. The tables are particularly bad. If I try to copy an entire file, receives yet more random order. For example, all the titles of the document can be gathered in one place. I tried various PDF conversion software. They put all the PDF content in the areas of text, which is not suitable for my purposes. I want to just copy the PDF files in the exact order in which they appear on the page in text form that I can handle in Word. I'm now up to version 9.3.0 and there has been no improvement. Is there a solution to this problem?

    Hey, Marcoola.

    I agree that by copying and sticky from PDF to Word maunally are really difficult and tedious work. I have encounterd this problem before. Fortunately the problem can be solved perfectly now. I recommend AnyBizSoft PDF to Word Converter, which is totally free for months before. This conversion mode supports three app - batch, partial, right-click conversion and it even supports encrypted PDF conversion, previously you must have legal rights. And the conversion quality is superb, all the original text, page, images and hyperlinks will be preserved in an editable word document. It is more useful than those software online for free. It will be useful.

  • I am trying to edit music digitized using Adobe Acrobat DC on a Lenovo tablet, Windows 7 operating system.  Adobe Acrobat DC rejects the analysis as unchangeable.  I export to a word document and re-export to AADC in pdf format.  It will then allow to cul

    Hello.  I'm currently editing digital music pdf using Adobe Acrobat DC on a Lenovo tablet with Windows 7 operating system.  Adobe Acrobat rejects the analysis as unchangeable.  I export the scanned pdf to MS Word and re-export to AADC in pdf format.  This can be cropped, but the edit function does not allow the movement of the measures (or 'skin measures text') within a team.  I am producing pages with 3 litters per page and 3 measures by staff who can be sufficiently enlarged to a blind musician.  My failure to change the measures so far means there is too much information by page and the page can be expanded properly. Advice please.  Thank you.  Charles.

    This is well beyond what Acrobat is intended - not just a little far, but a lot. PDF is therefore completely inappropriate for this.

    Address bits scan and cutting/organizing her before making a PDF. Maybe in Photoshop.

  • I can't find the bookmarks organize and/or export bookmarks from Safari with Firefox 4

    I can't find the bookmarks organize and/or export bookmarks from Safari with Firefox 4. View that all bookmarks does not help.

    Among the small icons - rightmost - in "show all bookmarks" is an import-export. Place your cursor on them and it will say what it is.

  • Is there a way to export / print to a pdf file which is flat? That is like printing on paper and a new analysis?

    Is it possible (if not, there should be MAOIs) export / print to a pdf file which is flat? That is like printing on paper and a new analysis? (I do not want to use this last)

    I could list 100 reasons why it would be useful to be able to do this, then please none of the usual "why would you do that?' or ' is not the right way... blah blah blah" which is too often here.

    If you are Windows on a machine, you can print the file to the Adobe PDF printer, thus creating a new copy of it.

    If you want to convert the entire file of images, then you can export it to the image files (PNG, for example) and then create a new PDF from that.

    If you want to flatten the dynamic objects (such as form, comments, links, etc.) then you can use a simple script to do.

    There are many ways, your request may be interpreted... If you provide information on what you are trying to achieve it will be easier to help you do.

  • How to add images and text from a txt file in Adobe Muse?

    How to add images and text from a txt file in muse

    Hello Tony,.

    At you can easily found in your text, copy and then paste in the new text box within the Muse, following a normal copy and paste.

    but images can be copied and pasted, so you need to save the images first as normal JPEG or PNG formats, and then you can import them into your file of muse.

    Best regards

    _Ankush

  • I try to insert two images in a single document, group them, and then export it to a jpg file but cannot do so can you please help. Thank you

    I try to insert two images in a single document, group them, and then export it to a jpg file but cannot do

    Hello sara,.

    For my part, I prefer to use my German PS in this way (I try to translate the commands):

    Use the same height for both images.

    Open the first image in PS.

    Use Bild > Arbeitsflache (Image > work plan?) See screenshot:

    From there:

    Give him the new with, by clicking on the left or the right indicating the hand playing around (trial and error).

    Now you can insert your second image.

    Hans-Günter

  • I'm trying to highlight objects and text in a pdf file converted from a cadd drawing but can't do either?

    I'm trying to highlight objects and text in a pdf file converted from a cadd drawing but can't do either?

    It may be that the PDF file is simply a graph. The highlight tool works on the text, not graphics. You may need to use the marker or another graphic tool.

  • Is it possible to export tables as a PDF file to an excel spreadsheet?

    I have attached the picture of the table. I know it can be done I'm not sure if it does on the tables that look like this?

    If so, which Adobe should I buy with this tool?

    PDFCapture.PNG

    This looks like a scanned document, which means that you will have to perform the recognition of text on it first.

    After doing this, you might be able to select the part of the page and then export it to an Excel file...

    You will need Acrobat to do this, preferably the Pro version.

  • How can I copy parts of the text of a pdf file in preview?

    How can I copy parts of the text of a pdf file in preview? For example, the beginning and the end part of a passage - copy not a few sentences in the Middle

    Select the first part of the text you want to copy. Press the command and option together and also select another passage of text. Now, press command + C to copy the two text selections.

    Hypothesis: The PDF did not limit content selection.

  • How to extract the text from corrupted pages file

    Really appreciate if someone of you knows a way to extract text from a page files of 9 MB that contains text and images and which does not:

    Error message: file format not valid.

    Change the type of file and opening in various programs (Word, Acrobat, Google doc converter) but nothing I've tried will open it.

    I have a backup at home, but am away for several weeks and done a lot of work on this file since the last backup.

    I hope there is a solution! Thank you

    JR

    If it is in fact a v5 Pages document, and it will not open in preview, or open in v5 Pages while holding the SHIFT key, then you're done. The inner content in Pages v5 is in a format that is illegible, indecipherable, scrambled anything able to read except Pages v5.

    If it's a v5.5.2 Pages or a later version of the document, then it is by default in a single file format (file zip compressed/renamed), and no version of Pages thru v5.2.2 on Mavericks v5 was designed to play a single file format documents that are not original Pages ' 09. If it is a document Pages from Yosemite or El Capitan, you will need the v5.5.3 Pages or later to change the type of file to the package format - which can be read by v5 + Pages on the Mavericks.

    If it is a Pages ' 09 document, there is a good possibility that your attempt to open/edit/etc this document with all of the people mentioned applications has damaged the document permanently. Try the preview or the free LibreOffice (v5 or later) which can open documents file format only (not the package files) of Pages ' 09 very simple, sometimes with pictures. No warranty if.

Maybe you are looking for

  • New location of SOS Emergency services IOS 10

    so I have updated to iOS 10 on my IP6s and research in location services, there is a rocking new emergency' sos' that means emergency services can follow you?

  • Should I disable the file vault before any downgrades...?

    Greetings...! I am running El Capitan 10.11.6 with File Vault active. I am considering upgrading to Sierra, once it is released, tomorrow. However, if it causes problems or I simply don't like it and want to downgrade disable file Vault before starti

  • HP 250 G3: problem G3 250 HP

    Hello! Please help me, I have a problem. I really hate Windows 8, if I want to install Windows 7. I see on this laptop support, drivers are ready for Win7. I start and I dlete my Windows partition partition no recovery. I see that my HARD drive is in

  • Question revised HP Smart Pass

    I use Simple past to connect to my e-mail address, I now change my e-mail password, Smart Pass will recognize it or it crashes me to connect to my e-mail account? Sorry for the confusion.

  • How to extend my screen? changing the resolution does not work?

    I can't get my screen to fit my monitor, tried everything, can someone help?