Error Copying Of Text From This Document Is Not Allowed
Contents |
Brought to you by: meshko Summary Files Reviews Support Wiki Tickets ▾ Bugs Support Requests Patches Feature Requests News Discussion Code Mailing Lists Create Topic Stats Graph Forums Experimental Bug Reports 8
Xpdf
Help 109 Open Discussion 55 Help Formatting Help Copying of text from pdftotext document isn't allowed Forum: Help Creator: random Created: 2005-07-07 Updated: 2013-04-24 random - 2005-07-07 hi I'm trying to convert a pdf to pdb for my Palm. If I can get it into html or text then I should be able to do this. But with pdftohtml and pdftotext I get: "Error: PDF version 1.5 -- xpdf supports version 1.4 (continuing anyway)" (in pdftohtml only) and "Error: Bad annotation action Error: Bad annotation destination. Error: Copying of text from this document is not allowed." any ideas? is my pdf corrupt or something? (I can view it in xpdf) Thanks, ~random If you would like to refer to this comment somewhere else in this project, copy and paste the following link: Anonymous - 2005-08-07 "is my pdf corrupt or something?" NO. It is version 1.5, supported by XPDF 3, but unsupported by PDF2HTML 0.36, based on XPDF 2, being unfortunately slightly obsolete. "Error: Copying of text from this document is not allowed" PDF supports a sort of [pseudo-] protection against modifying, printing, extraction , ... , and this is activated in your document. For ideological reasons, XPDF & PDF2HTML do NOT extract such documents. :-( If your XPDF does, it is probably cracked. Authors of XPDF & PDF2HTML accept ADOBE's license to be on the "safe" side when requiring other people accepting their GNU GPL. This is totally silly, since there is a large number of software like "PDF PASSWORD RECOVERY", "PDF PROTECTION REMOVER", ... etc, all are COMMERCIAL, they are violating ADOBE's license, and want people to accept their licenses, and pay for their software !!! :-( :-( :-( If you would like to refer to this comment somewhere else in this project, copy and paste the following link: Trejkaz - 2007-03-28 Modifying the code to allow it to
PDF File? So I ran into a problem the other day when I had to copy some text from a PDF file and paste it into a presentation that I was doing. http://helpdeskgeek.com/help-desk/cant-copy-text-from-a-pdf-file/ The problem was I could not copy the text! Hmm, I thought, there http://www.foolabs.com/xpdf/cracking.html must be something stupid I am doing since I am pretty sure I have copied text from a PDF file before. Luckily, I wasn't that stupid, since it ended up being that the PDF file had several pages that were scanned bitmap files that had been inserted into the PDF. So it was not error copying actual text in the first place. Secondly, where there was actual text that could normally be copied, this PDF had some sort of security permissions set on it so that content copying was not allowed! Grrrr! I still needed that text and I was going to figure out a way to get it. In this article, I'll walk through the simple way to copy text that works error copying of if the document is not protected and the text is not a scanned image. I'll also go over what to do in the tricker scenario where you are not allowed to copy the text. It's not an ideal solution, but it's better than nothing, especially if you have to copy a lot of text. Even if you can save yourself from typing 80% of it manually, that's great! Selecting Text in a PDF In Adobe Reader, if text is copyable, then all you have to do is select it and right-click and choose Copy. In other PDF viewer programs like Foxit, you have to click on Tools and then Select Text. Obviously, if you were able to do this, you would not be reading this post! But just in case, that's how you select text. Now on to the tougher issue of copying text from images or secured PDF files. Use OCR to Copy PDF Text You can quickly check to see if a PDF file is secured in Adobe Reader by looking up in the title bar and looking for the word SECURED. You can see specific permissions by clicking on Edit and then clicking on Protection and
these permission settings. Specifically: xpdf will not copy/paste from a PDF file which disallows copying text/graphics xpdf and pdftops will not print (convert to PostScript) a PDF file which disallows printing pdftotext will not convert a PDF file which disallows copying text/graphics pdfimages will not extract images from a PDF file which disallows copying text/graphics I occasionally get email asking if I can explain how to crack a PDF file, or if I can help decrypt a PDF file. I won't help these people because I believe that an author's requests relating to the use of his/her work should be honored. I distribute source code (for Xpdf) under a particular license (the GPL) which depends entirely on users' goodwill for its effectiveness. If any of my users ever decided to violate the license, I would probably never even know about it, much less be able to do anything about it. The only thing I can do is trust the users. In light of this, it would be very hypocritical of me to, on one hand, ask people to honor my licensing restrictions, and, on the other hand, bypass (or assist others in bypassing) another author's requested restrictions. In addition to all of this, Adobe requires that implementors of the PDF spec adhere to the document permissions. "But copyright law allows me to quote parts of a document under the fair use provisions -- and Xpdf is preventing me from doing that." Not really: you're still free to quote the document the same way you would a newspaper article, i.e., by retyping the text. If I have to choose between honoring the author's request and trying to interpret the law (exactly how much does fair use allow you to extract? should Xpdf allow copying a certain amount of text out of protected documents?), I'll choose to honor the author's request, no matter how misguided. For those who would argue that important content might get irretrievably locked away in PDF format, I'll remind you that Xpdf is open source, and can be modified by end users (the GPL even allows this). If you think these security protections are a bad idea then write the author of the document. He's the one who set those bits after all.