[Dclug] converting pdf tp txt

Leo Charre leocharre at gmail.com
Thu Jun 10 03:54:13 EDT 2010


Yeah, but that has limitations. A lot of pdf documents are scans and have no
text layer.
I wrote something to help do this.. I maintain PDF::OCR2 :
http://search.cpan.org/~leocharre/PDF-OCR2-1.20/lib/PDF/OCR2.pod
===========================
Leo Charre
Developer
http://leocharre.com/resume
http://search.cpan.org/~leocharre/


On Mon, Jun 7, 2010 at 12:28 PM, The Doctor <drwho at virtadpt.net> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Cecil Taylor wrote:
>
> > Does anyone know a linux program that will convert pdf files to text.
>
> pdftotext from poppler (http://poppler.freedesktop.org) will extract the
> text component from .pdf files.
>
> - --
>
> The Doctor [412/724/301/703]
>
> PGP: 0x807B17C1 / 7960 1CDC 85C9 0B63 8D9F  DD89 3BD8 FF2B 807B 17C1
> WWW: http://drwho.virtadpt.net/
>
> I may not know what I am capable of, but I am well versed in what I am not.
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.14 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkwNHhQACgkQO9j/K4B7F8HnuACfU6kA6qz5WqqX/w/PiFEw8Exe
> +JUAnjBVtCrzPxcuZdC3PzuwSj5JOuFs
> =BoFb
> -----END PGP SIGNATURE-----
> _______________________________________________
> Dclug mailing list
> Dclug at calypso.tux.org
> http://calypso.tux.org/mailman/listinfo/dclug
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://calypso.tux.org/pipermail/dclug/attachments/20100610/1f12fa5a/attachment.html 


More information about the Dclug mailing list