[Linux] converting PDF to DOC?

Robert Citek linux@flux.org
Fri, 22 Jun 2007 16:38:52 -0500


On 06/22/2007 02:13 PM, Larry Kagan wrote:
> I replied quite a while ago with precise instructions on how to do this
> using convert or Gimp and gocr.
> ... 
> This is obviously a project and probably more work than it's worth but
> only you can decide that.
> 
> Good Luck

Yes, with a heavy helping of "good luck".  Here's a full pipeline:

$ wget 'http://cm.bell-labs.com/cm/cs/who/dmr/pdfs/man11.pdf' -O - |
 convert - pnm:- |
 gocr - |
 head -20 |
 tail

e_DPSlS     __ key arlle n_et , , ,

DESCBTRI0h_   ar matnt_ln8 gro_FL or rll_8 c_bInad tnto a 8Ln_
'_lg archive ftle,  JtL maln use lL to cre_te aid
_             update lljary tlle8 aL uged bY tt.e loader,  Jt
can b8 used, t_ugh, tor any slmll_ __po8e,

kg Lg Dng ch8racter rrn_. the Let gd t  , opti0n-
ally concat_nated wlth v.  ?g1  tL the __cnive
rtle.  The n__eL ar8 co\code(0144)atltuent ftles tn the

Regards,
- Robert