This Pdf_Sum component can summarize Adobe Acrobat PDF files. It uses the program component to translate PDF files temporarily into Postscript. It then uses the ps2txt-1.0 converter to translate the postscript to text, and summarizes the text. The Pdf_Sum software is provided by Dan Schmitt of the Center for Natural Resource Information Technology. The Pdf_Sum software is released under the gnu public license.
To use this component,
components/Pdf_Sum.tar.gz
) from one of the
Harvest
software distribution sites.
% gzip -dc Pdf_Sum.tar.gz | (cd harvest-1.x/components/gatherer; tar xvf -) % ./SetupComponent add gatherer Pdf_Sum
For more information about Harvest, see the Harvest home page.