Ubuntu – paperless office on a budget

Since paper and myself have never gotten on well I have always been dreaming of a paperless office. A while ago I purchased a Fujitsu ScanSnap S1500 scanner for the office. I did this after doing some research on which Automatic Document Feed (ADF) multipage & duplex scanners were both affordable as well as supported on Linux.   by  Terry Freedman It took a while for me to get around to set all of this up, but the result now is that this scanner is connected to a headless Ubuntu VM and the press of the scanner button will: scan the document perform OCR to convert to text combine the text with PDF to create a searchable PDF OPTIONAL - send the resulting document into Alfresco Document Management Server via FTP Install dependencies NOTE: PPA is only required for support of Fujitsu ScanSnap S1500 sudo apt-add-repository ppa:rolfbensch/sane-git sudo apt-get update sudo apt-get install sane sane-utils imagemagick tesseract-ocr pdftk libtiff-tools libsane-extras exactimage wput Install scanbuttond Download the "Debian Experimental" package from http://pkgs.org/download/scanbuttond sudo dpkg -i scanbuttond_0.2.3.cvs20090713-14_i386.deb This step is only for the Fujitsu ScanSnap support. For other scanners you can probably install from the Ubuntu Repository Scanner config vim 40-libsane.rules #add this line ATTRS{idVendor}=="04c5", ATTRS{idProduct}=="11a2", ENV{libsane_matched}="yes" Permissions sudo adduser saned scanner Useful command lines for troubleshooting Since I had a few trouble getting this scanner to work properly I found the following commands highly useful in locating the issue. man sane-usb sane-find-scanner scanimage -L dmesg tail /var/log/udev NOTE: If you are using a…

Continue ReadingUbuntu – paperless office on a budget

Bulk converting Office documents to PDF

When you need to convert multiple documents to PDF for distribution (or from one Office format to another) there are a few utilities around. The most workable I found is the UNOCONV utility which is build on top of LibreOffice / OpenOffice. This uses the OpenOffice conversion facilities rather than a simple PDF print driver. On Ubuntu it can be installed via Software Center or via apt-get from the core repositories. sudo apt-get install unoconv Combined with the -exec option of the Unix find command this makes conversion of whole directory structures a breeze. #find all Word Documents and convert to PDF find . -name "*.doc*" -exec unoconv -f pdf {} \; #find all Powerpoint Documents and convert to PDF find . -name "*.ppt*" -exec unoconv -f pdf {} \; To show all the possible conversion formats you can use: unoconv --show The following list of document formats are currently available: bib - BibTeX [.bib] doc - Microsoft Word 97/2000/XP [.doc] doc6 - Microsoft Word 6.0 [.doc] doc95 - Microsoft Word 95 [.doc] docbook - DocBook [.xml] html - HTML Document (OpenOffice.org Writer) [.html] odt - ODF Text Document [.odt] ott - Open Document Text [.ott] ooxml - Microsoft Office Open XML [.xml] pdf - Portable Document Format [.pdf] rtf - Rich Text Format [.rtf] latex - LaTeX 2e [.ltx] sdw - StarWriter 5.0 [.sdw] sdw4 - StarWriter 4.0 [.sdw] sdw3 - StarWriter 3.0 [.sdw] stw - Open Office.org 1.0 Text Document Template [.stw] sxw - Open Office.org 1.0 Text…

Continue ReadingBulk converting Office documents to PDF