Ubuntu – paperless office on a budget

Since paper and myself have never gotten on well I have always been dreaming of a paperless office. A while ago I purchased a Fujitsu ScanSnap S1500 scanner for the office. I did this after doing some research on which Automatic Document Feed (ADF) multipage & duplex scanners were both affordable as well as supported on Linux.   by  Terry Freedman It took a while for me to get around to set all of this up, but the result now is that this scanner is connected to a headless Ubuntu VM and the press of the scanner button will: scan the document perform OCR to convert to text combine the text with PDF to create a searchable PDF OPTIONAL - send the resulting document into Alfresco Document Management Server via FTP Install dependencies NOTE: PPA is only required for support of Fujitsu ScanSnap S1500 sudo apt-add-repository ppa:rolfbensch/sane-git sudo apt-get update sudo apt-get install sane sane-utils imagemagick tesseract-ocr pdftk libtiff-tools libsane-extras exactimage wput Install scanbuttond Download the "Debian Experimental" package from http://pkgs.org/download/scanbuttond sudo dpkg -i scanbuttond_0.2.3.cvs20090713-14_i386.deb This step is only for the Fujitsu ScanSnap support. For other scanners you can probably install from the Ubuntu Repository Scanner config vim 40-libsane.rules #add this line ATTRS{idVendor}=="04c5", ATTRS{idProduct}=="11a2", ENV{libsane_matched}="yes" Permissions sudo adduser saned scanner Useful command lines for troubleshooting Since I had a few trouble getting this scanner to work properly I found the following commands highly useful in locating the issue. man sane-usb sane-find-scanner scanimage -L dmesg tail /var/log/udev NOTE: If you are using a…

Continue Reading

Bulk converting Office documents to PDF

When you need to convert multiple documents to PDF for distribution (or from one Office format to another) there are a few utilities around. The most workable I found is the UNOCONV utility which is build on top of LibreOffice / OpenOffice. This uses the OpenOffice conversion facilities rather than a simple PDF print driver. On Ubuntu it can be installed via Software Center or via apt-get from the core repositories. sudo apt-get install unoconv Combined with the -exec option of the Unix find command this makes conversion of whole directory structures a breeze. #find all Word Documents and convert to PDF find . -name "*.doc*" -exec unoconv -f pdf {} \; #find all Powerpoint Documents and convert to PDF find . -name "*.ppt*" -exec unoconv -f pdf {} \; To show all the possible conversion formats you can use: unoconv --show The following list of document formats are currently available: bib - BibTeX [.bib] doc - Microsoft Word 97/2000/XP [.doc] doc6 - Microsoft Word 6.0 [.doc] doc95 - Microsoft Word 95 [.doc] docbook - DocBook [.xml] html - HTML Document (OpenOffice.org Writer) [.html] odt - ODF Text Document [.odt] ott - Open Document Text [.ott] ooxml - Microsoft Office Open XML [.xml] pdf - Portable Document Format [.pdf] rtf - Rich Text Format [.rtf] latex - LaTeX 2e [.ltx] sdw - StarWriter 5.0 [.sdw] sdw4 - StarWriter 4.0 [.sdw] sdw3 - StarWriter 3.0 [.sdw] stw - Open Office.org 1.0 Text Document Template [.stw] sxw - Open Office.org 1.0 Text…

Continue Reading