Since paper and myself have never gotten on well I have always been dreaming of a paperless office. A while ago I purchased a Fujitsu ScanSnap S1500 scanner for the office. I did this after doing some research on which Automatic Document Feed (ADF) multipage & duplex scanners were both affordable as well as supported on Linux.

It took a while for me to get around to set all of this up, but the result now is that this scanner is connected to a headless Ubuntu VM and the press of the scanner button will:

  1. scan the document
  2. perform OCR to convert to text
  3. combine the text with PDF to create a searchable PDF
  4. OPTIONAL – send the resulting document into Alfresco Document Management Server via FTP

Install dependencies

NOTE: PPA is only required for support of Fujitsu ScanSnap S1500
sudo apt-add-repository ppa:rolfbensch/sane-git
sudo apt-get update
sudo apt-get install sane sane-utils imagemagick tesseract-ocr pdftk libtiff-tools libsane-extras exactimage wput

Install scanbuttond

Download the “Debian Experimental” package from http://pkgs.org/download/scanbuttond
sudo dpkg -i scanbuttond_0.2.3.cvs20090713-14_i386.deb

This step is only for the Fujitsu ScanSnap support. For other scanners you can probably install from the Ubuntu Repository

Scanner config

vim 40-libsane.rules
#add this line
ATTRS{idVendor}=="04c5", ATTRS{idProduct}=="11a2", ENV{libsane_matched}="yes"

Permissions

sudo adduser saned scanner

Useful command lines for troubleshooting

Since I had a few trouble getting this scanner to work properly I found the following commands highly useful in locating the issue.
man sane-usb
sane-find-scanner
scanimage -L
dmesg
tail /var/log/udev

NOTE: If you are using a notebook devices be careful as I spent quite a few hours troubleshooting an error when opening the device from saned. It turned out to be that the USB power-management on the Toshiba notebook caused havoc with saned (http://askubuntu.com/questions/55140/error-during-device-i-o-when-using-usb-scanner). Switching to the desktop that is now housing the scanner fixed that problem. Thank you VIRTUALBOX (I ended up setting up a dedicated VM for this task) !

Configure scanbuttond

vim /etc/default/scanbuttond
#change this line from no to yes
RUN=yes

cd /etc/scanbuttond
sudo cp initscanner.sh.example initscanner.sh
sudo vim initscanner.sh

Uncomment or copy any scanner init command(s).

sudo cp buttonpressed.sh.example buttonpressed.sh
sudo vim buttonpressed.sh

Copy the contents of the scan script below. The script is also hosted on GitHub (https://github.com/leogaggl/misc-scripts/blob/master/buttonpressed.sh)

Scan script

#!/bin/bash
OUT_DIR=/output/directory/name
TMP_DIR=`mktemp -d`
FILE_NAME=scan_`date +%Y%m%d-%H%M%S`
cd $TMP_DIR
echo "################## Scanning ###################"
scanimage --resolution 150 --batch=scan_%03d.pnm --format=pnm --mode Gray --device-name "fujitsu:ScanSnap S1500:67953" --source “ADF Duplex” --page-width 210 --page-height 297 --sleeptimer 1 -y 297 -x 210
echo "################## Cleaning ###################"
for f in ./*.pnm; do
unpaper --size "a4" --overwrite "$f" "$f"
done
echo "############## Converting to TIF ##############"
mogrify -format tif *.pnm
echo "################ OCR ################"
for f in ./*.tif; do
tesseract "$f" "$f" -l eng hocr
hocr2pdf -i "$f" -s -o "$f.pdf" < "$f.html" done echo "############## Converting to PDF ##############" pdftk *.tif.pdf cat output "output.pdf" && rm *.tif.pdf && rm *.tif.html echo "############## Copy Output File ##############" cp $FILE_NAME.pdf $OUT_DIR/ echo "############## clean up ##############" cd .. rm -rf $TMP_DIR echo "############## FTP Output File ##############" #wput $OUT_DIR/$FILE_NAME.pdf ftp://user:pwd@ftp.alfrescoserver.com.au:21/autoscan/pdf/

Credits:

A big thank you & hat tip to the following authors of the following pages:


EDIT (2013-09-16): I found this link describing how to remove empty pages: http://philipp.knechtges.com/?p=190 – might have to investigate this when I have some time.

Leo Gaggl

ict business owner specialising in mobile learning systems. interests: sustainability, internet of things, ict for development, open innovation, agriculture

This Post Has 12 Comments

  1. Ivan

    I’m looking for a portable ADF scanner (e.g. Canon imageFORMULA P-215 Scan-tini Personal Document Scanner, or HP Scanjet Pro 3000 s2 Sheet-feed Scanner) that would be Ubuntu compatible.

    Any recommendations from the research that you did?

    I’m not finding the portable ADF scanners listed in http://www.sane-project.org/sane-mfgs.html

    Thank you!

  2. Leo Gaggl

    @Ivan – sorry – I did not look at portable devices at all. In fact I needed a fairly solid stationary option and the Fujitsu was the most cost-efficient.

  3. Marlon

    Hey there,
    thank you for the nice instructions, but i’m hanging for hours on the i/o error you mentioned. Can you help me on that? What do you mean by using Virtualbox now?

  4. Leo Gaggl

    @Marlon: I ended up switching from the notebook (which I used to test all of this) to a desktop (using a VirtualBox VM on the desktop host). So it was the switch from notebook to desktop that fixed the issue (rather than VirtualBox – that was a bit misleading).

  5. It worked like a charm on Ubuntu 14.04 LTS, thanks a lot! I had been looking for a solution for some time! Two things though:
    1) The same “scanbuttond” package file (scanbuttond_0.2.3.cvs20090713-14_i386.deb) is available now in the repositories, probably after installing the cited ppa:rolfbensch/sane-git, so there’s no need to download it from the pkgs.com website, just type “sudo apt-get install scanbuttond”.
    2) The actual button on the scanner does nothing when pressed so I’m not sure what the purpose of the “scanbuttond” software actually is, so probably it is not needed anyway if you don’t mind missing this functionality. If the purpose of the software is just to have this physical button work then it doesn’t though, at least in my case. I scanned through Easyscan, Xsane and gscan2pdf and all worked perfectly.

  6. I would recommend doing the “Scanner config” and “Permissions” sections in the reference article and checking if it works, if it doesn’t then go to “Install dependencies” through the PPA and check again. At last I would install the scanbuttond and configure it.

  7. Ra

    Thank you, very useuful

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.