Slicing scannings

September 11, 2009 3 minutes read | 427 words by Ruben Berenguel

Last year our department got a nice and shiny Xerox WorkCentre, with scan+mail capabilities. Wonders of technology, now I can scan those chapters I use from books that weigth tons. But… of course, the machine scans two pages each time, and thus creates a double bound PDF.

And this doesn’t work well with my way to print booklets, so a working solution was to use ghostscript and ImageMagick from the command line. Automation at its finest.

First, scan whatever you have to scan. In my case, I had it reversed, and had a “black scan line” on the left side. Then:

gs -sDEVICE=jpeg -q -dBATCH -dNOPAUSE -r300x300 -sOutputFile=Page%03d.jpg -dFirstPage=1 PdfFile.pdf

This will split your PDF file into individual pages, and save them as jpg. The following step may be skipped, as I had them rotated… you may not.

mogrify -rotate 180 Page*.jpg

Now we want to get rid of that black rectangle on the right of every page. As this command cuts right sides, and I don’t bother reading again ImageMagick’s manual, if your black scan zone is on the other side, rotate as in the previous step, cut, and rotate back ;) Observe that mogrify needs its parameters before the file to process.

mogrify -gravity East -chop 150x0 Page%d.jpg

I suggest you try first on an individual page (with convert , not mogrify), and change the 150 pixels to be chopped as you see fit:

convert Page%d.jpg -gravity East -chop 150x0 Test.jpg

Now, look at the image dimensions, either by right-clicking and viewing image properties, or opening in some image editor (as GIMP) divide by two the width and round up. This is the cropping size.

mogrify -crop CroppingSize Page*.jpg

This will generate pair of files named PageN-0.jpg and PageN-1.jpg (if everything worked correctly). Now, delete your PageN.jpg, from the previous step, as you are ready to convert all PageN-M.jpg into PDF files.

mogrify -format pdf -page A4 Page*.jpg

To end this procedure mix them all in a (big!) pdf file

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=output.pdf Page*.pdf

Now you are ready to follow the steps here to turn it into a booklet. Hope you found this useful!