Automate downloading books and pdf’s on springerlink.com

Having an electronic version of a book is great. I can skim and search through it very easily. Although I do find hardcopies useful at times, I prefer softcopies 99% of the time due to their accessibility and searchability.

Most universities have deals with publishers where students can access the electronic version of a book at the publisher’s website. This saves me the trip of running to the library when I need a book and solves the “book checked out” issue.

Springer books can be found online. You have to be on your school’s network (VPN) to access them. The crappy thing is they put up books by chapters, so you have to manually save them if you need to look at the entire book. I’ve used wget before to easily download the pdf files. However, wget doesn’t seem to work anymore because the files are no longer html links. A quick query on google (“springerlink download whole book”) yielded the springer_downlad python script. It depends on stapler which in turn depends on pyPDF. To install and use:

 <pre class="src src-sh">git clone git://github.com/milianw/springer_download

git clone http://github.com/hellerbarde/stapler.git git clone http://github.com/mfenniak/pyPdf cd pyPDF sudo ./setup.py –install cd ../stapler/ cp ./stapler.sh ~/Documents/bin/ ## or copy it to /usr/local/bin cd ../springer_download cp springer_download.py ~/Documents/bin ## or copy it to /usr/local/bin ## to download springer_download.py -l http://springerlink.com/content/HASH/STUFF ## output: a concatenated, full pdf file of the book

Very neat!

About Vinh Nguyen

Statistician

2 comments

  1. Great script! Exactly what I searched for. I installed it on the Mac and the only thing I needed to change was to also copy the staplerlib into the bin folder as subdirectory. Otherwise it can’t find the lib.

    Best, Andreas

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>