vendredi 9 janvier 2015

Retrieve a zip file from a website in an automated script


For reference: http://ift.tt/14AjL7H


I am trying to retrieve a zip file from a that has no dedicated url. I was doing well with Python Mechanize and beautiful soup, but ran into a problem as I neared the end of the process.


After selecting the option that I want in the table (via mechanize/bs4), I then tried to get my browser to "submit" the form and retrieve my zip file. However, the "submit" button is just a gif image with an



onclick="javascript:submit()"


call. When you hit that button by hand in a browser, it redirects you to a generic ".....testdwn.cfm?RequestTimeout=2000" page, no matter which option you select prior to clicking the gif image (also downloads your zip file). So there's my problem with the no dedicated zip url.


So from what I've read online the past few days, Python/Mechanize cannot read javascript in any capacity, so I'm SOL on that avenue it seems. If mechanize could somehow just click that button, all would be fine and well.


What method should I pursue in order to pull this data? I've read about selenium, but I was wondering what option would be absolutely easiest and best to pull this data, javascipt-based or python-selenium based, or something else? Python is preferred if it can be managed.


Thanks in advance!





Aucun commentaire:

Enregistrer un commentaire