Get filename of downloadable binary from php url without actually downloading the file

Question

I'm doing some webscraping with Selenium in Python and I have a link on a page that points to, e.g.

<a href="/zip.php?zipid=103">Click Here To Download</a>

Now, of course, if I click on it, my browser will immediately start downloading the file, e.g. myinterestingarchive.zip

What I'm wondering is if I can inject some JavaScript, say, that will tell me the filename myinterestingarchive.zip WITHOUT my clicking the link, because I'd like to record the filename in my program's log, and it's nowhere in the source or OuterHTML, just that php url.

There's no filename "myinterestingarchive.zip" in that URL, only the id of it, which is recognized at the server-side as "myinterestingarchive.zip". You could try to get the filename with a head-type AJAX call ..? — Teemu, CommentedDec 11, 2018 at 19:11

ewwink · Accepted Answer · 2018-12-13 07:10:38Z

If it support HEAD request that will only download http headers you can do

import requests ...... # set the request with selenium cookies cookies = {c['name']: c['value'] for c in driver.get_cookies()} response = requests.head('http://....../zip.php?zipid=103', cookies=cookies ) print(response.headers['Content-Disposition']) # attachment; filename=zip/myinterestingarchive.zip

And yes you can do this with injected JavaScript but it more simple using requests

The reason I was using Selenium is that the site requires log-in and I couldn't figure out how the cookies worked, but this seems like the perfect time to finally try out requestium. — prooffreader, CommentedDec 12, 2018 at 21:07

Collectives™ on Stack Overflow

Get filename of downloadable binary from php url without actually downloading the file

1 Answer 1

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Related