0

I'm doing some webscraping with Selenium in Python and I have a link on a page that points to, e.g.

<a href="/zip.php?zipid=103">Click Here To Download</a> 

Now, of course, if I click on it, my browser will immediately start downloading the file, e.g. myinterestingarchive.zip

What I'm wondering is if I can inject some JavaScript, say, that will tell me the filename myinterestingarchive.zip WITHOUT my clicking the link, because I'd like to record the filename in my program's log, and it's nowhere in the source or OuterHTML, just that php url.

1
  • 1
    There's no filename "myinterestingarchive.zip" in that URL, only the id of it, which is recognized at the server-side as "myinterestingarchive.zip". You could try to get the filename with a head-type AJAX call ..?
    – Teemu
    CommentedDec 11, 2018 at 19:11

1 Answer 1

1

If it support HEAD request that will only download http headers you can do

import requests ...... # set the request with selenium cookies cookies = {c['name']: c['value'] for c in driver.get_cookies()} response = requests.head('http://....../zip.php?zipid=103', cookies=cookies ) print(response.headers['Content-Disposition']) # attachment; filename=zip/myinterestingarchive.zip 

And yes you can do this with injected JavaScript but it more simple using requests

2
  • The reason I was using Selenium is that the site requires log-in and I couldn't figure out how the cookies worked, but this seems like the perfect time to finally try out requestium.CommentedDec 12, 2018 at 21:07
  • 1
    I added code how to use selenium cookies in requests
    – ewwink
    CommentedDec 13, 2018 at 7:11

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.