3

I am working on a screen scraping tool in Python. But, as I look through the source of the webpage, I noticed that most of the data is coming through Javascript.

Any idea, how to scrape javascript based webpage ? Any tool in Python ?

Thanks

3
  • 3
    Why not just consume the Javascript directly?CommentedNov 18, 2011 at 14:07
  • 2
    Duplicate of stackoverflow.com/questions/2148493/…
    – hymloth
    CommentedNov 18, 2011 at 14:16
  • Why you do consume the Javascript directly ? For instance how do you call the JS function JS_Function(var1,var2,var3) from python ?
    – Kiran
    CommentedNov 18, 2011 at 21:34

3 Answers 3

5

Scraping javascript-based webpages is possible with selenium. In particular, try the Selenium WebDriver.

2
  • I tried Selenium. I donot want to mimic the user action. As I see it from running a sample program, it opens browser window and mimics the action. I donot want that. I want to extract the data from the webpage into my code.
    – Kiran
    CommentedNov 19, 2011 at 11:18
  • 1
    You don't have to mimic user actions if you don't need to. Just download the page and parse it. The point of using selenium is that it processes javascript for you.
    – unutbu
    CommentedNov 19, 2011 at 12:28
4

I use webkit, which is the browser renderer behind Chrome and Safari. There are Python bindings to webkit through Qt.

And here is a full Python example to execute JavaScript and extract the final HTML.

    3

    You can use the QtWebKit module of the PyQt4 library

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.