8

I am writing a web scraper for a particular webpage and I am doing this with "urllib2.Request(MyURL)" and "BeautifulSoup" but the problem is that there is a Paging on page in MyURL and the next page loads (in same myURL/page) by clicking on a link, behind this link is the javascript method written as

{ javascript:__doPostBack('rptPagingBottom$ctl01$btnPage','') }. 

Now without executing this Javascript function from Python, I can not get a complete page listing. How can I call this Javascript method from Python so that I can get all pages of that web page?

I found one related question here where it is suggested to use (Rhino, V8, SeaMonkey) but I did not get this at all. I need some example code if it is possible.

    1 Answer 1

    2

    Try Selenium for this kind of dirty work(inline js, ajax page loading). It is able to emulate exact what browsers can do with python and browser-driver.

    You can get some info about how to use it as a crawler by search google with keyword 'selenium crawler'.

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.