All Questions
743 questions
295votes
17answers
506kviews
How can I scrape a page with dynamic content (created by JavaScript) in Python?
I'm trying to develop a simple web scraper. I want to extract plain text without HTML markup. My code works on plain (static) HTML, but not when content is generated by JavaScript embedded in the page....
46votes
9answers
132kviews
How do I call a Javascript function from Python?
I am working on a web-scraping project. One of the websites I am working with has the data coming from Javascript. There was a suggestion on one of my earlier questions that I can directly call the ...
40votes
7answers
39kviews
How to convert raw javascript object to a dictionary?
When screen-scraping some website, I extract data from <script> tags. The data I get is not in standard JSON format. I cannot use json.loads(). # from js_obj = '{x:1, y:2, z:3}' # to py_obj = {'...
25votes
2answers
11kviews
How to use CrawlSpider from scrapy to click a link with javascript onclick?
I want scrapy to crawl pages where going on to the next link looks like this: <a href="#" onclick="return gotoPage('2');"> Next </a> Will scrapy be able to interpret javascript code of ...
16votes
2answers
46kviews
Fetch data of variables inside script tag in Python or Content added from js
I want to fetch data from another url for which I am using urllib and Beautiful Soup , My data is inside table tag (which I have figure out using Firefox console). But when I tried to fetch table ...
14votes
4answers
20kviews
How can I parse Javascript variables using python?
The problem: A website I am trying to gather data from uses Javascript to produce a graph. I'd like to be able to pull the data that is being used in the graph, but I am not sure where to start. For ...
12votes
3answers
23kviews
How to call JavaScript function using BeautifulSoup and Python
I am performing web scraping to grab data from a website as part of my project. I can make the request and grab the data which is present in the dom. However, some data is getting rendered on ...
12votes
1answer
15kviews
Get data from <script> tag in HTML using Scrapy
I've been trying to extract data from script tag in Kbb's HTML using Scrapy(xpath). But my main issue is with identifying the correct div and script tags. I'm new to using xpath and would appreciate ...
12votes
3answers
11kviews
How to scrape HTTPS javascript web pages
I am trying to monitor day-to-day prices from an online catalogue. The site uses HTTPS and generates the catalogue pages with javascript. How can i interface with the site and make it generate the ...
10votes
1answer
6kviews
Crawling through pages with PostBack data javascript Python Scrapy
I'm crawling through some directories with ASP.NET programming via Scrapy. The pages to crawl through are encoded as such: javascript:__doPostBack('ctl00$MainContent$List','Page$X') where X is an ...
9votes
3answers
31kviews
How can I render JavaScript HTML to HTML in python?
I have looked around and only found solutions that render a URL to HTML. However I need a way to be able to render a webpage (That I already have, and that has JavaScript) to proper HTML. Want: ...
9votes
2answers
10kviews
Can I scrape the raw data from highcharts.js?
I want to scrape the data from a page that shows a graph using highcharts.js, and thus I finished to parse all the pages to get to the following page. However, the last page, the one that displays the ...
8votes
3answers
30kviews
Selenium Python: How to Scroll Down in a Pop Up window
I am working on a Linkedin web scraping project. I am trying to get the list of companies that interest someone (notice I am not using the API). It is a dynamic website, so I would need to scroll down ...
8votes
1answer
10kviews
Python web scraping for javascript generated content
I am trying to use python3 to return the bibtex citation generated by http://www.doi2bib.org/. The url's are predictable so the script can work out the url without having to interact with the web page....
8votes
1answer
6kviews
Execute Javascript method on web page from Python
I am writing a web scraper for a particular webpage and I am doing this with "urllib2.Request(MyURL)" and "BeautifulSoup" but the problem is that there is a Paging on page in MyURL and the next page ...