46

I am working on a web-scraping project. One of the websites I am working with has the data coming from Javascript.

There was a suggestion on one of my earlier questions that I can directly call the Javascript from Python, but I'm not sure how to accomplish this.

For example: If a JavaScript function is defined as: add_2(var,var2)

How would I call that JavaScript function from Python?

1
  • 1
    If it's something you know and can easily simulate, it may be easiest to parse and interpret it yourself. If not, you could end up needing to tie into a JavaScript engine.CommentedNov 27, 2011 at 10:07

9 Answers 9

19

Find a JavaScript interpreter that has Python bindings. (Try Rhino? V8? SeaMonkey?). When you have found one, it should come with examples of how to use it from Python.

Python itself, however, does not include a JavaScript interpreter.

2
  • 5
    You should definitely check outpyv8 which offers a python wrapper for Google's V8 engine. This contains information about using python with SpiderMonkey. Hope this helps.
    – Codahk
    CommentedNov 27, 2011 at 11:07
  • 3
    If you want to support more than one JavaScript engine, you should have a look at PyExecJS
    – Wienczny
    CommentedAug 19, 2013 at 2:36
7

An interesting alternative I discovered recently is the Python bond module, which can be used to communicate with a NodeJs process (v8 engine).

Usage would be very similar to the pyv8 bindings, but you can directly use any NodeJs library without modification, which is a major selling point for me.

Your python code would look like this:

val = js.call('add2', var1, var2) 

or even:

add2 = js.callable('add2') val = add2(var1, var2) 

Calling functions though is definitely slower than pyv8, so it greatly depends on your needs. If you need to use an npm package that does a lot of heavy-lifting, bond is great. You can even have more nodejs processes running in parallel.

But if you just need to call a bunch of JS functions (for instance, to have the same validation functions between the browser/backend), pyv8 will definitely be a lot faster.

    7

    To interact with JavaScript from Python I use webkit, which is the browser renderer behind Chrome and Safari. There are Python bindings to webkit through Qt. In particular there is a function for executing JavaScript called evaluateJavaScript().

    Here is a full example to execute JavaScript and extract the final HTML.

    0
      6

      You can execute JavaScript code or files from Python using pythonmonkey.

      Install with: $ pip install pythonmonkey

      Example:

      js_code = """ function add(a, b) { return a + b; } function subtract(a, b) { return a - b; } """ import pythonmonkey as pm pm.eval(js_code) js_add = pm.eval('add') js_sub = pm.eval('subtract') print(js_add(1,2)) # 3.0 print(js_sub(1,2)) # -1.0 

      or you can require a file or module using commonjs syntax:

      import pythonmonkey as pm CryptoJS = pm.require('crypto-js') ... 
        2

        You can eventually get the JavaScript from the page and execute it through some interpreter (such as v8 or Rhino). However, you can get a good result in a way easier way by using some functional testing tools, such as Selenium or Splinter. These solutions launch a browser and effectively load the page - it can be slow but assures that the expected browser displayed content will be available.

        For example, consider the HTML document below:

        <html> <head> <title>Test</title> <script type="text/javascript"> function addContent(divId) { var div = document.getElementById(divId); div.innerHTML = '<em>My content!</em>'; } </script> </head> <body> <p>The element below will receive content</p> <div id="mydiv" /> <script type="text/javascript">addContent('mydiv')</script> </body> </html> 

        The script below will use Splinter. Splinter will launch Firefox and after the complete load of the page it will get the content added to a div by JavaScript:

        from splinter.browser import Browser import os.path browser = Browser() browser.visit('file://' + os.path.realpath('test.html')) elements = browser.find_by_css("#mydiv") div = elements[0] print div.value browser.quit() 

        The result will be the content printed in the stdout.

        1
        • Selenium is too heavy on the other hand I didn't knew splinter. Used to spynner (on the top of PyQT4 + autopy)
          – c24b
          CommentedApr 1, 2015 at 14:16
        2

        You might call node through Popen.

        My example how to do it

        print execute('''function (args) { var result = 0; args.map(function (i) { result += i; }); return result; }''', args=[[1, 2, 3, 4, 5]]) 
        1
        • 2
          Please include the code snippet as part of your answer.CommentedOct 17, 2017 at 15:10
        0

        Hi so one possible solution would be to use ajax with flask to comunicate between javascript and python. You would run a server with flask and then open the website in a browser. This way you could run javascript functions when the website is created via pythoncode or with a button how it is done in this example.

        HTML code:

         <html> <script src="//ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script> <script> function pycall() { $.getJSON('/pycall', {content: "content from js"},function(data) { alert(data.result); }); } </script> <button type="button" onclick="pycall()">click me</button> </html> 

        Python Code:

        from flask import Flask, jsonify, render_template, request app = Flask(__name__) def load_file(file_name): data = None with open(file_name, 'r') as file: data = file.read() return data @app.route('/pycall') def pycall(): content = request.args.get('content', 0, type=str) print("call_received",content) return jsonify(result="data from python") @app.route('/') def index(): return load_file("basic.html") import webbrowser print("opening localhost") url = "http://127.0.0.1:5000/" webbrowser.open(url) app.run() 

        output in python:

        call_received content from js

        alert in browser:

        data from python

          0

          This worked for me for simple js file, source: https://www.geeksforgeeks.org/how-to-run-javascript-from-python/

          pip install js2py pip install temp 

          file.py

          import js2py eval_res, tempfile = js2py.run_file("scripts/dev/test.js") tempfile.wish("GeeksforGeeks") 

          scripts/dev/test.js

          function wish(name) { console.log("Hello, " + name + "!") } 
            -2

            Did a whole run-down of the different methods recently.

            PyQt4 node.js/zombie.js phantomjs

            Phantomjs was the winner hands down, very straightforward with lots of examples.

            1
            • 3
              You can improve this answer by adding a code snippet than answers the question.CommentedApr 3, 2017 at 14:28

            Start asking to get answers

            Find the answer to your question by asking.

            Ask question

            Explore related questions

            See similar questions with these tags.