Skip to content

Latest commit

 

History

History
367 lines (296 loc) · 12 KB

rendering-data-as-graphs.md

File metadata and controls

367 lines (296 loc) · 12 KB
titleintroredirect_fromversionstopics
Rendering data as graphs
Learn how to visualize the programming languages from your repository using the D3.js library and Ruby Octokit.
/guides/rendering-data-as-graphs
/v3/guides/rendering-data-as-graphs
fptghesghec
*
*
*
API

In this guide, we're going to use the API to fetch information about repositories that we own, and the programming languages that make them up. Then, we'll visualize that information in a couple of different ways using the D3.js library. To interact with the {% data variables.product.github %} API, we'll be using the excellent Ruby library, Octokit.

If you haven't already, you should read the Basics of Authentication guide before starting this example. You can find the complete source code for this project in the platform-samples repository.

Let's jump right in!

Setting up an {% data variables.product.prodname_oauth_app %}

First, register a new application on {% data variables.product.github %}. Set the main and callback URLs to http://localhost:4567/. As before, we're going to handle authentication for the API by implementing a Rack middleware using sinatra-auth-github:

require'sinatra/auth/github'moduleExampleclassMyGraphApp < Sinatra::Base# !!! DO NOT EVER USE HARD-CODED VALUES IN A REAL APP !!!# Instead, set and test environment variables, like below# if ENV['GITHUB_CLIENT_ID'] && ENV['GITHUB_CLIENT_SECRET']# CLIENT_ID = ENV['GITHUB_CLIENT_ID']# CLIENT_SECRET = ENV['GITHUB_CLIENT_SECRET']# endCLIENT_ID=ENV['GH_GRAPH_CLIENT_ID']CLIENT_SECRET=ENV['GH_GRAPH_SECRET_ID']enable:sessionsset:github_options,{:scopes=>"repo",:secret=>CLIENT_SECRET,:client_id=>CLIENT_ID,:callback_url=>"/"}registerSinatra::Auth::Githubget'/'doif !authenticated?authenticate!elseaccess_token=github_user["token"]endendendend

Set up a similar config.ru file as in the previous example:

ENV['RACK_ENV'] ||= 'development'require"rubygems"require"bundler/setup"requireFile.expand_path(File.join(File.dirname(__FILE__),'server'))runExample::MyGraphApp

Fetching repository information

This time, in order to talk to the {% data variables.product.github %} API, we're going to use the Octokit Ruby library. This is much easier than directly making a bunch of REST calls. Plus, Octokit was developed by a GitHubber, and is actively maintained, so you know it'll work.

Authentication with the API via Octokit is easy. Just pass your login and token to the Octokit::Client constructor:

if !authenticated?authenticate!elseoctokit_client=Octokit::Client.new(:login=>github_user.login,:oauth_token=>github_user.token)end

Let's do something interesting with the data about our repositories. We're going to see the different programming languages they use, and count which ones are used most often. To do that, we'll first need a list of our repositories from the API. With Octokit, that looks like this:

repos=client.repositories

Next, we'll iterate over each repository, and count the language that {% data variables.product.github %} associates with it:

language_obj={}repos.eachdo |repo| # sometimes language can be nilifrepo.languageif !language_obj[repo.language]language_obj[repo.language]=1elselanguage_obj[repo.language] += 1endendendlanguages.to_s

When you restart your server, your web page should display something that looks like this:

{"JavaScript"=>13,"PHP"=>1,"Perl"=>1,"CoffeeScript"=>2,"Python"=>1,"Java"=>3,"Ruby"=>3,"Go"=>1,"C++"=>1}

So far, so good, but not very human-friendly. A visualization would be great in helping us understand how these language counts are distributed. Let's feed our counts into D3 to get a neat bar graph representing the popularity of the languages we use.

Visualizing language counts

D3.js, or just D3, is a comprehensive library for creating many kinds of charts, graphs, and interactive visualizations. Using D3 in detail is beyond the scope of this guide, but for a good introductory article, check out D3 for Mortals.

D3 is a JavaScript library, and likes working with data as arrays. So, let's convert our Ruby hash into a JSON array for use by JavaScript in the browser.

languages=[]language_obj.eachdo |lang,count| languages.push:language=>lang,:count=>countenderb:lang_freq,:locals=>{:languages=>languages.to_json}

We're simply iterating over each key-value pair in our object and pushing them into a new array. The reason we didn't do this earlier is because we didn't want to iterate over our language_obj object while we were creating it.

Now, lang_freq.erb is going to need some JavaScript to support rendering a bar graph. For now, you can just use the code provided here, and refer to the resources linked above if you want to learn more about how D3 works:

<!DOCTYPE html><metacharset="utf-8"><html><head><scriptsrc="//cdnjs.cloudflare.com/ajax/libs/d3/3.0.1/d3.v3.min.js"></script><style>svg { padding:20px; } rect { fill:#2d578b } text { fill: white; } text.yAxis { font-size:12px; font-family: Helvetica, sans-serif; fill: black; } </style></head><body><p>Check this sweet data out:</p><divid="lang_freq"></div></body><script>vardata=<%=languages%>; var barWidth = 40; var width = (barWidth + 10) * data.length; var height = 300; var x = d3.scale.linear().domain([0, data.length]).range([0, width]); var y = d3.scale.linear().domain([0, d3.max(data, function(datum) {returndatum.count;})]). rangeRound([0, height]); // add the canvas to the DOM var languageBars = d3.select("#lang_freq"). append("svg:svg"). attr("width", width). attr("height", height); languageBars.selectAll("rect"). data(data). enter(). append("svg:rect"). attr("x", function(datum, index) {returnx(index);}). attr("y", function(datum) {returnheight-y(datum.count);}). attr("height", function(datum) {returny(datum.count);}). attr("width", barWidth); languageBars.selectAll("text"). data(data). enter(). append("svg:text"). attr("x", function(datum, index) {returnx(index)+barWidth;}). attr("y", function(datum) {returnheight-y(datum.count);}). attr("dx", -barWidth/2). attr("dy", "1.2em"). attr("text-anchor", "middle"). text(function(datum) {returndatum.count;}); languageBars.selectAll("text.yAxis"). data(data). enter().append("svg:text"). attr("x", function(datum, index) {returnx(index)+barWidth;}). attr("y", height). attr("dx", -barWidth/2). attr("text-anchor", "middle"). text(function(datum) {returndatum.language;}). attr("transform", "translate(0, 18)"). attr("class", "yAxis"); </script></html>

Phew! Again, don't worry about what most of this code is doing. The relevant part here is a line way at the top--var data = <%= languages %>;--which indicates that we're passing our previously created languages array into ERB for manipulation.

As the "D3 for Mortals" guide suggests, this isn't necessarily the best use of D3. But it does serve to illustrate how you can use the library, along with Octokit, to make some really amazing things.

Combining different API calls

Now it's time for a confession: the language attribute within repositories only identifies the "primary" language defined. That means that if you have a repository that combines several languages, the one with the most bytes of code is considered to be the primary language.

Let's combine a few API calls to get a true representation of which language has the greatest number of bytes written across all our code. A treemap should be a great way to visualize the sizes of our coding languages used, rather than simply the count. We'll need to construct an array of objects that looks something like this:

[ { "name": "language1", "size": 100}, { "name": "language2", "size": 23} ... ]

Since we already have a list of repositories above, let's inspect each one, and call the GET /repos/{owner}/{repo}/languages endpoint:

repos.eachdo |repo| repo_name=repo.namerepo_langs=octokit_client.languages("#{github_user.login}/#{repo_name}")end

From there, we'll cumulatively add each language found to a list of languages:

repo_langs.eachdo |lang,count| if !language_obj[lang]language_obj[lang]=countelselanguage_obj[lang] += countendend

After that, we'll format the contents into a structure that D3 understands:

language_obj.eachdo |lang,count| language_byte_count.push:name=>"#{lang} (#{count})",:count=>countend# some mandatory formatting for D3language_bytes=[:name=>"language_bytes",:elements=>language_byte_count]

(For more information on D3 tree map magic, check out this simple tutorial.)

To wrap up, we pass this JSON information over to the same ERB template:

erb:lang_freq,:locals=>{:languages=>languages.to_json,:language_byte_count=>language_bytes.to_json}

Like before, here's a bunch of JavaScript that you can drop directly into your template:

<divid="byte_freq"></div><script>varlanguage_bytes=<%=language_byte_count%> var childrenFunction = function(d){returnd.elements}; var sizeFunction = function(d){returnd.count;}; var colorFunction = function(d){returnMath.floor(Math.random()*20)}; var nameFunction = function(d){returnd.name;}; var color = d3.scale.linear() .domain([0,10,15,20]) .range(["grey","green","yellow","red"]); drawTreemap(5000, 2000, '#byte_freq', language_bytes, childrenFunction, nameFunction, sizeFunction, colorFunction, color); function drawTreemap(height,width,elementSelector,language_bytes,childrenFunction,nameFunction,sizeFunction,colorFunction,colorScale){vartreemap=d3.layout.treemap().children(childrenFunction).size([width,height]).value(sizeFunction);vardiv=d3.select(elementSelector).append("div").style("position","relative").style("width",width+"px").style("height",height+"px");div.data(language_bytes).selectAll("div").data(function(d){returntreemap.nodes(d);}).enter().append("div").attr("class","cell").style("background",function(d){returncolorScale(colorFunction(d));}).call(cell).text(nameFunction);} function cell(){this.style("left",function(d){returnd.x+"px";}).style("top",function(d){returnd.y+"px";}).style("width",function(d){returnd.dx-1+"px";}).style("height",function(d){returnd.dy-1+"px";});}</script>

Et voila! Beautiful rectangles containing your repo languages, with relative proportions that are easy to see at a glance. You might need to tweak the height and width of your treemap, passed as the first two arguments to drawTreemap above, to get all the information to show up properly.

close