Hi and welcome to yet another programming blog. This blog is going to be my repository for technical how to’s for my own record and if it helps someone else along the way, then great.
This post discusses creating a dashboard for the U.S. economy using OpenShift and Node.js. Business executives have dashboards on the financial performance of their companies. Why not create a simple dashboard for the U.S. public on the overall financial health of the country?
I figure the dashboard could display four key economic metrics for the country. The choice of metrics is a matter of debate, but I started with the overall unemployment rate from the Bureau of Labor Statistics. I’m still thinking about which other indicators to display. If anyone has ideas on that, let me know.
In terms of technology, I wanted to demo Red Hat’s OpenShift platform-as-a-service (PaaS). I’m a big believer in the PaaS concept and hope it’s how most of us will deploy applications in the coming years. I love the idea of spinning up an entire application stack and deployment environment in one command (which I’ll show below).
Also, Node.js was on my target technology list, but it was more out of curiosity than anything else. I doubt Node.js is the best fit for this kind of application right now, but I was pleasantly surprised by the number and capabilities of Node.js libraries out there already, especially Express, Zombie, and the Jade template engine.
Design
The application is designed to have two kinds of HTTP routes defined. One is for the dashboard itself. The other is a set of data services that supply the JSON data that feeds the dashboard. Not all sources of economic data have that data exposed in a convenient format or API, so creating services for those sources is a potentially useful byproduct of this project. If I ever need to create a different kind of client (mobile, for example), I can reuse those services.
Creating a New OpenShift Application
Creating a new application on OpenShift is easy. If you haven’t already, sign up for an account on OpenShift and get your SSH keys set up. OpenShift offers various platform types (JBoss AS 7, PHP, Ruby, Python, Perl, Node) and includes a way to build your own cartridges to support other platforms. To create this application on a Node.js stack, either use the web management console or install the OpenShift command line tools and type:
rhc app create -a econstats -t nodejs-0.6 -l [OpenShift username]
OpenShift will create and configure a ready-to-go application stack, named econstats, for Node.js 0.6. It will also create a remote Git repository and local clone for your application code. Jump into the just-created local econstats folder and edit server.js as follows…
Create the JSON Data Service
The first step is to create the JSON data service that feeds the dashboard. I use a headless browser library called Zombie.js to scrape the data from the BLS site:
var express = require("express"); var zombie = require("zombie"); var retrieveUnemployment = function(callback) { // Screen scrape BLS web page for latest unemployment information zombie.visit("http://data.bls.gov/timeseries/LNS14000000", function(err, browser, status) { var unemploymentData = []; // Grab the unemployment table var ths = browser.querySelectorAll("table.regular-data tbody th"); for ( var i = 0; i < ths.length; i++) { var unemploymentEntry = {}; // Grab each row header and use it to set the year var th = ths.item(i); var year = th.innerHTML.trim(); // Grab each cell in the row and use it to set the month and // unemployment rate var tds = th.parentNode.getElementsByTagName("td"); for ( var j = 0; j < tds.length && j < 12; j++) { var monthData = tds.item(j).innerHTML.trim(); if (monthData && monthData !== "&nbsp;") { unemploymentEntry = { month : j + 1, year : parseFloat(year), rate : parseFloat(monthData) }; unemploymentData.push(unemploymentEntry); } } } console.log("Retrieved unemployment data from BLS."); callback(unemploymentData); }); } var app = express.createServer(); // Route: GET /unemployment -> Unemployment JSON data app.get("/unemployment", function(req, res) { retrieveUnemployment(function(unemploymentData) { res.json(unemploymentData)); }); }); // Get the environment variables we need. var ipaddr = process.env.OPENSHIFT_INTERNAL_IP || "127.0.0.1"; var port = process.env.OPENSHIFT_INTERNAL_PORT || "3000"; // And start the app on that interface (and port). app.listen(port, ipaddr, function() { console.log('%s: Node server started on %s:%d ...', Date(Date.now()), ipaddr, port); });
The call to zombie.visit visits the page, adding to the browser var context. browser.querySelectorAll retrieves the table header cells to grab the year. th.parentNode.getElementsByTagName retrieves the cells for the data, which is pushed to a JSON object called unemploymentData.
Routing is handled by Express.js. Express is a lightweight web application library for Node inspired by the excellent Sinatra library for Ruby. It is a non-MVC way to publish simple web applications and APIs. Define a route (URL pattern and HTTP method) and a response, create and start the server, and you’re up and running.
I simply define an Express server and use app.get to create a URL route for the unemployment data (/unemployment). A callback sets the content type and the content on the response. Then I bind the server to the port using a call to app.listen.
A couple OpenShift environment caveats: The IP address and port need to use OpenShift environment variables to work properly in that environment. Also, to make this all work, we need to have Zombie.js installed in the OpenShift environment (Express is already installed). To add Zombie to OpenShift, we edit the project folder’s the dependency list file, deplist.txt, by adding the following line:
zombie@0.12.13
Commit and push the changes to OpenShift to run it in the cloud:
git commit -a -m “Adding route to provide unemployment data scraped from BLS site” git push origin master
Visit http://econstats-[your OpenShift domain name].rhcloud.com/unemployment to the view raw, pure JSON unemployment data that will serve as the foundation for the dashboard chart.