Thatcham Trains

This is the final article in my brief series on the National Rail API. As usual, the code can be found on github:

https://github.com/DanteLore/national-rail

The Idea

There are a million and one different websites and apps which will tell you the next direct train from London Paddington to Thatcham (or between any other two railway stations) but all those apps are very general. You have to struggle through the crowds on the Circle Line while selecting the stations from drop-downs and clicking “Submit”, for example. Wouldn’t it be good if there was a simple way to see the information you need without any user input? Even better, what if you could get notifications when the direct trains are delayed or cancelled?

Enter stage left, the Twitter API. This article is all about a simple mash-up of the National Rail and twitter APIs to show information on direct trains between London and Thatcham. You can use it for other stations too – it’s all in the command line parameters.

People who live in Thatcham can use my twitter feed @ThatchamTrains or you can set up your own feed and run the python script to populate it with the stations you’re interested in.

The script also sends direct messages if the trains are more than 15 minutes late or cancelled.

Using the script

I host my instance of the script on my raspberry pi, which is small, cheap, quiet and can be left on 24×7 without much hassle. These instructions are therefore specific to setup on the pi, but the script will work on Windows and other version of Linux too.

1. Install the python libraries you need. You may already have these installed.

$ sudo easy_install argparse
$ sudo easy_install requests
$ sudo easy_install xmltodict
$ sudo easy_install flask

2. Get a twitter account and a set of API keys by following the steps on the Twitter developers page. You’ll need four magic strings in total, which you pass to the script as command line parameters.

3. Get a national rail API key from their website. You just need one key for this API, which is nice!

4. Clone the source and run the script using the three commands below… simples!

$ git clone https://github.com/DanteLore/national-rail.git
$ cd national-rail
$ python twitterrail.py --rail-key YOUR_NATIONAL_RAIL_KEY --consumer-key YOUR_CUST_KEY --consumer-secret YOUR_CUST_SECRET --access-token YOUR_ACCESS_TOKEN --access-token-secret YOUR_ACCESS_TOKEN_SECRET --users YourTwitterName --forever

When run with the –forever option, the script will query the NR API and post to twitter every 5 minutes. Note that there are some basic checks to prevent annoying behaviour and duplicate messages. You can specify one or more usernames who you’d like to receive direct messages when there are delays and cancellations; note that only users who follow you can receive DMs on twitter.

You can use other stations by specifying the three character station codes (CRS) for “home” and “work” on the command line. Here are the command line options:

$ python twitterrail.py --help

usage: twitterrail.py [-h] [--home HOME] [--work WORK] [--users USERS]
                      [--forever] --rail-key RAIL_KEY --consumer-key
                      CONSUMER_KEY --consumer-secret CONSUMER_SECRET
                      --access-token ACCESS_TOKEN --access-token-secret
                      ACCESS_TOKEN_SECRET

Tweeting about railways

optional arguments:
  -h, --help            show this help message and exit
  --home HOME           Home station CRS (default "THA")
  --work WORK           Work station CRS (default "PAD")
  --users USERS         Users to DM (comma separated)
  --forever             Use this switch to run the script forever (once ever 5 mins)
  --rail-key RAIL_KEY   API Key for National Rail
  --consumer-key CONSUMER_KEY
                        Consumer Key for Twitter
  --consumer-secret CONSUMER_SECRET
                        Consumer Secret for Twitter
  --access-token ACCESS_TOKEN
                        Access Token for Twitter
  --access-token-secret ACCESS_TOKEN_SECRET
                        Access Token Secret for Twitter

The Code

There’s not much to say about the code, since I’ve covered the National Rail API in graphic detail in a previous article. The only real difference between this script and my previous adventures with the API is that this time I did unit testing.

There’s a fair bit of business logic in the twitter app: rules about when to post and when to be quiet, duplicate message detection and all sorts of time- and data-based rules which can’t be tested using real data. It’s also pretty bad form to test code like this against a live API, so I mocked out the NR query and the Twitter API and wrote a small suite of tests to check that the behaviour is right.

Like I said, all the code is on GitHub, so I won’t bang on about it here.

https://github.com/DanteLore/national-rail

Live Train Route Animation

The code for this article is available on my github, here: https://github.com/DanteLore/national-rail

Building on the Live Departures Board project from the other day, I decided to try out mapping some departure data. The other article shows pretty much all the back-end code, which wasn’t changed much.

route-planner

The AngularJS app takes the routes of imminent departures from various stations and displays them on a Leaflet map as polylines. I used this great Snake library to animate the lines as they appear. Map tiles come from CartoDB, which is free, unlike Mapbox.

route-planner2

Here’s the code-behind for the Angular app:

var mapApp = angular.module('mapApp', ['ngRoute']);

mapApp
    .config(function($routeProvider){
	    $routeProvider
		    .when('/',
		    {
		    	controller: 'MapController',
			    templateUrl: 'map.html'
		    })
		    .otherwise({redirectTo: '/'});
	})
	.controller('MapController', function($scope, $http, $timeout, $routeParams) {

        var mymap = L.map('mapid').fitBounds([ [51.3933180851, -1.24174419711], [51.5154681995, -0.174688620494] ]);
        L.tileLayer('http://{s}.basemaps.cartocdn.com/dark_all/{z}/{x}/{y}.png', {
            attribution: '© <a href="http://www.openstreetmap.org/copyright">OpenStreetMap</a> © <a href="http://cartodb.com/attributions">CartoDB</a>',
            subdomains: 'abcd',
            maxZoom: 19
        }).addTo(mymap);

        $scope.routeLayer = L.featureGroup().addTo(mymap);
        $scope.categoryScale = d3.scale.category10();

        $scope.doStation = function(data) {
            data.forEach(function(route){
                var color = $scope.categoryScale(route[0].crs)
                var path = []

                route.filter(function(x) {return x.latitude && x.longitude}).forEach(function(station) {
                    var location = [station.latitude, station.longitude];
                    path.push(location);
                });

                var line = L.polyline(path, {
                    weight: 4,
                    color: color,
                    opacity: 0.5
                }).addTo($scope.routeLayer);

                line.snakeIn();
            });
        };

        $scope.refresh = function() {
            $scope.routeLayer.clearLayers();

            $scope.crsList.forEach(function(crs) {
                $http.get("/routes/" + crs).success($scope.doStation);
            });

            $timeout(function(){
                $scope.refresh();
            }, 10000)
        };

        $http.get("/loaded-crs").success(function(crsData) {
            $scope.crsList = crsData;

            $scope.refresh();
        })
    });

Quick TeamCity Build Status with AngularJS

So, this isn’t supposed to be the ultimate guide to AngularJS or anything like that – I’m not even using the latest version – this is just some notes on my return to The World of the View Model after a couple of years away from WPF. Yeah, that’s right, I just said WPF while talking about Javascript development. They may be different technologies from different eras: one may be the last hurrah of bloated fat-client development and the other may be the latest and greatest addition to the achingly-cool, tie dyed hemp tool belt of the Single Page App hipster, but under the hood they’re very very similar. Put that in your e-pipe and vape it, designer-bearded UX developers!

BuildStatus

Anyway, when I started, I knew nothing about SPA development. I’d last done JavaScript several years ago and never really used it as a real language. I still contend that JavaScript isn’t a real language (give me Scala or C# any day of the week) but you can’t ignore the fact that this is how user interfaces are developed these days… so, yeah, I started with a tutorial on YouTube.

I decided to do an Information Radiator to show build status from TeamCity on the web. Information Radiators are my passion – at least they’re one of the few passions I’m allowed to pursue at work – and we use Team City for all our continuous integration, release builds, automated tests and so on. Our old radiators are coded in WPF, which looks awesome on the big TVs dotted around the office, but doesn’t translate well for remote workers.

There is no sunshine and there are no rainbows in this article. I found javascript to be a hateful language, filled with boilerplate and confusion. Likewise, though TeamCity is doubtless the best enterprise CI platform on planet earth, the REST APIs are pretty painful to consume. With that in mind, let’s get into the weeds and see how this thing works…

Enable cross-site scripting (CORS) on your Team City server

You can’t hit a server from a web page unless that server is the server that served the web page you’re hitting the server with… unless of course you tell the server you want to hit that the web page you want to hit it with, served from a different server, is allowed to hit it. Got that? Thought so. This is all because of a really logical thing called “Cross Origin Resource Sharing”, which you can enable pretty easily in TeamCity as long as you have admin permissions.

Check out Administration -> Server Administration -> Diagnostics -> Internal Properties. From there you should be able to edit, or at least get the location of the internal.properties file. Weirdly, if the file doesn’t exist, there is no option to edit, so you have to go and create the file. Since my TeamCity server is running on a Windows box, I created the new file here:

C:\ProgramData\JetBrains\TeamCity\config\internal.properties

and added the following:

rest.cors.origins=*

You might want to be a little more selective on who you allow to access the server this way – I guess it depends on how secure your network is, how many clients access the dashboard and so on.

Tool Chain

This article is about AngularJS and it’s about TeamCity. It’s not about NPM or Bower or any of that nonsense. I’m not going to minify my code or use to crazy new-fangled pseudo-cosmic CSS. So setting up the build environment for me was pretty easy: create a folder, add a file called “index.html”, fire up the fantastic Fenix Web Server and configure it to serve up the folder we just created. Awesome.

If you’re already confused, or if you just want to play with the code, you can download the lot from GitHib: https://github.com/DanteLore/teamcity-status-with-angular

I promise to do my best

Hopefully you’ve watched the video I linked above, so you know the basics of an AngularJS app. If not, do so now. Then maybe Google around the subject of promises and http requests in AngularJS. Done that? OK, good.

Web requests take a while to run. In a normal app you might fetch them on another thread but not in JavaScript. JavaScript is all about callbacks. A Promise is basically a callback that promises to get called some time in the future. They are actually pretty cool, and they form the spinal column of the build status app. This is because the TeamCity API is so annoying. Let me explain why. In order to find out the status (OK or broken) and state (running, finished) of each build configuration you need to make roughly six trillion HTTP requests as follows:

  1. Fetch a list of the build configurations in the system. These are called “Build Types” in the API and have properties like “name”, “project” and “id”
  2. For each Build Type, make a REST request to get information on the latest running Build with a matching type ID. This will give you the “name”, “id” and “status” of the last finished build for the given Build Type.
  3. Fetch a list of the currently running builds.
  4. Use the list of finished builds and the list of running builds to create a set of status tiles (more on this later)
  5. Add the tiles to the angular $scope and let the UI render them

Here’s how that looks in code. Hopefully not too much more complicated than above!

buildFactory.getBuilds()
	.then(function(responses) {
		$scope.buildResponses = responses
			.filter(function(r) { return (r.status == 200 && r.data.build.length > 0)})
			.map(function(r){ return r.data.build[0] })
	})
	.then(buildFactory.getRunningBuilds)
	.then(function(data) {
		$scope.runningBuilds = data.data.build.map(function(row) { return row.buildTypeId })
	})
	.then(function() {
		$scope.builds = $scope.buildResponses.map(function(b) { return buildFactory.decodeBuild(b, $scope.runningBuilds); });
	})
	.then(function() {
		$scope.tiles = buildFactory.generateTiles($scope.builds)
	})
	.then(function() {
		$scope.statusVisible = false;
	});

Most of the REST access has been squirrelled away into a factory. And yes, our build server is called “tc” and guest access is allowed to the REST APIs and I have enabled CORS too… because sometimes productivity is more important than security!

angular.module('buildApp').factory('buildFactory', function($http) {
	var factory = {};
	  
	var getBuildTypes = function() {
		return $http.get('http://tc/guestAuth/app/rest/buildTypes?locator=start:0,count:100');
	};
	
	var getBuildStatus = function(id) {
		return $http.get('http://tc/guestAuth/app/rest/builds?locator=buildType:' + id + ',start:0,count:1&fields=build(id,status,state,buildType(name,id,projectName))');
	};
	
	factory.getRunningBuilds = function() {
		return $http.get('http://tc/guestAuth/app/rest/builds?locator=running:true');
	};

// etc

Grouping and Tiles

We have over 100 builds. Good teams have lots of builds. Not too many, just lots. Every product (basically every team) has CI builds, release/packaging builds, continuous deployment builds, continuous test builds, metrics builds… we have a lot of builds. Builds are good.

But a screen with 100+ builds on it means very little. This is an information radiator, not a formal report. So, I use a simple (but messy) algorithm to convert a big list of Builds into a smaller list of Tiles:

  1. Take the broken builds (hopefully not many) and turn each one into a Tile
  2. Take the successful builds and group them by “project” (basically the category, which is basically the team or product name)
  3. Turn each group of successful builds into a Tile, using the “project” as the tile name
  4. Mark any “running” build with a flag so we can give feedback in the UI

BuildStatus2

Displaying It

Not much very exciting here. I used Bootstrap, well, a derivative of Bootstrap to make the UI look nice. I bound some content to the View Model and that’s about it. Download the code and have a look if you like.

Here’s my index.html (which shows all the libraries I used):

<html ng-app="buildApp">
<head>
  <title>Build Status</title>
  
  <link href="https://bootswatch.com/cyborg/bootstrap.min.css" rel="stylesheet">
  <!--link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" rel="stylesheet"-->
</head>

<body>
  <div ng-view>
  </div>

  <script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.5.5/angular.min.js"></script>
  <script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.5.5/angular-route.js"></script>
  <script src="https://code.jquery.com/jquery-2.2.3.min.js"></script>
  <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js"></script>
  <script src="utils.js"></script>
  <script src="app.js"></script>
  <script src="build-factory.js"></script>
</body>
</html>

Here’s the “view” HTML for the list (in “templates/list.html”). I love the Angular way of specifying Views and Controllers by the way. Note the cool animated CSS for the “in progress” icon.

<div>
  <style>
	.glyphicon-refresh-animate {
		-animation: spin 1s infinite linear;
		-webkit-animation: spin2 1s infinite linear;
	}

	@-webkit-keyframes spin2 {
		from { -webkit-transform: rotate(0deg);}
		to { -webkit-transform: rotate(360deg);}
	}

	@keyframes spin {
		from { transform: scale(1) rotate(0deg);}
		to { transform: scale(1) rotate(360deg);}
	}
  </style>
  
	<div class="page-header">
		<h1>Build Status <small>from TeamCity</small></h1>
	</div>
	  
    <div class="container-fluid">
		<div class="row">
    		<div class="col-md-3" ng-repeat="tile in tiles | orderBy:'status' | filter:nameFilter">
        		<div ng-class="getPanelClass(tile)">
               <h5><span ng-class="getGlyphClass(tile)" aria-hidden="true"></span>   {{ tile.name | limitTo:32 }}{{tile.name.length > 32 ? '...' : ''}}   {{ tile.buildCount > 0 ? '(' + tile.buildCount + ')' : ''}} </h5>
               <p class="panel-body">{{ tile.project }}</p>
              </div>
        	</div>
    	</div>
    </div>
	<br/><br/><br/><br/><br/><br/>
  
  <nav class="navbar navbar-default navbar-fixed-bottom">
  <div class="container-fluid">
    <p class="navbar-text navbar-left">
		<input type="text" ng-model="nameFilter"/>  <span class="glyphicon glyphicon-filter" aria-hidden="true"></span>  
		<span class="glyphicon glyphicon-refresh glyphicon-refresh-animate" ng-hide="!statusVisible"></span>
	</p>
  </div>
</nav>
</div>

That’s about it!

I think I summarized how I feel about this project in the introduction. It looks cool and the MVC MVVM ViewModel vibe is a good one. The data binding is simple and works very well. All my gripes are with JavaScript as a language really. I want Linq-style methods and I want classes and objects with sensible scope. I want less syntactic nonsense, maybe the odd => every now and again. I think some or all of that is possible with libraries and new language specs… but I want it without any effort!

One thing I will say: that whole page is less than 300 lines of code. That’s pretty darned cool.

Feel free to download and use the app however you like – just bung in a link to this page!

BuildStatus

Shape Files and SQL Server

Over the last couple of weeks I have been doing a lot of work importing polygons into an SQL server database, using them for some data processing tasks and then exporting the results as KML for display. I thought it’d be worth a post to record how I did it.

Inserting polygons (or any other geometry type) from a shape file to the database can be done with the ogr2ogr tool which ships with the gdal libraries (and with Mapserver for Windows). I knocked up a little batch file to do it:

SET InputShapeFile="D:\Dropbox\Data\SingleView\Brazillian Polygons\BRA_adm3.shp"

SET SqlConnectionString="MSSQL:Server=tcp:yourserver.database.windows.net;Database=danTest;Uid=usernname@yourserver.database.windows.net;Pwd=yourpassword;"

SET TEMPFILE="D:\Dropbox\Data\Temp.shp"
SET OGR2OGR="C:\ms4w\tools\gdal-ogr\ogr2ogr.exe"
SET TABLENAME="TestPolygons"

%OGR2OGR% -overwrite -simplify 0.01 %TEMPFILE% %InputShapeFile% -progress

%OGR2OGR% -lco "SHPT=POLYGON" -f "MSSQLSpatial" %SqlConnectionString% %TEMPFILE% -nln %TABLENAME% -progress

The first ogr2ogr call is used to simplify the polygons. The value 0.01 is the minimum length of an edge (in degrees in this case) to be stored. Results of this command are pushed to a temporary shape file set. The second call to ogr2ogr pushes the polygons from the temp file up to a database in Windows Azure. The same code would work for a local SQL Server, you just need to tweak the connection string.

You can use SQL Server Management Studio to show the spatial results of your query, which is nice! Here I just did a “select * from testPolygons” to see the first 5000 polygons from my file.

PolygonsInSqlServer

Sql Server contains all sorts of interesting data processing options, which I’ll look at another time. Here I’ll just skip to the final step – exporting the polygon data from the database to a local KML file.

polygonsInKml

SET KmlFile="D:\Dropbox\Data\Brazil.kml"

SET SqlConnectionString="MSSQL:Server=tcp:yourserver.database.windows.net;Database=danTest;Uid=usernname@yourserver.database.windows.net;Pwd=yourpassword;"

SET TEMPFILE="D:\Dropbox\Data\Temp.shp"
SET OGR2OGR="C:\ms4w\tools\gdal-ogr\ogr2ogr.exe"
SET SQL="select * from TestPolygons"

%OGR2OGR% -lco "SHPT=POLYGON" -f "KML" %KmlFile% -sql %SQL% %SqlConnectionString%  -progress

Obviously you can make the SQL in that command as complex as you like.

Polygons here are from this site which allows you to download various polygon datasets for various countries.

Serial on Raspberry Pi Arch Linux

So the new version of Arch Linux doesn’t have runlevels, rc.d or any of that nonsense any more. It just has systemd. Super simple if you know how to use it, but a right pain in the backside if you don’t.

I have a little serial GPS module hooked up to my Raspberry Pi via the hardware serial port (ttyAMA0). My old instructions for getting this to work aren’t much use any more. Here’s the new procedure for getting serial data with the minimum of fuss:

1. Disable serial output during boot

Edit /boot/cmdline.txt using your favourite editor. I like nano these days.

sudo nano /boot/cmdline.txt

Remove all chunks of text that mention ttyAMA0 but leave the rest of the line intact. Bits to remove look like:

console=ttyAMA0,115200 kgdboc=ttyAMA0,115200

2. Disable the console on the serial port

This was the new bit for me. The process used to involve commenting out a line in /etc/innitab but that file is long gone.

Systemd uses links in /etc to decide what to start up, so once you find the right one, removing it is easy. You can find the files associated with consoles by doing:

ls /etc/systemd/system/getty.target.wants/

One of the entries clearly refers to ttyAMA0. It can be removed using the following command:

sudo systemd disable serial-getty@ttyAMA0.service

3. Check you’re getting data

I used minicom for this as it’s very simple to use. First of all, make sure you plug in your device (with the power off, if you’re as clumsy as me!).

sudo pacman -S minicom
minicom -b 4800 -o -D /dev/ttyAMA0

You should see a lovely stream of data. I my case it was a screen full of NMEA sentences. Great stuff!

WordPress: Oh deary deary me!

This evening I was innocently setting up a wireless dongle for my darling wife. I casually typed in the address of this very web page into her browser to check it was working, only to find that all the posts were missing!

404 errors on every page but the front page. Poo! I desperately dived in to the wordpress settings, everything was set up fine. I updated wordpress but it made no difference. In the end, I went back to the post URL settings and clicked “Apply” again. It fixed the problem!

Looking at the stats, Logical Genetics seems to have been off the air since Independence Day. Almost a month. Miserable.

Back now though, and soon to be posting an article on my Build Status Traffic Lights.

Using a BufferBlock to Read and process in Parallel

Wrote an app this week – top secret of course – to load data from a database and process the contents. The reading from the database is the slow part and the processing takes slightly less time. I decided it might help if I could read a batch of results into memory and process it while loading the next batch.

Batching was dead easy, I found an excellent extension method on the internet that batches up an enumerable and yields you a sequence of arrays. The code looks like this, in case you can’t be bothered to click the link:

public static IEnumerable<T[]> Batch<T>(this IEnumerable<T> sequence, int batchSize)
{
    var batch = new List<T>(batchSize);

    foreach (var item in sequence)
    {
        batch.Add(item);

        if (batch.Count >= batchSize)
        {
            yield return batch.ToArray();
            batch.Clear();
        }   
    }  

    if (batch.Count > 0)
    {
        yield return batch.ToArray();
        batch.Clear();
    }  
}

That works really well, but it doesn’t give me the parallel read and process I’m looking for. After a large amount of research, some help from an esteemed colleague and quite a bit of inappropriate language, I ended up with the following. It uses the BufferBlock class which is a new thing from Microsoft’s new Dataflow Pipeline libraries (which provide all sorts of very useful stuff which I may well write an article on at a later date). The BufferBlock marshals data over thread boundaries in a very clean and simple way.

public static IEnumerable<T[]> BatchAsync<T>(this IEnumerable<T> sequence, int batchSize)
{
    BufferBlock<T[]> buffer = new BufferBlock<T[]>();

    var reader = new Thread(() =>
        {
            foreach (var batch in sequence.Batch(batchSize))
            {
                buffer.Post(batch);
            }
            buffer.Post(null);
            buffer.Complete();
        }) { Name = "Batch Reader Async" };
    reader.Start();

    T[] blocktoProcess;
    while ((blocktoProcess = buffer.Receive()) != null)
    {
        yield return blocktoProcess;
    }
}

The database read is done on a new thread and data is pulled back to the calling thread in batches. This makes for nice clean code on the consumer side!