06 Jan 2015, 00:20

Go: too many open files

Recently while creating a basic HTTP/HTTPS monitoring app, Pingo2, I started seeing too many open files error. This error was thrown after the app had been running for some time, and I attempted to open a new network connection.

Of course, in Unix/Linux network sockets are just files, so this error message actually makes sense in that context. First thing to do was run lsof to see exactly which files the process had open:

lsof | grep 19991
...
myapp    19991 20009 monitoring   38u     IPv4            4685252       0t0        TCP dev.example.com:44449->foobar:https (ESTABLISHED)
myapp    19991 20009 monitoring   39u     IPv4            4685250       0t0        TCP dev.example.com:45459->xxx.xxx.189.184:https (ESTABLISHED)
myapp    19991 20009 monitoring   40u     IPv4            4685251       0t0        TCP dev.example.com:45460->xxx.xxx.189.184:https (ESTABLISHED)
myapp    19991 20009 monitoring   41u     IPv4            4685253       0t0        TCP dev.example.com:44450->foobar:https (ESTABLISHED)
myapp    19991 20009 monitoring   42u     IPv4            4685268       0t0        TCP dev.example.com:44454->foobar:https (ESTABLISHED)
myapp    19991 20009 monitoring   43u     IPv4            4685266       0t0        TCP dev.example.com:45464->xxx.xxx.189.184:https (ESTABLISH
...

So, lots of ESTABLISHED network connections, each one corresponding with a HTTP connection my program had initiated. Now I was explicitly closing the HTTP response body after each connection, and thought that was sufficient for the connection to close down by itself. However it turns out that the default HTTP transport has TCP keep-alives enabled. The TCP connections were piling up in the background as a result.

Creating a custom http.Transport for the HTTP client with DisableKeepAlives: true fixed the issue.

26 Mar 2014, 06:33

AngularJS + Martini: html5mode

By default AngularJS displays URL paths prefixed with a # symbol. This enables backwards compatibility with browsers that don’t support HTML5 history API. The AngularJS guide explains this in detail here

To remove the # symbol and display more normal-looking URLs requires the use of html5mode in AngularJS. This is enabled via $locationProvider, for example:

app.config(['$routeProvider', '$locationProvider',
       function($routeProvider, $locationProvider) {

           $locationProvider.html5Mode(true).hashPrefix('!');

           $routeProvider.
               when('/signin', {
               templateUrl: 'components/signin.html',
               controller: 'SigninCtrl'
           }). 
               when('/', {
               templateUrl: 'components/home.html',
               controller: 'HomeCtrl'
           }). 
               otherwise({
               redirectTo: '/' 
           }); 
       }   
]);

If the client starts browsing at / then this will work fine. The webserver sends index.html to the client and AngularJS can load itself up ready to handle the other routes itself. A problem occurs if the client starts browsing directly to a path that the webserver may not know about, for example /signin. When the web server receives that request it will return a HTTP 404 error. What we need to do instead is tell our server to serve the contents of index.html. Note, we are not performing a HTTP redirect here.

I’m using Go to serve my static content, and using the Martini web framework to make life a bit easier.

To have Martini to do the necessary rewrite, perform the following setup:

		router := martini.NewRouter()
        router.NotFound(func(w http.ResponseWriter, r *http.Request) {
                // Only rewrite paths *not* containing filenames
                if path.Ext(r.URL.Path) == "" {
                        http.ServeFile(w, r, "public/index.html")
                } else {
                        w.WriteHeader(http.StatusNotFound)
                        w.Write([]byte("404 page not found"))
                }
        })
        m := martini.New()
        m.Use(martini.Logger())
        m.Use(martini.Recovery())
        m.Use(martini.Static("public"))
        m.Action(router.Handle)

Change public to suit whichever folder your web content lives in.

06 Sep 2013, 07:22

Node.js vs Go

Following on from my previous post on Go I thought I’d write some thoughts down on why I think Go has an edge over its competitors, in particular Node.js

I’m not so interested in raw performance comparisons, as for many if not most real-world use cases these metrics are not particularly relevant. What interests me more are aspects such as developer productivity, code maintainability, scalability

For several years now, Node.js has been filling the role of a low barrier for entry tool for rapid development of networking apps. It is well documented and has good cross-platform support. Being built on Javascript has made it accessible and provided a bridge from the front end devs to explore the world of backend systems programming. It’s been wildly popular and I for one have thoroughly enjoyed the ride, however I do think Node.js has peaked.

I’ve written several non-trivial apps in Node.js and while I found it fun and interesting, I can’t bring myself to have a great deal of confidence in the finished product. It just feels unsafe. Javascript is a pretty gnarly language, carrying a lot of baggage from years of being pulled this way and that. Combine that with the async programming model which has a tendency to get you mentally tied up in knots, and it just seems like unexpected behaviour will be the inevitable outcome. Debugging Node.js can be a bitch. It is also very easy to write code that you come back to in 6mths and struggle to figure out what it’s meant to do. It may be possible to avoid these pitfalls, but the fact that they exist at all and bite most people at least a few times, is a sign of a more fundamental problem.

On the topic of scalability, Node.js is tied to a single threaded event loop. This is fine when you have a lot of IO wait and can hop between concurrent tasks, but as soon as you hit a task that chews CPU, all other tasks effectively stop dead. The solution is to spread the tasks across multiple Node.js instances, and that means forking. Setting up and tearing down a new process carries a not insignificant overhead penalty, and so is better suited to a few long-running processes than many short-lived processes. Load balancing of shared socket resource - for example, incoming TCP port connections - is done by the OS in a round-robin fashion; while probably fine for most scenarios, is a limitation none-the-less. To summarize: Node.js can be made to scale, however it requires careful planning and potentially quite a bit of extra overhead in terms of lines of code and system resources.

I feel that Go solves these problems while, importantly, remaining accessible and approachable for both newcomers and old-hands looking for a fresh approach.

28 Aug 2013, 20:35

And the winner is, Go!

I’ve just started playing around with Google’s language, Go (informally Golang). I’ve quickly become a fan and am already looking for opportunities to use it in a real-world application.

I’d heard about Go when it first came out but never really bothered to look into it until now. Since I’ve been learning more about it, I’ve had several of those aha! moments where I’ve felt like a an itch has just been scratched; something that’s bugged me about other languages has been implemented the right way with Go.

Go is very C-like, and programming in Go it’s easy at first to write C code accidentally without thinking. Go reaches far beyond C however, for example, in its use of methods and interfaces. Another big difference is in memory management, being more akin to Java with its garbage collection.

Features of Go that appeal to me in particular are:

  • Simplicity of the ecosystem. Everything is self-contained. You download a single binary tarball for your platform, unpack it somewhere convenient and just start using it. There is one tool, ‘go’, which does practically everything you need - run, build, install etc. Makefiles, you don’t need ’em!
  • Extensive standard library that is designed to accommodate modern systems programming.  Knowing that the same, powerful set of tools is available everywhere, straight out of the box, is a big plus.
  • Statically linked binaries For systems programming this is fantastic. I can code something up on one box and deploy the resulting binary across my entire estate of varying age and flavour of Linux distros without having to worry about library dependencies - something that can be problematic with e.g. Python
  • stable, well thought out language spec and library API. Having the direction and focus of full-time Google engineers behind the language really shows in this regard. The Go team have released v1 of the spec and have publicly stated that it will stay pretty much as-is for quite some time. This is good news indeed.
  • concurrency. Systems programming involves connecting lots of IO pieces together , and concurrency is needed to keep all the balls in the air at once. The Go team have baked this into the language where it should be via Goroutines and channels.