Recently while creating a basic HTTP/HTTPS monitoring app, Pingo2, I started seeing
too many open files error. This error was thrown after the app had been running for some time, and I attempted to open a new network connection.
Of course, in Unix/Linux network sockets are just files, so this error message actually makes sense in that context. First thing to do was run
lsof to see exactly which files the process had open:
lsof | grep 19991 ... myapp 19991 20009 monitoring 38u IPv4 4685252 0t0 TCP dev.example.com:44449->foobar:https (ESTABLISHED) myapp 19991 20009 monitoring 39u IPv4 4685250 0t0 TCP dev.example.com:45459->xxx.xxx.189.184:https (ESTABLISHED) myapp 19991 20009 monitoring 40u IPv4 4685251 0t0 TCP dev.example.com:45460->xxx.xxx.189.184:https (ESTABLISHED) myapp 19991 20009 monitoring 41u IPv4 4685253 0t0 TCP dev.example.com:44450->foobar:https (ESTABLISHED) myapp 19991 20009 monitoring 42u IPv4 4685268 0t0 TCP dev.example.com:44454->foobar:https (ESTABLISHED) myapp 19991 20009 monitoring 43u IPv4 4685266 0t0 TCP dev.example.com:45464->xxx.xxx.189.184:https (ESTABLISH ...
So, lots of
ESTABLISHED network connections, each one corresponding with a HTTP connection my program had initiated. Now I was explicitly closing the HTTP response body after each connection, and thought that was sufficient for the connection to close down by itself. However it turns out that the default HTTP transport has TCP keep-alives enabled. The TCP connections were piling up in the background as a result.
Creating a custom http.Transport for the HTTP client with
DisableKeepAlives: true fixed the issue.