I thought about using a clickbait title – “You’ll never believe how this guy captures metrics!” – but decided that 99% of these are not worth the time invested in coming up with the catch title.
So instead, I’ll simply talk about what I wanted to, and you be the judge of my title.
Application Performance Monitoring, or APM, is a crazily complex landscape, with an enormous amount of tooling, terminology, and providers looking to get some piece of the action.
There are many vendors, and all have their advantages, as well as disadvantages.
The vendor that I am pretty happy with (and I now work there) is Datadog.
One solution that has caught on quite well for surgical application monitoring is the use of the statsd protocol to send metrics from inside your application to a listener which can then store these metrics for querying later on. This is achieved by placing strategic “emitter” callouts in your code so that they can report metrics during runtime.
Flickr, then Etsy have started these projects, and they have been refined, ported to most languages, and are seeing adoption in companies where a focus on measuring is an important goal.
A blog post on Datadog’s implementation and extension of Statsd was written last year and goes into deeper detail.
One common question has always been “How do I collect metrics from an application running on Heroku with Datadog?”.
And I think we finally have one answer.
The Heroku Dyno container is pretty simple – you wanna run a process? Describe it in a Procfile.
You wanna scale? You tell Heroku to launch more Dynos with the process name, as specified in the Procfile.
However, the actual Dyno is a fairly limited environment by design – the root filesystem is read-only, the only writable area is in the application’s root directory, and disappears when terminated. There’s no sysvinit, upstart or systemd for people to bicker about. Use a Procfile, which is also really simple.
So a challenge to overcome became: “how to install a Datadog Agent package that runs a dogstatsd
listener as a second process, inside an environment that is pretty locked down?”
First, we have to install the package. Heroku has a concept of “[buildpacks]”(https://devcenter.heroku.com/articles/buildpacks) that can be used to run compilation steps before adding your application code and launching it. The use of multiple buildpacks is also available, to chain steps together to achieve the desired outcome.
I read the heroku-buildpack-apt and found a bunch of good ideas, and came up with a Datadog-Agent-specific installer buildpack that drops off the package, as well as the needed environment for the runtime.
Now how do I run the listener process alongside my application?
Enter foreman
. Foreman, not to be confused with “theforeman“, has long been a great way for application developers writing Heroku-targeted applications to run them locally in a similar manner that they will be run on the remote platform.
Foreman reads the Profile, and runs the processes based on the directives contained inside.
This feature is the one that we leverage to run multiple processes on a single Dyno.
By using foreman inside the Dyno, we are able to tell foreman to run more than one process type at a time, with another Procfile that specifies the startup process for the actual application as well as the dogstatsd listener.
When deploying any code revision, Heroku will read the base Procfile, and run a foreman
process inside the Dyno, which will in turn, start up the app & dogstasd.
And while foreman is a Ruby gem, your project may be in Python (use honcho), Go (use forego or goreman) and I’m sure there are others out there. I haven’t found or tested all of them, tell me if they work out for you.
I did, however, take the time to write up a README with the procedure to follow to use this, as well as commit-by-commit example application.
Here’s the buildpack code: http://miketheman.github.io/heroku-buildpack-datadog/
Here’s the example application: https://github.com/miketheman/buildpack-example-ruby
Here’s an image of the stats collected by the example application in Datadog, with increasing web load:
Here’s a random dog:
Hope this helps you find deeper insight into how you monitor your applications!
Update (2014-12-15)
A quick addition on this topic.
A couple of days after this was published, I had a short Twitter exchange with Bo Jeanes, after which he submitted a Pull Request to the buildpack, (as well as an update to the example app).
This simplifies the end-user’s deployment of the Agent package, in that the user no longer has to spend any time on doing Procfile-in-Procfile solutions, as well as remove the need from foreman
and the like from inside the container, rather the dogstatsd process will be started via the profile.d
mechanism which is run on Dyno startup.
This makes the solution even more elegant, so thanks a ton, Bo!