60 Seconds
Architecture – Graphite
Contents
Overview
Graphite is an end to end solution for storing, analyzing
and aggregating timed data. There are many other tools out there. The familiar ones
are CACTI (http://www.cacti.net/), RRDTools
(http://oss.oetiker.ch/rrdtool/)
and others.
Graphite has taken the solution to a new level on the
architectural plain. Graphite is not only a database solution but it is a full
application solution, including web interface, security, clustering and more.
For a more in-depth overview of graphite see https://graphite.readthedocs.org/en/latest/overview.html.
So what type of information do you want to store in this
database?
Answer: anything. You can use graphite to save metrics on
anything. Depending on your application, you can monitor your cpu, disk… You
can sent ticks from your app to notify process progress, and then monitor the
speed in graphite.
- You can even send Windows Performance counters to graphite: http://www.hodgkins.net.au/mswindows/using-powershell-to-send-metrics-graphite/
- Want to monitor you storm server no problem: http://www.michael-noll.com/blog/2013/11/06/sending-metrics-from-storm-to-graphite/
- Are you using logstash to analyze your logs, sent those also to graphite: http://logstash.net/docs/1.2.0/outputs/graphite
- Are you using sensu to monitor you farm: http://www.joemiller.me/2013/12/07/sensu-and-graphite-part-2/.
- As you can see it is all a matter of your imagination.
Data
So what is and what isn’t graphite. Graphite does not do the
actual collection of the data (if you need tools for this, see https://graphite.readthedocs.org/en/latest/tools.html).
Graphite supplies the option to store data and to query the data. Since the
data that you store can be very large, graphite has a built in option for retention.
Per metric you can decide what the resolution is and for how long you will keep
it. So for example you can define
retentions = 10s:14d
This will save the data every 10sec for 14 days (for more
info see http://graphite.readthedocs.org/en/latest/config-carbon.html#storage-schemas-conf).
This way you also don’t have to worry about deleting old
data from the database, as is the case in most time based solutions.
Once you have defined the retention of your data, you can
then define an aggregation function for your data. This way you can keep your
raw data up to a month, but you can then keep a daily average for the next
year. The basic aggregation functions that are supported are: average, sum, min, max,
and last.
Graphite aggregation also supports combining multiple
metrics into a new one via the aggregation definition which will save process
time later on (the request of data will not need to use the aggregation
function when retrieving the data).
Graphite Components
The basic components of the graphite server are:
- carbon – daemons that listen for time-series data over the network using multiple protocols.
- whisper – database library for storing time-series data
- graphite web - application that renders graphs using a simple url api
Carbon
The carbon daemon support two main protocols: plaintext,
pickle.
Plaintext is a simple TCP socket that receives data in the
format of:
<metric path> <metric value> <metric timestamp>.
Pickle is a python format for encoding strings of the
following format:
[(path,
(timestamp, value)), ...]
This format allows for inserting many timestamps of the same
metric in an efficient way.
Although not documented but also the plaintext supports
sending multiple metrics in the same TCP packet with a new line separator.
Many implementations of these protocols can be found in the
internet for multiple languages.
Whisper
Whisper is not an actual database, but is a library that is
optimized to write time based files. Each metric is written to its own file.
Each file is a fixed size based on the retention rule. This way the writing to
the file is optimized (location for each metric in the file, based on timestamp
is know in advance). This means that the allocation of the file is done on the
first metric that is sent for this file (a utility to help calculate the file
size based on retention can be found at: https://gist.github.com/jjmaestro/5774063).
The folder structure is very convenient. If your metric is
a.b.c, then you will have a file named “c.wsp” in a folder of “b” in a folder
of ”a”. For what every reason, if you wish to remove the metric data, you just
need to delete the file.
Since the whole architecture of graphite is like Lego
blocks, any part can be changed. So if you want to implement your own database library,
you can go and do it (see http://graphite.readthedocs.org/en/latest/storage-backends.html).
For an example of it (and an in-depth article on whisper)
see http://www.inmobi.com/blog/2014/01/24/extending-graphites-mileage.
Carbon-Cache
Since graphite is designed for high rate writing, obviously
the IO will be the bottle neck. To solve this, graphite has added the carbon
cache. All writes and reads go through the cache. The cache will persist the
metrics to disk after a configurable interval. The cache holds a queue per
whisper file, so that writing will be optimized and written in one block.
In the carbon.conf file you can configure multiple options
to fine tune your graphite performance. An important entry is the following:
MAX_UPDATES_PER_SECOND
= 500
This entry will define the updates per second to the disk. The
less writes to the disk the better performance, but it comes with the risk of
losing data in case of crash.
For fine tuning see the following article: http://mike-kirk.blogspot.co.il/2013_12_01_archive.html.
Configuration example:
[cache]
LINE_RECEIVER_INTERFACE
= 127.0.0.1
LINE_RECEIVER_PORT
= 2003
PICKLE_RECEIVER_INTERFACE
= 127.0.0.1
PICKLE_RECEIVER_PORT
= 2004
CACHE_QUERY_INTERFACE
= 0.0.0.0
CACHE_QUERY_PORT
= 7002
Carbon-Rely
Since the architecture is that each metric has its own life
cycle, we can store metrics on different machines, or for performance we might
have more than one cache (see section on performance boost and high
availability).
Configuration example:
[relay]
LINE_RECEIVER_INTERFACE
= 0.0.0.0
LINE_RECEIVER_PORT
= 2003
PICKLE_RECEIVER_INTERFACE
= 0.0.0.0
PICKLE_RECEIVER_PORT
= 2004
RELAY_METHOD =
consistent-hashing
DESTINATIONS =
127.0.0.1:2014:1, 127.0.0.1:2024:2
Web API
Graphite uses Python Django web application with a REST API
that can be queried to generate graphs as images, or return raw data in various
formats (csv, json). The main user interface can be used as a work area to
compose URLs for metrics retrieval.
The web api, can read from either the whisper file or the
carbon-cache so that it can access data that has not yet been persisted.
The Web API has the option to display a GUI dashboard, or to
retrieve the data via REST interface.
Getting data from graphite is as simple as:
http://graphite/render?target=app.numUsers&format=json
There are of course many options that include getting
multiple metrics with wildcards. Defining time period for metrics. Choosing the
format of the reply (json, png, csv, raw). Applying functions to metrics before
retrieval, and many more. For more information see documentation at: http://graphite.readthedocs.org/en/latest/render_api.html.
If you want to enhance your dashboards, have a look at this
open source graph editor: http://grafana.org/.
Performance boost
To boost the performance of graphite, it is recommended to
create a carbon-cache per cpu core. This way the machine can handle more
metrics at the same time. You will need to configure a port per carbon-cache
(actually 2, one for plaintext and one for pickle). This is a problem since our
clients do not want to be aware of this layer in the architecture. To solve
this graphite uses the carbon-rely. The client needs to see only the
carbon-rely, and then the relay will send the metrics to the different
carbon-cache.
Of course not only the writing layer has the option for
caching, but the reading one does to. So you can configure the web api layer to
use a memcache server to cache the results of REST requests. You should
configure all web-api servers to use the same cache server so that cross rest
requests will be cached as well (http://graphite.readthedocs.org/en/latest/config-local-settings.html).
Clustering Graphite
So how does graphite scale out? As you can guess by the
sections above we have all the building blocks we need. We will have many
backend servers that will host the metrics whisper files. Each machine will
have a carbon-cache per cpu. We will but another machine with a carbon-rely to
route all metric requests to the machines. Also we will add another machine for
the web-api interface.
So how does the system know which metrics are on which
machines?
When configuring the relay we added the option: consistent-hashing.
This will create a hash on the metric name, and will then know to send and
retrieve each metric from the correct machine.
For more information see: http://grey-boundary.com/the-architecture-of-clustering-graphite/.
As you can see the WEB-API does not use the carbon-relay,
but knows itself to which cache to access. This is done so that we don’t have
the penalty of another hop.
Both processes need to go via the carbon-cache and not
directly to the whisper since there can be data in the cache that has not been
persisted yet.
High Availability
Once you understand the cluster mode, to achieve high
availability, what we need to do is make sure that each metric whisper is
stored on more than one machine. This way if a machine goes down we still have
our data. To do this on the relay definition we add update the configuration
parameter REPLICATION_FACTOR=2. This will tell the relay to persist the metric
to two machines. For example:
[relay]
LINE_RECEIVER_INTERFACE
= 0.0.0.0
LINE_RECEIVER_PORT
= 2003
PICKLE_RECEIVER_INTERFACE
= 0.0.0.0
PICKLE_RECEIVER_PORT
= 2004
RELAY_METHOD =
consistent-hashing
REPLICATION_FACTOR
= 2
DESTINATIONS =
127.0.0.1:2014:1, 127.0.0.1:2024:2
So depending on your needs and the number of computers you
can define the replication factor to determine how many copies of your data you
want. The consistent-hashing will be used by the relay to know on which
machines reside the data.
For a detailed document on the architecture of graphite see:
Clustering Graphite
Open Issues with graphite
Ramp Up
The IO is the biggest problem in a system like graphite. As
we saw to minimize this graphite created the cache and rely and cluster
mechanism. Still in the end once a new metric is sent to the server, graphite
needs to create a new file with the size according to the retention.
There is a parameter:
MAX_CREATES_PER_MINUTE
= 50
This will define how many new files are created per minute. The
catch with this parameter, is that if you have a lot of new metrics, on the
first of each metric graphite will create a file of the maximum length. This
means writing a lot of data. What is not mentioned in the documentation is that
any metrics that are not written will be lost. Meaning that if on the start of
a new system I create more than 50 new metrics per minute, 50 will be created
and the rest will be dropped. So in order to get all my metrics in, I need a
ramp-up time. You need to continually send all the metrics, and then at a rate
of 50 per minute will the metrics be created.
Debugging
Eat your own dog food
Graphite saves a lot of information about itself in metrics
that are saved in graphite.
For the carbon cache we have the following metrics:
And for carbon relay we have the following:
Debugging techniques
The first part, is once you send a new metric to graphite,
you can check if the whisper file was created.
Next step: in the carbon.conf you have the following flags:
LOG_UPDATES =
log every whisper update
LOG_CACHE_HITS
= log every chache update
LOG_CACHE_QUEUE_SORTS
= True
To view the logs go to ../storage/log. Here you should have three log
folder: carbon-cache, carbon-relay, webapp. Under each one we have more folders
per instance of application.
For example, debugging cache. Go to folder /storage/log/carbon-cache/carbon-cache-1.
To debug your application sending metrics to graphite you can use listener.log.
If there are any connection failures you should see them in this file. Also in
case of invalid formats sent to graphite, you will see the error here.
Multi-tenet
If you need to use graphite for multiple customers, you can
easily do this by adding a prefix to the metric name. You just need to remember
that this is a solution on the application layer and not in graphite. So if you
give a direct connection to the graphite, you cannot block the data per client.
Events
Graphite does have a simple mechanism for saving basic
events. The basic structure of events is: when, what, data, tags. There is a
dedicated GUI for viewing the events. You can also use the rest api to query
events. The events are not the center of graphite and therefor do not have all
the features that would be expected from an events system. So you if need to do
anything more than a simple event you should look for a more robust system
(like elasticsearch). For more information see:
Hosting Graphite
If your servers have access to the internet, and you do not
want the hassle of setting up graphite and maintaining it, you can always go
the hosting way.