Monday, October 20, 2014

60 Seconds Architecture – Graphite

60 Seconds Architecture – Graphite
Contents

Overview

Graphite is an end to end solution for storing, analyzing and aggregating timed data. There are many other tools out there. The familiar ones are CACTI (http://www.cacti.net/), RRDTools (http://oss.oetiker.ch/rrdtool/) and others.
Graphite has taken the solution to a new level on the architectural plain. Graphite is not only a database solution but it is a full application solution, including web interface, security, clustering and more. For a more in-depth overview of graphite see https://graphite.readthedocs.org/en/latest/overview.html.
So what type of information do you want to store in this database?
Answer: anything. You can use graphite to save metrics on anything. Depending on your application, you can monitor your cpu, disk… You can sent ticks from your app to notify process progress, and then monitor the speed in graphite.


Data

So what is and what isn’t graphite. Graphite does not do the actual collection of the data (if you need tools for this, see https://graphite.readthedocs.org/en/latest/tools.html). Graphite supplies the option to store data and to query the data. Since the data that you store can be very large, graphite has a built in option for retention. Per metric you can decide what the resolution is and for how long you will keep it. So for example you can define
retentions = 10s:14d

This will save the data every 10sec for 14 days (for more info see http://graphite.readthedocs.org/en/latest/config-carbon.html#storage-schemas-conf).
This way you also don’t have to worry about deleting old data from the database, as is the case in most time based solutions.
Once you have defined the retention of your data, you can then define an aggregation function for your data. This way you can keep your raw data up to a month, but you can then keep a daily average for the next year. The basic aggregation functions that are supported are: average, sum, min, max, and last.
Graphite aggregation also supports combining multiple metrics into a new one via the aggregation definition which will save process time later on (the request of data will not need to use the aggregation function when retrieving the data).

Graphite Components

The basic components of the graphite server are:
  •          carbon daemons that listen for time-series data over the network using multiple protocols.
  •          whisper database library for storing time-series data
  •          graphite web - application that renders graphs using a simple url api

 

Carbon

The carbon daemon support two main protocols: plaintext, pickle.
Plaintext is a simple TCP socket that receives data in the format of:
<metric path> <metric value> <metric timestamp>.
Pickle is a python format for encoding strings of the following format:
[(path, (timestamp, value)), ...]

This format allows for inserting many timestamps of the same metric in an efficient way.
Although not documented but also the plaintext supports sending multiple metrics in the same TCP packet with a new line separator.
Many implementations of these protocols can be found in the internet for multiple languages.

Whisper

Whisper is not an actual database, but is a library that is optimized to write time based files. Each metric is written to its own file. Each file is a fixed size based on the retention rule. This way the writing to the file is optimized (location for each metric in the file, based on timestamp is know in advance). This means that the allocation of the file is done on the first metric that is sent for this file (a utility to help calculate the file size based on retention can be found at: https://gist.github.com/jjmaestro/5774063).
The folder structure is very convenient. If your metric is a.b.c, then you will have a file named “c.wsp” in a folder of “b” in a folder of ”a”. For what every reason, if you wish to remove the metric data, you just need to delete the file.
Since the whole architecture of graphite is like Lego blocks, any part can be changed. So if you want to implement your own database library, you can go and do it (see http://graphite.readthedocs.org/en/latest/storage-backends.html).
For an example of it (and an in-depth article on whisper) see http://www.inmobi.com/blog/2014/01/24/extending-graphites-mileage.

Carbon-Cache

Since graphite is designed for high rate writing, obviously the IO will be the bottle neck. To solve this, graphite has added the carbon cache. All writes and reads go through the cache. The cache will persist the metrics to disk after a configurable interval. The cache holds a queue per whisper file, so that writing will be optimized and written in one block.
In the carbon.conf file you can configure multiple options to fine tune your graphite performance. An important entry is the following:
MAX_UPDATES_PER_SECOND = 500
This entry will define the updates per second to the disk. The less writes to the disk the better performance, but it comes with the risk of losing data in case of crash.
For fine tuning see the following article: http://mike-kirk.blogspot.co.il/2013_12_01_archive.html.
Configuration example:
 [cache]
LINE_RECEIVER_INTERFACE = 127.0.0.1
LINE_RECEIVER_PORT = 2003
PICKLE_RECEIVER_INTERFACE = 127.0.0.1
PICKLE_RECEIVER_PORT = 2004
CACHE_QUERY_INTERFACE = 0.0.0.0
CACHE_QUERY_PORT = 7002

Carbon-Rely

Since the architecture is that each metric has its own life cycle, we can store metrics on different machines, or for performance we might have more than one cache (see section on performance boost and high availability).
Configuration example:
[relay]
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2003
PICKLE_RECEIVER_INTERFACE = 0.0.0.0
PICKLE_RECEIVER_PORT = 2004
RELAY_METHOD = consistent-hashing
DESTINATIONS = 127.0.0.1:2014:1, 127.0.0.1:2024:2

Web API

Graphite uses Python Django web application with a REST API that can be queried to generate graphs as images, or return raw data in various formats (csv, json). The main user interface can be used as a work area to compose URLs for metrics retrieval.
The web api, can read from either the whisper file or the carbon-cache so that it can access data that has not yet been persisted.
The Web API has the option to display a GUI dashboard, or to retrieve the data via REST interface.
Getting data from graphite is as simple as:
http://graphite/render?target=app.numUsers&format=json

There are of course many options that include getting multiple metrics with wildcards. Defining time period for metrics. Choosing the format of the reply (json, png, csv, raw). Applying functions to metrics before retrieval, and many more. For more information see documentation at: http://graphite.readthedocs.org/en/latest/render_api.html.
If you want to enhance your dashboards, have a look at this open source graph editor: http://grafana.org/.

Performance boost

To boost the performance of graphite, it is recommended to create a carbon-cache per cpu core. This way the machine can handle more metrics at the same time. You will need to configure a port per carbon-cache (actually 2, one for plaintext and one for pickle). This is a problem since our clients do not want to be aware of this layer in the architecture. To solve this graphite uses the carbon-rely. The client needs to see only the carbon-rely, and then the relay will send the metrics to the different carbon-cache.



Of course not only the writing layer has the option for caching, but the reading one does to. So you can configure the web api layer to use a memcache server to cache the results of REST requests. You should configure all web-api servers to use the same cache server so that cross rest requests will be cached as well (http://graphite.readthedocs.org/en/latest/config-local-settings.html).


Clustering Graphite

So how does graphite scale out? As you can guess by the sections above we have all the building blocks we need. We will have many backend servers that will host the metrics whisper files. Each machine will have a carbon-cache per cpu. We will but another machine with a carbon-rely to route all metric requests to the machines. Also we will add another machine for the web-api interface.



So how does the system know which metrics are on which machines?
When configuring the relay we added the option: consistent-hashing. This will create a hash on the metric name, and will then know to send and retrieve each metric from the correct machine.
As you can see the WEB-API does not use the carbon-relay, but knows itself to which cache to access. This is done so that we don’t have the penalty of another hop.
Both processes need to go via the carbon-cache and not directly to the whisper since there can be data in the cache that has not been persisted yet.

High Availability

Once you understand the cluster mode, to achieve high availability, what we need to do is make sure that each metric whisper is stored on more than one machine. This way if a machine goes down we still have our data. To do this on the relay definition we add update the configuration parameter REPLICATION_FACTOR=2. This will tell the relay to persist the metric to two machines. For example:
[relay]
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2003
PICKLE_RECEIVER_INTERFACE = 0.0.0.0
PICKLE_RECEIVER_PORT = 2004
RELAY_METHOD = consistent-hashing
REPLICATION_FACTOR = 2
DESTINATIONS = 127.0.0.1:2014:1, 127.0.0.1:2024:2

So depending on your needs and the number of computers you can define the replication factor to determine how many copies of your data you want. The consistent-hashing will be used by the relay to know on which machines reside the data.
For a detailed document on the architecture of graphite see:
Clustering Graphite


Open Issues with graphite

Ramp Up

The IO is the biggest problem in a system like graphite. As we saw to minimize this graphite created the cache and rely and cluster mechanism. Still in the end once a new metric is sent to the server, graphite needs to create a new file with the size according to the retention.
There is a parameter:
MAX_CREATES_PER_MINUTE = 50

This will define how many new files are created per minute. The catch with this parameter, is that if you have a lot of new metrics, on the first of each metric graphite will create a file of the maximum length. This means writing a lot of data. What is not mentioned in the documentation is that any metrics that are not written will be lost. Meaning that if on the start of a new system I create more than 50 new metrics per minute, 50 will be created and the rest will be dropped. So in order to get all my metrics in, I need a ramp-up time. You need to continually send all the metrics, and then at a rate of 50 per minute will the metrics be created.

Debugging

Eat your own dog food

Graphite saves a lot of information about itself in metrics that are saved in graphite.
For the carbon cache we have the following metrics:



And for carbon relay we have the following:



Debugging techniques

The first part, is once you send a new metric to graphite, you can check if the whisper file was created.
Next step: in the carbon.conf you have the following flags:
LOG_UPDATES = log every whisper update
LOG_CACHE_HITS = log every chache update
LOG_CACHE_QUEUE_SORTS = True

To view the logs go to  ../storage/log. Here you should have three log folder: carbon-cache, carbon-relay, webapp. Under each one we have more folders per instance of application.
For example, debugging cache. Go to folder /storage/log/carbon-cache/carbon-cache-1. To debug your application sending metrics to graphite you can use listener.log. If there are any connection failures you should see them in this file. Also in case of invalid formats sent to graphite, you will see the error here.



Multi-tenet

If you need to use graphite for multiple customers, you can easily do this by adding a prefix to the metric name. You just need to remember that this is a solution on the application layer and not in graphite. So if you give a direct connection to the graphite, you cannot block the data per client.

Events

Graphite does have a simple mechanism for saving basic events. The basic structure of events is: when, what, data, tags. There is a dedicated GUI for viewing the events. You can also use the rest api to query events. The events are not the center of graphite and therefor do not have all the features that would be expected from an events system. So you if need to do anything more than a simple event you should look for a more robust system (like elasticsearch). For more information see:



Hosting Graphite

If your servers have access to the internet, and you do not want the hassle of setting up graphite and maintaining it, you can always go the hosting way.

No comments:

Post a Comment