Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save msoedov/634547ba63a8a0af1bb954610a5eb528 to your computer and use it in GitHub Desktop.
Save msoedov/634547ba63a8a0af1bb954610a5eb528 to your computer and use it in GitHub Desktop.

Administering Carbon

Starting Carbon

Carbon can be started with the carbon-cache.py script:

/opt/graphite/bin/carbon-cache.py start This starts the main Carbon daemon in the background. Now is a good time to check the logs, located in /opt/graphite/storage/log/carbon-cache/ for any errors.

The Carbon Daemons

When we talk about "Carbon" we mean one or more of various daemons that make up the storage backend of a Graphite installation. In simple installations, there is typically only one daemon, carbon-cache.py. This document gives a brief overview of what each daemon does and how you can use them to build a more sophisticated storage backend.

All of the carbon daemons listen for time-series data and can accept it over a common set of :doc:protocols </feeding-carbon>. However, they differ in what they do with the data once they receive it.

carbon-cache.py

carbon-cache.py accepts metrics over various protocols and writes them to disk as efficiently as possible. This requires caching metric values in RAM as they are received, and flushing them to disk on an interval using the underlying whisper library.

carbon-cache.py requires some basic configuration files to run:

:doc:carbon.conf </config-carbon> The [cache] section tells carbon-cache.py what ports (2003/2004/7002), protocols (newline delimited, pickle) and transports (TCP/UDP) to listen on.

:doc:storage-schemas.conf </config-carbon> Defines a retention policy for incoming metrics based on regex patterns. This policy is passed to whisper when the .wsp file is pre-allocated, and dictates how long data is stored for.

As the number of incoming metrics increases, one carbon-cache.py instance may not be enough to handle the I/O load. To scale out, simply run multiple carbon-cache.py instances (on one or more machines) behind a carbon-aggregator.py or carbon-relay.py.

.. warning::

If clients connecting to the carbon-cache.py are experiencing errors such as connection refused by the daemon, a common reason is a shortage of file descriptors.

In the console.log file, if you find presence of:

Could not accept new connection (EMFILE)

or

exceptions.IOError: [Errno 24] Too many open files: '/var/lib/graphite/whisper/systems/somehost/something.wsp'

the number of files carbon-cache.py can open will need to be increased. Many systems default to a max of 1024 file descriptors. A value of 8192 or more may be necessary depending on how many clients are simultaneously connecting to the carbon-cache.py daemon.

In Linux, the system-global file descriptor max can be set via sysctl. Per-process limits are set via ulimit. See documentation for your operating system distribution for details on how to set these values.

carbon-relay.py

carbon-relay.py serves two distinct purposes: replication and sharding.

When running with RELAY_METHOD = rules, a carbon-relay.py instance can run in place of a carbon-cache.py server and relay all incoming metrics to multiple backend carbon-cache.py's running on different ports or hosts.

In RELAY_METHOD = consistent-hashing mode, a DESTINATIONS setting defines a sharding strategy across multiple carbon-cache.py backends. The same consistent hashing list can be provided to the graphite webapp via CARBONLINK_HOSTS to spread reads across the multiple backends.

carbon-relay.py is configured via:

:doc:carbon.conf </config-carbon> The [relay] section defines listener host/ports and a RELAY_METHOD

:doc:relay-rules.conf </config-carbon> In RELAY_METHOD = rules, pattern/servers tuples define what servers metrics matching certain regex rules are forwarded to.

carbon-aggregator.py

carbon-aggregator.py can be run in front of carbon-cache.py to buffer metrics over time before reporting them into whisper. This is useful when granular reporting is not required, and can help reduce I/O load and whisper file sizes due to lower retention policies.

carbon-aggregator.py is configured via:

:doc:carbon.conf </config-carbon> The [aggregator] section defines listener and destination host/ports.

:doc:aggregation-rules.conf </config-carbon> Defines a time interval (in seconds) and aggregation function (sum or average) for incoming metrics matching a certain pattern. At the end of each interval, the values received are aggregated and published to carbon-cache.py as a single metric.

The Carbon Daemons

When we talk about "Carbon" we mean one or more of various daemons that make up the storage backend of a Graphite installation. In simple installations, there is typically only one daemon, carbon-cache.py. This document gives a brief overview of what each daemon does and how you can use them to build a more sophisticated storage backend.

All of the carbon daemons listen for time-series data and can accept it over a common set of :doc:protocols </feeding-carbon>. However, they differ in what they do with the data once they receive it.

carbon-cache.py

carbon-cache.py accepts metrics over various protocols and writes them to disk as efficiently as possible. This requires caching metric values in RAM as they are received, and flushing them to disk on an interval using the underlying whisper library.

carbon-cache.py requires some basic configuration files to run:

:doc:carbon.conf </config-carbon> The [cache] section tells carbon-cache.py what ports (2003/2004/7002), protocols (newline delimited, pickle) and transports (TCP/UDP) to listen on.

:doc:storage-schemas.conf </config-carbon> Defines a retention policy for incoming metrics based on regex patterns. This policy is passed to whisper when the .wsp file is pre-allocated, and dictates how long data is stored for.

As the number of incoming metrics increases, one carbon-cache.py instance may not be enough to handle the I/O load. To scale out, simply run multiple carbon-cache.py instances (on one or more machines) behind a carbon-aggregator.py or carbon-relay.py.

.. warning::

If clients connecting to the carbon-cache.py are experiencing errors such as connection refused by the daemon, a common reason is a shortage of file descriptors.

In the console.log file, if you find presence of:

Could not accept new connection (EMFILE)

or

exceptions.IOError: [Errno 24] Too many open files: '/var/lib/graphite/whisper/systems/somehost/something.wsp'

the number of files carbon-cache.py can open will need to be increased. Many systems default to a max of 1024 file descriptors. A value of 8192 or more may be necessary depending on how many clients are simultaneously connecting to the carbon-cache.py daemon.

In Linux, the system-global file descriptor max can be set via sysctl. Per-process limits are set via ulimit. See documentation for your operating system distribution for details on how to set these values.

carbon-relay.py

carbon-relay.py serves two distinct purposes: replication and sharding.

When running with RELAY_METHOD = rules, a carbon-relay.py instance can run in place of a carbon-cache.py server and relay all incoming metrics to multiple backend carbon-cache.py's running on different ports or hosts.

In RELAY_METHOD = consistent-hashing mode, a DESTINATIONS setting defines a sharding strategy across multiple carbon-cache.py backends. The same consistent hashing list can be provided to the graphite webapp via CARBONLINK_HOSTS to spread reads across the multiple backends.

carbon-relay.py is configured via:

:doc:carbon.conf </config-carbon> The [relay] section defines listener host/ports and a RELAY_METHOD

:doc:relay-rules.conf </config-carbon> In RELAY_METHOD = rules, pattern/servers tuples define what servers metrics matching certain regex rules are forwarded to.

carbon-aggregator.py

carbon-aggregator.py can be run in front of carbon-cache.py to buffer metrics over time before reporting them into whisper. This is useful when granular reporting is not required, and can help reduce I/O load and whisper file sizes due to lower retention policies.

carbon-aggregator.py is configured via:

:doc:carbon.conf </config-carbon> The [aggregator] section defines listener and destination host/ports.

:doc:aggregation-rules.conf </config-carbon> Defines a time interval (in seconds) and aggregation function (sum or average) for incoming metrics matching a certain pattern. At the end of each interval, the values received are aggregated and published to carbon-cache.py as a single metric.

Configuring Carbon

Carbon's config files all live in /opt/graphite/conf/. If you've just installed Graphite, none of the .conf files will exist yet, but there will be a .conf.example file for each one. Simply copy the example files, removing the .example extension, and customize your settings.

carbon.conf

This is the main config file, and defines the settings for each Carbon daemon.

Each setting within this file is documented via comments in the config file itself. The settings are broken down into sections for each daemon - carbon-cache is controlled by the [cache] section, carbon-relay is controlled by [relay] and carbon-aggregator by [aggregator]. However, if this is your first time using Graphite, don't worry about anything but the [cache] section for now.

.. TIP:: Carbon-cache and carbon-relay can run on the same host! Try swapping the default ports listed for LINE_RECEIVER_PORT and PICKLE_RECEIVER_PORT between the [cache] and [relay] sections to prevent having to reconfigure your deployed metric senders. When setting DESTINATIONS in the [relay] section, keep in mind your newly-set PICKLE_RECEIVER_PORT in the [cache] section.

storage-schemas.conf

This configuration file details retention rates for storing metrics. It matches metric paths to patterns, and tells whisper what frequency and history of datapoints to store.

Important notes before continuing:

  • There can be many sections in this file.
  • The sections are applied in order from the top (first) and bottom (last).
  • The patterns are regular expressions, as opposed to the wildcards used in the URL API.
  • The first pattern that matches the metric name is used.
  • This retention is set at the time the first metric is sent.
  • Changing this file will not affect already-created .wsp files. Use whisper-resize.py to change those.

A given rule is made up of 3 lines:

  • A name, specified inside square brackets.
  • A regex, specified after "pattern="
  • A retention rate line, specified after "retentions="

The retentions line can specify multiple retentions. Each retention of frequency:history is separated by a comma.

Frequencies and histories are specified using the following suffixes:

  • s - second
  • m - minute
  • h - hour
  • d - day
  • y - year

Here's a simple, single retention example:

.. code-block:: none

[garbage_collection] pattern = garbageCollections$ retentions = 10s:14d

The name [garbage_collection] is mainly for documentation purposes, and will show up in creates.log when metrics matching this section are created.

The regular expression pattern will match any metric that ends with garbageCollections. For example, com.acmeCorp.instance01.jvm.memory.garbageCollections would match, but com.acmeCorp.instance01.jvm.memory.garbageCollections.full would not.

The retention line is saying that each datapoint represents 10 seconds, and we want to keep enough datapoints so that they add up to 14 days of data.

Here's a more complicated example with multiple retention rates:

.. code-block:: none

[apache_busyWorkers] pattern = ^servers.www.*.workers.busyWorkers$ retentions = 15s:7d,1m:21d,15m:5y

In this example, imagine that your metric scheme is servers.<servername>.<metrics>. The pattern would match server names that start with 'www', followed by anything, that are sending metrics that end in '.workers.busyWorkers' (note the escaped '.' characters).

Additionally, this example uses multiple retentions. The general rule is to specify retentions from most-precise:least-history to least-precise:most-history -- whisper will properly downsample metrics (averaging by default) as thresholds for retention are crossed.

By using multiple retentions, you can store long histories of metrics while saving on disk space and I/O. Because whisper averages (by default) as it downsamples, one is able to determine totals of metrics by reversing the averaging process later on down the road.

Example: You store the number of sales per minute for 1 year, and the sales per hour for 5 years after that. You need to know the total sales for January 1st of the year before. You can query whisper for the raw data, and you'll get 24 datapoints, one for each hour. They will most likely be floating point numbers. You can take each datapoint, multiply by 60 (the ratio of high-precision to low-precision datapoints) and still get the total sales per hour.

Additionally, whisper supports a legacy retention specification for backwards compatibility reasons - seconds-per-datapoint:count-of-datapoints

.. code-block:: none

retentions = 60:1440

60 represents the number of seconds per datapoint, and 1440 represents the number of datapoints to store. This required some unnecessarily complicated math, so although it's valid, it's not recommended.

storage-aggregation.conf

This file defines how to aggregate data to lower-precision retentions. The format is similar to storage-schemas.conf. Important notes before continuing:

  • This file is optional. If it is not present, defaults will be used.
  • There is no retentions line. Instead, there are xFilesFactor and/or aggregationMethod lines.
  • xFilesFactor should be a floating point number between 0 and 1, and specifies what fraction of the previous retention level's slots must have non-null values in order to aggregate to a non-null value. The default is 0.5.
  • aggregationMethod specifies the function used to aggregate values for the next retention level. Legal methods are average, sum, min, max, and last. The default is average.
  • These are set at the time the first metric is sent.
  • Changing this file will not affect .wsp files already created on disk. Use whisper-set-aggregation-method.py to change those.

Here's an example:

.. code-block:: none

[all_min] pattern = .min$ xFilesFactor = 0.1 aggregationMethod = min

The pattern above will match any metric that ends with .min.

The xFilesFactor line is saying that a minimum of 10% of the slots in the previous retention level must have values for next retention level to contain an aggregate. The aggregationMethod line is saying that the aggregate function to use is min.

If either xFilesFactor or aggregationMethod is left out, the default value will be used.

The aggregation parameters are kept separate from the retention parameters because the former depends on the type of data being collected and the latter depends on volume and importance.

relay-rules.conf

Relay rules are used to send certain metrics to a certain backend. This is handled by the carbon-relay system. It must be running for relaying to work. You can use a regular expression to select the metrics and define the servers to which they should go with the servers line.

Example:

.. code-block:: none

[example] pattern = ^mydata.foo..+ servers = 10.1.2.3, 10.1.2.4:2004, myserver.mydomain.com

You must define at least one section as the default.

aggregation-rules.conf

Aggregation rules allow you to add several metrics together as the come in, reducing the need to sum() many metrics in every URL. Note that unlike some other config files, any time this file is modified it will take effect automatically. This requires the carbon-aggregator service to be running.

The form of each line in this file should be as follows:

.. code-block:: none

output_template (frequency) = method input_pattern

This will capture any received metrics that match 'input_pattern' for calculating an aggregate metric. The calculation will occur every 'frequency' seconds and the 'method' can specify 'sum' or 'avg'. The name of the aggregate metric will be derived from 'output_template' filling in any captured fields from 'input_pattern'. Any metric that will arrive to carbon-aggregator will proceed to its output untouched unless it is overridden by some rule.

For example, if your metric naming scheme is:

.. code-block:: none

.applications...

You could configure some aggregations like so:

.. code-block:: none

.applications..all.requests (60) = sum .applications...requests .applications..all.latency (60) = avg .applications...latency

As an example, if the following metrics are received:

.. code-block:: none

prod.applications.apache.www01.requests prod.applications.apache.www02.requests prod.applications.apache.www03.requests prod.applications.apache.www04.requests prod.applications.apache.www05.requests

They would all go into the same aggregation buffer and after 60 seconds the aggregate metric 'prod.applications.apache.all.requests' would be calculated by summing their values.

Another common use pattern of carbon-aggregator is to aggregate several data points of the same metric. This could come in handy when you have got the same metric coming from several hosts, or when you are bound to send data more frequently than your shortest retention.

whitelist and blacklist

The whitelist functionality allows any of the carbon daemons to only accept metrics that are explicitly whitelisted and/or to reject blacklisted metrics. The functionality can be enabled in carbon.conf with the USE_WHITELIST flag. This can be useful when too many metrics are being sent to a Graphite instance or when there are metric senders sending useless or invalid metrics.

GRAPHITE_CONF_DIR is searched for whitelist.conf and blacklist.conf. Each file contains one regular expressions per line to match against metric values. If the whitelist configuration is missing or empty, all metrics will be passed through by default.

Graphite-web's local_settings.py

Graphite-web uses the convention of importing a local_settings.py file from the webapp settings.py module. This is where Graphite-web's runtime configuration is loaded from.

Config File Location

local_settings.py is generally located within the main graphite module where the webapp's code lives. In the :ref:default installation layout <default-installation-layout> this is /opt/graphite/webapp/graphite/local_settings.py. Alternative locations can be used by symlinking to this path or by ensuring the module can be found within the Python module search path.

General Settings

TIME_ZONE Default: America/Chicago

Set your local timezone. Timezone is specifed using zoneinfo names <http://en.wikipedia.org/wiki/Zoneinfo#Names_of_time_zones>_.

DOCUMENTATION_URL Default: http://graphite.readthedocs.org/

Overrides the Documentation link used in the header of the Graphite Composer

LOG_RENDERING_PERFORMANCE Default: False

Triggers the creation of rendering.log which logs timings for calls to the :doc:render_api

LOG_CACHE_PERFORMANCE Default: False

Triggers the creation of cache.log which logs timings for remote calls to carbon-cache as well as Request Cache (memcached) hits and misses.

LOG_METRIC_ACCESS Default: False

Trigges the creation of metricaccess.log which logs access to Whisper and RRD data files

DEBUG = True Default: False

Enables generation of detailed Django error pages. See Django's documentation <https://docs.djangoproject.com/en/dev/ref/settings/#debug>_ for details

FLUSHRRDCACHED Default: <unset>

If set, executes rrdtool flushcached before fetching data from RRD files. Set to the address or socket of the rrdcached daemon. Ex: unix:/var/run/rrdcached.sock

MEMCACHE_HOSTS Default: []

If set, enables the caching of calculated targets (including applied functions) and rendered images. If running a cluster of Graphite webapps, each webapp should have the exact same values for this setting to prevent unneeded cache misses.

Set this to the list of memcached hosts. Ex: ['10.10.10.10:11211', '10.10.10.11:11211', '10.10.10.12:11211']

DEFAULT_CACHE_DURATION Default: 60

Default expiration of cached data and images.

Filesystem Paths

These settings configure the location of Graphite-web's additional configuration files, static content, and data. These need to be adjusted if Graphite-web is installed outside of the :ref:default installation layout <default-installation-layout>.

GRAPHITE_ROOT Default: /opt/graphite The base directory for the Graphite install. This setting is used to shift the Graphite install from the default base directory which keeping the :ref:default layout <default-installation-layout>. The paths derived from this setting can be individually overridden as well

CONF_DIR Default: GRAPHITE_ROOT/conf The location of additional Graphite-web configuration files

STORAGE_DIR Default: GRAPHITE_ROOT/storage The base directory from which WHISPER_DIR, RRD_DIR, LOG_DIR, and INDEX_FILE default paths are referenced

CONTENT_DIR Default: See below The location of Graphite-web's static content. This defaults to content/ two parent directories up from settings.py. In the :ref:default layout <default-installation-layout> this is /opt/graphite/webapp/content

DASHBOARD_CONF Default: CONF_DIR/dashboard.conf The location of the Graphite-web Dashboard configuration

GRAPHTEMPLATES_CONF Default: CONF_DIR/graphTemplates.conf The location of the Graphite-web Graph Template configuration

WHISPER_DIR Default: /opt/graphite/storage/whisper The location of Whisper data files

RRD_DIR Default: /opt/graphite/storage/rrd The location of RRD data files

STANDARD_DIRS Default: [WHISPER_DIR, RRD_DIR] The list of directories searched for data files. By default, this is the value of WHISPER_DIR and RRD_DIR (if rrd support is detected). If this setting is defined, the WHISPER_DIR and RRD_DIR settings have no effect.

LOG_DIR Default: STORAGE_DIR/log/webapp The directory to write Graphite-web's log files. This directory must be writable by the user running the Graphite-web webapp

INDEX_FILE Default: /opt/graphite/storage/index The location of the search index file. This file is generated by the build-index.sh script and must be writable by the user running the Graphite-web webap

Email Configuration

These settings configure Django's email functionality which is used for emailing rendered graphs. See the Django documentation <https://docs.djangoproject.com/en/dev/topics/email/>__ for further detail on these settings

EMAIL_BACKEND Default: django.core.mail.backends.smtp.EmailBackend Set to django.core.mail.backends.dummy.EmailBackend to drop emails on the floor and effectively disable email features.

EMAIL_HOST Default: localhost

EMAIL_PORT Default: 25

EMAIL_HOST_USER Default: ''

EMAIL_HOST_PASSWORD Default: ''

EMAIL_USE_TLS Default: False

Authentication Configuration

These settings insert additional backends to the AUTHENTICATION_BACKENDS <https://docs.djangoproject.com/en/dev/ref/settings/#authentication-backends>_ and MIDDLEWARE_CLASSES <https://docs.djangoproject.com/en/dev/ref/settings/#std:setting-MIDDLEWARE_CLASSES>_ settings. Additional authentication schemes are possible by manipulating these lists directly.

LDAP ^^^^ These settings configure a custom LDAP authentication backend provided by Graphite. Additional settings to the ones below are configurable setting the LDAP module's global options using ldap.set_option. See the module documentation <http://python-ldap.org/>_ for more details.

.. code-block:: none

SSL Example

import ldap ldap.set_option(ldap.OPT_X_TLS_REQUIRE_CERT, ldap.OPT_X_TLS_ALLOW) ldap.set_option(ldap.OPT_X_TLS_CACERTDIR, "/etc/ssl/ca") ldap.set_option(ldap.OPT_X_TLS_CERTFILE, "/etc/ssl/mycert.pem") ldap.set_option(ldap.OPT_X_TLS_KEYFILE, "/etc/ssl/mykey.pem")

USE_LDAP_AUTH Default: False

LDAP_SERVER Default: ''

Set the LDAP server here or alternativly in LDAP_URI

LDAP_PORT Default: 389

Set the LDAP server port here or alternativly in LDAP_URI

LDAP_URI Default: None

Sets the LDAP server URI. E.g. ldaps://ldap.mycompany.com:636

LDAP_SEARCH_BASE Default: ''

Sets the LDAP search base. E.g. OU=users,DC=mycompany,DC=com

LDAP_BASE_USER Default: ''

Sets the base LDAP user to bind to the server with. E.g. CN=some_readonly_account,DC=mycompany,DC=com

LDAP_BASE_PASS Default: ''

Sets the password of the base LDAP user to bind to the server with.

LDAP_USER_QUERY Default: ''

Sets the LDAP query to return a user object where %s substituted with the user id. E.g. (username=%s) or (sAMAccountName=%s) (Active Directory)

Other Authentications ^^^^^^^^^^^^^^^^^^^^^ USE_REMOTE_USER_AUTHENTICATION Default: False

Enables the use of the Django RemoteUserBackend authentication backend. See the Django documentation <https://docs.djangoproject.com/en/dev/howto/auth-remote-user/>__ for further details

LOGIN_URL Default: /account/login

Modifies the URL linked in the Login link in the Composer interface. This is useful for directing users to an external authentication link such as for Remote User authentication or a backend such as django_openid_auth <https://launchpad.net/django-openid-auth>_

Dashboard Authorization Configuration

These settings control who is allowed to save and delete dashboards. By default anyone can perform these actions, but by setting DASHBOARD_REQUIRE_AUTHENTICATION, users must at least be logged in to do so. The other two settings allow further restriction of who is able to perform these actions. Users who are not suitably authorized will still be able to use and change dashboards, but will not be able to save changes or delete dashboards.

DASHBOARD_REQUIRE_AUTHENTICATION Default: False

If set to True, dashboards can only be saved and deleted by logged in users.

DASHBOARD_REQUIRE_EDIT_GROUP Default: None

If set to the name of a user group, dashboards can only be saved and deleted by logged-in users who are members of this group. Groups can be set in the Django Admin app, or in LDAP.

Note that DASHBOARD_REQUIRE_AUTHENTICATION must be set to true - if not, this setting is ignored.

DASHBOARD_REQUIRE_PERMISSIONS Default: False

If set to True, dashboards can only be saved or deleted by users having the appropriate (change or delete) permission (as set in the Django Admin app). These permissions can be set at the user or group level. Note that Django's 'add' permission is not used.

Note that DASHBOARD_REQUIRE_AUTHENTICATION must be set to true - if not, this setting is ignored.

Database Configuration

The following configures the Django database settings. Graphite uses the database for storing user profiles, dashboards, and for the Events functionality. Graphite uses an Sqlite database file located at STORAGE_DIR/graphite.db by default. If running multiple Graphite-web instances, a database such as PostgreSQL or MySQL is required so that all instances may share the same data source.

.. note :: As of Django 1.2, the database configuration is specified by the DATABASES dictionary instead of the old DATABASE_* format. Users must use the new specification to have a working database.

See the Django documentation <https://docs.djangoproject.com/en/dev/ref/settings/#databases>_ for full documentation of the DATABASE setting.

.. note :: Remember, setting up a new database requires running PYTHONPATH=$GRAPHITE_ROOT/webapp django-admin.py syncdb --settings=graphite.settings to create the initial schema.

Cluster Configuration

These settings configure the Graphite webapp for clustered use. When CLUSTER_SERVERS is set, metric browse and render requests will cause the webapp to query other webapps in CLUSTER_SERVERS for matching metrics. Graphite will use only one successfully matching response to render data. This means that metrics may only live on a single server in the cluster unless the data is consistent on both sources (e.g. with shared SAN storage). Duplicate metric data existing in multiple locations will not be combined.

CLUSTER_SERVERS Default: []

The list of IP addresses and ports of remote Graphite webapps in a cluster. Each of these servers should have local access to metric data to serve. The first server to return a match for a query will be used to serve that data. Ex: ["10.0.2.2:80", "10.0.2.3:80"]

REMOTE_STORE_FETCH_TIMEOUT Default: 6

Timeout for remote data fetches in seconds

REMOTE_STORE_FIND_TIMEOUT Default: 2.5

Timeout for remote find requests (metric browsing) in seconds

REMOTE_STORE_RETRY_DELAY Default: 60

Time in seconds to blacklist a webapp after a timed-out request

REMOTE_FIND_CACHE_DURATION Default: 300

Time to cache remote metric find results in seconds

REMOTE_RENDERING Default: False

Enable remote rendering of images and data (JSON, et al.) on remote Graphite webapps. If this is enabled, RENDERING_HOSTS must be configured below

RENDERING_HOSTS Default: []

List of IP addresses and ports of remote Graphite webapps used to perform rendering. Each webapp must have access to the same data as the Graphite webapp which uses this setting either through shared local storage or via CLUSTER_SERVERS. Ex: ["10.0.2.4:80", "10.0.2.5:80"]

REMOTE_RENDER_CONNECT_TIMEOUT Default: 1.0

Connection timeout for remote rendering requests in seconds

CARBONLINK_HOSTS Default: [127.0.0.1:7002]

If multiple carbon-caches are running on this machine, each should be listed here so that the Graphite webapp may query the caches for data that has not yet been persisted. Remote carbon-cache instances in a multi-host clustered setup should not be listed here. Instance names should be listed as applicable. Ex: ['127.0.0.1:7002:a','127.0.0.1:7102:b', '127.0.0.1:7202:c']

CARBONLINK_TIMEOUT Default: 1.0

Timeout for carbon-cache cache queries in seconds

Additional Django Settings

The local_settings.py.example shipped with Graphite-web imports app_settings.py into the namespace to allow further customization of Django. This allows the setting or customization of standard Django settings <https://docs.djangoproject.com/en/dev/ref/settings/>_ and the installation and configuration of additional middleware <https://docs.djangoproject.com/en/dev/topics/http/middleware/>_. To manipulate these settings, ensure app_settings.py is imported as such:

.. code-block:: python

from graphite.app_settings import *

The most common settings to manipulate are INSTALLED_APPS, MIDDLEWARE_CLASSES, and AUTHENTICATION_BACKENDS

Graphite-web's local_settings.py

Graphite-web uses the convention of importing a local_settings.py file from the webapp settings.py module. This is where Graphite-web's runtime configuration is loaded from.

Config File Location

local_settings.py is generally located within the main graphite module where the webapp's code lives. In the :ref:default installation layout <default-installation-layout> this is /opt/graphite/webapp/graphite/local_settings.py. Alternative locations can be used by symlinking to this path or by ensuring the module can be found within the Python module search path.

General Settings

TIME_ZONE Default: America/Chicago

Set your local timezone. Timezone is specifed using zoneinfo names <http://en.wikipedia.org/wiki/Zoneinfo#Names_of_time_zones>_.

DOCUMENTATION_URL Default: http://graphite.readthedocs.org/

Overrides the Documentation link used in the header of the Graphite Composer

LOG_RENDERING_PERFORMANCE Default: False

Triggers the creation of rendering.log which logs timings for calls to the :doc:render_api

LOG_CACHE_PERFORMANCE Default: False

Triggers the creation of cache.log which logs timings for remote calls to carbon-cache as well as Request Cache (memcached) hits and misses.

LOG_METRIC_ACCESS Default: False

Trigges the creation of metricaccess.log which logs access to Whisper and RRD data files

DEBUG = True Default: False

Enables generation of detailed Django error pages. See Django's documentation <https://docs.djangoproject.com/en/dev/ref/settings/#debug>_ for details

FLUSHRRDCACHED Default: <unset>

If set, executes rrdtool flushcached before fetching data from RRD files. Set to the address or socket of the rrdcached daemon. Ex: unix:/var/run/rrdcached.sock

MEMCACHE_HOSTS Default: []

If set, enables the caching of calculated targets (including applied functions) and rendered images. If running a cluster of Graphite webapps, each webapp should have the exact same values for this setting to prevent unneeded cache misses.

Set this to the list of memcached hosts. Ex: ['10.10.10.10:11211', '10.10.10.11:11211', '10.10.10.12:11211']

DEFAULT_CACHE_DURATION Default: 60

Default expiration of cached data and images.

Filesystem Paths

These settings configure the location of Graphite-web's additional configuration files, static content, and data. These need to be adjusted if Graphite-web is installed outside of the :ref:default installation layout <default-installation-layout>.

GRAPHITE_ROOT Default: /opt/graphite The base directory for the Graphite install. This setting is used to shift the Graphite install from the default base directory which keeping the :ref:default layout <default-installation-layout>. The paths derived from this setting can be individually overridden as well

CONF_DIR Default: GRAPHITE_ROOT/conf The location of additional Graphite-web configuration files

STORAGE_DIR Default: GRAPHITE_ROOT/storage The base directory from which WHISPER_DIR, RRD_DIR, LOG_DIR, and INDEX_FILE default paths are referenced

CONTENT_DIR Default: See below The location of Graphite-web's static content. This defaults to content/ two parent directories up from settings.py. In the :ref:default layout <default-installation-layout> this is /opt/graphite/webapp/content

DASHBOARD_CONF Default: CONF_DIR/dashboard.conf The location of the Graphite-web Dashboard configuration

GRAPHTEMPLATES_CONF Default: CONF_DIR/graphTemplates.conf The location of the Graphite-web Graph Template configuration

WHISPER_DIR Default: /opt/graphite/storage/whisper The location of Whisper data files

RRD_DIR Default: /opt/graphite/storage/rrd The location of RRD data files

STANDARD_DIRS Default: [WHISPER_DIR, RRD_DIR] The list of directories searched for data files. By default, this is the value of WHISPER_DIR and RRD_DIR (if rrd support is detected). If this setting is defined, the WHISPER_DIR and RRD_DIR settings have no effect.

LOG_DIR Default: STORAGE_DIR/log/webapp The directory to write Graphite-web's log files. This directory must be writable by the user running the Graphite-web webapp

INDEX_FILE Default: /opt/graphite/storage/index The location of the search index file. This file is generated by the build-index.sh script and must be writable by the user running the Graphite-web webap

Email Configuration

These settings configure Django's email functionality which is used for emailing rendered graphs. See the Django documentation <https://docs.djangoproject.com/en/dev/topics/email/>__ for further detail on these settings

EMAIL_BACKEND Default: django.core.mail.backends.smtp.EmailBackend Set to django.core.mail.backends.dummy.EmailBackend to drop emails on the floor and effectively disable email features.

EMAIL_HOST Default: localhost

EMAIL_PORT Default: 25

EMAIL_HOST_USER Default: ''

EMAIL_HOST_PASSWORD Default: ''

EMAIL_USE_TLS Default: False

Authentication Configuration

These settings insert additional backends to the AUTHENTICATION_BACKENDS <https://docs.djangoproject.com/en/dev/ref/settings/#authentication-backends>_ and MIDDLEWARE_CLASSES <https://docs.djangoproject.com/en/dev/ref/settings/#std:setting-MIDDLEWARE_CLASSES>_ settings. Additional authentication schemes are possible by manipulating these lists directly.

LDAP ^^^^ These settings configure a custom LDAP authentication backend provided by Graphite. Additional settings to the ones below are configurable setting the LDAP module's global options using ldap.set_option. See the module documentation <http://python-ldap.org/>_ for more details.

.. code-block:: none

SSL Example

import ldap ldap.set_option(ldap.OPT_X_TLS_REQUIRE_CERT, ldap.OPT_X_TLS_ALLOW) ldap.set_option(ldap.OPT_X_TLS_CACERTDIR, "/etc/ssl/ca") ldap.set_option(ldap.OPT_X_TLS_CERTFILE, "/etc/ssl/mycert.pem") ldap.set_option(ldap.OPT_X_TLS_KEYFILE, "/etc/ssl/mykey.pem")

USE_LDAP_AUTH Default: False

LDAP_SERVER Default: ''

Set the LDAP server here or alternativly in LDAP_URI

LDAP_PORT Default: 389

Set the LDAP server port here or alternativly in LDAP_URI

LDAP_URI Default: None

Sets the LDAP server URI. E.g. ldaps://ldap.mycompany.com:636

LDAP_SEARCH_BASE Default: ''

Sets the LDAP search base. E.g. OU=users,DC=mycompany,DC=com

LDAP_BASE_USER Default: ''

Sets the base LDAP user to bind to the server with. E.g. CN=some_readonly_account,DC=mycompany,DC=com

LDAP_BASE_PASS Default: ''

Sets the password of the base LDAP user to bind to the server with.

LDAP_USER_QUERY Default: ''

Sets the LDAP query to return a user object where %s substituted with the user id. E.g. (username=%s) or (sAMAccountName=%s) (Active Directory)

Other Authentications ^^^^^^^^^^^^^^^^^^^^^ USE_REMOTE_USER_AUTHENTICATION Default: False

Enables the use of the Django RemoteUserBackend authentication backend. See the Django documentation <https://docs.djangoproject.com/en/dev/howto/auth-remote-user/>__ for further details

LOGIN_URL Default: /account/login

Modifies the URL linked in the Login link in the Composer interface. This is useful for directing users to an external authentication link such as for Remote User authentication or a backend such as django_openid_auth <https://launchpad.net/django-openid-auth>_

Dashboard Authorization Configuration

These settings control who is allowed to save and delete dashboards. By default anyone can perform these actions, but by setting DASHBOARD_REQUIRE_AUTHENTICATION, users must at least be logged in to do so. The other two settings allow further restriction of who is able to perform these actions. Users who are not suitably authorized will still be able to use and change dashboards, but will not be able to save changes or delete dashboards.

DASHBOARD_REQUIRE_AUTHENTICATION Default: False

If set to True, dashboards can only be saved and deleted by logged in users.

DASHBOARD_REQUIRE_EDIT_GROUP Default: None

If set to the name of a user group, dashboards can only be saved and deleted by logged-in users who are members of this group. Groups can be set in the Django Admin app, or in LDAP.

Note that DASHBOARD_REQUIRE_AUTHENTICATION must be set to true - if not, this setting is ignored.

DASHBOARD_REQUIRE_PERMISSIONS Default: False

If set to True, dashboards can only be saved or deleted by users having the appropriate (change or delete) permission (as set in the Django Admin app). These permissions can be set at the user or group level. Note that Django's 'add' permission is not used.

Note that DASHBOARD_REQUIRE_AUTHENTICATION must be set to true - if not, this setting is ignored.

Database Configuration

The following configures the Django database settings. Graphite uses the database for storing user profiles, dashboards, and for the Events functionality. Graphite uses an Sqlite database file located at STORAGE_DIR/graphite.db by default. If running multiple Graphite-web instances, a database such as PostgreSQL or MySQL is required so that all instances may share the same data source.

.. note :: As of Django 1.2, the database configuration is specified by the DATABASES dictionary instead of the old DATABASE_* format. Users must use the new specification to have a working database.

See the Django documentation <https://docs.djangoproject.com/en/dev/ref/settings/#databases>_ for full documentation of the DATABASE setting.

.. note :: Remember, setting up a new database requires running PYTHONPATH=$GRAPHITE_ROOT/webapp django-admin.py syncdb --settings=graphite.settings to create the initial schema.

Cluster Configuration

These settings configure the Graphite webapp for clustered use. When CLUSTER_SERVERS is set, metric browse and render requests will cause the webapp to query other webapps in CLUSTER_SERVERS for matching metrics. Graphite will use only one successfully matching response to render data. This means that metrics may only live on a single server in the cluster unless the data is consistent on both sources (e.g. with shared SAN storage). Duplicate metric data existing in multiple locations will not be combined.

CLUSTER_SERVERS Default: []

The list of IP addresses and ports of remote Graphite webapps in a cluster. Each of these servers should have local access to metric data to serve. The first server to return a match for a query will be used to serve that data. Ex: ["10.0.2.2:80", "10.0.2.3:80"]

REMOTE_STORE_FETCH_TIMEOUT Default: 6

Timeout for remote data fetches in seconds

REMOTE_STORE_FIND_TIMEOUT Default: 2.5

Timeout for remote find requests (metric browsing) in seconds

REMOTE_STORE_RETRY_DELAY Default: 60

Time in seconds to blacklist a webapp after a timed-out request

REMOTE_FIND_CACHE_DURATION Default: 300

Time to cache remote metric find results in seconds

REMOTE_RENDERING Default: False

Enable remote rendering of images and data (JSON, et al.) on remote Graphite webapps. If this is enabled, RENDERING_HOSTS must be configured below

RENDERING_HOSTS Default: []

List of IP addresses and ports of remote Graphite webapps used to perform rendering. Each webapp must have access to the same data as the Graphite webapp which uses this setting either through shared local storage or via CLUSTER_SERVERS. Ex: ["10.0.2.4:80", "10.0.2.5:80"]

REMOTE_RENDER_CONNECT_TIMEOUT Default: 1.0

Connection timeout for remote rendering requests in seconds

CARBONLINK_HOSTS Default: [127.0.0.1:7002]

If multiple carbon-caches are running on this machine, each should be listed here so that the Graphite webapp may query the caches for data that has not yet been persisted. Remote carbon-cache instances in a multi-host clustered setup should not be listed here. Instance names should be listed as applicable. Ex: ['127.0.0.1:7002:a','127.0.0.1:7102:b', '127.0.0.1:7202:c']

CARBONLINK_TIMEOUT Default: 1.0

Timeout for carbon-cache cache queries in seconds

Additional Django Settings

The local_settings.py.example shipped with Graphite-web imports app_settings.py into the namespace to allow further customization of Django. This allows the setting or customization of standard Django settings <https://docs.djangoproject.com/en/dev/ref/settings/>_ and the installation and configuration of additional middleware <https://docs.djangoproject.com/en/dev/topics/http/middleware/>_. To manipulate these settings, ensure app_settings.py is imported as such:

.. code-block:: python

from graphite.app_settings import *

The most common settings to manipulate are INSTALLED_APPS, MIDDLEWARE_CLASSES, and AUTHENTICATION_BACKENDS

Working on Graphite-web

Graphite-web accepts contributions on GitHub <https://github.com/graphite-project/graphite-web>_, in the form of issues or pull requests. If you're comfortable with Python, here is how to get started.

First, keep in mind that Graphite-web supports Python versions 2.5 to 2.7 and Django versions 1.4 and above.

Setting up a development environment ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The recommended workflow is to use virtualenv_ / virtualenvwrapper_ to isolate projects between each other. This document uses virtualenv as the lowest common denominator.

.. _virtualenv: http://www.virtualenv.org/ .. _virtualenvwrapper: http://virtualenvwrapper.readthedocs.org/

Create a virtualenv at the root of your graphite-web repository::

virtualenv env
source env/bin/activate

Install the required dependencies::

pip install -r requirements.txt

Create the default storage directories::

mkdir -p storage/{ceres,whisper,log/webapp}

Then you should be able to run the graphite development server::

cd webapp
./manage.py runserver

Running the tests ^^^^^^^^^^^^^^^^^

To run the tests for the Python and Django versions of your virtualenv::

cd webapp
./manage.py test --settings=tests.settings

If you want to run the tests for all combinations of Python and Django versions, you can use the tox_ tool.

.. _tox: http://tox.readthedocs.org/

::

pip install tox
tox

This will run the tests for all configurations declared in the tox.ini file at the root of the repository.

You can see all the configurations available by running::

tox -l

You can run a single configuration with::

tox -e <configuration>

Note that you need the corresponding python version on your system. Most systems only provide one or two different python versions, it is up to you to install other versions.

Writing tests ^^^^^^^^^^^^^

Pull requests for new features or bugfixes should come with tests to demonstrate that your feature or fix actually works. Tests are located in the webapp/tests directory.

When writing a new test, look at the existing files to see if your test would fit in one. Otherwise simply create a new file named test_<whatever>.py with the following content:

.. code-block:: python

from django.test import TestCase

class WhateverTest(TestCase):
    def test_something(self):
        self.assertEqual(1, 2 / 2)

You can read Django's testing docs <https://docs.djangoproject.com/en/stable/topics/testing/>_ for more information on django.test.TestCase and how tests work with Django.

Feeding In Your Data

Getting your data into Graphite is very flexible. There are three main methods for sending data to Graphite: Plaintext, Pickle, and AMQP.

It's worth noting that data sent to Graphite is actually sent to the :doc:Carbon and Carbon-Relay </carbon-daemons>, which then manage the data. The Graphite web interface reads this data back out, either from cache or straight off disk.

Choosing the right transfer method for you is dependent on how you want to build your application or script to send data:

  • There are some tools and APIs which can help you get your data into Carbon.

  • For a singular script, or for test data, the plaintext protocol is the most straightforward method.

  • For sending large amounts of data, you'll want to batch this data up and send it to Carbon's pickle receiver.

  • Finally, Carbon can listen to a message bus, via AMQP.

Existing tools and APIs

  • :doc:client daemons and tools </tools>
  • :doc:client APIs </client-apis>

The plaintext protocol

The plaintext protocol is the most straightforward protocol supported by Carbon.

The data sent must be in the following format: <metric path> <metric value> <metric timestamp>. Carbon will then help translate this line of text into a metric that the web interface and Whisper understand.

On Unix, the nc program can be used to create a socket and send data to Carbon (by default, 'plaintext' runs on port 2003):

.. code-block:: none

PORT=2003 SERVER=graphite.your.org echo "local.random.diceroll 4 date +%s" | nc -q0 ${SERVER} ${PORT}

The -q0 parameter instructs nc to close socket once data is sent. Without this option, some nc versions would keep the connection open.

The pickle protocol

The pickle protocol is a much more efficient take on the plaintext protocol, and supports sending batches of metrics to Carbon in one go.

The general idea is that the pickled data forms a list of multi-level tuples:

.. code-block:: none

[(path, (timestamp, value)), ...]

Once you've formed a list of sufficient size (don't go too big!), send the data over a socket to Carbon's pickle receiver (by default, port 2004). You'll need to pack your pickled data into a packet containing a simple header:

.. code-block:: python

payload = pickle.dumps(listOfMetricTuples) header = struct.pack("!L", len(payload)) message = header + payload

You would then send the message object through a network socket.

Using AMQP

When AMQP_METRIC_NAME_IN_BODY is set to True in your carbon.conf file, the data should be of the same format as the plaintext protocol, e.g. echo "local.random.diceroll 4 date +%s". When AMQP_METRIC_NAME_IN_BODY is set to False, you should ommit 'local.random.diceroll'.

Installing From Pip

Versioned Graphite releases can be installed via pip <http://pypi.python.org/pypi/pip>_. When installing with pip, Installation of dependencies will automatically be attempted.

Installing in the Default Location

To install Graphite in the :ref:default location <default-installation-layout>, /opt/graphite/, simply execute as root:

.. code-block:: none

pip install https://github.com/graphite-project/ceres/tarball/master
pip install whisper
pip install carbon
pip install graphite-web

.. note::

On RedHat-based systems using the python-pip package, the pip executable is named pip-python

Installing Carbon in a Custom Location

Installation of Carbon in a custom location with pip is similar to doing so from a source install. Arguments to the underlying setup.py controlling installation location can be passed through pip with the --install-option option.

See :ref:carbon-custom-location-source for details of locations and available arguments

For example, to install everything in /srv/graphite/:

.. code-block:: none

pip install carbon --install-option="--prefix=/srv/graphite" --install-option="--install-lib=/srv/graphite/lib"

To install Carbon into the system-wide site-packages directory with scripts in /usr/bin and storage and configuration in /usr/share/graphite:

.. code-block:: none

pip install carbon --install-option="--install-scripts=/usr/bin" --install-option="--install-lib=/usr/lib/python2.6/site-packages" --install-option="--install-data=/var/lib/graphite"

Installing Graphite-web in a Custom Location

Installation of Graphite-web in a custom location with pip is similar to doing so from a source install. Arguments to the underlying setup.py controlling installation location can be passed through pip with the --install-option option.

See :ref:graphite-web-custom-location-source for details on default locations and available arguments

For example, to install everything in /srv/graphite/:

.. code-block:: none

pip install graphite-web --install-option="--prefix=/srv/graphite" --install-option="--install-lib=/srv/graphite/webapp"

To install the Graphite-web code into the system-wide site-packages directory with scripts in /usr/bin and storage configuration, and content in /usr/share/graphite:

.. code-block:: none

pip install graphite-web --install-option="--install-scripts=/usr/bin" install-option="--install-lib=/usr/lib/python2.6/site-packages" --install-option="--install-data=/var/lib/graphite"

Installing From Source

The latest source tarballs for Graphite-web, Carbon, and Whisper may be fetched from the Graphite project download page_ or the latest development branches may be cloned from the Github project page_:

  • Graphite-web: git clone https://github.com/graphite-project/graphite-web.git
  • Carbon: git clone https://github.com/graphite-project/carbon.git
  • Whisper: git clone https://github.com/graphite-project/whisper.git
  • Ceres: git clone https://github.com/graphite-project/ceres.git

.. note::

There currently is no tarball available for Ceres, it must be cloned from the Github project page_

Installing in the Default Location

To install Graphite in the :ref:default location <default-installation-layout>, /opt/graphite/, simply execute python setup.py install as root in each of the project directories for Graphite-web, Carbon, Whisper, and Ceres.

.. _carbon-custom-location-source:

Installing Carbon in a Custom Location

Carbon's setup.py installer is configured to use a prefix of /opt/graphite and an install-lib of /opt/graphite/lib. Carbon's lifecycle wrapper scripts and utilities are installed in bin, configuration within conf, and stored data in storage all within prefix. These may be overridden by passing parameters to the setup.py install command.

The following parameters influence the install location:

  • --prefix

    Location to place the bin/ and storage/ and conf/ directories (defaults to /opt/graphite/)

  • --install-lib

    Location to install Python modules (default: /opt/graphite/lib)

  • --install-data

    Location to place the storage and conf directories (default: value of prefix)

  • --install-scripts

    Location to place the scripts (default: bin/ inside of prefix)

For example, to install everything in /srv/graphite/:

.. code-block:: none

python setup.py install --prefix=/srv/graphite --install-lib=/srv/graphite/lib

To install Carbon into the system-wide site-packages directory with scripts in /usr/bin and storage and configuration in /usr/share/graphite:

.. code-block:: none

python setup.py install --install-scripts=/usr/bin --install-lib=/usr/lib/python2.6/site-packages --install-data=/var/lib/graphite

.. _graphite-web-custom-location-source:

Installing Graphite-web in a Custom Location

Graphite-web's setup.py installer is configured to use a prefix of /opt/graphite and an install-lib of /opt/graphite/webapp. Utilities are installed in bin, and configuration in conf within the prefix. These may be overridden by passing parameters to setup.py install

The following parameters influence the install location:

  • --prefix

    Location to place the bin/ and conf/ directories (defaults to /opt/graphite/)

  • --install-lib

    Location to install Python modules (default: /opt/graphite/webapp)

  • --install-data

    Location to place the webapp/content and conf directories (default: value of prefix)

  • --install-scripts

    Location to place scripts (default: bin/ inside of prefix)

For example, to install everything in /srv/graphite/:

.. code-block:: none

python setup.py install --prefix=/srv/graphite --install-lib=/srv/graphite/webapp

To install the Graphite-web code into the system-wide site-packages directory with scripts in /usr/bin and storage configuration, and content in /usr/share/graphite:

.. code-block:: none

python setup.py install --install-scripts=/usr/bin --install-lib=/usr/lib/python2.6/site-packages --install-data=/var/lib/graphite

.. _Github project page: http://github.com/graphite-project .. _download page: https://launchpad.net/graphite/+download

Installing in Virtualenv

Virtualenv_ provides an isolated Python environment to run Graphite in.

Installing in the Default Location

To install Graphite in the :ref:default location <default-installation-layout>, /opt/graphite/, create a virtualenv in /opt/graphite and activate it:

.. code-block:: none

virtualenv /opt/graphite source /opt/graphite/bin/activate

Once the virtualenv is activated, Graphite and Carbon can be installed :doc:from source <install-source> or :doc:via pip <install-pip>. Note that dependencies will need to be installed while the virtualenv is activated unless --system-site-packages <http://www.virtualenv.org/en/latest/index.html#the-system-site-packages-option>_ is specified at virtualenv creation time.

Installing in a Custom Location

To install from source activate the virtualenv and see the instructions for :ref:graphite-web <graphite-web-custom-location-source> and :ref:carbon <carbon-custom-location-source>

Running Carbon Within Virtualenv

Carbon may be run within Virtualenv by activating virtualenv_ before Carbon is started

Running Graphite-web Within Virtualenv

Running Django's django-admin.py within a virtualenv requires using the full path of the virtualenv::

/path/to/env/bin/django-admin.py <command> --settings=graphite.settings

The method of running Graphite-web within Virtualenv depends on the WSGI server used:

Apache mod_wsgi ^^^^^^^^^^^^^^^ .. note::

The version Python used to compile mod_wsgi must match the Python installed in the virtualenv (generally the system Python)

To the Apache mod_wsgi_ config, add the root of the virtualenv as WSGIPythonHome, /opt/graphite in this example:

.. code-block:: none

WSGIPythonHome /opt/graphite

and add the virtualenv's python site-packages to the graphite.wsgi file, python 2.6 in /opt/graphite in this example:

.. code-block:: none

site.addsitedir('/opt/graphite/lib/python2.6/site-packages')

See the mod_wsgi documentation on Virtual Environments <http://code.google.com/p/modwsgi/wiki/VirtualEnvironments> for more details.

Gunicorn ^^^^^^^^ Ensure Gunicorn_ is installed in the activated virtualenv and execute as normal. If gunicorn is installed system-wide, it may be necessary to execute it from the virtualenv's bin path

uWSGI ^^^^^ Execute uWSGI_ using the -H option to specify the virtualenv root. See the uWSGI documentation on virtualenv <http://projects.unbit.it/uwsgi/wiki/VirtualEnv>_ for more details.

.. _activating virtualenv: http://www.virtualenv.org/en/latest/index.html#activate-script .. _Gunicorn: http://gunicorn.org/ .. _mod_wsgi: http://code.google.com/p/modwsgi/ .. _uWSGI: http://projects.unbit.it/uwsgi .. _Virtualenv: http://virtualenv.org/

Installing Graphite

Dependencies

Graphite renders graphs using the Cairo graphics library. This adds dependencies on several graphics-related libraries not typically found on a server. If you're installing from source you can use the check-dependencies.py script to see if the dependencies have been met or not.

Basic Graphite requirements:

  • Python 2.5 or greater (2.6+ recommended)
  • Pycairo_
  • Django_ 1.0 or greater
  • django-tagging_ 0.3.1 or greater
  • Twisted_ 8.0 or greater (10.0+ recommended)
  • zope-interface_ (often included in Twisted package dependency)
  • fontconfig_ and at least one font package (a system package usually)
  • A WSGI server and web server. Popular choices are:
    • Apache_ with mod_wsgi_
    • gunicorn_ with nginx_
    • uWSGI_ with nginx_

Python 2.5 has extra requirements:

  • simplejson_
  • python-sqlite2_ or another Django-supported database module

Additionally, the Graphite webapp and Carbon require the whisper database library which is part of the Graphite project.

There are also several other dependencies required for additional features:

  • Render caching: memcached_ and python-memcache_
  • LDAP authentication: python-ldap_ (for LDAP authentication support in the webapp)
  • AMQP support: txamqp_
  • RRD support: python-rrdtool_
  • Dependant modules for additional database support (MySQL, PostgreSQL, etc). See Django database install_ instructions and the Django database_ documentation for details

.. seealso:: On some systems it is necessary to install fonts for Cairo to use. If the webapp is running but all graphs return as broken images, this may be why.

         * https://answers.launchpad.net/graphite/+question/38833
         * https://answers.launchpad.net/graphite/+question/133390
         * https://answers.launchpad.net/graphite/+question/127623

Fulfilling Dependencies

Most current Linux distributions have all of the requirements available in the base packages. RHEL based distributions may require the EPEL_ repository for requirements. Python module dependencies can be install with pip_ rather than system packages if desired or if using a Python version that differs from the system default. Some modules (such as Cairo) may require library development headers to be available.

.. _default-installation-layout :

Default Installation Layout

Graphite defaults to an installation layout that puts the entire install in its own directory: /opt/graphite

Whisper ^^^^^^^ Whisper is installed Python's system-wide site-packages directory with Whisper's utilities installed in the bin dir of the system's default prefix (generally /usr/bin/).

Carbon and Graphite-web ^^^^^^^^^^^^^^^^^^^^^^^ Carbon and Graphite-web are installed in /opt/graphite/ with the following layout:

  • bin/

  • conf/

  • lib/

    Carbon PYTHONPATH

  • storage/

    • log

      Log directory for Carbon and Graphite-web

    • rrd

      Location for RRD files to be read

    • whisper

      Location for Whisper data files to be stored and read

  • webapp/

    Graphite-web PYTHONPATH

    • graphite/

      Location of local_settings.py

    • content/

      Graphite-web static content directory

Installing Graphite

Several installation options exist:

.. toctree:: install-source install-pip install-virtualenv

Initial Configuration

.. toctree::

config-webapp config-carbon

Help! It didn't work!

If you run into any issues with Graphite, please to post a question to our Questions forum on Launchpad <https://answers.launchpad.net/graphite>_ or join us on IRC in #graphite on FreeNode

Post-Install Tasks

:doc:Configuring Carbon </config-carbon> Once you've installed everything you will need to create some basic configuration. Initially none of the config files are created by the installer but example files are provided. Simply copy the .example files and customize.

:doc:Administering Carbon </admin-carbon> Once Carbon is configured, you need to start it up.

:doc:Feeding In Your Data </feeding-carbon> Once it's up and running, you need to feed it some data.

:doc:Configuring The Webapp </config-webapp> With data getting into carbon, you probably want to look at graphs of it. So now we turn our attention to the webapp.

:doc:Administering The Webapp </admin-webapp> Once its configured you'll need to get it running.

:doc:Using the Composer </composer> Now that the webapp is running, you probably want to learn how to use it.

.. _Apache: http://projects.apache.org/projects/http_server.html .. _Django: http://www.djangoproject.com/ .. _django-tagging: http://code.google.com/p/django-tagging/ .. _Django database install: https://docs.djangoproject.com/en/dev/topics/install/#get-your-database-running .. _Django database: https://docs.djangoproject.com/en/dev/ref/databases/ .. _EPEL: http://fedoraproject.org/wiki/EPEL/ .. _fontconfig: http://www.freedesktop.org/wiki/Software/fontconfig/ .. _gunicorn: http://gunicorn.org/ .. _memcached: http://memcached.org/ .. _mod_wsgi: http://code.google.com/p/modwsgi/ .. _nginx: http://nginx.org/ .. _pip: http://www.pip-installer.org/ .. _Pycairo: http://www.cairographics.org/pycairo/ .. _python-ldap: http://www.python-ldap.org/ .. _python-memcache: http://www.tummy.com/Community/software/python-memcached/ .. _python-rrdtool: http://oss.oetiker.ch/rrdtool/prog/rrdpython.en.html .. _python-sqlite2: http://code.google.com/p/pysqlite/ .. _simplejson: http://pypi.python.org/pypi/simplejson/ .. _Twisted: http://twistedmatrix.com/ .. _txAMQP: https://launchpad.net/txamqp/ .. _uWSGI: http://projects.unbit.it/uwsgi/ .. _zope-interface: http://pypi.python.org/pypi/zope.interface/

Overview

What Graphite is and is not

Graphite does two things:

  1. Store numeric time-series data
  2. Render graphs of this data on demand

What Graphite does not do is collect data for you, however there are some :doc:tools </tools> out there that know how to send data to graphite. Even though it often requires a little code, :doc:sending data </feeding-carbon> to Graphite is very simple.

About the project

Graphite is an enterprise-scale monitoring tool that runs well on cheap hardware. It was originally designed and written by Chris Davis_ at Orbitz_ in 2006 as side project that ultimately grew to be a foundational monitoring tool. In 2008, Orbitz allowed Graphite to be released under the open source Apache 2.0 license. Since then Chris has continued to work on Graphite and has deployed it at other companies including Sears_, where it serves as a pillar of the e-commerce monitoring system. Today many large :doc:companies </who-is-using> use it.

The architecture in a nutshell

Graphite consists of 3 software components:

  1. carbon - a Twisted_ daemon that listens for time-series data
  2. whisper - a simple database library for storing time-series data (similar in design to RRD_)
  3. graphite webapp - A Django_ webapp that renders graphs on-demand using Cairo_

:doc:Feeding in your data </feeding-carbon> is pretty easy, typically most of the effort is in collecting the data to begin with. As you send datapoints to Carbon, they become immediately available for graphing in the webapp. The webapp offers several ways to create and display graphs including a simple :doc:URL API </render_api> for rendering that makes it easy to embed graphs in other webpages.

.. _Django: http://www.djangoproject.com/ .. _Twisted: http://www.twistedmatrix.com/ .. _Cairo: http://www.cairographics.org/ .. _RRD: http://oss.oetiker.ch/rrdtool/ .. _Chris Davis: mailto:[email protected] .. _Orbitz: http://www.orbitz.com/ .. _Sears: http://www.sears.com/

Requirements for documentation

Django>=1.4 django-tagging==0.3.1 sphinx sphinx_rtd_theme pytz git+git://github.com/graphite-project/whisper.git#egg=whisper

Alternative storage finders

Built-in finders ^^^^^^^^^^^^^^^^

The default graphite setup consists of:

  • A Whisper database
  • A carbon daemon writing data to the database
  • Graphite-web reading and graphing data from the database

It is possible to switch the storage layer to something different than Whisper to accomodate specific needs. The setup above would become:

  • An alternative database
  • A carbon daemon or alternative daemon for writing to the database
  • A custom storage finder for reading the data in graphite-web

This section aims at documenting the last item: configuring graphite-web to read data from a custom storage layer.

This can be done via the STORAGE_FINDERS setting. This setting is a list of paths to finder implementations. Its default value is:

.. code-block:: python

STORAGE_FINDERS = (
    'graphite.finders.standard.StandardFinder',
)

The default finder reads data from a Whisper database.

An alternative finder for the experimental Ceres database is available:

.. code-block:: python

STORAGE_FINDERS = (
    'graphite.finders.ceres.CeresFinder',
)

The setting supports multiple values, meaning you can read data from both a Whisper database and a Ceres database:

.. code-block:: python

STORAGE_FINDERS = (
    'graphite.finders.standard.StandardFinder',
    'graphite.finders.ceres.CeresFinder',
)

Custom finders ^^^^^^^^^^^^^^

STORAGE_FINDERS being a list of arbitrary python paths, it is relatively easy to write a custom finder if you want to read data from other places than Whisper and Ceres. A finder is a python class with a find_nodes() method:

.. code-block:: python

class CustomFinder(object):
    def find_nodes(self, query):
        # ...

query is a FindQuery object. find_nodes() is the entry point when browsing the metrics tree. It must yield leaf or branch nodes matching the query:

.. code-block:: python

from graphite.node import LeafNode, BranchNode

class CustomFinder(object):
    def find_nodes(self, query):
        # find some paths matching the query, then yield them
        for path in matches:
            if is_branch(path):
                yield BranchNode(path)
            if is_leaf(path):
                yield LeafNode(path, CustomReader(path))

LeafNode is created with a reader, which is the class responsible for fetching the datapoints for the given path. It is a simple class with 2 methods: fetch() and get_intervals():

.. code-block:: python

from graphite.intervals import IntervalSet, Interval

class CustomReader(object):
    __slots__ = ('path',)  # __slots__ is recommended to save memory on readers

    def __init__(self, path):
        self.path = path

    def fetch(self, start_time, end_time):
        # fetch data
        time_info = _from_, _to_, _step_
        return time_info, series

    def get_intervals(self):
        return IntervalSet([Interval(start, end)])

fetch() must return a list of 2 elements: the time info for the data and the datapoints themselves. The time info is a list of 3 items: the start time of the datapoints (in unix time), the end time and the time step (in seconds) between the datapoints.

The datapoints is a list of points found in the database for the required interval. There must be (end - start) / step points in the dataset even if the database has gaps: gaps can be filled with None values.

get_intervals() is a method that hints graphite-web about the time range available for this given metric in the database. It must return an IntervalSet of one or more Interval objects.

Installing custom finders ^^^^^^^^^^^^^^^^^^^^^^^^^

In order for your custom finder to be importable, you need to package it under a namespace of your choice. Python packaging won't be covered here but you can look at third-party finders to get some inspiration:

  • Cyanite finder <https://github.com/brutasse/graphite-cyanite>_
  • KairosDB finder

Tools That Work With Graphite

Backstop

Backstop_ is a simple endpoint for submitting metrics to Graphite. It accepts JSON data via HTTP POST and proxies the data to one or more Carbon/Graphite listeners.

Bucky

Bucky_ is a small service implemented in Python for collecting and translating metrics for Graphite. It can current collect metric data from CollectD daemons and from StatsD clients.

Cabot

Cabot_ is a self-hosted monitoring and alerting server that watches Graphite metrics and can alert on them by phone, SMS, Hipchat or email. It is designed to be deployed to cloud or physical hardware in minutes and configured via web interface.

collectd

collectd_ is a daemon which collects system performance statistics periodically and provides mechanisms to store the values in a variety of ways, including RRD. To send collectd metrics into carbon/graphite, use collectd's write-graphite_ plugin (available as of 5.1). Other options include:

  • Jordan Sissel's node collectd-to-graphite_ proxy
  • Joe Miller's perl collectd-graphite_ plugin
  • Gregory Szorc's python collectd-carbon_ plugin
  • Paul J. Davis's Bucky_ service

Graphite can also read directly from collectd_'s RRD files. RRD files can simply be added to STORAGE_DIR/rrd (as long as directory names and files do not contain any . characters). For example, collectd's host.name/load/load.rrd can be symlinked to rrd/collectd/host_name/load/load.rrd to graph collectd.host_name.load.load.{short,mid,long}term.

Collectl

Collectl_ is a collection tool for system metrics that can be run both interactively and as a daemon and has support for collecting from a broad set of subsystems. Collectl includes a Graphite interface which allows data to easily be fed to Graphite for storage.

Charcoal

Charcoal_ is a simple Sinatra dashboarding frontend for Graphite or any other system status service which can generate images directly from a URL. Charcoal configuration is driven by a YAML config file.

Descartes

Descartes_ is a Sinatra-based dashboard that allows users to correlate multiple metrics in a single chart, review long-term trends across one or more charts, and to collaborate with other users through a combination of shared dashboards and rich layouts.

Diamond

Diamond_ is a Python daemon that collects system metrics and publishes them to Graphite. It is capable of collecting cpu, memory, network, I/O, load and disk metrics. Additionally, it features an API for implementing custom collectors for gathering metrics from almost any source.

Dusk

Dusk_ is a simple dashboard for isolating "hotspots" across a fleet of systems. It incorporates horizon charts using Cubism.js to maximize data visualization in a constrained space.

Evenflow

Evenflow_ is a simple service for submitting sFlow datagrams to Graphite. It accepts sFlow datagrams from multiple network devices and proxies the data to a Carbon listener. Currently only Generic Interface Counters are supported. All other message types are discarded.

Firefly

Firefly_ is a web application aimed at powerful, flexible time series graphing for web developers.

Ganglia

Ganglia_ is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It collects system performance metrics and stores them in RRD, but now there is an add-on <https://github.com/ganglia/ganglia_contrib/tree/master/graphite_integration/>_ that allows Ganglia to send metrics directly to Graphite. Further integration work is underway.

GDash

Gdash_ is a simple Graphite dashboard built using Twitters Bootstrap driven by a small DSL.

Giraffe

Giraffe_ is a Graphite real-time dashboard based on Rickshaw_ and requires no server backend. Inspired by Gdash, Tasseo and Graphene_ it mixes features from all three into a slightly different animal.

Grafana

Grafana_ is general purpose graphite dashboard replacement with feature rich graph editing and dashboard creation interface. It contains a unique Graphite target parser that enables easy metric and function editing. Fast client side rendering (even over large time ranges) using Flot with a multitude of display options (Multiple Y-axis, Bars, Lines, Points, smart Y-axis formats and much more). Click and drag selection rectangle to zoom in on any graph.

Graphitus

graphitus_ is a client side dashboard for graphite built using bootstrap and underscore.js.

Graph-Explorer

Graph-Explorer_ is a graphite dashboard which uses plugins to add tags and metadata to metrics and a query language with lets you filter through them and compose/manipulate graphs on the fly. Also aims for high interactivity using TimeseriesWidget_ and minimal hassle to set up and get running.

Graph-Index

Graph-Index_ is index of graphs for Diamond_

Graphene

Graphene_ is a Graphite dashboard toolkit based on D3.js_ and Backbone.js_ which was made to offer a very aesthetic realtime dashboard. Graphene provides a solution capable of displaying thousands upon thousands of datapoints all updated in realtime.

Graphite-Newrelic

Graphite-Newrelic_ - Get your graphite data into New Relic_ via a New Relic Platform plugin.

Graphite-Observer

Graphite-Observer_ is a real-time monitor dashboard for Graphite.

Graphite PowerShell Functions

Graphite PowerShell Functions <https://github.com/MattHodge/Graphite-PowerShell-Functions>_ are a group of functions that can be used to collect Windows Performance Counters and send them over to the Graphite server. The main function can be run as a Windows service, and everything is configurable via an XML file.

Graphite-relay

Graphite-relay_ is a fast Graphite relay written in Scala with the Netty framework.

Graphite-Tattle

Graphite-Tattle_ is a self-service dashboard frontend for Graphite and Ganglia_.

Graphiti

Graphiti_ is a powerful dashboard front end with a focus on ease of access, ease of recovery and ease of tweaking and manipulation.

Graphitoid

Graphitoid_ is an Android app which allows one to browse and display Graphite graphs on an Android device.

Graphios

Graphios_ is a small Python daemon to send Nagios performance data (perfdata) to Graphite.

Graphsky

Graphsky_ is flexible and easy to configure PHP based dashboard. It uses JSON template files to build graphs and specify which graphs need to be displayed when, similar to Ganglia-web. Just like Ganglia, it uses a hierarchial structure: Environment/Cluster/Host/Metric to be able to display overview graphs and host-specific metrics. It communicates directly to the Graphite API to determine which Environments, Clusters, Hosts and Metrics are currently stored in Graphite.

Grockets

Grockets_ is a node.js application which provides streaming JSON data over HTTP from Graphite.

HoardD

HoardD_ is a Node.js app written in CoffeeScript to send data from servers to Graphite, much like collectd does, but aimed at being easier to expand and with less footprint. It comes by default with basic collectors plus Redis and MySQL metrics, and can be expanded with Javascript or CoffeeScript.

Host sFlow

Host sFlow_ is an open source implementation of the sFlow protocol (http://www.sflow.org), exporting a standard set of host cpu, memory, disk and network I/O metrics. The sflow2graphite utility converts sFlow to Graphite's plaintext protocol, allowing Graphite to receive sFlow metrics.

hubot-scripts

Hubot_ is a Campfire bot written in Node.js and CoffeeScript. The related hubot-scripts_ project includes a Graphite script which supports searching and displaying saved graphs from the Composer directory in your Campfire rooms.

jmxtrans

jmxtrans_ is a powerful tool that performs JMX queries to collect metrics from Java applications. It is requires very little configuration and is capable of sending metric data to several backend applications, including Graphite.

Ledbetter

Ledbetter_ is a simple script for gathering Nagios problem statistics and submitting them to Graphite. It focuses on summary (overall, servicegroup and hostgroup) statistics and writes them to the nagios.problems metrics namespace within Graphite.

Leonardo

Leonardo_ is a Graphite dashboard inspired by Gdash. It's written in Python using the Flask framework. The interface is built with Bootstrap. The graphs and dashboards are configured through the YAML files.

Logster

Logster_ is a utility for reading log files and generating metrics in Graphite or Ganglia. It is ideal for visualizing trends of events that are occurring in your application/system/error logs. For example, you might use logster to graph the number of occurrences of HTTP response code that appears in your web server logs.

Orion

Orion_ is powerful tool to create, view and manage dashboards for your Graphite data. It allows easy implementation of custom authentication to manage access to the dashboard.

metrics-sampler

metrics-sampler_ is a java program which regularly queries metrics from a configured set of inputs, selects and renames them using regular expressions and sends them to a configured set of outputs. It supports JMX and JDBC as inputs and Graphite as output out of the box.

Pencil

Pencil_ is a monitoring frontend for graphite. It runs a webserver that dishes out pretty Graphite URLs in interesting and intuitive layouts.

pipe-to-graphite

pipe-to-graphite_ is a small shell script that makes it easy to report the output of any other cli program to Graphite.

rearview

rearview_ is a real-time monitoring framework that sits on top of Graphite's time series data. This allows users to create monitors that both visualize and alert on data as it streams from Graphite. The monitors themselves are simple Ruby scripts which run in a sandbox to provide additional security. Monitors are also configured with a crontab compatible time specification used by the scheduler. Alerts can be sent via email, pagerduty, or campfire.

Rocksteady

Rocksteady_ is a system that ties together Graphite, RabbitMQ, and Esper. Developed by AdMob (who was then bought by Google), this was released by Google as open source (http://google-opensource.blogspot.com/2010/09/get-ready-to-rocksteady.html).

Sensu

Sensu_ is a monitoring framework that can route metrics to Graphite. Servers subscribe to sets of checks, so getting metrics from a new server to Graphite is as simple as installing the Sensu client and subscribing.

Seyren

Seyren_ is an alerting dashboard for Graphite.

Shinken

Shinken_ is a system monitoring solution compatible with Nagios which emphasizes scalability, flexibility, and ease of setup. Shinken provides complete integration with Graphite for processing and display of performance data.

SqlToGraphite

SqlToGraphite_ is an agent for windows written in .net to collect metrics using plugins (WMI, SQL Server, Oracle) by polling an endpoint with a SQL query and pushing the results into graphite. It uses either a local or a centralised configuration over HTTP.

SSC Serv

SSC Serv_ is a Windows service (agent) which periodically publishes system metrics, for example CPU, memory and disk usage. It can store data in Graphite using a naming schema that's identical to that used by collectd.

statsd

statsd_ is a simple daemon for easy stats aggregation, developed by the folks at Etsy. A list of forks and alternative implementations can be found at http://joemiller.me/2011/09/21/list-of-statsd-server-implementations/

Tasseo

Tasseo_ is a lightweight, easily configurable, real-time dashboard for Graphite metrics.

Therry

Therry_ ia s simple web service that caches Graphite metrics and exposes an endpoint for dumping or searching against them by substring.

TimeseriesWidget

TimeseriesWidget_ adds timeseries graphs to your webpages/dashboards using a simple api, focuses on high interactivity and modern features (realtime zooming, datapoint inspection, annotated events, etc). Supports Graphite, flot, rickshaw and anthracite.

.. _Backbone.js: http://documentcloud.github.com/backbone/ .. _Backstop: https://github.com/obfuscurity/backstop .. _Bucky: http://pypi.python.org/pypi/bucky .. _Cabot: https://github.com/arachnys/cabot .. _Charcoal: https://github.com/cebailey59/charcoal .. _collectd: http://collectd.org/ .. _collectd-carbon: https://github.com/indygreg/collectd-carbon .. _collectd-graphite: https://github.com/joemiller/collectd-graphite .. _collectd-to-graphite: https://github.com/loggly/collectd-to-graphite .. _Collectl: http://collectl.sourceforge.net/ .. _D3.js: http://mbostock.github.com/d3/ .. _Descartes: https://github.com/obfuscurity/descartes .. _Diamond: http://opensource.brightcove.com/project/Diamond/ .. _Dusk: https://github.com/obfuscurity/dusk .. _Esper: http://esper.codehaus.org/ .. _Evenflow: https://github.com/github/evenflow .. _Firefly: https://github.com/Yelp/firefly .. _Ganglia: http://ganglia.info/ .. _Gdash: https://github.com/ripienaar/gdash.git .. _Giraffe: http://kenhub.github.com/giraffe/ .. _Grafana: http://grafana.org/ .. _Graph-Explorer: http://vimeo.github.io/graph-explorer .. _Graph-Index: https://github.com/douban/graph-index .. _Graphene: http://jondot.github.com/graphene/ .. _Graphios: https://github.com/shawn-sterling/graphios .. _Graphite-Tattle: https://github.com/wayfair/Graphite-Tattle .. _Graphite-Newrelic: https://github.com/gingerlime/graphite-newrelic .. _Graphite-Observer: https://github.com/huoxy/graphite-observer .. _Graphite-relay: https://github.com/markchadwick/graphite-relay .. _Graphiti: https://github.com/paperlesspost/graphiti .. _graphitius: https://github.com/erezmazor/graphitus .. _Graphitoid: https://market.android.com/details?id=com.tnc.android.graphite .. _Graphsky: https://github.com/hyves-org/graphsky .. _Grockets: https://github.com/disqus/grockets .. _HoardD: https://github.com/coredump/hoardd .. _Host sFlow: http://host-sflow.sourceforge.net/ .. _Hubot: https://github.com/github/hubot .. _hubot-scripts: https://github.com/github/hubot-scripts .. _jmxtrans: http://code.google.com/p/jmxtrans/ .. _Ledbetter: https://github.com/github/ledbetter .. _Leonardo: https://github.com/PrFalken/leonardo .. _Logster: https://github.com/etsy/logster .. _Orion: https://github.com/gree/Orion .. _metrics-sampler: https://github.com/dimovelev/metrics-sampler .. _New Relic: https://newrelic.com/platform .. _Pencil: https://github.com/fetep/pencil .. _pipe-to-graphite: https://github.com/iFixit/pipe-to-graphite .. _RabbitMQ: http://www.rabbitmq.com/ .. _Rickshaw: http://code.shutterstock.com/rickshaw/ .. _rearview: http://github.com/livingsocial/rearview/ .. _Rocksteady: http://code.google.com/p/rocksteady/ .. _Seyren: https://github.com/scobal/seyren .. _Sensu: http://sensuapp.org/ .. _Shinken: http://www.shinken-monitoring.org/ .. _SqlToGraphite: https://github.com/perryofpeek/SqlToGraphite/ .. _SSC Serv: https://ssc-serv.com/ .. _statsd: https://github.com/etsy/statsd .. _Tasseo: https://github.com/obfuscurity/tasseo .. _Therry: https://github.com/obfuscurity/therry .. _TimeseriesWidget: https://github.com/Dieterbe/timeserieswidget .. _write-graphite: http://collectd.org/wiki/index.php/Plugin:Write_Graphite

The Whisper Database

Whisper is a fixed-size database, similar in design and purpose to RRD (round-robin-database). It provides fast, reliable storage of numeric data over time. Whisper allows for higher resolution (seconds per point) of recent data to degrade into lower resolutions for long-term retention of historical data.

Data Points

Data points in Whisper are stored on-disk as big-endian double-precision floats. Each value is paired with a timestamp in seconds since the UNIX Epoch (01-01-1970). The data value is parsed by the Python float() <http://docs.python.org/library/functions.html#float>_ function and as such behaves in the same way for special strings such as 'inf'. Maximum and minimum values are determined by the Python interpreter's allowable range for float values which can be found by executing::

python -c 'import sys; print sys.float_info'

Archives: Retention and Precision

Whisper databases contain one or more archives, each with a specific data resolution and retention (defined in number of points or max timestamp age). Archives are ordered from the highest-resolution and shortest retention archive to the lowest-resolution and longest retention period archive.

To support accurate aggregation from higher to lower resolution archives, the precision of a longer retention archive must be divisible by precision of next lower retention archive. For example, an archive with 1 data point every 60 seconds can have a lower-resolution archive following it with a resolution of 1 data point every 300 seconds because 60 cleanly divides 300. In contrast, a 180 second precision (3 minutes) could not be followed by a 600 second precision (10 minutes) because the ratio of points to be propagated from the first archive to the next would be 3 1/3 and Whisper will not do partial point interpolation.

The total retention time of the database is determined by the archive with the highest retention as the time period covered by each archive is overlapping (see Multi-Archive Storage and Retrieval Behavior_). That is, a pair of archives with retentions of 1 month and 1 year will not provide 13 months of data storage as may be guessed. Instead, it will provide 1 year of storage - the length of it's longest archive.

Rollup Aggregation

Whisper databases with more than a single archive need a strategy to collapse multiple data points for when the data rolls up a lower precision archive. By default, an average function is used. Available aggregation methods are:

  • average
  • sum
  • last
  • max
  • min

Multi-Archive Storage and Retrieval Behavior

When Whisper writes to a database with multiple archives, the incoming data point is written to all archives at once. The data point will be written to the highest resolution archive as-is, and will be aggregated by the configured aggregation method (see Rollup Aggregation_) and placed into each of the higher-retention archives. If you are in need for aggregation of the highest resolution points, please consider using :doc:carbon-aggregator </carbon-daemons> for that purpose.

When data is retrieved (scoped by a time range), the first archive which can satisfy the entire time period is used. If the time period overlaps an archive boundary, the lower-resolution archive will be used. This allows for a simpler behavior while retrieving data as the data's resolution is consistent through an entire returned series.

Disk Space Efficiency

Whisper is somewhat inefficient in its usage of disk space because of certain design choices:

Each data point is stored with its timestamp Rather than a timestamp being inferred from its position in the archive, timestamps are stored with each point. The timestamps are used during data retrieval to check the validity of the data point. If a timestamp does not match the expected value for its position relative to the beginning of the requested series, it is known to be out of date and a null value is returned Archives overlap time periods During the write of a data point, Whisper stores the same data in all archives at once (see Multi-Archive Storage and Retrieval Behavior_). Implied by this behavior is that all archives store from now until each of their retention times. Because of this, lower-resolution archives should be configured to significantly lower resolution and higher retentions than their higher-resolution counterparts so as to reduce the overlap. All time-slots within an archive take up space whether or not a value is stored While Whisper allows for reliable storage of irregular updates, it is most space efficient when data points are stored at every update interval. This behavior is a consequence of the fixed-size design of the database and allows the reading and writing of series data to be performed in a single contiguous disk operation (for each archive in a database).

Differences Between Whisper and RRD

RRD can not take updates to a time-slot prior to its most recent update This means that there is no way to back-fill data in an RRD series. Whisper does not have this limitation, and this makes importing historical data into Graphite much more simple and easy RRD was not designed with irregular updates in mind In many cases (depending on configuration) if an update is made to an RRD series but is not followed up by another update soon, the original update will be lost. This makes it less suitable for recording data such as operational metrics (e.g. code pushes) Whisper requires that metric updates occur at the same interval as the finest resolution storage archive This pushes the onus of aggregating values to fit into the finest precision archive to the user rather than the database. It also means that updates are written immediately into the finest precision archive rather than being staged first for aggregation and written later (during a subsequent write operation) as they are in RRD.

Performance

Whisper is fast enough for most purposes. It is slower than RRDtool primarily as a consequence of Whisper being written in Python, while RRDtool is written in C. The speed difference between the two in practice is quite small as much effort was spent to optimize Whisper to be as close to RRDtool's speed as possible. Testing has shown that update operations take anywhere from 2 to 3 times as long as RRDtool, and fetch operations take anywhere from 2 to 5 times as long. In practice the actual difference is measured in hundreds of microseconds (10^-4) which means less than a millisecond difference for simple cases.

Database Format

.. csv-table:: :delim: | :widths: 10, 10, 15, 30, 45

WhisperFile|Header,Data |Header|Metadata,ArchiveInfo+ | |Metadata|aggregationType,maxRetention,xFilesFactor,archiveCount | |ArchiveInfo|Offset,SecondsPerPoint,Points |Data|Archive+ | |Archive|Point+ | | |Point|timestamp,value

Data types in Python's struct format <http://docs.python.org/library/struct.html#format-strings>_:

.. csv-table:: :delim: |

Metadata|!2LfL ArchiveInfo|!3L Point|!Ld

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment