跳转到主要内容

StatsD和CollectD适配器用于Graphite

项目描述

Bucky

info:

Bucky Statsd and Collectd服务器用于Graphite

https://travis-ci.org/trbs/bucky.png?branch=master https://coveralls.io/repos/trbs/bucky/badge.png?branch=master

Bucky是一个用于收集和转换Graphite指标的轻量级服务器。它目前可以收集来自CollectD守护进程和StatsD客户端的指标数据。

安装

您可以使用easy_installpip按照常规方式安装。

$ easy_install bucky
# or
$ pip install bucky

安装后,您可以像以下这样运行Bucky

$ bucky

Bucky会尝试安装PyCrypto,这需要安装python-dev包。

默认情况下,Bucky将在127.0.0.1:25826上打开CollectD UDP套接字,在127.0.0.1:8125上打开StatsD套接字,并尝试连接到本地的Graphite (Carbon)守护进程,端口为127.0.0.1:2003。

以下都是可选的。如果您愿意,也可以完全禁用CollectD或StatsD服务器。

进程名称

如果已安装py-setproctitle模块,Bucky将使用它来设置用户可读的进程名称。这将使Bucky的子进程更容易识别。请注意,这完全是可选的。

要安装py-setproctitle,请运行

$ easy_install setproctitle
# or
$ pip install setproctitle

实际运行Bucky

细心的观察者会注意到Bucky没有用于守护进程化的标志。这是故意的。在生产环境中运行Bucky的建议方法是使用runit。Bucky的源代码仓库中有示例服务目录。

Python 3支持

Bucky支持Python 3。然而,这种支持仍然非常年轻,如果您在Python 3上运行Bucky,我们希望听到您的反馈,并帮助我们在实际生产环境中改进支持。

Sentry支持

Bucky支持通过Python Raven客户端将错误消息记录到Sentry。

要安装raven,请运行

$ pip install raven
# or
$ easy_install raven

接下来在Bucky的配置文件中启用Sentry。

命令行选项

命令行选项仅限于控制网络参数。如果您想配置一些更复杂的设置,您需要使用配置文件。以下是bucky -h输出的示例

Usage: main.py [OPTIONS] [CONFIG_FILE]

Options:
  --debug               Put server into debug mode. [False]
  --metricsd-ip=IP      IP address to bind for the MetricsD UDP socket
                        [127.0.0.1]
  --metricsd-port=INT   Port to bind for the MetricsD UDP socket [23632]
  --disable-metricsd    Disable the MetricsD UDP server
  --collectd-ip=IP      IP address to bind for the CollectD UDP socket
                        [127.0.0.1]
  --collectd-port=INT   Port to bind for the CollectD UDP socket [25826]
  --collectd-types=FILE
                        Path to the collectd types.db file, can be specified
                        multiple times
  --disable-collectd    Disable the CollectD UDP server
  --statsd-ip=IP        IP address to bind for the StatsD UDP socket
                        [127.0.0.1]
  --statsd-port=INT     Port to bind for the StatsD UDP socket [8125]
  --disable-statsd      Disable the StatsD server
  --graphite-ip=IP      IP address of the Graphite/Carbon server [127.0.0.1]
  --graphite-port=INT   Port of the Graphite/Carbon server [2003]
  --full-trace          Display full error if config file fails to load
  --log-level=NAME      Logging output verbosity [INFO]
  --version             show program's version number and exit
  -h, --help            show this help message and exit

配置文件选项

配置文件是一个正常的Python文件,它定义了许多变量。大多数命令行选项也可以在此文件中指定(删除“–”前缀,将“-”替换为“_”),但如果两个地方都指定了,则命令行优先。以下为默认配置文件

# Standard debug and log level
debug = False
log_level = "INFO"

# Whether to print the entire stack trace for errors encountered
# when loading the config file
full_trace = False

# Basic metricsd conifguration
metricsd_ip = "127.0.0.1"
metricsd_port = 23632
metricsd_enabled = True

# The default interval between flushes of metric data to Graphite
metricsd_default_interval = 10.0

# You can specify the frequency of flushes to Graphite based on
# the metric name used for each metric. These are specified as
# regular expressions. An entry in this list should be a 3-tuple
# that is: (regexp, frequency, priority)
#
# The regexp is applied with the match method. Frequency should be
# in seconds. Priority is used to break ties when a metric name
# matches more than one handler. (The largest priority wins)
metricsd_handlers = []

# Basic collectd configuration
collectd_ip = "127.0.0.1"
collectd_port = 25826
collectd_enabled = True

# A list of file names for collectd types.db
# files.
collectd_types = []

# A mapping of plugin names to converter callables. These are
# explained in more detail in the README.
collectd_converters = {}

# Whether to load converters from entry points. The entry point
# used to define converters is 'bucky.collectd.converters'.
collectd_use_entry_points = True

# If a collectd metric is received with a value of type counter when
# our types.db define it as derive, or vice versa, don't raise an
# exception and assume the server's types.db is correct.
# Types counter and derive are very similar. Also, it's common
# for different versions/installations of collectd in 'clients'
# to have a bit different definitions for the same metrics
# (counter/derive conflict).
collectd_counter_eq_derive = False

# CollectD server can also run using multiple worker subprocesses.
# Incoming packets are routed to workers based on source IP.
collectd_workers = 1

# Cryptographic settings for collectd. Security level 1 requires
# signed packets, level 2 requires encrypted communication.
# Auth file should contain lines in the form 'user: password'
collectd_security_level = 0
collectd_auth_file = None

# Basic statsd configuration
statsd_ip = "127.0.0.1"
statsd_port = 8125
statsd_enabled = True

# How often stats should be flushed to Graphite.
statsd_flush_time = 10.0

# If the legacy namespace is enabled, the statsd backend uses the
# default prefixes except for counters, which are stored directly
# in stats.NAME for the rate and stats_counts.NAME for the
# absolute count.  If legacy names are disabled, the prefixes are
# configurable, and counters are stored under
# stats.counters.{rate,count} by default.  Any prefix can be set
# to None to skip it.
statsd_legacy_namespace = True
statsd_global_prefix = "stats"
statsd_prefix_counter = "counters"
statsd_prefix_timer = "timers"
statsd_prefix_gauge = "gauges"

# Basic Graphite configuration
graphite_ip = "127.0.0.1"
graphite_port = 2003

# If the Graphite connection fails these numbers define how it
# will reconnect. The max reconnects applies each time a
# disconnect is encountered and the reconnect delay is the time
# in seconds between connection attempts. Setting max reconnects
# to a negative number removes the limit. The backoff factor
# determines how much the reconnect delay will be multiplied with
# each reconnect round. It can be limited with a maximum after which
# the delay will not be multiplied anymore.
graphite_max_reconnects = 3
graphite_reconnect_delay = 5
graphite_backoff_factor = 1.5
graphite_backoff_max = 60

# Configuration for sending metrics to Graphite via the pickle
# interface. Be sure to edit graphite_port to match the settings
# on your Graphite cache/relay.
graphite_pickle_enabled = False
graphite_pickle_buffer_size = 500

# Bucky provides these settings to allow the system wide
# configuration of how metric names are processed before
# sending to Graphite.
#
# Prefix and postfix allow to tag all values with some value.
name_prefix = None
name_postfix = None

# The replacement character is used to munge any '.' characters
# in name components because it is special to Graphite. Setting
# this to None will prevent this step.
name_replace_char = '_'

# Optionally strip duplicates in path components. For instance
# a.a.b.c.c.b would be rewritten as a.b.c.b
name_strip_duplicates = True

# Bucky reverses hostname components to improve the locality
# of metric values in Graphite. For instance, "node.company.tld"
# would be rewritten as "tld.company.node". This setting allows
# for the specification of hostname components that should
# be stripped from hostnames. For instance, if "company.tld"
# were specified, the previous example would end up as "node".
name_host_trim = []

# processor is a callable that takes a (host, name, val, time)
# tuple as input and is expected to return a tuple of the same
# structure to forward the sample to the clients, or None to
# drop it. processor_drop_on_error specifies if the sample is
# dropped or forwarded to clients in case an exception is
# raised by the processor callable.
processor = None
processor_drop_on_error = False

配置CollectD

您只需将类似以下内容添加到collectd.conf中

LoadPlugin "network"

<Plugin "network">
  Server "127.0.0.1" "25826"
</Plugin>

显然,您需要匹配IP地址和端口,并确保您的防火墙配置允许UDP数据包通过。

配置StatsD

只需将您的StatsD客户端指向Bucky的IP/端口,您就应该可以了。

配置MetricsD

待办事项

关于CollectD转换器的说明

CollectD度量标准并不直接转换为Graphite度量标准名称。默认转换器尝试做出最佳猜测,但可能会产生不太美观的Graphite树。

因此,Bucky具有可配置的转换器。这些转换器是根据CollectD插件名称键入的。这些函数的输入是类似于以下所示的CollectD度量表示

{
  'host': 'toroid.local',
  'interval': 10.0,
  'plugin': 'memory',
  'plugin_instance': '',
  'time': 1320970329.175534,
  'type': 'memory',
  'type_instance': 'inactive',
  'value': 823009280.0,
  'value_name': 'value',
  'value_type': 1
}

此函数的结果应该是一个表示Graphite度量名称一部分的字符串列表,或者表示完全丢弃样本的None。例如,如果转换器返回["foo", "bar"],则最终度量名称将变为:$prefix.$hostname.foo.bar.$postfix

以下是一个内置转换器的示例

# This might be how you define a converter in
# your config file

class MemoryConverter(object):
    PRIORITY = 0
    def __call__(self, sample):
        return ["memory", sample["type_instance"]]

collectd_converters = {"memory": MemoryConverter()}

转换器可以在可选的配置文件中声明和/或导入,或者可以通过入口点自动发现。搜索的入口点是“bucky.collectd.converters”。入口点名称应该是CollectD插件名称。

配置文件中的collectd_converters应将collectd插件名称映射到转换器实例。默认的通用转换器(当未为插件定义特殊转换器时使用)可以通过指定_default作为插件名称来覆盖。

转换器还有一个优先级概念,以解决冲突。这只是一个名为“PRIORITY”的可调用对象的属性,较大的优先级更受青睐。我不认为这需要经常使用,但它是存在的。

配置处理器

处理器是一个接收样本作为服务器解析样本并执行操作然后将其传递给客户端的过程。

如果定义了processor配置变量中的可调用对象,处理器进程将应用此可调用对象到收到的样本(host, name, val, time),并期望返回相同结构的元组以转发给客户端,或者返回None以丢弃样本。

这使得在样本上添加各种自定义过滤和修改变得容易。

这可能是在配置文件中定义处理器的示例方式

import time

def timediff(host, name, val, timestamp):
    """Drop samples with large time offset

    Drop samples that are more than 2 minutes in the future
    or more than 5 minutes in the past.

    """

    future = 120  # 2 minutes
    past = 300  # 5 minutes
    now = time.time()
    if timestamp > now + future or timestamp < now - past:
        return None
    return host, name, val, timestamp

processor = timediff

项目详情


下载文件

下载您平台上的文件。如果您不确定选择哪个,请了解有关安装包的更多信息。

源分布

bucky-2.3.0.tar.gz (36.8 kB 查看哈希值)

上传时间

构建分布

bucky-2.3.0-py2.py3-none-any.whl (45.4 kB 查看哈希值)

上传时间 Python 2 Python 3

由以下支持