Note: This project is a work-in-progress. We're currently aiming to ship a completed service, with integration hooks, as part of DC/OS 1.10. Community help is welcome and appreciated!
- Overview
- How this repo is organized
- Getting Started
- Documentation
- Community
- Contributing
- License
- Acknowledgements
I want to...
- emit metrics from a Mesos container: You should check for
STATSD_UDP_HOST
andSTATSD_UDP_PORT
in your application environment, then send statsd-formatted metrics to that endpoint when it's available. You may emit your own tags using the dogstatsd tag format, and they'll automatically be translated into avro-formatted tags! (see also: example code) - emit metrics from a system process on the agents: You should send avro-formatted metrics to the Collector process at
127.0.0.1:8124
. (see also: avro schema, example code) - collect and process emitted metrics: See Quick Start above. Take a look at the available Kafka Consumers, and see if your format already exists. If it doesn't, a new Consumer is very easy. (see also: avro schema)
- develop parts of the metrics stack: You can run the whole stack on your local system, no Mesos Agent required! To get started, take a look at the local stack launcher scripts.
- module: C++ code for the mesos-agent module. This module is installed by default on DC/OS EE 1.7+, with updated output support added as of EE 1.8+.
- Input: Accepts data produced by Mesos containers on the agent. All Mesos containers are given a unique StatsD endpoint, advertised via
STATSD_UDP_HOST
/STATSD_UDP_PORT
environment variables. The module then tags and forwards upstream any metrics sent to that endpoint. (EE 1.7+) - Output formats:
- Avro metrics sent to a local Collector process on TCP port
8124
(EE 1.8+) - StatsD to
metrics.marathon.mesos
with tags added via key prefixes or datadog tags (EE 1.7 only, disabled in EE 1.8),
- Avro metrics sent to a local Collector process on TCP port
- Input: Accepts data produced by Mesos containers on the agent. All Mesos containers are given a unique StatsD endpoint, advertised via
- collector: A Marathon process which runs on every agent node.
- Inputs:
- Listens on TCP port
8124
for Avro-formatted metrics from the mesos-agent module, as well as any other processes on the system. - Polls the local Mesos agent for additional information:
/containers
is polled to retrieve per-container resource usage stats (this was briefly done in the Mesos module via the Oversubscription module interface). Similarly/metrics/snapshot
is also polled for system-level information./state
is polled to determine the localagent_id
and to get a mapping offramework_id
toframework_name
. These are then used to populateagent_id
on all outgoing metrics, andframework_name
for metrics that have aframework_id
(i.e. all metrics emitted by containers).
- Listens on TCP port
- Output: Data is collated into topics and forwarded to a configured Kafka instance (default
kafka
).
- Inputs:
- consumer: Kafka Consumer implementations which fetch Avro-formatted metrics and do something with them (print to
stdout
, write to a database, etc). By default the Consumers will consume from all topics which match the regex patternmetrics-.*
. This expression can be customized, or alternately a single specific topic can be specified for consumption. - examples: Reference implementations of programs which integrate with the metrics stack:
- collector-emitter: A reference for DC/OS system processes which emit metrics. Sends some Avro metrics data to a local Collector process.
- local-stack: Helper scripts for running a full metrics stack on a dev machine. Feeds stats into itself and prints them at the end. Requires a running copy of Zookeeper (reqd by Kafka).
- statsd-emitter: A reference for mesos tasks which emit metrics. Sends some StatsD metrics to the
STATSD_UDP_HOST
/STATSD_UDP_PORT
endpoint advertised by the mesos-agent module.
- schema: Avro schemas shared by most everybody that processes metrics (agent module, collector, collector clients, kafka consumers). The exception is containerized processes which only need know how to emit StatsD data.
First, get a 1.8 EE cluster with at least 3 private nodes (minimum for default Kafka), then install the following:
- Install Kafka:
dcos package install kafka
or install via the Universe UI - Note: stock settings are plenty to start with, but for production use consider increasing the default number of partitions (num.partitions
) and replication factor (default.replication.factor
). - Run a Metrics Collector on every node: use provided marathon jsons.
- One or more Metrics Consumers: see example marathon jsons for each consumer type, edit output settings as needed before launching
- Launching demo processes
- Launching the Collector
- Launching Consumers
- Installing custom module builds (for module dev)
- Slides from MesosCon EU (Aug 2016)
This project is one component of the larger DC/OS community.
- DC/OS JIRA (issue tracker) (please use the
dcos-metrics
component) - DC/OS mailing list
- DC/OS Community Slack team
We love contributions! There's more than one way to give back, from code to documentation and examples. To ensure we have a chance to keep up with community contributions, please follow the guidelines in CONTRIBUTING.md.
DC/OS, along with this project, are both open source software released under the Apache Software License, Version 2.0.
- Maintainer(s): Jeff Malnick, Roger Ignazio
- Author(s): Nicholas Parker