โ๏ธ Kepler Configuration Guide
Kepler supports configuration through both command-line flags and a configuration file. This guide outlines all available configuration options for configuring Kepler.
๐ ๏ธ Configuration Methods
Kepler supports two primary methods for configuration:
- Command-line flags: For quick adjustments and one-time settings
- Configuration file: For persistent and comprehensive configuration
โก Tip: Command-line flags take precedence over configuration file settings when both are specified.
๐ฅ๏ธ Command-line Flags
You can configure Kepler by passing flags when starting the service. The following flags are available:
Flag | Description | Default | Values |
---|---|---|---|
--config.file |
Path to YAML configuration file | Any valid file path | |
--log.level |
Logging level | info |
debug , info , warn , error |
--log.format |
Output format for logs | text |
text , json |
--host.sysfs |
Path to sysfs filesystem | /sys |
Any valid directory path |
--host.procfs |
Path to procfs filesystem | /proc |
Any valid directory path |
--monitor.interval |
Monitor refresh interval | 5s |
Any valid duration |
--monitor.max-terminated |
Maximum number of terminated workloads to keep in memory until exported | 500 |
Negative number indicates unlimited and 0 disables the feature |
--web.config-file |
Path to TLS server config file | "" |
Any valid file path |
--web.listen-address |
Web server listen addresses (can be specified multiple times) | :28282 |
Any valid host:port or :port format |
--debug.pprof |
Enable pprof debugging endpoints | false |
true , false |
--exporter.stdout |
Enable stdout exporter | false |
true , false |
--exporter.prometheus |
Enable Prometheus exporter | true |
true , false |
--metrics |
Metrics levels to export (can be specified multiple times) | node,process,container,vm,pod |
node , process , container , vm , pod |
--kube.enable |
Monitor kubernetes | false |
true , false |
--kube.config |
Path to a kubeconfig file | "" |
Any valid file path |
--kube.node-name |
Name of kubernetes node on which kepler is running | "" |
Any valid node name |
๐ก Examples
# Run with debug logging
kepler --log.level=debug
# Use a different procfs path and JSON logging
kepler --host.procfs=/custom/proc --log.format=json
# Load configuration from file
kepler --config.file=/path/to/config.yaml
# Use custom listen addresses
kepler --web.listen-address=:8080 --web.listen-address=localhost:9090
# Enable stdout exporter and disable Prometheus exporter
kepler --exporter.stdout=true --exporter.prometheus=false
# Enable Kubernetes monitoring with specific kubeconfig and node name
kepler --kube.enable=true --kube.config=/path/to/kubeconfig --kube.node-name=my-node
# Export only node and container level metrics
kepler --metrics=node --metrics=container
# Export only process level metrics
kepler --metrics=process
# Set maximum terminated workloads to 1000
kepler --monitor.max-terminated=1000
# Disable terminated workload tracking
kepler --monitor.max-terminated=0
# Unlimited terminated workload tracking
kepler --monitor.max-terminated=-1
๐๏ธ Configuration File
Kepler can load configuration from a YAML file. The configuration file offers more extensive options than command-line flags.
๐งพ Sample Configuration File
log:
level: debug # debug, info, warn, error (default: info)
format: text # text or json (default: text)
monitor:
interval: 5s # Monitor refresh interval (default: 5s)
staleness: 1000ms # Duration after which data is considered stale (default: 1000ms)
maxTerminated: 500 # Maximum number of terminated workloads to keep in memory (default: 500)
minTerminatedEnergyThreshold: 10 # Minimum energy threshold for terminated workloads (default: 10)
host:
sysfs: /sys # Path to sysfs filesystem (default: /sys)
procfs: /proc # Path to procfs filesystem (default: /proc)
rapl:
zones: [] # RAPL zones to be enabled, empty enables all default zones
exporter:
stdout: # stdout exporter related config
enabled: false # disabled by default
prometheus: # prometheus exporter related config
enabled: true
debugCollectors:
- go
- process
metricsLevel:
- node
- process
- container
- vm
- pod
debug: # debug related config
pprof: # pprof related config
enabled: true
web:
configFile: "" # Path to TLS server config file
listenAddresses: # Web server listen addresses
- ":28282"
kube: # kubernetes related config
enabled: false # Enable kubernetes monitoring (default: false)
config: "" # Path to kubeconfig file (optional if running in-cluster)
nodeName: "" # Name of the kubernetes node (required when enabled)
# WARN: DO NOT ENABLE THIS IN PRODUCTION - for development/testing only
dev:
fake-cpu-meter:
enabled: false
zones: [] # Zones to be enabled, empty enables all default zones
๐งฉ Configuration Options in Detail
๐ Logging Configuration
log:
level: info # Logging level
format: text # Output format
- level: Controls the verbosity of logging
debug
: Very verbose, includes detailed operational informationinfo
: Standard operational informationwarn
: Only warnings and errors-
error
: Only errors -
format: Controls the output format of logs
text
: Human-readable formatjson
: JSON format, suitable for log processing systems
๐ Monitor Configuration
monitor:
interval: 5s
staleness: 1000ms
maxTerminated: 500
minTerminatedEnergyThreshold: 10
-
interval: The monitor's refresh interval. All processes with a lifetime less than this interval will be ignored. Setting to 0s disables monitor refreshes.
-
staleness: Duration after which data computed by the monitor is considered stale and recomputed when requested again. Especially useful when multiple Prometheus instances are scraping Kepler, ensuring they receive the same data within the staleness window. Should be shorter than the monitor interval.
-
maxTerminated: Maximum number of terminated workloads (processes, containers, VMs, pods) to keep in memory until the data is exported. This prevents unbounded memory growth in high-churn environments. Set 0 to disable. When the limit is reached, the least power consuming terminated workloads are removed first.
-
minTerminatedEnergyThreshold: Minimum energy consumption threshold (in joules) for terminated workloads to be tracked. Only terminated workloads with energy consumption above this threshold will be included in the tracking. This helps filter out short-lived processes that consume minimal energy. Default is 10 joules.
๐๏ธ Host Configuration
host:
sysfs: /sys # Path to sysfs
procfs: /proc # Path to procfs
These settings specify where Kepler should look for system information. In containerized environments, you might need to adjust these paths.
๐ RAPL Zones Configuration
rapl:
zones: [] # RAPL zones to be enabled
Running Average Power Limiting (RAPL) is Intel's power capping mechanism. By default, Kepler enables all available zones. You can restrict to specific zones by listing them.
Example with specific zones:
rapl:
zones: ["package", "core", "uncore"]
๐ฆ Exporter Configuration
exporter:
stdout: # stdout exporter related config
enabled: false # disabled by default
prometheus: # prometheus exporter related config
enabled: true
debugCollectors:
- go
- process
metricsLevel:
- node
- process
- container
- vm
- pod
- stdout: Configuration for the stdout exporter
-
enabled
: Enable or disable the stdout exporter (default: false) -
prometheus: Configuration for the Prometheus exporter
enabled
: Enable or disable the Prometheus exporter (default: true)debugCollectors
: List of debug collectors to enable (available: "go", "process")metricsLevel
: List of metric levels to expose. Controls the granularity of metrics exported:node
: Node-level metrics (system-wide power consumption)process
: Process-level metrics (per-process power consumption)container
: Container-level metrics (per-container power consumption)vm
: Virtual machine-level metrics (per-VM power consumption)pod
: Pod-level metrics (per-pod power consumption in Kubernetes)
๐ Debug Configuration
debug:
pprof:
enabled: true
- pprof: Configuration for pprof debugging
enabled
: When enabled, this exposes pprof debug endpoints that can be used for profiling Kepler (default: true)
๐ Web Configuration
web:
configFile: "" # Path to TLS server config file
listenAddresses: # Web server listen addresses
- ":28282"
- configFile: Path to a TLS server configuration file for securing Kepler's web endpoints
- listenAddresses: List of addresses where the web server should listen (default: [":28282"])
- Supports both host:port format (e.g., "localhost:8080", "0.0.0.0:9090") and port-only format (e.g., ":8080")
- Multiple addresses can be specified for listening on different interfaces or ports
- IPv6 addresses are supported using bracket notation (e.g., "[::1]:8080")
Example TLS server configuration file content:
# TLS server configuration
tls_server_config:
cert_file: /path/to/cert.pem # Path to the certificate file
key_file: /path/to/key.pem # Path to the key file
๐ณ Kubernetes Configuration
kube:
enabled: false # Enable kubernetes monitoring
config: "" # Path to kubeconfig file
nodeName: "" # Name of the kubernetes node
- enabled: Enable or disable Kubernetes monitoring (default: false)
-
When enabled, Kepler will monitor Kubernetes resources and expose pod level information
-
config: Path to a kubeconfig file (optional)
- Required when running Kepler outside of a Kubernetes cluster
- When running inside a cluster, Kepler can use the in-cluster configuration
-
Must be a valid and readable kubeconfig file
-
nodeName: Name of the Kubernetes node on which Kepler is running (required when enabled)
- This helps Kepler identify which node it's monitoring
- Must match the actual node name in the Kubernetes cluster
- Required when
enabled
is set totrue
๐งโ๐ฌ Development Configuration
dev:
fake-cpu-meter:
enabled: false
zones: []
โ ๏ธ WARNING: This section is for development and testing only. Do not enable in production.
- fake-cpu-meter: When enabled, uses a fake CPU meter instead of real hardware metrics
enabled
: Set totrue
to enable fake CPU meterzones
: Specific zones to enable, empty enables all
๐ Further Reading
For more details see the config file example in the main Kepler repository at hack/config.yaml
Happy configuring! ๐