Core Concepts
Before installing Joulie, understand the control model.
Problem Joulie addresses
Clusters running AI/scientific workloads need better power control:
- reduce energy use and power spikes,
- keep workload performance predictable,
- provide a path to greener operation (power envelope and carbon-aware strategies).
Joulie is currently a PoC focused on Kubernetes-native control loops and simulation.
Main components
- Operator (
cmd/operator): cluster-level policy brain- decides desired node power profile/cap assignments
- writes desired state as
NodePowerProfile
- Agent (
cmd/agent): node-level actuator- reads desired state and telemetry configuration
- enforces power controls (CPU now, GPU path planned)
- exports metrics/status
- Simulator (
simulator/): digital-twin execution environment- keeps scheduling real, simulates telemetry/control behavior
- enables repeatable experiments without requiring real hardware writes
Key CRDs
NodePowerProfile(joulie.io/v1alpha1)- desired node policy state (
performance/eco, optional power cap)
- desired node policy state (
TelemetryProfile(joulie.io/v1alpha1)- where telemetry/control inputs come from (
host,http, …), and how controls are sent
- where telemetry/control inputs come from (
Policy states and intent
Node supply is represented through joulie.io/power-profile:
performancedraining-performance(temporary transition label)eco
Workload demand is inferred from pod scheduling constraints:
- workload constrained to performance nodes
- workload constrained to eco nodes
- unconstrained workload (can run on either)
Energy policy in one paragraph
An energy policy decides how many nodes should stay in performance or move to eco, based on current demand and configured rules.
Today Joulie ships deterministic policies (static and queue-aware), plus a debug swap policy.
Policy algorithms are detailed in Policy Algorithms.
Control loop in one minute
- Operator observes cluster context and picks desired node states.
- Operator writes/updates
NodePowerProfile. - Agent reads desired state + telemetry/control profile.
- Agent applies controls and reports status/metrics.
- Operator reconciles again.
Next step
Proceed to Quickstart, then use Architecture pages for deeper details.