Documentation
Joulie is a Kubernetes-native energy management system that uses per-node digital twins to optimize data center power consumption. It ingests real-time telemetry from every node (CPU/GPU power draw, thermal state, per-pod utilization) to maintain a continuously updated model of the cluster’s energy state. That model drives two things: power cap enforcement (via RAPL and NVML) and scheduling decisions that steer workloads toward the most energy-efficient nodes.
If you are completely new, the smoothest path is:
Core mental model:
- telemetry feeds the digital twin,
- the twin drives operator decisions (power caps, migration triggers),
- the scheduler extender reads twin state to steer new pod placement,
- feedback from new placements updates telemetry, closing the loop.
Section guide
- Getting Started
- concepts, install, workload compatibility, WorkloadProfile classification, configuration reference
- Architecture
- operator/agent/twin/scheduler roles, CRDs, policy algorithms, telemetry/control interfaces, kubectl plugin
- Hardware
- CPU and GPU support model, heterogeneity strategy, runtime caveats
- Simulator
- trace-driven workload simulation, power modeling, facility stress
- Experiments
- benchmark design and measured outcomes
What to expect
- Per-node digital twins: telemetry → twin state → cap decisions and scheduling.
- Kubernetes-native contracts: 2 user-facing CRDs (
NodeHardware,NodeTwin) + scheduling constraints as intent/supply language. - Workload classification: automatic profiling via Kepler/cAdvisor metrics with transparent classification reasons.
- Observability tooling:
kubectl joulieplugin, Grafana dashboard, Prometheus metrics. - Practical path to adoption: quickstart first, then progressive deep dives.