CPU-Only Benchmark

This page reports results from the CPU-only cluster benchmark experiment:

Scope

The benchmark compares three baselines on a pure CPU cluster:

  • A: simulator only (Joulie-free)
  • B: Joulie with static partition policy
  • C: Joulie with queue-aware policy

It evaluates energy and throughput under real Kubernetes scheduling with KWOK nodes and simulated power control.

Experimental setup

Cluster and nodes

  • kind control-plane + worker (real control plane)
  • 8 managed KWOK nodes - CPU only, no GPUs
  • Workload pods target KWOK nodes via selector + toleration

Node inventory

Node prefixCountCPU modelCPU coresRAM
kwok-cpu-highcore2AMD EPYC 9965 192-Core384 (2×192)1536 GiB
kwok-cpu-highfreq2AMD EPYC 9375F 32-Core64 (2×32)770 GiB
kwok-cpu-intensive4AMD EPYC 9655 96-Core192 (2×96)1536 GiB

Total: 8 nodes, 2304 CPU cores, 0 GPUs.

Hardware models in simulator

CPU power per node:

P(u, f) = IdleW + (PeakW - IdleW) * u^AlphaUtil * f^BetaFreq
CPU familyIdleW (W)PeakW (W)AlphaUtilBetaFreq
AMD EPYC 9965 192-Core1209601.151.30
AMD EPYC 9375F 32-Core604801.101.25
AMD EPYC 9655 96-Core957601.121.28

Full power-model details: Power Simulator

Run configuration

  • Seeds: 3
  • Mean inter-arrival: 0.12 s
  • Time scale: 60×
  • Timeout: 14400 s
  • Perf ratio: 15%, eco ratio: 0%, GPU ratio: 0%
  • Workload types: cpu_preprocess, cpu_analytics
  • Policy caps: CPU eco at 80% of peak

Algorithms used

Controller policies

  • static_partition:
    • hpCount = round(N * 0.45) → 4 performance nodes, 4 eco nodes
  • queue_aware_v1:
    • baseCount = round(N * 0.50), dynamic from live perf-pod count
    • hpCount = clamp(max(baseCount, queueNeed), 2, 8, N)
  • Downgrade guard: performance → eco deferred while performance-sensitive pods still run on node

Results summary

Primary metrics: summary.csv

Per-seed results

BaselineSeedWall (s)Throughput (jobs/sim-hr)Energy (kWh sim)Avg power (W)
A11048.97285.9920.651181.1
A21073.38279.4923.851333.2
A31164.21257.6923.241197.6
B11068.72280.7118.571042.8
B21072.93279.6123.461312.1
B31142.27262.6320.321067.5
C11064.50281.8219.421094.6
C21073.04279.5822.821276.2
C31144.72262.0722.591183.8

Baseline means (3 seeds, all completed)

BaselineMean wall (s)Mean throughput (jobs/sim-hr)Mean energy (kWh sim)Mean cluster power (W)
A1095.5274.3922.581237.3
B1094.6274.3220.791140.8
C1094.1274.4921.611184.9

Relative to A:

  • B: energy −7.9%, throughput ≈ 0% (negligible)
  • C: energy −4.3%, throughput ≈ 0% (negligible)

Plot commentary

Runtime distribution

Runtime Distribution by Baseline
  • All three baselines complete in nearly identical wall-time windows.
  • Run-to-run seed jitter is larger than any inter-baseline difference.

Energy vs makespan

Energy vs Makespan
  • B is consistently lower-energy than A with near-identical makespan across all 3 seeds.
  • C shows slightly more variance; one seed lands close to A energy.

Baseline means

Baseline Mean Metrics
  • Energy is the main differentiator; throughput and wall-time bars are indistinguishable.

Completion summary

Completion Summary
  • All 3 seeds completed for all baselines; no timeouts or gang-scheduling issues.

Interpretation

Joulie reduces energy without throughput penalty on a CPU-only cluster because:

  1. The cluster is over-provisioned (2304 cores, lightweight jobs) - eco nodes have spare CPU cores to compensate for throttled frequency.
  2. CPU sensitivityCPU for cpu_preprocess/cpu_analytics is moderate (0.7–0.9): a 20% frequency reduction causes 14–18% per-job slowdown, but job completion time stays flat because the scheduler redistributes load.
  3. Eco nodes draw proportionally less power for the same simulated duration → energy falls without extending makespan.

Best-fit use case

The strongest observed benefit is:

  • energy reduction (−7.9% static, −4.3% queue-aware) with negligible throughput penalty in CPU-only mixed workload clusters.

static_partition is the most robust policy for this regime - predictable savings with no visible scheduling-performance impact.

Implementation details and scripts