Recurring operations work

Your weekly cluster right-sizing pass, prepared before you sit down

Hyground reviews 30 days of Prometheus data, identifies over- and under-provisioned workloads, and hands the platform team a prioritised resize list with the projected monthly cost delta. Triggered on a cron. Read-only kubectl and PromQL. Inside your cluster.

The artefact

A signed-off resize plan, every Tuesday morning

Cluster right-sizing is one of the operations chores senior engineers repeat every week. Hyground encodes that pass as a deterministic workflow that runs on your schedule and returns the same evidence-backed answer your most senior engineer would have produced.

Evidence-backed

30 days of usage per workload, request and limit ratios, HPA state, replica counts.

Prioritised output

A ranked resize list with projected monthly cost delta per change.

What the agent reads

The data the agent already has access to

Nothing new to install. Hyground reads the same data sources your engineers already query manually.

Prometheus

CPU and memory usage per workload over 30 days, p95 and p99, by namespace and label.

Kubernetes

Pod specs, requests and limits, HPA state, replica counts, node taints and tolerations.

Cloud pricing

Instance-type catalogue and pricing for AWS, Azure, GCP — used only to compute the cost delta.

What you get back

The structured answer, ready to share with FinOps

The same shape every week, so your platform team can scan it in a minute and your FinOps team can sign off in another.

01

Over-provisioned workloads

Workloads consistently below 30% of their requested CPU or memory for the trailing 30 days, ranked by waste.

02

Under-provisioned workloads

Workloads regularly throttled or OOM-killed, with the evidence and the suggested new request.

03

HPA misconfigurations

HorizontalPodAutoscalers whose min and max bounds do not match observed load, with the suggested correction.

04

Projected cost delta

The net monthly cost impact if every recommendation is applied, broken down by namespace.

Sovereign AI SRE Agent in your perimeter

Hyground is not SaaS. Hyground works as a bring-your-own-chart and bring-your-own-model, without sending any data back to us. This way, Hyground complies with highest security and data compliance standards in the AI SRE space. It speeds up incident resolution with automatic RCA and your daily work, both. Trusted by industry giants.

Related use cases

Other recurring operations work

Triage weekly cloud cost movers

Top deltas across AWS, Azure and GCP with the likely cause and the owning team, ready every Monday morning.

Sweep TLS certificates expiring this month

Every cert-manager Certificate object expiring within 30 days, grouped by owner, with the renewal state per cert.

Audit dangerous RBAC bindings

Every binding granting cluster-admin or wildcard verbs across every cluster, with subject attribution and change history.

Run the same pass against your cluster

Book a 30-minute technical deep dive. We will run the right-sizing workflow against your environment and walk through the resize list with your platform engineers.