Underutilization and segregation are the classic strategies for ensuring resources are available when work absolutely must get done. Keep a database on its own server so when the load spikes another VM or high priority thread can't interfere with RAM, power, disk, or CPU access. And when you really need fast and reliable networking you can't rely on QOS, you keep a dedicated line.
Google flips the script in Heracles: Improving Resource Efficiency at Scale, shooting for high resource utilization while combining different load profiles.
I'm assuming the project name Heracles was chosen not simply for his legendary strength, but because when strength failed, Heracles could always depend on his wits. Who can ever forget when Heracles tricked Atlas into taking the sky back onto his shoulders? Good times.
The problem: better utilization of compute resources while complying service level objectives (SLOs) for latency-critical (LC) and best effort batch (BE) tasks: