rdbf Context switching isn’t really involved here. The issue is not that the CPU is frequently switching between processes, but that when a container exceeds its I/O limits (whether IOPS or bandwidth), the kernel actively throttles its I/O. This forces the processes in that container to wait for I/O operations to complete.
When throttling kicks in, those processes cause increased I/O wait. Even though they are waiting, they remain resident in memory and continue to consume RAM. This can become problematic if memory is also tight, but the key point is that the high I/O wait is confined to the container (website) that hit its limit. Since each container is managed by its own cgroup, the high I/O wait and any associated load increase are isolated to that container.
I believe that if you inspect resource usage inside the container, say by running top
or checking cgroup-specific metrics, you’ll see a spike in I/O wait. On the host, top
aggregates data across all processes, so you might only notice a slight increase overall, even though the heavy throttling happening in a single container.
Unless your server is very I/O constrained, it’s often best to leave I/O limits (like IOPS and bandwidth) disabled. In environments running on enterprise-grade drives, setting these limits can lead to unwanted throttling that causes performance problems (high iowait and increased load) for the abusive container without providing any real isolation benefits, as the drives are already fast.
I recommend focusing on CPU, RAM, and nproc limits instead, as they tend to be more effective. In many cases, limiting only RAM and nproc may be sufficient, since nearly every process uses memory and runaway process creation can be harmful.