Adam I will have to disagree on this, but as I mentioned, we are still in testing and will bring forward the results, whatever they are.
One of such cases - updating plugins in middle of night without any visitors on the website. I hit the NPROC and RAM extremely fast, which is not something I can't replicate anywhere else - and I am comparing behaviour of OLS/LSE with Enhance to OLS with RunCloud (so anyone can test it for themselves).
Note: NPROC limit is set to 24. with 3 GB RAM. Since each process can have ~ 200 MB size, you can see how fast you can hit the limit. And that would not be the issue, if processes were swiftly removed - but they are not. Which is currently our main investigation point.
Note 2: If NPROC is set to 24 with 6 GB RAM, the 503 are not as often. As the max theoretical sum of 250 MB per process multipled by 24 is 6 GB. But we've had issues with websites hitting 2 GB RAM with 24 NPROC in middle of night with almost zero visitors (at that moment I believe only CRON was running and doing nothing).
We've tested lowering numbers such RetryTimout and others with clear results mitigating the 503 errors with success. Processing hanging there for long periods of time instead of ending themselves is seem to be the root cause for most of our problems.