prasad0889

cgroups CPU limits can't be bypassed as they're enforced at the kernel level, but high server load can still occur if users are allocated too much CPU, leading to waiting tasks and contention. Other resources like disk I/O or RAM can also become bottlenecks. So, it's important to set ALL limits wisely to keep things under control.

CloudLinux simplifies this by automatically managing resource limits, but when relying solely on cgroups, you need to carefully tweak settings and optimize limits to ensure balanced performance.

    cPFence

    If the php proccees is not spawned within the user cggroup limits won't effect it correct ?
    You keep mentioning the right limits but honestly I haven't find something that doesn't impact the users negatively while stopping those.load peaks .
    The test I did is simple

    Cloned a website, in both DA + CL and enhance, same hardware and same resources
    Sent loads with loader io, the load was not big enough to cause issues with the server but big enough to be seen in top clearly .

    Difference in behaviour : DA+ CL server the USER of the website gets 200% CPU (2 core ) and if the nprocs are exceeded the user gets 503(503 is different behaviour all together because of the limit of EP that's extra on cl )

    Now on enhance I don't see high CPU usage on the user , but as I see it as anonymous user as it's explained on the native setup , and the user is using 6 CPU cores.

    So to me this test shows me that for inbound traffic or processes the limits aren't enforced on the enhance setup, now apart from the CPU not running the use in su exec in my opinion allows the users to spawn a lot of processes which will result reaching the nproc limit and 503 , processes that would elsewise get limited from su exec and queued before the nproc is reached

    On CL server the suexec limit is set to 40 by default and entry process to 20 .

      gmakhs Now on enhance I don't see high CPU usage on the user , but as I see it as anonymous user as it's explained on the native setup , and the user is using 6 CPU cores.

      This isn’t the expected behavior. In our tests and daily live scenarios on many servers with the cPFence Owl module enabled, we get notified about high load immediately and haven’t encountered this issue. I recommend opening a ticket with the Enhance team to investigate further.

      gmakhs You keep mentioning the right limits but honestly I haven't find something that doesn't impact the users negatively while stopping those.load peaks .

      You won’t find a solution as long as you insist on giving the user 2 full cores. That’s why I’ve recommended several times before either lowering it or moving that client to a VM. Based on our tests with Enhance and other panels using cgroups without CloudLinux, this approach just doesn’t work for a busy shared hosting server.

        cPFence
        Tha is for your reply , I am also considering the owl module, but before I purchase your product I need to make sure I will continue with enhance in the long term for shared hosting , for cloud hosting on vms it works well .

        The problem is not the cpu limit in my scenario the server can easily handle that, problem is that the processes aren't started within the user so the limmit is not respected .

          gmakhs The problem is not the cpu limit in my scenario the server can easily handle that, problem is that the processes aren't started within the user so the limmit is not respected .

          I mentioned cPFence Owl just to share that we often receive notifications about high load from the Owl, but we’ve never encountered an issue where a user bypassed CPU limits or processes ran outside their assigned limits. It might be worth reaching out to Adam about this issue to pinpoint what’s wrong with your setup, and let us know the outcome.

          gmakhs I am also considering the owl module

          Many Enhance users manage perfectly fine without the Owl module.

          As long as the server has enough resources and you avoid bad plugins or themes, you should be fine without it.

            cPFence I wanted to write thanks for your reply not that is your reply , my auto correct messed it up , but I do appreciate the time you spent trying to help and explain

            cPFence

            We worked it down to High io wait, but couln't explain why it happens the server i tested are :

            1) Da + CL server with 120 ~ websites running Load 12
            2 Enhance server identical with server 1 with 10-20 Websites running (low traffic owned by me) load 120
            3) Enhance server different hardware load 70-80 10-20 low traffic websites

            I performed the tests using loader IO on the same cloned WordPress installation, Packages have the same limits for DA and Enhance servers.

            Loader IO results on DA server
            Success 597610 Timeout 1667

            Loader io Enhance 1 - Similar results on Enhance 2
            Success 3879

            Websites on Enhance Gave 503 Error after a while, DA - CL website no errors.

            Other observations are that workers do consume more CPU in Enhance servers and are under user 99 when in DA consume less and are under apache user and run in suexec

            Cache files are in a user directory in DA when in enhance are not, cache in the Enhance server reached 30 gb during the test on one page

            On my observation the issue is from the missing su exec and/or something is not working correctly with the cace.

            Other on this post had similar experience so seems to be an ongoing issue

              gmakhs

              We have lots of busy servers running the most abusing users you can imagine, and they all run like clockwork for us. Other users in this thread mentioned hacked sites abusing CPU, but I don’t think that’s related to your issue.

              Why not open a ticket with Enhance? You’ve been dealing with this for quite a while now, and I think it’s time to get help from Adam. He knows all the ins and outs of the panel and can help you sort this out.

                cPFence i did, we couldnt find the reason behind it, he believes is not from enhance but happening in 2 different dedicated servers, makes me believe the opposite.

                In any case the main issue isn't the fact that the server gets load, but the fact that this load is caused by a USER on the server which shouldnt happen, even after the user give 503 error .

                I am running few last tests today and then i will give up

                cPFence i will now update this issue in case anyone is experience the same, the problem is Caused by the Website user reaching the memory limit and going into swap causing high IO wait, there is currently no way to limit that apart from disabling swap on the server completely, which seems to be a temp solution.

                I hope for a solution from enhance after the 12v is out

                  gmakhs

                  That’s why I’ve recommended several times to use 1024 IOPS and 8 MB I/O limits instead of unlimited IOPS like you’re doing. This will help you keep things under control. We even apply stricter limits for users causing high load issues (using overrides), and despite having lots of abusing users, this setup works very well for us with swap on.

                    cPFence unlimited iops and bandwidth was recommended by enhance, in bot scenarios with unlimited and not the issue occurs, I don't believe that during swapping the limits apply .

                      gmakhs

                      We never use unlimited in any cgroups limits -it’s called limits for a reason- and it works flawlessly for us. We deal with abusing users daily, blacklist them in our cpf settings, and apply stricter limits when needed.

                      Not sure about your setup and testing, but I’ve shared what works for us and our clients’ servers. What works for us might not work for you. That’s all I can add.

                        cPFence Indeed , I am grateful for the insights you shared .
                        Though I was right from the begining that limits are not properly enforced, and hoping it will be fixed coming releases .....

                          gmakhs

                          You still don't quite understand. Let me provide further clarification. The limits are enforced by cgroups, which are part of the Linux kernel. Enhance itself doesn’t enforce these limits - it just provides an interface to help you set them up with ease, avoiding the hassle of manually configuring them using Linux commands. This is Enhance’s primary role in managing resource limits.

                          You can inspect the current cgroup hierarchy and settings for each website with:

                          systemd-cgls
                          
                          ls /sys/fs/cgroup/websites/add_website_id_here/

                          You want to verify the settings for cpu.max, memory.max, io.max, pids.max, etc. If the configurations you found for each website are accurate, then Enhance CP has done it's part of the job. Moving forward, you should focus on troubleshooting the cgroups themselves rather than attributing any issues to Enhance. Do you understand?

                          And you can even monitor resource usage for each website in real time with:

                          systemd-cgtop

                          If you believe there’s a bug, consider reporting it to the appropriate Linux kernel maintainers or the distribution’s support team.

                            cPFence this is where we disagree, partially I agree with you, about reporting to the right Devs , but the moment enhance advertises resource and tenant isolation and it's a paid product, is their job to make sure it works as advertised.

                            Again thank you for providing a helpful reply, and I believe you are doing amazing job supporting this forum

                            Write a Reply...
                            Follow @enhancecp