I kept my swap intact, but set swapiness to 0. That fixed like 75% of load spikes. Swapiness at 0 keeps swap usage available, but it will only be used in dire situations to avoid OOM. On my dedi servers that have 250gb of ram this is ideal, as ram usage never surpasses 60gb and the rest is cache - if I ever needed to use swap for any reason I would have MUCH bigger issues to deal with.

On small vps swap might be better to allow usage in special cases where it's preferable to OOM situations that could more easily creep up due to limitation of resources, but I would still keep swapiness real low, because (especially on vps) you're gonna have IO limits that will bottleneck and cause load spikes, so I might only run a 5 swapiness.

Removing limitations for IO, and limiting backups to 1 concurrent also greatly improved random load spikes. Still keep limitations for cpu/ram/nproc, though they are pretty generous. I think when the backup system runs it takes up 1 cpu at 100% against the website/account, so if you have a package with only 1 cpu limit then it's gonna have problems during the cpu run, so I would recommend no package lower than 2 cpu to avoid backups from causing issues/load spikes/503 errors on sites. Of course, that's not too bad a problem if it's only 1 backup per day and done at a time of night where most of your customers don't have traffic or work to do - but we run backups every 4 hours all day so it could be a big issue for customers if it didn't have the resources available to run.

    twest it's impossible to have backups only at night at the moment since you can't specify different times, for different servers so if the servers are in different continents one will be during high traffic hours doesn't matter what you do , also I didn't know backups are using user's resources - limits are you certain for that ?
    I personally use similar limits asecpfence, and had big load spikes due to user swapping (server had available ram ) , also in some of the tests the user could use 30 GB ram, when the limit was 1024 .

    And I can share more examples , unfortunately the answer most of the times is either that"my servers works like clockwork can't be something wrong " , or in case of enhance even after many tests we came to no conclusion as for why the high load occurs .

    Same tests performed on DA +CL setup on the same hardware, without load spikes .

    On none of the tests mysql performance was tested due to the fact that enhance doesn't limit anything there .

      gmakhs And I can share more examples , unfortunately the answer most of the times is either that"my servers works like clockwork can't be something wrong "

      Giving complicated replication instructions (like the one you mentioned in the other thread) can take hours to replicate. Whenever you’re sharing issues, it’s always a good idea to include a simple, clear step-by-step guide on how to reproduce the problem (using both your current limits and the exact limits we recommended above). This will hopefully encourage someone from the community to step in and help. Otherwise, you’ll keep getting responses like 'works good for us, not sure about yours,' which I agree isn’t really helpful for anyone.

      Can you even change:

      vm.dirty_ratio = 8
      vm.dirty_background_ratio = 4

      on Ubuntu 24? The file /etc/systctl.conf doesn't exist on this version.

        We always use this configuration on our small vps i.e. 4cpu, 16gb ram,

        recently we started caping iops & io to 2048, 8mb/s respectively (we're yet to see how it goes), we were not caping those before but the following config we are using throughout the year i guess, of course also we constantly tune database thanks to releem.. P.S. nproc we set to 150 on every package because we found unsual queuing chocking the server with 1024, 8mb/s iops and io while nproc was 50 first and even after set to 100...

        cPFence any suggestions?

        vm.overcommit_memory = 1
        vm.swappiness=10
        vm.vfs_cache_pressure=100
        vm.dirty_background_ratio = 5
        vm.dirty_ratio = 20

          pratik_asabe

          We prefer to keep it at the defaults:

          vm.overcommit_memory = 0
          vm.swappiness = 60
          vm.vfs_cache_pressure = 100
          vm.dirty_background_ratio = 10
          vm.dirty_ratio = 20

          Swap size is set to 4 GB (never larger than this, no matter how much RAM is installed).

          We also apply the cgroups limits mentioned above and set max_user_connections=25 in MariaDB. If a user causes trouble, we simply add them to the Owl blacklist and move on. This setup works well for us.

            Kosta we have done a extensive testing in past and these are the optimal stable* settings we ever found out, so when you use 50 in cache pressure the kernal the becomes more aggressive in term of memory usage (correct me if im saying something wrong) in my experience of being a decent sys admin and setting up so many servers i think these are the sweet spot values, and yes these may not work for everyone!

            cPFence Interesting! i'll definitely test your suggested config and share my experiences.. i agree sometimes default works like a charm than over-tweaking like we did, we spend almost3-4 months in R&D setting these values to find out stable* config.. lol..

              cPFence set max_user_connections=25 in MariaDB

              btw, do you use max_connections paired with max_user_connections? like,

              max_connections = 151
              max_user_connections = 25

                pratik_asabe

                Yes, for max_connections, ensure the value is high enough to handle expected peak traffic but not so high that it overwhelms server resources.

                Monitor the Max_used_connections status variable in MariaDB to see the highest number of simultaneous connections used on your server. (Note: This value resets after a MariaDB restart, so make sure to check it after peak hours to get a clear picture.) Adjust max_connections if you frequently approach this limit. For example, if your max used connections consistently stays around 100, leaving the default of 151 is perfectly fine.

                Here’s a script we use to display the values on our servers:
                https://gist.github.com/cPFence/98c359cfade030fd62adb6681312a97a
                It provides a quick overview of the most important metrics.

                  Kosta seems like a good system admin great work

                  When you're in IT you're constantly learning, no matter how far you've come...

                  Kosta enhance team need to be the asset to this community not cpfance

                  cpfence is just doing what community member should be doing, adding value and helping other members.. you should try it sometimes 😉

                    Kosta no cpfance just promoting his services lol

                    And is that a crime in any way?! his product is based on enhance itself, he obviously will not promote in cpanel or DA forums lol.. its a win win when you add value to community and help'em, you also gets promoted naturally, its a old but proven strategy, and above all strategy or not, promoting or not, helping community is always good, there's no harm in it!

                      Follow @enhancecp