Hi all,

It's a busy site that runs, and every now and then it completely freezes for about 1 min.
I can see this notice error on OLS logs:

No request delivery notification has been received from LSAPI application, possible dead lock.

Anyone had similar issue?

I just raised PHP_LSAPI_CHILDREN from 10 to 30 and also Max Connections for lsphp to 30

It's really something that these changes are not persistent on Enhance. Will they stay if I dont do anything on the panel?

  • Rich replied to this.

    wav3front Updates within the panel would more likely change the vhost cofigs. Its updating the enhance panel version that will cause docker OLS to update thereby rendering your webconsole settings undone.

    I've not seen this as an issue on a variety of VPS - > Dedi's what's your setup as maybe there's an IO Wait issue somewhere?

    and what are the server specs?

    its a 4464P server with 128GB of ram and nvme drives.

    It's not the hardware. The load is not high.

    • Rich replied to this.

      wav3front OK, at least you've tried to tune it for more throughput. What about DB slow queries? Is it a WP site? A 1 minute pause is pretty significant, bad plugin maybe?

      Honestly I am not sure what else to suggest.

      wav3front

      "LSAPI_AVOID_FORK=1 will only keep the child processes alive if there is enough available memory. By default "enough" is set to 1GB, so if your server has less than 1 GB available, setting LSAPI_AVOID_FORK=1 will not work. Instead you can set a limit, as in LSAPI_AVOID_FORK=100M. This will allow the LSAPI_AVOID_FORK variable to work as expected."

      I'd say it's fine, it's telling it to respect 200MB instead of 1GB.

      You could up that to 500M or even 1 if that's a dedicated server to a big website. 1 means zombie processes might exists, which with that RAM you've got isn't likely an issue.

      Are you using any hard limits on a package that the website is a part of?

        Rich

        Thanks for the info.

        No, I'm not using any limits. It's a dedi server.

        This 1min freeze continues to happen even with increased OLS settings. I believe its a php/database issue, so I need to dig deeper.

          wav3front I am personally thinking DB, so maybe have a look here. Although I wouldn't rule out php workers entirely.

          15 days later

          Hi,

          I cannot find a solution to this.

          1. There's nothing else on OLS log, The only error is "No request delivery notification has been received from LSAPI application, possible dead lock."
          2. Raiting limits like PHP_LSAPI_CHILDREN or Max Connections makes no difference
          3. Nothing on PHP logs that could at least give me a hint
          4. Nothing on MariaDB logs that could give me a hint
          5. MariaDB slow query log does contain some entries, but they seem to be uncorrelated with the issue. They are too few and the timing does not much.

          I'm totally out of ideas. I have started doing random things like changing php versions....i don't know what to do.
          It's not WP websites. One website is Xenforo, the other one is an old classified application called Oxy Classifieds,
          and Revive Ad Server also runs on the same server.

          Any ideas?

          • Rich replied to this.

            wav3front

            Do you notice any 500 or 503 errors when seeing these deadlock messages?

            Do you have any package restrictions for resources at all? If so what about setting up a separate package with no restrictions and seeing if one of these sites still has issues? possible nproc or io issue? I used to see cpu spikes with some restrictions on.

            As for OLS have you looked at initTimeout, pcKeepAliveTimeout?

            Tried xdebug or deep php logging could it be a poorly formatted php script causing infinite loops, or some really long running threads... have you tried opening up the timeout for PHP execution to something un-godly to see if you can catch something in htop?

            Is it even possible to put these sites, or even one site on a seperate server, such that you can set some wild settings to see? try alternative web servers just incase.

            Do you have this or see anything in it? /usr/local/lsws/logs/stderr.log

            Hi @Rich , thanks for getting back to me.

            I haven't personally seen a 500 when viewing the websites; all I get is a complete freeze for 1-2 minutes.
            The log does contain some though:

            2024-12-18 17:00:31.027846 NOTICE [440899] [***.**.194.55:38960:HTTP2-3#*****.gr] No request delivery notification has been received from LSAPI application, possible dead lock.
            2024-12-18 17:00:30.096404 NOTICE [440900] [***.***.251.176:63184:HTTP2-303#*****..gr] ExtConn timed out while connecting.
            2024-12-18 17:00:30.096490 NOTICE [440900] [***.***.251.176:63184:HTTP2-303#*****..gr] oops! 503 Service Unavailable

            Do you have any package restrictions for resources at all? If so what about setting up a separate package with no restrictions and seeing if one of these sites still has issues? possible nproc or io issue? I used to see cpu spikes with some restrictions on.

            No, there are no limits. Those websites run on a dedi server with high resources

            As for OLS have you looked at initTimeout, pcKeepAliveTimeout?

            I'm trying to find those in OLS admin. Where are they?

            Tried xdebug or deep php logging could it be a poorly formatted php script causing infinite loops, or some really long running threads... have you tried opening up the timeout for PHP execution to something un-godly to see if you can catch something in htop? Is it even possible to put these sites, or even one site on a seperate server, such that you can set some wild settings to see? try alternative web servers just incase. Do you have this or see anything in it? /usr/local/lsws/logs/stderr.log

            stderr contains nothing related. only:

            2024-12-02 10:01:18.800 [STDERR] sh: 1: /usr/sbin/sendmail: not found

            I suspect this is an issue with Oxy classifieds. It's a badly written application. Even though it has run for decades without issues, but what I'm thinking is that, it previously run on MySQL5 and now it runs on MariaDB 11, maybe MariaDB handles bad queries differently. But there is nothing worth notting on slow query log or mariaDB error log.

            The only thing I can do now, is start separating the applications to different server, so that I can at least "make sure" that this is indeed coming from Oxy like my hunch is telling me.

            • Rich replied to this.

              wav3front If it's so old and came from MySQL 5.x What's the table type? MyISAM? InnoDB took over as the default table of choice, surely MariaDB would of complained a bit. What's the collation it's using. Is there anyway of checking and spinning off a clone to play with these, if they're even applicable?

              initTimeout is under: External App > SAPI > lsphp: Initial Request Timeout (secs)

              In same area there is Connection Keep-Alive Timeout

              He had his under this wsgiDefaults config setting, but that's not as relevent within webconfig as there rails/python/node, you're needing PHP only.

              I was reading through this post: here

              There was also mention of zlib setting, but I don't think it's that, but you might as well try... what version of your server are you on, as my current default is 1.7.19 and 1.8.2 is available. There's another post on here about updating the OLS webserver. It's possible there's been fixes since then as people was complaining about this as late as 1.7.12 in that post.

                The version is OLS is: OpenLiteSpeed 1.8.2

                I doubt if this is actually an OLS issue.

                Rich If it's so old and came from MySQL 5.x What's the table type? MyISAM? InnoDB took over as the default table of choice, surely MariaDB would of complained a bit. What's the collation it's using. Is there anyway of checking and spinning off a clone to play with these, if they're even applicable?

                Yes, some tables are actually still MyISAM.

                I will switch to InnoDB and report results.

                Rich initTimeout is under: External App > SAPI > lsphp: Initial Request Timeout (secs)

                In same area there is Connection Keep-Alive Timeout

                Raised those, no change. I just had a freeze for 1 min.

                Might be worth switching that site on to a server you can switch to apache, if the issue goes away, it's OLS, if it doesn't then it's something else, either likely php or db.

                Another thing might be to temporary use netdata, since you can look through the metrics and correlate with pauses and try drill down to any iowaits or cpu spikes. That said some are concerned removing it doesn't clean up properly, so again might be something to do on a different server. I had some horrible CPU spikes and didn't know what was causing them until I saw which client php process was with netdata... you already know what client is the issue, but it might give more detail across the board to review.

                Does this only affect this site, or does the pauses effect other sites at the same time?

                Also have you given this a try? enhance php strace it talks looking for hangs.

                Beyond this all I can think of is something like newrelic or php Xray? (not used it but might help you get vision on what's happening) I've setup Xdebug in the past for looking into why LSWS was failing on long executions when I configured the server to allow them. It helped but it's a pain to setup and it was my own code I was working on.

                If you've got PHP pretty much set to what it was prior to moving the site, then it really could be something with Mariadb.

                I know you said you'd used slow query but what about:
                Try Configure Long Query Time: Define the time threshold for a query to be considered “slow”. You can set the long_query_time system variable to a value in seconds (e.g., 5 seconds). For MariaDB 10.11 and later, use log_slow_query_time instead.
                SET GLOBAL long_query_time = 5; // or SET GLOBAL log_slow_query_time = 5;

                and set it to like 45second, so you can isolate them long pauses and try see if it's caused by DB, if not, then must be PHP loop/ poor execution. Likewise what if you set the php execution down to say 20seconds, can you break the site instead of it hanging?

                I think your at the stage of trying to isolate the what and where rather than the how. Fingers crossed for you!

                  Hi, just a quick update: i switched to innoDB for ALL tables, nothing changed.

                  @Rich many thanks for the information. I will go through that and report results.

                  have noticed something though:
                  Monitoring htop, some CPU cores go up to 100% and they stay there for some time and even get on "red" teritorry.,
                  mariadb is on top of the list.

                  Here's my.cnf file

                  `[mysqld]
                  skip-log-bin

                  ssl-ca=/etc/certs/mysql/ca.pem
                  ssl-cert=/etc/mysql/ssl/cert.pem
                  ssl-key=/etc/mysql/ssl/key.pem

                  skip-host-cache
                  skip-name-resolve

                  default_authentication_plugin = mysql_native_password

                  innodb_flush_log_at_trx_commit=2
                  innodb_buffer_pool_size=6G
                  max_allowed_packet=512M
                  query-cache-type=1
                  query_cache_size=52428800
                  max_connections=500
                  innodb_flush_neighbors=0
                  innodb_flush_method=O_DIRECT_NO_FSYNC
                  innodb_io_capacity=450
                  innodb_random_read_ahead=ON
                  table_open_cache=16013

                  log_output=FILE
                  slow_query_log
                  slow_query_log_file=slow-queries.log
                  long_query_time=5.0`

                  The reason I have enabled query cache is because Oxy classifieds has some very badly written queries and it's a way to save it. What surprised me though is that, before moving this website to Enhance, MySQL run on a much slower Windows 2022 VM running MySQL5.7 (for query cache) without any issues. If anything, the CPU load was much less.

                  Riddle.

                  • Rich replied to this.

                    Rich Also have you given this a try? enhance php strace it talks looking for hangs.

                    Btw this will not work:

                    Followed the instructions and I'm getting:

                    **_com01@*-*:~$ strace -p 22
                    strace: attach: ptrace(PTRACE_SEIZE, 22): Operation not permitted

                    @Adam something has changed since the time of the documentation?

                    • Rich replied to this.

                      wav3front You might want to check on these ones.... I saw notices of redundancy in MySQL logs, when I was doing some tweaking recently:

                      skip-host-cache
                      skip-name-resolve
                      
                      default_authentication_plugin = mysql_native_password

                      likewise I've commented out my certs as when I looked there wasn't any certs there anyway.

                      Your bufferpool size could be much bigger given all the ram you got. mines at 32GB on 64GB server. Here's mine to compare with: [note its for mysql not mariadb]

                      [mysqld]
                      # Binary logging
                      skip-log-bin                      # Disable binary logging for performance
                      
                      # SSL settings
                      # ssl-ca=/etc/certs/mysql/ca.pem  # Commented out for SSL CA
                      # ssl-cert=/etc/mysql/ssl/cert.pem # Commented out for SSL
                      # ssl-key=/etc/mysql/ssl/key.pem   # Commented out for SSL
                      
                      # Character set settings
                      collation-server=utf8mb4_unicode_ci  # Use utf8mb4 for better Unicode support
                      character-set-server=utf8mb4          # Use utf8mb4 for better Unicode support
                      
                      # Basic settings
                      max_connections=300                   # Increased to allow more concurrent connections
                      thread_cache_size=16                  # Increased to reduce thread creation overhead
                      
                      # InnoDB settings
                      innodb_file_per_table=1               # Keep for better space management
                      innodb_buffer_pool_size=32G           # Set to 32GB for caching
                      innodb_buffer_pool_instances=32        # Increased for better concurrency
                      innodb_log_file_size=512M             # Increased for larger transactions
                      innodb_log_buffer_size=64M            # Increased for larger transactions
                      innodb_flush_log_at_trx_commit=2      # Set to 2 for performance with some durability
                      innodb_flush_method=O_DIRECT           # Keep for performance
                      tmp_table_size=256M                   # Increased for larger temporary tables
                      max_heap_table_size=256M              # Increased for larger heap tables
                      innodb_thread_concurrency=0            # Allow InnoDB to manage concurrency
                      innodb_read_io_threads=4               # Increased for better read performance
                      innodb_write_io_threads=4              # Increased for better write performance
                      innodb_io_capacity=4000                # Increased for better I/O performance
                      innodb_io_capacity_max=10000           # Increased for better I/O performance
                      innodb_checksum_algorithm=crc32
                      innodb_log_compressed_pages=OFF
                      innodb_change_buffering=all
                      innodb_redo_log_capacity=8G            # Set to 8G for better recovery performance
                      
                      # Additional settings
                      activate_all_roles_on_login=ON         # Keep for role management
                      host_cache_size=0
                      performance_schema=ON                   # Enable performance schema
                      sql-mode="NO_ENGINE_SUBSTITUTION"
                      
                      # Additional tuning based on MySQLTuner recommendations
                      join_buffer_size=512K                  # Increase join buffer size
                      table_definition_cache=6000             # Increase table definition cache

                      Your server is better specc'ed than mine. It's possible your DB is a little restrictive on table sizes and how many it can have in working memory. Mine isn't perfectly tuned, but its performing a lot better now than it was before it's last tuning.

                      You have a few rules I'll look into myself like innodb_random_read_ahead=ON and innodb_flush_neighbors=0. Thanks for sharing.

                        Follow @enhancecp