Database Slowdowns: with Preemptive

4D Server 17R6, windows server 2016, 28 processor cores, 180GB ram, 80 GB cache

I have a large (170GB), busy (200+ users) database that I’m struggling with major slow-downs.

To help fix the slowdowns: I’ve recently migrated a bunch of various task processing from cooperative, to Preemptive threads. The slowdowns have in some ways become more profound.

I have a table of [DB_Tasks], where I create records of work that needs to be done. I have a task processing routine that processes that tasks: and logs into the completed record how much processing-time it took (based on PROCESS PROPERTIES procTime). I can see that some tasks are taking too much time: but since moving from cooperative to preemptive, the amount of processing time seems to have greatly increased.

The reason that I moved the tasks to a preemptive thread: was to lighten the load on the main core of the server: I don’t really care if the processes take “more processing time” as long as it doesn’t impact the users.

However: from what I’m seeing: even though the processes are now in preemptive: the database performance becomes very bad when I have multiple preemptive processes that get busy. Today, it became so bad at one point that I had to restart the 4D server: it became uselessly un-responsive. (at the time, there were probably 10+ preemptive tasks fighting to do work: if they can’t complete in a timely manner: they spawn more)

(My tasks aren’t doing anything that I think should take that much time: scanning records that should already be in the database cache: comparing values to other records: and adding/modifying/deleting a nominal amount of records based on the results. Tested on a local machine: the same processing might take a few seconds: but my server tasks (when performing badly) are taking 30+ seconds. )

I would dearly like to add Debug Logging to my preemptive threads to help diagnose the issue: but that does not seem to be available in Preemptive. (major bummer)

WHY would preemptive threads compete with my primary Server core performance – and cause server speed degradation?

I’m entering that terrible state of desperation.

I’m sorry, but I fear without deep analyses any answer will be wrong or at least not helping.

Only in general:
when you switch to preemptive processes, a well written application seems to drastically increase the CPU load. I know this looks wrong upfront, but before the CPU was idling, now it is fully used. Of course as result the number of operations processed per second should increase, or in other words, the jobs should be done faster (or more jobs at the same time).

Short: higher CPU load is not bad by default.

An example for this is here:
https://blog.4d.com/improved-performance-up-to-8xs-faster-no-thats-not-a-typo/

Take a look at the screen shots, the new feature seems to increase load, while in fact, the result is faster processing.

Also in General:
we saw applications going to preemptive entering a new bottleneck. This applications had one large table used from almost all users, a kind of main table.
If hundreds of users (or processes) are always touching the same table, micro locking (which we need to do internal even to read a record) could slow down others. See blog above for details.
But as you are already on R6 you should not see that anymore. This just as example.

to analyse what happens on your side requires checking network logs or Cache logs. Not necessary debug logs. The debug logs would be a first step if your own code is not responding, but you wrote that the server hangs. So I would start checking the cache usage. But as written, contact support for deeper analyse.

Thanks for the reply Thomas,
I’ve been picking through code with a fine-tooth comb. I might be on to something but won’t be sure until I’ve tried it…
In some of my triggers, I was adding some new code to discern a user ID ( my system’s user ID). To do this: I’m using the new “Get process activity”. I was not passing a selector to Get process activity.

I did a little testing now: and found that with 50 users logged in: Get process activity was taking about 0.8 seconds to run. I’m wondering if by chance - if 4D is internally performing any kind of a semaphore on this command - such that two processes - especially preemptive processes might not be able to hit this at the same time?

Anyways: I also found that if I passed a selector: (again with about 50 users in)
Get process activity(Sessions only): was taking about 0.4 seconds,
Get process activity(Processes only): was taking about 0.1 seconds.

… and, I really only needed the Processes. So, I’ve cut it down to the Processes only, with a provision for me to disable the feature in the live database.

During peak times: with 200 users: I’m guessing this command would take 4x as long: and possibly bring my database to it’s knees.

I only need information about the current process: but what I need is the sessionID. I wish I could request only a specific process, or a specific session from Get process activity: and it be very fast. If my live client improves substantially today: I’ll probably make that feature request.

ah, Get process activity was not build for that.

This command is to build your own Server Admin Dialog.
And as you surely noticed, if you run 4D Server with user or process page displayed, this dialog takes 2-5% of your CPU time. We do not recommend to have this page opened except for administrative/debug purpose, as it takes too much time.

And the command to build this list is not designed to be used more than once every 5 seconds, in my own dialog I call it only every 30 seconds.

In no case call that inside a trigger or any fast running loop. Especially not in code which is performance critical.

It seems to me that what you really want is a kind of user / session storage object and you are trying to rebuild that yourself? In that case, better to make a feature request describe what you want to do, not asking for a detail for a command you use to try to build that yourself…

ok: Thanks for the info.
So, before I make a feature request: Is there any way in a Trigger on server-side, to ask “what is my remote-side sessionID?”

what is the sessionID?

For me, the only common “thing” we have is the “Current User”.
This string is available on client in all processes and in the trigger on the server.

You can set this string either by using 4D’s build in user/password system or by using SET USER ALIAS. Then you can use this string to lookup your own session variables, either via Storage or in an user record.

What kind of drive do you use in your server ? It can be the bottleneck :idea:

what is the sessionID?
This is one of the properties of the Get Process Activity: that links it back to the user session.

I was not aware that current user, in a trigger, would retrieve the client-side user. I’ll move to that.
Thanks!

Manual,
Thanks for the tip; but we have a high-speed NVME drive on the machine. (this is a high-end type of SSD)

: Tony RINGSMUTH

a high-speed NVME drive (this is a high-end type of SSD)

Ok so no problem here :wink:

: Tony RINGSMUTH

I was not aware that current user, in a trigger, would retrieve the
client-side user. I’ll move to that.
As I needed more, I start my processes with an execute on server method to set a variable in the twin. When the twin executes the trigger, this variable is available. And you can put anything you need on server side in it (user first name, last name, his IP address, etc.)
Schematically:
<code 4D>
//on client side
$userInfo_o:=New object
$userInfo_o.machine:=Current machine
$userInfo_o.possesseur:=Current system user
$userInfo_o.currentUser:=Current user
user_setInTwin ($userInfo_o) //write twin variable
</code 4D>
<code 4D>
//user_setInTwin
//EoS method
c_object($1)
c_object(userInfo_o) //must be a process variable
userInfo_o:=$1
</code 4D>

Arnaud
Yes: good idea: I also do that.
(I’ll explain my situation in my next post)

MY SITUATION IS FIXED!
I’ve identified that it was the user of Get Process Activity in a trigger (and going rogue) that was the problem. Here’s the longer story (if you want to read it).

Back in August, we began coding for a major feature set. At that time, I was also to make my triggers preemptive-safe. A sub-method, called by my triggers, did some work to create change-log records: one of the fields’ being a user ID. My previous code uses an inter-process array: which I needed to get rid of. Being that Get Process Activity was new: I found that it could give me a sessionID, which I could link to a user session. As a quick way of getting the job done - I used it. (not realizing how heavy it is). In testing - it worked great!
It was November before we began to roll out our new version: and to small client first. (still worked great)
By the time we finally hit our last, and biggest clients: the original changes were far from my mind.

Most of the time my triggers got my custom user ID from a client side push to server of the user ID: and once in the server: I store it in a process variable. So real users presented no problem.
However I have “batch workstation” type clients that just run tasks. These apps don’t have a user ID in my system. Because of that: in the triggers: my code says:

  • what’s the user id: Oh! we don’t know it yet: so use Get Process Activity - to find it: and then store it.
    But because no user id could be found: every trigger kept hitting the code again: and before you know it: if a lot of “task records” need processing: the system would come to it’s knees!

In trying to figure out the problem: I migrated more code from cooperative to preemptive - trying to figure out what was slowing it down: and trying to remedy by sharing across more cores. However, as I see now: if I hit “Get Process activity” on more and more cores: they compete with each other and eventually consume a 28 core machine.

When I wrote my new code to process tasks on server side - in preemptive: I made it with an “OFF” switch: where I could turn the feature off: and shift the processing back to the batch workstations. I’ve enabled that OFF switch for today: until I regain stability on this system: which I now have.

Once the dust settles again: I’ll try turning my preemptive task processing on server back ON: and then see how well it really does (or does not) work. My hope is that I can de-commission some of my batch-workstation stuff. (remove a little complexity)

: Tony RINGSMUTH

However I have “batch workstation” type clients that just run tasks.
These apps don’t have a user ID in my system.
Ho Tony,
shouldn’t these have a userID? After all, even a batch should leave traces of what it does, no?

: Tony RINGSMUTH

…create change-log records…

What information are you capturing in the log record?

Hi Tony,

Well done for fixing this !!!

Arnaud,

shouldn’t these have a userID? After all, even a batch should leave traces of what it does, no?
Yes: it’s kind of on my list of things to do.

Jeremy,

What information are you capturing in the log record?
It depends: but in most cases: I capture a complete before and after copy of the record - into what I call a “recycle-bin” record.
It’s saved our tails too many times to count.

I’ve created a feature request: for the Get Process Activity change that would help me:

https://forums.4d.com/Post//33928309/1/