Make RM_Yield thread-safe (#12905)

## Issues and solutions from #12817
1. Touch ProcessingEventsWhileBlocked and calling moduleCount() without
GIL in afterSleep()
    - Introduced: 
       Version: 7.0.0
       PR: #9963

   - Harm Level: Very High
If the module thread calls `RM_Yield()` before the main thread enters
afterSleep(),
and modifies `ProcessingEventsWhileBlocked`(+1), it will cause the main
thread to not wait for GIL,
which can lead to all kinds of unforeseen problems, including memory
data corruption.

   - Initial / Abandoned Solution:
      * Added `__thread` specifier for ProcessingEventsWhileBlocked.
`ProcessingEventsWhileBlocked` is used to protect against nested event
processing, but event processing
in the main thread and module threads should be completely independent
and unaffected, so it is safer
         to use TLS.
* Adding a cached module count to keep track of the current number of
modules, to avoid having to use `dictSize()`.
    
    - Related Warnings:
```
WARNING: ThreadSanitizer: data race (pid=1136)
  Write of size 4 at 0x0001045990c0 by thread T4 (mutexes: write M0):
    #0 processEventsWhileBlocked networking.c:4135 (redis-server:arm64+0x10006d124)
    #1 RM_Yield module.c:2410 (redis-server:arm64+0x10018b66c)
    #2 bg_call_worker <null>:83232836 (blockedclient.so:arm64+0x16a8)

  Previous read of size 4 at 0x0001045990c0 by main thread:
    #0 afterSleep server.c:1861 (redis-server:arm64+0x100024f98)
    #1 aeProcessEvents ae.c:408 (redis-server:arm64+0x10000fd64)
    #2 aeMain ae.c:496 (redis-server:arm64+0x100010f0c)
    #3 main server.c:7220 (redis-server:arm64+0x10003f38c)
```

2. aeApiPoll() is not thread-safe
When using RM_Yield to handle events in a module thread, if the main
thread has not yet
entered `afterSleep()`, both the module thread and the main thread may
touch `server.el` at the same time.

    - Introduced: 
       Version: 7.0.0
       PR: #9963

   - Old / Abandoned Solution:
Adding a new mutex to protect timing between after beforeSleep() and
before afterSleep().
Defect: If the main thread enters the ae loop without any IO events, it
will wait until
the next timeout or until there is any event again, and the module
thread will
always hang until the main thread leaves the event loop.

    - Related Warnings:
```
SUMMARY: ThreadSanitizer: data race ae_kqueue.c:55 in addEventMask
==================
==================
WARNING: ThreadSanitizer: data race (pid=14682)
  Write of size 4 at 0x000100b54000 by thread T9 (mutexes: write M0):
    #0 aeApiPoll ae_kqueue.c:175 (redis-server:arm64+0x100010588)
    #1 aeProcessEvents ae.c:399 (redis-server:arm64+0x10000fb84)
    #2 processEventsWhileBlocked networking.c:4138 (redis-server:arm64+0x10006d3c4)
    #3 RM_Yield module.c:2410 (redis-server:arm64+0x10018b66c)
    #4 bg_call_worker <null>:16042052 (blockedclient.so:arm64+0x169c)

  Previous write of size 4 at 0x000100b54000 by main thread:
    #0 aeApiPoll ae_kqueue.c:175 (redis-server:arm64+0x100010588)
    #1 aeProcessEvents ae.c:399 (redis-server:arm64+0x10000fb84)
    #2 aeMain ae.c:496 (redis-server:arm64+0x100010da8)
    #3 main server.c:7238 (redis-server:arm64+0x10003f51c)
```

## The final fix as the comments:
https://github.com/redis/redis/pull/12817#discussion_r1436427232
Optimized solution based on the above comment:

First, we add `module_gil_acquring` to indicate whether the main thread
is currently in the acquiring GIL state.

When the module thread starts to yield, there are two possibilities(we
assume the caller keeps the GIL):
1. The main thread is in the mid of beforeSleep() and afterSleep(), that
is, `module_gil_acquring` is not 1 now.
At this point, the module thread will wake up the main thread through
the pipe and leave the yield,
waiting for the next yield when the main thread may already in the
acquiring GIL state.
    
2. The main thread is in the acquiring GIL state.
The module thread release the GIL, yielding CPU to give the main thread
an opportunity to start
event processing, and then acquire the GIL again until the main thread
releases it.
This is what
https://github.com/redis/redis/pull/12817#discussion_r1436427232
mentioned direction.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
This commit is contained in:
debing.sun 2024-01-07 18:10:29 +08:00 committed by GitHub
parent 4cae66f5e8
commit ca1f67af80
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 31 additions and 1 deletions

View File

@ -2407,7 +2407,33 @@ void RM_Yield(RedisModuleCtx *ctx, int flags, const char *busy_reply) {
server.busy_module_yield_flags |= BUSY_MODULE_YIELD_CLIENTS;
/* Let redis process events */
processEventsWhileBlocked();
if (!pthread_equal(server.main_thread_id, pthread_self())) {
/* If we are not in the main thread, we defer event loop processing to the main thread
* after the main thread enters acquiring GIL state in order to protect the event
* loop (ae.c) and avoid potential race conditions. */
int acquiring;
atomicGet(server.module_gil_acquring, acquiring);
if (!acquiring) {
/* If the main thread has not yet entered the acquiring GIL state,
* we attempt to wake it up and exit without waiting for it to
* acquire the GIL. This avoids blocking the caller, allowing them to
* continue with unfinished tasks before the next yield.
* We assume the caller keeps the GIL locked. */
if (write(server.module_pipe[1],"A",1) != 1) {
/* Ignore the error, this is best-effort. */
}
} else {
/* Release the GIL, yielding CPU to give the main thread an opportunity to start
* event processing, and then acquire the GIL again until the main thread releases it. */
moduleReleaseGIL();
sched_yield();
moduleAcquireGIL();
}
} else {
/* If we are in the main thread, we can safely process events. */
processEventsWhileBlocked();
}
server.busy_module_yield_reply = prev_busy_module_yield_reply;
/* Possibly restore the previous flags in case of two nested contexts
@ -11888,6 +11914,7 @@ void moduleInitModulesSystem(void) {
moduleUnblockedClients = listCreate();
server.loadmodule_queue = listCreate();
server.module_configs_queue = dictCreate(&sdsKeyValueHashDictType);
server.module_gil_acquring = 0;
modules = dictCreate(&modulesDictType);
moduleAuthCallbacks = listCreate();

View File

@ -1875,7 +1875,9 @@ void afterSleep(struct aeEventLoop *eventLoop) {
mstime_t latency;
latencyStartMonitor(latency);
atomicSet(server.module_gil_acquring, 1);
moduleAcquireGIL();
atomicSet(server.module_gil_acquring, 0);
moduleFireServerEvent(REDISMODULE_EVENT_EVENTLOOP,
REDISMODULE_SUBEVENT_EVENTLOOP_AFTER_SLEEP,
NULL);

View File

@ -1608,6 +1608,7 @@ struct redisServer {
int module_pipe[2]; /* Pipe used to awake the event loop by module threads. */
pid_t child_pid; /* PID of current child */
int child_type; /* Type of current child */
redisAtomic int module_gil_acquring; /* Indicates whether the GIL is being acquiring by the main thread. */
/* Networking */
int port; /* TCP listening port */
int tls_port; /* TLS listening port */