redis

Commit Graph

Author	SHA1	Message	Date
debing.sun	f95031c473	Fix CI failure caused by PR #13231 (#13233 ) For my mistake, in the last revert commit in #13231, I originally wanted to revert the last one, but reverted the penultimate fix. Now that we have fix another potential memory read issue in [`743f1dd` (#13231)](`743f1dde79`), now it just seems to avoid confusion, i will verify in the future whether it will have any impact, if so we will add this PR to backport. Failed CI: https://github.com/sundb/redis/actions/runs/8826731960	2024-04-25 14:11:45 +08:00
debing.sun	772564fc9e	Fix forget to update the dict's node in the kvstore's rehashing list after defragment (#13231 ) Introducted by #13013 After defragmenting the dictionary in the kvstore, if the dict is reallocated, the value of its node in the kvstore rehashing list must be updated.	2024-04-24 16:15:42 +08:00
Binbin	804110a487	Allocate Lua VM code with jemalloc instead of libc, and count it used memory (#13133 ) ## Background 1. Currently Lua memory control does not pass through Redis's zmalloc.c. Redis maxmemory cannot limit memory problems caused by users abusing lua since these lua VM memory is not part of used_memory. 2. Since jemalloc is much better (fragmentation and speed), and also we know it and trust it. we are going to use jemalloc instead of libc to allocate the Lua VM code and count it used memory. ## Process: In this PR, we will use jemalloc in lua. 1. Create an arena for all lua vm (script and function), which is shared, in order to avoid blocking defragger. 2. Create a bound tcache for the lua VM, since the lua VM and the main thread are by default in the same tcache, and if there is no isolated tcache, lua may request memory from the tcache which has just been freed by main thread, and vice versa On the other hand, since lua vm might be release in bio thread, but tcache is not thread-safe, we need to recreate the tcache every time we recreate the lua vm. 3. Remove lua memory statistics from memory fragmentation statistics to avoid the effects of lua memory fragmentation ## Other Add the following new fields to `INFO DEBUG` (we may promote them to INFO MEMORY some day) 1. allocator_allocated_lua: total number of bytes allocated of lua arena 2. allocator_active_lua: total number of bytes in active pages allocated in lua arena 3. allocator_resident_lua: maximum number of bytes in physically resident data pages mapped in lua arena 4. allocator_frag_bytes_lua: fragment bytes in lua arena This is oranagra's idea, and i got some help from sundb. This solves the third point in #13102. --------- Co-authored-by: debing.sun <debing.sun@redis.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2024-04-16 12:43:33 +03:00
Yves LeBras	e3550f01dd	redis-cli - sendReadOnly() to work with Redis Cloud (#13195 ) When using Redis Cloud, sendReadOnly() exit with `Error: ERR unknown command 'READONLY'`. It is impacting `--memkeys`, `--bigkeys`, `--hotkeys`, and will impact `--keystats`. Added one line to ignore this error. issue introduced in #12735 (not yet released).	2024-04-08 11:12:57 +03:00
debing.sun	f4481e657f	Use usleep() instead of sched_yield() to yield cpu (#13183 ) when the main thread and the module thread are in the same thread, sched_yield() can work well. when they are both bind to different cpus, sched_yield() will look for the thread with the highest priority, and if the module thread is always the highest priority on a cpu, it will take a long time to let the main thread to reacquire the GIL. ref https://man7.org/linux/man-pages/man2/sched_yield.2.html ``` If the calling thread is the only thread in the highest priority list at that time, it will continue to run after a call to sched_yield(). ```	2024-04-07 20:59:36 +08:00
debing.sun	4581d43230	Fix daylight race condition and some thread leaks (#13191 ) fix some issues that come from sanitizer thread report. 1. when the main thread is updating daylight_active, other threads (bio, module thread) may be writing logs at the same time. ``` WARNING: ThreadSanitizer: data race (pid=661064) Read of size 4 at 0x55c9a4d11c70 by thread T2: #0 serverLogRaw /home/sundb/data/redis_fork/src/server.c:116 (redis-server+0x8d797) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) #1 _serverLog.constprop.2 /home/sundb/data/redis_fork/src/server.c:146 (redis-server+0x2a3b14) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) #2 bioProcessBackgroundJobs /home/sundb/data/redis_fork/src/bio.c:329 (redis-server+0x1c24ca) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) Previous write of size 4 at 0x55c9a4d11c70 by main thread (mutexes: write M0, write M1, write M2, write M3): #0 updateCachedTimeWithUs /home/sundb/data/redis_fork/src/server.c:1102 (redis-server+0x925e7) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) #1 updateCachedTimeWithUs /home/sundb/data/redis_fork/src/server.c:1087 (redis-server+0x925e7) #2 updateCachedTime /home/sundb/data/redis_fork/src/server.c:1118 (redis-server+0x925e7) #3 afterSleep /home/sundb/data/redis_fork/src/server.c:1811 (redis-server+0x925e7) #4 aeProcessEvents /home/sundb/data/redis_fork/src/ae.c:389 (redis-server+0x85ae0) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) #5 aeProcessEvents /home/sundb/data/redis_fork/src/ae.c:342 (redis-server+0x85ae0) #6 aeMain /home/sundb/data/redis_fork/src/ae.c:477 (redis-server+0x85ae0) #7 main /home/sundb/data/redis_fork/src/server.c:7211 (redis-server+0x7168c) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) ``` 2. thread leaks in module tests ``` WARNING: ThreadSanitizer: thread leak (pid=668683) Thread T13 (tid=670041, finished) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1036 (libtsan.so.2+0x3d179) (BuildId: 28a9f70061dbb2dfa2cef661d3b23aff4ea13536) #1 HelloBlockNoTracking_RedisCommand /home/sundb/data/redis_fork/tests/modules/blockonbackground.c:200 (blockonbackground.so+0x97fd) (BuildId: 9cd187906c57e88cdf896d121d1d96448b37a136) #2 HelloBlockNoTracking_RedisCommand /home/sundb/data/redis_fork/tests/modules/blockonbackground.c:169 (blockonbackground.so+0x97fd) #3 call /home/sundb/data/redis_fork/src/server.c:3546 (redis-server+0x9b7fb) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) #4 processCommand /home/sundb/data/redis_fork/src/server.c:4176 (redis-server+0xa091c) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) #5 processCommandAndResetClient /home/sundb/data/redis_fork/src/networking.c:2468 (redis-server+0xd2b8e) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) #6 processInputBuffer /home/sundb/data/redis_fork/src/networking.c:2576 (redis-server+0xd2b8e) #7 readQueryFromClient /home/sundb/data/redis_fork/src/networking.c:2722 (redis-server+0xd358f) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) #8 callHandler /home/sundb/data/redis_fork/src/connhelpers.h:58 (redis-server+0x288a7b) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) #9 connSocketEventHandler /home/sundb/data/redis_fork/src/socket.c:277 (redis-server+0x288a7b) #10 aeProcessEvents /home/sundb/data/redis_fork/src/ae.c:417 (redis-server+0x85b45) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) #11 aeProcessEvents /home/sundb/data/redis_fork/src/ae.c:342 (redis-server+0x85b45) #12 aeMain /home/sundb/data/redis_fork/src/ae.c:477 (redis-server+0x85b45) #13 main /home/sundb/data/redis_fork/src/server.c:7211 (redis-server+0x7168c) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0) ```	2024-04-04 13:49:51 +03:00
Moti Cohen	4df037962d	Change FLUSHALL/FLUSHDB SYNC to run as blocking ASYNC (#13167 ) # Overview Users utilize the `FLUSHDB SYNC` and `FLUSHALL SYNC` commands for a variety of reasons. The main issue with this command is that if the database becomes substantial in size, the server will be unresponsive for an extended period. Other than freezing application traffic, this may also lead some clients making incorrect judgments about the server's availability. For instance, a watchdog may erroneously decide to terminate the process, resulting in potential adverse outcomes. While a `FLUSH* ASYNC` can address these issues, it might not be used for two reasons: firstly, it's not the default, and secondly, in some cases, the client issuing the flush wants to wait for its completion before repopulating the database. Between the option of triggering FLUSH* asynchronously in the background without indication for completion versus running it synchronously in the foreground by the main thread, there is another more appealing option. We can block the client that requested the flush, execute the flush command in the background, and once done, unblock the client and return notification for completion. This approach ensures the server remains responsive to other clients, and the blocked client receives the expected response only after the flush operation has been successfully carried out. # Implementation details Instead of defining yet another flavor to the flush command, we can modify `FLUSHALL SYNC` and `FLUSHDB SYNC` always run in this new mode. ## Extending BIO Threads capabilities Today jobs that are carried out by BIO threads don't have the capability to indicate completion to the main thread. We can add this infrastructure by having an additional dummy job, coined as completion-job, that eventually will be written by BIO threads to a response-queue. The main thread will take care to consume items from the response-queue and call the provided callback function of each completion-job. ## FLUSH* SYNC to run as blocking ASYNC Command `FLUSH* SYNC` will be modified to create one or more async jobs to flush DB(s) and afterward will push additional completion-job request. By sending the completion job request only at the end, the main thread will be called back only after all the preceding jobs completed their task in the background. During that time, the client of the command is suspended and marked as `BLOCKED_LAZYFREE` whereas any other client will be able to communicate with the server without any issue.	2024-04-02 15:09:52 +03:00
Moti Cohen	ce47834309	kvstoreIteratorNext() wrongly reset iterator twice (#13178 ) It calls kvstoreIteratorNextDict() which eventually calls dictResumeRehashing() And then, on return, it calls dictResetIterator(iter) which calls dictResumeRehashing(). We end up with pauserehash value decremented twice instead of once.	2024-04-01 18:08:55 +03:00
Pieter Cailliau	0b34396924	Change license from BSD-3 to dual RSALv2+SSPLv1 (#13157 ) [Read more about the license change here](https://redis.com/blog/redis-adopts-dual-source-available-licensing/) Live long and prosper 🖖	2024-03-20 22:38:24 +00:00
Yanqi Lv	e64d91c371	Fix dict use-after-free problem in kvs->rehashing (#13154 ) In ASAN CI, we find server may crash because of NULL ptr in `kvstoreIncrementallyRehash`. the reason is that we use two phase unlink in `dbGenericDelete`. After `kvstoreDictTwoPhaseUnlinkFind`, the dict may be in rehashing and only have one element in ht[0] of `db->keys`. When we delete the last element in `db->keys` meanwhile `db->keys` is in rehashing, we may free the dict in `kvstoreDictTwoPhaseUnlinkFree` without deleting the node in `kvs->rehashing`. Then we may use this freed ptr in `kvstoreIncrementallyRehash` in the `serverCron` and cause the crash. This is indeed a use-after-free problem. The fix is to call rehashingCompleted in dictRelease and dictEmpty, so that every call for rehashingStarted is always matched with a rehashingCompleted. Adding a test in the unit test to catch it consistently --------- Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: debing.sun <debing.sun@redis.com>	2024-03-20 22:44:28 +02:00
Yanqi Lv	bad33f8738	fix wrong data type conversion in zrangeResultBeginStore (#13148 ) In `beginResultEmission`, -1 means the result length is not known in advance. But after #12185, if we pass -1 to `zrangeResultBeginStore`, it will convert to SIZE_MAX in `zsetTypeCreate` and try to `dictExpand`. Although `dictExpand` won't succeed because the size overflows, I think we'd better to avoid this wrong conversion. This bug can be triggered when the source of `zrangestore` doesn't exist or we use `zrangestore` command with `byscore` or `bylex`. The impact is that dst keys will be converted to use skiplist instead of listpack.	2024-03-19 08:52:55 +02:00
Binbin	e04d41d78d	Prevent lua error_reply abuse from causing errorstats to become larger (#13141 ) Users who abuse lua error_reply will generate a new error object on each error call, which can make server.errors get bigger and bigger. This will cause the server to block when calling INFO (we also return errorstats by default). To prevent the damage it can cause, when a misuse is detected, we will print a warning log and disable the errorstats to avoid adding more new errors. It can be re-enabled via CONFIG RESETSTAT. Because server.errors may be very large (it may be better now since we have the limit), config resetstat may block for a while. So in resetErrorTableStats, we will try to lazyfree server.errors. See the related discussion at the end of #8217.	2024-03-19 08:18:22 +02:00
Chen Tianjie	aeada20140	Avoid unnecessary dict shrink in zremrangeGenericCommand (#13143 ) If the skiplist is emptied, there is no need to shrink the dict in skiplist, it can be deleted directly.	2024-03-19 10:14:19 +08:00
Binbin	7b070423b8	Fix dictionary use-after-free in active expire and make kvstore iter to respect EMPTY flag (#13135 ) After #13072, there is an use-after-free error. In expireScanCallback, we will delete the dict, and then in dictScan we will continue to use the dict, like we will doing `dictResumeRehashing(d)` in the end, this casued an error. In this PR, in freeDictIfNeeded, if the dict's pauserehash is set, don't delete the dict yet, and then when scan returns try to delete it again. At the same time, we noticed that there will be similar problems in iterator. We may also delete elements during the iteration process, causing the dict to be deleted, so the part related to iter in the PR has also been modified. dictResetIterator was also missing from the previous kvstoreIteratorNextDict, we currently have no scenario that elements will be deleted in kvstoreIterator process, deal with it together to avoid future problems. Added some simple tests to verify the changes. In addition, the modification in #13072 omitted initTempDb and emptyDbAsync, and they were also added. This PR also remove the slow flag from the expire test (consumes 1.3s) so that problems can be found in CI in the future.	2024-03-18 17:41:54 +02:00
Alexander Mahone	98a6e55d4e	Add missing REDIS_STATIC in quicklist (#13147 ) Compiler complained when I tried to compile only quicklist.c. Since static keyword is needed when a static function declaration is placed before its implementation. ``` #ifndef REDIS_STATIC #define REDIS_STATIC static #endif ``` [How to solve static declaration follows non-static declaration in GCC C code?](https://stackoverflow.com/questions/3148244/how-to-solve-static-declaration-follows-non-static-declaration-in-gcc-c-code)	2024-03-18 08:22:19 +02:00
Madelyn Olson	3f725b8619	Change mmap rand bits as a temporary mitigation to resolve asan bug (#13150 ) There is a new change in linux kernel 6.6.6 that uses randomization of address space to harden security, but it interferes with the way ASAN works. Folks are working on a fix, but this is a temporary mitigation for us to get our CI to be green again. See https://github.com/google/sanitizers/issues/1716 for more information See https://github.com/redis/redis/actions/runs/8305126288/job/22731614306?pr=13149 for a recent failure	2024-03-17 09:06:51 +02:00
Viktor Söderqvist	1d77a8e2c5	Makefile respect user's REDIS_CFLAGS and OPT (#13073 ) This change to the Makefile makes it possible to opt out of `-fno-omit-frame-pointer` added in #12973 and `-flto` (#11350). Those features were implemented by conditionally modifying the `REDIS_CFLAGS` and `REDIS_LDFLAGS` variables. Historically, those variables provided a way for users to pass options to the compiler and linker unchanged. Instead of conditionally appending optimization flags to REDIS_CFLAGS and REDIS_LDFLAGS, I want to append them to the OPTIMIZATION variable. Later in the Makefile, we have `OPT=$(OPTIMIZATION)` (meaning OPTIMIZATION is only a default for OPT, but OPT can be overridden by the user), and later the flags are combined like this: FINAL_CFLAGS=$(STD) $(WARN) $(OPT) $(DEBUG) $(CFLAGS) $(REDIS_CFLAGS) FINAL_LDFLAGS=$(LDFLAGS) $(OPT) $(REDIS_LDFLAGS) $(DEBUG) This makes it possible for the the user to override all optimization flags with e.g. `make OPT=-O1` or just `make OPT=`. For some reason `-O3` was also already added to REDIS_LDFLAGS by default in #12339, so I added OPT to FINAL_LDFLAGS to avoid more complex logic (such as introducing a separate LD_OPT variable).	2024-03-13 17:02:00 +02:00
Binbin	3b3d16f748	Add KVSTORE_FREE_EMPTY_DICTS to cluster mode keys / expires kvstore (#13072 ) Currently (following #11695, and #12822), keys kvstore and expires kvstore both flag with ON_DEMAND, it means that a cluster node will only allocate a dict when the slot is assigned to it and populated, but on the other hand, when the slot is unassigned, the dict will remain allocated. We considered releasing the dict when the slot is unassigned, but it causes complications on replicas. On the other hand, from benchmarks we conducted, it looks like the performance impact of releasing the dict when it becomes empty and re-allocate it when a key is added again, isn't huge. This PR add KVSTORE_FREE_EMPTY_DICTS to cluster mode keys / expires kvstore. The impact is about about 2% performance drop, for this hopefully uncommon scenario. --------- Co-authored-by: Oran Agra <oran@redislabs.com>	2024-03-13 08:30:20 +02:00
Binbin	ad28d222ed	Lua eval scripts first in first out LRU eviction (#13108 ) In some cases, users will abuse lua eval. Each EVAL call generates a new lua script, which is added to the lua interpreter and cached to redis-server, consuming a large amount of memory over time. Since EVAL is mostly the one that abuses the lua cache, and these won't have pipeline issues (i.e. the script won't disappear unexpectedly, and cause errors like it would with SCRIPT LOAD and EVALSHA), we implement a plain FIFO LRU eviction only for these (not for scripts loaded with SCRIPT LOAD). ### Implementation notes: When not abused we'll probably have less than 100 scripts, and when abused we'll have many thousands. So we use a hard coded value of 500 scripts. And considering that we don't have many scripts, then unlike keys, we don't need to worry about the memory usage of keeping a true sorted LRU linked list. We compute the SHA of each script anyway, and put the script in a dict, we can store a listNode there, and use it for quick removal and re-insertion into an LRU list each time the script is used. ### New interfaces: At the same time, a new `evicted_scripts` field is added to INFO, which represents the number of evicted eval scripts. Users can check it to see if they are abusing EVAL. ### benchmark: `./src/redis-benchmark -P 10 -n 1000000 -r 10000000000 eval "return __rand_int__" 0` The simple abuse of eval benchmark test that will create 1 million EVAL scripts. The performance has been improved by 50%, and the max latency has dropped from 500ms to 13ms (this may be caused by table expansion inside Lua when the number of scripts is large). And in the INFO memory, it used to consume 120MB (server cache) + 310MB (lua engine), but now it only consumes 70KB (server cache) + 210KB (lua_engine) because of the scripts eviction. For non-abusive case of about 100 EVAL scripts, there's no noticeable change in performance or memory usage. ### unlikely potentially breaking change: in theory, a user can maybe load a script with EVAL and then use EVALSHA to call it (by calculating the SHA1 value on the client side), it could be that if we read the docs carefully we'll realized it's a valid scenario, but we suppose it's extremely rare. So it may happen that EVALSHA acts on a script created by EVAL, and the script is evicted and EVALSHA returns a NOSCRIPT error. that is if you have more than 500 scripts being used in the same transaction / pipeline. This solves the second point in #13102.	2024-03-13 08:27:41 +02:00
Ronen Kalish	a8e745117f	Xread last entry in stream (#7388 ) (#13117 ) Allow using `+` as a special ID for last item in stream on XREAD command. This would allow to iterate on a stream with XREAD starting with the last available message instead of the next one which `$` is used for. I.e. the caller can use `BLOCK` and `+` on the first call, and change to `$` on the next call. Closes #7388 --------- Co-authored-by: Felipe Machado <462154+felipou@users.noreply.github.com>	2024-03-13 08:23:32 +02:00
Viktor Söderqvist	9efc6ad6a6	Add API RedisModule_ClusterKeySlot and RedisModule_ClusterCanonicalKeyNameInSlot (#13069 ) Sometimes it's useful to compute a key's cluster slot in a module. This API function is just like the command CLUSTER KEYSLOT (but faster). A "reverse" API is also added: `RedisModule_ClusterCanonicalKeyNameInSlot`. Given a slot, it returns a short string that we can call a canonical key for the slot.	2024-03-12 09:26:12 -07:00
Andy Pan	9c065c417d	Enable accept4() on BSD (#13104 ) Redis enabled `accept4` on Linux after #9177, reducing extra system calls for sockets. `accept4` system call is also widely supported on BSD and Solaris in addition to Linux. This PR enables `accept4` on all platforms that support it. ### References - [accept4 on FreeBSD](https://man.freebsd.org/cgi/man.cgi?query=accept4&sektion=2&n=1) - [accept4 on DragonFly](https://man.dragonflybsd.org/?command=accept&section=2) - [accept4 on NetBSD](https://man.netbsd.org/accept.2) - [accept4 on OpenBSD](https://man.openbsd.org/accept4.2) - [accept4 on Solaris](https://docs.oracle.com/cd/E88353_01/html/E37843/accept4-3c.html)	2024-03-12 16:35:52 +02:00
Binbin	da727ad445	Fix redis-check-aof incorrectly considering data in manifest format as MP-AOF (#12958 ) The check in fileIsManifest misjudged the manifest file. For example, if resp aof contains "file", it will be considered a manifest file and the check will fail: ``` *3 $3 set $4 file $4 file ``` In #12951, if the preamble aof also contains it, it will also fail. Fixes #12951. the bug was happening if the the word "file" is mentioned in the first 1024 lines of the AOF. and now as soon as it finds a non-comment line it'll break (if it contains "file" or doesn't)	2024-03-12 08:47:43 +02:00
Harkrishn Patro	3c8d15f8c3	Pick random slot for a node to distribute operation across slots in redis-benchmark (#12986 ) Distribute operations via `redis-benchmark` on different slots owned by node. `current_slot_index` is never updated hence the value is always `0` and the tag used is always the first slot owned by the node. Hence any read/write operation via `redis-benchmark` in cluster mode always happens on a particular slot. This is inconvenient to load data uniformly via `redis-benchmark`.	2024-03-11 11:19:30 -07:00
Matthew Douglass	5fdaa53d20	Fix conversion of numbers in lua args to redis args (#13115 ) Since lua_Number is not explicitly an integer or a double, we need to make an effort to convert it as an integer when that's possible, since the string could later be used in a context that doesn't support scientific notation (e.g. 1e9 instead of 100000000). Since fpconv_dtoa converts numbers with the equivalent of `%f` or `%e`, which ever is shorter, this would break if we try to pass a long integer number to a command that takes integer. we'll get an implicit conversion to string in Lua, and then the parsing in getLongLongFromObjectOrReply will fail. ``` > eval "redis.call('hincrby', 'key', 'field', '1000000000')" 0 (nil) > eval "redis.call('hincrby', 'key', 'field', tonumber('1000000000'))" 0 (error) ERR value is not an integer or out of range script: ac99c32e4daf7e300d593085b611de261954a946, on @user_script:1. ``` Switch to using ll2string if the number can be safely represented as a long long. The problem was introduced in #10587 (Redis 7.2). closes #13113. --------- Co-authored-by: Binbin <binloveplay1314@qq.com> Co-authored-by: debing.sun <debing.sun@redis.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2024-03-10 08:46:49 +02:00
Madelyn Olson	4979cf02ff	Change crc16 slot table to be fixed size character array instead of pointer to strings (#13112 ) Update the crc16 hash lookup table to use fixed size character arrays instead of pointers to static string addresses. Since the actual values are so short, we can just store them in a uniform array instead. This saves about 128kb of memory and should improve the performance as well since we should have much better memory locality.	2024-03-08 15:50:36 -08:00
debing.sun	9738ba9841	Check user's oom_score_adj write permission for oom-score-adj test (#13111 ) `CONFIG SET oom-score-adj handles configuration failures` test failed in some CI jobs today. Failed CI: https://github.com/redis/redis/actions/runs/8152519326 Not sure why the github action's docker image perssions have changed, but the issue is similar to #12887, where we can't assume the range of oom_score_adj that a user can change. ## Solution: Modify the way of determining whether the current user has no privileges or not, instead of relying on whether the user id is 0 or not.	2024-03-05 14:42:28 +02:00
Ping Xie	28976a9003	Fix PONG message processing for primary-ship tracking during failovers (#13055 ) This commit updates the processing of PONG gossip messages in the cluster. When a node (B) becomes a replica due to a failover, its PONG messages include its new primary node's (A) information and B's configuration epoch is aligned with A's. This allows observer nodes to identify changes in primary-ship, addressing issues of intermediate states and enhancing cluster state consistency during topology changes. Fix #13018	2024-03-04 17:32:25 -08:00
debing.sun	ad12730333	Implement defragmentation for pubsub kvstore (#13058 ) After #13013 ### This PR make effort to defrag the pubsub kvstore in the following ways: 1. Till now server.pubsub(shard)_channels only share channel name obj with the first subscribed client, now change it so that the clients and the pubsub kvstore share the channel name robj. This would save a lot of memory when there are many subscribers to the same channel. It also means that we only need to defrag the channel name robj in the pubsub kvstore, and then update all client references for the current channel, avoiding the need to iterate through all the clients to do the same things. 2. Refactor the code to defragment pubsub(shard) in the same way as defragment of keys and EXPIRES, with the exception that we only defragment pubsub(without shard) when slot is zero. ### Other Fix an overlook in #11695, if defragment doesn't reach the end time, we should wait for the current db's keys and expires, pubsub and pubsubshard to finish before leaving, now it's possible to exit early when the keys are defragmented. --------- Co-authored-by: oranagra <oran@redislabs.com>	2024-03-04 16:56:50 +02:00
Binbin	33ea432585	Call finalizerProc when free the aeTimeEvent in ae (#13101 ) Supplement to #6189, we also need to call finalizerProc. This is a minor cleanup, no one currently uses this finalizerProc feature.	2024-03-03 09:20:18 +02:00
Binbin	df75153d79	Fix reply schemas validator build issue due to new regular expression (#13103 ) The new regular expression break the validator: ``` In file included from commands.c:10: commands_with_reply_schema.def:14528:72: error: stray ‘\’ in program 14528 \| struct jsonObjectElement MEMORY_STATS_ReplySchema_patternProperties__db\_\d+__properties_overhead_hashtable_main_elements[] = { ``` The reason is that special characters are not added to to_c_name, causes special characters to appear in the structure name, causing c file compilation to fail. Broken by #12913	2024-03-02 21:26:05 +02:00
YaacovHazan	a50bbcb656	redis-cli fixes around help hints version filtering (#13097 ) - In removeUnsupportedArgs, trying to access the next item after the last one and causing an out of bounds read. - In versionIsSupported, when the 'version' is equal to 'since', the return value is 0 (not supported). Also, change the function to return `not supported` in case they have different numbers of digits Both issues were found by `Non-interactive non-TTY CLI: Test command-line hinting - old server` under `test-sanitizer-address` (When changing the `src/version.h` locally to `8.0.0`) The new `MAXAGE` argument inside `client-kill` triggered the issue (new argument at the end of the list) --------- Co-authored-by: YaacovHazan <yaacov.hazan@redislabs.com>	2024-03-02 11:48:36 +02:00
Chen Tianjie	4cae99e785	Add overhead of all DBs and rehashing dict count to info. (#12913 ) Sometimes we need to make fast judgement about why Redis is suddenly taking more memory. One of the reasons is main DB's dicts doing rehashing. We may use `MEMORY STATS` to monitor the overhead memory of each DB, but there still lacks a total sum to show an overall trend. So this PR adds the total overhead of all DBs to `INFO MEMORY` section, together with the total count of rehashing DB dicts, providing some intuitive metrics about main dicts rehashing. This PR adds the following metrics to INFO MEMORY * `mem_overhead_db_hashtable_rehashing` - only size of ht[0] in dictionaries we're rehashing (i.e. the memory that's gonna get released soon) and a similar ones to MEMORY STATS: * `overhead.db.hashtable.lut` (complements the existing `overhead.hashtable.main` and `overhead.hashtable.expires` which also counts the `dictEntry` structs too) * `overhead.db.hashtable.rehashing` - temporary rehashing overhead. * `db.dict.rehashing.count` - number of top level dictionaries being rehashed. --------- Co-authored-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2024-03-01 13:41:24 +08:00
Binbin	f17381a38d	Fix propagation of entries_read by calling streamPropagateGroupID unconditionally (#12898 ) In XREADGROUP ACK, because streamPropagateXCLAIM does not propagate entries-read, entries-read will be inconsistent between master and replicas. I.e. if no entries were claimed, it would have propagated correctly, but if some were claimed, then the entries-read field would be inconsistent on the replica. The fix was suggested by guybe7, call streamPropagateGroupID unconditionally, so that we will normalize entries_read on the replicas. In the past, we would only set propagate_last_id when NOACK was specified. And in #9127, XCLAIM did not propagate entries_read in ACK, which would cause entries_read to be inconsistent between master and replicas. Another approach is add another arg to XCLAIM and let it propagate entries_read, but we decided not to use it. Because we want minimal damage in case there's an old target and new source (in the worst case scenario, the new source doesn't recognize XGROUP SETID ... ENTRIES READ and the lag is lost. If we change XCLAIM, the damage is much more severe). In this patch, now if the user uses XREADGROUP .. COUNT 1 there will be an additional overhead of MULTI, EXEC and XGROUPSETID. We assume the extra command in case of COUNT 1 (4x factor, changing from one XCLAIM to MULTI+XCLAIM+XSETID+EXEC), is probably ok since reading just one entry is in any case very inefficient (a client round trip per record), so we're hoping it's not a common case. Issue was introduced in #9127.	2024-02-29 09:48:20 +02:00
zhaozhao.zz	cc9fbd270e	freeDictIfNeeded when kvstoreEmpty (#13098 ) just like `kvstoreDictDelete`, we need check `freeDictIfNeeded` when `kvstoreEmpty`.	2024-02-29 08:16:41 +02:00
Binbin	a7abc2f067	SCRIPT FLUSH run truly async, close lua interpreter in bio (#13087 ) Even if we have SCRIPT FLUSH ASYNC now, when there are a lot of lua scripts, SCRIPT FLUSH ASYNC will still block the main thread. This is because lua_close is executed in the main thread, and lua heap needs to release a lot of memory. In this PR, we take the current lua instance on lctx.lua and call lua_close on it in a background thread, to close it in async way. This is MeirShpilraien's idea.	2024-02-28 17:57:29 +02:00
LiiNen	763827c981	Fix redis-cli --count (for --scan, --bigkeys, etc) was ignored unless --pattern was also used (#13092 ) The --count option for redis-cli has been released in redis 7.2. https://github.com/redis/redis/pull/12042 But I have found in code, that some logic was missing for using this 'count' option. ``` static redisReply sendScan(unsigned long long it) { redisReply reply; if (config.pattern) reply = redisCommand(context, "SCAN %llu MATCH %b COUNT %d", it, config.pattern, sdslen(config.pattern), config.count); else reply = redisCommand(context,"SCAN %llu",it); ``` The intention was being able to using scan count. But in this case, the --count will be only applied when 'pattern' is declared. So, I had fix it simply, to be worked properly - even if --pattern option is not being used. I tested it simply with time() command several times, and I could see it works as intended with this commit. The examples of test results are below: ``` # unstable build time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan >/dev/null 2>/dev/null) real 0m1.287s user 0m0.011s sys 0m0.022s # count is not applied time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan --count 1000 >/dev/null 2>/dev/null) real 0m1.117s user 0m0.011s sys 0m0.020s # count is applied with --pattern time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan --count 1000 --pattern "hash:" >/dev/null 2>/dev/null) real 0m0.045s user 0m0.002s sys 0m0.002s ``` ``` # fix-redis-cli-scan-count build time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan >/dev/null 2>/dev/null) real 0m1.084s user 0m0.008s sys 0m0.024s # count is applied even if --pattern is not declared time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan --count 1000 >/dev/null 2>/dev/null) real 0m0.043s user 0m0.000s sys 0m0.004s # of course this also applied time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan --count 1000 --pattern "hash:*" >/dev/null 2>/dev/null) real 0m0.031s user 0m0.002s sys 0m0.002s ``` Thanks a lot.	2024-02-28 09:44:30 +02:00
Yanqi Lv	0a12f380e8	Optimize DEL on expired keys (#13080 ) If we call `DEL` on expired keys, keys may be deleted in `expireIfNeeded` and we don't need to call `dbSyncDelete` or `dbAsyncDelete` after, which repeat the deletion process(i.e. find keys in main db). In this PR, I refine the return values of `expireIfNeeded` to indicate whether we have deleted the expired key to avoid the potential redundant deletion logic in `delGenericCommand`. Besides, because both KEY_EXPIRED and KEY_DELETED are non-zero, this PR won't affect other functions calling `expireIfNeeded`. I also make a performance test. I first close active expiration by `debug set-active-expire 0` and write 1 million keys with 1ms TTL. Then I repeatedly delete 100 expired keys in one `DEL`. The results are as follow, which shows that this PR can improve performance by about 10% in this situation. unstable ``` Summary: throughput summary: 10080.65 requests per second latency summary (msec): avg min p50 p95 p99 max 0.953 0.136 0.959 1.215 1.335 2.247 ``` This PR ``` Summary: throughput summary: 11074.20 requests per second latency summary (msec): avg min p50 p95 p99 max 0.865 0.128 0.879 1.055 1.175 2.159 ``` --------- Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Oran Agra <oran@redislabs.com>	2024-02-26 12:50:04 +02:00
Binbin	104b207602	Fix size stat in malloc(0) and cleanups around zmalloc file (#13068 ) In #8554, we added a MALLOC_MIN_SIZE to use a minimum allocation size when using malloc(0). However, we did not update the size, when malloc_size is missing. When malloc_size exists, we record the size that was allocated instead of the size that was requested. This would work with both jemalloc, and libc malloc (the change in #8554, doesn't break this). When malloc_size is missing, we allocate extra size_t bytes and store the requested size in it. In that case, the requested size is probably different than the allocated size anyway (the change in #8554 doesn't conceptually change that). So we have room for improvement since in this case we are aware of the extra bytes we asked for. Same as we're also aware of the extra size_t bytes we asked for. In addition, some cleaning was done: 1. fixes some outupdated comments. 2. test cleanups	2024-02-26 12:07:06 +02:00
Binbin	bfcaa7db0a	Fix minor memory leak in rewriteSetObject (#13086 ) It seems to be a leak caused by code refactoring in #11290. it's a small leak, that only happens if there's an IO error.	2024-02-22 14:46:56 +02:00
debing.sun	4a265554ae	Expose lua os.clock() api (#12971 ) Implement #12699 This PR exposing Lua os.clock() api for getting the elapsed time of Lua code execution. Using: ```lua local start = os.clock() ... do something ... local elpased = os.clock() - start ``` --------- Co-authored-by: Meir Shpilraien (Spielrein) <meir@redis.com> Co-authored-by: Madelyn Olson <34459052+madolson@users.noreply.github.com>	2024-02-22 11:29:52 +02:00
debing.sun	165afc5f2a	Determine the large limit of the quicklist node based on fill (#12659 ) Following #12568 In issue #9357, when inserting an element larger than 1GB, we currently store it in a plain node instead of a listpack. Presently, when we insert an element that exceeds the maximum size of a packed node, it cannot be accommodated in any other nodes, thus ending up isolated like a large element. I.e. it's a node with only one element, but it's listpack encoded rather than a plain buffer. This PR lowers the threshold for considering an element as 'large' from 1GB to the maximum size of a node. While this change doesn't completely resolve the bug mentioned in the previous PR, it does mitigate its potential impact. As a result of this change, we can now only use LSET to replace an element with another element that falls below the maximum size threshold. In the worst-case scenario, with a fill of -5, the largest packed node we can create is 2GB (32k * 64k): * 32k: The smallest element in a listpack is 2 bytes, which allows us to store up to 32k elements. * 64k: This is the maximum size for a single quicklist node. ## Others To fully fix #9357, we need more work, as discussed in #12568, when we insert an element into a quicklistNode, it may be created in a new node, put into another node, or merged, and we can't correctly delete the node that was supposed to be deleted. I'm not sure it's worth it, since it involves a lot of modifications.	2024-02-22 10:02:38 +02:00
guybe7	820a4e45f1	Edit the history field of xinfo-consumers (#13078 ) Now it matches the information in xinfo-stream.json	2024-02-22 09:44:29 +02:00
Binbin	5b9fc46523	Add new allocator.muzzy field to memory-stats reply schema (#13076 ) This field was added in #12996 but forgot to add it in json file. This also causes reply-schemas-validator to fail.	2024-02-21 08:35:10 +02:00
debing.sun	f6785df663	Defragger improvements around large bins (#12996 ) Implement #12963 ## Changes 1. large bins don't have external fragmentation or are at least non-defraggable, so we should ignore the effect of large bins when measuring fragmentation, and only measure fragmentation of small bins. this affects both the allocator_frag* metrics and also the active-defrag trigger 2. Adding INFO metrics for `muzzy` memory, which is memory returned to the OS but still shows as RSS until the OS reclaims it. --------- Co-authored-by: Oran Agra <oran@redislabs.com>	2024-02-20 18:11:09 +02:00
Binbin	ca5cac998e	xinfo-stream add minimum to seen-time, skip logreqres in fuzzer (#13056 ) Recently I saw in CI that reply-schemas-validator fails here: ``` Failed validating 'minimum' in schema[1]['properties']['groups']['items']['properties']['consumers']['items']['properties']['active-time']: {'description': 'Last time this consumer was active (successful ' 'reading/claiming).', 'minimum': 0, 'type': 'integer'} On instance['groups'][0]['consumers'][0]['active-time']: -1729380548878722639 ``` The reason is that in fuzzer, we may restore corrupted active-time, which will cause the reply schema CI to fail. The fuzzer can cause corrupt the state in many places, which will bugs that mess up the reply, so we decided to skip logreqres. Also, seen-time is the same type as active-time, adding the minimum. --------- Co-authored-by: Oran Agra <oran@redislabs.com>	2024-02-20 12:21:10 +02:00
Binbin	3c2ea1ea95	Fix wathced client test timing issue caused by late close (#13062 ) There is a timing issue in the test, close may arrive late, or in freeClientAsync we will free the client in async way, which will lead to errors in watching_clients statistics, since we will only unwatch all keys when we truly freeClient. Add a wait here to avoid this problem. Also fixed some outdated comments i saw. The test was introduced in #12966.	2024-02-20 11:12:19 +02:00
Binbin	4e3be944fc	Fix timing issue in blockedclient test (#13071 ) We can see that the past time here happens to be busy_time_limit, causing the test to fail: ``` [err]: RM_Call from blocked client in tests/unit/moduleapi/blockedclient.tcl Expected '50' to be more than '50' (context: type eval line 26 cmd {assert_morethan [expr [clock clicks -milliseconds]-$start] $busy_time_limit} proc ::test) ``` It is reasonable for them to be equal, so equal is added here. It should be noted that in the previous `Busy module command` test, we also used assert_morethan_equal, so this should have been missed at the time.	2024-02-20 08:43:13 +02:00
judeng	fc3a68d8fb	add -fno-omit-frame-pointer to default complication flags (#12973 ) Currently redis uses O3 level optimization would remove the frame pointer in the target bin. In the very old past, when gcc optimized at O1 and above levels, the frame pointer is deleted by default to improve performance. This saves the RBP registers and reduces the pop/push instructions. But it makes it difficult for us to observe the running status of the program. For example, the perf tool cannot be used effectively, especially the modern eBPF tools such as bcc/memleak.	2024-02-19 11:47:02 -08:00
guybe7	6df42df291	Adds a README to the command JSON files (#13066 ) Add readme about the command json folder, what it does, and who should (not) use it. see discussion https://github.com/redis/redis/issues/9359#issuecomment-1936420698 --------- Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: Binbin <binloveplay1314@qq.com>	2024-02-19 18:49:31 +02:00

1 2 3 4 5 ...

12119 Commits All Branches Search

12119 Commits

All Branches