redis

Commit Graph

Author	SHA1	Message	Date
Oran Agra	f35f36a265	Redis 7.0.15	2024-01-09 13:51:41 +02:00
Binbin	3e720bbcf5	Use CLZ in _dictNextExp to get the next power of two (#12815 ) In the past, we did not call _dictNextExp frequently. It was only called when the dictionary was expanded. Later, dictTypeExpandAllowed was introduced in #7954, which is 6.2. For the data dict and the expire dict, we can check maxmemory before actually expanding the dict. This is a good optimization to avoid maxmemory being exceeded due to the dict expansion. And in #11692, we moved the dictTypeExpandAllowed check before the threshold check, this caused a bit of performance degradation, every time a key is added to the dict, dictTypeExpandAllowed is called to check. The main reason for degradation is that in a large dict, we need to call _dictNextExp frequently, that is, every time we add a key, we need to call _dictNextExp once. Then the threshold is checked to see if the dict needs to be expanded. We can see that the order of checks here can be optimized. So we moved the dictTypeExpandAllowed check back to after the threshold check in #12789. In this way, before the dict is actually expanded (that is, before the threshold is reached), we will not do anything extra compared to before, that is, we will not call _dictNextExp frequently. But note we'll still hit the degradation when we over the thresholds. When the threshold is reached, because #7954, we may delay the dict expansion due to maxmemory limitations. In this case, we will call _dictNextExp every time we add a key during this period. This PR use CLZ in _dictNextExp to get the next power of two. CLZ (count leading zeros) can easily give you the next power of two. It should be noted that we have actually introduced the use of __builtin_clzl in #8687, which is 7.0. So i suppose all the platforms we use have it (even if the CPU doesn't have an instruction). We build 67108864 (226) keys through DEBUG POPULTE, which will use approximately 5.49G memory (used_memory:5898522936). If expansion is triggered, the additional hash table will consume approximately 1G memory (2 27 * 8). So we set maxmemory to 6871947673 (that is, 6.4G), which will be less than 5.49G + 1G, so we will delay the dict rehash while addint the keys. After that, each time an element is added to the dict, an allow check will be performed, that is, we can frequently call _dictNextExp to test the comparison before and after the optimization. Using DEBUG HTSTATS 0 to check and make sure that our dict expansion is dealyed. Using `./src/redis-server redis.conf --save "" --maxmemory 6871947673`. Using `./src/redis-benchmark -P 100 -r 1000000000 -t set -n 5000000`. After ten rounds of testing: ``` unstable: this PR: 769585.94 816860.00 771724.00 818196.69 775674.81 822368.44 781983.12 822503.69 783576.25 828088.75 784190.75 828637.75 791389.69 829875.50 794659.94 835660.69 798212.00 830013.25 801153.62 833934.56 ``` We can see there is about 4-5% performance improvement in this case. (cherry picked from commit `22cc9b5122`)	2024-01-09 13:51:41 +02:00
Binbin	8263146227	Optimize dict expand check, move allow check after the thresholds check (#12789 ) dictExpandAllowed (for the main db dict and the expire dict) seems to involve a few function calls and memory accesses, and we can do it only after the thresholds checks and can get some performance improvements. A simple benchmark test: there are 11032768 fixed keys in the database, start a redis-server with `--maxmemory big_number --save ""`, start a redis-benchmark with `-P 100 -r 1000000000 -t set -n 5000000`, collect `throughput summary: n requests per second` result. After five rounds of testing: ``` unstable this PR 848032.56 897988.56 854408.69 913408.88 858663.94 914076.81 871839.56 916758.31 882612.56 920640.75 ``` We can see a 5% performance improvement in general condition. But note we'll still hit the degradation when we over the thresholds. (cherry picked from commit `463476933c`)	2024-01-09 13:51:41 +02:00
Oran Agra	e351099e11	Fix possible corruption in sdsResize (CVE-2023-41056) #11766 introduced a bug in sdsResize where it could forget to update the sds type in the sds header and then cause an overflow in sdsalloc. it looks like the only implication of that is a possible assertion in HLL, but it's hard to rule out possible heap corruption issues with clientsCronResizeQueryBuffer	2024-01-09 13:51:41 +02:00
Oran Agra	c1d92a69c6	Redis 7.0.14	2023-10-18 10:43:44 +03:00
Binbin	6573acbd73	Support NO ONE block in REPLICAOF command json (#12633 ) The current commands.json doesn't mention the special NO ONE arguments. This change is also applied to SLAVEOF (cherry picked from commit `8d92f7f2b7`)	2023-10-18 10:43:44 +03:00
Jachin	8ada737f0a	Fix compile on macOS 13 (#12611 ) Use the __MAC_OS_X_VERSION_MIN_REQUIRED macro to detect the macOS system version instead of using MAC_OS_X_VERSION_10_6. From MacOSX14.0.sdk, the default definitions of MAC_OS_X_VERSION_xxx have been removed in usr/include/AvailabilityMacros.h. It includes AvailabilityVersions.h, where the following condition must be met: `#if (!defined(_POSIX_C_SOURCE) && !defined(_XOPEN_SOURCE)) \|\| defined(_DARWIN_C_SOURCE)` Only then will MAC_OS_X_VERSION_xxx be defined. However, in the project, _DARWIN_C_SOURCE is not defined, which leads to the loss of the definition for MAC_OS_X_VERSION_10_6. (cherry picked from commit `a2b0701d2c`)	2023-10-18 10:43:44 +03:00
Yossi Gottlieb	7f486ea6ee	Fix issue of listen before chmod on Unix sockets (CVE-2023-45145) Before this commit, Unix socket setup performed chmod(2) on the socket file after calling listen(2). Depending on what umask is used, this could leave the file with the wrong permissions for a short period of time. As a result, another process could exploit this race condition and establish a connection that would otherwise not be possible. We now make sure the socket permissions are set up prior to calling listen(2). (cherry picked from commit `a11b3bc34a`)	2023-10-18 10:43:44 +03:00
Oran Agra	49dbedb1d5	Redis 7.0.13	2023-09-06 20:55:58 +03:00
bodong.ybd	0f14d32792	Fix sort_ro get-keys function return wrong key number (#12522 ) Before： ``` 127.0.0.1:6379> command getkeys sort_ro key (empty array) 127.0.0.1:6379> ``` After: ``` 127.0.0.1:6379> command getkeys sort_ro key 1) "key" 127.0.0.1:6379> ``` (cherry picked from commit `b59f53efb3`)	2023-09-06 20:55:58 +03:00
zhaozhao.zz	4d67bb6afa	do not call handleClientsBlockedOnKeys inside yielding command (#12459 ) Fix the assertion when a busy script (timeout) signal ready keys (like LPUSH), and then an arbitrary client's `allow-busy` command steps into `handleClientsBlockedOnKeys` try wake up clients blocked on keys (like BLPOP). Reproduction process: 1. start a redis with aof `./redis-server --appendonly yes` 2. exec blpop `127.0.0.1:6379> blpop a 0` 3. use another client call a busy script and this script push the blocked key `127.0.0.1:6379> eval "redis.call('lpush','a','b') while(1) do end" 0` 4. user a new client call an allow-busy command like auth `127.0.0.1:6379> auth a` BTW, this issue also break the atomicity of script. This bug has been around for many years, the old versions only have the atomic problem, only 7.0/7.2 has the assertion problem. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `8226f39fb2`)	2023-09-06 20:55:58 +03:00
Meir Shpilraien (Spielrein)	37599fe75a	Ensure that the function load timeout is disabled during loading from RDB/AOF and on replicas. (#12451 ) When loading a function from either RDB/AOF or a replica, it is essential not to fail on timeout errors. The loading time may vary due to various factors, such as hardware specifications or the system's workload during the loading process. Once a function has been successfully loaded, it should be allowed to load from persistence or on replicas without encountering a timeout failure. To maintain a clear separation between the engine and Redis internals, the implementation refrains from directly checking the state of Redis within the engine itself. Instead, the engine receives the desired timeout as part of the library creation and duly respects this timeout value. If Redis wishes to disable any timeout, it can simply send a value of 0. (cherry picked from commit `2ee1bbb53b`)	2023-09-06 20:55:58 +03:00
Sankar	ea1bc6f623	Process loss of slot ownership in cluster bus (#12344 ) Process loss of slot ownership in cluster bus When a node no longer owns a slot, it clears the bit corresponding to the slot in the cluster bus messages. The receiving nodes currently don't record the fact that the sender stopped claiming a slot until some other node in the cluster starts claiming the slot. This can cause a slot to go missing during slot migration when subjected to inopportune race with addition of new shards or a failover. This fix forces the receiving nodes to process the loss of ownership to avoid spreading wrong information. (cherry picked from commit `1190f25ca7`)	2023-09-06 20:55:58 +03:00
sundb	646069a900	Skip test for sdsRemoveFreeSpace when mem_allocator is not jemalloc (#11878 ) Test `trim on SET with big value` (introduced from #11817) fails under mac m1 with libc mem_allocator. The reason is that malloc(33000) will allocate 65536 bytes(>42000). This test still passes under ubuntu with libc mem_allocator. ``` *** [err]: trim on SET with big value in tests/unit/type/string.tcl Expected [r memory usage key] < 42000 (context: type source line 471 file /Users/iospack/data/redis_fork/tests/unit/type/string.tcl cmd {assert {[r memory usage key] < 42000}} proc ::test) ``` simple test under mac m1 with libc mem_allocator: ```c void *p = zmalloc(33000); printf("malloc size: %zu\n", zmalloc_size(p)); # output malloc size: 65536 ``` (cherry picked from commit `3fba3ccd96`)	2023-09-06 20:55:58 +03:00
Oran Agra	8e73f9d348	Redis 7.0.12	2023-07-10 14:39:42 +03:00
sundb	f90ecfb1f7	Fix compile errors when building with gcc-12 or clang (partial #12035 ) This is a partial cherry-pick from Redis 7.2 ## Fix various compilation warnings and errors 5) server.c COMPILER: gcc-13 with FORTIFY_SOURCE WARNING: ``` In function 'lookupCommandLogic', inlined from 'lookupCommandBySdsLogic' at server.c:3139:32: server.c:3102:66: error: '(robj )argv' may be used uninitialized [-Werror=maybe-uninitialized] 3102 \| struct redisCommand base_cmd = dictFetchValue(commands, argv[0]->ptr); \| ~~~~^~~ ``` REASON: The compiler thinks that the `argc` returned by `sdssplitlen()` could be 0, resulting in an empty array of size 0 being passed to lookupCommandLogic. this should be a false positive, `argc` can't be 0 when strings are not NULL. SOLUTION: add an assert to let the compiler know that `argc` is positive. ## Other changes 1) Fixed `ps -p [pid]` doesn't output `<defunct>` when using procps 4.x causing `replication child dies when parent is killed - diskless` test to fail. (cherry picked from commit `42c8c61813`)	2023-07-10 14:39:42 +03:00
Lior Lahav	bd1dac0c6e	Fix possible crash in command getkeys (#12380 ) When getKeysUsingKeySpecs processes a command with more than one key-spec, and called with a total of more than 256 keys, it'll call getKeysPrepareResult again, but since numkeys isn't updated, getKeysPrepareResult will not bother to copy key names from the old result (leaving these slots uninitialized). Furthermore, it did not consider the keys it already found when allocating more space. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `b7559d9f3b`)	2023-07-10 14:39:42 +03:00
sundb	25f610fc29	Use Reservoir Sampling for random sampling of dict, and fix hang during fork (#12276 ) ## Issue: When a dict has a long chain or the length of the chain is longer than the number of samples, we will never be able to sample the elements at the end of the chain using dictGetSomeKeys(). This could mean that SRANDMEMBER can be hang in and endless loop. The most severe case, is the pathological case of when someone uses SCAN+DEL or SSCAN+SREM creating an unevenly distributed dict. This was amplified by the recent change in #11692 which prevented a down-sizing rehashing while there is a fork. ## Solution 1. Before, we will stop sampling when we reach the maximum number of samples, even if there is more data after the current chain. Now when we reach the maximum we use the Reservoir Sampling algorithm to fairly sample the end of the chain that cannot be sampled 2. Fix the rehashing code, so that the same as it allows rehashing for up-sizing during fork when the ratio is extreme, it will allow it for down-sizing as well. Issue was introduced (or became more severe) by #11692 Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `b00a235186`)	2023-07-10 14:39:42 +03:00
Binbin	eb64a97d33	Add missing return on -UNKILLABLE sent by master case (#12277 ) We now no longer propagate scripts (started from 7.0), so this is a very rare issue that in nearly-dead-code. This is an overlook in #9780 (cherry picked from commit `e4d183afd3`)	2023-07-10 14:39:42 +03:00
Oran Agra	2ba8de9d5e	Fix WAIT for clients being blocked in a module command (#12220 ) So far clients being blocked and unblocked by a module command would update the c->woff variable and so WAIT was ineffective and got released without waiting for the command actions to propagate. This seems to have existed since forever, but not for RM_BlockClientOnKeys. It is problematic though to know if the module did or didn't propagate anything in that command, so for now, instead of adding an API, we'll just update the woff to the latest offset when unblocking, this will cause the client to possibly wait excessively, but that's not that bad. (cherry picked from commit `6117f28822`)	2023-07-10 14:39:42 +03:00
Shaya Potter	1d2839a830	Fix memory leak when RM_Call's RUN_AS_USER fails (#12158 ) previously the argv wasn't freed so would leak. not a common case, but should be handled. Solution: move RUN_AS_USER setup and error exit to the right place. this way, when we do `goto cleanup` (instead of return) it'll automatically do the right thing (including autoMemoryAdd) Removed the user argument from moduleAllocTempClient (reverted to the state before `6e993a5`) Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `71e6abe423`)	2023-07-10 14:39:42 +03:00
Brennan	c340fd5a39	Prevent repetitive backlog trimming (#12155 ) When `replicationFeedSlaves()` serializes a command, it repeatedly calls `feedReplicationBuffer()` to feed it to the replication backlog piece by piece. It is unnecessary to call `incrementalTrimReplicationBacklog()` for every small amount of data added with `feedReplicationBuffer()` as the chance of the conditions being met for trimming are very low and these frequent calls add up to a notable performance cost. Instead, we will only attempt trimming when a new block is added to the replication backlog. Using redis-benchmark to saturate a local redis server indicated a performance improvement of around 3-3.5% for 100 byte SET commands with this change. (cherry picked from commit `40e6131ba5`)	2023-07-10 14:39:42 +03:00
zhaozhao.zz	88682ca305	Free backlog only if rsi is invalid when master reboot (#12088 ) When master reboot from RDB, if rsi in RDB is valid we should not free replication backlog, even if master_repl_offset or repl-offset is 0. Since if master doesn't send any data to replicas master_repl_offset is 0, it's a valid number. A clear example: 1. start a master and apply some write commands, the master's master_repl_offset is 0 since it has no replicas. 2. stop write commands on master, and start another instance and replicaof the master, trigger an FULLRESYNC 3. the master's master_repl_offset is still 0 (set a large number for repl-ping-replica-period), do BGSAVE and restart the master 4. master load master_repl_offset from RDB's rsi and it's still 0, and we should make sure replica can partially resync with master. (cherry picked from commit `b0dd7b3245`)	2023-07-10 14:39:42 +03:00
Oran Agra	f6a7c9f9ec	Lua cjson and cmsgpack integer overflow issues (CVE-2022-24834) * Fix integer overflows due to using wrong integer size. * Add assertions / panic when overflow still happens. * Deletion of dead code to avoid need to maintain it * Some changes are not because of bugs, but rather paranoia. * Improve cmsgpack and cjson test coverage. Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>	2023-07-10 14:39:42 +03:00
Oran Agra	391aa4079c	Redis 7.0.11	2023-04-17 15:54:03 +03:00
Oran Agra	6b17d824c5	fix false valgrind error on new hash test (#11200 ) New test fails on valgrind because strtold("+inf") with valgrind returns a non-inf result same thing is done in incr.tcl. (cherry picked from commit `c3b7bde914`)	2023-04-17 15:54:03 +03:00
Oran Agra	5656cc82a6	Avoid valgrind fishy value warning on corrupt restore payloads (#10937 ) The corrupt dump fuzzer uncovered a valgrind warning saying: ``` ==76370== Argument 'size' of function malloc has a fishy (possibly negative) value: -3744781444216323815 ``` This allocation would have failed (returning NULL) and being handled properly by redis (even before this change), but we also want to silence the valgrind warnings (which are checking that casting to ssize_t produces a non-negative value). The solution i opted for is to explicitly fail these allocations (returning NULL), before even reaching `malloc` (which would have failed and return NULL too). The implication is that we will not be able to support a single allocation of more than 2GB on a 32bit system (which i don't think is a realistic scenario). i.e. i do think we could be facing cases were redis consumes more than 2gb on a 32bit system, but not in a single allocation. The byproduct of this, is that i dropped the overflow assertions, since these will now lead to the same OOM panic we have for failed allocations. (cherry picked from commit `599e59ebc5`)	2023-04-17 15:54:03 +03:00
sundb	863fcfbf52	Use dummy allocator to make accesses defined as per standard (#11982 ) NOTE: for 7.0 backport we don't declare malloc_size attributes in zmalloc.h so that we don't take the risk of inducing any crashes in a bugfix release, so will only have effect if LTO was enforced from outside. ## Issue When we use GCC-12 later or clang 9.0 later to build with `-D_FORTIFY_SOURCE=3`, we can see the following buffer overflow: ``` === REDIS BUG REPORT START: Cut & paste starting from here === 6263:M 06 Apr 2023 08:59:12.915 # Redis 255.255.255 crashed by signal: 6, si_code: -6 6263:M 06 Apr 2023 08:59:12.915 # Crashed running the instruction at: 0x7f03d59efa7c ------ STACK TRACE ------ EIP: /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f03d59efa7c] Backtrace: /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f03d599b520] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f03d59efa7c] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f03d599b476] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f03d59817f3] /lib/x86_64-linux-gnu/libc.so.6(+0x896f6)[0x7f03d59e26f6] /lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x2a)[0x7f03d5a8f76a] /lib/x86_64-linux-gnu/libc.so.6(+0x1350c6)[0x7f03d5a8e0c6] src/redis-server 127.0.0.1:25111(+0xd5e80)[0x557cddd3be80] src/redis-server 127.0.0.1:25111(feedReplicationBufferWithObject+0x78)[0x557cddd3c768] src/redis-server 127.0.0.1:25111(replicationFeedSlaves+0x1a4)[0x557cddd3cbc4] src/redis-server 127.0.0.1:25111(+0x8721a)[0x557cddced21a] src/redis-server 127.0.0.1:25111(call+0x47a)[0x557cddcf38ea] src/redis-server 127.0.0.1:25111(processCommand+0xbf4)[0x557cddcf4aa4] src/redis-server 127.0.0.1:25111(processInputBuffer+0xe6)[0x557cddd22216] src/redis-server 127.0.0.1:25111(readQueryFromClient+0x3a8)[0x557cddd22898] src/redis-server 127.0.0.1:25111(+0x1b9134)[0x557cdde1f134] src/redis-server 127.0.0.1:25111(aeMain+0x119)[0x557cddce5349] src/redis-server 127.0.0.1:25111(main+0x466)[0x557cddcd6716] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f03d5982d90] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f03d5982e40] src/redis-server 127.0.0.1:25111(_start+0x25)[0x557cddcd7025] ``` The main reason is that when FORTIFY_SOURCE is enabled, GCC or clang will enhance some common functions, such as `strcpy`, `memcpy`, `fgets`, etc, so that they can detect buffer overflow errors and stop program execution, thus improving the safety of the program. We use `zmalloc_usable_size()` everywhere to use memory blocks, but that is an abuse since the malloc_usable_size() isn't meant for this kind of use, it is for diagnostics only. That is also why the behavior is flaky when built with _FORTIFY_SOURCE, the compiler can sense that we reach outside the allocated block and SIGABRT. ### Solution If we need to use the additional memory we got, we need to use a dummy realloc with `alloc_size` attribute and no inlining, (see `extend_to_usable`) to let the compiler see the large of memory we need to use. This can either be an implicit call inside `zusable` that returns the size, so that the caller doesn't have any other worry, or it can be a normal zmalloc call which means that if the caller wants to use zmalloc_usable_size it must also use extend_to_usable. ### Changes This PR does the following: 1) rename the current z[try]malloc_usable family to z[try]malloc_internal and don't expose them to users outside zmalloc.c, 2) expose a new set of `z[]_usable` family that use z[]_internal and `extend_to_usable()` implicitly, the caller gets the size of the allocation and it is safe to use. 3) go over all the users of `zmalloc_usable_size` and convert them to use the `z[]_usable` family if possible. 4) in the places where the caller can't use `z[]_usable` and store the real size, and must still rely on zmalloc_usable_size, we still make sure that the allocation used `z[]_usable` (which has a call to `extend_to_usable()`) and ignores the returning size, this way a later call to `zmalloc_usable_size` is still safe. [4] was done for module.c and listpack.c, all the others places (sds, reply proto list, replication backlog, client->buf) are using [3]. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `e0b378d22b`)	2023-04-17 15:54:03 +03:00
Slava Koyfman	90f489b0a4	Disconnect pub-sub subscribers when revoking `allchannels` permission (#11992 ) The existing logic for killing pub-sub clients did not handle the `allchannels` permission correctly. For example, if you: ACL SETUSER foo allchannels Have a client authenticate as the user `foo` and subscribe to a channel, and then: ACL SETUSER foo resetchannels The subscribed client would not be disconnected, though new clients under that user would be blocked from subscribing to any channels. This was caused by an incomplete optimization in `ACLKillPubsubClientsIfNeeded` checking whether the new channel permissions were a strict superset of the old ones. (cherry picked from commit `f38aa6bfb7`)	2023-04-17 15:54:03 +03:00
Binbin	1788568425	Fix fork done handler wrongly update fsync metrics and enhance AOF_ FSYNC_ALWAYS (#11973 ) This PR fix several unrelated bugs that were discovered by the same set of tests (WAITAOF tests in #11713), could make the `WAITAOF` test hang. The change in `backgroundRewriteDoneHandler` is about MP-AOF. That leftover / old code assumes that we started a new AOF file just now (when we have a new base into which we're gonna incrementally write), but the fact is that with MP-AOF, the fork done handler doesn't really affect the incremental file being maintained by the parent process, there's no reason to re-issue `SELECT`, and no reason to update any of the fsync variables in that flow. This should have been deleted with MP-AOF (introduced in #9788, 7.0). The damage is that the update to `aof_fsync_offset` will cause us to miss an fsync in `flushAppendOnlyFile`, that happens if we stop write commands in `AOF_FSYNC_EVERYSEC` while an AOFRW is in progress. This caused a new `WAITAOF` test to sometime hang forever. Also because of MP-AOF, we needed to change `aof_fsync_offset` to `aof_last_incr_fsync_offset` and match it to `aof_last_incr_size` in `flushAppendOnlyFile`. This is because in the past we compared `aof_fsync_offset` and `aof_current_size`, but with MP-AOF it could be the total AOF file will be smaller after AOFRW, and the (already existing) incr file still has data that needs to be fsynced. The change in `flushAppendOnlyFile`, about the `AOF_FSYNC_ALWAYS`, it is follow #6053 (the details is in #5985), we also check `AOF_FSYNC_ALWAYS` to handle a case where appendfsync is changed from everysec to always while there is data that's written but not yet fsynced. (cherry picked from commit `cb17178658`)	2023-04-17 15:54:03 +03:00
chendianqiang	1c1bd618c9	fix hincrbyfloat not to create a key if the new value is invalid (#11149 ) Check the validity of the value before performing the create operation, prevents new data from being generated even if the request fails to execute. Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: chendianqiang <chendianqiang@meituan.com> Co-authored-by: Binbin <binloveplay1314@qq.com> (cherry picked from commit `bc7fe41e58`)	2023-04-17 15:54:03 +03:00
Oran Agra	f651708a19	Redis 7.0.10	2023-03-20 19:16:50 +02:00
Oran Agra	6956d15b77	Avoid assertion when MSETNX is used with the same key twice (CVE-2023-28425) Using the same key twice in MSETNX command would trigger an assertion. This reverts #11594 (introduced in Redis 7.0.8)	2023-03-20 19:16:50 +02:00
Binbin	66ff5e6974	Fix tail->repl_offset update in feedReplicationBuffer (#11905 ) In #11666, we added a while loop and will split a big reply node to multiple nodes. The update of tail->repl_offset may be wrong. Like before #11666, we would have created at most one new reply node, and now we will create multiple nodes if it is a big reply node. Now we are creating more than one node, and the tail->repl_offset of all the nodes except the last one are incorrect. Because we update master_repl_offset at the beginning, and then use it to update the tail->repl_offset. This would have lead to an assertion during PSYNC, a test was added to validate that case. Besides that, the calculation of size was adjusted to fix tests that failed due to a combination of a very low backlog size, and some thresholds of that get violated because of the relatively high overhead of replBufBlock. So now if the backlog size / 16 is too small, we'll take PROTO_REPLY_CHUNK_BYTES instead. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `7997874f4d`)	2023-03-20 19:16:50 +02:00
xbasel	8869589430	Large blocks of replica client output buffer could lead to psync loops and unnecessary memory usage (#11666 ) This can happen when a key almost equal or larger than the client output buffer limit of the replica is written. Example: 1. DB is empty 2. Backlog size is 1 MB 3. Client out put buffer limit is 2 MB 4. Client writes a 3 MB key 5. The shared replication buffer will have a single node which contains the key written above, and it exceeds the backlog size. At this point the client output buffer usage calculation will report the replica buffer to be 3 MB (or more) even after sending all the data to the replica. The primary drops the replica connection for exceeding the limits, the replica reconnects and successfully executes partial sync but the primary will drop the connection again because the buffer usage is still 3 MB. This happens over and over. To mitigate the problem, this fix limits the maximum size of a single backlog node to be (repl_backlog_size/16). This way a single node can't exceed the limits of the COB (the COB has to be larger than the backlog). It also means that if the backlog has some excessive data it can't trim, it would be at most about 6% overuse. other notes: 1. a loop was added in feedReplicationBuffer which caused a massive LOC change due to indentation, the actual changes are just the `min(max` and the loop. 3. an unrelated change in an existing test to speed up a server termination which took 10 seconds. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `7be7834e65`)	2023-03-20 19:16:50 +02:00
Binbin	f8ae7a414c	Fix the bug that CLIENT REPLY OFF\|SKIP cannot receive push notifications (#11875 ) This bug seems to be there forever, CLIENT REPLY OFF\|SKIP will mark the client with CLIENT_REPLY_OFF or CLIENT_REPLY_SKIP flags. With these flags, prepareClientToWrite called by addReply* will return C_ERR directly. So the client can't receive the Pub/Sub messages and any other push notifications, e.g client side tracking. In this PR, we adding a CLIENT_PUSHING flag, disables the reply silencing flags. When adding push replies, set the flag, after the reply, clear the flag. Then add the flag check in prepareClientToWrite. Fixes #11874 Note, the SUBSCRIBE command response is a bit awkward, see https://github.com/redis/redis-doc/pull/2327 Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `416842e6c0`)	2023-03-20 19:16:50 +02:00
Madelyn Olson	17181517ec	Always compact nodes in stream listpacks after creating new nodes (#11885 ) This change attempts to alleviate a minor memory usage degradation for Redis 6.2 and onwards when using rather large objects (~2k) in streams. Introduced in #6281, we pre-allocate the head nodes of a stream to be 4kb, to limit the amount of unnecessary initial reallocations that are done. However, if we only ever allocate one object because 2 objects exceeds the max_stream_entry_size, we never actually shrink it to fit the single item. This can lead to a lot of excessive memory usage. For smaller item sizes this becomes less of an issue, as the overhead decreases as the items become smaller in size. This commit also changes the MEMORY USAGE of streams, since it was reporting the lpBytes instead of the allocated size. This introduced an observability issue when diagnosing the memory issue, since Redis reported the same amount of used bytes pre and post change, even though the new implementation allocated more memory. (cherry picked from commit `2bb29e4aa3`)	2023-03-20 19:16:50 +02:00
Ozan Tezcan	a39032214d	Ignore RM_Call deny-oom flag if maxmemory is zero (#11319 ) If a command gets an OOM response and then if we set maxmemory to zero to disable the limit, server.pre_command_oom_state never gets updated and it stays true. As RM_Call() calls with "respect deny-oom" flag checks server.pre_command_oom_state, all calls will fail with OOM. Added server.maxmemory check in RM_Call() to process deny-oom flag only if maxmemory is configured. (cherry picked from commit `18920813a9`)	2023-03-20 19:16:50 +02:00
Oran Agra	86920532f7	Redis 7.0.9	2023-02-28 18:32:34 +02:00
Oran Agra	2a2a582e7c	Integer Overflow in RAND commands can lead to assertion (CVE-2023-25155) Issue happens when passing a negative long value that greater than the max positive value that the long can store.	2023-02-28 18:32:34 +02:00
Tom Levy	0825552565	String pattern matching had exponential time complexity on pathological patterns (CVE-2022-36021) Authenticated users can use string matching commands with a specially crafted pattern to trigger a denial-of-service attack on Redis, causing it to hang and consume 100% CPU time.	2023-02-28 18:32:34 +02:00
ranshid	7091b495a0	Fix possible memory corruption in FLUSHALL when a client watches more than one key (#11854 ) Avoid calling unwatchAllKeys when running touchAllWatchedKeysInDb (which was unnecessary) This can potentially lead to use-after-free and memory corruption when the next entry pointer held by the watched keys iterator is freed when unwatching all keys of a specific client. found with address sanitizer, added a test which will not always fail (depending on the random dict hashing seed) problem introduced in #9829 (Reids 7.0) Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `18017df7c1`)	2023-02-28 18:32:34 +02:00
Madelyn Olson	ddf984d058	Prevent Redis from crashing from key tracking invalidations (#11814 )	2023-02-28 18:32:34 +02:00
judeng	feb796d312	add test case and comments for active expiry in the writeable replica (#11789 ) This test case is to cover a edge scenario: when a writable replica enabled AOF at the same time, active expiry keys which was created in writable replicas should propagate to the AOF file, and some versions might crash (fixed by #11615). For details, please refer to #11778 (cherry picked from commit `40659c3424`)	2023-02-28 18:32:34 +02:00
zhaozhao.zz	c2bedf2dd5	correct cluster inbound link keepalive time (#11785 ) (cherry picked from commit `a35e08370a`)	2023-02-28 18:32:34 +02:00
guybe7	2db720591b	SCAN/RANDOMKEY and lazy-expire (#11788 ) Starting from Redis 7.0 (#9890) we started wrapping everything a command propagates with MULTI/EXEC. The problem is that both SCAN and RANDOMKEY can lazy-expire arbitrary keys (similar behavior to active-expire), and put DELs in a transaction. Fix: When these commands are called without a parent exec-unit (e.g. not in EVAL or MULTI) we avoid wrapping their DELs in a transaction (for the same reasons active-expire and eviction avoids a transaction) This PR adds a per-command flag that indicates that the command may touch arbitrary keys (not the ones in the arguments), and uses that flag to avoid the MULTI-EXEC. For now, this flag is internal, since we're considering other solutions for the future. Note for cluster mode: if SCAN/RANDOMKEY is inside EVAL/MULTI it can still cause the same situation (as it always did), but it won't cause a CROSSSLOT because replicas and AOF do not perform slot checks. The problem with the above is mainly for 3rd party ecosystem tools that propagate commands from master to master, or feed an AOF file with redis-cli into a master. This PR aims to fix the regression in redis 7.0, and we opened #11792 to try to handle the bigger problem with lazy expire better for another release. (cherry picked from commit `fd82bccd0e`)	2023-02-28 18:32:34 +02:00
Ran Shidlansik	ab05b28b22	fix cluster propagation in case of disconnected cluster node, see #11752 The mentioned PR which was fixed before 7.2 needed these adjustments in order to fix the problem in redis 7.0.	2023-02-28 18:32:34 +02:00
Harkrishn Patro	ca0b6caeed	Propagate message to a node only if the cluster link is healthy. (#11752 ) Currently while a sharded pubsub message publish tries to propagate the message across the cluster, a NULL check is missing for clusterLink. clusterLink could be NULL if the link is causing memory beyond the set threshold cluster-link-sendbuf-limit and server terminates the link. This change introduces two things: Avoids the engine crashes on the publishing node if a message is tried to be sent to a node and the link is NULL. Adds a debugging tool CLUSTERLINK KILL to terminate the clusterLink between two nodes. (cherry picked from commit `fd3975684a`)	2023-02-28 18:32:34 +02:00
Binbin	5aaa1a271c	Document some fields history of CLIENT LIST command (#11729 ) Change history: - `user` added in 6.0.0, `0f42447a0e` - `argv-mem` and `tot-mem` added in 6.2.0, `bea40e6a41` - `redir` added in 6.2.0, `dd1f20edc5` - `resp` added in 7.0.0, `7c376398b1` - `multi-mem` added in 7.0.0, `2753429c99` - `rbs` and `rbp` added in 7.0.0, `47c51d0c78` - `ssub` added in 7.0.3, `35c2ee8716` (cherry picked from commit `e7f35edb13`)	2023-02-28 18:32:34 +02:00
uriyage	af80a4a554	Optimization: sdsRemoveFreeSpace to avoid realloc on noop (#11766 ) In #7875 (Redis 6.2), we changed the sds alloc to be the usable allocation size in order to: > reduce the need for realloc calls by making the sds implicitly take over the internal fragmentation This change was done most sds functions, excluding `sdsRemoveFreeSpace` and `sdsResize`, the reason is that in some places (e.g. clientsCronResizeQueryBuffer) we call sdsRemoveFreeSpace when we see excessive free space and want to trim it. so if we don't trim it exactly to size, the caller may still see excessive free space and call it again and again. However, this resulted in some excessive calls to realloc, even when there's no need and it's gonna be a no-op (e.g. when reducing 15 bytes allocation to 13). It turns out that a call for realloc with jemalloc can be expensive even if it ends up doing nothing, so this PR adds a check using `je_nallocx`, which is cheap to avoid the call for realloc. in addition to that this PR unifies sdsResize and sdsRemoveFreeSpace into common code. the difference between them was that sdsResize would avoid using SDS_TYPE_5, since it want to keep the string ready to be resized again, while sdsRemoveFreeSpace would permit using SDS_TYPE_5 and get an optimal memory consumption. now both methods take a `would_regrow` argument that makes it more explicit. the only actual impact of that is that in clientsCronResizeQueryBuffer we call both sdsResize and sdsRemoveFreeSpace for in different cases, and we now prevent the use of SDS_TYPE_5 in both. The new test that was added to cover this concern used to pass before this PR as well, this PR is just a performance optimization and cleanup. Benchmark: `redis-benchmark -c 100 -t set -d 512 -P 10 -n 100000000` on i7-9850H with jemalloc, shows improvement from 1021k ops/sec to 1067k (average of 3 runs). some 4.5% improvement. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `46393f9819`)	2023-02-28 18:32:34 +02:00

1 2 3 4 5 ...

11432 Commits All Branches Search

11432 Commits

All Branches