Commit Graph

10326 Commits

Author SHA1 Message Date
Oran Agra 4930d19e70 Redis 6.2.6 2021-10-04 13:59:40 +03:00
Oran Agra aba9517542 corrupt-dump-fuzzer test, avoid creating junk keys (#9302)
The execution of the RPOPLPUSH command by the fuzzer created junk keys,
that were later being selected by RANDOMKEY and modified.
This also meant that lists were statistically tested more than other
files.

Fix the fuzzer not to pass junk key names to RPOPLPUSH, and add a check
that detects that new keys are not added by the fuzzer to detect future
similar issues.

(cherry picked from commit 3f3f678a47)
2021-10-04 13:59:40 +03:00
sundb 7540708a61 Fix missing check for sanitize_dump in corrupt-dump-fuzzer test (#9285)
this means the assertion that checks that when deep sanitization is enabled,
there are no crashes, was missing.

(cherry picked from commit 3db0f1a284)
2021-10-04 13:59:40 +03:00
Yunier Pérez 664c7b36a5 Allow to override OPENSSL_PREFIX (#9567)
While the original issue was on Linux, this should work for other
platforms as well.
2021-10-04 13:59:40 +03:00
Yossi Gottlieb c390972c76 Propagate OPENSSL_PREFIX to hiredis. (#9345) 2021-10-04 13:59:40 +03:00
Oran Agra 73d286d523 Fix stream sanitization for non-int first value (#9553)
This was recently broken in #9321 when we validated stream IDs to be
integers but did that after to the stepping next record instead of before.

(cherry picked from commit 5a4ab7c7d2)
2021-10-04 13:59:40 +03:00
David CARLIER 4cbccab2bd TLS build fix on OpenBSD when built with LibreSSL. (#9486)
(cherry picked from commit 418c2e7931)
2021-10-04 13:59:40 +03:00
yvette903 23ed37dd87 Fix: client pause uses an old timeout (#9477)
A write request may be paused unexpectedly because `server.client_pause_end_time` is old.

**Recreate this:**
redis-cli -p 6379
127.0.0.1:6379> client pause 500000000 write
OK
127.0.0.1:6379> client unpause
OK
127.0.0.1:6379> client pause 10000 write
OK
127.0.0.1:6379> set key value

The write request `set key value` is paused util  the timeout of 500000000 milliseconds was reached.

**Fix:**
reset `server.client_pause_end_time` = 0 in `unpauseClients`

(cherry picked from commit f560531d5b)
2021-10-04 13:59:40 +03:00
zhaozhao.zz 9b25484a13 Fix wrong offset when replica pause (#9448)
When a replica paused, it would not apply any commands event the command comes from master, if we feed the non-applied command to replication stream, the replication offset would be wrong, and data would be lost after failover(since replica's `master_repl_offset` grows but command is not applied).

To fix it, here are the changes:
* Don't update replica's replication offset or propagate commands to sub-replicas when it's paused in `commandProcessed`.
* Show `slave_read_repl_offset` in info reply.
* Add an assert to make sure master client should never be blocked unless pause or module (some modules may use block way to do background (parallel) processing and forward original block module command to the replica, it's not a good way but it can work, so the assert excludes module now, but someday in future all modules should rewrite block command to propagate like what `BLPOP` does).

(cherry picked from commit 1b83353dc3)
2021-10-04 13:59:40 +03:00
Madelyn Olson 49f8f43890 Add test verifying PUBSUB NUMPAT behavior (#9209)
(cherry picked from commit 8b8f05c86c)
2021-10-04 13:59:40 +03:00
guybe7 c936f801c0 Fix two minor bugs (MIGRATE key args and getKeysUsingCommandTable) (#9455)
1. MIGRATE has a potnetial key arg in argv[3]. It should be reflected in the command table.
2. getKeysUsingCommandTable should never free getKeysResult, it is always freed by the caller)
   The reason we never encountered this double-free bug is that almost always getKeysResult
   uses the statis buffer and doesn't allocate a new one.

(cherry picked from commit 6aa2285e32)
2021-10-04 13:59:40 +03:00
sundb 694a869e78 Fix the timing of read and write events under kqueue (#9416)
Normally we execute the read event first and then the write event.
When the barrier is set, we will do it reverse.
However, under `kqueue`, if an `fd` has both read and write events,
reading the event using `kevent` will generate two events, which will
result in uncontrolled read and write timing.

This also means that the guarantees of AOF `appendfsync` = `always` are
not met on MacOS without this fix.

The main change to this pr is to cache the events already obtained when reading
them, so that if the same `fd` occurs again, only the mask in the cache is updated,
rather than a new event is generated.

This was exposed by the following test failure on MacOS:
```
*** [err]: AOF fsync always barrier issue in tests/integration/aof.tcl
Expected 544 != 544 (context: type eval line 26 cmd {assert {$size1 != $size2}} proc ::test)
```

(cherry picked from commit 306a5ccd2d)
2021-10-04 13:59:40 +03:00
sundb a2e8a3a241 Sanitize dump payload: fix double free after insert dup nodekey to stream rax and returns 0 (#9399)
(cherry picked from commit 492d8d0961)
2021-10-04 13:59:40 +03:00
Wang Yuan 2f8ec0e024 Fix the wrong detection of sync_file_range system call (#9371)
If we want to check `defined(SYNC_FILE_RANGE_WAIT_BEFORE)`, we should include fcntl.h.
otherwise, SYNC_FILE_RANGE_WAIT_BEFORE is not defined, and there is alway not `sync_file_range` system call.
Introduced by #8532

(cherry picked from commit 8edc3cd62c)
2021-10-04 13:59:40 +03:00
sundb 09c63c45dd Sanitize dump payload: handle remaining empty key when RDB loading and restore command (#9349)
This commit mainly fixes empty keys due to RDB loading and restore command,
which was omitted in #9297.

1) When loading quicklsit, if all the ziplists in the quicklist are empty, NULL will be returned.
    If only some of the ziplists are empty, then we will skip the empty ziplists silently.
2) When loading hash zipmap, if zipmap is empty, sanitization check will fail.
3) When loading hash ziplist, if ziplist is empty, NULL will be returned.
4) Add RDB loading test with sanitize.

(cherry picked from commit cbda492909)
2021-10-04 13:59:40 +03:00
DarrenJiang13 1ed0f049fe [BUGFIX] Add some missed error statistics (#9328)
add error counting for some missed behaviors.

(cherry picked from commit 43eb0ce3bf)
2021-10-04 13:59:40 +03:00
Oran Agra 4b04ca0b18 Improvements to corrupt payload sanitization (#9321)
Recently we found two issues in the fuzzer tester: #9302 #9285
After fixing them, more problems surfaced and this PR (as well as #9297) aims to fix them.

Here's a list of the fixes
- Prevent an overflow when allocating a dict hashtable
- Prevent OOM when attempting to allocate a huge string
- Prevent a few invalid accesses in listpack
- Improve sanitization of listpack first entry
- Validate integrity of stream consumer groups PEL
- Validate integrity of stream listpack entry IDs
- Validate ziplist tail followed by extra data which start with 0xff

Co-authored-by: sundb <sundbcn@gmail.com>
(cherry picked from commit 0c90370e6d)
2021-10-04 13:59:40 +03:00
sundb 2f54107289 Sanitize dump payload: fix empty keys when RDB loading and restore command (#9297)
When we load rdb or restore command, if we encounter a length of 0, it will result in the creation of an empty key.
This could either be a corrupt payload, or a result of a bug (see #8453 )

This PR mainly fixes the following:
1) When restore command will return `Bad data format` error.
2) When loading RDB, we will silently discard the key.

Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 8ea777a6a0)
2021-10-04 13:59:40 +03:00
menwen e34f06ae5d Add latency monitor sample when key is deleted via lazy expire (#9317)
Fix that there is no sample latency after the key expires via expireIfNeeded().
Some refactoring for shared code.

(cherry picked from commit ca559819f7)
2021-10-04 13:59:40 +03:00
Viktor Söderqvist 77386ae011 redis-cli ASK redirect test: Add retry loop to fix timing issue (#9315)
(cherry picked from commit 1c59567a7f)
2021-10-04 13:59:40 +03:00
Oran Agra 0c959294a8 Skip new redis-cli ASK test in TLS mode (#9312)
(cherry picked from commit 52df350fe5)
2021-10-04 13:59:40 +03:00
Huang Zhw 8892b5cf9e When redis-cli received ASK, it didn't handle it (#8930)
When redis-cli received ASK, it used string matching wrong and didn't
handle it.

When we access a slot which is in migrating state, it maybe
return ASK. After redirect to the new node, we need send ASKING
command before retry the command.  In this PR after redis-cli receives
ASK, we send a ASKING command before send the origin command
after reconnecting.

Other changes:
* Make redis-cli -u and -c (unix socket and cluster mode) incompatible
  with one another.
* When send command fails, we avoid the 2nd reconnect retry and just
  print the error info. Users will decide how to do next.
  See #9277.
* Add a test faking two redis nodes in TCL to just send ASK and OK in
  redis protocol to test ASK behavior.

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit cf61ad14cc)
2021-10-04 13:59:40 +03:00
Binbin 9da232c05a redis-cli: Sleep for a while in each cliConnect when we got connect error in cluster mode. (#8884)
There's an infinite loop when redis-cli fails to connect in cluster mode.
This commit adds a 1 second sleep to prevent flooding the console with errors.
It also adds a specific error print in a few places that could have error without printing anything.

Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 8351a10b95)
2021-10-04 13:59:40 +03:00
Huang Zhw 1ce06604de redis-cli when SELECT fails, we should reset dbnum to 0 (#8898)
when SELECT fails, we should reset dbnum to 0, so the prompt will not
display incorrectly.

Additionally when SELECT and HELLO fail, we output message to inform
it.

Add config.input_dbnum which means the dbnum about to select.
And config.dbnum means currently selected dbnum. When users succeed to
select db, config.dbnum and config.input_dbnum will be the same. When
users select db failed, config.input_dbnum will be kept. Next time if users
auth success, config.input_dbnum will be automatically selected.
When reconnect, we should select the origin dbnum.

Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 6b47598998)
2021-10-04 13:59:40 +03:00
Binbin 530c70b0a9 GEO* STORE with empty src key delete the dest key and return 0, not empty array (#9271)
With an empty src key, we need to deal with two situations:
1. non-STORE: We should return emptyarray.
2. STORE: Try to delete the store key and return 0.

This applies to both GEOSEARCHSTORE (new to v6.2), and
also GEORADIUS STORE (which was broken since forever)

This pr try to fix #9261. i.e. both STORE variants would have behaved
like the non-STORE variants when the source key was missing,
returning an empty array and not deleting the destination key,
instead of returning 0, and deleting the destination key.

Also add more tests for some commands.
- GEORADIUS: wrong type src key, non existing src key, empty search,
  store with non existing src key, store with empty search
- GEORADIUSBYMEMBER: wrong type src key, non existing src key,
  non existing member, store with non existing src key
- GEOSEARCH: wrong type src key, non existing src key, empty search,
  frommember with non existing member
- GEOSEARCHSTORE: wrong type key, non existing src key,
  fromlonlat with empty search, frommember with non existing member

Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 86555ae0f7)
2021-10-04 13:59:40 +03:00
YiyuanGUO dadc67a92e Fix integer overflow in _sdsMakeRoomFor (CVE-2021-41099) 2021-10-04 13:59:40 +03:00
Oran Agra 2775a3526e Fix ziplist and listpack overflows and truncations (CVE-2021-32627, CVE-2021-32628)
- fix possible heap corruption in ziplist and listpack resulting by trying to
  allocate more than the maximum size of 4GB.
- prevent ziplist (hash and zset) from reaching size of above 1GB, will be
  converted to HT encoding, that's not a useful size.
- prevent listpack (stream) from reaching size of above 1GB.
- XADD will start a new listpack if the new record may cause the previous
  listpack to grow over 1GB.
- XADD will respond with an error if a single stream record is over 1GB
- List type (ziplist in quicklist) was truncating strings that were over 4GB,
  now it'll respond with an error.
2021-10-04 13:59:40 +03:00
meir@redislabs.com 8f241ab3b8 Fix invalid memory write on lua stack overflow {CVE-2021-32626}
When LUA call our C code, by default, the LUA stack has room for 20
elements. In most cases, this is more than enough but sometimes it's not
and the caller must verify the LUA stack size before he pushes elements.

On 3 places in the code, there was no verification of the LUA stack size.
On specific inputs this missing verification could have lead to invalid
memory write:
1. On 'luaReplyToRedisReply', one might return a nested reply that will
   explode the LUA stack.
2. On 'redisProtocolToLuaType', the Redis reply might be deep enough
   to explode the LUA stack (notice that currently there is no such
   command in Redis that returns such a nested reply, but modules might
   do it)
3. On 'ldbRedis', one might give a command with enough arguments to
   explode the LUA stack (all the arguments will be pushed to the LUA
   stack)

This commit is solving all those 3 issues by calling 'lua_checkstack' and
verify that there is enough room in the LUA stack to push elements. In
case 'lua_checkstack' returns an error (there is not enough room in the
LUA stack and it's not possible to increase the stack), we will do the
following:
1. On 'luaReplyToRedisReply', we will return an error to the user.
2. On 'redisProtocolToLuaType' we will exit with panic (we assume this
   scenario is rare because it can only happen with a module).
3. On 'ldbRedis', we return an error.
2021-10-04 13:59:40 +03:00
meir@redislabs.com 3e09be56a8 Fix protocol parsing on 'ldbReplParseCommand' (CVE-2021-32672)
The protocol parsing on 'ldbReplParseCommand' (LUA debugging)
Assumed protocol correctness. This means that if the following
is given:
*1
$100
test
The parser will try to read additional 94 unallocated bytes after
the client buffer.
This commit fixes this issue by validating that there are actually enough
bytes to read. It also limits the amount of data that can be sent by
the debugger client to 1M so the client will not be able to explode
the memory.
2021-10-04 13:59:40 +03:00
Oran Agra 757f8f771e Prevent unauthenticated client from easily consuming lots of memory (CVE-2021-32675)
This change sets a low limit for multibulk and bulk length in the
protocol for unauthenticated connections, so that they can't easily
cause redis to allocate massive amounts of memory by sending just a few
characters on the network.
The new limits are 10 arguments of 16kb each (instead of 1m of 512mb)
2021-10-04 13:59:40 +03:00
Oran Agra 04ba485042 Fix redis-cli / redis-sential overflow on some platforms (CVE-2021-32762)
The redis-cli command line tool and redis-sentinel service may be vulnerable
to integer overflow when parsing specially crafted large multi-bulk network
replies. This is a result of a vulnerability in the underlying hiredis
library which does not perform an overflow check before calling the calloc()
heap allocation function.

This issue only impacts systems with heap allocators that do not perform their
own overflow checks. Most modern systems do and are therefore not likely to
be affected. Furthermore, by default redis-sentinel uses the jemalloc allocator
which is also not vulnerable.
2021-10-04 13:59:40 +03:00
Oran Agra b1149a49b2 Fix Integer overflow issue with intsets (CVE-2021-32687)
The vulnerability involves changing the default set-max-intset-entries
configuration parameter to a very large value and constructing specially
crafted commands to manipulate sets
2021-10-04 13:59:40 +03:00
Oran Agra db09f6eb2e Redis 6.2.5 2021-07-21 21:06:49 +03:00
Huang Zhw 835d15b536 On 32 bit platform, the bit position of GETBIT/SETBIT/BITFIELD/BITCOUNT,BITPOS may overflow (see CVE-2021-32761) (#9191)
GETBIT, SETBIT may access wrong address because of wrap.
BITCOUNT and BITPOS may return wrapped results.
BITFIELD may access the wrong address but also allocate insufficient memory and segfault (see CVE-2021-32761).

This commit uses `uint64_t` or `long long` instead of `size_t`.
related https://github.com/redis/redis/pull/8096

At 32bit platform:
> setbit bit 4294967295 1
(integer) 0
> config set proto-max-bulk-len 536870913
OK
> append bit "\xFF"
(integer) 536870913
> getbit bit 4294967296
(integer) 0

When the bit index is larger than 4294967295, size_t can't hold bit index. In the past,  `proto-max-bulk-len` is limit to 536870912, so there is no problem.

After this commit, bit position is stored in `uint64_t` or `long long`. So when `proto-max-bulk-len > 536870912`, 32bit platforms can still be correct.

For 64bit platform, this problem still exists. The major reason is bit pos 8 times of byte pos. When proto-max-bulk-len is very larger, bit pos may overflow.
But at 64bit platform, we don't have so long string. So this bug may never happen.

Additionally this commit add a test cost `512MB` memory which is tag as `large-memory`. Make freebsd ci and valgrind ci ignore this test.

(cherry picked from commit 71d452876e)
2021-07-21 21:06:49 +03:00
Oran Agra bae0512c8a longer timeout in replication test (#8963)
the test normally passes. but we saw one failure in a valgrind run in github actions

(cherry picked from commit 8458baf6a9)
2021-07-21 21:06:49 +03:00
Huang Zhw 8d6134952a Remove testmodule in src/modules/Makefile. (#9250)
src/modules make failed. As in #3718 testmodule.c was removed. But the makefile was not updated

(cherry picked from commit d54c9086c2)
2021-07-21 21:06:49 +03:00
Oran Agra 1d7c0e5949 Fix failing basics moduleapi test on 32bit CI (#9140)
(cherry picked from commit 5ffdbae1f6)
2021-07-21 21:06:49 +03:00
Oran Agra 37b0f3617d Adjustments to recent RM_StringTruncate fix (#3718) (#9125)
- Introduce a new sdssubstr api as a building block for sdsrange.
  The API of sdsrange is many times hard to work with and also has
  corner case that cause bugs. sdsrange is easy to work with and also
  simplifies the implementation of sdsrange.
- Revert the fix to RM_StringTruncate and just use sdssubstr instead of
  sdsrange.
- Solve valgrind warnings from the new tests introduced by the previous
  PR.

(cherry picked from commit ae418eca24)
2021-07-21 21:06:49 +03:00
Huang Zhw 6866117194 Fix missing separator in module info line (usedby and using lists) (#9241)
Fix module info genModulesInfoStringRenderModulesList lack separator when there's more than one module in the list.

Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 1895e134a7)
2021-07-21 21:06:49 +03:00
Binbin b622537199 SMOVE only notify dstset when the addition is successful. (#9244)
in case dest key already contains the member, the dest key isn't modified, so the command shouldn't invalidate watch.

(cherry picked from commit 11dc4e59b3)
2021-07-21 21:06:49 +03:00
qetu3790 355b1b6a57 Set TCP keepalive on inbound clusterbus connections (#9230)
Set TCP keepalive on inbound clusterbus connections to prevent memory leak

(cherry picked from commit f03af47a34)
2021-07-21 21:06:49 +03:00
Yossi Gottlieb d6f4273241 Fix compatibility with OpenSSL 1.1.0. (#9233)
(cherry picked from commit 277e4dc203)
2021-07-21 21:06:49 +03:00
Oran Agra de1b19ea88 fix valgrind issues with recently added test in modules/blockonbackground (#9192)
fixes test issue introduced in #9167

1. invalid reads due to accessing non-retained string (passed as unblock context).
2. leaking module blocked client context, see #6922 for info.

(cherry picked from commit a8518cce95)
2021-07-21 21:06:49 +03:00
Yossi Gottlieb 79fa5618f1 Fix CLIENT UNBLOCK crashing modules. (#9167)
Modules that use background threads with thread safe contexts are likely
to use RM_BlockClient() without a timeout function, because they do not
set up a timeout.

Before this commit, `CLIENT UNBLOCK` would result with a crash as the
`NULL` timeout callback is called. Beyond just crashing, this is also
logically wrong as it may throw the module into an unexpected client
state.

This commits makes `CLIENT UNBLOCK` on such clients behave the same as
any other client that is not in a blocked state and therefore cannot be
unblocked.

(cherry picked from commit aa139e2f02)
2021-07-21 21:06:49 +03:00
Huang Zhw 91bf2ab86d redis-cli cluster import command may issue wrong MIGRATE command. (#8945)
In clusterManagerCommandImport strcat was used to concat COPY and
REPLACE, the space maybe not enough.
If we use --cluster-replace but not --cluster-copy, the MIGRATE
command contained COPY instead of REPLACE.

(cherry picked from commit a049f6295a)
2021-07-21 21:06:49 +03:00
Binbin 88655019cc Fix accidental deletion of sinterstore command when we meet wrong type error. (#9032)
SINTERSTORE would have deleted the dest key right away,
even when later on it is bound to fail on an (WRONGTYPE) error.

With this change it first picks up all the input keys, and only later
delete the dest key if one is empty.

Also add more tests for some commands.
Mainly focus on
- `wrong type error`:
	expand test case (base on sinter bug) in non-store variant
	add tests for store variant (although it exists in non-store variant, i think it would be better to have same tests)
- the dstkey result when we meet `non-exist key (empty set)` in *store

sdiff:
- improve test case about wrong type error (the one we found in sinter, although it is safe in sdiff)
- add test about using non-exist key (treat it like an empty set)
sdiffstore:
- according to sdiff test case, also add some tests about `wrong type error` and `non-exist key`
- the different is that in sdiffstore, we will consider the `dstkey` result

sunion/sunionstore add more tests (same as above)

sinter/sinterstore also same as above ...

(cherry picked from commit b8a5da80c4)
2021-07-21 21:06:49 +03:00
Jason Elbaum fad44611dc Change return value type for ZPOPMAX/MIN in RESP3 (#8981)
When using RESP3, ZPOPMAX/ZPOPMIN should return nested arrays for consistency
with other commands (e.g. ZRANGE).

We do that only when COUNT argument is present (similarly to how LPOP behaves).
for reasoning see https://github.com/redis/redis/issues/8824#issuecomment-855427955

This is a breaking change only when RESP3 is used, and COUNT argument is present!

(cherry picked from commit 7f342020dc)
2021-07-21 21:06:49 +03:00
Mikhail Fesenko 8884971223 Direct redis-cli repl prints to stderr, because --rdb can print to stdout. fflush stdout after responses (#9136)
1. redis-cli can output --rdb data to stdout
   but redis-cli also write some messages to stdout which will mess up the rdb.

2. Make redis-cli flush stdout when printing a reply
  This was needed in order to fix a hung in redis-cli test that uses
  --replica.
   Note that printf does flush when there's a newline, but fwrite does not.

3. fix the redis-cli --replica test which used to pass previously
   because it didn't really care what it read, and because redis-cli
   used printf to print these other things to stdout.

4. improve redis-cli --replica test to run with both diskless and disk-based.

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Viktor Söderqvist <viktor@zuiderkwast.se>
(cherry picked from commit 1eb4baa5b8)
2021-07-21 21:06:49 +03:00
Rob Snyder 07e1248686 Fix ziplist length updates on bigendian platforms (#2080)
Adds call to intrev16ifbe to ensure ZIPLIST_LENGTH is compared correctly

(cherry picked from commit eaa52719a3)
2021-07-21 21:06:49 +03:00
Oran Agra 6cd84b64f0 Test infra, handle RESP3 attributes and big-numbers and bools (#9235)
- promote the code in DEBUG PROTOCOL to addReplyBigNum
- DEBUG PROTOCOL ATTRIB skips the attribute when client is RESP2
- networking.c addReply for push and attributes generate assertion when
  called on a RESP2 client, anything else would produce a broken
  protocol that clients can't handle.

(cherry picked from commit 6a5bac309e)
2021-07-21 21:06:49 +03:00