Commit Graph

16 Commits

Author SHA1 Message Date
Pieter Cailliau 0b34396924
Change license from BSD-3 to dual RSALv2+SSPLv1 (#13157)
[Read more about the license change
here](https://redis.com/blog/redis-adopts-dual-source-available-licensing/)
Live long and prosper 🖖
2024-03-20 22:38:24 +00:00
Oran Agra 4c54528f0f
fixes for fork child exit and test: #11463 (#11499)
Fix a few issues with the recent #11463
* use exitFromChild instead of exit
* test should ignore defunct process since that's what we expect to
  happen for thees child processes when the parent dies.
* fix typo

Co-authored-by: Binbin <binloveplay1314@qq.com>
2022-11-12 20:35:34 +02:00
Oran Agra ccaef5c923
diskless master, avoid bgsave child hung when fork parent crashes (#11463)
During a diskless sync, if the master main process crashes, the child would
have hung in `write`. This fix closes the read fd on the child side, so that if the
parent crashes, the child will get a write error and exit.

This change also fixes disk-based replication, BGSAVE and AOFRW.
In that case the child wouldn't have been hang, it would have just kept
running until done which may be pointless.

There is a certain degree of risk here. in case there's a BGSAVE child that could
maybe succeed and the parent dies for some reason, the old code would have let
the child keep running and maybe succeed and avoid data loss.
On the other hand, if the parent is restarted, it would have loaded an old rdb file
(or none), and then the child could reach the end and rename the rdb file (data
conflicting with what the parent has), or also have a race with another BGSAVE
child that the new parent started.

Note that i removed a comment saying a write error will be ignored in the child
and handled by the parent (this comment was very old and i don't think relevant).
2022-11-09 10:02:18 +02:00
Andy Pan 2391aefd82
Implement anetPipe() to combine creating pipe and setting flags (#9511)
Implement createPipe() to combine creating pipe and setting flags, also reduce
system calls by prioritizing pipe2() over pipe().

Without createPipe(), we have to call pipe() to create a pipe and then call some
functions (like anetCloexec() and anetNonBlock()) of anet.c to set flags respectively,
which leads to some extra system calls, now we can leverage pipe2() to combine
them and make the process of creating pipe more convergent in createPipe().

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Oran Agra <oran@redislabs.com>
2021-10-06 16:08:13 +03:00
Wang Yuan d4bca53cd9
Use madvise(MADV_DONTNEED) to release memory to reduce COW (#8974)
## Backgroud
As we know, after `fork`, one process will copy pages when writing data to these
pages(CoW), and another process still keep old pages, they totally cost more memory.
For redis, we suffered that redis consumed much memory when the fork child is serializing
key/values, even that maybe cause OOM.

But actually we find, in redis fork child process, the child process don't need to keep some
memory and parent process may write or update that, for example, child process will never
access the key-value that is serialized but users may update it in parent process.
So we think it may reduce COW if the child process release memory that it is not needed.

## Implementation
For releasing key value in child process, we may think we call `decrRefCount` to free memory,
but i find the fork child process still use much memory when we don't write any data to redis,
and it costs much more time that slows down bgsave. Maybe because memory allocator doesn't
really release memory to OS, and it may modify some inner data for this free operation, especially
when we free small objects.

Moreover, CoW is based on  pages, so it is a easy way that we only free the memory bulk that is
not less than kernel page size. madvise(MADV_DONTNEED) can quickly release specified region
pages to OS bypassing memory allocator, and allocator still consider that this memory still is used
and don't change its inner data.

There are some buffers we can release in the fork child process:
- **Serialized key-values**
  the fork child process never access serialized key-values, so we try to free them.
  Because we only can release big bulk memory, and it is time consumed to iterate all
  items/members/fields/entries of complex data type. So we decide to iterate them and
  try to release them only when their average size of item/member/field/entry is more
  than page size of OS.
- **Replication backlog**
  Because replication backlog is a cycle buffer, it will be changed quickly if redis has heavy
  write traffic, but in fork child process, we don't need to access that.
- **Client buffers**
  If clients have requests during having the fork child process, clients' buffer also be changed
  frequently. The memory includes client query buffer, output buffer, and client struct used memory.

To get child process peak private dirty memory, we need to count peak memory instead
of last used memory, because the child process may continue to release memory (since
COW used to only grow till now, the last was equivalent to the peak).
Also we're adding a new `current_cow_peak` info variable (to complement the existing
`current_cow_size`)

Co-authored-by: Oran Agra <oran@redislabs.com>
2021-08-04 23:01:46 +03:00
Wang Yuan 81e2d7272b
Fix wrong COW memory in log (#8917)
Always 0 MB of memory used by copy-on-write, introduced in #8645.
2021-05-06 10:52:11 +03:00
Yossi Gottlieb c3df27d1ea
Fix slowdown due to child reporting CoW. (#8645)
Reading CoW from /proc/<pid>/smaps can be slow with large processes on
some platforms.

This measures the time it takes to read CoW info and limits the duty
cycle of future updates to roughly 1/100.

As current_cow_size no longer represnets a current, fixed interval value
there is also a new current_cow_size_age field that provides information
about the age of the size value, in seconds.
2021-03-22 13:25:58 +02:00
Oran Agra 71ab81ec69
solve valgrind warning in child_info (#8505)
Valgrind warns about `write` accessing uninitialized memory, which was the struct padding.
2021-02-17 12:30:29 +02:00
uriyage fd052d2a86
Adds INFO fields to track fork child progress (#8414)
* Adding current_save_keys_total and current_save_keys_processed info fields.
  Present in replication, BGSAVE and AOFRW.
* Changing RM_SendChildCOWInfo() to RM_SendChildHeartbeat(double progress)
* Adding new info field current_fork_perc. Present in Replication, BGSAVE, AOFRW,
  and module forks.
2021-02-16 16:06:51 +02:00
Oran Agra 8dd16caec8
Fix last COW INFO report, Skip test on non-linux platforms (#8301)
- the last COW report wasn't always read from the pipe
  (receiveLastChildInfo wasn't used)
- but in fact, there's no reason we won't always try to drain that pipe
  so i'm unifying receiveLastChildInfo with receiveChildInfo
- adjust threshold of the COW test when run in accurate mode
- add some prints in case this test fails again
- fix indentation, page size, and PID! in MacOS proc info

p.s. it seems that pri_pages_dirtied is always 0
2021-01-08 23:35:30 +02:00
YaacovHazan ea930a352c Report child copy-on-write info continuously
Add INFO field, rdb_active_cow_size, to report COW of a live fork child while
it's active.
- once in 1024 keys check the time, and if there's more than one second since
  the last report send a report to the parent via the pipe.
- refactor the child_info_data struct, it's an implementation detail that
  shouldn't be in the server struct, and not used to communicate data between
  caller and callee
- remove the magic value from that struct (not sure what it was good for), and
  instead add handling of short reads.
- add another value to the structure, cow_type, to indicate if the report is
  for the new rdb_active_cow_size field, or it's the last report of a
  successful operation
- add new Module API to report the active COW
- add more asserts variants to test.tcl
2021-01-07 16:14:29 +02:00
sundb 6987176059
docs: Fix some typos in comments and log messge (#7975) 2020-10-28 08:51:35 +02:00
Oran Agra 2458e54814
RM_GetContextFlags provides indication that we're in a fork child (#7783) 2020-09-20 13:43:28 +03:00
Oran Agra 56258c6b7d Module API for Forking
* create module API for forking child processes.
* refactor duplicate code around creating and tracking forks by AOF and RDB.
* child processes listen to SIGUSR1 and dies exitFromChild in order to
  eliminate a valgrind warning of unhandled signal.
* note that BGSAVE error reply has changed.

valgrind error is:
  Process terminating with default action of signal 10 (SIGUSR1)
2019-07-17 16:40:24 +03:00
antirez e9d861ec69 Clear child data when opening the pipes.
This is important both to reset the magic to 0, so that it will not
match if the structure is not explicitly set, and to initialize other
things we may add like counters and such.
2016-09-19 14:11:17 +02:00
antirez e565632e59 Child -> Parent pipe for COW info transferring. 2016-09-19 13:45:20 +02:00