Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788)

Implement Multi-Part AOF mechanism to avoid overheads during AOFRW.
Introducing a folder with multiple AOF files tracked by a manifest file.

The main issues with the the original AOFRW mechanism are:
* buffering of commands that are processed during rewrite (consuming a lot of RAM)
* freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it.
* double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files)

The main modifications of this PR:
1. Remove the AOF rewrite buffer and related code.
2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type,
  it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only
  one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the
  incremental commands since the last AOFRW.
3. Use a AOF manifest file to record and manage these AOF files mentioned above.
4. The original configuration of `appendfilename` will be the base part of the new file name, for example:
  `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof`
5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename`
6. Remove the `aof_rewrite_buffer_length` field in info.
7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs.
  It also gives users the opportunity to preserve the history AOFs. just for testing use now.
8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now),
  we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be
  delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit
  period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately.
9. Support upgrade (load) data from old version redis.
10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and
  manifest file will be placed in this directory.
11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if
  `aof-load-truncated` is enabled.

Co-authored-by: Oran Agra <oran@redislabs.com>
This commit is contained in:
chenyang8094 2022-01-04 01:14:13 +08:00 committed by GitHub
parent 78a62c0124
commit 87789fae0b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
25 changed files with 2581 additions and 657 deletions

3
.gitignore vendored
View File

@ -16,7 +16,8 @@ doc-tools
release
misc/*
src/release.h
appendonly.aof
appendonly.aof.*
appendonlydir
SHORT_TERM_TODO
release.h
src/transfer.sh

View File

@ -1330,10 +1330,39 @@ disable-thp yes
appendonly no
# The name of the append only file (default: "appendonly.aof")
# The base name of the append only file.
#
# Redis 7 and newer use a set of append-only files to persist the dataset
# and changes appleid to it. There are two basic types of files in use:
#
# - Base files, which are a snapshot representing the complete state of the
# dataset at the time the file was created. Base files can be either in
# the form of RDB (binary serialized) or AOF (textual commands).
# - Incremenetal files, which contain additional commands that were applied
# to the dataset following the previous file.
#
# In addition, manifest files are used to track the files and the order in
# which they were created and should be applied.
#
# Append-only file names are created by Redis following a specific pattern.
# The file name's prefix is based on the 'appendfilename' configuration
# parameter, followed by additional information about the sequence and type.
#
# For example, if appendfilename is set to appendonly.aof, the following file
# names could be derived:
#
# - appendonly.aof.1.base.rdb as a base file.
# - appendonly.aof.1.incr.aof, appendonly.aof.2.incr.aof as incremental files.
# - appendonly.aof.manifest as a manifest file.
appendfilename "appendonly.aof"
# For convenience, Redis stores all persistent append-only files in a dedicated
# directory. The name of the directory is determined by the appenddirname
# configuration parameter.
appenddirname "appendonlydir"
# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
@ -1426,15 +1455,9 @@ auto-aof-rewrite-min-size 64mb
# will be found.
aof-load-truncated yes
# When rewriting the AOF file, Redis is able to use an RDB preamble in the
# AOF file for faster rewrites and recoveries. When this option is turned
# on the rewritten AOF file is composed of two different stanzas:
#
# [RDB file][AOF tail]
#
# When loading, Redis recognizes that the AOF file starts with the "REDIS"
# string and loads the prefixed RDB file, then continues loading the AOF
# tail.
# Redis can create append-only base files in either RDB or AOF formats. Using
# the RDB format is always faster and more efficient, and disabling it is only
# supported for backward compatibility purposes.
aof-use-rdb-preamble yes
# Redis supports recording timestamp annotations in the AOF to support restoring

1520
src/aof.c

File diff suppressed because it is too large Load Diff

View File

@ -2150,6 +2150,10 @@ static int isValidDBfilename(char *val, const char **err) {
}
static int isValidAOFfilename(char *val, const char **err) {
if (!strcmp(val, "")) {
*err = "appendfilename can't be empty";
return 0;
}
if (!pathIsBaseName(val)) {
*err = "appendfilename can't be a path, just a filename";
return 0;
@ -2157,6 +2161,18 @@ static int isValidAOFfilename(char *val, const char **err) {
return 1;
}
static int isValidAOFdirname(char *val, const char **err) {
if (!strcmp(val, "")) {
*err = "appenddirname can't be empty";
return 0;
}
if (!pathIsBaseName(val)) {
*err = "appenddirname can't be a path, just a dirname";
return 0;
}
return 1;
}
static int isValidAnnouncedHostname(char *val, const char **err) {
if (strlen(val) >= NET_HOST_STR_LEN) {
*err = "Hostnames must be less than "
@ -2265,6 +2281,15 @@ static int updateAppendonly(const char **err) {
return 1;
}
static int updateAofAutoGCEnabled(const char **err) {
UNUSED(err);
if (!server.aof_disable_auto_gc) {
aofDelHistoryFiles();
}
return 1;
}
static int updateSighandlerEnabled(const char **err) {
UNUSED(err);
if (server.crashlog_enabled)
@ -2680,7 +2705,8 @@ standardConfig configs[] = {
createBoolConfig("disable-thp", NULL, MODIFIABLE_CONFIG, server.disable_thp, 1, NULL, NULL),
createBoolConfig("cluster-allow-replica-migration", NULL, MODIFIABLE_CONFIG, server.cluster_allow_replica_migration, 1, NULL, NULL),
createBoolConfig("replica-announced", NULL, MODIFIABLE_CONFIG, server.replica_announced, 1, NULL, NULL),
createBoolConfig("aof-disable-auto-gc", NULL, MODIFIABLE_CONFIG, server.aof_disable_auto_gc, 0, NULL, updateAofAutoGCEnabled),
/* String Configs */
createStringConfig("aclfile", NULL, IMMUTABLE_CONFIG, ALLOW_EMPTY_STRING, server.acl_filename, "", NULL, NULL),
createStringConfig("unixsocket", NULL, IMMUTABLE_CONFIG, EMPTY_STRING_IS_NULL, server.unixsocket, NULL, NULL, NULL),
@ -2693,6 +2719,7 @@ standardConfig configs[] = {
createStringConfig("syslog-ident", NULL, IMMUTABLE_CONFIG, ALLOW_EMPTY_STRING, server.syslog_ident, "redis", NULL, NULL),
createStringConfig("dbfilename", NULL, MODIFIABLE_CONFIG | PROTECTED_CONFIG, ALLOW_EMPTY_STRING, server.rdb_filename, "dump.rdb", isValidDBfilename, NULL),
createStringConfig("appendfilename", NULL, IMMUTABLE_CONFIG, ALLOW_EMPTY_STRING, server.aof_filename, "appendonly.aof", isValidAOFfilename, NULL),
createStringConfig("appenddirname", NULL, IMMUTABLE_CONFIG, ALLOW_EMPTY_STRING, server.aof_dirname, "appendonlydir", isValidAOFdirname, NULL),
createStringConfig("server_cpulist", NULL, IMMUTABLE_CONFIG, EMPTY_STRING_IS_NULL, server.server_cpulist, NULL, NULL, NULL),
createStringConfig("bio_cpulist", NULL, IMMUTABLE_CONFIG, EMPTY_STRING_IS_NULL, server.bio_cpulist, NULL, NULL, NULL),
createStringConfig("aof_rewrite_cpulist", NULL, IMMUTABLE_CONFIG, EMPTY_STRING_IS_NULL, server.aof_rewrite_cpulist, NULL, NULL, NULL),

View File

@ -570,7 +570,10 @@ NULL
if (server.aof_state != AOF_OFF) flushAppendOnlyFile(1);
emptyData(-1,EMPTYDB_NO_FLAGS,NULL);
protectClient(c);
int ret = loadAppendOnlyFile(server.aof_filename);
if (server.aof_manifest) aofManifestFree(server.aof_manifest);
aofLoadManifestFromDisk();
aofDelHistoryFiles();
int ret = loadAppendOnlyFiles(server.aof_manifest);
if (ret != AOF_OK && ret != AOF_EMPTY)
exit(1);
unprotectClient(c);

View File

@ -365,7 +365,7 @@ size_t freeMemoryGetNotCountedMemory(void) {
}
if (server.aof_state != AOF_OFF) {
overhead += sdsAllocSize(server.aof_buf)+aofRewriteBufferMemoryUsage();
overhead += sdsAllocSize(server.aof_buf);
}
return overhead;
}

View File

@ -445,7 +445,7 @@ sds createLatencyReport(void) {
}
if (advise_ssd) {
report = sdscat(report,"- SSD disks are able to reduce fsync latency, and total time needed for snapshotting and AOF log rewriting (resulting in smaller memory usage and smaller final AOF rewrite buffer flushes). With extremely high write load SSD disks can be a good option. However Redis should perform reasonably with high load using normal disks. Use this advice as a last resort.\n");
report = sdscat(report,"- SSD disks are able to reduce fsync latency, and total time needed for snapshotting and AOF log rewriting (resulting in smaller memory usage). With extremely high write load SSD disks can be a good option. However Redis should perform reasonably with high load using normal disks. Use this advice as a last resort.\n");
}
if (advise_data_writeback) {

View File

@ -1202,7 +1202,6 @@ struct redisMemOverhead *getMemoryOverheadData(void) {
mem = 0;
if (server.aof_state != AOF_OFF) {
mem += sdsZmallocSize(server.aof_buf);
mem += aofRewriteBufferMemoryUsage();
}
mh->aof_buffer = mem;
mem_total+=mem;

View File

@ -1256,7 +1256,6 @@ ssize_t rdbSaveDb(rio *rdb, int dbid, int rdbflags, long *key_counter) {
ssize_t written = 0;
ssize_t res;
static long long info_updated_time = 0;
size_t processed = 0;
char *pname = (rdbflags & RDBFLAGS_AOF_PREAMBLE) ? "AOF rewrite" : "RDB";
redisDb *db = server.db + dbid;
@ -1299,16 +1298,6 @@ ssize_t rdbSaveDb(rio *rdb, int dbid, int rdbflags, long *key_counter) {
size_t dump_size = rdb->processed_bytes - rdb_bytes_before_key;
if (server.in_fork_child) dismissObject(o, dump_size);
/* When this RDB is produced as part of an AOF rewrite, move
* accumulated diff from parent to child while rewriting in
* order to have a smaller final write. */
if (rdbflags & RDBFLAGS_AOF_PREAMBLE &&
rdb->processed_bytes > processed+AOF_READ_DIFF_INTERVAL_BYTES)
{
processed = rdb->processed_bytes;
aofReadDiffFromParent();
}
/* Update child info every 1 second (approximately).
* in order to avoid calling mstime() on each iteration, we will
* check the diff every 1024 keys */
@ -2677,21 +2666,30 @@ void startLoading(size_t size, int rdbflags, int async) {
/* Mark that we are loading in the global state and setup the fields
* needed to provide loading stats.
* 'filename' is optional and used for rdb-check on error */
void startLoadingFile(FILE *fp, char* filename, int rdbflags) {
struct stat sb;
if (fstat(fileno(fp), &sb) == -1)
sb.st_size = 0;
void startLoadingFile(size_t size, char* filename, int rdbflags) {
rdbFileBeingLoaded = filename;
startLoading(sb.st_size, rdbflags, 0);
startLoading(size, rdbflags, 0);
}
/* Refresh the loading progress info */
void loadingProgress(off_t pos) {
/* Refresh the absolute loading progress info */
void loadingAbsProgress(off_t pos) {
server.loading_loaded_bytes = pos;
if (server.stat_peak_memory < zmalloc_used_memory())
server.stat_peak_memory = zmalloc_used_memory();
}
/* Refresh the incremental loading progress info */
void loadingIncrProgress(off_t size) {
server.loading_loaded_bytes += size;
if (server.stat_peak_memory < zmalloc_used_memory())
server.stat_peak_memory = zmalloc_used_memory();
}
/* Update the file name currently being loaded */
void updateLoadingFileName(char* filename) {
rdbFileBeingLoaded = filename;
}
/* Loading finished */
void stopLoading(int success) {
server.loading = 0;
@ -2738,7 +2736,7 @@ void rdbLoadProgressCallback(rio *r, const void *buf, size_t len) {
{
if (server.masterhost && server.repl_state == REPL_STATE_TRANSFER)
replicationSendNewlineToMaster();
loadingProgress(r->processed_bytes);
loadingAbsProgress(r->processed_bytes);
processEventsWhileBlocked();
processModuleLoadingProgressEvent(0);
}
@ -3176,9 +3174,14 @@ int rdbLoad(char *filename, rdbSaveInfo *rsi, int rdbflags) {
FILE *fp;
rio rdb;
int retval;
struct stat sb;
if ((fp = fopen(filename,"r")) == NULL) return C_ERR;
startLoadingFile(fp, filename,rdbflags);
if (fstat(fileno(fp), &sb) == -1)
sb.st_size = 0;
startLoadingFile(sb.st_size, filename, rdbflags);
rioInitWithFile(&rdb,fp);
retval = rdbLoadRio(&rdb,rdbflags,rsi);

View File

@ -34,6 +34,7 @@
#include <stdarg.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/stat.h>
void createSharedObjects(void);
void rdbLoadProgressCallback(rio *r, const void *buf, size_t len);
@ -192,11 +193,15 @@ int redis_check_rdb(char *rdbfilename, FILE *fp) {
char buf[1024];
long long expiretime, now = mstime();
static rio rdb; /* Pointed by global struct riostate. */
struct stat sb;
int closefile = (fp == NULL);
if (fp == NULL && (fp = fopen(rdbfilename,"r")) == NULL) return 1;
startLoadingFile(fp, rdbfilename, RDBFLAGS_NONE);
if (fstat(fileno(fp), &sb) == -1)
sb.st_size = 0;
startLoadingFile(sb.st_size, rdbfilename, RDBFLAGS_NONE);
rioInitWithFile(&rdb,fp);
rdbstate.rio = &rdb;
rdb.update_cksum = rdbLoadProgressCallback;

View File

@ -1191,7 +1191,8 @@ int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
/* Start a scheduled AOF rewrite if this was requested by the user while
* a BGSAVE was in progress. */
if (!hasActiveChildProcess() &&
server.aof_rewrite_scheduled)
server.aof_rewrite_scheduled &&
!aofRewriteLimited())
{
rewriteAppendOnlyFileBackground();
}
@ -1230,7 +1231,8 @@ int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
if (server.aof_state == AOF_ON &&
!hasActiveChildProcess() &&
server.aof_rewrite_perc &&
server.aof_current_size > server.aof_rewrite_min_size)
server.aof_current_size > server.aof_rewrite_min_size &&
!aofRewriteLimited())
{
long long base = server.aof_rewrite_base_size ?
server.aof_rewrite_base_size : 1;
@ -1248,8 +1250,11 @@ int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
/* AOF postponed flush: Try at every cron cycle if the slow fsync
* completed. */
if (server.aof_state == AOF_ON && server.aof_flush_postponed_start)
if ((server.aof_state == AOF_ON || server.aof_state == AOF_WAIT_REWRITE) &&
server.aof_flush_postponed_start)
{
flushAppendOnlyFile(0);
}
/* AOF write errors: in this case we have a buffer to flush as well and
* clear the AOF error in case of success to make the DB writable again,
@ -1506,7 +1511,7 @@ void beforeSleep(struct aeEventLoop *eventLoop) {
trackingBroadcastInvalidationMessages();
/* Write the AOF buffer on disk */
if (server.aof_state == AOF_ON)
if (server.aof_state == AOF_ON || server.aof_state == AOF_WAIT_REWRITE)
flushAppendOnlyFile(0);
/* Try to process blocked clients every once in while. Example: A module
@ -1761,6 +1766,7 @@ void initServerConfig(void) {
server.aof_fd = -1;
server.aof_selected_db = -1; /* Make sure the first time will not match */
server.aof_flush_postponed_start = 0;
server.aof_last_incr_size = 0;
server.active_defrag_running = 0;
server.notify_keyspace_events = 0;
server.blocked_clients = 0;
@ -2390,7 +2396,6 @@ void initServer(void) {
server.child_info_pipe[0] = -1;
server.child_info_pipe[1] = -1;
server.child_info_nread = 0;
aofRewriteBufferReset();
server.aof_buf = sdsempty();
server.lastsave = time(NULL); /* At startup we consider the DB saved. */
server.lastbgsave_try = 0; /* At startup we never tried to BGSAVE. */
@ -3859,6 +3864,9 @@ int finishShutdown(void) {
}
}
/* Free the AOF manifest. */
if (server.aof_manifest) aofManifestFree(server.aof_manifest);
/* Fire the shutdown modules event. */
moduleFireServerEvent(REDISMODULE_EVENT_SHUTDOWN,0,NULL);
@ -4978,14 +4986,12 @@ sds genRedisInfoString(const char *section) {
"aof_base_size:%lld\r\n"
"aof_pending_rewrite:%d\r\n"
"aof_buffer_length:%zu\r\n"
"aof_rewrite_buffer_length:%lu\r\n"
"aof_pending_bio_fsync:%llu\r\n"
"aof_delayed_fsync:%lu\r\n",
(long long) server.aof_current_size,
(long long) server.aof_rewrite_base_size,
server.aof_rewrite_scheduled,
sdslen(server.aof_buf),
aofRewriteBufferSize(),
bioPendingJobsOfType(BIO_AOF_FSYNC),
server.aof_delayed_fsync);
}
@ -6017,12 +6023,9 @@ int checkForSentinelMode(int argc, char **argv, char *exec_name) {
void loadDataFromDisk(void) {
long long start = ustime();
if (server.aof_state == AOF_ON) {
/* It's not a failure if the file is empty or doesn't exist (later we will create it) */
int ret = loadAppendOnlyFile(server.aof_filename);
int ret = loadAppendOnlyFiles(server.aof_manifest);
if (ret == AOF_FAILED || ret == AOF_OPEN_ERR)
exit(1);
if (ret == AOF_OK)
serverLog(LL_NOTICE,"DB loaded from append only file: %.3f seconds",(float)(ustime()-start)/1000000);
} else {
rdbSaveInfo rsi = RDB_SAVE_INFO_INIT;
errno = 0; /* Prevent a stale value from affecting error checking */
@ -6492,17 +6495,10 @@ int main(int argc, char **argv) {
moduleLoadFromQueue();
ACLLoadUsersAtStartup();
InitServerLast();
aofLoadManifestFromDisk();
loadDataFromDisk();
/* Open the AOF file if needed. */
if (server.aof_state == AOF_ON) {
server.aof_fd = open(server.aof_filename,
O_WRONLY|O_APPEND|O_CREAT,0644);
if (server.aof_fd == -1) {
serverLog(LL_WARNING, "Can't open the append-only file: %s",
strerror(errno));
exit(1);
}
}
aofOpenIfNeededOnServerStart();
aofDelHistoryFiles();
if (server.cluster_enabled) {
if (verifyClusterConfigWithData() == C_ERR) {
serverLog(LL_WARNING,

View File

@ -105,7 +105,6 @@ typedef long long ustime_t; /* microsecond time type. */
#define OBJ_SHARED_BULKHDR_LEN 32
#define LOG_MAX_LEN 1024 /* Default maximum length of syslog messages.*/
#define AOF_REWRITE_ITEMS_PER_CMD 64
#define AOF_READ_DIFF_INTERVAL_BYTES (1024*10)
#define AOF_ANNOTATION_LINE_MAX_LEN 1024
#define CONFIG_AUTHPASS_MAX_LEN 512
#define CONFIG_RUN_ID_SIZE 40
@ -251,6 +250,7 @@ extern int configOOMScoreAdjValuesDefaults[CONFIG_OOM_COUNT];
#define AOF_EMPTY 2
#define AOF_OPEN_ERR 3
#define AOF_FAILED 4
#define AOF_TRUNCATED 5
/* Command doc flags */
#define CMD_DOC_NONE 0
@ -1339,6 +1339,33 @@ typedef struct redisTLSContextConfig {
int session_cache_timeout;
} redisTLSContextConfig;
/*-----------------------------------------------------------------------------
* AOF manifest definition
*----------------------------------------------------------------------------*/
typedef enum {
AOF_FILE_TYPE_BASE = 'b', /* BASE file */
AOF_FILE_TYPE_HIST = 'h', /* HISTORY file */
AOF_FILE_TYPE_INCR = 'i', /* INCR file */
} aof_file_type;
typedef struct {
sds file_name; /* file name */
long long file_seq; /* file sequence */
aof_file_type file_type; /* file type */
} aofInfo;
typedef struct {
aofInfo *base_aof_info; /* BASE file information. NULL if there is no BASE file. */
list *incr_aof_list; /* INCR AOFs list. We may have multiple INCR AOF when rewrite fails. */
list *history_aof_list; /* HISTORY AOF list. When the AOFRW success, The aofInfo contained in
`base_aof_info` and `incr_aof_list` will be moved to this list. We
will delete these AOF files when AOFRW finish. */
long long curr_base_file_seq; /* The sequence number used by the current BASE file. */
long long curr_incr_file_seq; /* The sequence number used by the current INCR file. */
int dirty; /* 1 Indicates that the aofManifest in the memory is inconsistent with
disk, we need to persist it immediately. */
} aofManifest;
/*-----------------------------------------------------------------------------
* Global server state
*----------------------------------------------------------------------------*/
@ -1556,12 +1583,14 @@ struct redisServer {
int aof_enabled; /* AOF configuration */
int aof_state; /* AOF_(ON|OFF|WAIT_REWRITE) */
int aof_fsync; /* Kind of fsync() policy */
char *aof_filename; /* Name of the AOF file */
char *aof_filename; /* Basename of the AOF file and manifest file */
char *aof_dirname; /* Name of the AOF directory */
int aof_no_fsync_on_rewrite; /* Don't fsync if a rewrite is in prog. */
int aof_rewrite_perc; /* Rewrite AOF if % growth is > M and... */
off_t aof_rewrite_min_size; /* the AOF file is at least N bytes. */
off_t aof_rewrite_base_size; /* AOF size on latest startup or rewrite. */
off_t aof_current_size; /* AOF current size. */
off_t aof_current_size; /* AOF current size (Including BASE + INCRs). */
off_t aof_last_incr_size; /* The size of the latest incr AOF. */
off_t aof_fsync_offset; /* AOF offset which is already synced to disk. */
int aof_flush_sleep; /* Micros to sleep before flush. (used by tests) */
int aof_rewrite_scheduled; /* Rewrite once BGSAVE terminates. */
@ -1585,16 +1614,10 @@ struct redisServer {
int aof_use_rdb_preamble; /* Use RDB preamble on AOF rewrites. */
redisAtomic int aof_bio_fsync_status; /* Status of AOF fsync in bio job. */
redisAtomic int aof_bio_fsync_errno; /* Errno of AOF fsync in bio job. */
/* AOF pipes used to communicate between parent and child during rewrite. */
int aof_pipe_write_data_to_child;
int aof_pipe_read_data_from_parent;
int aof_pipe_write_ack_to_parent;
int aof_pipe_read_ack_from_child;
int aof_pipe_write_ack_to_child;
int aof_pipe_read_ack_from_parent;
int aof_stop_sending_diff; /* If true stop sending accumulated diffs
to child process. */
sds aof_child_diff; /* AOF diff accumulator child side. */
aofManifest *aof_manifest; /* Used to track AOFs. */
int aof_disable_auto_gc; /* If disable automatically deleting HISTORY type AOFs?
default no. (for testings). */
/* RDB persistence */
long long dirty; /* Changes to DB from the last save */
long long dirty_before_bgsave; /* Used to restore dirty on failed BGSAVE */
@ -2571,10 +2594,12 @@ void abortFailover(const char *err);
const char *getFailoverStateString();
/* Generic persistence functions */
void startLoadingFile(FILE* fp, char* filename, int rdbflags);
void startLoadingFile(size_t size, char* filename, int rdbflags);
void startLoading(size_t size, int rdbflags, int async);
void loadingProgress(off_t pos);
void loadingAbsProgress(off_t pos);
void loadingIncrProgress(off_t size);
void stopLoading(int success);
void updateLoadingFileName(char* filename);
void startSaving(int rdbflags);
void stopSaving(int success);
int allPersistenceDisabled(void);
@ -2594,16 +2619,18 @@ void flushAppendOnlyFile(int force);
void feedAppendOnlyFile(int dictid, robj **argv, int argc);
void aofRemoveTempFile(pid_t childpid);
int rewriteAppendOnlyFileBackground(void);
int loadAppendOnlyFile(char *filename);
int loadAppendOnlyFiles(aofManifest *am);
void stopAppendOnly(void);
int startAppendOnly(void);
void backgroundRewriteDoneHandler(int exitcode, int bysignal);
void aofRewriteBufferReset(void);
unsigned long aofRewriteBufferSize(void);
unsigned long aofRewriteBufferMemoryUsage(void);
ssize_t aofReadDiffFromParent(void);
void killAppendOnlyChild(void);
void restartAOFAfterSYNC();
void aofLoadManifestFromDisk(void);
void aofOpenIfNeededOnServerStart(void);
void aofManifestFree(aofManifest *am);
int aofDelHistoryFiles(void);
int aofRewriteLimited(void);
/* Child info */
void openChildInfoPipe(void);

View File

@ -40,9 +40,13 @@
#include <stdint.h>
#include <errno.h>
#include <time.h>
#include <sys/stat.h>
#include <dirent.h>
#include <fcntl.h>
#include "util.h"
#include "sha256.h"
#include "config.h"
/* Glob-style pattern matching. */
int stringmatchlen(const char *pattern, int patternLen,
@ -810,6 +814,82 @@ int pathIsBaseName(char *path) {
return strchr(path,'/') == NULL && strchr(path,'\\') == NULL;
}
int fileExist(char *filename) {
struct stat statbuf;
return stat(filename, &statbuf) == 0 && S_ISREG(statbuf.st_mode);
}
int dirExists(char *dname) {
struct stat statbuf;
return stat(dname, &statbuf) == 0 && S_ISDIR(statbuf.st_mode);
}
int dirCreateIfMissing(char *dname) {
if (mkdir(dname, 0755) != 0) {
if (errno != EEXIST) {
return -1;
} else if (!dirExists(dname)) {
errno = ENOTDIR;
return -1;
}
}
return 0;
}
int dirRemove(char *dname) {
DIR *dir;
struct stat stat_entry;
struct dirent *entry;
char full_path[PATH_MAX + 1];
if ((dir = opendir(dname)) == NULL) {
return -1;
}
while ((entry = readdir(dir)) != NULL) {
if (!strcmp(entry->d_name, ".") || !strcmp(entry->d_name, "..")) continue;
snprintf(full_path, sizeof(full_path), "%s/%s", dname, entry->d_name);
int fd = open(full_path, O_RDONLY|O_NONBLOCK);
if (fd == -1) {
closedir(dir);
return -1;
}
if (fstat(fd, &stat_entry) == -1) {
close(fd);
closedir(dir);
return -1;
}
close(fd);
if (S_ISDIR(stat_entry.st_mode) != 0) {
if (dirRemove(full_path) == -1) {
return -1;
}
continue;
}
if (unlink(full_path) != 0) {
closedir(dir);
return -1;
}
}
if (rmdir(dname) != 0) {
closedir(dir);
return -1;
}
closedir(dir);
return 0;
}
sds makePath(char *path, char *filename) {
return sdscatfmt(sdsempty(), "%s/%s", path, filename);
}
#ifdef REDIS_TEST
#include <assert.h>

View File

@ -65,6 +65,11 @@ int ld2string(char *buf, size_t len, long double value, ld2string_mode mode);
sds getAbsolutePath(char *filename);
long getTimeZone(void);
int pathIsBaseName(char *path);
int dirCreateIfMissing(char *dname);
int dirExists(char *dname);
int dirRemove(char *dname);
int fileExist(char *filename);
sds makePath(char *path, char *filename);
#ifdef REDIS_TEST
int utilTest(int argc, char **argv, int flags);

Binary file not shown.

View File

@ -12,6 +12,7 @@ package require Tcl 8.5
set tcl_precision 17
source ../support/redis.tcl
source ../support/util.tcl
source ../support/aofmanifest.tcl
source ../support/server.tcl
source ../support/test.tcl

File diff suppressed because it is too large Load Diff

View File

@ -1,6 +1,5 @@
set defaults { appendonly {yes} appendfilename {appendonly.aof} aof-use-rdb-preamble {no} }
set defaults { appendonly {yes} appendfilename {appendonly.aof} appenddirname {appendonlydir} aof-use-rdb-preamble {no} }
set server_path [tmpdir server.aof]
set aof_path "$server_path/appendonly.aof"
proc start_server_aof {overrides code} {
upvar defaults defaults srv srv server_path server_path

View File

@ -1,36 +1,25 @@
set defaults { appendonly {yes} appendfilename {appendonly.aof} }
source tests/support/aofmanifest.tcl
set defaults { appendonly {yes} appendfilename {appendonly.aof} appenddirname {appendonlydir} auto-aof-rewrite-percentage {0}}
set server_path [tmpdir server.aof]
set aof_path "$server_path/appendonly.aof"
proc append_to_aof {str} {
upvar fp fp
puts -nonewline $fp $str
}
proc create_aof {code} {
upvar fp fp aof_path aof_path
set fp [open $aof_path w+]
uplevel 1 $code
close $fp
}
proc start_server_aof {overrides code} {
upvar defaults defaults srv srv server_path server_path
set config [concat $defaults $overrides]
set srv [start_server [list overrides $config]]
uplevel 1 $code
kill_server $srv
}
set aof_dirname "appendonlydir"
set aof_basename "appendonly.aof"
set aof_dirpath "$server_path/$aof_dirname"
set aof_file "$server_path/$aof_dirname/${aof_basename}.1$::incr_aof_sufix$::aof_format_suffix"
set aof_manifest_file "$server_path/$aof_dirname/$aof_basename$::manifest_suffix"
tags {"aof external:skip"} {
## Server can start when aof-load-truncated is set to yes and AOF
## is truncated, with an incomplete MULTI block.
create_aof {
# Server can start when aof-load-truncated is set to yes and AOF
# is truncated, with an incomplete MULTI block.
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand set foo hello]
append_to_aof [formatCommand multi]
append_to_aof [formatCommand set bar world]
}
create_aof_manifest $aof_dirpath $aof_manifest_file {
append_to_manifest "file appendonly.aof.1.incr.aof seq 1 type i\n"
}
start_server_aof [list dir $server_path aof-load-truncated yes] {
test "Unfinished MULTI: Server should start if load-truncated is yes" {
assert_equal 1 [is_alive $srv]
@ -38,7 +27,7 @@ tags {"aof external:skip"} {
}
## Should also start with truncated AOF without incomplete MULTI block.
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand incr foo]
append_to_aof [formatCommand incr foo]
append_to_aof [formatCommand incr foo]
@ -77,7 +66,7 @@ tags {"aof external:skip"} {
}
## Test that the server exits when the AOF contains a format error
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand set foo hello]
append_to_aof "!!!"
append_to_aof [formatCommand set foo hello]
@ -102,7 +91,7 @@ tags {"aof external:skip"} {
}
## Test the server doesn't start when the AOF contains an unfinished MULTI
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand set foo hello]
append_to_aof [formatCommand multi]
append_to_aof [formatCommand set bar world]
@ -127,7 +116,7 @@ tags {"aof external:skip"} {
}
## Test that the server exits when the AOF contains a short read
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand set foo hello]
append_to_aof [string range [formatCommand set bar world] 0 end-1]
}
@ -153,25 +142,25 @@ tags {"aof external:skip"} {
## Test that redis-check-aof indeed sees this AOF is not valid
test "Short read: Utility should confirm the AOF is not valid" {
catch {
exec src/redis-check-aof $aof_path
exec src/redis-check-aof $aof_file
} result
assert_match "*not valid*" $result
}
test "Short read: Utility should show the abnormal line num in AOF" {
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand set foo hello]
append_to_aof "!!!"
}
catch {
exec src/redis-check-aof $aof_path
exec src/redis-check-aof $aof_file
} result
assert_match "*ok_up_to_line=8*" $result
}
test "Short read: Utility should be able to fix the AOF" {
set result [exec src/redis-check-aof --fix $aof_path << "y\n"]
set result [exec src/redis-check-aof --fix $aof_file << "y\n"]
assert_match "*Successfully truncated AOF*" $result
}
@ -190,7 +179,7 @@ tags {"aof external:skip"} {
}
## Test that SPOP (that modifies the client's argc/argv) is correctly free'd
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand sadd set foo]
append_to_aof [formatCommand sadd set bar]
append_to_aof [formatCommand spop set]
@ -209,7 +198,7 @@ tags {"aof external:skip"} {
}
## Uses the alsoPropagate() API.
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand sadd set foo]
append_to_aof [formatCommand sadd set bar]
append_to_aof [formatCommand sadd set gah]
@ -229,7 +218,7 @@ tags {"aof external:skip"} {
}
## Test that PEXPIREAT is loaded correctly
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand rpush list foo]
append_to_aof [formatCommand pexpireat list 1000]
append_to_aof [formatCommand rpush list bar]
@ -247,14 +236,14 @@ tags {"aof external:skip"} {
}
}
start_server {overrides {appendonly {yes} appendfilename {appendonly.aof}}} {
start_server {overrides {appendonly {yes}}} {
test {Redis should not try to convert DEL into EXPIREAT for EXPIRE -1} {
r set x 10
r expire x -1
}
}
start_server {overrides {appendonly {yes} appendfilename {appendonly.aof} appendfsync always}} {
start_server {overrides {appendonly {yes} appendfsync always}} {
test {AOF fsync always barrier issue} {
set rd [redis_deferring_client]
# Set a sleep when aof is flushed, so that we have a chance to look
@ -272,7 +261,7 @@ tags {"aof external:skip"} {
r del x
r setrange x [expr {int(rand()*5000000)+10000000}] x
r debug aof-flush-sleep 500000
set aof [file join [lindex [r config get dir] 1] appendonly.aof]
set aof [get_last_incr_aof_path r]
set size1 [file size $aof]
$rd get x
after [expr {int(rand()*30)}]
@ -285,9 +274,9 @@ tags {"aof external:skip"} {
}
}
start_server {overrides {appendonly {yes} appendfilename {appendonly.aof}}} {
start_server {overrides {appendonly {yes}}} {
test {GETEX should not append to AOF} {
set aof [file join [lindex [r config get dir] 1] appendonly.aof]
set aof [get_last_incr_aof_path r]
r set foo bar
set before [file size $aof]
r getex foo
@ -297,7 +286,7 @@ tags {"aof external:skip"} {
}
## Test that the server exits when the AOF contains a unknown command
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand set foo hello]
append_to_aof [formatCommand bla foo hello]
append_to_aof [formatCommand set foo hello]
@ -322,7 +311,7 @@ tags {"aof external:skip"} {
}
# Test that LMPOP/BLMPOP work fine with AOF.
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand lpush mylist a b c]
append_to_aof [formatCommand rpush mylist2 1 2 3]
append_to_aof [formatCommand lpush mylist3 a b c d e]
@ -369,7 +358,7 @@ tags {"aof external:skip"} {
}
# Test that ZMPOP/BZMPOP work fine with AOF.
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand zadd myzset 1 one 2 two 3 three]
append_to_aof [formatCommand zadd myzset2 4 four 5 five 6 six]
append_to_aof [formatCommand zadd myzset3 1 one 2 two 3 three 4 four 5 five]
@ -416,22 +405,24 @@ tags {"aof external:skip"} {
}
test {Generate timestamp annotations in AOF} {
start_server {overrides {appendonly {yes} appendfilename {appendonly.aof}}} {
start_server {overrides {appendonly {yes}}} {
r config set aof-timestamp-enabled yes
r config set aof-use-rdb-preamble no
set aof [file join [lindex [r config get dir] 1] appendonly.aof]
set aof [get_last_incr_aof_path r]
r set foo bar
assert_match "#TS:*" [exec head -n 1 $aof]
r bgrewriteaof
waitForBgrewriteaof r
set aof [get_base_aof_path r]
assert_match "#TS:*" [exec head -n 1 $aof]
}
}
# redis could load AOF which has timestamp annotations inside
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof "#TS:1628217470\r\n"
append_to_aof [formatCommand set foo1 bar1]
append_to_aof "#TS:1628217471\r\n"
@ -453,7 +444,7 @@ tags {"aof external:skip"} {
test {Truncate AOF to specific timestamp} {
# truncate to timestamp 1628217473
exec src/redis-check-aof --truncate-to-timestamp 1628217473 $aof_path
exec src/redis-check-aof --truncate-to-timestamp 1628217473 $aof_file
start_server_aof [list dir $server_path] {
set c [redis [dict get $srv host] [dict get $srv port] 0 $::tls]
wait_done_loading $c
@ -463,7 +454,7 @@ tags {"aof external:skip"} {
}
# truncate to timestamp 1628217471
exec src/redis-check-aof --truncate-to-timestamp 1628217471 $aof_path
exec src/redis-check-aof --truncate-to-timestamp 1628217471 $aof_file
start_server_aof [list dir $server_path] {
set c [redis [dict get $srv host] [dict get $srv port] 0 $::tls]
wait_done_loading $c
@ -473,7 +464,7 @@ tags {"aof external:skip"} {
}
# truncate to timestamp 1628217470
exec src/redis-check-aof --truncate-to-timestamp 1628217470 $aof_path
exec src/redis-check-aof --truncate-to-timestamp 1628217470 $aof_file
start_server_aof [list dir $server_path] {
set c [redis [dict get $srv host] [dict get $srv port] 0 $::tls]
wait_done_loading $c
@ -482,12 +473,12 @@ tags {"aof external:skip"} {
}
# truncate to timestamp 1628217469
catch {exec src/redis-check-aof --truncate-to-timestamp 1628217469 $aof_path} e
catch {exec src/redis-check-aof --truncate-to-timestamp 1628217469 $aof_file} e
assert_match {*aborting*} $e
}
test {EVAL timeout with slow verbatim Lua script from AOF} {
create_aof {
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand select 9]
append_to_aof [formatCommand eval {redis.call('set',KEYS[1],'y'); for i=1,1500000 do redis.call('ping') end return 'ok'} 1 x]
}
@ -512,7 +503,10 @@ tags {"aof external:skip"} {
}
test {EVAL can process writes from AOF in read-only replicas} {
create_aof {
create_aof_manifest $aof_dirpath $aof_manifest_file {
append_to_manifest "file appendonly.aof.1.incr.aof seq 1 type i\n"
}
create_aof $aof_dirpath $aof_file {
append_to_aof [formatCommand select 9]
append_to_aof [formatCommand eval {redis.call("set",KEYS[1],"100")} 1 foo]
append_to_aof [formatCommand eval {redis.call("incr",KEYS[1])} 1 foo]

View File

@ -0,0 +1,169 @@
set ::base_aof_sufix ".base"
set ::incr_aof_sufix ".incr"
set ::manifest_suffix ".manifest"
set ::aof_format_suffix ".aof"
set ::rdb_format_suffix ".rdb"
proc get_full_path {dir filename} {
set _ [format "%s/%s" $dir $filename]
}
proc join_path {dir1 dir2} {
return [format "%s/%s" $dir1 $dir2]
}
proc get_redis_dir {} {
set config [srv config]
set _ [dict get $config "dir"]
}
proc check_file_exist {dir filename} {
set file_path [get_full_path $dir $filename]
return [file exists $file_path]
}
proc del_file {dir filename} {
set file_path [get_full_path $dir $filename]
catch {exec rm -rf $file_path}
}
proc get_cur_base_aof_name {manifest_filepath} {
set fp [open $manifest_filepath r+]
set lines {}
while {1} {
set line [gets $fp]
if {[eof $fp]} {
close $fp
break;
}
lappend lines $line
}
if {[llength $lines] == 0} {
return ""
}
set first_line [lindex $lines 0]
set aofname [lindex [split $first_line " "] 1]
set aoftype [lindex [split $first_line " "] 5]
if { $aoftype eq "b" } {
return $aofname
}
return ""
}
proc get_last_incr_aof_name {manifest_filepath} {
set fp [open $manifest_filepath r+]
set lines {}
while {1} {
set line [gets $fp]
if {[eof $fp]} {
close $fp
break;
}
lappend lines $line
}
if {[llength $lines] == 0} {
return ""
}
set len [llength $lines]
set last_line [lindex $lines [expr $len - 1]]
set aofname [lindex [split $last_line " "] 1]
set aoftype [lindex [split $last_line " "] 5]
if { $aoftype eq "i" } {
return $aofname
}
return ""
}
proc get_last_incr_aof_path {r} {
set dir [lindex [$r config get dir] 1]
set appenddirname [lindex [$r config get appenddirname] 1]
set appendfilename [lindex [$r config get appendfilename] 1]
set manifest_filepath [file join $dir $appenddirname $appendfilename$::manifest_suffix]
set last_incr_aof_name [get_last_incr_aof_name $manifest_filepath]
if {$last_incr_aof_name == ""} {
return ""
}
return [file join $dir $appenddirname $last_incr_aof_name]
}
proc get_base_aof_path {r} {
set dir [lindex [$r config get dir] 1]
set appenddirname [lindex [$r config get appenddirname] 1]
set appendfilename [lindex [$r config get appendfilename] 1]
set manifest_filepath [file join $dir $appenddirname $appendfilename$::manifest_suffix]
set cur_base_aof_name [get_cur_base_aof_name $manifest_filepath]
if {$cur_base_aof_name == ""} {
return ""
}
return [file join $dir $appenddirname $cur_base_aof_name]
}
proc assert_aof_manifest_content {manifest_path content} {
set fp [open $manifest_path r+]
set lines {}
while {1} {
set line [gets $fp]
if {[eof $fp]} {
close $fp
break;
}
lappend lines $line
}
assert_equal [llength $lines] [llength $content]
for { set i 0 } { $i < [llength $lines] } {incr i} {
assert_equal [lindex $lines $i] [lindex $content $i]
}
}
proc clean_aof_persistence {aof_dirpath} {
catch {eval exec rm -rf [glob $aof_dirpath]}
}
proc append_to_manifest {str} {
upvar fp fp
puts -nonewline $fp $str
}
proc create_aof_manifest {dir aof_manifest_file code} {
create_aof_dir $dir
upvar fp fp
set fp [open $aof_manifest_file w+]
uplevel 1 $code
close $fp
}
proc append_to_aof {str} {
upvar fp fp
puts -nonewline $fp $str
}
proc create_aof {dir aof_file code} {
create_aof_dir $dir
upvar fp fp
set fp [open $aof_file w+]
uplevel 1 $code
close $fp
}
proc create_aof_dir {dir_path} {
file mkdir $dir_path
}
proc start_server_aof {overrides code} {
upvar defaults defaults srv srv server_path server_path
set config [concat $defaults $overrides]
set srv [start_server [list overrides $config]]
uplevel 1 $code
kill_server $srv
}

View File

@ -30,10 +30,16 @@ proc clean_persistence config {
# we may wanna keep the logs for later, but let's clean the persistence
# files right away, since they can accumulate and take up a lot of space
set config [dict get $config "config"]
set rdb [format "%s/%s" [dict get $config "dir"] "dump.rdb"]
set aof [format "%s/%s" [dict get $config "dir"] "appendonly.aof"]
set dir [dict get $config "dir"]
set rdb [format "%s/%s" $dir "dump.rdb"]
if {[dict exists $config "appenddirname"]} {
set aofdir [dict get $config "appenddirname"]
} else {
set aofdir "appendonlydir"
}
set aof_dirpath [format "%s/%s" $dir $aofdir]
clean_aof_persistence $aof_dirpath
catch {exec rm -rf $rdb}
catch {exec rm -rf $aof}
}
proc kill_server config {

View File

@ -84,7 +84,7 @@ proc status {r property} {
proc waitForBgsave r {
while 1 {
if {[status r rdb_bgsave_in_progress] eq 1} {
if {[status $r rdb_bgsave_in_progress] eq 1} {
if {$::verbose} {
puts -nonewline "\nWaiting for background save to finish... "
flush stdout
@ -98,7 +98,7 @@ proc waitForBgsave r {
proc waitForBgrewriteaof r {
while 1 {
if {[status r aof_rewrite_in_progress] eq 1} {
if {[status $r aof_rewrite_in_progress] eq 1} {
if {$::verbose} {
puts -nonewline "\nWaiting for background AOF rewrite to finish... "
flush stdout

View File

@ -6,6 +6,7 @@ package require Tcl 8.5
set tcl_precision 17
source tests/support/redis.tcl
source tests/support/aofmanifest.tcl
source tests/support/server.tcl
source tests/support/tmpfile.tcl
source tests/support/test.tcl
@ -46,6 +47,7 @@ set ::all_tests {
integration/replication-buffer
integration/shutdown
integration/aof
integration/aof-multi-part
integration/rdb
integration/corrupt-dump
integration/corrupt-dump-fuzzer

View File

@ -308,14 +308,14 @@ start_server {tags {"expire"}} {
} {-2}
# Start a new server with empty data and AOF file.
start_server {overrides {appendonly {yes} appendfilename {appendonly.aof} appendfsync always} tags {external:skip}} {
start_server {overrides {appendonly {yes} appendfsync always} tags {external:skip}} {
test {All time-to-live(TTL) in commands are propagated as absolute timestamp in milliseconds in AOF} {
# This test makes sure that expire times are propagated as absolute
# times to the AOF file and not as relative time, so that when the AOF
# is reloaded the TTLs are not being shifted forward to the future.
# We want the time to logically pass when the server is restarted!
set aof [file join [lindex [r config get dir] 1] [lindex [r config get appendfilename] 1]]
set aof [get_last_incr_aof_path r]
# Apply each TTL-related command to a unique key
# SET commands

View File

@ -183,6 +183,7 @@ start_server {tags {"introspection"}} {
pidfile
syslog-ident
appendfilename
appenddirname
supervised
syslog-facility
databases