From 100cd7b21efa965637c5e95cd3bf72f211c00122 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 15:38:54 +0100 Subject: [PATCH 0001/1648] Added a release notes file good for "unstable". --- 00-RELEASENOTES | 91 ++++++------------------------------------------- 1 file changed, 11 insertions(+), 80 deletions(-) diff --git a/00-RELEASENOTES b/00-RELEASENOTES index 36317ca3502..81ff184feba 100644 --- a/00-RELEASENOTES +++ b/00-RELEASENOTES @@ -1,85 +1,16 @@ -Redis 2.6 release notes +Hello! This file is just a placeholder, since this is the "unstable" branch +of Redis, the place where all the development happens. -Migrating from 2.4 to 2.6 -========================= +There is no release notes for this branch, it gets forked into another branch +every time there is a partial feature freeze in order to eventually create +a new stable release. -Redis 2.4 is mostly a strict subset of 2.6. However there are a few things -that you should be aware of: +Usually "unstable" is stable enough for you to use it in development enviromnets +however you should never use it in production environments. It is possible +to download the latest stable release here: -* You can't use .rdb and AOF files generated with 2.6 into a 2.4 instance. -* 2.6 slaves can be attached to 2.4 masters, but not the contrary, and only - for the time needed to perform the version upgrade. + http://download.redis.io/releases/redis-stable.tar.gz -There are also a few API differences, that are unlikely to cause problems, -but it is better to keep them in mind: +More information is available at http://redis.io -* SORT now will refuse to sort in numerical mode elements that can't be parsed - as numbers. -* EXPIREs now all have millisecond resolution (but this is very unlikely to - break code that was not conceived exploting the previous resolution error - in some way.) -* INFO output is a bit different now, and contains empty lines and comments - starting with '#'. All the major clients should be already fixed to work - with the new INFO format. - -Also the following redis.conf and CONFIG GET / SET parameters changed name: - - * hash-max-zipmap-entries, now replaced by hash-max-ziplist-entries - * hash-max-zipmap-value, now replaced by hash-max-ziplist-value - * glueoutputbuf was now completely removed as it does not make sense - ---------- -CHANGELOG ---------- - -What's new in Redis 2.6.0 -========================= - -UPGRADE URGENCY: We suggest new users to start with 2.6.0, and old users to - upgrade after some testing of the application with the new - Redis version. - -* Server side Lua scripting, see http://redis.io/commands/eval -* Virtual Memory removed (was deprecated in 2.4) -* Hardcoded limits about max number of clients removed. -* AOF low level semantics is generally more sane, and especially when used - in slaves. -* Milliseconds resolution expires, also added new commands with milliseconds - precision (PEXPIRE, PTTL, ...). -* Clients max output buffer soft and hard limits. You can specifiy different - limits for different classes of clients (normal,pubsub,slave). -* AOF is now able to rewrite aggregate data types using variadic commands, - often producing an AOF that is faster to save, load, and is smaller in size. -* Every redis.conf directive is now accepted as a command line option for the - redis-server binary, with the same name and number of arguments. -* Hash table seed randomization for protection against collisions attacks. -* Performances improved when writing large objects to Redis. -* Significant parts of the core refactored or rewritten. New internal APIs - and core changes allowed to develop Redis Cluster on top of the new code, - however for 2.6 all the cluster code was removed, and will be released with - Redis 3.0 when it is more complete and stable. -* Redis ASCII art logo added at startup. -* Crash report on memory violation or failed asserts improved significantly - to make debugging of hard to catch bugs simpler. -* redis-benchmark improvements: ability to run selected tests, - CSV output, faster, better help. -* redis-cli improvements: --eval for comfortable development of Lua scripts. -* SHUTDOWN now supports two optional arguments: "SAVE" and "NOSAVE". -* INFO output split into sections, the command is now able to just show - pecific sections. -* New statistics about how many time a command was called, and how much - execution time it used (INFO commandstats). -* More predictable SORT behavior in edge cases. -* INCRBYFLOAT and HINCRBYFLOAT commands. - --------------------------------------------------------------------------------- - -Credits: Where not specified the implementation and design are done by -Salvatore Sanfilippo and Pieter Noordhuis. Thanks to VMware for making all -this possible. Also many thanks to all the other contributors and the amazing -community we have. - -See commit messages for more credits. - -Cheers, -Salvatore +Happy hacking! From 007e1c7cb27045f8c6693ba62680beda3eb86e75 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 15:53:09 +0100 Subject: [PATCH 0002/1648] Cluster: added signature + version in bus packets. --- src/cluster.c | 8 +++++++- src/cluster.h | 3 +++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/src/cluster.c b/src/cluster.c index 5510c5e52ab..44cb47651b1 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -1166,7 +1166,9 @@ int clusterProcessPacket(clusterLink *link) { type, (unsigned long) totlen); /* Perform sanity checks */ - if (totlen < 8) return 1; + if (totlen < 16) return 1; /* At least signature, totlen, count. */ + if (hdr->sig[0] != 'R' || hdr->sig[1] != 'C' || + hdr->sig[2] != 'i' || hdr->sig[3] != 'b') return 1; /* Bad signature. */ if (totlen > sdslen(link->rcvbuf)) return 1; if (type == CLUSTERMSG_TYPE_PING || type == CLUSTERMSG_TYPE_PONG || type == CLUSTERMSG_TYPE_MEET) @@ -1677,6 +1679,10 @@ void clusterBuildMessageHdr(clusterMsg *hdr, int type) { myself->slaveof : myself; memset(hdr,0,sizeof(*hdr)); + hdr->sig[0] = 'R'; + hdr->sig[1] = 'C'; + hdr->sig[2] = 'i'; + hdr->sig[3] = 'b'; hdr->type = htons(type); memcpy(hdr->sender,myself->name,REDIS_CLUSTER_NAMELEN); diff --git a/src/cluster.h b/src/cluster.h index cbda08987d7..20889b6e2a7 100644 --- a/src/cluster.h +++ b/src/cluster.h @@ -195,6 +195,9 @@ union clusterMsgData { typedef struct { + char sig[4]; /* Siganture "RCib" (Redis Cluster internal bus). */ + uint16_t ver; /* Protocol version, currently set to 0. */ + uint16_t notused0; /* 2 bytes not used. */ uint32_t totlen; /* Total length of this message */ uint16_t type; /* Message type */ uint16_t count; /* Only used for some kind of messages. */ From dced9c06198d6b80caf7aa6fe8cb865a562c12db Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 15:54:19 +0100 Subject: [PATCH 0003/1648] Cluster: discard bus messages with version != 0. --- src/cluster.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/cluster.c b/src/cluster.c index 44cb47651b1..8535da2f539 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -1166,9 +1166,10 @@ int clusterProcessPacket(clusterLink *link) { type, (unsigned long) totlen); /* Perform sanity checks */ - if (totlen < 16) return 1; /* At least signature, totlen, count. */ + if (totlen < 16) return 1; /* At least signature, version, totlen, count. */ if (hdr->sig[0] != 'R' || hdr->sig[1] != 'C' || hdr->sig[2] != 'i' || hdr->sig[3] != 'b') return 1; /* Bad signature. */ + if (ntohs(hdr->ver) != 0) return 1; /* Can't handle versions other than 0. */ if (totlen > sdslen(link->rcvbuf)) return 1; if (type == CLUSTERMSG_TYPE_PING || type == CLUSTERMSG_TYPE_PONG || type == CLUSTERMSG_TYPE_MEET) From 7bf7b7350cb16d2bc5084d41469f96c9565bf089 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 15:55:21 +0100 Subject: [PATCH 0004/1648] Cluster: signature changed to "RCmb" (Redis Cluster message bus). Sounds better after all. --- src/cluster.c | 4 ++-- src/cluster.h | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index 8535da2f539..ce604aeed73 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -1168,7 +1168,7 @@ int clusterProcessPacket(clusterLink *link) { /* Perform sanity checks */ if (totlen < 16) return 1; /* At least signature, version, totlen, count. */ if (hdr->sig[0] != 'R' || hdr->sig[1] != 'C' || - hdr->sig[2] != 'i' || hdr->sig[3] != 'b') return 1; /* Bad signature. */ + hdr->sig[2] != 'm' || hdr->sig[3] != 'b') return 1; /* Bad signature. */ if (ntohs(hdr->ver) != 0) return 1; /* Can't handle versions other than 0. */ if (totlen > sdslen(link->rcvbuf)) return 1; if (type == CLUSTERMSG_TYPE_PING || type == CLUSTERMSG_TYPE_PONG || @@ -1682,7 +1682,7 @@ void clusterBuildMessageHdr(clusterMsg *hdr, int type) { memset(hdr,0,sizeof(*hdr)); hdr->sig[0] = 'R'; hdr->sig[1] = 'C'; - hdr->sig[2] = 'i'; + hdr->sig[2] = 'm'; hdr->sig[3] = 'b'; hdr->type = htons(type); memcpy(hdr->sender,myself->name,REDIS_CLUSTER_NAMELEN); diff --git a/src/cluster.h b/src/cluster.h index 20889b6e2a7..ef466c820bb 100644 --- a/src/cluster.h +++ b/src/cluster.h @@ -195,7 +195,7 @@ union clusterMsgData { typedef struct { - char sig[4]; /* Siganture "RCib" (Redis Cluster internal bus). */ + char sig[4]; /* Siganture "RCmb" (Redis Cluster message bus). */ uint16_t ver; /* Protocol version, currently set to 0. */ uint16_t notused0; /* 2 bytes not used. */ uint32_t totlen; /* Total length of this message */ From 344a065d51aa79a96c58c490cf202d5950cb60da Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 16:00:27 +0100 Subject: [PATCH 0005/1648] Cluster: don't propagate PUBLISH two times. PUBLISH both published messages via Cluster bus and replication when cluster was enabled, resulting in duplicated message in the slave. --- src/pubsub.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/pubsub.c b/src/pubsub.c index a596dfc9631..af075bfe925 100644 --- a/src/pubsub.c +++ b/src/pubsub.c @@ -306,8 +306,10 @@ void punsubscribeCommand(redisClient *c) { void publishCommand(redisClient *c) { int receivers = pubsubPublishMessage(c->argv[1],c->argv[2]); - if (server.cluster_enabled) clusterPropagatePublish(c->argv[1],c->argv[2]); - forceCommandPropagation(c,REDIS_PROPAGATE_REPL); + if (server.cluster_enabled) + clusterPropagatePublish(c->argv[1],c->argv[2]); + else + forceCommandPropagation(c,REDIS_PROPAGATE_REPL); addReplyLongLong(c,receivers); } From f885fa8bac6e476e4ca9ad8be80707ac74a320af Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 16:27:33 +0100 Subject: [PATCH 0006/1648] Cluster: clusterReadHandler() fixed to work with new message header. --- src/cluster.c | 22 ++++++++++++---------- src/cluster.h | 2 +- 2 files changed, 13 insertions(+), 11 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index ce604aeed73..d59b63423fe 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -1167,8 +1167,6 @@ int clusterProcessPacket(clusterLink *link) { /* Perform sanity checks */ if (totlen < 16) return 1; /* At least signature, version, totlen, count. */ - if (hdr->sig[0] != 'R' || hdr->sig[1] != 'C' || - hdr->sig[2] != 'm' || hdr->sig[3] != 'b') return 1; /* Bad signature. */ if (ntohs(hdr->ver) != 0) return 1; /* Can't handle versions other than 0. */ if (totlen > sdslen(link->rcvbuf)) return 1; if (type == CLUSTERMSG_TYPE_PING || type == CLUSTERMSG_TYPE_PONG || @@ -1582,18 +1580,22 @@ void clusterReadHandler(aeEventLoop *el, int fd, void *privdata, int mask) { while(1) { /* Read as long as there is data to read. */ rcvbuflen = sdslen(link->rcvbuf); - if (rcvbuflen < 4) { - /* First, obtain the first four bytes to get the full message + if (rcvbuflen < 8) { + /* First, obtain the first 8 bytes to get the full message * length. */ - readlen = 4 - rcvbuflen; + readlen = 8 - rcvbuflen; } else { /* Finally read the full message. */ hdr = (clusterMsg*) link->rcvbuf; - if (rcvbuflen == 4) { - /* Perform some sanity check on the message length. */ - if (ntohl(hdr->totlen) < CLUSTERMSG_MIN_LEN) { + if (rcvbuflen == 8) { + /* Perform some sanity check on the message signature + * and length. */ + if (memcmp(hdr->sig,"RCmb",4) != 0 || + ntohl(hdr->totlen) < CLUSTERMSG_MIN_LEN) + { redisLog(REDIS_WARNING, - "Bad message length received from Cluster bus."); + "Bad message length or signature received " + "from Cluster bus."); handleLinkIOError(link); return; } @@ -1619,7 +1621,7 @@ void clusterReadHandler(aeEventLoop *el, int fd, void *privdata, int mask) { } /* Total length obtained? Process this packet. */ - if (rcvbuflen >= 4 && rcvbuflen == ntohl(hdr->totlen)) { + if (rcvbuflen >= 8 && rcvbuflen == ntohl(hdr->totlen)) { if (clusterProcessPacket(link)) { sdsfree(link->rcvbuf); link->rcvbuf = sdsempty(); diff --git a/src/cluster.h b/src/cluster.h index ef466c820bb..aae13cd4071 100644 --- a/src/cluster.h +++ b/src/cluster.h @@ -196,9 +196,9 @@ union clusterMsgData { typedef struct { char sig[4]; /* Siganture "RCmb" (Redis Cluster message bus). */ + uint32_t totlen; /* Total length of this message */ uint16_t ver; /* Protocol version, currently set to 0. */ uint16_t notused0; /* 2 bytes not used. */ - uint32_t totlen; /* Total length of this message */ uint16_t type; /* Message type */ uint16_t count; /* Only used for some kind of messages. */ uint64_t currentEpoch; /* The epoch accordingly to the sending node. */ From f106a79309c540aa49989e53ab8cba66c94b9349 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 16:59:09 +0100 Subject: [PATCH 0007/1648] Cluster: redis-trib del-node variable typo fixed. --- src/redis-trib.rb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/redis-trib.rb b/src/redis-trib.rb index 1354c35addb..7aa345ac255 100755 --- a/src/redis-trib.rb +++ b/src/redis-trib.rb @@ -898,7 +898,7 @@ def delnode_cluster_cmd(argv,opt) xputs ">>> Sending CLUSTER FORGET messages to the cluster..." @nodes.each{|n| next if n == node - if n.info[:replicate] && n.info[:replicate].downcase == node_id + if n.info[:replicate] && n.info[:replicate].downcase == id # Reconfigure the slave to replicate with some other node xputs ">>> #{n} as replica of #{master}" master = get_master_with_least_replicas From 5b2082ead3489523c336e3047550bad5cad25d98 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 17:08:37 +0100 Subject: [PATCH 0008/1648] Cluster: replica migration should only work for masters serving slots. --- src/cluster.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index d59b63423fe..167468dc5ca 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -2264,7 +2264,7 @@ void clusterHandleSlaveMigration(int max_slaves) { if (nodeIsSlave(node) || nodeFailed(node)) continue; okslaves = clusterCountNonFailingSlaves(node); - if (okslaves == 0 && target == NULL) target = node; + if (okslaves == 0 && target == NULL && node->numslots > 0) target = node; if (okslaves == max_slaves) { for (j = 0; j < node->numslaves; j++) { if (memcmp(node->slaves[j]->name, @@ -2487,7 +2487,7 @@ void clusterCron(void) { if (nodeIsSlave(myself) && nodeIsMaster(node) && !nodeFailed(node)) { int okslaves = clusterCountNonFailingSlaves(node); - if (okslaves == 0) orphaned_masters++; + if (okslaves == 0 && node->numslots > 0) orphaned_masters++; if (okslaves > max_slaves) max_slaves = okslaves; if (nodeIsSlave(myself) && myself->slaveof == node) this_slaves = okslaves; From 21648473aad5c3d09c6e9a9ed440564c8135ef6f Mon Sep 17 00:00:00 2001 From: Matt Stancliff Date: Mon, 10 Feb 2014 11:06:13 -0500 Subject: [PATCH 0009/1648] Auto-enter slaveMode when SYNC from redis-cli If someone asks for SYNC or PSYNC from redis-cli, automatically enter slaveMode (as if they ran redis-cli --slave) and continue printing the replication stream until either they Ctrl-C or the master gets disconnected. --- src/redis-cli.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/src/redis-cli.c b/src/redis-cli.c index fa9dc54768f..7d411f8aef1 100644 --- a/src/redis-cli.c +++ b/src/redis-cli.c @@ -94,6 +94,7 @@ static struct config { } config; static void usage(); +static void slaveMode(void); char *redisGitSHA1(void); char *redisGitDirty(void); @@ -600,6 +601,8 @@ static int cliSendCommand(int argc, char **argv, int repeat) { if (!strcasecmp(command,"monitor")) config.monitor_mode = 1; if (!strcasecmp(command,"subscribe") || !strcasecmp(command,"psubscribe")) config.pubsub_mode = 1; + if (!strcasecmp(command,"sync") || + !strcasecmp(command,"psync")) config.slave_mode = 1; /* Setup argument length */ argvlen = malloc(argc*sizeof(size_t)); @@ -621,6 +624,13 @@ static int cliSendCommand(int argc, char **argv, int repeat) { } } + if (config.slave_mode) { + printf("Entering slave output mode... (press Ctrl-C to quit)\n"); + slaveMode(); + config.slave_mode = 0; + return REDIS_ERR; /* Error = slaveMode lost connection to master */ + } + if (cliReadReply(output_raw) != REDIS_OK) { free(argvlen); return REDIS_ERR; @@ -1061,6 +1071,7 @@ static void slaveMode(void) { int fd = context->fd; unsigned long long payload = sendSync(fd); char buf[1024]; + int original_output = config.output; fprintf(stderr,"SYNC with master, discarding %llu " "bytes of bulk transfer...\n", payload); @@ -1081,6 +1092,7 @@ static void slaveMode(void) { /* Now we can use hiredis to read the incoming protocol. */ config.output = OUTPUT_CSV; while (cliReadReply(0) == REDIS_OK); + config.output = original_output; } /* This function implements --rdb, so it uses the replication protocol in order From 32563b4a5f44bee87da2bbe431b6bf3a375dd939 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 17:18:16 +0100 Subject: [PATCH 0010/1648] Cluster: clear the FAIL status for masters without slots. Masters without slots don't participate to the cluster but just do redirections, no need to take them in FAIL state if they are back reachable. --- src/cluster.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index 167468dc5ca..848068af262 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -830,9 +830,10 @@ void clearNodeFailureIfNeeded(clusterNode *node) { /* For slaves we always clear the FAIL flag if we can contact the * node again. */ - if (nodeIsSlave(node)) { + if (nodeIsSlave(node) || node->numslots == 0) { redisLog(REDIS_NOTICE, - "Clear FAIL state for node %.40s: slave is reachable again.", + "Clear FAIL state for node %.40s: %s is reachable again.", + nodeIsSlave(node) ? "slave" : "master without slots", node->name); node->flags &= ~REDIS_NODE_FAIL; clusterDoBeforeSleep(CLUSTER_TODO_UPDATE_STATE|CLUSTER_TODO_SAVE_CONFIG); From 1a73c992a3aeedab3c7fd57ab538722327f3fc99 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 17:21:10 +0100 Subject: [PATCH 0011/1648] Cluster: fixed inverted arguments in logging function call. --- src/cluster.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index 848068af262..395899f9869 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -833,8 +833,8 @@ void clearNodeFailureIfNeeded(clusterNode *node) { if (nodeIsSlave(node) || node->numslots == 0) { redisLog(REDIS_NOTICE, "Clear FAIL state for node %.40s: %s is reachable again.", - nodeIsSlave(node) ? "slave" : "master without slots", - node->name); + node->name, + nodeIsSlave(node) ? "slave" : "master without slots"); node->flags &= ~REDIS_NODE_FAIL; clusterDoBeforeSleep(CLUSTER_TODO_UPDATE_STATE|CLUSTER_TODO_SAVE_CONFIG); } From be0bb19fd3f7159a63a889ecc8db4a06bda2cf15 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 17:44:16 +0100 Subject: [PATCH 0012/1648] Cluster: redis-trib, more info about open slots error. --- src/redis-trib.rb | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/redis-trib.rb b/src/redis-trib.rb index 7aa345ac255..ab497f190ae 100755 --- a/src/redis-trib.rb +++ b/src/redis-trib.rb @@ -359,11 +359,11 @@ def check_open_slots @nodes.each{|n| if n.info[:migrating].size > 0 cluster_error \ - "[WARNING] Node #{n} has slots in migrating state." + "[WARNING] Node #{n} has slots in migrating state (#{n.info[:migrating].keys.join(",")})." open_slots += n.info[:migrating].keys elsif n.info[:importing].size > 0 cluster_error \ - "[WARNING] Node #{n} has slots in importing state." + "[WARNING] Node #{n} has slots in importing state (#{n.info[:importing].keys.join(",")})." open_slots += n.info[:importing].keys end } From 6fc53e16adba5a3f096bf4f242f883c95d2b12de Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 18:01:58 +0100 Subject: [PATCH 0013/1648] Cluster: update configEpoch after manually messing with slots. --- src/cluster.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index 395899f9869..651063d2114 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -1115,8 +1115,7 @@ void clusterUpdateSlotsConfigWith(clusterNode *sender, uint64_t senderConfigEpoc * 2) The new node claims it with a greater configEpoch. */ if (server.cluster->slots[j] == sender) continue; if (server.cluster->slots[j] == NULL || - server.cluster->slots[j]->configEpoch < - senderConfigEpoch) + server.cluster->slots[j]->configEpoch < senderConfigEpoch) { if (server.cluster->slots[j] == curmaster) newmaster = sender; @@ -3164,14 +3163,26 @@ void clusterCommand(redisClient *c) { * itself also clears the importing status. */ if (n == myself && server.cluster->importing_slots_from[slot]) + { + clusterNode *old_owner = + server.cluster->importing_slots_from[slot]; + /* This slot was manually migrated, set this node configEpoch + * at least to the value of the configEpoch of the old owner + * so that its old replicas, or some of its old message pending + * on the cluster bus, can't claim our slot. */ + if (old_owner->configEpoch > myself->configEpoch) + myself->configEpoch = old_owner->configEpoch; server.cluster->importing_slots_from[slot] = NULL; + } clusterDelSlot(slot); clusterAddSlot(n,slot); } else { addReplyError(c,"Invalid CLUSTER SETSLOT action or number of arguments"); return; } - clusterDoBeforeSleep(CLUSTER_TODO_UPDATE_STATE|CLUSTER_TODO_SAVE_CONFIG); + clusterDoBeforeSleep(CLUSTER_TODO_SAVE_CONFIG| + CLUSTER_TODO_UPDATE_STATE| + CLUSTER_TODO_FSYNC_CONFIG); addReply(c,shared.ok); } else if (!strcasecmp(c->argv[1]->ptr,"info") && c->argc == 2) { /* CLUSTER INFO */ From a3755ae9ee6391b0ebb6fa0a26e0f38f00574654 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 18:04:43 +0100 Subject: [PATCH 0014/1648] Cluster: ignore slot config changes if we are importing it. --- src/cluster.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index 651063d2114..88c927d5c39 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -1112,8 +1112,10 @@ void clusterUpdateSlotsConfigWith(clusterNode *sender, uint64_t senderConfigEpoc if (bitmapTestBit(slots,j)) { /* We rebind the slot to the new node claiming it if: * 1) The slot was unassigned. - * 2) The new node claims it with a greater configEpoch. */ - if (server.cluster->slots[j] == sender) continue; + * 2) The new node claims it with a greater configEpoch. + * 3) We are not currently importing the slot. */ + if (server.cluster->slots[j] == sender || + server.cluster->importing_slots_from[j]) continue; if (server.cluster->slots[j] == NULL || server.cluster->slots[j]->configEpoch < senderConfigEpoch) { From bf670e0745100e1214e5d1ff74ff2e0d869a31ad Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 18:33:34 +0100 Subject: [PATCH 0015/1648] Cluster: don't update slave's master if we don't know it. There is no way we can update the slave's node->slaveof pointer if we don't know the master (no node with such an ID in our tables). --- src/cluster.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/cluster.c b/src/cluster.c index 88c927d5c39..f9899fae530 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -1363,7 +1363,7 @@ int clusterProcessPacket(clusterLink *link) { } /* Master node changed for this slave? */ - if (sender->slaveof != master) { + if (master && sender->slaveof != master) { if (sender->slaveof) clusterNodeRemoveSlave(sender->slaveof,sender); clusterNodeAddSlave(master,sender); From 59e03a8f3521dddfab5dcbe99f20cf68e69791ad Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 18:48:36 +0100 Subject: [PATCH 0016/1648] redis-trib: log event after we have reference to 'master'. --- src/redis-trib.rb | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/redis-trib.rb b/src/redis-trib.rb index ab497f190ae..e1956e4bc10 100755 --- a/src/redis-trib.rb +++ b/src/redis-trib.rb @@ -116,6 +116,7 @@ def assert_empty def load_info(o={}) self.connect nodes = @r.cluster("nodes").split("\n") + puts @r.cluster("nodes") nodes.each{|n| # name addr flags role ping_sent ping_recv link_status slots split = n.split @@ -900,8 +901,8 @@ def delnode_cluster_cmd(argv,opt) next if n == node if n.info[:replicate] && n.info[:replicate].downcase == id # Reconfigure the slave to replicate with some other node - xputs ">>> #{n} as replica of #{master}" master = get_master_with_least_replicas + xputs ">>> #{n} as replica of #{master}" n.r.cluster("replicate",master.info[:name]) end n.r.cluster("forget",argv[1]) From 1ae50a9b1dd0ac9cebb694a184b7da67543d1d60 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 19:10:21 +0100 Subject: [PATCH 0017/1648] Cluster: redis-trib fix: cover new case of open slot. The case is the trivial one a single node claiming the slot as migrating, without nodes claiming it as importing. --- src/redis-trib.rb | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/redis-trib.rb b/src/redis-trib.rb index e1956e4bc10..147d3132a3f 100755 --- a/src/redis-trib.rb +++ b/src/redis-trib.rb @@ -470,6 +470,9 @@ def fix_open_slot(slot) # importing state in 1 slot. That's trivial to address. if migrating.length == 1 && importing.length == 1 move_slot(migrating[0],importing[0],slot,:verbose=>true) + elsif migrating.length == 1 && importing.length == 0 + xputs ">>> Setting #{slot} as STABLE" + migrating[0].r.cluster("setslot",slot,"stable") else xputs "[ERR] Sorry, Redis-trib can't fix this slot yet (work in progress)" end From 3107e7ca60a126ba096e785b575dd0980e9ae29f Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 19:14:05 +0100 Subject: [PATCH 0018/1648] Cluster: remove debugging xputs from redis-trib. --- src/redis-trib.rb | 1 - 1 file changed, 1 deletion(-) diff --git a/src/redis-trib.rb b/src/redis-trib.rb index 147d3132a3f..36bd05d977e 100755 --- a/src/redis-trib.rb +++ b/src/redis-trib.rb @@ -116,7 +116,6 @@ def assert_empty def load_info(o={}) self.connect nodes = @r.cluster("nodes").split("\n") - puts @r.cluster("nodes") nodes.each{|n| # name addr flags role ping_sent ping_recv link_status slots split = n.split From 218358bbbd6ed9b717db50868ca069a3731e08b2 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 23:48:42 +0100 Subject: [PATCH 0019/1648] Cluster: conditions to clear "migrating" on slot for SETSLOT ... NODE changed. If the slot is manually assigned to another node, clear the migrating status regardless of the fact it was previously assigned to us or not, as long as we no longer have keys for this slot. This avoid a race during slots migration that may leave the slot in migrating status in the source node, since it received an update message from the destination node that is already claiming the slot. This way we are sure that redis-trib at the end of the slot migration is always able to close the slot correctly. --- src/cluster.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index f9899fae530..8296e1c388c 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -3154,10 +3154,10 @@ void clusterCommand(redisClient *c) { return; } } - /* If this node was the slot owner and the slot was marked as - * migrating, assigning the slot to another node will clear + /* If this slot is in migrating status but we have no keys + * for it assigning the slot to another node will clear * the migratig status. */ - if (server.cluster->slots[slot] == myself && + if (countKeysInSlot(slot) == 0 && server.cluster->migrating_slots_to[slot]) server.cluster->migrating_slots_to[slot] = NULL; From 6dc26795aa949d6cacad4f7c6ab2df0f18f670e4 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Feb 2014 23:54:08 +0100 Subject: [PATCH 0020/1648] Cluster: fsync at every SETSLOT command puts too pressure on disks. During slots migration redis-trib can send a number of SETSLOT commands. Fsyncing every time is a bit too much in production as verified empirically. To make sure configs are fsynced on all nodes after a resharding redis-trib may send something like CLUSTER CONFSYNC. In this case fsyncs were not providing too much value since anyway processes can crash in the middle of the resharding of an hash slot, and redis-trib should be able to recover from this condition anyway. --- src/cluster.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index 8296e1c388c..57f502c7dbd 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -3172,8 +3172,10 @@ void clusterCommand(redisClient *c) { * at least to the value of the configEpoch of the old owner * so that its old replicas, or some of its old message pending * on the cluster bus, can't claim our slot. */ - if (old_owner->configEpoch > myself->configEpoch) + if (old_owner->configEpoch > myself->configEpoch) { myself->configEpoch = old_owner->configEpoch; + clusterDoBeforeSleep(CLUSTER_TODO_FSYNC_CONFIG); + } server.cluster->importing_slots_from[slot] = NULL; } clusterDelSlot(slot); @@ -3182,9 +3184,7 @@ void clusterCommand(redisClient *c) { addReplyError(c,"Invalid CLUSTER SETSLOT action or number of arguments"); return; } - clusterDoBeforeSleep(CLUSTER_TODO_SAVE_CONFIG| - CLUSTER_TODO_UPDATE_STATE| - CLUSTER_TODO_FSYNC_CONFIG); + clusterDoBeforeSleep(CLUSTER_TODO_SAVE_CONFIG|CLUSTER_TODO_UPDATE_STATE); addReply(c,shared.ok); } else if (!strcasecmp(c->argv[1]->ptr,"info") && c->argc == 2) { /* CLUSTER INFO */ From a1349728ea1281977e1dd2b82d65465165137390 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 11 Feb 2014 00:32:39 +0100 Subject: [PATCH 0021/1648] Cluster: on resharding upgrade version of receiving node. The node receiving the hash slot needs to have a version that wins over the other versions in order to force the ownership of the slot. However the current code is far from perfect since a failover can happen during the manual resharding. The fix is a work in progress but the bottom line is that the new version must either be voted as usually, set by redis-trib manually after it makes sure can't be used by other nodes, or reserved configEpochs could be used for manual operations (for example odd versions could be never used by slaves and are always used by CLUSTER SETSLOT NODE). --- src/cluster.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index 57f502c7dbd..452ddf57939 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -3169,11 +3169,15 @@ void clusterCommand(redisClient *c) { clusterNode *old_owner = server.cluster->importing_slots_from[slot]; /* This slot was manually migrated, set this node configEpoch - * at least to the value of the configEpoch of the old owner - * so that its old replicas, or some of its old message pending - * on the cluster bus, can't claim our slot. */ + * to a new epoch so that the new version can be propagated + * by the cluster. + * + * FIXME: the new version should be agreed otherwise a race + * is possible if while a manual resharding is in progress + * the master is failed over by a slave. */ if (old_owner->configEpoch > myself->configEpoch) { - myself->configEpoch = old_owner->configEpoch; + server.cluster->currentEpoch++; + myself->configEpoch = server.cluster->currentEpoch; clusterDoBeforeSleep(CLUSTER_TODO_FSYNC_CONFIG); } server.cluster->importing_slots_from[slot] = NULL; From 44f7afe28a6483c048155098d0606b968264b511 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 11 Feb 2014 09:48:53 +0100 Subject: [PATCH 0022/1648] Cluster: always increment the configEpoch in SETNODE after import. Removed a stale conditional preventing the configEpoch from incrementing after the import in certain conditions. Since the master got a new slot it should always claim a new configuration. --- src/cluster.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index 452ddf57939..febd1888d8f 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -3166,8 +3166,6 @@ void clusterCommand(redisClient *c) { if (n == myself && server.cluster->importing_slots_from[slot]) { - clusterNode *old_owner = - server.cluster->importing_slots_from[slot]; /* This slot was manually migrated, set this node configEpoch * to a new epoch so that the new version can be propagated * by the cluster. @@ -3175,11 +3173,9 @@ void clusterCommand(redisClient *c) { * FIXME: the new version should be agreed otherwise a race * is possible if while a manual resharding is in progress * the master is failed over by a slave. */ - if (old_owner->configEpoch > myself->configEpoch) { - server.cluster->currentEpoch++; - myself->configEpoch = server.cluster->currentEpoch; - clusterDoBeforeSleep(CLUSTER_TODO_FSYNC_CONFIG); - } + server.cluster->currentEpoch++; + myself->configEpoch = server.cluster->currentEpoch; + clusterDoBeforeSleep(CLUSTER_TODO_FSYNC_CONFIG); server.cluster->importing_slots_from[slot] = NULL; } clusterDelSlot(slot); From 72f7abf6a2e9a6fc2b6e710e049b48ba6c3cdfa6 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 11 Feb 2014 10:00:11 +0100 Subject: [PATCH 0023/1648] Cluster: clusterSetStartupEpoch() made more generally useful. The actual goal of the function was to get the max configEpoch found in the cluster, so make it general by removing the assignment of the max epoch to currentEpoch that is useful only at startup. --- src/cluster.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index febd1888d8f..aeedc17745b 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -73,21 +73,19 @@ void resetManualFailover(void); * Initialization * -------------------------------------------------------------------------- */ -/* This function is called at startup in order to set the currentEpoch - * (which is not saved on permanent storage) to the greatest configEpoch found - * in the loaded nodes (configEpoch is stored on permanent storage as soon as - * it changes for some node). */ -void clusterSetStartupEpoch() { +/* Return the greatest configEpoch found in the cluster. */ +uint64_t clusterGetMaxEpoch(void) { + uint64_t max = 0; dictIterator *di; dictEntry *de; di = dictGetSafeIterator(server.cluster->nodes); while((de = dictNext(di)) != NULL) { clusterNode *node = dictGetVal(de); - if (node->configEpoch > server.cluster->currentEpoch) - server.cluster->currentEpoch = node->configEpoch; + if (node->configEpoch > max) max = node->configEpoch; } dictReleaseIterator(di); + return max; } int clusterLoadConfig(char *filename) { @@ -227,7 +225,10 @@ int clusterLoadConfig(char *filename) { /* Config sanity check */ redisAssert(server.cluster->myself != NULL); redisLog(REDIS_NOTICE,"Node configuration loaded, I'm %.40s", myself->name); - clusterSetStartupEpoch(); + /* Set the currentEpoch to the max epoch found in the master. + * FIXME: this should actually be part of the persistent state, as + * documented in the Github issue #1479. */ + server.cluster->currentEpoch = clusterGetMaxEpoch(); return REDIS_OK; fmterr: From 4a64286c363267acff39b1f2ec9d4b47640871c0 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 11 Feb 2014 10:06:10 +0100 Subject: [PATCH 0024/1648] Cluster: configEpoch assignment in SETNODE improved. Avoid to trash a configEpoch for every slot migrated if this node has already the max configEpoch across the cluster. Still work to do in this area but this avoids both ending with a very high configEpoch without any reason and to flood the system with fsyncs. --- src/cluster.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index aeedc17745b..b6562341b3d 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -3174,9 +3174,13 @@ void clusterCommand(redisClient *c) { * FIXME: the new version should be agreed otherwise a race * is possible if while a manual resharding is in progress * the master is failed over by a slave. */ - server.cluster->currentEpoch++; - myself->configEpoch = server.cluster->currentEpoch; - clusterDoBeforeSleep(CLUSTER_TODO_FSYNC_CONFIG); + uint64_t maxEpoch = clusterGetMaxEpoch(); + + if (myself->configEpoch != maxEpoch) { + server.cluster->currentEpoch++; + myself->configEpoch = server.cluster->currentEpoch; + clusterDoBeforeSleep(CLUSTER_TODO_FSYNC_CONFIG); + } server.cluster->importing_slots_from[slot] = NULL; } clusterDelSlot(slot); From 8251d2d150e454b2783d394eb3f131423b5b89d0 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 11 Feb 2014 10:13:18 +0100 Subject: [PATCH 0025/1648] Cluster: redis-trib fix: handling of another trivial case. --- src/redis-trib.rb | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/redis-trib.rb b/src/redis-trib.rb index 36bd05d977e..ec3c416bbc8 100755 --- a/src/redis-trib.rb +++ b/src/redis-trib.rb @@ -472,6 +472,9 @@ def fix_open_slot(slot) elsif migrating.length == 1 && importing.length == 0 xputs ">>> Setting #{slot} as STABLE" migrating[0].r.cluster("setslot",slot,"stable") + elsif migrating.length == 0 && importing.length == 1 + xputs ">>> Setting #{slot} as STABLE" + importing[0].r.cluster("setslot",slot,"stable") else xputs "[ERR] Sorry, Redis-trib can't fix this slot yet (work in progress)" end From 5e0e03be41e7b07455f9f502fbaaa3dbd9e2daaf Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 11 Feb 2014 10:18:24 +0100 Subject: [PATCH 0026/1648] Cluster: UPDATE messages are the norm and verbose. Logging them at WARNING level was of little utility and of sure disturb. --- src/cluster.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/cluster.c b/src/cluster.c index b6562341b3d..de632c64d82 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -1430,7 +1430,7 @@ int clusterProcessPacket(clusterLink *link) { if (server.cluster->slots[j]->configEpoch > senderConfigEpoch) { - redisLog(REDIS_WARNING, + redisLog(REDIS_VERBOSE, "Node %.40s has old slots configuration, sending " "an UPDATE message about %.40s", sender->name, server.cluster->slots[j]->name); From db6d628c3ee9f39a196026b5bb5ac47cb8551aef Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 11 Feb 2014 10:34:14 +0100 Subject: [PATCH 0027/1648] Cluster: clusterDelNode(): remove node from master's slaves. --- src/cluster.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/cluster.c b/src/cluster.c index de632c64d82..f47799aca8a 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -664,7 +664,11 @@ void clusterDelNode(clusterNode *delnode) { } dictReleaseIterator(di); - /* 3) Free the node, unlinking it from the cluster. */ + /* 3) Remove this node from its master's slaves if needed. */ + if (nodeIsSlave(delnode) && delnode->slaveof) + clusterNodeRemoveSlave(delnode->slaveof,delnode); + + /* 4) Free the node, unlinking it from the cluster. */ freeClusterNode(delnode); } From fe8352540fa5d6157648427b0651de9d5574e48d Mon Sep 17 00:00:00 2001 From: antirez Date: Wed, 12 Feb 2014 12:47:10 +0100 Subject: [PATCH 0028/1648] AOF: don't abort on write errors unless fsync is 'always'. A system similar to the RDB write error handling is used, in which when we can't write to the AOF file, writes are no longer accepted until we are able to write again. For fsync == always we still abort on errors since there is currently no easy way to avoid replying with success to the user otherwise, and this would violate the contract with the user of only acknowledging data already secured on disk. --- src/aof.c | 80 +++++++++++++++++++++++++++++++++++++++++++---------- src/redis.c | 33 ++++++++++++++++------ src/redis.h | 2 ++ 3 files changed, 91 insertions(+), 24 deletions(-) diff --git a/src/aof.c b/src/aof.c index d59e4061f98..dbe7bfa6d03 100644 --- a/src/aof.c +++ b/src/aof.c @@ -226,6 +226,7 @@ int startAppendOnly(void) { * * However if force is set to 1 we'll write regardless of the background * fsync. */ +#define AOF_WRITE_LOG_ERROR_RATE 30 /* Seconds between errors logging. */ void flushAppendOnlyFile(int force) { ssize_t nwritten; int sync_in_progress = 0; @@ -267,27 +268,76 @@ void flushAppendOnlyFile(int force) { * or alike */ nwritten = write(server.aof_fd,server.aof_buf,sdslen(server.aof_buf)); if (nwritten != (signed)sdslen(server.aof_buf)) { - /* Ooops, we are in troubles. The best thing to do for now is - * aborting instead of giving the illusion that everything is - * working as expected. */ + static time_t last_write_error_log = 0; + int can_log = 0; + + /* Limit logging rate to 1 line per AOF_WRITE_LOG_ERROR_RATE seconds. */ + if ((server.unixtime - last_write_error_log) > AOF_WRITE_LOG_ERROR_RATE) { + can_log = 1; + last_write_error_log = server.unixtime; + } + + /* Lof the AOF write error and record the error code. */ if (nwritten == -1) { - redisLog(REDIS_WARNING,"Exiting on error writing to the append-only file: %s",strerror(errno)); + if (can_log) { + redisLog(REDIS_WARNING,"Error writing to the AOF file: %s", + strerror(errno)); + server.aof_last_write_errno = errno; + } } else { - redisLog(REDIS_WARNING,"Exiting on short write while writing to " - "the append-only file: %s (nwritten=%ld, " - "expected=%ld)", - strerror(errno), - (long)nwritten, - (long)sdslen(server.aof_buf)); + if (can_log) { + redisLog(REDIS_WARNING,"Short write while writing to " + "the AOF file: (nwritten=%lld, " + "expected=%lld)", + (long long)nwritten, + (long long)sdslen(server.aof_buf)); + } if (ftruncate(server.aof_fd, server.aof_current_size) == -1) { - redisLog(REDIS_WARNING, "Could not remove short write " - "from the append-only file. Redis may refuse " - "to load the AOF the next time it starts. " - "ftruncate: %s", strerror(errno)); + if (can_log) { + redisLog(REDIS_WARNING, "Could not remove short write " + "from the append-only file. Redis may refuse " + "to load the AOF the next time it starts. " + "ftruncate: %s", strerror(errno)); + } + } else { + /* If the ftrunacate() succeeded we can set nwritten to + * -1 since there is no longer partial data into the AOF. */ + nwritten = -1; } + server.aof_last_write_errno = ENOSPC; + } + + /* Handle the AOF write error. */ + if (server.aof_fsync == AOF_FSYNC_ALWAYS) { + /* We can't recover when the fsync policy is ALWAYS since the + * reply for the client is already in the output buffers, and we + * have the contract with the user that on acknowledged write data + * is synched on disk. */ + redisLog(REDIS_WARNING,"Can't recover from AOF write error when the AOF fsync policy is 'always'. Exiting..."); + exit(1); + } else { + /* Recover from failed write leaving data into the buffer. However + * set an error to stop accepting writes as long as the error + * condition is not cleared. */ + server.aof_last_write_status = REDIS_ERR; + + /* Trim the sds buffer if there was a partial write, and there + * was no way to undo it with ftruncate(2). */ + if (nwritten > 0) { + server.aof_current_size += nwritten; + sdsrange(server.aof_buf,nwritten,-1); + } + return; /* We'll try again on the next call... */ + } + } else { + /* Successful write(2). If AOF was in error state, restore the + * OK state and log the event. */ + if (server.aof_last_write_status == REDIS_ERR) { + redisLog(REDIS_WARNING, + "AOF write error looks solved, Redis can write again."); + server.aof_last_write_status = REDIS_OK; } - exit(1); } server.aof_current_size += nwritten; diff --git a/src/redis.c b/src/redis.c index 059c43d47df..5299c69d09f 100644 --- a/src/redis.c +++ b/src/redis.c @@ -1167,9 +1167,13 @@ int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) { } - /* If we postponed an AOF buffer flush, let's try to do it every time the - * cron function is called. */ - if (server.aof_flush_postponed_start) flushAppendOnlyFile(0); + /* AOF: we may have postponed buffer flush, or were not able to + * write our buffer because of write(2) error. Try again here. */ + if (server.aof_flush_postponed_start || + server.aof_last_write_status == REDIS_ERR) + { + flushAppendOnlyFile(0); + } /* Close clients that need to be closed asynchronous */ freeClientsInAsyncFreeQueue(); @@ -1670,6 +1674,8 @@ void initServer() { server.unixtime = time(NULL); server.mstime = mstime(); server.lastbgsave_status = REDIS_OK; + server.aof_last_write_status = REDIS_OK; + server.aof_last_write_errno = 0; server.repl_good_slaves_count = 0; /* Create the serverCron() time event, that's our main way to process @@ -2035,15 +2041,22 @@ int processCommand(redisClient *c) { /* Don't accept write commands if there are problems persisting on disk * and if this is a master instance. */ - if (server.stop_writes_on_bgsave_err && - server.saveparamslen > 0 - && server.lastbgsave_status == REDIS_ERR && + if (((server.stop_writes_on_bgsave_err && + server.saveparamslen > 0 && + server.lastbgsave_status == REDIS_ERR) || + server.aof_last_write_status == REDIS_ERR) && server.masterhost == NULL && (c->cmd->flags & REDIS_CMD_WRITE || c->cmd->proc == pingCommand)) { flagTransaction(c); - addReply(c, shared.bgsaveerr); + if (server.aof_last_write_status == REDIS_OK) + addReply(c, shared.bgsaveerr); + else + addReplySds(c, + sdscatprintf(sdsempty(), + "-MISCONF Errors writing to the AOF file: %s\r\n", + strerror(server.aof_last_write_errno))); return REDIS_OK; } @@ -2426,7 +2439,8 @@ sds genRedisInfoString(char *section) { "aof_rewrite_scheduled:%d\r\n" "aof_last_rewrite_time_sec:%jd\r\n" "aof_current_rewrite_time_sec:%jd\r\n" - "aof_last_bgrewrite_status:%s\r\n", + "aof_last_bgrewrite_status:%s\r\n" + "aof_last_write_status:%s\r\n", server.loading, server.dirty, server.rdb_child_pid != -1, @@ -2441,7 +2455,8 @@ sds genRedisInfoString(char *section) { (intmax_t)server.aof_rewrite_time_last, (intmax_t)((server.aof_child_pid == -1) ? -1 : time(NULL)-server.aof_rewrite_time_start), - (server.aof_lastbgrewrite_status == REDIS_OK) ? "ok" : "err"); + (server.aof_lastbgrewrite_status == REDIS_OK) ? "ok" : "err", + (server.aof_last_write_status == REDIS_OK) ? "ok" : "err"); if (server.aof_state != REDIS_AOF_OFF) { info = sdscatprintf(info, diff --git a/src/redis.h b/src/redis.h index 504d7947f66..d7a9eea0a54 100644 --- a/src/redis.h +++ b/src/redis.h @@ -691,6 +691,8 @@ struct redisServer { int aof_lastbgrewrite_status; /* REDIS_OK or REDIS_ERR */ unsigned long aof_delayed_fsync; /* delayed AOF fsync() counter */ int aof_rewrite_incremental_fsync;/* fsync incrementally while rewriting? */ + int aof_last_write_status; /* REDIS_OK or REDIS_ERR */ + int aof_last_write_errno; /* Valid if aof_last_write_status is ERR */ /* RDB persistence */ long long dirty; /* Changes to DB from the last save */ long long dirty_before_bgsave; /* Used to restore dirty on failed BGSAVE */ From fc08c8599f36761530b3ba40c31d4e8efb2e7113 Mon Sep 17 00:00:00 2001 From: antirez Date: Wed, 12 Feb 2014 16:27:59 +0100 Subject: [PATCH 0029/1648] AOF write error: retry with a frequency of 1 hz. --- src/redis.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/src/redis.c b/src/redis.c index 5299c69d09f..1b61946e78b 100644 --- a/src/redis.c +++ b/src/redis.c @@ -1167,12 +1167,17 @@ int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) { } - /* AOF: we may have postponed buffer flush, or were not able to - * write our buffer because of write(2) error. Try again here. */ - if (server.aof_flush_postponed_start || - server.aof_last_write_status == REDIS_ERR) - { - flushAppendOnlyFile(0); + /* AOF postponed flush: Try at every cron cycle if the slow fsync + * completed. */ + if (server.aof_flush_postponed_start) flushAppendOnlyFile(0); + + /* AOF write errors: in this case we have a buffer to flush as well and + * clear the AOF error in case of success to make the DB writable again, + * however to try every second is enough in case of 'hz' is set to + * an higher frequency. */ + run_with_period(1000) { + if (server.aof_last_write_status == REDIS_ERR) + flushAppendOnlyFile(0); } /* Close clients that need to be closed asynchronous */ From 21e6b0fbe9810a37844bbdf86e5aba655cfa5fe9 Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 13 Feb 2014 12:10:43 +0100 Subject: [PATCH 0030/1648] Fix script cache bug in the scripting engine. This commit fixes a serious Lua scripting replication issue, described by Github issue #1549. The root cause of the problem is that scripts were put inside the script cache, assuming that slaves and AOF already contained it, even if the scripts sometimes produced no changes in the data set, and were not actaully propagated to AOF/slaves. Example: eval "if tonumber(KEYS[1]) > 0 then redis.call('incr', 'x') end" 1 0 Then: evalsha 1 0 At this step sha1 of the script is added to the replication script cache (the script is marked as known to the slaves) and EVALSHA command is transformed to EVAL. However it is not dirty (there is no changes to db), so it is not propagated to the slaves. Then the script is called again: evalsha 1 1 At this step master checks that the script already exists in the replication script cache and doesn't transform it to EVAL command. It is dirty and propagated to the slaves, but they fail to evaluate the script as they don't have it in the script cache. The fix is trivial and just uses the new API to force the propagation of the executed command regardless of the dirty state of the data set. Thank you to @minus-infinity on Github for finding the issue, understanding the root cause, and fixing it. --- src/scripting.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/scripting.c b/src/scripting.c index d88078967bf..5fe0b60a319 100644 --- a/src/scripting.c +++ b/src/scripting.c @@ -958,6 +958,7 @@ void evalGenericCommand(redisClient *c, int evalsha) { rewriteClientCommandArgument(c,0, resetRefCount(createStringObject("EVAL",4))); rewriteClientCommandArgument(c,1,script); + forceCommandPropagation(c,REDIS_PROPAGATE_REPL|REDIS_PROPAGATE_AOF); } } } From f2bdf601bee92972bad7493950a72391b9278606 Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 13 Feb 2014 12:25:44 +0100 Subject: [PATCH 0031/1648] Test: regression for issue #1549. It was verified that reverting the commit that fixes the bug, the test no longer passes. --- tests/unit/scripting.tcl | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/tests/unit/scripting.tcl b/tests/unit/scripting.tcl index ec5230bfe3e..a17892d3013 100644 --- a/tests/unit/scripting.tcl +++ b/tests/unit/scripting.tcl @@ -417,5 +417,17 @@ start_server {tags {"scripting repl"}} { } set res } {a 1} + + test {EVALSHA replication when first call is readonly} { + r del x + r eval {if tonumber(KEYS[1]) > 0 then redis.call('incr', 'x') end} 1 0 + r evalsha 38fe3ddf5284a1d48f37f824b4c4e826879f3cb9 1 0 + r evalsha 38fe3ddf5284a1d48f37f824b4c4e826879f3cb9 1 1 + wait_for_condition 50 100 { + [r -1 get x] eq {1} + } else { + fail "Expected 1 in x, but value is '[r -1 get x]'" + } + } } } From 7e8abcf6932120b5f9c8a7dadafbd5a8045c2bb6 Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 13 Feb 2014 14:32:44 +0100 Subject: [PATCH 0032/1648] Log when CONFIG REWRITE goes bad. --- src/config.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/config.c b/src/config.c index a68e4f3e7eb..3e6b1561e7b 100644 --- a/src/config.c +++ b/src/config.c @@ -1824,6 +1824,7 @@ void configCommand(redisClient *c) { return; } if (rewriteConfig(server.configfile) == -1) { + redisLog(REDIS_WARNING,"CONFIG REWRITE failed: %s", strerror(errno)); addReplyErrorFormat(c,"Rewriting config file: %s", strerror(errno)); } else { addReply(c,shared.ok); From 51bd9da1fdf3d4bb0d4f81234f33003c16db9a51 Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 13 Feb 2014 15:09:41 +0100 Subject: [PATCH 0033/1648] Update cached time in rdbLoad() callback. server.unixtime and server.mstime are cached less precise timestamps that we use every time we don't need an accurate time representation and a syscall would be too slow for the number of calls we require. Such an example is the initialization and update process of the last interaction time with the client, that is used for timeouts. However rdbLoad() can take some time to load the DB, but at the same time it did not updated the time during DB loading. This resulted in the bug described in issue #1535, where in the replication process the slave loads the DB, creates the redisClient representation of its master, but the timestamp is so old that the master, under certain conditions, is sensed as already "timed out". Thanks to @yoav-steinberg and Redis Labs Inc for the bug report and analysis. --- src/rdb.c | 4 ++++ src/redis.c | 20 ++++++++++++-------- src/redis.h | 1 + 3 files changed, 17 insertions(+), 8 deletions(-) diff --git a/src/rdb.c b/src/rdb.c index 60dd7113e83..1a472dbc7cf 100644 --- a/src/rdb.c +++ b/src/rdb.c @@ -1065,6 +1065,10 @@ void rdbLoadProgressCallback(rio *r, const void *buf, size_t len) { if (server.loading_process_events_interval_bytes && (r->processed_bytes + len)/server.loading_process_events_interval_bytes > r->processed_bytes/server.loading_process_events_interval_bytes) { + /* The DB can take some non trivial amount of time to load. Update + * our cached time since it is used to create and update the last + * interaction time with clients and for other important things. */ + updateCachedTime(); if (server.masterhost && server.repl_state == REDIS_REPL_TRANSFER) replicationSendNewlineToMaster(); loadingProgress(r->processed_bytes); diff --git a/src/redis.c b/src/redis.c index 1b61946e78b..e50dc607268 100644 --- a/src/redis.c +++ b/src/redis.c @@ -1001,6 +1001,15 @@ void databasesCron(void) { } } +/* We take a cached value of the unix time in the global state because with + * virtual memory and aging there is to store the current time in objects at + * every object access, and accuracy is not needed. To access a global var is + * a lot faster than calling time(NULL) */ +void updateCachedTime(void) { + server.unixtime = time(NULL); + server.mstime = mstime(); +} + /* This is our timer interrupt, called server.hz times per second. * Here is where we do a number of things that need to be done asynchronously. * For instance: @@ -1030,12 +1039,8 @@ int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) { * handler if we don't return here fast enough. */ if (server.watchdog_period) watchdogScheduleSignal(server.watchdog_period); - /* We take a cached value of the unix time in the global state because - * with virtual memory and aging there is to store the current time - * in objects at every object access, and accuracy is not needed. - * To access a global var is faster than calling time(NULL) */ - server.unixtime = time(NULL); - server.mstime = mstime(); + /* Update the time cache. */ + updateCachedTime(); run_with_period(100) trackOperationsPerSecond(); @@ -1676,12 +1681,11 @@ void initServer() { server.ops_sec_idx = 0; server.ops_sec_last_sample_time = mstime(); server.ops_sec_last_sample_ops = 0; - server.unixtime = time(NULL); - server.mstime = mstime(); server.lastbgsave_status = REDIS_OK; server.aof_last_write_status = REDIS_OK; server.aof_last_write_errno = 0; server.repl_good_slaves_count = 0; + updateCachedTime(); /* Create the serverCron() time event, that's our main way to process * background operations. */ diff --git a/src/redis.h b/src/redis.h index d7a9eea0a54..618dfac439b 100644 --- a/src/redis.h +++ b/src/redis.h @@ -1154,6 +1154,7 @@ void populateCommandTable(void); void resetCommandTableStats(void); void adjustOpenFilesLimit(void); void closeListeningSockets(int unlink_unix_socket); +void updateCachedTime(); /* Set data type */ robj *setTypeCreate(robj *value); From e1b77b61f320077dd41d8ea2a7fd74561084d75f Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 17 Feb 2014 12:10:12 +0100 Subject: [PATCH 0034/1648] Sentinel: better specify startup errors due to config file. Now it logs the file name if it is not accessible. Also there is a different error for the missing config file case, and for the non writable file case. --- src/sentinel.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/sentinel.c b/src/sentinel.c index fb8b220a8b8..c25e01b18c5 100644 --- a/src/sentinel.c +++ b/src/sentinel.c @@ -417,8 +417,13 @@ void initSentinel(void) { void sentinelIsRunning(void) { redisLog(REDIS_WARNING,"Sentinel runid is %s", server.runid); - if (server.configfile == NULL || access(server.configfile,W_OK) == -1) { - redisLog(REDIS_WARNING,"Sentinel started without a config file, or config file not writable. Exiting..."); + if (server.configfile == NULL) { + redisLog(REDIS_WARNING, + "Sentinel started without a config file. Exiting..."); + } else if (access(server.configfile,W_OK) == -1) { + redisLog(REDIS_WARNING, + "Sentinel config file %s is not writable: %s. Exiting...", + server.configfile,strerror(errno)); exit(1); } } From ede33fb91259000950db667b1e92dd6e398a01f7 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 17 Feb 2014 12:14:19 +0100 Subject: [PATCH 0035/1648] Get absoulte config file path before processig 'dir'. The code tried to obtain the configuration file absolute path after processing the configuration file. However if config file was a relative path and a "dir" statement was processed reading the config, the absolute path obtained was wrong. With this fix the absolute path is obtained before processing the configuration while the server is still in the original directory where it was executed. --- src/redis.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/redis.c b/src/redis.c index e50dc607268..0d1686a02a6 100644 --- a/src/redis.c +++ b/src/redis.c @@ -3153,10 +3153,10 @@ int main(int argc, char **argv) { } j++; } + if (configfile) server.configfile = getAbsolutePath(configfile); resetServerSaveParams(); loadServerConfig(configfile,options); sdsfree(options); - if (configfile) server.configfile = getAbsolutePath(configfile); } else { redisLog(REDIS_WARNING, "Warning: no config file specified, using the default config. In order to specify a config file use %s /path/to/%s.conf", argv[0], server.sentinel_mode ? "sentinel" : "redis"); } From a1dca2efab3e5b18054072e487f6f54814fd8992 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 17 Feb 2014 12:29:54 +0100 Subject: [PATCH 0036/1648] Test: code to test server availability refactored. Some inline test moved into server_is_up procedure. Also find_available_port was moved into util since it is going to be used for the Sentinel test as well. --- tests/support/server.tcl | 30 ++++++++++++++++++------------ tests/support/util.tcl | 15 +++++++++++++++ tests/test_helper.tcl | 15 --------------- 3 files changed, 33 insertions(+), 27 deletions(-) diff --git a/tests/support/server.tcl b/tests/support/server.tcl index e10c350ff69..b53abb2ef18 100644 --- a/tests/support/server.tcl +++ b/tests/support/server.tcl @@ -79,7 +79,7 @@ proc is_alive config { proc ping_server {host port} { set retval 0 if {[catch { - set fd [socket $::host $::port] + set fd [socket $host $port] fconfigure $fd -translation binary puts $fd "PING\r\n" flush $fd @@ -101,6 +101,22 @@ proc ping_server {host port} { return $retval } +# Return 1 if the server at the specified addr is reachable by PING, otherwise +# returns 0. Performs a try every 50 milliseconds for the specified number +# of retries. +proc server_is_up {host port retrynum} { + after 10 ;# Use a small delay to make likely a first-try success. + set retval 0 + while {[incr retrynum -1]} { + if {[catch {ping_server $host $port} ping]} { + set ping 0 + } + if {$ping} {return 1} + after 50 + } + return 0 +} + # doesn't really belong here, but highly coupled to code in start_server proc tags {tags code} { set ::tags [concat $::tags $tags] @@ -191,23 +207,13 @@ proc start_server {options {code undefined}} { # check that the server actually started # ugly but tries to be as fast as possible... if {$::valgrind} {set retrynum 1000} else {set retrynum 100} - set serverisup 0 if {$::verbose} { puts -nonewline "=== ($tags) Starting server ${::host}:${::port} " } - after 10 if {$code ne "undefined"} { - while {[incr retrynum -1]} { - catch { - if {[ping_server $::host $::port]} { - set serverisup 1 - } - } - if {$serverisup} break - after 50 - } + set serverisup [server_is_up $::host $::port $retrynum] } else { set serverisup 1 } diff --git a/tests/support/util.tcl b/tests/support/util.tcl index c5a6853b39b..3804f253b2d 100644 --- a/tests/support/util.tcl +++ b/tests/support/util.tcl @@ -312,3 +312,18 @@ proc csvstring s { proc roundFloat f { format "%.10g" $f } + +proc find_available_port start { + for {set j $start} {$j < $start+1024} {incr j} { + if {[catch { + set fd [socket 127.0.0.1 $j] + }]} { + return $j + } else { + close $fd + } + } + if {$j == $start+1024} { + error "Can't find a non busy port in the $start-[expr {$start+1023}] range." + } +} diff --git a/tests/test_helper.tcl b/tests/test_helper.tcl index a0badb87904..062c318e528 100644 --- a/tests/test_helper.tcl +++ b/tests/test_helper.tcl @@ -164,21 +164,6 @@ proc cleanup {} { if {!$::quiet} {puts "OK"} } -proc find_available_port start { - for {set j $start} {$j < $start+1024} {incr j} { - if {[catch { - set fd [socket 127.0.0.1 $j] - }]} { - return $j - } else { - close $fd - } - } - if {$j == $start+1024} { - error "Can't find a non busy port in the $start-[expr {$start+1023}] range." - } -} - proc test_server_main {} { cleanup set tclsh [info nameofexecutable] From 34c404e069644336325fb568860cb8248d054089 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 17 Feb 2014 17:36:50 +0100 Subject: [PATCH 0037/1648] Test: colorstr moved to util.tcl. --- tests/support/test.tcl | 30 ------------------------------ tests/support/util.tcl | 30 ++++++++++++++++++++++++++++++ 2 files changed, 30 insertions(+), 30 deletions(-) diff --git a/tests/support/test.tcl b/tests/support/test.tcl index 480c674e05b..96b529d7a5d 100644 --- a/tests/support/test.tcl +++ b/tests/support/test.tcl @@ -61,36 +61,6 @@ proc wait_for_condition {maxtries delay e _else_ elsescript} { } } -# Test if TERM looks like to support colors -proc color_term {} { - expr {[info exists ::env(TERM)] && [string match *xterm* $::env(TERM)]} -} - -proc colorstr {color str} { - if {[color_term]} { - set b 0 - if {[string range $color 0 4] eq {bold-}} { - set b 1 - set color [string range $color 5 end] - } - switch $color { - red {set colorcode {31}} - green {set colorcode {32}} - yellow {set colorcode {33}} - blue {set colorcode {34}} - magenta {set colorcode {35}} - cyan {set colorcode {36}} - white {set colorcode {37}} - default {set colorcode {37}} - } - if {$colorcode ne {}} { - return "\033\[$b;${colorcode};40m$str\033\[0m" - } - } else { - return $str - } -} - proc test {name code {okpattern undefined}} { # abort if tagged with a tag to deny foreach tag $::denytags { diff --git a/tests/support/util.tcl b/tests/support/util.tcl index 3804f253b2d..e49ea229cf6 100644 --- a/tests/support/util.tcl +++ b/tests/support/util.tcl @@ -327,3 +327,33 @@ proc find_available_port start { error "Can't find a non busy port in the $start-[expr {$start+1023}] range." } } + +# Test if TERM looks like to support colors +proc color_term {} { + expr {[info exists ::env(TERM)] && [string match *xterm* $::env(TERM)]} +} + +proc colorstr {color str} { + if {[color_term]} { + set b 0 + if {[string range $color 0 4] eq {bold-}} { + set b 1 + set color [string range $color 5 end] + } + switch $color { + red {set colorcode {31}} + green {set colorcode {32}} + yellow {set colorcode {33}} + blue {set colorcode {34}} + magenta {set colorcode {35}} + cyan {set colorcode {36}} + white {set colorcode {37}} + default {set colorcode {37}} + } + if {$colorcode ne {}} { + return "\033\[$b;${colorcode};40m$str\033\[0m" + } + } else { + return $str + } +} From af788b585268deeb0f58767e8832add87d12fec1 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 17 Feb 2014 17:37:56 +0100 Subject: [PATCH 0038/1648] Sentinel: initial testing framework. Nothing tested at all so far... Just the infrastructure spawning N Sentinels and N Redis instances that the test will use again and again. --- tests/sentinel-tests/base.tcl | 6 ++ tests/sentinel.tcl | 121 ++++++++++++++++++++++++++++++++++ 2 files changed, 127 insertions(+) create mode 100644 tests/sentinel-tests/base.tcl create mode 100644 tests/sentinel.tcl diff --git a/tests/sentinel-tests/base.tcl b/tests/sentinel-tests/base.tcl new file mode 100644 index 00000000000..95626b8b04f --- /dev/null +++ b/tests/sentinel-tests/base.tcl @@ -0,0 +1,6 @@ +test "test OK" { +} + +test "test failed" { + assert {"foo" eq "bar"} +} diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl new file mode 100644 index 00000000000..61cb282873d --- /dev/null +++ b/tests/sentinel.tcl @@ -0,0 +1,121 @@ +# Sentinel test suite. Copyright (C) 2014 Salvatore Sanfilippo antirez@gmail.com +# This softare is released under the BSD License. See the COPYING file for +# more information. + +package require Tcl 8.5 + +set tcl_precision 17 +source tests/support/redis.tcl +source tests/support/util.tcl +source tests/support/server.tcl +source tests/support/test.tcl + +set ::verbose 0 +set ::sentinel_instances {} +set ::redis_instances {} +set ::sentinel_base_port 20000 +set ::redis_base_port 30000 +set ::instances_count 5 ; # How many Sentinels / Instances we use at max +set ::pids {} ; # We kill everything at exit +set ::dirs {} ; # We remove all the temp dirs at exit + +if {[catch {cd tests/sentinel-tmp}]} { + puts "tests/sentinel-tmp directory not found." + puts "Please run this test from the Redis source root." + exit 1 +} + +# Spawn a redis or sentinel instance, depending on 'type'. +proc spawn_instance {type base_port count} { + for {set j 0} {$j < $count} {incr j} { + set port [find_available_port $base_port] + incr base_port + puts "Starting $type #$j at port $port" + + # Create a directory for this Sentinel. + set dirname "${type}_${j}" + lappend ::dirs $dirname + catch {exec rm -rf $dirname} + file mkdir $dirname + + # Write the Sentinel config file. + set cfgfile [file join $dirname $type.conf] + set cfg [open $cfgfile w] + puts $cfg "port $port" + puts $cfg "dir ./$dirname" + puts $cfg "logfile log.txt" + close $cfg + + # Finally exec it and remember the pid for later cleanup. + set sentinel_pid [exec ../../src/redis-sentinel $cfgfile &] + lappend ::pids $sentinel_pid + + # Check availability + if {[server_is_up 127.0.0.1 $port 100] == 0} { + abort_sentinel_test "Problems starting $type #$j: ping timeout" + } + + # Push the instance into the right list + lappend ${type}_instances [list \ + host 127.0.0.1 \ + port $port \ + [redis 127.0.0.1 $port] \ + ] + } +} + +proc cleanup {} { + puts "Cleaning up..." + foreach pid $::pids { + catch {exec kill -9 $pid} + } + foreach dir $::dirs { + catch {exec rm -rf $dir} + } +} + +proc abort_sentinel_test msg { + puts "WARNING: Aborting the test." + puts ">>>>>>>> $msg" + cleanup + exit 1 +} + +proc main {} { + spawn_instance sentinel $::sentinel_base_port $::instances_count + spawn_instance redis $::redis_base_port $::instances_count + run_tests + cleanup +} + +# We redefine 'test' as for Sentinel we don't use the server-client +# architecture for the test, everything is sequential. +proc test {descr code} { + puts -nonewline "> $descr: " + flush stdout + + if {[catch {set retval [uplevel 1 $code]} error]} { + if {[string match "assertion:*" $error]} { + set msg [string range $error 10 end] + puts [colorstr red $msg] + } else { + # Re-raise, let handler up the stack take care of this. + error $error $::errorInfo + } + } else { + puts [colorstr green OK] + } +} + +proc run_tests {} { + set tests [lsort [glob ../sentinel-tests/*]] + foreach test $tests { + puts [colorstr green "### [lindex [file split $test] end]"] + source $test + } +} + +if {[catch main e]} { + puts $::errorInfo + cleanup +} From 18b8bad53cf380471ec28ec097fd38960a246077 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 18 Feb 2014 08:50:57 +0100 Subject: [PATCH 0039/1648] Sentinel: fix slave promotion timeout. If we can't reconfigure a slave in time during failover, go forward as anyway the slave will be fixed by Sentinels in the future, once they detect it is misconfigured. Otherwise a failover in progress may never terminate if for some reason the slave is uncapable to sync with the master while at the same time it is not disconnected. --- src/sentinel.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/sentinel.c b/src/sentinel.c index c25e01b18c5..7713cce7596 100644 --- a/src/sentinel.c +++ b/src/sentinel.c @@ -3364,14 +3364,17 @@ void sentinelFailoverReconfNextSlave(sentinelRedisInstance *master) { /* Skip the promoted slave, and already configured slaves. */ if (slave->flags & (SRI_PROMOTED|SRI_RECONF_DONE)) continue; - /* Clear the SRI_RECONF_SENT flag if too much time elapsed without - * the slave moving forward to the next state. */ + /* If too much time elapsed without the slave moving forward to + * the next state, consider it reconfigured even if it is not. + * Sentinels will detect the slave as misconfigured and fix its + * configuration later. */ if ((slave->flags & SRI_RECONF_SENT) && (mstime() - slave->slave_reconf_sent_time) > SENTINEL_SLAVE_RECONF_RETRY_PERIOD) { sentinelEvent(REDIS_NOTICE,"-slave-reconf-sent-timeout",slave,"%@"); slave->flags &= ~SRI_RECONF_SENT; + slave->flags |= SRI_RECONF_DONE; } /* Nothing to do for instances that are disconnected or already From 7cec9e48cea8caf3d03685f017ca51510ead7f7c Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 18 Feb 2014 10:27:38 +0100 Subject: [PATCH 0040/1648] Sentinel: SENTINEL_SLAVE_RECONF_RETRY_PERIOD -> RECONF_TIMEOUT Rename define to match the new meaning. --- src/sentinel.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/sentinel.c b/src/sentinel.c index 7713cce7596..9fbd43ded04 100644 --- a/src/sentinel.c +++ b/src/sentinel.c @@ -78,7 +78,7 @@ typedef struct sentinelAddr { #define SENTINEL_TILT_TRIGGER 2000 #define SENTINEL_TILT_PERIOD (SENTINEL_PING_PERIOD*30) #define SENTINEL_DEFAULT_SLAVE_PRIORITY 100 -#define SENTINEL_SLAVE_RECONF_RETRY_PERIOD 10000 +#define SENTINEL_SLAVE_RECONF_TIMEOUT 10000 #define SENTINEL_DEFAULT_PARALLEL_SYNCS 1 #define SENTINEL_MIN_LINK_RECONNECT_PERIOD 15000 #define SENTINEL_DEFAULT_FAILOVER_TIMEOUT (60*3*1000) @@ -3370,7 +3370,7 @@ void sentinelFailoverReconfNextSlave(sentinelRedisInstance *master) { * configuration later. */ if ((slave->flags & SRI_RECONF_SENT) && (mstime() - slave->slave_reconf_sent_time) > - SENTINEL_SLAVE_RECONF_RETRY_PERIOD) + SENTINEL_SLAVE_RECONF_TIMEOUT) { sentinelEvent(REDIS_NOTICE,"-slave-reconf-sent-timeout",slave,"%@"); slave->flags &= ~SRI_RECONF_SENT; From 141bac4c79917675c8b1dd0692413f520ed3b474 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 18 Feb 2014 11:04:01 +0100 Subject: [PATCH 0041/1648] Sentinel test: provide basic commands to access instances. --- tests/sentinel-tests/base.tcl | 9 ++++----- tests/sentinel.tcl | 37 +++++++++++++++++++++++++++++++++-- 2 files changed, 39 insertions(+), 7 deletions(-) diff --git a/tests/sentinel-tests/base.tcl b/tests/sentinel-tests/base.tcl index 95626b8b04f..c148e416fc6 100644 --- a/tests/sentinel-tests/base.tcl +++ b/tests/sentinel-tests/base.tcl @@ -1,6 +1,5 @@ -test "test OK" { -} - -test "test failed" { - assert {"foo" eq "bar"} +test "Sentinels aren't monitoring any master" { + foreach_sentinel_id id { + assert {[S $id sentinel masters] eq {}} + } } diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index 61cb282873d..d8efff6b799 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -56,10 +56,11 @@ proc spawn_instance {type base_port count} { } # Push the instance into the right list - lappend ${type}_instances [list \ + lappend ::${type}_instances [list \ + pid $sentinel_pid \ host 127.0.0.1 \ port $port \ - [redis 127.0.0.1 $port] \ + link [redis 127.0.0.1 $port] \ ] } } @@ -115,6 +116,38 @@ proc run_tests {} { } } +# The "S" command is used to interact with the N-th Sentinel. +# The general form is: +# +# S command arg arg arg ... +# +# Example to ping the Sentinel 0 (first instance): S 0 PING +proc S {n args} { + set s [lindex $::sentinel_instances $n] + [dict get $s link] {*}$args +} + +# Like R but to chat with Redis instances. +proc R {n args} { + set r [lindex $::redis_instances $n] + [dict get $r link] {*}$args +} + +# Iterate over IDs of sentinel or redis instances. +proc foreach_sentinel_id {idvar code} { + upvar 1 $idvar id + for {set id 0} {$id < [llength $::sentinel_instances]} {incr id} { + uplevel 1 $code + } +} + +proc foreach_redis_id {idvar code} { + upvar 1 $idvar id + for {set id 0} {$id < [llength $::redis_instances]} {incr id} { + uplevel 1 $code + } +} + if {[catch main e]} { puts $::errorInfo cleanup From 19b863c7fadbfd9c0c11340a75337c86234ea104 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 18 Feb 2014 11:07:25 +0100 Subject: [PATCH 0042/1648] Prefix test file names with numbers to force exec order. --- tests/sentinel-tests/{base.tcl => 00-base.tcl} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename tests/sentinel-tests/{base.tcl => 00-base.tcl} (100%) diff --git a/tests/sentinel-tests/base.tcl b/tests/sentinel-tests/00-base.tcl similarity index 100% rename from tests/sentinel-tests/base.tcl rename to tests/sentinel-tests/00-base.tcl From c4fbc1d3368734838fd3004dc0b38a8c862e080d Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 18 Feb 2014 11:38:49 +0100 Subject: [PATCH 0043/1648] Sentinel test: info fields, master-slave setup, fixes. --- tests/sentinel-tests/00-base.tcl | 4 +++ tests/sentinel.tcl | 57 +++++++++++++++++++++++++++++++- 2 files changed, 60 insertions(+), 1 deletion(-) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index c148e416fc6..2eb5de34b35 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -3,3 +3,7 @@ test "Sentinels aren't monitoring any master" { assert {[S $id sentinel masters] eq {}} } } + +test "Sentinels can start monitoring a master" { + create_redis_master_slave_cluster 3 +} diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index d8efff6b799..38243abde15 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -47,7 +47,12 @@ proc spawn_instance {type base_port count} { close $cfg # Finally exec it and remember the pid for later cleanup. - set sentinel_pid [exec ../../src/redis-sentinel $cfgfile &] + if {$type eq "redis"} { + set prgname redis-server + } else { + set prgname redis-sentinel + } + set sentinel_pid [exec ../../src/${prgname} $cfgfile &] lappend ::pids $sentinel_pid # Check availability @@ -133,6 +138,26 @@ proc R {n args} { [dict get $r link] {*}$args } +proc get_info_field {info field} { + set fl [string length $field] + append field : + foreach line [split $info "\n"] { + set line [string trim $line "\r\n "] + if {[string range $line 0 $fl] eq $field} { + return [string range $line [expr {$fl+1}] end] + } + } + return {} +} + +proc SI {n field} { + get_info_field [S $n info] $field +} + +proc RI {n field} { + get_info_field [R $n info] $field +} + # Iterate over IDs of sentinel or redis instances. proc foreach_sentinel_id {idvar code} { upvar 1 $idvar id @@ -148,6 +173,36 @@ proc foreach_redis_id {idvar code} { } } +# Get the specific attribute of the specified instance type, id. +proc get_instance_attrib {type id attrib} { + dict get [lindex [set ::${type}_instances] $id] $attrib +} + +# Create a master-slave cluster of the given number of total instances. +# The first instance "0" is the master, all others are configured as +# slaves. +proc create_redis_master_slave_cluster n { + foreach_redis_id id { + if {$id == 0} { + # Our master. + R $id flushall + R $id slaveof no one + } elseif {$id < $n} { + R $id slaveof [get_instance_attrib redis 0 host] \ + [get_instance_attrib redis 0 port] + } else { + # Instances not part of the cluster. + R $id slaveof no one + } + } + # Wait for all the slaves to sync. + wait_for_condition 100 50 { + [RI 0 connected_slaves] == ($n-1) + } else { + fail "Unable to create a master-slaves cluster." + } +} + if {[catch main e]} { puts $::errorInfo cleanup From c7b7439528a9eaf693664fc5c543b19438870e01 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 18 Feb 2014 11:53:54 +0100 Subject: [PATCH 0044/1648] Sentinel test: basic tests for MONITOR and auto-discovery. --- tests/sentinel-tests/00-base.tcl | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index 2eb5de34b35..2bb80e75334 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -4,6 +4,29 @@ test "Sentinels aren't monitoring any master" { } } -test "Sentinels can start monitoring a master" { +test "Create a master-slaves cluster of 3 instances" { create_redis_master_slave_cluster 3 } + +test "Sentinels can start monitoring a master" { + set sentinels [llength $::sentinel_instances] + set quorum [expr {$sentinels/2+1}] + foreach_sentinel_id id { + S $id SENTINEL MONITOR mymaster [get_instance_attrib redis 0 host] \ + [get_instance_attrib redis 0 port] $quorum + } + foreach_sentinel_id id { + assert {[S $id sentinel master mymaster] ne {}} + } +} + +test "Sentinels are able to auto-discover other sentinels" { + set sentinels [llength $::sentinel_instances] + foreach_sentinel_id id { + wait_for_condition 100 50 { + [dict get [S $id SENTINEL MASTER mymaster] num-other-sentinels] == ($sentinels-1) + } else { + fail "At least some sentinel can't detect some other sentinel" + } + } +} From 8e553ec67cdc1d2720b2c37d5ccbacf1924bd3a9 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 18 Feb 2014 16:31:52 +0100 Subject: [PATCH 0045/1648] Sentinel test: basic failover tested. Framework improvements. --- tests/sentinel-tests/00-base.tcl | 50 +++++++++++++++++++++++++++++--- tests/sentinel.tcl | 33 ++++++++++++++++----- 2 files changed, 71 insertions(+), 12 deletions(-) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index 2bb80e75334..0b70cc26eb4 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -4,16 +4,19 @@ test "Sentinels aren't monitoring any master" { } } -test "Create a master-slaves cluster of 3 instances" { - create_redis_master_slave_cluster 3 +set redis_slaves 4 +test "Create a master-slaves cluster of [expr $redis_slaves+1] instances" { + create_redis_master_slave_cluster [expr {$redis_slaves+1}] } +set master_id 0 test "Sentinels can start monitoring a master" { set sentinels [llength $::sentinel_instances] set quorum [expr {$sentinels/2+1}] foreach_sentinel_id id { - S $id SENTINEL MONITOR mymaster [get_instance_attrib redis 0 host] \ - [get_instance_attrib redis 0 port] $quorum + S $id SENTINEL MONITOR mymaster \ + [get_instance_attrib redis $master_id host] \ + [get_instance_attrib redis $master_id port] $quorum } foreach_sentinel_id id { assert {[S $id sentinel master mymaster] ne {}} @@ -30,3 +33,42 @@ test "Sentinels are able to auto-discover other sentinels" { } } } + +test "Sentinels are able to auto-discover slaves" { + foreach_sentinel_id id { + wait_for_condition 100 50 { + [dict get [S $id SENTINEL MASTER mymaster] num-slaves] == $redis_slaves + } else { + fail "At least some sentinel can't detect some slave" + } + } +} + +test "Can change master parameters via SENTINEL SET" { + foreach_sentinel_id id { + S $id SENTINEL SET mymaster down-after-milliseconds 2000 + } + foreach_sentinel_id id { + assert {[dict get [S $id sentinel master mymaster] down-after-milliseconds] == 2000} + } +} + +test "Basic failover works if the master is down" { + set old_port [RI $master_id tcp_port] + set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] + assert {[lindex $addr 1] == $old_port} + R $master_id debug sleep 5 + foreach_sentinel_id id { + wait_for_condition 100 50 { + [lindex [S $id SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] 1] != $old_port + } else { + fail "At least one Sentinel did not received failover info" + } + } + set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] + set master_id [get_instance_id_by_port redis [lindex $addr 1]] +} + +test "New master [join $addr {:}] role matches" { + assert {[RI $master_id role] eq {master}} +} diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index 38243abde15..0001a15fe4d 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -116,7 +116,7 @@ proc test {descr code} { proc run_tests {} { set tests [lsort [glob ../sentinel-tests/*]] foreach test $tests { - puts [colorstr green "### [lindex [file split $test] end]"] + puts [colorstr yellow "Testing unit: [lindex [file split $test] end]"] source $test } } @@ -159,18 +159,26 @@ proc RI {n field} { } # Iterate over IDs of sentinel or redis instances. -proc foreach_sentinel_id {idvar code} { +proc foreach_instance_id {instances idvar code} { upvar 1 $idvar id - for {set id 0} {$id < [llength $::sentinel_instances]} {incr id} { - uplevel 1 $code + for {set id 0} {$id < [llength $instances]} {incr id} { + set errcode [catch {uplevel 1 $code} result] + if {$errcode == 1} { + error $result $::errorInfo $::errorCode + } elseif {$errcode != 0} { + return -code $errcode $result + } } } +proc foreach_sentinel_id {idvar code} { + set errcode [catch {uplevel 1 [list foreach_instance_id $::sentinel_instances $idvar $code]} result] + return -code $errcode $result +} + proc foreach_redis_id {idvar code} { - upvar 1 $idvar id - for {set id 0} {$id < [llength $::redis_instances]} {incr id} { - uplevel 1 $code - } + set errcode [catch {uplevel 1 [list foreach_instance_id $::redis_instances $idvar $code]} result] + return -code $errcode $result } # Get the specific attribute of the specified instance type, id. @@ -203,6 +211,15 @@ proc create_redis_master_slave_cluster n { } } +proc get_instance_id_by_port {type port} { + foreach_${type}_id id { + if {[get_instance_attrib $type $id port] == $port} { + return $id + } + } + fail "Instance $type port $port not found." +} + if {[catch main e]} { puts $::errorInfo cleanup From 136537dcb06ec19fe62c4c7c13e17cbf7fad867a Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 18 Feb 2014 17:03:56 +0100 Subject: [PATCH 0046/1648] Sentinel test: check reconfig of slaves and old master. --- tests/sentinel-tests/00-base.tcl | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index 0b70cc26eb4..aec344b1dd5 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -72,3 +72,23 @@ test "Basic failover works if the master is down" { test "New master [join $addr {:}] role matches" { assert {[RI $master_id role] eq {master}} } + +test "All the other slaves now point to the new master" { + foreach_redis_id id { + if {$id != $master_id && $id != 0} { + wait_for_condition 1000 50 { + [RI $id master_port] == [lindex $addr 1] + } else { + fail "Redis ID $id not configured to replicate with new master" + } + } + } +} + +test "The old master eventually gets reconfigured as a slave" { + wait_for_condition 1000 50 { + [RI 0 master_port] == [lindex $addr 1] + } else { + fail "Old master not reconfigured as slave of new master" + } +} From 2a08c7e5ac86b9f0c38e844af5ecb35091df74e5 Mon Sep 17 00:00:00 2001 From: antirez Date: Wed, 19 Feb 2014 09:44:38 +0100 Subject: [PATCH 0047/1648] Sentinel test: ODOWN and agreement. --- tests/sentinel-tests/00-base.tcl | 69 ++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index aec344b1dd5..a2725f8ba9a 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -92,3 +92,72 @@ test "The old master eventually gets reconfigured as a slave" { fail "Old master not reconfigured as slave of new master" } } + +test "ODOWN is not possible without enough Sentinels reports" { + foreach_sentinel_id id { + S $id SENTINEL SET mymaster quorum [expr $sentinels+1] + } + set old_port [RI $master_id tcp_port] + set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] + assert {[lindex $addr 1] == $old_port} + R $master_id debug sleep 5 + + # Make sure failover did not happened. + set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] + assert {[lindex $addr 1] == $old_port} +} + +test "Failover is not possible without majority agreement" { + foreach_sentinel_id id { + S $id SENTINEL SET mymaster quorum $quorum + } + + # Make majority of sentinels stop monitoring the master + for {set id 0} {$id < $quorum} {incr id} { + S $id SENTINEL REMOVE mymaster + } + R $master_id debug sleep 5 + + # Make sure failover did not happened. + set addr [S $quorum SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] + assert {[lindex $addr 1] == $old_port} + + # Cleanup: reconfigure the Sentinels to monitor the master. + for {set id 0} {$id < $quorum} {incr id} { + S $id SENTINEL MONITOR mymaster \ + [get_instance_attrib redis $master_id host] \ + [get_instance_attrib redis $master_id port] $quorum + } +} + +test "Failover works if we configure for absolute agreement" { + foreach_sentinel_id id { + S $id SENTINEL SET mymaster quorum $sentinels + } + + # Wait for Sentinels to monitor the master again + foreach_sentinel_id id { + wait_for_condition 1000 50 { + [dict get [S $id SENTINEL MASTER mymaster] info-refresh] < 100000 + } else { + fail "At least one Sentinel is not monitoring the master" + } + } + + R $master_id debug sleep 5 + foreach_sentinel_id id { + wait_for_condition 100 50 { + [lindex [S $id SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] 1] != $old_port + } else { + fail "At least one Sentinel did not received failover info" + } + } + set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] + set master_id [get_instance_id_by_port redis [lindex $addr 1]] + + # Set the min ODOWN agreement back to strict majority. + foreach_sentinel_id id { + S $id SENTINEL SET mymaster quorum $quorum + } +} + From a88a057a1faff1ffe6296b305cca4f0936c29ed8 Mon Sep 17 00:00:00 2001 From: antirez Date: Wed, 19 Feb 2014 10:08:49 +0100 Subject: [PATCH 0048/1648] Sentinel test: check that role matches at end of 00-base. --- tests/sentinel-tests/00-base.tcl | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index a2725f8ba9a..4526d565497 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -161,3 +161,7 @@ test "Failover works if we configure for absolute agreement" { } } +test "New master [join $addr {:}] role matches" { + assert {[RI $master_id role] eq {master}} +} + From e087d8a20df0867c611a85361257bbed758f7448 Mon Sep 17 00:00:00 2001 From: antirez Date: Wed, 19 Feb 2014 10:26:23 +0100 Subject: [PATCH 0049/1648] Sentinel test: some reliability fixes to 00-base tests. --- tests/sentinel-tests/00-base.tcl | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index 4526d565497..8fc20775fdf 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -1,3 +1,5 @@ +# Check the basic monitoring and failover capabilities. + test "Sentinels aren't monitoring any master" { foreach_sentinel_id id { assert {[S $id sentinel masters] eq {}} @@ -127,6 +129,7 @@ test "Failover is not possible without majority agreement" { S $id SENTINEL MONITOR mymaster \ [get_instance_attrib redis $master_id host] \ [get_instance_attrib redis $master_id port] $quorum + S $id SENTINEL SET mymaster down-after-milliseconds 2000 } } @@ -146,7 +149,7 @@ test "Failover works if we configure for absolute agreement" { R $master_id debug sleep 5 foreach_sentinel_id id { - wait_for_condition 100 50 { + wait_for_condition 1000 50 { [lindex [S $id SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] 1] != $old_port } else { fail "At least one Sentinel did not received failover info" @@ -164,4 +167,3 @@ test "Failover works if we configure for absolute agreement" { test "New master [join $addr {:}] role matches" { assert {[RI $master_id role] eq {master}} } - From b20ae393f1129f97eea4d9b70f44bf2309cfbd65 Mon Sep 17 00:00:00 2001 From: Matt Stancliff Date: Wed, 19 Feb 2014 17:26:33 -0500 Subject: [PATCH 0050/1648] Fix "can't bind to address" error reporting. Report the actual port used for the listening attempt instead of server.port. Originally, Redis would just listen on server.port. But, with clustering, Redis uses a Cluster Port too, so we can't say server.port is always where we are listening. If you tried to launch Redis with a too-high port number (any port where Port+10000 > 65535), Redis would refuse to start, but only print an error saying it can't connect to the Redis port. This patch fixes much confusions. --- src/redis.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/redis.c b/src/redis.c index 0d1686a02a6..817943e3051 100644 --- a/src/redis.c +++ b/src/redis.c @@ -1581,7 +1581,7 @@ int listenToPort(int port, int *fds, int *count) { redisLog(REDIS_WARNING, "Creating Server TCP listening socket %s:%d: %s", server.bindaddr[j] ? server.bindaddr[j] : "*", - server.port, server.neterr); + port, server.neterr); return REDIS_ERR; } (*count)++; From ce68caea371cc75378c2261927c953fbd27147c7 Mon Sep 17 00:00:00 2001 From: Matt Stancliff Date: Wed, 19 Feb 2014 17:30:07 -0500 Subject: [PATCH 0051/1648] Cluster: error out quicker if port is unusable The default cluster control port is 10,000 ports higher than the base Redis port. If Redis is started on a too-high port, Cluster can't start and everything will exit later anyway. --- src/cluster.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/src/cluster.c b/src/cluster.c index f47799aca8a..52ace9ebf1f 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -329,6 +329,19 @@ void clusterInit(void) { /* We need a listening TCP port for our cluster messaging needs. */ server.cfd_count = 0; + + /* Port sanity check II + The other handshake port check is triggered too late to stop + us from trying to use a too-high cluster port number. + */ + if (server.port > (65535-REDIS_CLUSTER_PORT_INCR)) { + redisLog(REDIS_WARNING, "Redis port number too high. " + "Cluster communication port is 10,000 port " + "numbers higher than your Redis port. " + "Your Redis port number must be " + "lower than 55535."); + } + if (listenToPort(server.port+REDIS_CLUSTER_PORT_INCR, server.cfd,&server.cfd_count) == REDIS_ERR) { From 7d7b3810e7080cceeb58d39a193f06ac4585c8e7 Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 20 Feb 2014 12:13:52 +0100 Subject: [PATCH 0052/1648] Sentinel: report instances role switch events. This is useful mostly for debugging of issues. --- src/sentinel.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/src/sentinel.c b/src/sentinel.c index 9fbd43ded04..5de6afea612 100644 --- a/src/sentinel.c +++ b/src/sentinel.c @@ -1770,6 +1770,14 @@ void sentinelRefreshInstanceInfo(sentinelRedisInstance *ri, const char *info) { ri->role_reported_time = mstime(); ri->role_reported = role; if (role == SRI_SLAVE) ri->slave_conf_change_time = mstime(); + /* Log the event with +role-change if the new role is coherent or + * with -role-change if there is a mismatch with the current config. */ + sentinelEvent(REDIS_VERBOSE, + ((ri->flags & (SRI_MASTER|SRI_SLAVE)) == role) ? + "+role-change" : "-role-change", + ri, "%@ new reported role is %s", + role == SRI_MASTER ? "master" : "slave", + ri->flags & SRI_MASTER ? "master" : "slave"); } /* Handle master -> slave role switch. */ From 57654444542b54e178b07fa904aee8b019f3a9c9 Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 20 Feb 2014 16:28:38 +0100 Subject: [PATCH 0053/1648] Sentinel test: ability to run just a subset of test files. --- tests/sentinel-tests/00-base.tcl | 1 + tests/sentinel.tcl | 25 +++++++++++++++++++++++++ 2 files changed, 26 insertions(+) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index 8fc20775fdf..8788cfdbcac 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -16,6 +16,7 @@ test "Sentinels can start monitoring a master" { set sentinels [llength $::sentinel_instances] set quorum [expr {$sentinels/2+1}] foreach_sentinel_id id { + catch {S $id SENTINEL REMOVE mymaster} S $id SENTINEL MONITOR mymaster \ [get_instance_attrib redis $master_id host] \ [get_instance_attrib redis $master_id port] $quorum diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index 0001a15fe4d..54464355f02 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -18,6 +18,7 @@ set ::redis_base_port 30000 set ::instances_count 5 ; # How many Sentinels / Instances we use at max set ::pids {} ; # We kill everything at exit set ::dirs {} ; # We remove all the temp dirs at exit +set ::run_matching {} ; # If non empty, only tests matching pattern are run. if {[catch {cd tests/sentinel-tmp}]} { puts "tests/sentinel-tmp directory not found." @@ -87,7 +88,28 @@ proc abort_sentinel_test msg { exit 1 } +proc parse_options {} { + for {set j 0} {$j < [llength $::argv]} {incr j} { + set opt [lindex $::argv $j] + set val [lindex $::argv [expr $j+1]] + if {$opt eq "--single"} { + incr j + set ::run_matching "*${val}*" + } elseif {$opt eq "--help"} { + puts "Hello, I'm sentinel.tcl and I run Sentinel unit tests." + puts "\nOptions:" + puts "--single Only runs tests specified by pattern." + puts "--help Shows this help." + exit 0 + } else { + puts "Unknown option $opt" + exit 1 + } + } +} + proc main {} { + parse_options spawn_instance sentinel $::sentinel_base_port $::instances_count spawn_instance redis $::redis_base_port $::instances_count run_tests @@ -116,6 +138,9 @@ proc test {descr code} { proc run_tests {} { set tests [lsort [glob ../sentinel-tests/*]] foreach test $tests { + if {$::run_matching ne {} && [string match $::run_matching $test] == 0} { + continue + } puts [colorstr yellow "Testing unit: [lindex [file split $test] end]"] source $test } From d7da507683028e6bb7b53220f85b574e4a51dce9 Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 20 Feb 2014 16:57:51 +0100 Subject: [PATCH 0054/1648] Sentinel test: move init tests as includes. Most units will start with these two basic tests to create an environment where the real tests are ran. --- tests/sentinel-tests/00-base.tcl | 26 +------------------- tests/sentinel-tests/includes/init-tests.tcl | 23 +++++++++++++++++ tests/sentinel.tcl | 1 + 3 files changed, 25 insertions(+), 25 deletions(-) create mode 100644 tests/sentinel-tests/includes/init-tests.tcl diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index 8788cfdbcac..0587c625c33 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -1,30 +1,6 @@ # Check the basic monitoring and failover capabilities. -test "Sentinels aren't monitoring any master" { - foreach_sentinel_id id { - assert {[S $id sentinel masters] eq {}} - } -} - -set redis_slaves 4 -test "Create a master-slaves cluster of [expr $redis_slaves+1] instances" { - create_redis_master_slave_cluster [expr {$redis_slaves+1}] -} -set master_id 0 - -test "Sentinels can start monitoring a master" { - set sentinels [llength $::sentinel_instances] - set quorum [expr {$sentinels/2+1}] - foreach_sentinel_id id { - catch {S $id SENTINEL REMOVE mymaster} - S $id SENTINEL MONITOR mymaster \ - [get_instance_attrib redis $master_id host] \ - [get_instance_attrib redis $master_id port] $quorum - } - foreach_sentinel_id id { - assert {[S $id sentinel master mymaster] ne {}} - } -} +source "../sentinel-tests/includes/init-tests.tcl" test "Sentinels are able to auto-discover other sentinels" { set sentinels [llength $::sentinel_instances] diff --git a/tests/sentinel-tests/includes/init-tests.tcl b/tests/sentinel-tests/includes/init-tests.tcl new file mode 100644 index 00000000000..302f64b6540 --- /dev/null +++ b/tests/sentinel-tests/includes/init-tests.tcl @@ -0,0 +1,23 @@ +# Initialization tests -- most units will start including this. + +set redis_slaves 4 +test "Create a master-slaves cluster of [expr $redis_slaves+1] instances" { + create_redis_master_slave_cluster [expr {$redis_slaves+1}] +} +set master_id 0 + +test "Sentinels can start monitoring a master" { + set sentinels [llength $::sentinel_instances] + set quorum [expr {$sentinels/2+1}] + foreach_sentinel_id id { + catch {S $id SENTINEL REMOVE mymaster} + S $id SENTINEL MONITOR mymaster \ + [get_instance_attrib redis $master_id host] \ + [get_instance_attrib redis $master_id port] $quorum + } + foreach_sentinel_id id { + assert {[S $id sentinel master mymaster] ne {}} + } +} + + diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index 54464355f02..4b49ed4c7a0 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -141,6 +141,7 @@ proc run_tests {} { if {$::run_matching ne {} && [string match $::run_matching $test] == 0} { continue } + if {[file isdirectory $test]} continue puts [colorstr yellow "Testing unit: [lindex [file split $test] end]"] source $test } From 2c273e359130f6546aa471ffd0f1855d415a05c5 Mon Sep 17 00:00:00 2001 From: Matt Stancliff Date: Thu, 20 Feb 2014 23:45:56 -0500 Subject: [PATCH 0055/1648] Add cluster or sentinel to proc title MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If you launch redis with `redis-server --sentinel` then in a ps, your output only says "redis-server IP:Port" — this patch changes the proc title to include [sentinel] or [cluster] depending on the current server mode: e.g. "redis-server IP:Port [sentinel]" "redis-server IP:Port [cluster]" --- src/redis.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/redis.c b/src/redis.c index 0d1686a02a6..c8562c85d88 100644 --- a/src/redis.c +++ b/src/redis.c @@ -3079,10 +3079,15 @@ void redisOutOfMemoryHandler(size_t allocation_size) { void redisSetProcTitle(char *title) { #ifdef USE_SETPROCTITLE - setproctitle("%s %s:%d", + char *server_mode = ""; + if (server.cluster_enabled) server_mode = " [cluster]"; + else if (server.sentinel_mode) server_mode = " [sentinel]"; + + setproctitle("%s %s:%d%s", title, server.bindaddr_count ? server.bindaddr[0] : "*", - server.port); + server.port, + server_mode); #else REDIS_NOTUSED(title); #endif From 8c254415f7caa392b750131e3f37ccf123e03cd3 Mon Sep 17 00:00:00 2001 From: antirez Date: Sat, 22 Feb 2014 17:26:30 +0100 Subject: [PATCH 0056/1648] Sentinel test: framework improved and conf-update unit added. It is now possible to kill and restart sentinel or redis instances for more real-world testing. The 01 unit tests the capability of Sentinel to update the configuration of Sentinels rejoining the cluster, however the test is pretty trivial and more tests should be added. --- tests/sentinel-tests/00-base.tcl | 17 ++---- tests/sentinel-tests/01-conf-update.tcl | 38 +++++++++++++ tests/sentinel-tests/includes/init-tests.tcl | 11 +++- tests/sentinel.tcl | 56 ++++++++++++++++++-- tests/support/test.tcl | 10 +++- 5 files changed, 112 insertions(+), 20 deletions(-) create mode 100644 tests/sentinel-tests/01-conf-update.tcl diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index 0587c625c33..b8dfa70ca97 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -23,20 +23,11 @@ test "Sentinels are able to auto-discover slaves" { } } -test "Can change master parameters via SENTINEL SET" { - foreach_sentinel_id id { - S $id SENTINEL SET mymaster down-after-milliseconds 2000 - } - foreach_sentinel_id id { - assert {[dict get [S $id sentinel master mymaster] down-after-milliseconds] == 2000} - } -} - test "Basic failover works if the master is down" { set old_port [RI $master_id tcp_port] set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] assert {[lindex $addr 1] == $old_port} - R $master_id debug sleep 5 + R $master_id debug sleep 10 foreach_sentinel_id id { wait_for_condition 100 50 { [lindex [S $id SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] 1] != $old_port @@ -79,7 +70,7 @@ test "ODOWN is not possible without enough Sentinels reports" { set old_port [RI $master_id tcp_port] set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] assert {[lindex $addr 1] == $old_port} - R $master_id debug sleep 5 + R $master_id debug sleep 10 # Make sure failover did not happened. set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] @@ -95,7 +86,7 @@ test "Failover is not possible without majority agreement" { for {set id 0} {$id < $quorum} {incr id} { S $id SENTINEL REMOVE mymaster } - R $master_id debug sleep 5 + R $master_id debug sleep 10 # Make sure failover did not happened. set addr [S $quorum SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] @@ -124,7 +115,7 @@ test "Failover works if we configure for absolute agreement" { } } - R $master_id debug sleep 5 + R $master_id debug sleep 10 foreach_sentinel_id id { wait_for_condition 1000 50 { [lindex [S $id SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] 1] != $old_port diff --git a/tests/sentinel-tests/01-conf-update.tcl b/tests/sentinel-tests/01-conf-update.tcl new file mode 100644 index 00000000000..49323707513 --- /dev/null +++ b/tests/sentinel-tests/01-conf-update.tcl @@ -0,0 +1,38 @@ +# Test Sentinel configuration consistency after partitions heal. + +source "../sentinel-tests/includes/init-tests.tcl" + +test "We can failover with Sentinel 1 crashed" { + foreach_sentinel_id id { + S $id SENTINEL SET mymaster down-after-milliseconds 2000 + } + + set old_port [RI $master_id tcp_port] + set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] + assert {[lindex $addr 1] == $old_port} + + # Crash Sentinel 1 + kill_instance sentinel 1 + + R $master_id debug sleep 10 + foreach_sentinel_id id { + if {$id != 1} { + wait_for_condition 1000 50 { + [lindex [S $id SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] 1] != $old_port + } else { + fail "Sentinel $id did not received failover info" + } + } + } + set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] + set master_id [get_instance_id_by_port redis [lindex $addr 1]] +} + +test "After Sentinel 1 is restarted, its config gets updated" { + restart_instance sentinel 1 + wait_for_condition 1000 50 { + [lindex [S 1 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] 1] != $old_port + } else { + fail "Restarted Sentinel did not received failover info" + } +} diff --git a/tests/sentinel-tests/includes/init-tests.tcl b/tests/sentinel-tests/includes/init-tests.tcl index 302f64b6540..82beeea4fb7 100644 --- a/tests/sentinel-tests/includes/init-tests.tcl +++ b/tests/sentinel-tests/includes/init-tests.tcl @@ -17,7 +17,16 @@ test "Sentinels can start monitoring a master" { } foreach_sentinel_id id { assert {[S $id sentinel master mymaster] ne {}} + S $id SENTINEL SET mymaster down-after-milliseconds 2000 } } - +test "Sentinels can talk with the master" { + foreach_sentinel_id id { + wait_for_condition 100 50 { + [catch {S $id SENTINEL GET-MASTER-ADDR-BY-NAME mymaster}] == 0 + } else { + fail "Sentinel $id can't talk with the master." + } + } +} diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index 4b49ed4c7a0..f1c1669ac4d 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -53,8 +53,8 @@ proc spawn_instance {type base_port count} { } else { set prgname redis-sentinel } - set sentinel_pid [exec ../../src/${prgname} $cfgfile &] - lappend ::pids $sentinel_pid + set pid [exec ../../src/${prgname} $cfgfile &] + lappend ::pids $pid # Check availability if {[server_is_up 127.0.0.1 $port 100] == 0} { @@ -63,7 +63,7 @@ proc spawn_instance {type base_port count} { # Push the instance into the right list lappend ::${type}_instances [list \ - pid $sentinel_pid \ + pid $pid \ host 127.0.0.1 \ port $port \ link [redis 127.0.0.1 $port] \ @@ -212,6 +212,13 @@ proc get_instance_attrib {type id attrib} { dict get [lindex [set ::${type}_instances] $id] $attrib } +# Set the specific attribute of the specified instance type, id. +proc set_instance_attrib {type id attrib newval} { + set d [lindex [set ::${type}_instances] $id] + dict set d $attrib $newval + lset ::${type}_instances $id $d +} + # Create a master-slave cluster of the given number of total instances. # The first instance "0" is the master, all others are configured as # slaves. @@ -219,8 +226,8 @@ proc create_redis_master_slave_cluster n { foreach_redis_id id { if {$id == 0} { # Our master. - R $id flushall R $id slaveof no one + R $id flushall } elseif {$id < $n} { R $id slaveof [get_instance_attrib redis 0 host] \ [get_instance_attrib redis 0 port] @@ -246,6 +253,47 @@ proc get_instance_id_by_port {type port} { fail "Instance $type port $port not found." } +# Kill an instance of the specified type/id with SIGKILL. +# This function will mark the instance PID as -1 to remember that this instance +# is no longer running and will remove its PID from the list of pids that +# we kill at cleanup. +# +# The instance can be restarted with restart-instance. +proc kill_instance {type id} { + set pid [get_instance_attrib $type $id pid] + exec kill -9 $pid + set_instance_attrib $type $id pid -1 + set_instance_attrib $type $id link you_tried_to_talk_with_killed_instance + + # Remove the PID from the list of pids to kill at exit. + set ::pids [lsearch -all -inline -not -exact $::pids $pid] +} + +# Restart an instance previously killed by kill_instance +proc restart_instance {type id} { + set dirname "${type}_${id}" + set cfgfile [file join $dirname $type.conf] + set port [get_instance_attrib $type $id port] + + # Execute the instance with its old setup and append the new pid + # file for cleanup. + if {$type eq "redis"} { + set prgname redis-server + } else { + set prgname redis-sentinel + } + set pid [exec ../../src/${prgname} $cfgfile &] + lappend ::pids $pid + + # Check that the instance is running + if {[server_is_up 127.0.0.1 $port 100] == 0} { + abort_sentinel_test "Problems starting $type #$j: ping timeout" + } + + # Connect with it with a fresh link + set_instance_attrib $type $id link [redis 127.0.0.1 $port] +} + if {[catch main e]} { puts $::errorInfo cleanup diff --git a/tests/support/test.tcl b/tests/support/test.tcl index 96b529d7a5d..bf2cb0e2f87 100644 --- a/tests/support/test.tcl +++ b/tests/support/test.tcl @@ -53,11 +53,17 @@ proc assert_type {type key} { # executed. proc wait_for_condition {maxtries delay e _else_ elsescript} { while {[incr maxtries -1] >= 0} { - if {[uplevel 1 [list expr $e]]} break + set errcode [catch {uplevel 1 [list expr $e]} result] + if {$errcode == 0} { + if {$result} break + } else { + return -code $errcode $result + } after $delay } if {$maxtries == -1} { - uplevel 1 $elsescript + set errcode [catch [uplevel 1 $elsescript] result] + return -code $errcode $result } } From b1c1386374aabd085a6d2c9659cc8d50bc0b866a Mon Sep 17 00:00:00 2001 From: antirez Date: Sat, 22 Feb 2014 17:34:46 +0100 Subject: [PATCH 0057/1648] Sentinel: IDONTKNOW error removed. This error was conceived for the older version of Sentinel that worked via master redirection and that was not able to get configuration updates from other Sentinels via the Pub/Sub channel of masters or slaves. This reply does not make sense today, every Sentinel should reply with the best information it has currently. The error will make even more sense in the future since the plan is to allow Sentinels to update the configuration of other Sentinels via gossip with a direct chat without the prerequisite that they have at least a monitored instance in common. --- src/sentinel.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/sentinel.c b/src/sentinel.c index 5de6afea612..17d56a1d319 100644 --- a/src/sentinel.c +++ b/src/sentinel.c @@ -2468,8 +2468,6 @@ void sentinelCommand(redisClient *c) { ri = sentinelGetMasterByName(c->argv[2]->ptr); if (ri == NULL) { addReply(c,shared.nullmultibulk); - } else if (ri->info_refresh == 0) { - addReplySds(c,sdsnew("-IDONTKNOW I have not enough information to reply. Please ask another Sentinel.\r\n")); } else { sentinelAddr *addr = sentinelGetCurrentMasterAddress(ri); From a929867cca5c31c5923ff97ab4ddbc35a094bbc6 Mon Sep 17 00:00:00 2001 From: antirez Date: Sun, 23 Feb 2014 17:50:59 +0100 Subject: [PATCH 0058/1648] Sentinel test: added empty units to fill later. --- tests/sentinel-tests/02-slaves-reconf.tcl | 1 + tests/sentinel-tests/03-runtime-reconf.tcl | 1 + 2 files changed, 2 insertions(+) create mode 100644 tests/sentinel-tests/02-slaves-reconf.tcl create mode 100644 tests/sentinel-tests/03-runtime-reconf.tcl diff --git a/tests/sentinel-tests/02-slaves-reconf.tcl b/tests/sentinel-tests/02-slaves-reconf.tcl new file mode 100644 index 00000000000..49ecb1ab918 --- /dev/null +++ b/tests/sentinel-tests/02-slaves-reconf.tcl @@ -0,0 +1 @@ +# Check that slaves are reconfigured at a latter time if they are partitioned. diff --git a/tests/sentinel-tests/03-runtime-reconf.tcl b/tests/sentinel-tests/03-runtime-reconf.tcl new file mode 100644 index 00000000000..50bca55478f --- /dev/null +++ b/tests/sentinel-tests/03-runtime-reconf.tcl @@ -0,0 +1 @@ +# Test runting reconfiguration command SENTINEL SET. From afd3db17a01fd2fa8b2d1b85edcae389b3e7f647 Mon Sep 17 00:00:00 2001 From: antirez Date: Sun, 23 Feb 2014 17:57:53 +0100 Subject: [PATCH 0059/1648] Sentinel test: --pause-on-error option added. Pause the test with running instances available for state inspection on error. --- tests/sentinel.tcl | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index f1c1669ac4d..f06ade07e53 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -11,6 +11,7 @@ source tests/support/server.tcl source tests/support/test.tcl set ::verbose 0 +set ::pause_on_error 0 set ::sentinel_instances {} set ::redis_instances {} set ::sentinel_base_port 20000 @@ -95,10 +96,13 @@ proc parse_options {} { if {$opt eq "--single"} { incr j set ::run_matching "*${val}*" + } elseif {$opt eq "--pause-on-error"} { + set ::pause_on_error 1 } elseif {$opt eq "--help"} { puts "Hello, I'm sentinel.tcl and I run Sentinel unit tests." puts "\nOptions:" puts "--single Only runs tests specified by pattern." + puts "--pause-on-error Pause for manual inspection on error." puts "--help Shows this help." exit 0 } else { @@ -116,6 +120,18 @@ proc main {} { cleanup } +# If --pause-on-error option was passed at startup this function is called +# on error in order to give the developer a chance to understand more about +# the error condition while the instances are still running. +proc pause_on_error {} { + puts [colorstr yellow "*** Please inspect the error now ***"] + puts "\nType \"continue\" to resume the test." + while {[gets stdin] ne {continue}} { + puts "> " + flush stdout + } +} + # We redefine 'test' as for Sentinel we don't use the server-client # architecture for the test, everything is sequential. proc test {descr code} { @@ -126,6 +142,7 @@ proc test {descr code} { if {[string match "assertion:*" $error]} { set msg [string range $error 10 end] puts [colorstr red $msg] + if {$::pause_on_error} pause_on_error } else { # Re-raise, let handler up the stack take care of this. error $error $::errorInfo From 09dec3613e7af60a084704eb2e80c5cf4ecd1b09 Mon Sep 17 00:00:00 2001 From: antirez Date: Sun, 23 Feb 2014 18:01:30 +0100 Subject: [PATCH 0060/1648] Sentinel test: minor fixes to --pause-on-error. --- tests/sentinel.tcl | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index f06ade07e53..669bc830765 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -124,11 +124,13 @@ proc main {} { # on error in order to give the developer a chance to understand more about # the error condition while the instances are still running. proc pause_on_error {} { + puts "" puts [colorstr yellow "*** Please inspect the error now ***"] - puts "\nType \"continue\" to resume the test." - while {[gets stdin] ne {continue}} { - puts "> " + puts "\nType \"continue\" to resume the test.\n" + while 1 { + puts -nonewline "> " flush stdout + if {[gets stdin] eq {continue}} break } } From 540536c05519cce1174b00d1ff25a125b9c7c4a1 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 24 Feb 2014 11:51:31 +0100 Subject: [PATCH 0061/1648] Sentinel test: tmp dir and gitignore added. --- tests/sentinel-tmp/.gitignore | 2 ++ 1 file changed, 2 insertions(+) create mode 100644 tests/sentinel-tmp/.gitignore diff --git a/tests/sentinel-tmp/.gitignore b/tests/sentinel-tmp/.gitignore new file mode 100644 index 00000000000..f581f73e2d0 --- /dev/null +++ b/tests/sentinel-tmp/.gitignore @@ -0,0 +1,2 @@ +redis_* +sentinel_* From 25cebf72853ccf0e8d7a042217b4f39f2ad9d793 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 24 Feb 2014 16:22:52 +0100 Subject: [PATCH 0062/1648] Sentinel: added missing exit(1) after checking for config file. --- src/sentinel.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/sentinel.c b/src/sentinel.c index 17d56a1d319..06f54b52123 100644 --- a/src/sentinel.c +++ b/src/sentinel.c @@ -420,6 +420,7 @@ void sentinelIsRunning(void) { if (server.configfile == NULL) { redisLog(REDIS_WARNING, "Sentinel started without a config file. Exiting..."); + exit(1); } else if (access(server.configfile,W_OK) == -1) { redisLog(REDIS_WARNING, "Sentinel config file %s is not writable: %s. Exiting...", From 3b7a7574685f184b9c88e8aed5f97e07265f21ec Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 24 Feb 2014 16:25:34 +0100 Subject: [PATCH 0063/1648] Sentinel: log +monitor and +set events. Now that we have a runtime configuration system, it is very important to be able to log how the Sentinel configuration changes over time because of API calls. --- src/sentinel.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/src/sentinel.c b/src/sentinel.c index 06f54b52123..22fe7cafc39 100644 --- a/src/sentinel.c +++ b/src/sentinel.c @@ -2503,6 +2503,7 @@ void sentinelCommand(redisClient *c) { sentinelPendingScriptsCommand(c); } else if (!strcasecmp(c->argv[1]->ptr,"monitor")) { /* SENTINEL MONITOR */ + sentinelRedisInstance *ri; long quorum, port; char buf[32]; @@ -2518,9 +2519,11 @@ void sentinelCommand(redisClient *c) { addReplyError(c,"Invalid IP address specified"); return; } - if (createSentinelRedisInstance(c->argv[2]->ptr,SRI_MASTER, - c->argv[3]->ptr,port,quorum,NULL) == NULL) - { + + /* Parameters are valid. Try to create the master instance. */ + ri = createSentinelRedisInstance(c->argv[2]->ptr,SRI_MASTER, + c->argv[3]->ptr,port,quorum,NULL); + if (ri == NULL) { switch(errno) { case EBUSY: addReplyError(c,"Duplicated master name"); @@ -2534,6 +2537,7 @@ void sentinelCommand(redisClient *c) { } } else { sentinelFlushConfig(); + sentinelEvent(REDIS_WARNING,"+monitor",ri,"%@"); addReply(c,shared.ok); } } else if (!strcasecmp(c->argv[1]->ptr,"remove")) { @@ -2542,6 +2546,7 @@ void sentinelCommand(redisClient *c) { if ((ri = sentinelGetMasterByNameOrReplyError(c,c->argv[2])) == NULL) return; + sentinelEvent(REDIS_WARNING,"-monitor",ri,"%@"); dictDelete(sentinel.masters,c->argv[2]->ptr); sentinelFlushConfig(); addReply(c,shared.ok); @@ -2693,6 +2698,7 @@ void sentinelSetCommand(redisClient *c) { if (changes) sentinelFlushConfig(); return; } + sentinelEvent(REDIS_WARNING,"+set",ri,"%@ %s %s",option,value); } if (changes) sentinelFlushConfig(); From 6b373edb773787e67cec502ab7e42234bdf365f2 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 24 Feb 2014 16:32:09 +0100 Subject: [PATCH 0064/1648] Sentinel: generate +monitor events at startup. --- src/sentinel.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/src/sentinel.c b/src/sentinel.c index 22fe7cafc39..a3285434457 100644 --- a/src/sentinel.c +++ b/src/sentinel.c @@ -327,6 +327,7 @@ void sentinelDiscardReplyCallback(redisAsyncContext *c, void *reply, void *privd int sentinelSendSlaveOf(sentinelRedisInstance *ri, char *host, int port); char *sentinelVoteLeader(sentinelRedisInstance *master, uint64_t req_epoch, char *req_runid, uint64_t *leader_epoch); void sentinelFlushConfig(void); +void sentinelGenerateInitialMonitorEvents(void); /* ========================= Dictionary types =============================== */ @@ -427,6 +428,10 @@ void sentinelIsRunning(void) { server.configfile,strerror(errno)); exit(1); } + + /* We want to generate a +monitor event for every configured master + * at startup. */ + sentinelGenerateInitialMonitorEvents(); } /* ============================== sentinelAddr ============================== */ @@ -558,6 +563,22 @@ void sentinelEvent(int level, char *type, sentinelRedisInstance *ri, } } +/* This function is called only at startup and is used to generate a + * +monitor event for every configured master. The same events are also + * generated when a master to monitor is added at runtime via the + * SENTINEL MONITOR command. */ +void sentinelGenerateInitialMonitorEvents(void) { + dictIterator *di; + dictEntry *de; + + di = dictGetIterator(sentinel.masters); + while((de = dictNext(di)) != NULL) { + sentinelRedisInstance *ri = dictGetVal(de); + sentinelEvent(REDIS_WARNING,"+monitor",ri,"%@"); + } + dictReleaseIterator(di); +} + /* ============================ script execution ============================ */ /* Release a script job structure and all the associated data. */ From 29f4df801815712d70154a43103d815b006e5546 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 24 Feb 2014 16:57:52 +0100 Subject: [PATCH 0065/1648] Sentinel test: removed useless code to set SDOWN timeout. The new common initialization code used to start a new unit already set the timeout to 2000 milliseconds. --- tests/sentinel-tests/01-conf-update.tcl | 4 ---- 1 file changed, 4 deletions(-) diff --git a/tests/sentinel-tests/01-conf-update.tcl b/tests/sentinel-tests/01-conf-update.tcl index 49323707513..7f3d1c3661e 100644 --- a/tests/sentinel-tests/01-conf-update.tcl +++ b/tests/sentinel-tests/01-conf-update.tcl @@ -3,10 +3,6 @@ source "../sentinel-tests/includes/init-tests.tcl" test "We can failover with Sentinel 1 crashed" { - foreach_sentinel_id id { - S $id SENTINEL SET mymaster down-after-milliseconds 2000 - } - set old_port [RI $master_id tcp_port] set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] assert {[lindex $addr 1] == $old_port} From b15411df98fbd1eea941e729c457aff5dee3bfa3 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 24 Feb 2014 17:10:20 +0100 Subject: [PATCH 0066/1648] Sentinel: log quorum with +monitor event. --- src/sentinel.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/sentinel.c b/src/sentinel.c index a3285434457..a96375a7959 100644 --- a/src/sentinel.c +++ b/src/sentinel.c @@ -574,7 +574,7 @@ void sentinelGenerateInitialMonitorEvents(void) { di = dictGetIterator(sentinel.masters); while((de = dictNext(di)) != NULL) { sentinelRedisInstance *ri = dictGetVal(de); - sentinelEvent(REDIS_WARNING,"+monitor",ri,"%@"); + sentinelEvent(REDIS_WARNING,"+monitor",ri,"%@ quorum %d",ri->quorum); } dictReleaseIterator(di); } @@ -2558,7 +2558,7 @@ void sentinelCommand(redisClient *c) { } } else { sentinelFlushConfig(); - sentinelEvent(REDIS_WARNING,"+monitor",ri,"%@"); + sentinelEvent(REDIS_WARNING,"+monitor",ri,"%@ quorum %d",ri->quorum); addReply(c,shared.ok); } } else if (!strcasecmp(c->argv[1]->ptr,"remove")) { From d3a3ef0bc1f993661de83bca07256902aff8ecab Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 24 Feb 2014 17:21:50 +0100 Subject: [PATCH 0067/1648] Sentinel test: more stuff mored from 00-base to init. The area a number of mandatory tests to craete a stable setup for testing that is not too sensitive to timing issues. All those tests moved to includes/init-tests, and marked as (init). --- tests/sentinel-tests/00-base.tcl | 21 --------------- tests/sentinel-tests/includes/init-tests.tcl | 27 +++++++++++++++++--- 2 files changed, 24 insertions(+), 24 deletions(-) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index b8dfa70ca97..d036922c901 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -2,27 +2,6 @@ source "../sentinel-tests/includes/init-tests.tcl" -test "Sentinels are able to auto-discover other sentinels" { - set sentinels [llength $::sentinel_instances] - foreach_sentinel_id id { - wait_for_condition 100 50 { - [dict get [S $id SENTINEL MASTER mymaster] num-other-sentinels] == ($sentinels-1) - } else { - fail "At least some sentinel can't detect some other sentinel" - } - } -} - -test "Sentinels are able to auto-discover slaves" { - foreach_sentinel_id id { - wait_for_condition 100 50 { - [dict get [S $id SENTINEL MASTER mymaster] num-slaves] == $redis_slaves - } else { - fail "At least some sentinel can't detect some slave" - } - } -} - test "Basic failover works if the master is down" { set old_port [RI $master_id tcp_port] set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] diff --git a/tests/sentinel-tests/includes/init-tests.tcl b/tests/sentinel-tests/includes/init-tests.tcl index 82beeea4fb7..cea437a83f7 100644 --- a/tests/sentinel-tests/includes/init-tests.tcl +++ b/tests/sentinel-tests/includes/init-tests.tcl @@ -1,12 +1,12 @@ # Initialization tests -- most units will start including this. set redis_slaves 4 -test "Create a master-slaves cluster of [expr $redis_slaves+1] instances" { +test "(init) Create a master-slaves cluster of [expr $redis_slaves+1] instances" { create_redis_master_slave_cluster [expr {$redis_slaves+1}] } set master_id 0 -test "Sentinels can start monitoring a master" { +test "(init) Sentinels can start monitoring a master" { set sentinels [llength $::sentinel_instances] set quorum [expr {$sentinels/2+1}] foreach_sentinel_id id { @@ -21,7 +21,7 @@ test "Sentinels can start monitoring a master" { } } -test "Sentinels can talk with the master" { +test "(init) Sentinels can talk with the master" { foreach_sentinel_id id { wait_for_condition 100 50 { [catch {S $id SENTINEL GET-MASTER-ADDR-BY-NAME mymaster}] == 0 @@ -30,3 +30,24 @@ test "Sentinels can talk with the master" { } } } + +test "(init) Sentinels are able to auto-discover other sentinels" { + set sentinels [llength $::sentinel_instances] + foreach_sentinel_id id { + wait_for_condition 100 50 { + [dict get [S $id SENTINEL MASTER mymaster] num-other-sentinels] == ($sentinels-1) + } else { + fail "At least some sentinel can't detect some other sentinel" + } + } +} + +test "(init) Sentinels are able to auto-discover slaves" { + foreach_sentinel_id id { + wait_for_condition 100 50 { + [dict get [S $id SENTINEL MASTER mymaster] num-slaves] == $redis_slaves + } else { + fail "At least some sentinel can't detect some slave" + } + } +} From 630fb3539f402b07b4a27a65d8ab0fe21df2feb1 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 08:23:48 +0100 Subject: [PATCH 0068/1648] Sentinel test: restart_instance should refresh pid attrib. Also kill_instance was modified to warn when a test will try to kill the same instance multiple times for error. --- tests/sentinel.tcl | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index 669bc830765..e278aea70fa 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -280,6 +280,9 @@ proc get_instance_id_by_port {type port} { # The instance can be restarted with restart-instance. proc kill_instance {type id} { set pid [get_instance_attrib $type $id pid] + if {$pid == -1} { + error "You tried to kill $type $id twice." + } exec kill -9 $pid set_instance_attrib $type $id pid -1 set_instance_attrib $type $id link you_tried_to_talk_with_killed_instance @@ -302,6 +305,7 @@ proc restart_instance {type id} { set prgname redis-sentinel } set pid [exec ../../src/${prgname} $cfgfile &] + set_instance_attrib $type $id pid $pid lappend ::pids $pid # Check that the instance is running From 044b62754959d8cbba3fb24a37a3e6b29b4e773c Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 08:29:12 +0100 Subject: [PATCH 0069/1648] Sentinel test: test majority crashing Sentinels. The test was previously performed by removing the master from the Sentinel monitored masters. The test with the Sentinels crashed is more similar to real-world partitions / failures. --- tests/sentinel-tests/00-base.tcl | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index d036922c901..dd87714666b 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -42,7 +42,7 @@ test "The old master eventually gets reconfigured as a slave" { } } -test "ODOWN is not possible without enough Sentinels reports" { +test "ODOWN is not possible without N (quorum) Sentinels reports" { foreach_sentinel_id id { S $id SENTINEL SET mymaster quorum [expr $sentinels+1] } @@ -61,22 +61,20 @@ test "Failover is not possible without majority agreement" { S $id SENTINEL SET mymaster quorum $quorum } - # Make majority of sentinels stop monitoring the master + # Crash majority of sentinels for {set id 0} {$id < $quorum} {incr id} { - S $id SENTINEL REMOVE mymaster + kill_instance sentinel $id } + R $master_id debug sleep 10 # Make sure failover did not happened. set addr [S $quorum SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] assert {[lindex $addr 1] == $old_port} - # Cleanup: reconfigure the Sentinels to monitor the master. + # Cleanup: restart Sentinels to monitor the master. for {set id 0} {$id < $quorum} {incr id} { - S $id SENTINEL MONITOR mymaster \ - [get_instance_attrib redis $master_id host] \ - [get_instance_attrib redis $master_id port] $quorum - S $id SENTINEL SET mymaster down-after-milliseconds 2000 + restart_instance sentinel $id } } From a9360c62e8b9f0d612e3cb7054d9fb7a1a251f52 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 08:33:41 +0100 Subject: [PATCH 0070/1648] Sentinel test: jump to next unit on test failure. Sentinel tests are designed to be dependent on the previous tests in the same unit, so usually we can't continue with the next test in the same unit if a previous test failed. --- tests/sentinel.tcl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index e278aea70fa..b15bce65634 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -145,6 +145,8 @@ proc test {descr code} { set msg [string range $error 10 end] puts [colorstr red $msg] if {$::pause_on_error} pause_on_error + puts "(Jumping to next unit after error)" + return -code continue } else { # Re-raise, let handler up the stack take care of this. error $error $::errorInfo From 386467acfb53e855369e9ffdad316462462802e5 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 08:48:46 +0100 Subject: [PATCH 0071/1648] Sentinel test: restart instances left killed by previous unit. An unit can abort in the middle for an error. The next unit should not assume that the instances are in a clean state, and must restart what was left killed. --- tests/sentinel-tests/includes/init-tests.tcl | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/tests/sentinel-tests/includes/init-tests.tcl b/tests/sentinel-tests/includes/init-tests.tcl index cea437a83f7..91b179b4d35 100644 --- a/tests/sentinel-tests/includes/init-tests.tcl +++ b/tests/sentinel-tests/includes/init-tests.tcl @@ -1,5 +1,17 @@ # Initialization tests -- most units will start including this. +test "(init) Restart killed instances" { + foreach type {redis sentinel} { + foreach_${type}_id id { + if {[get_instance_attrib $type $id pid] == -1} { + puts -nonewline "$type/$id " + flush stdout + restart_instance $type $id + } + } + } +} + set redis_slaves 4 test "(init) Create a master-slaves cluster of [expr $redis_slaves+1] instances" { create_redis_master_slave_cluster [expr {$redis_slaves+1}] From dcac007b814d36205f317349152db116842cd326 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 12:24:45 +0100 Subject: [PATCH 0072/1648] redis-cli: added comments to split program in parts. --- src/redis-cli.c | 46 +++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 41 insertions(+), 5 deletions(-) diff --git a/src/redis-cli.c b/src/redis-cli.c index 7d411f8aef1..a40d09ffb14 100644 --- a/src/redis-cli.c +++ b/src/redis-cli.c @@ -102,14 +102,18 @@ char *redisGitDirty(void); * Utility functions *--------------------------------------------------------------------------- */ -static long long mstime(void) { +static long long ustime(void) { struct timeval tv; - long long mst; + long long ust; gettimeofday(&tv, NULL); - mst = ((long long)tv.tv_sec)*1000; - mst += tv.tv_usec/1000; - return mst; + ust = ((long long)tv.tv_sec)*1000000; + ust += tv.tv_usec; + return ust; +} + +static long long mstime(void) { + return ustime()/1000; } static void cliRefreshPrompt(void) { @@ -950,6 +954,10 @@ static int noninteractive(int argc, char **argv) { return retval; } +/*------------------------------------------------------------------------------ + * Eval mode + *--------------------------------------------------------------------------- */ + static int evalMode(int argc, char **argv) { sds script = sdsempty(); FILE *fp; @@ -988,6 +996,10 @@ static int evalMode(int argc, char **argv) { return cliSendCommand(argc+3-got_comma, argv2, config.repeat); } +/*------------------------------------------------------------------------------ + * Latency and latency history modes + *--------------------------------------------------------------------------- */ + #define LATENCY_SAMPLE_RATE 10 /* milliseconds. */ #define LATENCY_HISTORY_DEFAULT_INTERVAL 15000 /* milliseconds. */ static void latencyMode(void) { @@ -1032,6 +1044,10 @@ static void latencyMode(void) { } } +/*------------------------------------------------------------------------------ + * Slave mode + *--------------------------------------------------------------------------- */ + /* Sends SYNC and reads the number of bytes in the payload. Used both by * slaveMode() and getRDB(). */ unsigned long long sendSync(int fd) { @@ -1095,6 +1111,10 @@ static void slaveMode(void) { config.output = original_output; } +/*------------------------------------------------------------------------------ + * RDB transfer mode + *--------------------------------------------------------------------------- */ + /* This function implements --rdb, so it uses the replication protocol in order * to fetch the RDB file from a remote server. */ static void getRDB(void) { @@ -1140,6 +1160,10 @@ static void getRDB(void) { exit(0); } +/*------------------------------------------------------------------------------ + * Bulk import (pipe) mode + *--------------------------------------------------------------------------- */ + static void pipeMode(void) { int fd = context->fd; long long errors = 0, replies = 0, obuf_len = 0, obuf_pos = 0; @@ -1291,6 +1315,10 @@ static void pipeMode(void) { exit(0); } +/*------------------------------------------------------------------------------ + * Find big keys + *--------------------------------------------------------------------------- */ + #define TYPE_STRING 0 #define TYPE_LIST 1 #define TYPE_SET 2 @@ -1377,6 +1405,10 @@ static void findBigKeys(void) { } } +/*------------------------------------------------------------------------------ + * Stats mode + *--------------------------------------------------------------------------- */ + /* Return the specified INFO field from the INFO command output "info". * A new buffer is allocated for the result, that needs to be free'd. * If the field is not found NULL is returned. */ @@ -1516,6 +1548,10 @@ static void statMode() { } } +/*------------------------------------------------------------------------------ + * Scan mode + *--------------------------------------------------------------------------- */ + static void scanMode() { redisReply *reply; unsigned long long cur = 0; From c1d67ea9b46f88098e88450adb968d8d7c140cda Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 12:37:52 +0100 Subject: [PATCH 0073/1648] redis-cli: --intrinsic-latency run mode added. --- src/redis-cli.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 81 insertions(+), 3 deletions(-) diff --git a/src/redis-cli.c b/src/redis-cli.c index a40d09ffb14..c6de505175f 100644 --- a/src/redis-cli.c +++ b/src/redis-cli.c @@ -58,7 +58,7 @@ #define OUTPUT_RAW 1 #define OUTPUT_CSV 2 #define REDIS_CLI_KEEPALIVE_INTERVAL 15 /* seconds */ -#define REDIS_DEFAULT_PIPE_TIMEOUT 30 /* seconds */ +#define REDIS_CLI_DEFAULT_PIPE_TIMEOUT 30 /* seconds */ static redisContext *context; static struct config { @@ -82,6 +82,8 @@ static struct config { int getrdb_mode; int stat_mode; int scan_mode; + int intrinsic_latency_mode; + int intrinsic_latency_duration; char *pattern; char *rdb_filename; int bigkeys; @@ -732,6 +734,9 @@ static int parseOptions(int argc, char **argv) { config.scan_mode = 1; } else if (!strcmp(argv[i],"--pattern")) { config.pattern = argv[++i]; + } else if (!strcmp(argv[i],"--intrinsic-latency") && !lastarg) { + config.intrinsic_latency_mode = 1; + config.intrinsic_latency_duration = atoi(argv[++i]); } else if (!strcmp(argv[i],"--rdb") && !lastarg) { config.getrdb_mode = 1; config.rdb_filename = argv[++i]; @@ -817,6 +822,8 @@ static void usage() { " --bigkeys Sample Redis keys looking for big keys.\n" " --scan List all keys using the SCAN command.\n" " --pattern Useful with --scan to specify a SCAN pattern.\n" +" --intrinsic-latency Run a test to measure intrinsic system latency.\n" +" The test will run for the specified amount of seconds.\n" " --eval Send an EVAL command using the Lua script at .\n" " --help Output this help and exit.\n" " --version Output version and exit.\n" @@ -834,7 +841,7 @@ static void usage() { "When no command is given, redis-cli starts in interactive mode.\n" "Type \"help\" in interactive mode for information on available commands.\n" "\n", - version, REDIS_DEFAULT_PIPE_TIMEOUT); + version, REDIS_CLI_DEFAULT_PIPE_TIMEOUT); sdsfree(version); exit(1); } @@ -1581,6 +1588,73 @@ static void scanMode() { exit(0); } +/*------------------------------------------------------------------------------ + * Intrisic latency mode. + * + * Measure max latency of a running process that does not result from + * syscalls. Basically this software should provide an hint about how much + * time the kernel leaves the process without a chance to run. + *--------------------------------------------------------------------------- */ + +/* This is just some computation the compiler can't optimize out. + * Should run in less than 100-200 microseconds even using very + * slow hardware. Runs in less than 10 microseconds in modern HW. */ +uint64_t compute_something_fast(void) { + uint8_t s[256], i, j, t; + int count = 1000, k; + uint64_t output = 0; + + for (k = 0; k < 256; k++) s[k] = k; + + i = 0; + j = 0; + while(count--) { + i++; + j = j + s[i]; + t = s[i]; + s[i] = s[j]; + s[j] = t; + output += s[(s[i]+s[j])&255]; + } + return output; +} + +static void intrinsicLatencyMode(void) { + long long test_end, run_time, max_latency = 0, runs = 0; + + run_time = config.intrinsic_latency_duration*1000000; + test_end = ustime() + run_time; + + while(1) { + long long start, end, latency; + + start = ustime(); + compute_something_fast(); + end = ustime(); + latency = end-start; + runs++; + if (latency <= 0) continue; + + /* Reporting */ + if (latency > max_latency) { + max_latency = latency; + printf("Max latency so far: %lld microseconds.\n", max_latency); + } + + if (end > test_end) { + printf("\n%lld total runs (avg %lld microseconds per run).\n", + runs, run_time/runs); + printf("Worst run took %.02fx times the avarege.\n", + (double) max_latency / (run_time/runs)); + exit(0); + } + } +} + +/*------------------------------------------------------------------------------ + * Program main() + *--------------------------------------------------------------------------- */ + int main(int argc, char **argv) { int firstarg; @@ -1601,10 +1675,11 @@ int main(int argc, char **argv) { config.getrdb_mode = 0; config.stat_mode = 0; config.scan_mode = 0; + config.intrinsic_latency_mode = 0; config.pattern = NULL; config.rdb_filename = NULL; config.pipe_mode = 0; - config.pipe_timeout = REDIS_DEFAULT_PIPE_TIMEOUT; + config.pipe_timeout = REDIS_CLI_DEFAULT_PIPE_TIMEOUT; config.bigkeys = 0; config.stdinarg = 0; config.auth = NULL; @@ -1663,6 +1738,9 @@ int main(int argc, char **argv) { scanMode(); } + /* Intrinsic latency mode */ + if (config.intrinsic_latency_mode) intrinsicLatencyMode(); + /* Start interactive mode when no command is provided */ if (argc == 0 && !config.eval) { /* Note that in repl mode we don't abort on connection error. From 5580350a7b179bffc96b277dca2958c64ac773a8 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 12:38:24 +0100 Subject: [PATCH 0074/1648] redis-cli: check argument existence for --pattern. --- src/redis-cli.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/redis-cli.c b/src/redis-cli.c index c6de505175f..b2bd7b79687 100644 --- a/src/redis-cli.c +++ b/src/redis-cli.c @@ -732,7 +732,7 @@ static int parseOptions(int argc, char **argv) { config.stat_mode = 1; } else if (!strcmp(argv[i],"--scan")) { config.scan_mode = 1; - } else if (!strcmp(argv[i],"--pattern")) { + } else if (!strcmp(argv[i],"--pattern") && !lastarg) { config.pattern = argv[++i]; } else if (!strcmp(argv[i],"--intrinsic-latency") && !lastarg) { config.intrinsic_latency_mode = 1; From ba993cc685811d5e75acc9ff562c53c505368f0f Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 13:43:47 +0100 Subject: [PATCH 0075/1648] redis-cli: don't use uint64_t where actually not needed. The computation is just something to take the CPU busy, no need to use a specific type. Since stdint.h was not included this prevented compilation on certain systems. --- src/redis-cli.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/redis-cli.c b/src/redis-cli.c index b2bd7b79687..8d34a5c4127 100644 --- a/src/redis-cli.c +++ b/src/redis-cli.c @@ -1599,10 +1599,10 @@ static void scanMode() { /* This is just some computation the compiler can't optimize out. * Should run in less than 100-200 microseconds even using very * slow hardware. Runs in less than 10 microseconds in modern HW. */ -uint64_t compute_something_fast(void) { +unsigned long compute_something_fast(void) { uint8_t s[256], i, j, t; int count = 1000, k; - uint64_t output = 0; + unsigned long output = 0; for (k = 0; k < 256; k++) s[k] = k; From a2c76ffb1c307e0d77a46d8e9e104778d4b8c626 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 13:47:37 +0100 Subject: [PATCH 0076/1648] redis-cli: also remove useless uint8_t. --- src/redis-cli.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/redis-cli.c b/src/redis-cli.c index 8d34a5c4127..c82720dd65a 100644 --- a/src/redis-cli.c +++ b/src/redis-cli.c @@ -1600,7 +1600,7 @@ static void scanMode() { * Should run in less than 100-200 microseconds even using very * slow hardware. Runs in less than 10 microseconds in modern HW. */ unsigned long compute_something_fast(void) { - uint8_t s[256], i, j, t; + unsigned char s[256], i, j, t; int count = 1000, k; unsigned long output = 0; From fef88681f287ddbd54123d581d02d59771e57311 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 14:33:44 +0100 Subject: [PATCH 0077/1648] Sentinel test: kill masters instead of using DEBUG SLEEP in all tests. --- tests/sentinel-tests/00-base.tcl | 14 ++++++++++---- tests/sentinel-tests/01-conf-update.tcl | 3 ++- 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index dd87714666b..74308b7752c 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -6,7 +6,7 @@ test "Basic failover works if the master is down" { set old_port [RI $master_id tcp_port] set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] assert {[lindex $addr 1] == $old_port} - R $master_id debug sleep 10 + kill_instance redis $master_id foreach_sentinel_id id { wait_for_condition 100 50 { [lindex [S $id SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] 1] != $old_port @@ -14,6 +14,7 @@ test "Basic failover works if the master is down" { fail "At least one Sentinel did not received failover info" } } + restart_instance redis $master_id set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] set master_id [get_instance_id_by_port redis [lindex $addr 1]] } @@ -49,11 +50,12 @@ test "ODOWN is not possible without N (quorum) Sentinels reports" { set old_port [RI $master_id tcp_port] set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] assert {[lindex $addr 1] == $old_port} - R $master_id debug sleep 10 + kill_instance redis $master_id # Make sure failover did not happened. set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] assert {[lindex $addr 1] == $old_port} + restart_instance redis $master_id } test "Failover is not possible without majority agreement" { @@ -66,11 +68,13 @@ test "Failover is not possible without majority agreement" { kill_instance sentinel $id } - R $master_id debug sleep 10 + # Kill the current master + kill_instance redis $master_id # Make sure failover did not happened. set addr [S $quorum SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] assert {[lindex $addr 1] == $old_port} + restart_instance redis $master_id # Cleanup: restart Sentinels to monitor the master. for {set id 0} {$id < $quorum} {incr id} { @@ -92,7 +96,8 @@ test "Failover works if we configure for absolute agreement" { } } - R $master_id debug sleep 10 + kill_instance redis $master_id + foreach_sentinel_id id { wait_for_condition 1000 50 { [lindex [S $id SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] 1] != $old_port @@ -100,6 +105,7 @@ test "Failover works if we configure for absolute agreement" { fail "At least one Sentinel did not received failover info" } } + restart_instance redis $master_id set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] set master_id [get_instance_id_by_port redis [lindex $addr 1]] diff --git a/tests/sentinel-tests/01-conf-update.tcl b/tests/sentinel-tests/01-conf-update.tcl index 7f3d1c3661e..3cf26926e6e 100644 --- a/tests/sentinel-tests/01-conf-update.tcl +++ b/tests/sentinel-tests/01-conf-update.tcl @@ -10,7 +10,7 @@ test "We can failover with Sentinel 1 crashed" { # Crash Sentinel 1 kill_instance sentinel 1 - R $master_id debug sleep 10 + kill_instance redis $master_id foreach_sentinel_id id { if {$id != 1} { wait_for_condition 1000 50 { @@ -20,6 +20,7 @@ test "We can failover with Sentinel 1 crashed" { } } } + restart_instance redis $master_id set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] set master_id [get_instance_id_by_port redis [lindex $addr 1]] } From 013a4ce242eb4c13c4eee68b54a68f3e7b23f0f3 Mon Sep 17 00:00:00 2001 From: michael-grunder Date: Tue, 25 Feb 2014 05:41:30 -0800 Subject: [PATCH 0078/1648] Update --bigkeys to use SCAN This commit changes the findBigKeys() function in redis-cli.c to use the new SCAN command for iterating the keyspace, rather than RANDOMKEY. Because we can know when we're done using SCAN, it will exit after exhausting the keyspace. --- src/redis-cli.c | 140 ++++++++++++++++++++++++++++-------------------- 1 file changed, 82 insertions(+), 58 deletions(-) diff --git a/src/redis-cli.c b/src/redis-cli.c index 7d411f8aef1..e703cf4c567 100644 --- a/src/redis-cli.c +++ b/src/redis-cli.c @@ -1300,81 +1300,105 @@ static void pipeMode(void) { static void findBigKeys(void) { unsigned long long biggest[5] = {0,0,0,0,0}; unsigned long long samples = 0; - redisReply *reply1, *reply2, *reply3 = NULL; - char *sizecmd, *typename[] = {"string","list","set","hash","zset"}; + redisReply *reply1, *reply2, *reply3 = NULL, *keys; + char *key, *sizecmd, *typename[] = {"string","list","set","hash","zset"}; char *typeunit[] = {"bytes","items","members","fields","members"}; - int type; + int type, it=0, i; printf("\n# Press ctrl+c when you have had enough of it... :)\n"); - printf("# You can use -i 0.1 to sleep 0.1 sec every 100 sampled keys\n"); + printf("# You can use -i 0.1 to sleep 0.1 sec per 100 SCANS\n"); printf("# in order to reduce server load (usually not needed).\n\n"); - while(1) { - /* Sample with RANDOMKEY */ - reply1 = redisCommand(context,"RANDOMKEY"); - if (reply1 == NULL) { - fprintf(stderr,"\nI/O error\n"); + + do { + /* Grab some keys with SCAN */ + reply1 = redisCommand(context, "SCAN %d", it); + if(reply1 == NULL) { + fprintf(stderr, "\nI/O error\n"); exit(1); - } else if (reply1->type == REDIS_REPLY_ERROR) { - fprintf(stderr, "RANDOMKEY error: %s\n", - reply1->str); + } else if(reply1->type == REDIS_REPLY_ERROR) { + fprintf(stderr, "SCAN error: %s\n", reply1->str); exit(1); - } else if (reply1->type == REDIS_REPLY_NIL) { - fprintf(stderr, "It looks like the database is empty!\n"); + } else if(reply1->type != REDIS_REPLY_ARRAY) { + fprintf(stderr, "Non ARRAY response from SCAN!\n"); exit(1); - } - - /* Get the key type */ - reply2 = redisCommand(context,"TYPE %s",reply1->str); - assert(reply2 && reply2->type == REDIS_REPLY_STATUS); - samples++; - - /* Get the key "size" */ - if (!strcmp(reply2->str,"string")) { - sizecmd = "STRLEN"; - type = TYPE_STRING; - } else if (!strcmp(reply2->str,"list")) { - sizecmd = "LLEN"; - type = TYPE_LIST; - } else if (!strcmp(reply2->str,"set")) { - sizecmd = "SCARD"; - type = TYPE_SET; - } else if (!strcmp(reply2->str,"hash")) { - sizecmd = "HLEN"; - type = TYPE_HASH; - } else if (!strcmp(reply2->str,"zset")) { - sizecmd = "ZCARD"; - type = TYPE_ZSET; - } else if (!strcmp(reply2->str,"none")) { - freeReplyObject(reply1); - freeReplyObject(reply2); - continue; - } else { - fprintf(stderr, "Unknown key type '%s' for key '%s'\n", - reply2->str, reply1->str); + } else if(reply1->elements!=2) { + fprintf(stderr, "Invalid SCAN result!\n"); exit(1); } - - reply3 = redisCommand(context,"%s %s", sizecmd, reply1->str); - if (reply3 && reply3->type == REDIS_REPLY_INTEGER) { - if (biggest[type] < reply3->integer) { - printf("Biggest %-6s found so far '%s' with %llu %s.\n", - typename[type], reply1->str, - (unsigned long long) reply3->integer, - typeunit[type]); - biggest[type] = reply3->integer; + + /* Validate the SCAN response */ + assert(reply1->element[0]->type == REDIS_REPLY_STRING); + assert(reply1->element[1]->type == REDIS_REPLY_ARRAY); + + /* Update iterator and grab pointer to keys */ + it = atoi(reply1->element[0]->str); + keys = reply1->element[1]; + + /* Iterate keys that SCAN returned */ + for(i=0;ielements;i++) { + /* Make sure we've got a string, grab it, and increment samples */ + assert(keys->element[i]->type == REDIS_REPLY_STRING); + key = keys->element[i]->str; + samples++; + + /* Get the key type */ + reply2 = redisCommand(context, "TYPE %s", key); + assert(reply2 && reply2->type == REDIS_REPLY_STATUS); + + /* Get the key "size" */ + if (!strcmp(reply2->str,"string")) { + sizecmd = "STRLEN"; + type = TYPE_STRING; + } else if (!strcmp(reply2->str,"list")) { + sizecmd = "LLEN"; + type = TYPE_LIST; + } else if (!strcmp(reply2->str,"set")) { + sizecmd = "SCARD"; + type = TYPE_SET; + } else if (!strcmp(reply2->str,"hash")) { + sizecmd = "HLEN"; + type = TYPE_HASH; + } else if (!strcmp(reply2->str,"zset")) { + sizecmd = "ZCARD"; + type = TYPE_ZSET; + } else if (!strcmp(reply2->str,"none")) { + freeReplyObject(reply1); + freeReplyObject(reply2); + continue; + } else { + fprintf(stderr, "Unknown key type '%s' for key '%s'\n", + reply2->str, key); + exit(1); + } + + /* The size command */ + reply3 = redisCommand(context,"%s %s", sizecmd, key); + if (reply3 && reply3->type == REDIS_REPLY_INTEGER) { + if (biggest[type] < reply3->integer) { + printf("Biggest %-6s found so far '%s' with %llu %s.\n", + typename[type], key, + (unsigned long long) reply3->integer, + typeunit[type]); + biggest[type] = reply3->integer; + } } + + freeReplyObject(reply2); + if(reply3) freeReplyObject(reply3); } - if ((samples % 1000000) == 0) + if (samples && (samples % 1000000) == 0) printf("(%llu keys sampled)\n", samples); - if ((samples % 100) == 0 && config.interval) + if (samples && (samples % 100) == 0 && config.interval) usleep(config.interval); freeReplyObject(reply1); - freeReplyObject(reply2); - if (reply3) freeReplyObject(reply3); - } + } while(it != 0); + + /* We've finished scanning the keyspace */ + printf("\n# Scanned all %llu keys in the keyspace!\n", samples); + exit(0); } /* Return the specified INFO field from the INFO command output "info". From 339322fe58f711ee5c5ebaad114bcb3017059188 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 14:59:36 +0100 Subject: [PATCH 0079/1648] Sentinel test: check role at end of unit 01. --- tests/sentinel-tests/01-conf-update.tcl | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tests/sentinel-tests/01-conf-update.tcl b/tests/sentinel-tests/01-conf-update.tcl index 3cf26926e6e..4625ebd4c43 100644 --- a/tests/sentinel-tests/01-conf-update.tcl +++ b/tests/sentinel-tests/01-conf-update.tcl @@ -33,3 +33,7 @@ test "After Sentinel 1 is restarted, its config gets updated" { fail "Restarted Sentinel did not received failover info" } } + +test "New master [join $addr {:}] role matches" { + assert {[RI $master_id role] eq {master}} +} From fe9a489b29ac6f9f138ab1f94bc1c862fe0af2f3 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 15:21:53 +0100 Subject: [PATCH 0080/1648] Sentinel test: added TODO items in 02 unit. --- tests/sentinel-tests/02-slaves-reconf.tcl | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/tests/sentinel-tests/02-slaves-reconf.tcl b/tests/sentinel-tests/02-slaves-reconf.tcl index 49ecb1ab918..753580a0ec6 100644 --- a/tests/sentinel-tests/02-slaves-reconf.tcl +++ b/tests/sentinel-tests/02-slaves-reconf.tcl @@ -1 +1,6 @@ # Check that slaves are reconfigured at a latter time if they are partitioned. +# +# Here we should test: +# 1) That slaves point to the new master after failover. +# 2) That partitioned slaves point to new master when they are partitioned +# away during failover and return at a latter time. From 55c059e27078e86cd097707be67faf6280f5321b Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 25 Feb 2014 15:36:51 +0100 Subject: [PATCH 0081/1648] Sentinel test: add stub for unit 04. --- tests/sentinel-tests/04-slave-selection.tcl | 5 +++++ 1 file changed, 5 insertions(+) create mode 100644 tests/sentinel-tests/04-slave-selection.tcl diff --git a/tests/sentinel-tests/04-slave-selection.tcl b/tests/sentinel-tests/04-slave-selection.tcl new file mode 100644 index 00000000000..3d2ca648458 --- /dev/null +++ b/tests/sentinel-tests/04-slave-selection.tcl @@ -0,0 +1,5 @@ +# Test slave selection algorithm. +# +# This unit should test: +# 1) That when there are no suitable slaves no failover is performed. +# 2) That among the available slaves, the one with better offset is picked. From d769cad4bf37ac913bf41d3b555b2b3d765dc829 Mon Sep 17 00:00:00 2001 From: Matt Stancliff Date: Tue, 25 Feb 2014 16:02:28 -0500 Subject: [PATCH 0082/1648] Fix IP representation in clusterMsgDataGossip --- src/cluster.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/cluster.h b/src/cluster.h index aae13cd4071..654274b9bb4 100644 --- a/src/cluster.h +++ b/src/cluster.h @@ -148,7 +148,7 @@ typedef struct { char nodename[REDIS_CLUSTER_NAMELEN]; uint32_t ping_sent; uint32_t pong_received; - char ip[16]; /* IP address last time it was seen */ + char ip[REDIS_IP_STR_LEN]; /* IP address last time it was seen */ uint16_t port; /* port last time it was seen */ uint16_t flags; uint32_t notused; /* for 64 bit alignment */ From 746ce35f5f7296fbf829cfde031820ea8dffdc89 Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 27 Feb 2014 09:46:20 +0100 Subject: [PATCH 0083/1648] Fix misaligned word access in redisPopcount(). --- src/bitops.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/src/bitops.c b/src/bitops.c index 599d4dd8ee0..0de5abc0aae 100644 --- a/src/bitops.c +++ b/src/bitops.c @@ -60,11 +60,18 @@ static int getBitOffsetFromArgument(redisClient *c, robj *o, size_t *offset) { * work with a input string length up to 512 MB. */ size_t redisPopcount(void *s, long count) { size_t bits = 0; - unsigned char *p; - uint32_t *p4 = s; + unsigned char *p = s; + uint32_t *p4; static const unsigned char bitsinbyte[256] = {0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,4,5,5,6,5,6,6,7,5,6,6,7,6,7,7,8}; + /* Count initial bytes not aligned to 32 bit. */ + while((unsigned long)p & 3 && count) { + bits += bitsinbyte[*p++]; + count--; + } + /* Count bits 16 bytes at a time */ + p4 = (uint32_t*)p; while(count>=16) { uint32_t aux1, aux2, aux3, aux4; From 2a7847a3b5732667f1b63a12f6cdbd25285dccf8 Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 27 Feb 2014 10:00:17 +0100 Subject: [PATCH 0084/1648] BITCOUNT fuzzy test with random start/end added. It was verified in practice that this test is able to stress much more the implementation by introducing errors that were only trivially to detect with different offsets but impossible to detect starting always at zero and counting bits the full length of the string. --- tests/unit/bitops.tcl | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/tests/unit/bitops.tcl b/tests/unit/bitops.tcl index dade8923ee6..a145199a3e9 100644 --- a/tests/unit/bitops.tcl +++ b/tests/unit/bitops.tcl @@ -52,7 +52,7 @@ start_server {tags {"bitops"}} { } } - test {BITCOUNT fuzzing} { + test {BITCOUNT fuzzing without start/end} { for {set j 0} {$j < 100} {incr j} { set str [randstring 0 3000] r set str $str @@ -60,6 +60,20 @@ start_server {tags {"bitops"}} { } } + test {BITCOUNT fuzzing with start/end} { + for {set j 0} {$j < 100} {incr j} { + set str [randstring 0 3000] + r set str $str + set l [string length $str] + set start [randomInt $l] + set end [randomInt $l] + if {$start > $end} { + lassign [list $end $start] start end + } + assert {[r bitcount str $start $end] == [count_bits [string range $str $start $end]]} + } + } + test {BITCOUNT with start, end} { r set s "foobar" assert_equal [r bitcount s 0 -1] [count_bits "foobar"] From 8d95a47408e5b1accc8e51d182684e8c83cd133d Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 27 Feb 2014 10:07:29 +0100 Subject: [PATCH 0085/1648] Added two more BITCOUNT tests stressing misaligned access. --- tests/unit/bitops.tcl | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/tests/unit/bitops.tcl b/tests/unit/bitops.tcl index a145199a3e9..e7f289f9480 100644 --- a/tests/unit/bitops.tcl +++ b/tests/unit/bitops.tcl @@ -98,6 +98,18 @@ start_server {tags {"bitops"}} { } } {1} + test {BITCOUNT misaligned prefix} { + r del str + r set str ab + r bitcount str 1 -1 + } {3} + + test {BITCOUNT misaligned prefix + full words + remainder} { + r del str + r set str __PPxxxxxxxxxxxxxxxxRR__ + r bitcount str 2 -3 + } {74} + test {BITOP NOT (empty string)} { r set s "" r bitop not dest s From 38c620b3b5676281463ab155e2447889f984b44f Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 27 Feb 2014 12:40:58 +0100 Subject: [PATCH 0086/1648] Initial implementation of BITPOS. It appears to work but more stress testing, and both unit tests and fuzzy testing, is needed in order to ensure the implementation is sane. --- src/bitops.c | 170 ++++++++++++++++++++++++++++++++++++++++++++++++++- src/redis.c | 1 + src/redis.h | 1 + 3 files changed, 171 insertions(+), 1 deletion(-) diff --git a/src/bitops.c b/src/bitops.c index 0de5abc0aae..7715a1ff64c 100644 --- a/src/bitops.c +++ b/src/bitops.c @@ -94,12 +94,99 @@ size_t redisPopcount(void *s, long count) { ((((aux3 + (aux3 >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24) + ((((aux4 + (aux4 >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24); } - /* Count the remaining bytes */ + /* Count the remaining bytes. */ p = (unsigned char*)p4; while(count--) bits += bitsinbyte[*p++]; return bits; } +/* Return the position of the first bit set to one (if 'bit' is 1) or + * zero (if 'bit' is 0) in the bitmap starting at 's' and long 'count' bytes. + * + * The function is guaranteed to return a value >= 0 if 'bit' is 0 since if + * no zero bit is found, it returns count*8 assuming the string is zero + * padded on the right. However if 'bit' is 1 it is possible that there is + * not a single set bit in the bitmap. In this special case -1 is returned. */ +long redisBitpos(void *s, long count, int bit) { + unsigned long *l; + unsigned char *c; + unsigned long skipval, word = 0, one; + long pos = 0; /* Position of bit, to return to the caller. */ + int j; + + /* Process whole words first, seeking for first word that is not + * all ones or all zeros respectively if we are lookig for zeros + * or ones. This is much faster with large strings having contiguous + * blocks of 1 or 0 bits compared to the vanilla bit per bit processing. + * + * Note that if we start from an address that is not aligned + * to sizeof(unsigned long) we consume it byte by byte until it is + * aligned. */ + + /* Skip initial bits not aligned to sizeof(unsigned long) byte by byte. */ + skipval = bit ? 0 : UCHAR_MAX; + c = (unsigned char*) s; + while((unsigned long)c & (sizeof(*l)-1) && count) { + if (*c != skipval) break; + c++; + count--; + pos += 8; + } + + /* Skip bits with full word step. */ + skipval = bit ? 0 : ULONG_MAX; + l = (unsigned long*) c; + while (count >= sizeof(*l)) { + if (*l != skipval) break; + l++; + count -= sizeof(*l); + pos += sizeof(*l)*8; + } + + /* Load bytes into "word" considering the first byte as the most significant + * (we basically consider it as written in big endian, since we consider the + * string as a set of bits from left to right, with the first bit at position + * zero. + * + * Note that the loading is designed to work even when the bytes left + * (count) are less than a full word. We pad it with zero on the right. */ + c = (unsigned char*)l; + for (j = 0; j < sizeof(*l); j++) { + word <<= 8; + if (count) { + word |= *c; + c++; + count--; + } + } + + /* Special case: + * If bits in the string are all zero and we are looking for one, + * return -1 to signal that there is not a single "1" in the whole + * string. This can't happen when we are looking for "0" as we assume + * that the right of the string is zero padded. */ + if (bit == 1 && word == 0) return -1; + + /* Last word left, scan bit by bit. The first thing we need is to + * have a single "1" set in the most significant position in an + * unsigned long. We don't know the size of the long so we use a + * simple trick. */ + one = ULONG_MAX; /* All bits set to 1.*/ + one >>= 1; /* All bits set to 1 but the MSB. */ + one = ~one; /* All bits set to 0 but the MSB. */ + + while(one) { + if (((one & word) != 0) == bit) return pos; + pos++; + one >>= 1; + } + + /* If we reached this point, there is a bug in the algorithm, since + * the case of no match is handled as a special case before. */ + redisPanic("End of redisBitpos() reached."); + return 0; /* Just to avoid warnigns. */ +} + /* ----------------------------------------------------------------------------- * Bits related string commands: GETBIT, SETBIT, BITCOUNT, BITOP. * -------------------------------------------------------------------------- */ @@ -417,3 +504,84 @@ void bitcountCommand(redisClient *c) { addReplyLongLong(c,redisPopcount(p+start,bytes)); } } + +/* BITPOS key bit [start end] */ +void bitposCommand(redisClient *c) { + robj *o; + long bit, start, end, strlen; + unsigned char *p; + char llbuf[32]; + + /* Parse the bit argument to understand what we are looking for, set + * or clear bits. */ + if (getLongFromObjectOrReply(c,c->argv[2],&bit,NULL) != REDIS_OK) + return; + if (bit != 0 && bit != 1) { + addReplyError(c, "The bit argument must be 1 or 0."); + return; + } + + /* If the key does not exist, from our point of view it is an infinite + * array of 0 bits. If the user is looking for the fist clear bit return 0, + * If the user is looking for the first set bit, return -1. */ + if ((o = lookupKeyRead(c->db,c->argv[1])) == NULL) { + addReplyLongLong(c, bit ? -1 : 0); + return; + } + if (checkType(c,o,REDIS_STRING)) return; + + /* Set the 'p' pointer to the string, that can be just a stack allocated + * array if our string was integer encoded. */ + if (o->encoding == REDIS_ENCODING_INT) { + p = (unsigned char*) llbuf; + strlen = ll2string(llbuf,sizeof(llbuf),(long)o->ptr); + } else { + p = (unsigned char*) o->ptr; + strlen = sdslen(o->ptr); + } + + /* Parse start/end range if any. */ + if (c->argc == 5) { + if (getLongFromObjectOrReply(c,c->argv[3],&start,NULL) != REDIS_OK) + return; + if (getLongFromObjectOrReply(c,c->argv[4],&end,NULL) != REDIS_OK) + return; + /* Convert negative indexes */ + if (start < 0) start = strlen+start; + if (end < 0) end = strlen+end; + if (start < 0) start = 0; + if (end < 0) end = 0; + if (end >= strlen) end = strlen-1; + } else if (c->argc == 3) { + /* The whole string. */ + start = 0; + end = strlen-1; + } else { + /* Syntax error. */ + addReply(c,shared.syntaxerr); + return; + } + + /* For empty ranges (start > end) we return -1 as an empty range does + * not contain a 0 nor a 1. */ + if (start > end) { + addReplyLongLong(c, -1); + } else { + long bytes = end-start+1; + long pos = redisBitpos(p+start,bytes,bit); + + /* If we are looking for clear bits, and our range does not includes + * the end of the string, but terminates before, we can't consider the + * right of the range as zero padded. + * + * so if redisBitpos() returns the first bit outside the string, + * we return -1 to the caller, to mean, in the specified range there + * is not a single "0" bit. */ + if (end != strlen-1 && bit == 0 && pos == bytes*8) { + addReplyLongLong(c,-1); + return; + } + if (pos != -1) pos += start*8; /* Adjust for the bytes we skipped. */ + addReplyLongLong(c,pos); + } +} diff --git a/src/redis.c b/src/redis.c index c8562c85d88..f132bb563d6 100644 --- a/src/redis.c +++ b/src/redis.c @@ -266,6 +266,7 @@ struct redisCommand redisCommandTable[] = { {"time",timeCommand,1,"rR",0,NULL,0,0,0,0,0}, {"bitop",bitopCommand,-4,"wm",0,NULL,2,-1,1,0,0}, {"bitcount",bitcountCommand,-2,"r",0,NULL,1,1,1,0,0}, + {"bitpos",bitposCommand,-3,"r",0,NULL,1,1,1,0,0}, {"wait",waitCommand,3,"rs",0,NULL,0,0,0,0,0} }; diff --git a/src/redis.h b/src/redis.h index 618dfac439b..c09d410e656 100644 --- a/src/redis.h +++ b/src/redis.h @@ -1420,6 +1420,7 @@ void scriptCommand(redisClient *c); void timeCommand(redisClient *c); void bitopCommand(redisClient *c); void bitcountCommand(redisClient *c); +void bitposCommand(redisClient *c); void replconfCommand(redisClient *c); void waitCommand(redisClient *c); From 0e31eaa27f995fead02b861a7df403771ce772d2 Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 27 Feb 2014 12:53:03 +0100 Subject: [PATCH 0087/1648] More consistent BITPOS behavior with bit=0 and ranges. With the new behavior it is possible to specify just the start in the range (the end will be assumed to be the first byte), or it is possible to specify both start and end. This is useful to change the behavior of the command when looking for zeros inside a string. 1) If the user specifies both start and end, and no 0 is found inside the range, the command returns -1. 2) If instead no range is specified, or just the start is given, even if in the actual string no 0 bit is found, the command returns the first bit on the right after the end of the string. So for example if the string stored at key foo is "\xff\xff": BITPOS foo (returns 16) BITPOS foo 0 -1 (returns -1) BITPOS foo 0 (returns 16) The idea is that when no end is given the user is just looking for the first bit that is zero and can be set to 1 with SETBIT, as it is "available". Instead when a specific range is given, we just look for a zero within the boundaries of the range. --- src/bitops.c | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/src/bitops.c b/src/bitops.c index 7715a1ff64c..a87ed041177 100644 --- a/src/bitops.c +++ b/src/bitops.c @@ -505,12 +505,13 @@ void bitcountCommand(redisClient *c) { } } -/* BITPOS key bit [start end] */ +/* BITPOS key bit [start [end]] */ void bitposCommand(redisClient *c) { robj *o; long bit, start, end, strlen; unsigned char *p; char llbuf[32]; + int end_given = 0; /* Parse the bit argument to understand what we are looking for, set * or clear bits. */ @@ -541,11 +542,16 @@ void bitposCommand(redisClient *c) { } /* Parse start/end range if any. */ - if (c->argc == 5) { + if (c->argc == 4 || c->argc == 5) { if (getLongFromObjectOrReply(c,c->argv[3],&start,NULL) != REDIS_OK) return; - if (getLongFromObjectOrReply(c,c->argv[4],&end,NULL) != REDIS_OK) - return; + if (c->argc == 5) { + if (getLongFromObjectOrReply(c,c->argv[4],&end,NULL) != REDIS_OK) + return; + end_given = 1; + } else { + end = strlen-1; + } /* Convert negative indexes */ if (start < 0) start = strlen+start; if (end < 0) end = strlen+end; @@ -570,14 +576,14 @@ void bitposCommand(redisClient *c) { long bytes = end-start+1; long pos = redisBitpos(p+start,bytes,bit); - /* If we are looking for clear bits, and our range does not includes - * the end of the string, but terminates before, we can't consider the - * right of the range as zero padded. + /* If we are looking for clear bits, and the user specified an exact + * range with start-end, we can't consider the right of the range as + * zero padded (as we do when no explicit end is given). * - * so if redisBitpos() returns the first bit outside the string, + * So if redisBitpos() returns the first bit outside the range, * we return -1 to the caller, to mean, in the specified range there * is not a single "0" bit. */ - if (end != strlen-1 && bit == 0 && pos == bytes*8) { + if (end_given && bit == 0 && pos == bytes*8) { addReplyLongLong(c,-1); return; } From 76a6e82d8926be859a272fa60612e610334f93ad Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 27 Feb 2014 13:17:23 +0100 Subject: [PATCH 0088/1648] warnigns -> warnings in redisBitpos(). --- src/bitops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/bitops.c b/src/bitops.c index a87ed041177..6cf02dc135c 100644 --- a/src/bitops.c +++ b/src/bitops.c @@ -184,7 +184,7 @@ long redisBitpos(void *s, long count, int bit) { /* If we reached this point, there is a bug in the algorithm, since * the case of no match is handled as a special case before. */ redisPanic("End of redisBitpos() reached."); - return 0; /* Just to avoid warnigns. */ + return 0; /* Just to avoid warnings. */ } /* ----------------------------------------------------------------------------- From b21f4d63de4db35900f7f9e999f606fb87ab3770 Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 27 Feb 2014 15:01:45 +0100 Subject: [PATCH 0089/1648] Basic BITPOS tests. --- tests/unit/bitops.tcl | 102 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 102 insertions(+) diff --git a/tests/unit/bitops.tcl b/tests/unit/bitops.tcl index e7f289f9480..a32ce26cfac 100644 --- a/tests/unit/bitops.tcl +++ b/tests/unit/bitops.tcl @@ -203,4 +203,106 @@ start_server {tags {"bitops"}} { r set a "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" r bitop or x a b } {32} + + test {BITPOS bit=0 with empty key returns 0} { + r del str + r bitpos str 0 + } {0} + + test {BITPOS bit=1 with empty key returns -1} { + r del str + r bitpos str 1 + } {-1} + + test {BITPOS bit=0 with string less than 1 word works} { + r set str "\xff\xf0\x00" + r bitpos str 0 + } {12} + + test {BITPOS bit=1 with string less than 1 word works} { + r set str "\x00\x0f\x00" + r bitpos str 1 + } {12} + + test {BITPOS bit=0 starting at unaligned address} { + r set str "\xff\xf0\x00" + r bitpos str 0 1 + } {12} + + test {BITPOS bit=1 starting at unaligned address} { + r set str "\x00\x0f\xff" + r bitpos str 1 1 + } {12} + + test {BITPOS bit=0 unaligned+full word+reminder} { + r del str + r set str "\xff\xff\xff" ; # Prefix + # Followed by two (or four in 32 bit systems) full words + r append str "\xff\xff\xff\xff\xff\xff\xff\xff" + r append str "\xff\xff\xff\xff\xff\xff\xff\xff" + r append str "\xff\xff\xff\xff\xff\xff\xff\xff" + # First zero bit. + r append str "\x0f" + assert {[r bitpos str 0] == 216} + assert {[r bitpos str 0 1] == 216} + assert {[r bitpos str 0 2] == 216} + assert {[r bitpos str 0 3] == 216} + assert {[r bitpos str 0 4] == 216} + assert {[r bitpos str 0 5] == 216} + assert {[r bitpos str 0 6] == 216} + assert {[r bitpos str 0 7] == 216} + assert {[r bitpos str 0 8] == 216} + } + + test {BITPOS bit=1 unaligned+full word+reminder} { + r del str + r set str "\x00\x00\x00" ; # Prefix + # Followed by two (or four in 32 bit systems) full words + r append str "\x00\x00\x00\x00\x00\x00\x00\x00" + r append str "\x00\x00\x00\x00\x00\x00\x00\x00" + r append str "\x00\x00\x00\x00\x00\x00\x00\x00" + # First zero bit. + r append str "\xf0" + assert {[r bitpos str 1] == 216} + assert {[r bitpos str 1 1] == 216} + assert {[r bitpos str 1 2] == 216} + assert {[r bitpos str 1 3] == 216} + assert {[r bitpos str 1 4] == 216} + assert {[r bitpos str 1 5] == 216} + assert {[r bitpos str 1 6] == 216} + assert {[r bitpos str 1 7] == 216} + assert {[r bitpos str 1 8] == 216} + } + + test {BITPOS bit=1 returns -1 if string is all 0 bits} { + r set str "" + for {set j 0} {$j < 20} {incr j} { + assert {[r bitpos str 1] == -1} + r append str "\x00" + } + } + + test {BITPOS bit=0 works with intervals} { + r set str "\x00\xff\x00" + assert {[r bitpos str 0 0 -1] == 0} + assert {[r bitpos str 0 1 -1] == 16} + assert {[r bitpos str 0 2 -1] == 16} + assert {[r bitpos str 0 2 200] == 16} + assert {[r bitpos str 0 1 1] == -1} + } + + test {BITPOS bit=1 works with intervals} { + r set str "\x00\xff\x00" + assert {[r bitpos str 1 0 -1] == 8} + assert {[r bitpos str 1 1 -1] == 8} + assert {[r bitpos str 1 2 -1] == -1} + assert {[r bitpos str 1 2 200] == -1} + assert {[r bitpos str 1 1 1] == 8} + } + + test {BITPOS bit=0 changes behavior if end is given} { + r set str "\xff\xff\xff" + assert {[r bitpos str 0] == 24} + assert {[r bitpos str 0 0 -1] == -1} + } } From 8b8c1cd4c2a40a4a11dce3d072e3e53d8d89a1b1 Mon Sep 17 00:00:00 2001 From: antirez Date: Thu, 27 Feb 2014 15:27:05 +0100 Subject: [PATCH 0090/1648] BITPOS fuzzy testing. --- tests/unit/bitops.tcl | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/tests/unit/bitops.tcl b/tests/unit/bitops.tcl index a32ce26cfac..8963109805f 100644 --- a/tests/unit/bitops.tcl +++ b/tests/unit/bitops.tcl @@ -303,6 +303,39 @@ start_server {tags {"bitops"}} { test {BITPOS bit=0 changes behavior if end is given} { r set str "\xff\xff\xff" assert {[r bitpos str 0] == 24} + assert {[r bitpos str 0 0] == 24} assert {[r bitpos str 0 0 -1] == -1} } + + test {BITPOS bit=1 fuzzy testing using SETBIT} { + r del str + set max 524288; # 64k + set first_one_pos -1 + for {set j 0} {$j < 1000} {incr j} { + assert {[r bitpos str 1] == $first_one_pos} + set pos [randomInt $max] + r setbit str $pos 1 + if {$first_one_pos == -1 || $first_one_pos > $pos} { + # Update the position of the first 1 bit in the array + # if the bit we set is on the left of the previous one. + set first_one_pos $pos + } + } + } + + test {BITPOS bit=0 fuzzy testing using SETBIT} { + set max 524288; # 64k + set first_zero_pos $max + r set str [string repeat "\xff" [expr $max/8]] + for {set j 0} {$j < 1000} {incr j} { + assert {[r bitpos str 0] == $first_zero_pos} + set pos [randomInt $max] + r setbit str $pos 0 + if {$first_zero_pos > $pos} { + # Update the position of the first 0 bit in the array + # if the bit we clear is on the left of the previous one. + set first_zero_pos $pos + } + } + } } From 806788d009f3d95b3f48c624253c80165b6ed04d Mon Sep 17 00:00:00 2001 From: michael-grunder Date: Thu, 27 Feb 2014 12:01:57 -0800 Subject: [PATCH 0091/1648] Improved bigkeys with progress, pipelining and summary This commit reworks the redis-cli --bigkeys command to provide more information about our progress as well as output summary information when we're done. - We now show an approximate percentage completion as we go - Hiredis pipelining is used for TYPE and SIZE retreival - A summary of keyspace distribution and overall breakout at the end --- src/redis-cli.c | 325 ++++++++++++++++++++++++++++++++++++------------ 1 file changed, 244 insertions(+), 81 deletions(-) diff --git a/src/redis-cli.c b/src/redis-cli.c index 11cefdf47f8..b643933af1a 100644 --- a/src/redis-cli.c +++ b/src/redis-cli.c @@ -1331,108 +1331,271 @@ static void pipeMode(void) { #define TYPE_SET 2 #define TYPE_HASH 3 #define TYPE_ZSET 4 +#define TYPE_NONE 5 -static void findBigKeys(void) { - unsigned long long biggest[5] = {0,0,0,0,0}; - unsigned long long samples = 0; - redisReply *reply1, *reply2, *reply3 = NULL, *keys; - char *key, *sizecmd, *typename[] = {"string","list","set","hash","zset"}; - char *typeunit[] = {"bytes","items","members","fields","members"}; - int type, it=0, i; +static redisReply *sendScan(unsigned long long *it) { + redisReply *reply = redisCommand(context, "SCAN %llu", *it); - printf("\n# Press ctrl+c when you have had enough of it... :)\n"); - printf("# You can use -i 0.1 to sleep 0.1 sec per 100 SCANS\n"); - printf("# in order to reduce server load (usually not needed).\n\n"); + /* Handle any error conditions */ + if(reply == NULL) { + fprintf(stderr, "\nI/O error\n"); + exit(1); + } else if(reply->type == REDIS_REPLY_ERROR) { + fprintf(stderr, "SCAN error: %s\n", reply->str); + exit(1); + } else if(reply->type != REDIS_REPLY_ARRAY) { + fprintf(stderr, "Non ARRAY response from SCAN!\n"); + exit(1); + } else if(reply->elements != 2) { + fprintf(stderr, "Invalid element count from SCAN!\n"); + exit(1); + } - do { - /* Grab some keys with SCAN */ - reply1 = redisCommand(context, "SCAN %d", it); - if(reply1 == NULL) { - fprintf(stderr, "\nI/O error\n"); + /* Validate our types are correct */ + assert(reply->element[0]->type == REDIS_REPLY_STRING); + assert(reply->element[1]->type == REDIS_REPLY_ARRAY); + + /* Update iterator */ + *it = atoi(reply->element[0]->str); + + return reply; +} + +static int getDbSize(void) { + redisReply *reply; + int size; + + reply = redisCommand(context, "DBSIZE"); + + if(reply == NULL || reply->type != REDIS_REPLY_INTEGER) { + fprintf(stderr, "Couldn't determine DBSIZE!\n"); + exit(1); + } + + /* Grab the number of keys and free our reply */ + size = reply->integer; + freeReplyObject(reply); + + return size; +} + +static int toIntType(char *key, char *type) { + if(!strcmp(type, "string")) { + return TYPE_STRING; + } else if(!strcmp(type, "list")) { + return TYPE_LIST; + } else if(!strcmp(type, "set")) { + return TYPE_SET; + } else if(!strcmp(type, "hash")) { + return TYPE_HASH; + } else if(!strcmp(type, "zset")) { + return TYPE_ZSET; + } else if(!strcmp(type, "none")) { + return TYPE_NONE; + } else { + fprintf(stderr, "Unknown type '%s' for key '%s'\n", type, key); + exit(1); + } +} + +static void getKeyTypes(redisReply *keys, int *types) { + redisReply *reply; + int i; + + /* Pipeline TYPE commands */ + for(i=0;ielements;i++) { + redisAppendCommand(context, "TYPE %s", keys->element[i]->str); + } + + /* Retrieve types */ + for(i=0;ielements;i++) { + if(redisGetReply(context, (void**)&reply)!=REDIS_OK) { + fprintf(stderr, "Error getting type for key '%s' (%d: %s)\n", + keys->element[i]->str, context->err, context->errstr); exit(1); - } else if(reply1->type == REDIS_REPLY_ERROR) { - fprintf(stderr, "SCAN error: %s\n", reply1->str); + } else if(reply->type != REDIS_REPLY_STATUS) { + fprintf(stderr, "Invalid reply type (%d) for TYPE on key '%s'!\n", + reply->type, keys->element[i]->str); exit(1); - } else if(reply1->type != REDIS_REPLY_ARRAY) { - fprintf(stderr, "Non ARRAY response from SCAN!\n"); + } + + types[i] = toIntType(keys->element[i]->str, reply->str); + freeReplyObject(reply); + } +} + +static void getKeySizes(redisReply *keys, int *types, + unsigned long long *sizes) +{ + redisReply *reply; + char *sizecmds[] = {"STRLEN","LLEN","SCARD","HLEN","ZCARD"}; + int i; + + /* Pipeline size commands */ + for(i=0;ielements;i++) { + /* Skip keys that were deleted */ + if(types[i]==TYPE_NONE) + continue; + + redisAppendCommand(context, "%s %s", sizecmds[types[i]], + keys->element[i]->str); + } + + /* Retreive sizes */ + for(i=0;ielements;i++) { + /* Skip keys that dissapeared between SCAN and TYPE */ + if(types[i] == TYPE_NONE) { + sizes[i] = 0; + continue; + } + + /* Retreive size */ + if(redisGetReply(context, (void**)&reply)!=REDIS_OK) { + fprintf(stderr, "Error getting size for key '%s' (%d: %s)\n", + keys->element[i]->str, context->err, context->errstr); exit(1); - } else if(reply1->elements!=2) { - fprintf(stderr, "Invalid SCAN result!\n"); + } else if(reply->type != REDIS_REPLY_INTEGER) { + /* Theoretically the key could have been removed and + * added as a different type between TYPE and SIZE */ + fprintf(stderr, + "Warning: %s on '%s' failed (may have changed type)\n", + sizecmds[types[i]], keys->element[i]->str); + sizes[i] = 0; + } else { + sizes[i] = reply->integer; + } + + freeReplyObject(reply); + } +} + +static void findBigKeys(void) { + unsigned long long biggest[5] = {0}, counts[5] = {0}, totalsize[5] = {0}; + unsigned long long sampled = 0, total_keys, totlen=0, *sizes=NULL, it=0; + sds maxkeys[5] = {0}; + char *typename[] = {"string","list","set","hash","zset"}; + char *typeunit[] = {"bytes","items","members","fields","members"}; + redisReply *reply, *keys; + int type, *types=NULL, arrsize=0, i; + double pct; + + /* Total keys pre scanning */ + total_keys = getDbSize(); + + /* Status message */ + printf("\n# Scanning the entire keyspace to find biggest keys as well as\n"); + printf("# average sizes per key type. You can use -i 0.1 to sleep 0.1 sec\n"); + printf("# per 100 SCAN commands (not usually needed).\n\n"); + + /* New up sds strings to keep track of overall biggest per type */ + for(i=0;ielement[0]->type == REDIS_REPLY_STRING); - assert(reply1->element[1]->type == REDIS_REPLY_ARRAY); + } - /* Update iterator and grab pointer to keys */ - it = atoi(reply1->element[0]->str); - keys = reply1->element[1]; + /* SCAN loop */ + do { + /* Calculate approximate percentage completion */ + pct = 100 * (double)sampled/total_keys; - /* Iterate keys that SCAN returned */ - for(i=0;ielements;i++) { - /* Make sure we've got a string, grab it, and increment samples */ - assert(keys->element[i]->type == REDIS_REPLY_STRING); - key = keys->element[i]->str; - samples++; - - /* Get the key type */ - reply2 = redisCommand(context, "TYPE %s", key); - assert(reply2 && reply2->type == REDIS_REPLY_STATUS); - - /* Get the key "size" */ - if (!strcmp(reply2->str,"string")) { - sizecmd = "STRLEN"; - type = TYPE_STRING; - } else if (!strcmp(reply2->str,"list")) { - sizecmd = "LLEN"; - type = TYPE_LIST; - } else if (!strcmp(reply2->str,"set")) { - sizecmd = "SCARD"; - type = TYPE_SET; - } else if (!strcmp(reply2->str,"hash")) { - sizecmd = "HLEN"; - type = TYPE_HASH; - } else if (!strcmp(reply2->str,"zset")) { - sizecmd = "ZCARD"; - type = TYPE_ZSET; - } else if (!strcmp(reply2->str,"none")) { - freeReplyObject(reply1); - freeReplyObject(reply2); - continue; - } else { - fprintf(stderr, "Unknown key type '%s' for key '%s'\n", - reply2->str, key); + /* Grab some keys and point to the keys array */ + reply = sendScan(&it); + keys = reply->element[1]; + + /* Reallocate our type and size array if we need to */ + if(keys->elements > arrsize) { + types = zrealloc(types, sizeof(int)*keys->elements); + sizes = zrealloc(sizes, sizeof(unsigned long long)*keys->elements); + + if(!types || !sizes) { + fprintf(stderr, "Failed to allocate storage for keys!\n"); exit(1); } + + arrsize = keys->elements; + } + + /* Retreive types and then sizes */ + getKeyTypes(keys, types); + getKeySizes(keys, types, sizes); + + /* Now update our stats */ + for(i=0;ielements;i++) { + if((type = types[i]) == TYPE_NONE) + continue; - /* The size command */ - reply3 = redisCommand(context,"%s %s", sizecmd, key); - if (reply3 && reply3->type == REDIS_REPLY_INTEGER) { - if (biggest[type] < reply3->integer) { - printf("Biggest %-6s found so far '%s' with %llu %s.\n", - typename[type], key, - (unsigned long long) reply3->integer, - typeunit[type]); - biggest[type] = reply3->integer; + totalsize[type] += sizes[i]; + counts[type]++; + totlen += keys->element[i]->len; + sampled++; + + if(biggest[type]element[i]->str, sizes[i], + typeunit[type]); + + /* Keep track of biggest key name for this type */ + maxkeys[type] = sdscpy(maxkeys[type], keys->element[i]->str); + if(!maxkeys[type]) { + fprintf(stderr, "Failed to allocate memory for key!\n"); + exit(1); } + + /* Keep track of the biggest size for this type */ + biggest[type] = sizes[i]; } - freeReplyObject(reply2); - if(reply3) freeReplyObject(reply3); + /* Update overall progress */ + if(sampled % 1000000 == 0) { + printf("[%05.2f%%] Sampled %llu keys so far\n", pct, sampled); + } } - if (samples && (samples % 1000000) == 0) - printf("(%llu keys sampled)\n", samples); - - if (samples && (samples % 100) == 0 && config.interval) + /* Sleep if we've been directed to do so */ + if(sampled && (sampled %100) == 0 && config.interval) { usleep(config.interval); - - freeReplyObject(reply1); + } + + freeReplyObject(reply); } while(it != 0); - /* We've finished scanning the keyspace */ - printf("\n# Scanned all %llu keys in the keyspace!\n", samples); + if(types) zfree(types); + if(sizes) zfree(sizes); + + /* We're done */ + printf("\n-------- summary -------\n\n"); + + printf("Sampled %llu keys in the keyspace!\n", sampled); + printf("Total key length in bytes is %llu (avg len %.2f)\n\n", + totlen, totlen ? (double)totlen/sampled : 0); + + /* Output the biggest keys we found, for types we did find */ + for(i=0;i0) { + printf("Biggest %6s found '%s' has %llu %s\n", typename[i], maxkeys[i], + biggest[i], typeunit[i]); + } + } + + printf("\n"); + + for(i=0;i Date: Fri, 28 Feb 2014 16:00:00 +0100 Subject: [PATCH 0092/1648] Sentinel test: Makefile target added. --- runtest-sentinel | 14 ++++++++++++++ src/Makefile | 3 +++ 2 files changed, 17 insertions(+) create mode 100755 runtest-sentinel diff --git a/runtest-sentinel b/runtest-sentinel new file mode 100755 index 00000000000..1650eea7374 --- /dev/null +++ b/runtest-sentinel @@ -0,0 +1,14 @@ +#!/bin/sh +TCL_VERSIONS="8.5 8.6" +TCLSH="" + +for VERSION in $TCL_VERSIONS; do + TCL=`which tclsh$VERSION 2>/dev/null` && TCLSH=$TCL +done + +if [ -z $TCLSH ] +then + echo "You need tcl 8.5 or newer in order to run the Redis Sentinel test" + exit 1 +fi +$TCLSH tests/sentinel.tcl $* diff --git a/src/Makefile b/src/Makefile index 0b4cff7a1c9..c7fdf211ef6 100644 --- a/src/Makefile +++ b/src/Makefile @@ -204,6 +204,9 @@ distclean: clean test: $(REDIS_SERVER_NAME) $(REDIS_CHECK_AOF_NAME) @(cd ..; ./runtest) +test-sentinel: $(REDIS_SENTINEL_NAME) + @(cd ..; ./runtest-sentinel) + check: test lcov: From f1c9a203b2eda46c2980582d3bdb12b6e66f4aaf Mon Sep 17 00:00:00 2001 From: Matt Stancliff Date: Fri, 28 Feb 2014 17:47:41 -0500 Subject: [PATCH 0093/1648] Force INFO used_memory_peak to match peak memory used_memory_peak only updates in serverCron every server.hz, but Redis can use more memory and a user can request memory INFO before used_memory_peak gets updated in the next cron run. This patch updates used_memory_peak to the current memory usage if the current memory usage is higher than the recorded used_memory_peak value. (And it only calls zmalloc_used_memory() once instead of twice as it was doing before.) --- src/redis.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/redis.c b/src/redis.c index f132bb563d6..fe32c8defb4 100644 --- a/src/redis.c +++ b/src/redis.c @@ -2407,8 +2407,13 @@ sds genRedisInfoString(char *section) { if (allsections || defsections || !strcasecmp(section,"memory")) { char hmem[64]; char peak_hmem[64]; + size_t zmalloc_used = zmalloc_used_memory(); - bytesToHuman(hmem,zmalloc_used_memory()); + if (zmalloc_used > server.stat_peak_memory) { + server.stat_peak_memory = zmalloc_used; + } + + bytesToHuman(hmem,zmalloc_used); bytesToHuman(peak_hmem,server.stat_peak_memory); if (sections++) info = sdscat(info,"\r\n"); info = sdscatprintf(info, @@ -2421,7 +2426,7 @@ sds genRedisInfoString(char *section) { "used_memory_lua:%lld\r\n" "mem_fragmentation_ratio:%.2f\r\n" "mem_allocator:%s\r\n", - zmalloc_used_memory(), + zmalloc_used, hmem, zmalloc_get_rss(), server.stat_peak_memory, From 8dea2029a4ec3382b7d80de82d793ef2ba2773e5 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 3 Mar 2014 11:12:11 +0100 Subject: [PATCH 0094/1648] Fix configEpoch assignment when a cluster slot gets "closed". This is still code to rework in order to use agreement to obtain a new configEpoch when a slot is migrated, however this commit handles the special case that happens when the nodes are just started and everybody has a configEpoch of 0. In this special condition to have the maximum configEpoch is not enough as the special epoch 0 is not unique (all the others are). This does not fixes the intrinsic race condition of a failover happening while we are resharding, that will be addressed later. --- src/cluster.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/cluster.c b/src/cluster.c index f47799aca8a..afea589ec21 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -3180,7 +3180,9 @@ void clusterCommand(redisClient *c) { * the master is failed over by a slave. */ uint64_t maxEpoch = clusterGetMaxEpoch(); - if (myself->configEpoch != maxEpoch) { + if (myself->configEpoch == 0 || + myself->configEpoch != maxEpoch) + { server.cluster->currentEpoch++; myself->configEpoch = server.cluster->currentEpoch; clusterDoBeforeSleep(CLUSTER_TODO_FSYNC_CONFIG); From 12a88d575d2cdd1fb36c8eef043f1f7e6fdd5246 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 3 Mar 2014 11:19:54 +0100 Subject: [PATCH 0095/1648] Document why we update peak memory in INFO. --- src/redis.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/redis.c b/src/redis.c index fe32c8defb4..eb47616dc47 100644 --- a/src/redis.c +++ b/src/redis.c @@ -2409,9 +2409,12 @@ sds genRedisInfoString(char *section) { char peak_hmem[64]; size_t zmalloc_used = zmalloc_used_memory(); - if (zmalloc_used > server.stat_peak_memory) { + /* Peak memory is updated from time to time by serverCron() so it + * may happen that the instantaneous value is slightly bigger than + * the peak value. This may confuse users, so we update the peak + * if found smaller than the current memory usage. */ + if (zmalloc_used > server.stat_peak_memory) server.stat_peak_memory = zmalloc_used; - } bytesToHuman(hmem,zmalloc_used); bytesToHuman(peak_hmem,server.stat_peak_memory); From 69cc3b692ffc543e9d46064438ac380ef33513a7 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 3 Mar 2014 13:01:11 +0100 Subject: [PATCH 0096/1648] Sentienl test: fixed typo in unit 03 top comment. --- tests/sentinel-tests/03-runtime-reconf.tcl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/sentinel-tests/03-runtime-reconf.tcl b/tests/sentinel-tests/03-runtime-reconf.tcl index 50bca55478f..426596c37e1 100644 --- a/tests/sentinel-tests/03-runtime-reconf.tcl +++ b/tests/sentinel-tests/03-runtime-reconf.tcl @@ -1 +1 @@ -# Test runting reconfiguration command SENTINEL SET. +# Test runtime reconfiguration command SENTINEL SET. From db9e718c8a004b5fbd012f106a2e6d670dfc80c2 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 3 Mar 2014 13:23:32 +0100 Subject: [PATCH 0097/1648] Sentinel test: foreach_instance_id now supports 'continue'. --- tests/sentinel.tcl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index b15bce65634..80f9f90b0ab 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -212,6 +212,8 @@ proc foreach_instance_id {instances idvar code} { set errcode [catch {uplevel 1 $code} result] if {$errcode == 1} { error $result $::errorInfo $::errorCode + } elseif {$errcode == 4} { + continue } elseif {$errcode != 0} { return -code $errcode $result } From 1734f7aedb105974e401acfbabc9139fcdac449d Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 3 Mar 2014 13:25:14 +0100 Subject: [PATCH 0098/1648] Sentinel test: initial tests in 03 unit. --- tests/sentinel-tests/02-slaves-reconf.tcl | 39 +++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/tests/sentinel-tests/02-slaves-reconf.tcl b/tests/sentinel-tests/02-slaves-reconf.tcl index 753580a0ec6..f046830e3b6 100644 --- a/tests/sentinel-tests/02-slaves-reconf.tcl +++ b/tests/sentinel-tests/02-slaves-reconf.tcl @@ -4,3 +4,42 @@ # 1) That slaves point to the new master after failover. # 2) That partitioned slaves point to new master when they are partitioned # away during failover and return at a latter time. + +source "../sentinel-tests/includes/init-tests.tcl" + +proc 03_test_slaves_replication {} { + uplevel 1 { + test "Check that slaves replicate from current master" { + set master_port [RI $master_id tcp_port] + foreach_redis_id id { + if {$id == $master_id} continue + wait_for_condition 1000 50 { + [RI $id master_port] == $master_port + } else { + fail "Redis slave $id is replicating from wrong master" + } + } + } + } +} + +03_test_slaves_replication + +test "Crash the master and force a failover" { + set old_port [RI $master_id tcp_port] + set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] + assert {[lindex $addr 1] == $old_port} + kill_instance redis $master_id + foreach_sentinel_id id { + wait_for_condition 1000 50 { + [lindex [S $id SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] 1] != $old_port + } else { + fail "At least one Sentinel did not received failover info" + } + } + restart_instance redis $master_id + set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] + set master_id [get_instance_id_by_port redis [lindex $addr 1]] +} + +03_test_slaves_replication From 35e8bc305dd800e44b528f71a581b23c777340e5 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 3 Mar 2014 13:26:18 +0100 Subject: [PATCH 0099/1648] Sentinel test: use 1000 as retry in initial 00 unit test. --- tests/sentinel-tests/00-base.tcl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index 74308b7752c..e0c5c37d91a 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -8,7 +8,7 @@ test "Basic failover works if the master is down" { assert {[lindex $addr 1] == $old_port} kill_instance redis $master_id foreach_sentinel_id id { - wait_for_condition 100 50 { + wait_for_condition 1000 50 { [lindex [S $id SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] 1] != $old_port } else { fail "At least one Sentinel did not received failover info" From c5edd91716486b1134aa9318657e38e43445a75d Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 3 Mar 2014 17:11:51 +0100 Subject: [PATCH 0100/1648] Cluster: invalidate current transaction on redirections. --- src/redis.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/redis.c b/src/redis.c index eb47616dc47..1219fdbd8e9 100644 --- a/src/redis.c +++ b/src/redis.c @@ -2027,6 +2027,7 @@ int processCommand(redisClient *c) { addReplyError(c,"Multi keys request invalid in cluster"); return REDIS_OK; } else if (n != server.cluster->myself) { + flagTransaction(c); addReplySds(c,sdscatprintf(sdsempty(), "-%s %d %s:%d\r\n", ask ? "ASK" : "MOVED", hashslot,n->ip,n->port)); From 4b9ac6edd0ce5382e9f2c07e0c68b3230226df72 Mon Sep 17 00:00:00 2001 From: zhanghailei Date: Mon, 23 Dec 2013 12:32:57 +0800 Subject: [PATCH 0101/1648] According to context,the size should be 16 rather than 64 --- src/dict.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/dict.c b/src/dict.c index 17a32287000..bcf592e43d6 100644 --- a/src/dict.c +++ b/src/dict.c @@ -695,7 +695,7 @@ static unsigned long rev(unsigned long v) { * (where SIZE-1 is always the mask that is equivalent to taking the rest * of the division between the Hash of the key and SIZE). * - * For example if the current hash table size is 64, the mask is + * For example if the current hash table size is 16, the mask is * (in binary) 1111. The position of a key in the hash table will be always * the last four bits of the hash output, and so forth. * From c0f86654142b47e2275fc84eacc08a5e680d7929 Mon Sep 17 00:00:00 2001 From: zhanghailei Date: Thu, 26 Dec 2013 11:28:34 +0800 Subject: [PATCH 0102/1648] FIXED a typo more thank should be more than --- src/dict.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/dict.c b/src/dict.c index bcf592e43d6..10e42eabf89 100644 --- a/src/dict.c +++ b/src/dict.c @@ -239,7 +239,7 @@ int dictExpand(dict *d, unsigned long size) /* Performs N steps of incremental rehashing. Returns 1 if there are still * keys to move from the old to the new hash table, otherwise 0 is returned. * Note that a rehashing step consists in moving a bucket (that may have more - * thank one key as we use chaining) from the old to the new hash table. */ + * than one key as we use chaining) from the old to the new hash table. */ int dictRehash(dict *d, int n) { if (!dictIsRehashing(d)) return 0; From 138695d9908bf9e35875a6e29f24c87a202b1fe4 Mon Sep 17 00:00:00 2001 From: zhanghailei Date: Tue, 4 Mar 2014 12:20:31 +0800 Subject: [PATCH 0103/1648] refer to updateLRUClock's comment REDIS_LRU_CLOCK_MAX is 22 bits,but #define REDIS_LRU_CLOCK_MAX ((1<<21)-1) only 21 bits --- src/redis.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/redis.h b/src/redis.h index c09d410e656..0348421fa63 100644 --- a/src/redis.h +++ b/src/redis.h @@ -382,7 +382,7 @@ typedef long long mstime_t; /* millisecond time type. */ /* A redis object, that is a type able to hold a string / list / set */ /* The actual Redis Object */ -#define REDIS_LRU_CLOCK_MAX ((1<<21)-1) /* Max value of obj->lru */ +#define REDIS_LRU_CLOCK_MAX ((1<<22)-1) /* Max value of obj->lru */ #define REDIS_LRU_CLOCK_RESOLUTION 10 /* LRU clock resolution in seconds */ typedef struct redisObject { unsigned type:4; From dd8d883c9c81517a851df7f13f38a4fca801e4c9 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 4 Mar 2014 11:17:27 +0100 Subject: [PATCH 0104/1648] Sentiel test: add test start time in output. --- tests/sentinel.tcl | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index 80f9f90b0ab..1adab0a5884 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -101,9 +101,9 @@ proc parse_options {} { } elseif {$opt eq "--help"} { puts "Hello, I'm sentinel.tcl and I run Sentinel unit tests." puts "\nOptions:" - puts "--single Only runs tests specified by pattern." - puts "--pause-on-error Pause for manual inspection on error." - puts "--help Shows this help." + puts "--single Only runs tests specified by pattern." + puts "--pause-on-error Pause for manual inspection on error." + puts "--help Shows this help." exit 0 } else { puts "Unknown option $opt" @@ -137,7 +137,8 @@ proc pause_on_error {} { # We redefine 'test' as for Sentinel we don't use the server-client # architecture for the test, everything is sequential. proc test {descr code} { - puts -nonewline "> $descr: " + set ts [clock format [clock seconds] -format %H:%M:%S] + puts -nonewline "$ts> $descr: " flush stdout if {[catch {set retval [uplevel 1 $code]} error]} { From 3072a1e7813f57cf70f401caa928840a1afbe5f7 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 4 Mar 2014 11:20:53 +0100 Subject: [PATCH 0105/1648] Sentinel test: be more patient in create_redis_master_slave_cluster. --- tests/sentinel.tcl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index 1adab0a5884..8954fe54d15 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -261,7 +261,7 @@ proc create_redis_master_slave_cluster n { } } # Wait for all the slaves to sync. - wait_for_condition 100 50 { + wait_for_condition 1000 50 { [RI 0 connected_slaves] == ($n-1) } else { fail "Unable to create a master-slaves cluster." From 7d97a4c99b6564902c16be24ed352101a63e813d Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 4 Mar 2014 12:05:49 +0100 Subject: [PATCH 0106/1648] Sentinel test: initial debugging console. --- tests/sentinel-tests/00-base.tcl | 6 ++++++ tests/sentinel.tcl | 22 +++++++++++++++++++++- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/tests/sentinel-tests/00-base.tcl b/tests/sentinel-tests/00-base.tcl index e0c5c37d91a..26758de0926 100644 --- a/tests/sentinel-tests/00-base.tcl +++ b/tests/sentinel-tests/00-base.tcl @@ -2,6 +2,12 @@ source "../sentinel-tests/includes/init-tests.tcl" +if {$::simulate_error} { + test "This test will fail" { + fail "Simulated error" + } +} + test "Basic failover works if the master is down" { set old_port [RI $master_id tcp_port] set addr [S 0 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster] diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index 8954fe54d15..7fdf2d756c4 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -12,6 +12,7 @@ source tests/support/test.tcl set ::verbose 0 set ::pause_on_error 0 +set ::simulate_error 0 set ::sentinel_instances {} set ::redis_instances {} set ::sentinel_base_port 20000 @@ -98,11 +99,14 @@ proc parse_options {} { set ::run_matching "*${val}*" } elseif {$opt eq "--pause-on-error"} { set ::pause_on_error 1 + } elseif {$opt eq "--fail"} { + set ::simulate_error 1 } elseif {$opt eq "--help"} { puts "Hello, I'm sentinel.tcl and I run Sentinel unit tests." puts "\nOptions:" puts "--single Only runs tests specified by pattern." puts "--pause-on-error Pause for manual inspection on error." + puts "--fail Simulate a test failure." puts "--help Shows this help." exit 0 } else { @@ -130,7 +134,23 @@ proc pause_on_error {} { while 1 { puts -nonewline "> " flush stdout - if {[gets stdin] eq {continue}} break + set line [gets stdin] + set argv [split $line " "] + set cmd [lindex $argv 0] + if {$cmd eq {continue}} { + break + } elseif {$cmd eq {show-sentinel-logs}} { + set count 10 + if {[lindex $argv 1] ne {}} {set count [lindex $argv 1]} + foreach_sentinel_id id { + puts "=== SENTINEL $id ====" + puts [exec tail -$count sentinel_$id/log.txt] + puts "---------------------\n" + } + } else { + set errcode [catch {eval $line} retval] + puts "$retval" + } } } From efb092baa6541c4541f944ce1b0c2ed9f72da019 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 4 Mar 2014 15:55:36 +0100 Subject: [PATCH 0107/1648] Sentinel test: debugging console improved. --- tests/sentinel.tcl | 44 ++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 42 insertions(+), 2 deletions(-) diff --git a/tests/sentinel.tcl b/tests/sentinel.tcl index 7fdf2d756c4..faa3935464c 100644 --- a/tests/sentinel.tcl +++ b/tests/sentinel.tcl @@ -130,7 +130,7 @@ proc main {} { proc pause_on_error {} { puts "" puts [colorstr yellow "*** Please inspect the error now ***"] - puts "\nType \"continue\" to resume the test.\n" + puts "\nType \"continue\" to resume the test, \"help\" for help screen.\n" while 1 { puts -nonewline "> " flush stdout @@ -147,9 +147,49 @@ proc pause_on_error {} { puts [exec tail -$count sentinel_$id/log.txt] puts "---------------------\n" } + } elseif {$cmd eq {ls}} { + foreach_redis_id id { + puts -nonewline "Redis $id" + set errcode [catch { + set str {} + append str "@[RI $id tcp_port]: " + append str "[RI $id role] " + if {[RI $id role] eq {slave}} { + append str "[RI $id master_host]:[RI $id master_port]" + } + set str + } retval] + if {$errcode} { + puts " -- $retval" + } else { + puts $retval + } + } + foreach_sentinel_id id { + puts -nonewline "Sentinel $id" + set errcode [catch { + set str {} + append str "@[SI $id tcp_port]: " + append str "[join [S $id sentinel get-master-addr-by-name mymaster]]" + set str + } retval] + if {$errcode} { + puts " -- $retval" + } else { + puts $retval + } + } + } elseif {$cmd eq {help}} { + puts "ls List Sentinel and Redis instances." + puts "show-sentinel-logs \[N\] Show latest N lines of logs." + puts "S cmd ... arg Call command in Sentinel ." + puts "R cmd ... arg Call command in Redis ." + puts "SI Show Sentinel INFO ." + puts "RI Show Sentinel INFO ." + puts "continue Resume test." } else { set errcode [catch {eval $line} retval] - puts "$retval" + if {$retval ne {}} {puts "$retval"} } } } From 08da025f56b2b99523fc9fa9a8976ceafa69fe57 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 4 Mar 2014 16:39:44 +0100 Subject: [PATCH 0108/1648] CONFIG REWRITE should be logged at WARNING level. --- src/config.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/config.c b/src/config.c index 3e6b1561e7b..d8a886eae69 100644 --- a/src/config.c +++ b/src/config.c @@ -1827,6 +1827,7 @@ void configCommand(redisClient *c) { redisLog(REDIS_WARNING,"CONFIG REWRITE failed: %s", strerror(errno)); addReplyErrorFormat(c,"Rewriting config file: %s", strerror(errno)); } else { + redisLog(REDIS_WARNING,"CONFIG REWRITE executed with success."); addReply(c,shared.ok); } } else { From 47750998a61c4ba88be542292fb438ae651f8de3 Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 4 Mar 2014 16:54:40 +0100 Subject: [PATCH 0109/1648] Sentinel: more aggressive failover start desynchronization. Sentinel needs to avoid split brain conditions due to multiple sentinels trying to get voted at the exact same time. So far some desynchronization was provided by fluctuating server.hz, that is the frequency of the timer function call. However the desynchonization provided in this way was not enough when using many Sentinel instances, especially when a large quorum value is used in order to force a greater degree of agreement (more than N/2+1). It was verified that it was likely to trigger a split brain condition, forcing the system to try again after a timeout. Usually the system will succeed after a few retries, but this is not optimal. This commit desynchronizes instances in a more effective way to make it likely that the first attempt will be successful. --- src/sentinel.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/sentinel.c b/src/sentinel.c index a96375a7959..0fdaf63075c 100644 --- a/src/sentinel.c +++ b/src/sentinel.c @@ -84,6 +84,7 @@ typedef struct sentinelAddr { #define SENTINEL_DEFAULT_FAILOVER_TIMEOUT (60*3*1000) #define SENTINEL_MAX_PENDING_COMMANDS 100 #define SENTINEL_ELECTION_TIMEOUT 10000 +#define SENTINEL_MAX_DESYNC 1000 /* Failover machine different states. */ #define SENTINEL_FAILOVER_STATE_NONE 0 /* No failover in progress. */ @@ -2943,7 +2944,7 @@ char *sentinelVoteLeader(sentinelRedisInstance *master, uint64_t req_epoch, char * time to now, in order to force a delay before we can start a * failover for the same master. */ if (strcasecmp(master->leader,server.runid)) - master->failover_start_time = mstime(); + master->failover_start_time = mstime()+rand()%SENTINEL_MAX_DESYNC; } *leader_epoch = master->leader_epoch; @@ -3088,7 +3089,7 @@ void sentinelStartFailover(sentinelRedisInstance *master) { sentinelEvent(REDIS_WARNING,"+new-epoch",master,"%llu", (unsigned long long) sentinel.current_epoch); sentinelEvent(REDIS_WARNING,"+try-failover",master,"%@"); - master->failover_start_time = mstime(); + master->failover_start_time = mstime()+rand()%SENTINEL_MAX_DESYNC; master->failover_state_change_time = mstime(); } From 8d011492a0802c45114bab63c41b7311648632be Mon Sep 17 00:00:00 2001 From: antirez Date: Tue, 4 Mar 2014 17:10:29 +0100 Subject: [PATCH 0110/1648] Sentinel test: set less time sensitive defaults. This commit sets the failover timeout to 30 seconds instead of the 180 seconds default, and allows to reconfigure multiple slaves at the same time. This makes tests less sensible to timing, with the result that there are less false positives due to normal behaviors that require time to succeed or to be retried. However the long term solution is probably some way in order to detect when a test failed because of timing issues (for example split brain during leader election) and retry it. --- tests/sentinel-tests/includes/init-tests.tcl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/sentinel-tests/includes/init-tests.tcl b/tests/sentinel-tests/includes/init-tests.tcl index 91b179b4d35..cb359ea1b3e 100644 --- a/tests/sentinel-tests/includes/init-tests.tcl +++ b/tests/sentinel-tests/includes/init-tests.tcl @@ -30,6 +30,8 @@ test "(init) Sentinels can start monitoring a master" { foreach_sentinel_id id { assert {[S $id sentinel master mymaster] ne {}} S $id SENTINEL SET mymaster down-after-milliseconds 2000 + S $id SENTINEL SET mymaster failover-timeout 20000 + S $id SENTINEL SET mymaster parallel-syncs 10 } } From e5b1e7be64d82b0360af3bed0eaa22ca942be4f4 Mon Sep 17 00:00:00 2001 From: Matt Stancliff Date: Mon, 3 Mar 2014 10:57:27 -0500 Subject: [PATCH 0111/1648] Bind source address for cluster communication The first address specified as a bind parameter (server.bindaddr[0]) gets used as the source IP for cluster communication. If no bind address is specified by the user, the behavior is unchanged. This patch allows multiple Redis Cluster instances to communicate when running on the same interface of the same host. --- src/anet.c | 32 ++++++++++++++++++++++++++++---- src/anet.h | 1 + src/cluster.c | 11 ++++++++--- 3 files changed, 37 insertions(+), 7 deletions(-) diff --git a/src/anet.c b/src/anet.c index 2851318e23a..91545a0d5d2 100644 --- a/src/anet.c +++ b/src/anet.c @@ -234,11 +234,12 @@ static int anetCreateSocket(char *err, int domain) { #define ANET_CONNECT_NONE 0 #define ANET_CONNECT_NONBLOCK 1 -static int anetTcpGenericConnect(char *err, char *addr, int port, int flags) +static int anetTcpGenericConnect(char *err, char *addr, int port, + char *source_addr, int flags) { int s = ANET_ERR, rv; char portstr[6]; /* strlen("65535") + 1; */ - struct addrinfo hints, *servinfo, *p; + struct addrinfo hints, *servinfo, *bservinfo, *p, *b; snprintf(portstr,sizeof(portstr),"%d",port); memset(&hints,0,sizeof(hints)); @@ -258,6 +259,24 @@ static int anetTcpGenericConnect(char *err, char *addr, int port, int flags) if (anetSetReuseAddr(err,s) == ANET_ERR) goto error; if (flags & ANET_CONNECT_NONBLOCK && anetNonBlock(err,s) != ANET_OK) goto error; + if (source_addr) { + int bound = 0; + /* Using getaddrinfo saves us from self-determining IPv4 vs IPv6 */ + if ((rv = getaddrinfo(source_addr, NULL, &hints, &bservinfo)) != 0) { + anetSetError(err, "%s", gai_strerror(rv)); + goto end; + } + for (b = bservinfo; b != NULL; b = b->ai_next) { + if (bind(s,b->ai_addr,b->ai_addrlen) != -1) { + bound = 1; + break; + } + } + if (!bound) { + anetSetError(err, "bind: %s", strerror(errno)); + goto end; + } + } if (connect(s,p->ai_addr,p->ai_addrlen) == -1) { /* If the socket is non-blocking, it is ok for connect() to * return an EINPROGRESS error here. */ @@ -287,12 +306,17 @@ static int anetTcpGenericConnect(char *err, char *addr, int port, int flags) int anetTcpConnect(char *err, char *addr, int port) { - return anetTcpGenericConnect(err,addr,port,ANET_CONNECT_NONE); + return anetTcpGenericConnect(err,addr,port,NULL,ANET_CONNECT_NONE); } int anetTcpNonBlockConnect(char *err, char *addr, int port) { - return anetTcpGenericConnect(err,addr,port,ANET_CONNECT_NONBLOCK); + return anetTcpGenericConnect(err,addr,port,NULL,ANET_CONNECT_NONBLOCK); +} + +int anetTcpNonBlockBindConnect(char *err, char *addr, int port, char *source_addr) +{ + return anetTcpGenericConnect(err,addr,port,source_addr,ANET_CONNECT_NONBLOCK); } int anetUnixGenericConnect(char *err, char *path, int flags) diff --git a/src/anet.h b/src/anet.h index 3f893be2dd3..c4659cd3595 100644 --- a/src/anet.h +++ b/src/anet.h @@ -45,6 +45,7 @@ int anetTcpConnect(char *err, char *addr, int port); int anetTcpNonBlockConnect(char *err, char *addr, int port); +int anetTcpNonBlockBindConnect(char *err, char *addr, int port, char *source_addr); int anetUnixConnect(char *err, char *path); int anetUnixNonBlockConnect(char *err, char *path); int anetRead(int fd, char *buf, int count); diff --git a/src/cluster.c b/src/cluster.c index afea589ec21..abe684539bb 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -2411,9 +2411,14 @@ void clusterCron(void) { mstime_t old_ping_sent; clusterLink *link; - fd = anetTcpNonBlockConnect(server.neterr, node->ip, - node->port+REDIS_CLUSTER_PORT_INCR); - if (fd == -1) continue; + fd = anetTcpNonBlockBindConnect(server.neterr, node->ip, + node->port+REDIS_CLUSTER_PORT_INCR, server.bindaddr[0]); + if (fd == -1) { + redisLog(REDIS_DEBUG, "Unable to connect to " + "Cluster Client [%s]:%d", node->ip, + node->port+REDIS_CLUSTER_PORT_INCR); + continue; + } link = createClusterLink(node); link->fd = fd; node->link = link; From 5f5118bdad709c5eef10140130ba1329b6a5e661 Mon Sep 17 00:00:00 2001 From: Jan-Erik Rediger Date: Wed, 5 Mar 2014 00:41:02 +0100 Subject: [PATCH 0112/1648] Small typo fixed --- src/sentinel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/sentinel.c b/src/sentinel.c index 0fdaf63075c..4ce48de5e76 100644 --- a/src/sentinel.c +++ b/src/sentinel.c @@ -1514,7 +1514,7 @@ void sentinelFlushConfig(void) { close(fd); } } else { - redisLog(REDIS_WARNING,"WARNING: Senitnel was not able to save the new configuration on disk!!!: %s", strerror(errno)); + redisLog(REDIS_WARNING,"WARNING: Sentinel was not able to save the new configuration on disk!!!: %s", strerror(errno)); } server.hz = saved_hz; return; From 9b401819c009c85f285c5b4890fb91a6a2a91f7f Mon Sep 17 00:00:00 2001 From: antirez Date: Wed, 5 Mar 2014 11:26:14 +0100 Subject: [PATCH 0113/1648] Cast saveparams[].seconds to long for %ld format specifier. --- src/config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/config.c b/src/config.c index d8a886eae69..968d5a36579 100644 --- a/src/config.c +++ b/src/config.c @@ -1461,7 +1461,7 @@ void rewriteConfigSaveOption(struct rewriteConfigState *state) { * resulting into no RDB persistence as expected. */ for (j = 0; j < server.saveparamslen; j++) { line = sdscatprintf(sdsempty(),"save %ld %d", - server.saveparams[j].seconds, server.saveparams[j].changes); + (long) server.saveparams[j].seconds, server.saveparams[j].changes); rewriteConfigRewriteLine(state,"save",line,1); } /* Mark "save" as processed in case server.saveparamslen is zero. */ From 59cf0b1902f20436e01f2a7523609a597e688c5b Mon Sep 17 00:00:00 2001 From: Matt Stancliff Date: Wed, 26 Feb 2014 17:58:46 -0500 Subject: [PATCH 0114/1648] Fix return value check for anetTcpAccept anetTcpAccept returns ANET_ERR, not AE_ERR. This isn't a physical error since both ANET_ERR and AE_ERR are -1, but better to be consistent. --- src/cluster.c | 2 +- src/networking.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index afea589ec21..be7e5916a7d 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -388,7 +388,7 @@ void clusterAcceptHandler(aeEventLoop *el, int fd, void *privdata, int mask) { REDIS_NOTUSED(privdata); cfd = anetTcpAccept(server.neterr, fd, cip, sizeof(cip), &cport); - if (cfd == AE_ERR) { + if (cfd == ANET_ERR) { redisLog(REDIS_VERBOSE,"Accepting cluster node: %s", server.neterr); return; } diff --git a/src/networking.c b/src/networking.c index ba9258cf5d1..9e3e4a21ff8 100644 --- a/src/networking.c +++ b/src/networking.c @@ -587,7 +587,7 @@ void acceptTcpHandler(aeEventLoop *el, int fd, void *privdata, int mask) { REDIS_NOTUSED(privdata); cfd = anetTcpAccept(server.neterr, fd, cip, sizeof(cip), &cport); - if (cfd == AE_ERR) { + if (cfd == ANET_ERR) { redisLog(REDIS_WARNING,"Accepting client connection: %s", server.neterr); return; } From d2040ab9b13bc4edd07825e2ecee55b3679e0651 Mon Sep 17 00:00:00 2001 From: Matt Stancliff Date: Wed, 26 Feb 2014 18:02:55 -0500 Subject: [PATCH 0115/1648] Remove some redundant code Function nodeIp2String in cluster.c is exactly anetPeerToString with a pre-extracted fd. --- src/cluster.c | 15 +-------------- 1 file changed, 1 insertion(+), 14 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index be7e5916a7d..e6da3a66fc3 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -1021,22 +1021,9 @@ void clusterProcessGossipSection(clusterMsg *hdr, clusterLink *link) { /* IP -> string conversion. 'buf' is supposed to at least be 46 bytes. */ void nodeIp2String(char *buf, clusterLink *link) { - struct sockaddr_storage sa; - socklen_t salen = sizeof(sa); - - if (getpeername(link->fd, (struct sockaddr*) &sa, &salen) == -1) - redisPanic("getpeername() failed."); - - if (sa.ss_family == AF_INET) { - struct sockaddr_in *s = (struct sockaddr_in *)&sa; - inet_ntop(AF_INET,(void*)&(s->sin_addr),buf,REDIS_CLUSTER_IPLEN); - } else { - struct sockaddr_in6 *s = (struct sockaddr_in6 *)&sa; - inet_ntop(AF_INET6,(void*)&(s->sin6_addr),buf,REDIS_CLUSTER_IPLEN); - } + anetPeerToString(link->fd, buf, REDIS_CLUSTER_IPLEN, NULL); } - /* Update the node address to the IP address that can be extracted * from link->fd, and at the specified port. * Also disconnect the node link so that we'll connect again to the new From 385c25f70f97903df64d19b8e54cbba13ec66f74 Mon Sep 17 00:00:00 2001 From: Matt Stancliff Date: Wed, 26 Feb 2014 18:05:12 -0500 Subject: [PATCH 0116/1648] Remove redundant IP length definition REDIS_CLUSTER_IPLEN had the same value as REDIS_IP_STR_LEN. They were both #define'd to the same INET6_ADDRSTRLEN. --- src/cluster.c | 6 +++--- src/cluster.h | 1 - 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index e6da3a66fc3..f4ddeef4e0a 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -915,11 +915,11 @@ int clusterStartHandshake(char *ip, int port) { if (sa.ss_family == AF_INET) inet_ntop(AF_INET, (void*)&(((struct sockaddr_in *)&sa)->sin_addr), - norm_ip,REDIS_CLUSTER_IPLEN); + norm_ip,REDIS_IP_STR_LEN); else inet_ntop(AF_INET6, (void*)&(((struct sockaddr_in6 *)&sa)->sin6_addr), - norm_ip,REDIS_CLUSTER_IPLEN); + norm_ip,REDIS_IP_STR_LEN); if (clusterHandshakeInProgress(norm_ip,port)) { errno = EAGAIN; @@ -1021,7 +1021,7 @@ void clusterProcessGossipSection(clusterMsg *hdr, clusterLink *link) { /* IP -> string conversion. 'buf' is supposed to at least be 46 bytes. */ void nodeIp2String(char *buf, clusterLink *link) { - anetPeerToString(link->fd, buf, REDIS_CLUSTER_IPLEN, NULL); + anetPeerToString(link->fd, buf, REDIS_IP_STR_LEN, NULL); } /* Update the node address to the IP address that can be extracted diff --git a/src/cluster.h b/src/cluster.h index 654274b9bb4..8c440b9f889 100644 --- a/src/cluster.h +++ b/src/cluster.h @@ -10,7 +10,6 @@ #define REDIS_CLUSTER_FAIL 1 /* The cluster can't work */ #define REDIS_CLUSTER_NAMELEN 40 /* sha1 hex length */ #define REDIS_CLUSTER_PORT_INCR 10000 /* Cluster port = baseport + PORT_INCR */ -#define REDIS_CLUSTER_IPLEN INET6_ADDRSTRLEN /* IPv6 address string length */ /* The following defines are amunt of time, sometimes expressed as * multiplicators of the node timeout value (when ending with MULT). */ From e8bae92e5479bc64a82d92c73e9360438cf9f6f5 Mon Sep 17 00:00:00 2001 From: Matt Stancliff Date: Tue, 4 Mar 2014 17:31:34 -0500 Subject: [PATCH 0117/1648] Reset op_sec_last_sample_ops when reset requested This value needs to be set to zero (in addition to stat_numcommands) or else people may see a negative operations per second count after they run CONFIG RESETSTAT. Fixes antirez/redis#1577 --- src/config.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/config.c b/src/config.c index 968d5a36579..c64c4a9fcc2 100644 --- a/src/config.c +++ b/src/config.c @@ -1810,6 +1810,7 @@ void configCommand(redisClient *c) { server.stat_keyspace_hits = 0; server.stat_keyspace_misses = 0; server.stat_numcommands = 0; + server.ops_sec_last_sample_ops = 0; server.stat_numconnections = 0; server.stat_expiredkeys = 0; server.stat_rejected_conn = 0; From 36676c23186617ff096d534c6faa353829c4e437 Mon Sep 17 00:00:00 2001 From: antirez Date: Fri, 7 Mar 2014 13:19:09 +0100 Subject: [PATCH 0118/1648] Redis Cluster: support for multi-key operations. --- src/cluster.c | 133 ++++++++++++++++++++++++++++++++++++-------------- src/cluster.h | 7 +++ src/redis.c | 18 +++++-- 3 files changed, 118 insertions(+), 40 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index f4ddeef4e0a..0edfe0e27ac 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -3804,21 +3804,39 @@ void readwriteCommand(redisClient *c) { } /* Return the pointer to the cluster node that is able to serve the command. - * For the function to succeed the command should only target a single - * key (or the same key multiple times). + * For the function to succeed the command should only target either: * - * If the returned node should be used only for this request, the *ask - * integer is set to '1', otherwise to '0'. This is used in order to - * let the caller know if we should reply with -MOVED or with -ASK. + * 1) A single key (even multiple times like LPOPRPUSH mylist mylist). + * 2) Multiple keys in the same hash slot, while the slot is stable (no + * resharding in progress). * - * If the command contains multiple keys, and as a consequence it is not - * possible to handle the request in Redis Cluster, NULL is returned. */ -clusterNode *getNodeByQuery(redisClient *c, struct redisCommand *cmd, robj **argv, int argc, int *hashslot, int *ask) { + * On success the function returns the node that is able to serve the request. + * If the node is not 'myself' a redirection must be perfomed. The kind of + * redirection is specified setting the integer passed by reference + * 'error_code', which will be set to REDIS_CLUSTER_REDIR_ASK or + * REDIS_CLUSTER_REDIR_MOVED. + * + * When the node is 'myself' 'error_code' is set to REDIS_CLUSTER_REDIR_NONE. + * + * If the command fails NULL is returned, and the reason of the failure is + * provided via 'error_code', which will be set to: + * + * REDIS_CLUSTER_REDIR_CROSS_SLOT if the request contains multiple keys that + * don't belong to the same hash slot. + * + * REDIS_CLUSTER_REDIR_UNSTABLE if the request contains mutliple keys + * belonging to the same slot, but the slot is not stable (in migration or + * importing state, likely because a resharding is in progress). */ +clusterNode *getNodeByQuery(redisClient *c, struct redisCommand *cmd, robj **argv, int argc, int *hashslot, int *error_code) { clusterNode *n = NULL; robj *firstkey = NULL; + int multiple_keys = 0; multiState *ms, _ms; multiCmd mc; - int i, slot = 0; + int i, slot = 0, migrating_slot = 0, importing_slot = 0, missing_keys = 0; + + /* Set error code optimistically for the base case. */ + if (error_code) *error_code = REDIS_CLUSTER_REDIR_NONE; /* We handle all the cases as if they were EXEC commands, so we have * a common code path for everything */ @@ -3839,8 +3857,8 @@ clusterNode *getNodeByQuery(redisClient *c, struct redisCommand *cmd, robj **arg mc.cmd = cmd; } - /* Check that all the keys are the same key, and get the slot and - * node for this key. */ + /* Check that all the keys are in the same hash slot, and obtain this + * slot and the node associated. */ for (i = 0; i < ms->count; i++) { struct redisCommand *mcmd; robj **margv; @@ -3853,48 +3871,88 @@ clusterNode *getNodeByQuery(redisClient *c, struct redisCommand *cmd, robj **arg keyindex = getKeysFromCommand(mcmd,margv,margc,&numkeys, REDIS_GETKEYS_ALL); for (j = 0; j < numkeys; j++) { + robj *thiskey = margv[keyindex[j]]; + int thisslot = keyHashSlot((char*)thiskey->ptr, + sdslen(thiskey->ptr)); + if (firstkey == NULL) { /* This is the first key we see. Check what is the slot * and node. */ - firstkey = margv[keyindex[j]]; - - slot = keyHashSlot((char*)firstkey->ptr, sdslen(firstkey->ptr)); + firstkey = thiskey; + slot = thisslot; n = server.cluster->slots[slot]; redisAssertWithInfo(c,firstkey,n != NULL); + /* If we are migrating or importing this slot, we need to check + * if we have all the keys in the request (the only way we + * can safely serve the request, otherwise we return a TRYAGAIN + * error). To do so we set the importing/migrating state and + * increment a counter for every missing key. */ + if (n == myself && + server.cluster->migrating_slots_to[slot] != NULL) + { + migrating_slot = 1; + } else if (server.cluster->importing_slots_from[slot] != NULL) { + importing_slot = 1; + } } else { /* If it is not the first key, make sure it is exactly * the same key as the first we saw. */ - if (!equalStringObjects(firstkey,margv[keyindex[j]])) { - getKeysFreeResult(keyindex); - return NULL; + if (!equalStringObjects(firstkey,thiskey)) { + if (slot != thisslot) { + /* Error: multiple keys from different slots. */ + getKeysFreeResult(keyindex); + if (error_code) + *error_code = REDIS_CLUSTER_REDIR_CROSS_SLOT; + return NULL; + } else { + /* Flag this request as one with multiple different + * keys. */ + multiple_keys = 1; + } } } + + /* Migarting / Improrting slot? Count keys we don't have. */ + if ((migrating_slot || importing_slot) && + lookupKeyRead(&server.db[0],thiskey) == NULL) + { + missing_keys++; + } } getKeysFreeResult(keyindex); } - if (ask) *ask = 0; /* This is the default. Set to 1 if needed later. */ + /* No key at all in command? then we can serve the request - * without redirections. */ + * without redirections or errors. */ if (n == NULL) return myself; + + /* Return the hashslot by reference. */ if (hashslot) *hashslot = slot; + /* This request is about a slot we are migrating into another instance? - * Then we need to check if we have the key. If we have it we can reply. - * If instead is a new key, we pass the request to the node that is - * receiving the slot. */ - if (n == myself && server.cluster->migrating_slots_to[slot] != NULL) { - if (lookupKeyRead(&server.db[0],firstkey) == NULL) { - if (ask) *ask = 1; - return server.cluster->migrating_slots_to[slot]; - } - } - /* Handle the case in which we are receiving this hash slot from - * another instance, so we'll accept the query even if in the table - * it is assigned to a different node, but only if the client - * issued an ASKING command before. */ - if (server.cluster->importing_slots_from[slot] != NULL && - (c->flags & REDIS_ASKING || cmd->flags & REDIS_CMD_ASKING)) { - return myself; + * Then if we have all the keys. */ + + /* If we don't have all the keys and we are migrating the slot, send + * an ASK redirection. */ + if (migrating_slot && missing_keys) { + if (error_code) *error_code = REDIS_CLUSTER_REDIR_ASK; + return server.cluster->migrating_slots_to[slot]; + } + + /* If we are receiving the slot, we have all the keys, and the client + * correctly flagged the request as "ASKING", we can serve + * the request, otherwise the only option is to send a TRYAGAIN error. */ + if (importing_slot && + (c->flags & REDIS_ASKING || cmd->flags & REDIS_CMD_ASKING)) + { + if (missing_keys) { + if (error_code) *error_code = REDIS_CLUSTER_REDIR_UNSTABLE; + return NULL; + } else { + return myself; + } } + /* Handle the read-only client case reading from a slave: if this * node is a slave and the request is about an hash slot our master * is serving, we can reply without redirection. */ @@ -3905,6 +3963,9 @@ clusterNode *getNodeByQuery(redisClient *c, struct redisCommand *cmd, robj **arg { return myself; } - /* It's not a -ASK case. Base case: just return the right node. */ + + /* Base case: just return the right node. However if this node is not + * myself, set error_code to MOVED since we need to issue a rediretion. */ + if (n != myself && error_code) *error_code = REDIS_CLUSTER_REDIR_MOVED; return n; } diff --git a/src/cluster.h b/src/cluster.h index 8c440b9f889..9581b575e3d 100644 --- a/src/cluster.h +++ b/src/cluster.h @@ -24,6 +24,13 @@ #define REDIS_CLUSTER_MF_TIMEOUT 5000 /* Milliseconds to do a manual failover. */ #define REDIS_CLUSTER_MF_PAUSE_MULT 2 /* Master pause manual failover mult. */ +/* Redirection errors returned by getNodeByQuery(). */ +#define REDIS_CLUSTER_REDIR_NONE 0 /* Node can serve the request. */ +#define REDIS_CLUSTER_REDIR_CROSS_SLOT 1 /* Keys in different slots. */ +#define REDIS_CLUSTER_REDIR_UNSTABLE 2 /* Keys in slot resharding. */ +#define REDIS_CLUSTER_REDIR_ASK 3 /* -ASK redirection required. */ +#define REDIS_CLUSTER_REDIR_MOVED 4 /* -MOVED redirection required. */ + struct clusterNode; /* clusterLink encapsulates everything needed to talk with a remote node. */ diff --git a/src/redis.c b/src/redis.c index 1219fdbd8e9..72a907af439 100644 --- a/src/redis.c +++ b/src/redis.c @@ -2021,15 +2021,25 @@ int processCommand(redisClient *c) { addReplySds(c,sdsnew("-CLUSTERDOWN The cluster is down. Use CLUSTER INFO for more information\r\n")); return REDIS_OK; } else { - int ask; - clusterNode *n = getNodeByQuery(c,c->cmd,c->argv,c->argc,&hashslot,&ask); + int error_code; + clusterNode *n = getNodeByQuery(c,c->cmd,c->argv,c->argc,&hashslot,&error_code); if (n == NULL) { - addReplyError(c,"Multi keys request invalid in cluster"); + if (error_code == REDIS_CLUSTER_REDIR_CROSS_SLOT) { + addReplySds(c,sdsnew("-CROSSSLOT Keys in request don't hash to the same slot\r\n")); + } else if (error_code == REDIS_CLUSTER_REDIR_UNSTABLE) { + /* The request spawns mutliple keys in the same slot, + * but the slot is not "stable" currently as there is + * a migration or import in progress. */ + addReplySds(c,sdsnew("-TRYAGAIN Multiple keys request during rehashing of slot\r\n")); + } else { + redisPanic("getNodeByQuery() unknown error."); + } return REDIS_OK; } else if (n != server.cluster->myself) { flagTransaction(c); addReplySds(c,sdscatprintf(sdsempty(), - "-%s %d %s:%d\r\n", ask ? "ASK" : "MOVED", + "-%s %d %s:%d\r\n", + (error_code == REDIS_CLUSTER_REDIR_ASK) ? "ASK" : "MOVED", hashslot,n->ip,n->port)); return REDIS_OK; } From 6984692060106de666ee89d092163075282c6498 Mon Sep 17 00:00:00 2001 From: antirez Date: Fri, 7 Mar 2014 16:18:00 +0100 Subject: [PATCH 0119/1648] Cluster: fix conditional generating TRYAGAIN error. --- src/cluster.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index 0edfe0e27ac..c99f1fa8e39 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -3939,13 +3939,14 @@ clusterNode *getNodeByQuery(redisClient *c, struct redisCommand *cmd, robj **arg return server.cluster->migrating_slots_to[slot]; } - /* If we are receiving the slot, we have all the keys, and the client - * correctly flagged the request as "ASKING", we can serve - * the request, otherwise the only option is to send a TRYAGAIN error. */ + /* If we are receiving the slot, and the client correctly flagged the + * request as "ASKING", we can serve the request. However if the request + * involves multiple keys and we don't have them all, the only option is + * to send a TRYAGAIN error. */ if (importing_slot && (c->flags & REDIS_ASKING || cmd->flags & REDIS_CMD_ASKING)) { - if (missing_keys) { + if (multiple_keys && missing_keys) { if (error_code) *error_code = REDIS_CLUSTER_REDIR_UNSTABLE; return NULL; } else { From 0f2597092fdcd17547282bd434e46d3dc8412e23 Mon Sep 17 00:00:00 2001 From: antirez Date: Fri, 7 Mar 2014 18:03:51 +0100 Subject: [PATCH 0120/1648] Typo in sentinel.conf, exists -> exits. --- sentinel.conf | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sentinel.conf b/sentinel.conf index e4434222135..ff255f87f06 100644 --- a/sentinel.conf +++ b/sentinel.conf @@ -86,10 +86,10 @@ sentinel failover-timeout mymaster 180000 # or to reconfigure clients after a failover. The scripts are executed # with the following rules for error handling: # -# If script exists with "1" the execution is retried later (up to a maximum +# If script exits with "1" the execution is retried later (up to a maximum # number of times currently set to 10). # -# If script exists with "2" (or an higher value) the script execution is +# If script exits with "2" (or an higher value) the script execution is # not retried. # # If script terminates because it receives a signal the behavior is the same From f0782a6e8633988a935097c195899773da2c2ad1 Mon Sep 17 00:00:00 2001 From: Matt Stancliff Date: Fri, 7 Mar 2014 16:32:04 -0500 Subject: [PATCH 0121/1648] Fix key extraction for z{union,inter}store The previous implementation wasn't taking into account the storage key in position 1 being a requirement (it was only counting the source keys in positions 3 to N). Fixes antirez/redis#1581 --- src/db.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/src/db.c b/src/db.c index 41e95b6ba5e..5170c6ed30a 100644 --- a/src/db.c +++ b/src/db.c @@ -993,9 +993,23 @@ int *zunionInterGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *num *numkeys = 0; return NULL; } - keys = zmalloc(sizeof(int)*num); + + /* Keys in z{union,inter}store come from two places: + argv[1] = storage key, + argv[3...n] = keys to intersect */ + + /* (num+1) is (argv[2] + 1) to account for argv[1] too */ + keys = zmalloc(sizeof(int)*(num+1)); + + /* Add all key positions for argv[3...n] to keys[] */ for (i = 0; i < num; i++) keys[i] = 3+i; - *numkeys = num; + + /* Now add the argv[1] key position (the storage key target) + to our list of command positions containing keys. + num is the number of source keys, but we initialized + keys[] to size num+1, so keys[num] is safe and valid and okay. */ + keys[num] = 1; + *numkeys = num+1; /* Total keys = {union,inter} keys + storage key */ return keys; } From 0f1f25784f6440f3476609065edfb524e3bbbcf4 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Mar 2014 09:57:52 +0100 Subject: [PATCH 0122/1648] Cluster: better timeout and retry time for failover. When node-timeout is too small, in the order of a few milliseconds, there is no way the voting process can terminate during that time, so we set a lower limit for the failover timeout of two seconds. The retry time is set to two times the failover timeout time, so it is at least 4 seconds. --- src/cluster.c | 23 ++++++++++++++++------- src/cluster.h | 1 - 2 files changed, 16 insertions(+), 8 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index c99f1fa8e39..031b61bc1d2 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -2050,6 +2050,18 @@ void clusterHandleSlaveFailover(void) { int manual_failover = server.cluster->mf_end != 0 && server.cluster->mf_can_start; int j; + mstime_t auth_timeout, auth_retry_time; + + /* Compute the failover timeout (the max time we have to send votes + * and wait for replies), and the failover retry time (the time to wait + * before waiting again. + * + * Timeout is MIN(NODE_TIMEOUT*2,2000) milliseconds. + * Retry is two times the Timeout. + */ + auth_timeout = server.cluster_node_timeout*2; + if (auth_timeout < 2000) auth_timeout = 2000; + auth_retry_time = auth_timeout*2; /* Pre conditions to run the function: * 1) We are a slave. @@ -2060,8 +2072,6 @@ void clusterHandleSlaveFailover(void) { (!nodeFailed(myself->slaveof) && !manual_failover) || myself->slaveof->numslots == 0) return; - /* If this is a manual failover, are we ready to start? */ - /* Set data_age to the number of seconds we are disconnected from * the master. */ if (server.repl_state == REDIS_REPL_CONNECTED) { @@ -2084,10 +2094,9 @@ void clusterHandleSlaveFailover(void) { (server.cluster_node_timeout * REDIS_CLUSTER_SLAVE_VALIDITY_MULT)) return; - /* Compute the time at which we can start an election. */ - if (auth_age > - server.cluster_node_timeout * REDIS_CLUSTER_FAILOVER_AUTH_RETRY_MULT) - { + /* If the previous failover attempt timedout and the retry time has + * elapsed, we can setup a new one. */ + if (auth_age > auth_retry_time) { server.cluster->failover_auth_time = mstime() + 500 + /* Fixed delay of 500 milliseconds, let FAIL msg propagate. */ random() % 500; /* Random delay between 0 and 500 milliseconds. */ @@ -2139,7 +2148,7 @@ void clusterHandleSlaveFailover(void) { if (mstime() < server.cluster->failover_auth_time) return; /* Return ASAP if the election is too old to be valid. */ - if (auth_age > server.cluster_node_timeout) return; + if (auth_age > auth_timeout) return; /* Ask for votes if needed. */ if (server.cluster->failover_auth_sent == 0) { diff --git a/src/cluster.h b/src/cluster.h index 9581b575e3d..4fb1cfe8d71 100644 --- a/src/cluster.h +++ b/src/cluster.h @@ -18,7 +18,6 @@ #define REDIS_CLUSTER_FAIL_UNDO_TIME_MULT 2 /* Undo fail if master is back. */ #define REDIS_CLUSTER_FAIL_UNDO_TIME_ADD 10 /* Some additional time. */ #define REDIS_CLUSTER_SLAVE_VALIDITY_MULT 10 /* Slave data validity. */ -#define REDIS_CLUSTER_FAILOVER_AUTH_RETRY_MULT 4 /* Auth request retry time. */ #define REDIS_CLUSTER_FAILOVER_DELAY 5 /* Seconds */ #define REDIS_CLUSTER_DEFAULT_MIGRATION_BARRIER 1 #define REDIS_CLUSTER_MF_TIMEOUT 5000 /* Milliseconds to do a manual failover. */ From 3e8a92ef8d93e50ac57e688ebd68838437921f27 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Mar 2014 10:32:28 +0100 Subject: [PATCH 0123/1648] Cluster: log error when anetTcpNonBlockBindConnect() fails. --- src/cluster.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index dd0aa08d5a4..ed3f870a785 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -2411,8 +2411,9 @@ void clusterCron(void) { node->port+REDIS_CLUSTER_PORT_INCR, server.bindaddr[0]); if (fd == -1) { redisLog(REDIS_DEBUG, "Unable to connect to " - "Cluster Client [%s]:%d", node->ip, - node->port+REDIS_CLUSTER_PORT_INCR); + "Cluster Node [%s]:%d -> %s", node->ip, + node->port+REDIS_CLUSTER_PORT_INCR, + server.neterr); continue; } link = createClusterLink(node); From ed8c55237b4f2c1c5c5fbac0c0b4d6b12ca7695a Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Mar 2014 10:33:53 +0100 Subject: [PATCH 0124/1648] Cluster: be explicit about passing NULL as bind addr for connect. The code was already correct but it was using that bindaddr[0] is set to NULL as a side effect of current implementation if no bind address is configured. This is not guarnteed to hold true in the future. --- src/cluster.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/cluster.c b/src/cluster.c index ed3f870a785..217ffee7426 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -2408,7 +2408,8 @@ void clusterCron(void) { clusterLink *link; fd = anetTcpNonBlockBindConnect(server.neterr, node->ip, - node->port+REDIS_CLUSTER_PORT_INCR, server.bindaddr[0]); + node->port+REDIS_CLUSTER_PORT_INCR, + server.bindaddr_count ? server.bindaddr[0] : NULL); if (fd == -1) { redisLog(REDIS_DEBUG, "Unable to connect to " "Cluster Node [%s]:%d -> %s", node->ip, From c1a7d3e61f84344393f47924777c89d9cd78b775 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Mar 2014 10:41:27 +0100 Subject: [PATCH 0125/1648] Cluster: abort on port too high error. It also fixes multi-line comment style to be consistent with the rest of the code base. Related to #1555. --- src/cluster.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index 3ce8bc513cf..622b90a18a8 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -331,15 +331,15 @@ void clusterInit(void) { server.cfd_count = 0; /* Port sanity check II - The other handshake port check is triggered too late to stop - us from trying to use a too-high cluster port number. - */ + * The other handshake port check is triggered too late to stop + * us from trying to use a too-high cluster port number. */ if (server.port > (65535-REDIS_CLUSTER_PORT_INCR)) { redisLog(REDIS_WARNING, "Redis port number too high. " "Cluster communication port is 10,000 port " "numbers higher than your Redis port. " "Your Redis port number must be " "lower than 55535."); + exit(1); } if (listenToPort(server.port+REDIS_CLUSTER_PORT_INCR, From 55b88e004414359fe8e33d7e0cf4e8454367cf6d Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Mar 2014 11:43:56 +0100 Subject: [PATCH 0126/1648] Cluster: some zunionInterGetKeys() comment trimmed. Everything was pretty clear again from the initial statements. --- src/db.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/src/db.c b/src/db.c index 5170c6ed30a..d6ef1307aca 100644 --- a/src/db.c +++ b/src/db.c @@ -995,19 +995,14 @@ int *zunionInterGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *num } /* Keys in z{union,inter}store come from two places: - argv[1] = storage key, - argv[3...n] = keys to intersect */ - - /* (num+1) is (argv[2] + 1) to account for argv[1] too */ + * argv[1] = storage key, + * argv[3...n] = keys to intersect */ keys = zmalloc(sizeof(int)*(num+1)); /* Add all key positions for argv[3...n] to keys[] */ for (i = 0; i < num; i++) keys[i] = 3+i; - /* Now add the argv[1] key position (the storage key target) - to our list of command positions containing keys. - num is the number of source keys, but we initialized - keys[] to size num+1, so keys[num] is safe and valid and okay. */ + /* Finally add the argv[1] key position (the storage key target). */ keys[num] = 1; *numkeys = num+1; /* Total keys = {union,inter} keys + storage key */ return keys; From 787b297046b5695c5dbca87060566cf1d93762de Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Mar 2014 13:18:41 +0100 Subject: [PATCH 0127/1648] Cluster: getKeysFromCommand() API cleaned up. This API originated from the "diskstore" experiment, not for Redis Cluster itself, so there were legacy/useless things trying to differentiate between keys that are going to be overwritten and keys that need to be fetched from disk (preloaded). All useless with Cluster, so removed with the result of code simplification. --- src/cluster.c | 3 +-- src/db.c | 27 +++------------------------ src/redis.c | 16 ++++++++-------- src/redis.h | 10 +++------- 4 files changed, 15 insertions(+), 41 deletions(-) diff --git a/src/cluster.c b/src/cluster.c index 622b90a18a8..59a711806ac 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -3897,8 +3897,7 @@ clusterNode *getNodeByQuery(redisClient *c, struct redisCommand *cmd, robj **arg margc = ms->commands[i].argc; margv = ms->commands[i].argv; - keyindex = getKeysFromCommand(mcmd,margv,margc,&numkeys, - REDIS_GETKEYS_ALL); + keyindex = getKeysFromCommand(mcmd,margv,margc,&numkeys); for (j = 0; j < numkeys; j++) { robj *thiskey = margv[keyindex[j]]; int thisslot = keyHashSlot((char*)thiskey->ptr, diff --git a/src/db.c b/src/db.c index d6ef1307aca..c0641db5096 100644 --- a/src/db.c +++ b/src/db.c @@ -949,9 +949,9 @@ int *getKeysUsingCommandTable(struct redisCommand *cmd,robj **argv, int argc, in return keys; } -int *getKeysFromCommand(struct redisCommand *cmd,robj **argv, int argc, int *numkeys, int flags) { +int *getKeysFromCommand(struct redisCommand *cmd,robj **argv, int argc, int *numkeys) { if (cmd->getkeys_proc) { - return cmd->getkeys_proc(cmd,argv,argc,numkeys,flags); + return cmd->getkeys_proc(cmd,argv,argc,numkeys); } else { return getKeysUsingCommandTable(cmd,argv,argc,numkeys); } @@ -961,30 +961,9 @@ void getKeysFreeResult(int *result) { zfree(result); } -int *noPreloadGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *numkeys, int flags) { - if (flags & REDIS_GETKEYS_PRELOAD) { - *numkeys = 0; - return NULL; - } else { - return getKeysUsingCommandTable(cmd,argv,argc,numkeys); - } -} - -int *renameGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *numkeys, int flags) { - if (flags & REDIS_GETKEYS_PRELOAD) { - int *keys = zmalloc(sizeof(int)); - *numkeys = 1; - keys[0] = 1; - return keys; - } else { - return getKeysUsingCommandTable(cmd,argv,argc,numkeys); - } -} - -int *zunionInterGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *numkeys, int flags) { +int *zunionInterGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *numkeys) { int i, num, *keys; REDIS_NOTUSED(cmd); - REDIS_NOTUSED(flags); num = atoi(argv[2]->ptr); /* Sanity check. Don't return any key if the command is going to diff --git a/src/redis.c b/src/redis.c index bd875cb70e7..e018d28090b 100644 --- a/src/redis.c +++ b/src/redis.c @@ -118,13 +118,13 @@ struct redisCommand *commandTable; */ struct redisCommand redisCommandTable[] = { {"get",getCommand,2,"r",0,NULL,1,1,1,0,0}, - {"set",setCommand,-3,"wm",0,noPreloadGetKeys,1,1,1,0,0}, - {"setnx",setnxCommand,3,"wm",0,noPreloadGetKeys,1,1,1,0,0}, - {"setex",setexCommand,4,"wm",0,noPreloadGetKeys,1,1,1,0,0}, - {"psetex",psetexCommand,4,"wm",0,noPreloadGetKeys,1,1,1,0,0}, + {"set",setCommand,-3,"wm",0,NULL,1,1,1,0,0}, + {"setnx",setnxCommand,3,"wm",0,NULL,1,1,1,0,0}, + {"setex",setexCommand,4,"wm",0,NULL,1,1,1,0,0}, + {"psetex",psetexCommand,4,"wm",0,NULL,1,1,1,0,0}, {"append",appendCommand,3,"wm",0,NULL,1,1,1,0,0}, {"strlen",strlenCommand,2,"r",0,NULL,1,1,1,0,0}, - {"del",delCommand,-2,"w",0,noPreloadGetKeys,1,-1,1,0,0}, + {"del",delCommand,-2,"w",0,NULL,1,-1,1,0,0}, {"exists",existsCommand,2,"r",0,NULL,1,1,1,0,0}, {"setbit",setbitCommand,4,"wm",0,NULL,1,1,1,0,0}, {"getbit",getbitCommand,3,"r",0,NULL,1,1,1,0,0}, @@ -206,8 +206,8 @@ struct redisCommand redisCommandTable[] = { {"randomkey",randomkeyCommand,1,"rR",0,NULL,0,0,0,0,0}, {"select",selectCommand,2,"rl",0,NULL,0,0,0,0,0}, {"move",moveCommand,3,"w",0,NULL,1,1,1,0,0}, - {"rename",renameCommand,3,"w",0,renameGetKeys,1,2,1,0,0}, - {"renamenx",renamenxCommand,3,"w",0,renameGetKeys,1,2,1,0,0}, + {"rename",renameCommand,3,"w",0,NULL,1,2,1,0,0}, + {"renamenx",renamenxCommand,3,"w",0,NULL,1,2,1,0,0}, {"expire",expireCommand,3,"w",0,NULL,1,1,1,0,0}, {"expireat",expireatCommand,3,"w",0,NULL,1,1,1,0,0}, {"pexpire",pexpireCommand,3,"w",0,NULL,1,1,1,0,0}, @@ -247,7 +247,7 @@ struct redisCommand redisCommandTable[] = { {"punsubscribe",punsubscribeCommand,-1,"rpslt",0,NULL,0,0,0,0,0}, {"publish",publishCommand,3,"pltr",0,NULL,0,0,0,0,0}, {"pubsub",pubsubCommand,-2,"pltrR",0,NULL,0,0,0,0,0}, - {"watch",watchCommand,-2,"rs",0,noPreloadGetKeys,1,-1,1,0,0}, + {"watch",watchCommand,-2,"rs",0,NULL,1,-1,1,0,0}, {"unwatch",unwatchCommand,1,"rs",0,NULL,0,0,0,0,0}, {"cluster",clusterCommand,-2,"ar",0,NULL,0,0,0,0,0}, {"restore",restoreCommand,-4,"awm",0,NULL,1,1,1,0,0}, diff --git a/src/redis.h b/src/redis.h index 0348421fa63..6ea0c22c62e 100644 --- a/src/redis.h +++ b/src/redis.h @@ -826,7 +826,7 @@ typedef struct pubsubPattern { } pubsubPattern; typedef void redisCommandProc(redisClient *c); -typedef int *redisGetKeysProc(struct redisCommand *cmd, robj **argv, int argc, int *numkeys, int flags); +typedef int *redisGetKeysProc(struct redisCommand *cmd, robj **argv, int argc, int *numkeys); struct redisCommand { char *name; redisCommandProc *proc; @@ -1237,13 +1237,9 @@ void scanGenericCommand(redisClient *c, robj *o, unsigned long cursor); int parseScanCursorOrReply(redisClient *c, robj *o, unsigned long *cursor); /* API to get key arguments from commands */ -#define REDIS_GETKEYS_ALL 0 -#define REDIS_GETKEYS_PRELOAD 1 -int *getKeysFromCommand(struct redisCommand *cmd, robj **argv, int argc, int *numkeys, int flags); +int *getKeysFromCommand(struct redisCommand *cmd, robj **argv, int argc, int *numkeys); void getKeysFreeResult(int *result); -int *noPreloadGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *numkeys, int flags); -int *renameGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *numkeys, int flags); -int *zunionInterGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *numkeys, int flags); +int *zunionInterGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *numkeys); /* Cluster */ void clusterInit(void); From caf7b9b425807bd72577ae22e75d42e29fb675fa Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Mar 2014 15:24:38 +0100 Subject: [PATCH 0128/1648] Cluster: getKeysFromCommand() and related: top-comments added. --- src/db.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/src/db.c b/src/db.c index c0641db5096..243ed195fdd 100644 --- a/src/db.c +++ b/src/db.c @@ -930,6 +930,8 @@ void persistCommand(redisClient *c) { * API to get key arguments from commands * ---------------------------------------------------------------------------*/ +/* The base case is to use the keys position as given in the command table + * (firstkey, lastkey, step). */ int *getKeysUsingCommandTable(struct redisCommand *cmd,robj **argv, int argc, int *numkeys) { int j, i = 0, last, *keys; REDIS_NOTUSED(argv); @@ -949,6 +951,11 @@ int *getKeysUsingCommandTable(struct redisCommand *cmd,robj **argv, int argc, in return keys; } +/* Return keys as an heap allocated array of integers. The length of the array + * is returned by reference into *numkeys. + * + * This function uses the command table if a command-specific helper function + * is not required, otherwise it calls the command-specific function. */ int *getKeysFromCommand(struct redisCommand *cmd,robj **argv, int argc, int *numkeys) { if (cmd->getkeys_proc) { return cmd->getkeys_proc(cmd,argv,argc,numkeys); @@ -957,11 +964,15 @@ int *getKeysFromCommand(struct redisCommand *cmd,robj **argv, int argc, int *num } } +/* Free the result of getKeysFromCommand. */ void getKeysFreeResult(int *result) { zfree(result); } -int *zunionInterGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *numkeys) { +/* Helper function to extract keys from following commands: + * ZUNIONSTORE ... + * ZINTERSTORE ... */ +int *zunionInterGetKeys(struct redisCommand *cmd, robj **argv, int argc, int *numkeys) { int i, num, *keys; REDIS_NOTUSED(cmd); From c0e818ab080f8c07eb1885945eea4a42f3ca3d29 Mon Sep 17 00:00:00 2001 From: antirez Date: Mon, 10 Mar 2014 15:26:10 +0100 Subject: [PATCH 0129/1648] Cluster: evalGetKey() added for EVAL/EVALSHA. Previously we used zunionInterGetKeys(), however after this function was fixed to account for the destination key (not needed when the API was designed for "diskstore") the two set of commands can no longer be served by an unique keys-extraction function. --- src/db.c | 23 +++++++++++++++++++++++ src/redis.c | 4 ++-- src/redis.h | 1 + 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/src/db.c b/src/db.c index 243ed195fdd..913e03a0024 100644 --- a/src/db.c +++ b/src/db.c @@ -998,6 +998,29 @@ int *zunionInterGetKeys(struct redisCommand *cmd, robj **argv, int argc, int *nu return keys; } +/* Helper function to extract keys from the following commands: + * EVAL