voldemort.utils
Class ClusterForkLiftTool

java.lang.Object
  extended by voldemort.utils.ClusterForkLiftTool
All Implemented Interfaces:
java.lang.Runnable

public class ClusterForkLiftTool
extends java.lang.Object
implements java.lang.Runnable

Tool to fork lift data over from a source cluster to a destination cluster. When used in conjunction with a client that "double writes" to both the clusters, this can be a used as a feasible store migration tool to move an existing store to a new cluster. There are two modes around how the divergent versions of a key are consolidated from the source cluster. : 1) Primary only Resolution (ClusterForkLiftTool#SinglePartitionForkLiftTask: The entries on the primary partition are moved over to the destination cluster with empty vector clocks. if any key has multiple versions on the primary, they are resolved. This approach is fast and is best suited if you deem the replicas being very much in sync with each other. This is the DEFAULT mode 2) Global Resolution (ClusterForkLiftTool#SinglePartitionGloballyResolvingForkLiftTask : The keys belonging to a partition are fetched out of the primary replica, and for each such key, the corresponding values are obtained from all other replicas, using get(..) operations. These versions are then resolved and written back to the destination cluster as before. This approach is slow since it involves several roundtrips to the server for each key (some potentially cross colo) and hence should be used when thorough version resolution is neccessary or the admin deems the replicas being fairly out-of-sync In both mode, the default chained resolver ( VectorClockInconsistencyResolver + TimeBasedInconsistencyResolver is used to determine a final resolved version. NOTES: 1) If the tool fails for some reason in the middle, the admin can restart the tool for the failed partitions alone. The keys that were already written in the failed partitions, will all experience ObsoleteVersionException and the un-inserted keys will be inserted. 2) Since the forklift writes are issued with empty vector clocks, they will always yield to online writes happening on the same key, before or during the forklift window. Of course, after the forklift window, the destination cluster resumes normal operation. 3) For now, we will fallback to fetching the key from the primary replica, fetch the values out manually, resolve and write it back. PitFalls : primary somehow does not have the key. Two scenarios. 1) Key active after double writes: the situation is the result of slop not propagating to the primary. But double writes would write the key back to destination cluster anyway. We are good. 2) Key inactive after double writes: This indicates a problem elsewhere. This is a base guarantee voldemort should offer. 4) Zoned <-> Non Zoned forklift implications. When forklifting data from a non-zoned to zoned cluster, both destination zones will be populated with data, by simply running the tool once with the respective bootstrap urls. If you need to forklift data from zoned to non-zoned clusters (i.e your replication between datacenters is not handled by Voldemort), then you need to run the tool twice for each destination non-zoned cluster. Zoned -> Zoned and Non-Zoned -> Non-Zoned forklifts are trivial.


Constructor Summary
ClusterForkLiftTool(java.lang.String srcBootstrapUrl, java.lang.String dstBootstrapUrl, int maxPutsPerSecond, int partitionParallelism, int progressOps, java.util.List<java.lang.String> storesList, java.util.List<java.lang.Integer> partitions, voldemort.utils.ClusterForkLiftTool.ForkLiftTaskMode mode)
           
 
Method Summary
static void main(java.lang.String[] args)
           
 void run()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ClusterForkLiftTool

public ClusterForkLiftTool(java.lang.String srcBootstrapUrl,
                           java.lang.String dstBootstrapUrl,
                           int maxPutsPerSecond,
                           int partitionParallelism,
                           int progressOps,
                           java.util.List<java.lang.String> storesList,
                           java.util.List<java.lang.Integer> partitions,
                           voldemort.utils.ClusterForkLiftTool.ForkLiftTaskMode mode)
Method Detail

run

public void run()
Specified by:
run in interface java.lang.Runnable

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Parameters:
args -
Throws:
java.lang.Exception


Jay Kreps, Roshan Sumbaly, Alex Feinberg, Bhupesh Bansal, Lei Gao, Chinmay Soman, Vinoth Chandar, Zhongjie Wu