voldemort.server.rebalance
Class Rebalancer

java.lang.Object
  extended by voldemort.server.rebalance.Rebalancer
All Implemented Interfaces:
java.lang.Runnable

public class Rebalancer
extends java.lang.Object
implements java.lang.Runnable

Service responsible for rebalancing
Handles two scenarios a) When a new request comes in b) When a rebalancing was shut down and the box was restarted


Constructor Summary
Rebalancer(StoreRepository storeRepository, MetadataStore metadataStore, VoldemortConfig voldemortConfig, AsyncOperationService asyncService)
           
 
Method Summary
 boolean acquireRebalancingPermit(int nodeId)
          Acquire a permit for a particular node id so as to allow rebalancing
 AsyncOperationService getAsyncOperationService()
           
 int rebalanceNode(RebalanceTaskInfo stealInfo)
          This function is responsible for starting the actual async rebalance operation.
 void rebalanceStateChange(Cluster cluster, java.util.List<StoreDefinition> storeDefs, java.util.List<RebalanceTaskInfo> rebalanceTaskInfo, boolean swapRO, boolean changeClusterAndStoresMetadata, boolean changeRebalanceState, boolean rollback)
          Support four different stages
For normal operation:
 void releaseRebalancingPermit(int nodeId)
          Release the rebalancing permit for a particular node id
 void run()
          This is called only once at startup
 void start()
           
 void stop()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Rebalancer

public Rebalancer(StoreRepository storeRepository,
                  MetadataStore metadataStore,
                  VoldemortConfig voldemortConfig,
                  AsyncOperationService asyncService)
Method Detail

getAsyncOperationService

public AsyncOperationService getAsyncOperationService()

start

public void start()

stop

public void stop()

run

public void run()
This is called only once at startup

Specified by:
run in interface java.lang.Runnable

acquireRebalancingPermit

public boolean acquireRebalancingPermit(int nodeId)
Acquire a permit for a particular node id so as to allow rebalancing

Parameters:
nodeId - The id of the node for which we are acquiring a permit
Returns:
Returns true if permit acquired, false if the permit is already held by someone

releaseRebalancingPermit

public void releaseRebalancingPermit(int nodeId)
Release the rebalancing permit for a particular node id

Parameters:
nodeId - The node id whose permit we want to release

rebalanceStateChange

public void rebalanceStateChange(Cluster cluster,
                                 java.util.List<StoreDefinition> storeDefs,
                                 java.util.List<RebalanceTaskInfo> rebalanceTaskInfo,
                                 boolean swapRO,
                                 boolean changeClusterAndStoresMetadata,
                                 boolean changeRebalanceState,
                                 boolean rollback)
Support four different stages
For normal operation:
 | swapRO | changeClusterMetadata | changeRebalanceState | Order |
 | f | t | t | rebalance -> cluster  | 
 | f | f | t | rebalance |
 | t | t | f | cluster -> swap |
 | t | t | t | rebalance -> cluster -> swap|
 
In general we need to do [ cluster change -> swap -> rebalance state change ] NOTE: The update of the cluster metadata and the rebalancer state is not "atomic". Ergo, there could theoretically be a race where a client picks up new cluster metadata sends a request based on that, but the proxy bridges have not been setup and we either miss a proxy put or return a null for get/getalls TODO:refactor The rollback logic here is too convoluted. Specifically, the independent updates to each key could be split up into their own methods.

Parameters:
cluster - Cluster metadata to change
rebalanceTaskInfo - List of rebalance partitions info
swapRO - Boolean to indicate swapping of RO store
changeClusterAndStoresMetadata - Boolean to indicate a change of cluster metadata
changeRebalanceState - Boolean to indicate a change in rebalance state
rollback - Boolean to indicate that we are rolling back or not

rebalanceNode

public int rebalanceNode(RebalanceTaskInfo stealInfo)
This function is responsible for starting the actual async rebalance operation. This is run if this node is the stealer node
We also assume that the check that this server is in rebalancing state has been done at a higher level

Parameters:
stealInfo - Partition info to steal
Returns:
Returns a id identifying the async operation


Jay Kreps, Roshan Sumbaly, Alex Feinberg, Bhupesh Bansal, Lei Gao, Chinmay Soman, Vinoth Chandar, Zhongjie Wu