voldemort.client.protocol.admin
Class AdminClient

java.lang.Object
  extended by voldemort.client.protocol.admin.AdminClient

public class AdminClient
extends java.lang.Object

AdminClient is intended for administrative functionality that is useful and often needed, but should be used sparingly (if at all) at the application level.

Some of the uses of AdminClient include


Field Summary
static java.util.List<java.lang.String> restoreStoreEngineBlackList
           
 
Constructor Summary
AdminClient(Cluster cluster, AdminClientConfig adminClientConfig)
          Create an instance of AdminClient given a Cluster object.
AdminClient(java.lang.String bootstrapURL, AdminClientConfig adminClientConfig)
          Create an instance of AdminClient given a URL of a node in the cluster.
 
Method Summary
 void addStore(StoreDefinition def)
          Add a new store definition to all active nodes in the cluster.
 void addStore(StoreDefinition def, int nodeId)
          Add a new store definition to a particular node
 long deletePartitions(int nodeId, java.lang.String storeName, java.util.HashMap<java.lang.Integer,java.util.List<java.lang.Integer>> replicaToPartitionList, Cluster initialCluster, VoldemortFilter filter)
          Delete all entries belonging to all the partitions passed as a map of replica_type to partition list.
 long deletePartitions(int nodeId, java.lang.String storeName, java.util.List<java.lang.Integer> partitionList, VoldemortFilter filter)
          Delete all entries belonging to a list of partitions
 void deleteStore(java.lang.String storeName)
          Delete a store from all active nodes in the cluster
 void deleteStore(java.lang.String storeName, int nodeId)
          Delete a store from a particular node
 void failedFetchStore(int nodeId, java.lang.String storeName, java.lang.String storeDir)
          When a fetch store fails, we don't need to keep the pushed data around.
 java.util.Iterator<Pair<ByteArray,Versioned<byte[]>>> fetchEntries(int nodeId, java.lang.String storeName, java.util.HashMap<java.lang.Integer,java.util.List<java.lang.Integer>> replicaToPartitionList, VoldemortFilter filter, boolean fetchMasterEntries, Cluster initialCluster, long skipRecords)
          Fetch key/value tuples belonging to this map of replica type to partition list
 java.util.Iterator<Pair<ByteArray,Versioned<byte[]>>> fetchEntries(int nodeId, java.lang.String storeName, java.util.List<java.lang.Integer> partitionList, VoldemortFilter filter, boolean fetchMasterEntries)
          Legacy interface for fetching entries.
 java.util.Iterator<Pair<ByteArray,Versioned<byte[]>>> fetchEntries(int nodeId, java.lang.String storeName, java.util.List<java.lang.Integer> partitionList, VoldemortFilter filter, boolean fetchMasterEntries, long skipRecords)
          Legacy interface for fetching entries.
 java.util.Iterator<ByteArray> fetchKeys(int nodeId, java.lang.String storeName, java.util.HashMap<java.lang.Integer,java.util.List<java.lang.Integer>> replicaToPartitionList, VoldemortFilter filter, boolean fetchMasterEntries, Cluster initialCluster, long skipRecords)
          Fetch all keys belonging to the map of replica type to partition list.
 java.util.Iterator<ByteArray> fetchKeys(int nodeId, java.lang.String storeName, java.util.List<java.lang.Integer> partitionList, VoldemortFilter filter, boolean fetchMasterEntries)
          Legacy interface for fetching entries.
 java.util.Iterator<ByteArray> fetchKeys(int nodeId, java.lang.String storeName, java.util.List<java.lang.Integer> partitionList, VoldemortFilter filter, boolean fetchMasterEntries, long skipRecords)
          Legacy interface for fetching entries.
 void fetchPartitionFiles(int nodeId, java.lang.String storeName, java.util.HashMap<java.lang.Integer,java.util.List<java.lang.Integer>> replicaToPartitionList, java.lang.String destinationDirPath, java.util.Set<java.lang.Object> notAcceptedBuckets, java.util.concurrent.atomic.AtomicBoolean running)
          Fetch read-only store files to a specified directory.
 java.lang.String fetchStore(int nodeId, java.lang.String storeName, java.lang.String storeDir, long pushVersion, long timeoutMs)
          Fetch data from directory 'storeDir' on node id
 Cluster getAdminClientCluster()
          Get the cluster info AdminClient is using.
 java.util.List<java.lang.Integer> getAsyncRequestList(int nodeId)
          Retrieves a list of asynchronous request ids on the server.
 java.util.List<java.lang.Integer> getAsyncRequestList(int nodeId, boolean showComplete)
          Retrieves a list of asynchronous request ids on the server.
 AsyncOperationStatus getAsyncRequestStatus(int nodeId, int requestId)
          Get the status of an Async Operation running at (remote) node.
 Versioned<Cluster> getRemoteCluster(int nodeId)
          Get the cluster information from a remote node.
 Versioned<java.lang.String> getRemoteMetadata(int remoteNodeId, java.lang.String key)
          Get the metadata on a remote node.
 Versioned<MetadataStore.VoldemortState> getRemoteServerState(int nodeId)
          Retrieve the server state from a remote node.
 Versioned<java.util.List<StoreDefinition>> getRemoteStoreDefList(int nodeId)
          Retrieve the store definitions from a remote node.
 java.util.Map<java.lang.Integer,java.util.HashMap<java.lang.Integer,java.util.List<java.lang.Integer>>> getReplicationMapping(int restoringNode, Cluster cluster, StoreDefinition storeDef)
          For a particular node, finds out all the [replica, partition] tuples it needs to steal in order to be brought back to normal state
 java.util.Map<java.lang.String,java.lang.Long> getROCurrentVersion(int nodeId, java.util.List<java.lang.String> storeNames)
          Returns the 'current' version of RO store
 java.util.Map<java.lang.String,java.lang.String> getROCurrentVersionDir(int nodeId, java.util.List<java.lang.String> storeNames)
          Returns the 'current' versions of all RO stores provided
 java.util.Map<java.lang.String,java.lang.Long> getROMaxVersion(int nodeId, java.util.List<java.lang.String> storeNames)
          Returns the max version of push currently being used by read-only store.
 java.util.Map<java.lang.String,java.lang.Long> getROMaxVersion(java.util.List<java.lang.String> storeNames)
          This is a wrapper around getROMaxVersion(int, List) where-in we find the max versions on each machine and then return the max of all of them
 java.util.Map<java.lang.String,java.lang.String> getROMaxVersionDir(int nodeId, java.util.List<java.lang.String> storeNames)
          Returns the max version of push currently being used by read-only store.
 java.util.Map<java.lang.String,java.lang.String> getROStorageFormat(int nodeId, java.util.List<java.lang.String> storeNames)
          Returns the read-only storage format - ReadOnlyStorageFormat for a list of stores
 int migratePartitions(int donorNodeId, int stealerNodeId, java.lang.String storeName, java.util.HashMap<java.lang.Integer,java.util.List<java.lang.Integer>> replicaToPartitionList, VoldemortFilter filter, Cluster initialCluster, boolean optimize)
          Migrate keys/values belonging to a map of replica type to partition list from donor node to stealer node.
 int migratePartitions(int donorNodeId, int stealerNodeId, java.lang.String storeName, java.util.List<java.lang.Integer> stealPartitionList, VoldemortFilter filter)
          Migrate keys/values belonging to stealPartitionList ( can be primary or replica ) from donor node to stealer node.
 int rebalanceNode(RebalancePartitionsInfo stealInfo)
          Rebalance a stealer-donor node pair for a set of stores
 void rebalanceStateChange(Cluster existingCluster, Cluster transitionCluster, java.util.List<RebalancePartitionsInfo> rebalancePartitionPlanList, boolean swapRO, boolean changeClusterMetadata, boolean changeRebalanceState, boolean rollback, boolean failEarly)
          Used in rebalancing to indicate change in states.
 void restoreDataFromReplications(int nodeId, int parallelTransfers)
          RestoreData from copies on other machines for the given nodeId
 void rollbackStore(int nodeId, java.lang.String storeName, long pushVersion)
          Rollback RO store to most recent backup of the current store
 void setAdminClientCluster(Cluster cluster)
          Set cluster info for AdminClient to use.
 void stop()
          Stop the AdminClient cleanly freeing all resources.
 void stopAsyncRequest(int nodeId, int requestId)
          To stop an asynchronous request on the particular node
 java.lang.String swapStore(int nodeId, java.lang.String storeName, java.lang.String storeDir)
          Swap store data atomically on a single node
 void throwException(VProto.Error error)
           
 void truncate(int nodeId, java.lang.String storeName)
          Delete the store completely (Deletes all data) from the remote node.
 void updateEntries(int nodeId, java.lang.String storeName, java.util.Iterator<Pair<ByteArray,Versioned<byte[]>>> entryIterator, VoldemortFilter filter)
          Update a stream of key/value entries at the given node.
 void updateRemoteCluster(int nodeId, Cluster cluster, Version clock)
          Update the cluster information MetadataStore.CLUSTER_KEY on a remote node.
 void updateRemoteMetadata(int remoteNodeId, java.lang.String key, Versioned<java.lang.String> value)
          Update metadata at the given remoteNodeId.
 void updateRemoteServerState(int nodeId, MetadataStore.VoldemortState state, Version clock)
          Update the server state ( MetadataStore.VoldemortState) on a remote node.
 void updateRemoteStoreDefList(int nodeId, java.util.List<StoreDefinition> storesList)
          Update the store definitions on a remote node.
 void updateSlopEntries(int nodeId, java.util.Iterator<Versioned<Slop>> entryIterator)
          Update slops which may be meant for multiple stores
 java.lang.String waitForCompletion(int nodeId, int requestId, long maxWait, java.util.concurrent.TimeUnit timeUnit)
          Wait for async task at (remote) nodeId to finish completion, using exponential backoff to poll the task completion status.
 java.lang.String waitForCompletion(int nodeId, int requestId, long maxWait, java.util.concurrent.TimeUnit timeUnit, AsyncOperationStatus higherStatus)
          Wait for async task at (remote) nodeId to finish completion, using exponential backoff to poll the task completion status.
 void waitForCompletion(int nodeId, java.lang.String key, java.lang.String value, long maxWait, java.util.concurrent.TimeUnit timeUnit)
          Wait till the passed value matches with the metadata value returned by the remote node for the passed key.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

restoreStoreEngineBlackList

public static final java.util.List<java.lang.String> restoreStoreEngineBlackList
Constructor Detail

AdminClient

public AdminClient(java.lang.String bootstrapURL,
                   AdminClientConfig adminClientConfig)
Create an instance of AdminClient given a URL of a node in the cluster. The bootstrap URL is used to get the cluster metadata.

Parameters:
bootstrapURL - URL pointing to the bootstrap node
adminClientConfig - Configuration for AdminClient specifying client parameters eg.
  • number of threads
  • number of sockets per node
  • socket buffer size

AdminClient

public AdminClient(Cluster cluster,
                   AdminClientConfig adminClientConfig)
Create an instance of AdminClient given a Cluster object.

Parameters:
cluster - Initialized cluster object, describing the nodes we wish to contact
adminClientConfig - Configuration for AdminClient specifying client parameters eg.
  • number of threads
  • number of sockets per node
  • socket buffer size
Method Detail

updateEntries

public void updateEntries(int nodeId,
                          java.lang.String storeName,
                          java.util.Iterator<Pair<ByteArray,Versioned<byte[]>>> entryIterator,
                          VoldemortFilter filter)
Update a stream of key/value entries at the given node. The iterator entries are streamed from the client to the server:
  1. Client performs a handshake with the server (sending in the update entries request with a store name and a VoldemortFilter instance.
  2. While entryIterator has entries, the client will keep sending the updates one after another to the server, buffering the data, without waiting for a response from the server.
  3. After iteration is complete, send an end of stream message, force a flush of the buffer, check the response on the server to check if a VoldemortException has occured.

Parameters:
nodeId - Id of the remote node (where we wish to update the entries)
storeName - Store name for the entries
entryIterator - Iterator of key-value pairs for the entries
filter - Custom filter implementation to filter out entries which should not be updated.
Throws:
VoldemortException

fetchEntries

public java.util.Iterator<Pair<ByteArray,Versioned<byte[]>>> fetchEntries(int nodeId,
                                                                          java.lang.String storeName,
                                                                          java.util.List<java.lang.Integer> partitionList,
                                                                          VoldemortFilter filter,
                                                                          boolean fetchMasterEntries,
                                                                          long skipRecords)
Legacy interface for fetching entries. See fetchEntries(int, String, HashMap, VoldemortFilter, boolean, Cluster, long) for more information.

Parameters:
nodeId - Id of the node to fetch from
storeName - Name of the store
partitionList - List of the partitions
filter - Custom filter implementation to filter out entries which should not be fetched.
fetchMasterEntries - Fetch an entry only if master replica
skipRecords - Number of records to skip
Returns:
An iterator which allows entries to be streamed as they're being iterated over.

fetchEntries

public java.util.Iterator<Pair<ByteArray,Versioned<byte[]>>> fetchEntries(int nodeId,
                                                                          java.lang.String storeName,
                                                                          java.util.List<java.lang.Integer> partitionList,
                                                                          VoldemortFilter filter,
                                                                          boolean fetchMasterEntries)
Legacy interface for fetching entries. See fetchEntries(int, String, HashMap, VoldemortFilter, boolean, Cluster, long) for more information.

Parameters:
nodeId - Id of the node to fetch from
storeName - Name of the store
partitionList - List of the partitions
filter - Custom filter implementation to filter out entries which should not be fetched.
fetchMasterEntries - Fetch an entry only if master replica
Returns:
An iterator which allows entries to be streamed as they're being iterated over.

fetchEntries

public java.util.Iterator<Pair<ByteArray,Versioned<byte[]>>> fetchEntries(int nodeId,
                                                                          java.lang.String storeName,
                                                                          java.util.HashMap<java.lang.Integer,java.util.List<java.lang.Integer>> replicaToPartitionList,
                                                                          VoldemortFilter filter,
                                                                          boolean fetchMasterEntries,
                                                                          Cluster initialCluster,
                                                                          long skipRecords)
Fetch key/value tuples belonging to this map of replica type to partition list

Streaming API - The server keeps sending the messages as it's iterating over the data. Once iteration has finished, the server sends an "end of stream" marker and flushes its buffer. A response indicating a VoldemortException may be sent at any time during the process.

Entries are being streamed as the iteration happens i.e. the whole result set is not buffered in memory.

Parameters:
nodeId - Id of the node to fetch from
storeName - Name of the store
replicaToPartitionList - Mapping of replica type to partition list
filter - Custom filter implementation to filter out entries which should not be fetched.
fetchMasterEntries - Fetch an entry only if master replica
initialCluster - The cluster metadata to use while making the decision to fetch entries. This is important during rebalancing where-in we want to fetch keys using an older metadata compared to the new one.
skipRecords - Number of records to skip
Returns:
An iterator which allows entries to be streamed as they're being iterated over.

fetchKeys

public java.util.Iterator<ByteArray> fetchKeys(int nodeId,
                                               java.lang.String storeName,
                                               java.util.List<java.lang.Integer> partitionList,
                                               VoldemortFilter filter,
                                               boolean fetchMasterEntries,
                                               long skipRecords)
Legacy interface for fetching entries. See fetchKeys(int, String, HashMap, VoldemortFilter, boolean, Cluster, long) for more information.

Parameters:
nodeId - Id of the node to fetch from
storeName - Name of the store
partitionList - List of the partitions to retrieve
filter - Custom filter implementation to filter out entries which should not be fetched.
fetchMasterEntries - Fetch a key only if master replica
skipRecords - Number of keys to skip
Returns:
An iterator which allows keys to be streamed as they're being iterated over.

fetchKeys

public java.util.Iterator<ByteArray> fetchKeys(int nodeId,
                                               java.lang.String storeName,
                                               java.util.List<java.lang.Integer> partitionList,
                                               VoldemortFilter filter,
                                               boolean fetchMasterEntries)
Legacy interface for fetching entries. See fetchKeys(int, String, HashMap, VoldemortFilter, boolean, Cluster, long) for more information.

Parameters:
nodeId - Id of the node to fetch from
storeName - Name of the store
partitionList - List of the partitions to retrieve
filter - Custom filter implementation to filter out entries which should not be fetched.
fetchMasterEntries - Fetch a key only if master replica
Returns:
An iterator which allows keys to be streamed as they're being iterated over.

fetchKeys

public java.util.Iterator<ByteArray> fetchKeys(int nodeId,
                                               java.lang.String storeName,
                                               java.util.HashMap<java.lang.Integer,java.util.List<java.lang.Integer>> replicaToPartitionList,
                                               VoldemortFilter filter,
                                               boolean fetchMasterEntries,
                                               Cluster initialCluster,
                                               long skipRecords)
Fetch all keys belonging to the map of replica type to partition list. Identical to fetchEntries(int, java.lang.String, java.util.List, voldemort.client.protocol.VoldemortFilter, boolean, long) but only fetches the keys

Parameters:
nodeId - The node id from where to fetch the keys
storeName - The store name whose keys we want to retrieve
replicaToPartitionList - Map of replica type to corresponding partition list
filter - Custom filter
initialCluster - Cluster to use for selecting a key. If null, use the default metadata from the metadata store
skipRecords - Number of records to skip [ Used for sampling ]
Returns:
Returns an iterator of the keys

restoreDataFromReplications

public void restoreDataFromReplications(int nodeId,
                                        int parallelTransfers)
RestoreData from copies on other machines for the given nodeId

Recovery mechanism to recover and restore data actively from replicated copies in the cluster.

Parameters:
nodeId - Id of the node to restoreData
parallelTransfers - number of transfers
Throws:
java.lang.InterruptedException

getReplicationMapping

public java.util.Map<java.lang.Integer,java.util.HashMap<java.lang.Integer,java.util.List<java.lang.Integer>>> getReplicationMapping(int restoringNode,
                                                                                                                                     Cluster cluster,
                                                                                                                                     StoreDefinition storeDef)
For a particular node, finds out all the [replica, partition] tuples it needs to steal in order to be brought back to normal state

Parameters:
restoringNode - The id of the node which needs to be restored
cluster - The cluster definition
storeDef - The store definition to use
Returns:
Map of node id to map of replica type and corresponding partition list

rebalanceNode

public int rebalanceNode(RebalancePartitionsInfo stealInfo)
Rebalance a stealer-donor node pair for a set of stores

Parameters:
stealInfo - Partition steal information
Returns:
The request id of the async operation

migratePartitions

public int migratePartitions(int donorNodeId,
                             int stealerNodeId,
                             java.lang.String storeName,
                             java.util.List<java.lang.Integer> stealPartitionList,
                             VoldemortFilter filter)
Migrate keys/values belonging to stealPartitionList ( can be primary or replica ) from donor node to stealer node. Does not delete the partitions from donorNode, merely copies them.

See migratePartitions(int, int, String, HashMap, VoldemortFilter, Cluster, boolean) for more details.

Parameters:
donorNodeId - Node from which the partitions are to be streamed.
stealerNodeId - Node to which the partitions are to be streamed.
storeName - Name of the store to stream.
stealPartitionList - List of partitions to stream.
filter - Custom filter implementation to filter out entries which should not be deleted.
Returns:
The value of the AsyncOperation created on stealerNodeId which is performing the operation.

migratePartitions

public int migratePartitions(int donorNodeId,
                             int stealerNodeId,
                             java.lang.String storeName,
                             java.util.HashMap<java.lang.Integer,java.util.List<java.lang.Integer>> replicaToPartitionList,
                             VoldemortFilter filter,
                             Cluster initialCluster,
                             boolean optimize)
Migrate keys/values belonging to a map of replica type to partition list from donor node to stealer node. Does not delete the partitions from donorNode, merely copies them.

This is a background operation (see AsyncOperation that runs on the stealer node where updates are performed.

Parameters:
donorNodeId - Node from which the partitions are to be streamed.
stealerNodeId - Node to which the partitions are to be streamed.
storeName - Name of the store to stream.
replicaToPartitionList - Mapping from replica type to partition to be stolen
filter - Voldemort post-filter
initialCluster - The cluster metadata to use for making the decision if the key belongs to these partitions. If not specified, falls back to the metadata stored on the box
optimize - We can run an optimization at this level where-in we try avoid copying of data which already exists ( in the form of a replica ). We do need to disable this when we're trying to recover a node which was completely damaged ( restore from replica ).
Returns:
The value of the AsyncOperation created on stealer node which is performing the operation.

truncate

public void truncate(int nodeId,
                     java.lang.String storeName)
Delete the store completely (Deletes all data) from the remote node.

Parameters:
nodeId - The node id on which the store is present
storeName - The name of the store

getAsyncRequestStatus

public AsyncOperationStatus getAsyncRequestStatus(int nodeId,
                                                  int requestId)
Get the status of an Async Operation running at (remote) node. If The operation is complete, then the operation will be removed from a list of currently running operations.

Parameters:
nodeId - Id on which the operation is running
requestId - Id of the operation itself
Returns:
The status of the operation

getAsyncRequestList

public java.util.List<java.lang.Integer> getAsyncRequestList(int nodeId)
Retrieves a list of asynchronous request ids on the server. Does not include the completed requests

Parameters:
nodeId - The id of the node whose request ids we want
Returns:
List of async request ids

getAsyncRequestList

public java.util.List<java.lang.Integer> getAsyncRequestList(int nodeId,
                                                             boolean showComplete)
Retrieves a list of asynchronous request ids on the server. Depending on the boolean passed also retrieves the completed requests

Parameters:
nodeId - The id of the node whose request ids we want
showComplete - Boolean to indicate if we want to include the completed requests as well
Returns:
List of async request ids

stopAsyncRequest

public void stopAsyncRequest(int nodeId,
                             int requestId)
To stop an asynchronous request on the particular node

Parameters:
nodeId - The id of the node on which the request is running
requestId - The id of the request to terminate

deletePartitions

public long deletePartitions(int nodeId,
                             java.lang.String storeName,
                             java.util.List<java.lang.Integer> partitionList,
                             VoldemortFilter filter)
Delete all entries belonging to a list of partitions

Parameters:
nodeId - Node on which the entries to be deleted
storeName - Name of the store holding the entries
partitionList - List of partitions to delete.
filter - Custom filter implementation to filter out entries which should not be deleted.
Returns:
Number of entries deleted

deletePartitions

public long deletePartitions(int nodeId,
                             java.lang.String storeName,
                             java.util.HashMap<java.lang.Integer,java.util.List<java.lang.Integer>> replicaToPartitionList,
                             Cluster initialCluster,
                             VoldemortFilter filter)
Delete all entries belonging to all the partitions passed as a map of replica_type to partition list. Works only for RW stores.

Parameters:
nodeId - Node on which the entries to be deleted
storeName - Name of the store holding the entries
replicaToPartitionList - Map of replica type to partition list
filter - Custom filter implementation to filter out entries which should not be deleted.
Returns:
Number of entries deleted

throwException

public void throwException(VProto.Error error)

stop

public void stop()
Stop the AdminClient cleanly freeing all resources.


waitForCompletion

public java.lang.String waitForCompletion(int nodeId,
                                          int requestId,
                                          long maxWait,
                                          java.util.concurrent.TimeUnit timeUnit,
                                          AsyncOperationStatus higherStatus)
Wait for async task at (remote) nodeId to finish completion, using exponential backoff to poll the task completion status.

Logs the status at each status check if debug is enabled.

Parameters:
nodeId - Id of the node to poll
requestId - Id of the request to check
maxWait - Maximum time we'll keep checking a request until we give up
timeUnit - Unit in which maxWait is expressed.
higherStatus - A higher level async operation object. If this waiting is being run another async operation this helps us propagate the status all the way up.
Returns:
description The final description attached with the response
Throws:
VoldemortException - if task failed to finish in specified maxWait time.

waitForCompletion

public java.lang.String waitForCompletion(int nodeId,
                                          int requestId,
                                          long maxWait,
                                          java.util.concurrent.TimeUnit timeUnit)
Wait for async task at (remote) nodeId to finish completion, using exponential backoff to poll the task completion status.

Logs the status at each status check if debug is enabled.

Parameters:
nodeId - Id of the node to poll
requestId - Id of the request to check
maxWait - Maximum time we'll keep checking a request until we give up
timeUnit - Unit in which maxWait is expressed.
Returns:
description The final description attached with the response
Throws:
VoldemortException - if task failed to finish in specified maxWait time.

waitForCompletion

public void waitForCompletion(int nodeId,
                              java.lang.String key,
                              java.lang.String value,
                              long maxWait,
                              java.util.concurrent.TimeUnit timeUnit)
Wait till the passed value matches with the metadata value returned by the remote node for the passed key.

Logs the status at each status check if debug is enabled.

Parameters:
nodeId - Id of the node to poll
key - metadata key to keep checking for current value
value - metadata value should match for exit criteria.
maxWait - Maximum time we'll keep checking a request until we give up
timeUnit - Unit in which maxWait is expressed.

updateRemoteMetadata

public void updateRemoteMetadata(int remoteNodeId,
                                 java.lang.String key,
                                 Versioned<java.lang.String> value)
Update metadata at the given remoteNodeId.

Metadata keys can be one of MetadataStore.METADATA_KEYS
eg.

  • cluster metadata (cluster.xml as string)
  • stores definitions (stores.xml as string)
  • Server states
    See MetadataStore for more information.

    Parameters:
    remoteNodeId - Id of the node
    key - Metadata key to update
    value - Value for the metadata key

  • getRemoteMetadata

    public Versioned<java.lang.String> getRemoteMetadata(int remoteNodeId,
                                                         java.lang.String key)
    Get the metadata on a remote node.

    Metadata keys can be one of MetadataStore.METADATA_KEYS
    eg.

  • cluster metadata (cluster.xml as string)
  • stores definitions (stores.xml as string)
  • Server states
    See MetadataStore for more information.

    Parameters:
    remoteNodeId - Id of the node
    key - Metadata key to update
    Returns:
    Metadata with its associated Version

  • updateRemoteCluster

    public void updateRemoteCluster(int nodeId,
                                    Cluster cluster,
                                    Version clock)
                             throws VoldemortException
    Update the cluster information MetadataStore.CLUSTER_KEY on a remote node.

    Parameters:
    nodeId - Id of the remote node
    cluster - The new cluster object
    Throws:
    VoldemortException

    getRemoteCluster

    public Versioned<Cluster> getRemoteCluster(int nodeId)
                                        throws VoldemortException
    Get the cluster information from a remote node.

    Parameters:
    nodeId - Node to retrieve information from
    Returns:
    A cluster object with its Version
    Throws:
    VoldemortException

    updateRemoteStoreDefList

    public void updateRemoteStoreDefList(int nodeId,
                                         java.util.List<StoreDefinition> storesList)
                                  throws VoldemortException
    Update the store definitions on a remote node.

    Parameters:
    nodeId - The node id of the machine
    storesList - The new store list
    Throws:
    VoldemortException

    getRemoteStoreDefList

    public Versioned<java.util.List<StoreDefinition>> getRemoteStoreDefList(int nodeId)
                                                                     throws VoldemortException
    Retrieve the store definitions from a remote node.

    Parameters:
    nodeId - The node id from which we can to remote the store definition
    Returns:
    The list of store definitions from the remote machine
    Throws:
    VoldemortException

    updateRemoteServerState

    public void updateRemoteServerState(int nodeId,
                                        MetadataStore.VoldemortState state,
                                        Version clock)
    Update the server state ( MetadataStore.VoldemortState) on a remote node.


    getRemoteServerState

    public Versioned<MetadataStore.VoldemortState> getRemoteServerState(int nodeId)
    Retrieve the server state from a remote node.


    addStore

    public void addStore(StoreDefinition def)
    Add a new store definition to all active nodes in the cluster.

    Parameters:
    def - the definition of the store to add

    addStore

    public void addStore(StoreDefinition def,
                         int nodeId)
    Add a new store definition to a particular node

    Parameters:
    def - the definition of the store to add
    nodeId - Node on which to add the store

    deleteStore

    public void deleteStore(java.lang.String storeName)
    Delete a store from all active nodes in the cluster

    Parameters:
    storeName - name of the store to delete

    deleteStore

    public void deleteStore(java.lang.String storeName,
                            int nodeId)
    Delete a store from a particular node

    Parameters:
    storeName - name of the store to delete
    nodeId - Node on which we want to delete a store

    setAdminClientCluster

    public void setAdminClientCluster(Cluster cluster)
    Set cluster info for AdminClient to use.

    Parameters:
    cluster - Set the current cluster

    getAdminClientCluster

    public Cluster getAdminClientCluster()
    Get the cluster info AdminClient is using.

    Returns:
    Returns the current cluster being used by the admin client

    rollbackStore

    public void rollbackStore(int nodeId,
                              java.lang.String storeName,
                              long pushVersion)
    Rollback RO store to most recent backup of the current store

    Parameters:
    nodeId - The node id on which to rollback
    storeName - The name of the RO Store to rollback
    pushVersion - The version of the push to revert back to

    fetchStore

    public java.lang.String fetchStore(int nodeId,
                                       java.lang.String storeName,
                                       java.lang.String storeDir,
                                       long pushVersion,
                                       long timeoutMs)
    Fetch data from directory 'storeDir' on node id

    Parameters:
    nodeId - The id of the node on which to fetch the data
    storeName - The name of the store
    storeDir - The directory from where to read the data
    pushVersion - The version of the push
    timeoutMs - Time timeout in milliseconds
    Returns:
    The path of the directory where the data is stored finally

    failedFetchStore

    public void failedFetchStore(int nodeId,
                                 java.lang.String storeName,
                                 java.lang.String storeDir)
    When a fetch store fails, we don't need to keep the pushed data around. This function deletes its...

    Parameters:
    nodeId - The node id on which to delete the data
    storeName - The name of the store
    storeDir - The directory to delete

    swapStore

    public java.lang.String swapStore(int nodeId,
                                      java.lang.String storeName,
                                      java.lang.String storeDir)
    Swap store data atomically on a single node

    Parameters:
    nodeId - The node id where we would want to swap the data
    storeName - Name of the store
    storeDir - The directory where the data is present
    Returns:
    Returns the location of the previous directory

    getROStorageFormat

    public java.util.Map<java.lang.String,java.lang.String> getROStorageFormat(int nodeId,
                                                                               java.util.List<java.lang.String> storeNames)
    Returns the read-only storage format - ReadOnlyStorageFormat for a list of stores

    Parameters:
    nodeId - The id of the node on which the stores are present
    storeNames - List of all the store names
    Returns:
    Returns a map of store name to its corresponding RO storage format

    getROMaxVersionDir

    public java.util.Map<java.lang.String,java.lang.String> getROMaxVersionDir(int nodeId,
                                                                               java.util.List<java.lang.String> storeNames)
    Returns the max version of push currently being used by read-only store. Important to remember that this may not be the 'current' version since multiple pushes (with greater version numbers) may be in progress currently

    Parameters:
    nodeId - The id of the node on which the store is present
    storeNames - List of all the stores
    Returns:
    Returns a map of store name to the respective store directory

    getROCurrentVersionDir

    public java.util.Map<java.lang.String,java.lang.String> getROCurrentVersionDir(int nodeId,
                                                                                   java.util.List<java.lang.String> storeNames)
    Returns the 'current' versions of all RO stores provided

    Parameters:
    nodeId - The id of the node on which the store is present
    storeNames - List of all the RO stores
    Returns:
    Returns a map of store name to the respective max version directory

    getROCurrentVersion

    public java.util.Map<java.lang.String,java.lang.Long> getROCurrentVersion(int nodeId,
                                                                              java.util.List<java.lang.String> storeNames)
    Returns the 'current' version of RO store

    Parameters:
    nodeId - The id of the node on which the store is present
    storeNames - List of all the stores
    Returns:
    Returns a map of store name to the respective max version number

    getROMaxVersion

    public java.util.Map<java.lang.String,java.lang.Long> getROMaxVersion(int nodeId,
                                                                          java.util.List<java.lang.String> storeNames)
    Returns the max version of push currently being used by read-only store. Important to remember that this may not be the 'current' version since multiple pushes (with greater version numbers) may be in progress currently

    Parameters:
    nodeId - The id of the node on which the store is present
    storeNames - List of all the stores
    Returns:
    Returns a map of store name to the respective max version number

    getROMaxVersion

    public java.util.Map<java.lang.String,java.lang.Long> getROMaxVersion(java.util.List<java.lang.String> storeNames)
    This is a wrapper around getROMaxVersion(int, List) where-in we find the max versions on each machine and then return the max of all of them

    Parameters:
    storeNames - List of all read-only stores
    Returns:
    A map of store-name to their corresponding max version id

    updateSlopEntries

    public void updateSlopEntries(int nodeId,
                                  java.util.Iterator<Versioned<Slop>> entryIterator)
    Update slops which may be meant for multiple stores

    Parameters:
    nodeId - The id of the node
    entryIterator - An iterator over all the slops for this particular node

    fetchPartitionFiles

    public void fetchPartitionFiles(int nodeId,
                                    java.lang.String storeName,
                                    java.util.HashMap<java.lang.Integer,java.util.List<java.lang.Integer>> replicaToPartitionList,
                                    java.lang.String destinationDirPath,
                                    java.util.Set<java.lang.Object> notAcceptedBuckets,
                                    java.util.concurrent.atomic.AtomicBoolean running)
    Fetch read-only store files to a specified directory. This is run on the stealer node side

    Parameters:
    nodeId - The node id from where to copy
    storeName - The name of the read-only store
    replicaToPartitionList - Map of replica type to partition list
    destinationDirPath - The destination path
    notAcceptedBuckets - These are Pair< partition, replica > which we cannot copy AT all. This is because these are current mmap-ed and are serving traffic.
    running - A boolean which will control when we want to stop the copying of files. As long this is true, we will continue copying. Once this is changed to false we'll disable the copying

    rebalanceStateChange

    public void rebalanceStateChange(Cluster existingCluster,
                                     Cluster transitionCluster,
                                     java.util.List<RebalancePartitionsInfo> rebalancePartitionPlanList,
                                     boolean swapRO,
                                     boolean changeClusterMetadata,
                                     boolean changeRebalanceState,
                                     boolean rollback,
                                     boolean failEarly)
    Used in rebalancing to indicate change in states. Groups the partition plans on the basis of stealer nodes and sends them over. The various combinations and their order of execution is given below
     | swapRO | changeClusterMetadata | changeRebalanceState | Order |
     | f | t | t | cluster -> rebalance | 
     | f | f | t | rebalance |
     | t | t | f | cluster -> swap |
     | t | t | t | cluster -> swap -> rebalance |
     
    Similarly for rollback:
     | swapRO | changeClusterMetadata | changeRebalanceState | Order |
     | f | t | t | remove from rebalance -> cluster  | 
     | f | f | t | remove from rebalance |
     | t | t | f | cluster -> swap |
     | t | t | t | remove from rebalance -> cluster -> swap  |
     

    Parameters:
    existingCluster - Current cluster
    transitionCluster - Transition cluster
    rebalancePartitionPlanList - The list of rebalance partition info plans
    swapRO - Boolean indicating if we need to swap RO stores
    changeClusterMetadata - Boolean indicating if we need to change cluster metadata
    changeRebalanceState - Boolean indicating if we need to change rebalancing state
    rollback - Do we want to do a rollback step in case of failures?
    failEarly - Do we want to fail early while doing state change?


    Jay Kreps, Roshan Sumbaly, Alex Feinberg, Bhupesh Bansal, Lei Gao