voldemort.cluster.failuredetector
Class AsyncRecoveryFailureDetector
java.lang.Object
voldemort.cluster.failuredetector.AbstractFailureDetector
voldemort.cluster.failuredetector.AsyncRecoveryFailureDetector
- All Implemented Interfaces:
- java.lang.Runnable, FailureDetector
- Direct Known Subclasses:
- ThresholdFailureDetector
public class AsyncRecoveryFailureDetector
- extends AbstractFailureDetector
- implements java.lang.Runnable
AsyncRecoveryFailureDetector detects failures and then attempts to contact
the failing node's Store to determine availability.
When a node does go down, attempts to access the remote Store for that node
may take several seconds. Rather than cause the thread to block, we perform
this check in a background thread.
Method Summary |
void |
destroy()
Cleans up any open resources in preparation for shutdown. |
boolean |
isAvailable(Node node)
Determines if the node is available or offline. |
protected void |
nodeRecovered(Node node)
|
void |
recordException(Node node,
long requestTime,
UnreachableStoreException e)
Allows external callers to provide input to the FailureDetector that an
error occurred when trying to access the node. |
void |
recordSuccess(Node node,
long requestTime)
Allows external callers to provide input to the FailureDetector that an
access to the node succeeded. |
void |
run()
|
Methods inherited from class voldemort.cluster.failuredetector.AbstractFailureDetector |
addFailureDetectorListener, checkArgs, checkNodeArg, getAvailableNodeCount, getAvailableNodes, getConfig, getLastChecked, getNodeCount, getNodeStatus, getUnavailableNodes, removeFailureDetectorListener, setAvailable, setUnavailable, waitForAvailability |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
AsyncRecoveryFailureDetector
public AsyncRecoveryFailureDetector(FailureDetectorConfig failureDetectorConfig)
isAvailable
public boolean isAvailable(Node node)
- Description copied from interface:
FailureDetector
- Determines if the node is available or offline.
The isAvailable method is a simple boolean operation to determine if the
node in question is available. As expected, the result of this call is an
approximation given race conditions. However, the FailureDetector should
do its best to determine the then-current state of the cluster to produce
a minimum of false negatives and false positives.
Note: this determination is approximate and differs based upon the
algorithm used by the implementation.
- Specified by:
isAvailable
in interface FailureDetector
- Parameters:
node
- Node to check
- Returns:
- True if available, false otherwise
recordException
public void recordException(Node node,
long requestTime,
UnreachableStoreException e)
- Description copied from interface:
FailureDetector
- Allows external callers to provide input to the FailureDetector that an
error occurred when trying to access the node. The implementation is free
to use or ignore this input. It can be considered a "hint" to the
FailureDetector rather than an absolute truth. For example, it is
possible to call recordException for a given node and have an immediate
call to isAvailable return true, depending on the implementation.
- Specified by:
recordException
in interface FailureDetector
- Parameters:
node
- Node to checkrequestTime
- Length of time (in milliseconds) to perform requeste
- Exception that occurred when trying to access the node
recordSuccess
public void recordSuccess(Node node,
long requestTime)
- Description copied from interface:
FailureDetector
- Allows external callers to provide input to the FailureDetector that an
access to the node succeeded. As with recordException, the implementation
is free to use or ignore this input. It can be considered a "hint" to the
FailureDetector rather than gospel truth.
Note for implementors: because of threading issues it's possible
for multiple threads to attempt access to a node and some fail and some
succeed. In a classic last-one-in-wins scenario, it's possible for the
failures to be recorded first and then the successes. It would be prudent
for implementations not to immediately assume that the node is then
available.
- Specified by:
recordSuccess
in interface FailureDetector
- Parameters:
node
- Node to checkrequestTime
- Length of time (in milliseconds) to perform request
destroy
public void destroy()
- Description copied from interface:
FailureDetector
- Cleans up any open resources in preparation for shutdown.
Note for implementors: After this method is called it is assumed
that attempts to call the other methods will either silently fail, throw
errors, or return stale information.
- Specified by:
destroy
in interface FailureDetector
- Overrides:
destroy
in class AbstractFailureDetector
run
public void run()
- Specified by:
run
in interface java.lang.Runnable
nodeRecovered
protected void nodeRecovered(Node node)
Jay Kreps, Roshan Sumbaly, Alex Feinberg, Bhupesh Bansal, Lei Gao, Chinmay Soman, Vinoth Chandar, Zhongjie Wu