|
Total Recall |
|
Total Recall is an effort to address the issues of reliability and
availability in systems built over highly unavailable components.
The need for explicit availability management is especially required
when components undergo frequent, transient failures, and more so when
the characteristics of these components changes with time.
Currently, the focus of Total Recall is on building automated availability
management in peer-to-peer systems. Transient failures are frequent in
such systems, where hosts leave the network periodically, to come back
at a later time. The behavior of hosts in peer-to-peer systems needs
to be understood, and system support has to be provided to take into
account these characteristics.
To understand host characteristics, we have performed a measurement study,
and have evaluated several means of providing redundancy in peer-to-peer
systems. We are currently in the process of applying our findings to
the design and implementation of a highly available mutable file system.
|