**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Personne# Vinodh Venkatesan

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Unités associées

Chargement

Cours enseignés par cette personne

Chargement

Domaines de recherche associés

Chargement

Publications associées

Chargement

Personnes menant des recherches similaires

Chargement

Cours enseignés par cette personne

Aucun résultat

Unités associées (2)

Personnes menant des recherches similaires

Domaines de recherche associés

Aucun résultat

Aucun résultat

Publications associées (2)

Chargement

Chargement

Modern data storage systems are extremely large and consist of several tens or hundreds of nodes. In such systems, node failures are daily events, and safeguarding data from them poses a serious design challenge. The focus of this thesis is on the data reliability analysis of storage systems and, in particular, on the effect of different design choices and parameters on the system reliability. Data redundancy, in the form of replication or advanced erasure codes, is used to protect data from node failures. By storing redundant data across several nodes, the surviving redundant data on surviving nodes can be used to rebuild the data lost by the failed nodes if node failures occur. As these rebuild processes take a finite amount of time to complete, there exists a nonzero probability of additional node failures during rebuild, which eventually may lead to a situation in which some of the data have lost so much redundancy that they become irrecoverably lost from the system. The average time taken by the system to suffer an irrecoverable data loss, also known as the mean time to data loss (MTTDL), is a measure of data reliability that is commonly used to compare different redundancy schemes and to study the effect of various design parameters. The theoretical analysis of MTTDL, however, is a challenging problem for non-exponential real-world failure and rebuild time distributions and for general data placement schemes. To address this issue, a methodology for reliability analysis is developed in this thesis that is based on the probability of direct path to data loss during rebuild. The reliability analysis is detailed in the sense that it accounts for the rebuild times involved, the amounts of partially rebuilt data when additional nodes fail during rebuild, and the fact that modern systems use an intelligent rebuild process that will first rebuild the data having the least amount of redundancy left. Through rigorous arguments and simulations it is established that the methodology developed is well-suited for the reliability analysis of real-world data storage systems. Applying this methodology to data storage systems with different types of redundancy, various data placement schemes, and rebuild constraints, the effect of the design parameters on the system reliability is studied. When sufficient network bandwidth is available for rebuild processes, it is shown that spreading the redundant data corresponding to the data on each node across a higher number of other nodes and using a distributed and intelligent rebuild process will improve the system MTTDL. In particular, declustered placement, which corresponds to spreading the redundant data corresponding to each node equally across all other nodes of the system, is found to potentially have significantly higher MTTDL values than other placement schemes, especially for large storage systems. This implies that more reliable data storage systems can be designed merely by changing the data placement without compromising on the storage efficiency or performance. The effect of a limited network rebuild bandwidth on the system reliability is also analyzed, and it is shown that, for certain redundancy schemes, spreading redundant data across more number of nodes can actually have a detrimental effect on reliability. It is also shown that the MTTDL values are invariant in a large class of node failure time distributions with the same mean. This class includes the exponential distribution as well as the real-world distributions, such as Weibull or gamma. This result implies that the system MTTDL will not be affected if the failure distribution is changed to a corresponding exponential one with the same mean. This observation is also of great importance because it suggests that the MTTDL results obtained in the literature by assuming exponential node failure distributions may still be valid for real-world storage systems despite the fact that real-world failure distributions are non-exponential. In contrast, it is shown that the MTTDL is sensitive to the node rebuild time distribution. A storage system reliability simulator is built to verify the theoretical results mentioned above. The simulator is sufficiently complex to perform all required failure events and rebuild tasks in a storage system, to use real-world failure and rebuild time distributions for scheduling failures and rebuilds, to take into account partial rebuilds when additional node failures occur, and to simulate different data placement schemes and compare their reliability. The simulation results are found to match the theoretical predictions with high confidence for a wide range of system parameters, thereby validating the methodology of reliability analysis developed.

Christina Fragouli, Robert Haas, Xiao-Yu Hu, Vinodh Venkatesan

Replication is a widely used method to protect large- scale data storage systems from data loss when storage nodes fail. It is well known that the placement of replicas of the different data blocks across the nodes affects the time to rebuild. Several systems described in the literature are designed based on the premise that minimizing the rebuild times maximizes the system reliability. Our results however indicate that the reliability is essentially unaffected by the replica placement scheme. We show that, for a replication factor of two, all possible placement schemes have mean times to data loss (MTTDLs) within a factor of two for practical values of the failure rate, storage capacity, and rebuild bandwidth of a storage node. The theoretical results are confirmed by means of event-driven simulation. For higher replication factors, an analytical derivation of MTTDL becomes intractable for a general placement scheme. We therefore use one of the alternate measures of reliability that have been proposed in the literature, namely, the probability of data loss during rebuild in the critical mode of the system. Whereas for a replication factor of two this measure can be directly translated into MTTDL, it is only speculative of the MTTDL behavior for higher replication factors. This measure of reliability is shown to lie within a factor of two for all possible placement schemes and any replication factor. We also show that for any replication factor, the clustered placement scheme has the lowest probability of data loss during rebuild in critical mode among all possible placement schemes, whereas the declustered placement scheme has the highest probability. Simulation results reveal however that these properties do not hold for the corresponding MTTDLs for a replication factor greater than two. This indicates that some alternate measures of reliability may not be appropriate for comparing the MTTDL of different placement schemes.

2010