Concept

Single point of failure

A single point of failure (SPOF) is a part of a system that, if it fails, will stop the entire system from working. SPOFs are undesirable in any system with a goal of high availability or reliability, be it a business practice, software application, or other industrial system. Systems can be made robust by adding redundancy in all potential SPOFs. Redundancy can be achieved at various levels. The assessment of a potential SPOF involves identifying the critical components of a complex system that would provoke a total systems failure in case of malfunction. Highly reliable systems should not rely on any such individual component. For instance, the owner of a small tree care company may only own one woodchipper. If the chipper breaks, he may be unable to complete his current job and may have to cancel future jobs until he can obtain a replacement. The owner of the tree care company may have spare parts ready for the repair of the wood chipper, in case it fails. At a higher level, he may have a second wood chipper that he can bring to the job site. Finally, at the highest level, he may have enough equipment available to completely replace everything at the work site in the case of multiple failures. File:Spof simple.svg|Possible SPOFs in a simple setup File:Spof redundancy.svg|Using redundancy to avoid some SPOFs File:Spof complex.svg|Completely redundant system without SPOFs (note: assumes generator and grid sources are each rated at N, each UPS is rated at N, and "A/C" and "Electrical" are in themselves completely fault tolerant systems) A fault-tolerant computer system can be achieved at the internal component level, at the system level (multiple machines), or site level (replication). One would normally deploy a load balancer to ensure high availability for a server cluster at the system level. In a high-availability server cluster, each individual server may attain internal component redundancy by having multiple power supplies, hard drives, and other components.

Official source

https://en.wikipedia.org/wiki/Single_point_of_failure

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Single point of failure

Graph Chatbot

Chat with Graph Search

A Minimally Intrusive Low-Memory Approach to Resilience for Existing Transient Solvers

System Support for Efficient Replication in Distributed Systems

Monitoring distributed fragmented skylines

System Support for Efficient Replication in Distributed Systems

A Minimally Intrusive Low-Memory Approach to Resilience for Existing Transient Solvers

Monitoring distributed fragmented skylines