Home Knowledge Base Fault Tolerance

Fault Tolerance

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

FTHP-MPI: Towards Providing Replication-based Fault Tolerance in a Fault-Intolerant Native MPI Library

Announce Type: replace Abstract: Faults in high-performance systems are expected to be very frequent in the current exascale computing era. To compensate for a higher failure rate, the standard checkpoint/restart technique would need to create checkpoints at a much higher frequency, resulting in an excessive amount of overhead, which would not be sustainable for many scientific applications. To improve application efficiency in such high-failure environments, the mechanism of replication of...

arXiv CS 8d ago

Hardware-aware Low-latency Quantum Compilation with Data-driven Lightweight Error Detection for Early Fault-Tolerant Systems

arXiv:2606.07666v1 Announce Type: cross Abstract: Noisy intermediate-scale quantum (NISQ) processors are entering an early fault-tolerance regime where full quantum error correction carries prohibitive resource costs, yet lightweight error detection can meaningfully improve algorithmic success rates. Existing compilation and error-detection toolchains treat these concerns in isolation, with no principled way to balance detection overhead against success probability under latency constraints....

arXiv CS 1d ago

Fault tolerance estimation in digital circuits with visualised generative networks

Announce Type: replace Abstract: We propose a new numerical method to estimate the fault tolerance of failure modes in digital circuit structures with a generative network sampling technique. From a random input of generated bitwise configurations of ideally digitalised analog currents in the digital circuit design with classical logical gates, expected output currents are compared to the realistic signals of a numerical experiment at the discriminator part of the Generative Adversarial...

arXiv CS 5d ago

Preserving Full 6-DOF Actuation Under Abrupt Total Rotor Failures: Passive Fault-Tolerant Flight Control Using a Biaxial-Tilt Hexacopter

arXiv:2606.05663v1 Announce Type: new Abstract: Conventional multirotors suffer from a rapid collapse of attainable wrench space (AWS) under abrupt total rotor failures, rendering full 6-DOF recovery physically impossible. This paper addresses passive fault-tolerant flight of a biaxial-tilt overactuated hexacopter (BTO) under abrupt total rotor failures that are a priori unknown to the controller. The control design and analysis focus on representative abrupt rotor-failure cases for which...

arXiv CS 5d ago

PartRePer-MPI: Combining Fault Tolerance and Performance for MPI Applications

arXiv:2310.16370v2 Announce Type: replace Abstract: As we have entered Exascale computing, the faults in high-performance systems are expected to increase considerably. To compensate for a higher failure rate, the standard checkpoint/restart technique would need to create checkpoints at a much higher frequency resulting in an excessive amount of overhead which would not be sustainable for many scientific applications. Replication allows for fast recovery from failures by simply dropping the...

arXiv CS 7d ago

EES-CND: Collaborative Neural Decision-Making for Drift-Aware Fault-Tolerant Edge-Cloud Service Placement

arXiv:2606.02259v1 Announce Type: new Abstract: The edge-cloud paradigm improves service delivery by orchestrating resources across edge nodes and cloud data centres. These environments consist of heterogeneous, interconnected computing nodes that cooperate to deliver continuous services. However, their scale and complexity increase vulnerability to failures from hardware malfunctions, software defects, and dynamic operating conditions.

arXiv CS 8d ago

Improved quantum processor logical error rates via correction and detection

Abstract Performing quantum algorithms for critical problems in physics and chemistry requires substantially lower error rates than the physical error rates of present quantum computers. Achieving such low logical error rates requires quantum error correction1,2 and physical error rates below a critical threshold value3,4,5,6,7,8. We experimentally demonstrate on a trapped-ion quantum charge-coupled device (QCCD)9,10 improvements in logical error rates ranging from 11× to 800× compared with...

Nature 17h ago

Proactive-reactive detection and mitigation of intermittent faults in robot swarms

arXiv:2509.19246v2 Announce Type: replace Abstract: Intermittent faults are transient errors that sporadically appear and disappear. Although intermittent faults pose substantial challenges to reliability and coordination, existing studies of fault tolerance in robot swarms focus instead on permanent faults. One reason for this is that intermittent faults are prohibitively difficult to detect in the fully self-organized ad-hoc networks typical of robot swarms, as their network topologies are...

arXiv CS 8d ago

A Dual Metastable-State Encoding Architecture for Quantum Processing with $^{171}\mathrm{Yb}$ Atom Arrays

arXiv:2606.08453v1 Announce Type: cross Abstract: Neutral-atom arrays combine scalable qubit registers, long coherence times, flexible optical control, and strong Rydberg-mediated entangling interactions, making them a promising platform for quantum information processing. However, physical error rates remain a challenge, and fault-tolerant quantum error correction (QEC) requires repeated mid-circuit measurement and reset of ancilla qubits without disturbing nearby data qubits. This...

arXiv Physics 1d ago

Discovering autonomous quantum error correction via deep reinforcement learning

Announce Type: replace-cross Abstract: Quantum error correction is essential for fault-tolerant quantum computing. However, standard methods relying on active measurements may introduce additional errors. Autonomous quantum error correction (AQEC) circumvents this by utilizing engineered dissipation and drives in bosonic systems, but identifying practical encoding remains challenging due to stringent Knill-Laflamme conditions.

arXiv CS 7d ago