Fault Tolerance
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
FTHP-MPI: Towards Providing Replication-based Fault Tolerance in a Fault-Intolerant Native MPI Library
Announce Type: replace Abstract: Faults in high-performance systems are expected to be very frequent in the current exascale computing era. To compensate for a higher failure rate, the standard checkpoint/restart technique would need to create checkpoints at a much higher frequency, resulting in an excessive amount of overhead, which would not be sustainable for many scientific applications. To improve application efficiency in such high-failure environments, the mechanism of replication of...
Hardware-aware Low-latency Quantum Compilation with Data-driven Lightweight Error Detection for Early Fault-Tolerant Systems
arXiv:2606.07666v1 Announce Type: cross Abstract: Noisy intermediate-scale quantum (NISQ) processors are entering an early fault-tolerance regime where full quantum error correction carries prohibitive resource costs, yet lightweight error detection can meaningfully improve algorithmic success rates. Existing compilation and error-detection toolchains treat these concerns in isolation, with no principled way to balance detection overhead against success probability under latency constraints....
Fault tolerance estimation in digital circuits with visualised generative networks
Announce Type: replace Abstract: We propose a new numerical method to estimate the fault tolerance of failure modes in digital circuit structures with a generative network sampling technique. From a random input of generated bitwise configurations of ideally digitalised analog currents in the digital circuit design with classical logical gates, expected output currents are compared to the realistic signals of a numerical experiment at the discriminator part of the Generative Adversarial...
Preserving Full 6-DOF Actuation Under Abrupt Total Rotor Failures: Passive Fault-Tolerant Flight Control Using a Biaxial-Tilt Hexacopter
arXiv:2606.05663v1 Announce Type: new Abstract: Conventional multirotors suffer from a rapid collapse of attainable wrench space (AWS) under abrupt total rotor failures, rendering full 6-DOF recovery physically impossible. This paper addresses passive fault-tolerant flight of a biaxial-tilt overactuated hexacopter (BTO) under abrupt total rotor failures that are a priori unknown to the controller. The control design and analysis focus on representative abrupt rotor-failure cases for which...
PartRePer-MPI: Combining Fault Tolerance and Performance for MPI Applications
arXiv:2310.16370v2 Announce Type: replace Abstract: As we have entered Exascale computing, the faults in high-performance systems are expected to increase considerably. To compensate for a higher failure rate, the standard checkpoint/restart technique would need to create checkpoints at a much higher frequency resulting in an excessive amount of overhead which would not be sustainable for many scientific applications. Replication allows for fast recovery from failures by simply dropping the...
EES-CND: Collaborative Neural Decision-Making for Drift-Aware Fault-Tolerant Edge-Cloud Service Placement
arXiv:2606.02259v1 Announce Type: new Abstract: The edge-cloud paradigm improves service delivery by orchestrating resources across edge nodes and cloud data centres. These environments consist of heterogeneous, interconnected computing nodes that cooperate to deliver continuous services. However, their scale and complexity increase vulnerability to failures from hardware malfunctions, software defects, and dynamic operating conditions.
Improved quantum processor logical error rates via correction and detection
Abstract Performing quantum algorithms for critical problems in physics and chemistry requires substantially lower error rates than the physical error rates of present quantum computers. Achieving such low logical error rates requires quantum error correction1,2 and physical error rates below a critical threshold value3,4,5,6,7,8. We experimentally demonstrate on a trapped-ion quantum charge-coupled device (QCCD)9,10 improvements in logical error rates ranging from 11× to 800× compared with...
Proactive-reactive detection and mitigation of intermittent faults in robot swarms
arXiv:2509.19246v2 Announce Type: replace Abstract: Intermittent faults are transient errors that sporadically appear and disappear. Although intermittent faults pose substantial challenges to reliability and coordination, existing studies of fault tolerance in robot swarms focus instead on permanent faults. One reason for this is that intermittent faults are prohibitively difficult to detect in the fully self-organized ad-hoc networks typical of robot swarms, as their network topologies are...
A Dual Metastable-State Encoding Architecture for Quantum Processing with $^{171}\mathrm{Yb}$ Atom Arrays
arXiv:2606.08453v1 Announce Type: cross Abstract: Neutral-atom arrays combine scalable qubit registers, long coherence times, flexible optical control, and strong Rydberg-mediated entangling interactions, making them a promising platform for quantum information processing. However, physical error rates remain a challenge, and fault-tolerant quantum error correction (QEC) requires repeated mid-circuit measurement and reset of ancilla qubits without disturbing nearby data qubits. This...
Discovering autonomous quantum error correction via deep reinforcement learning
Announce Type: replace-cross Abstract: Quantum error correction is essential for fault-tolerant quantum computing. However, standard methods relying on active measurements may introduce additional errors. Autonomous quantum error correction (AQEC) circumvents this by utilizing engineered dissipation and drives in bosonic systems, but identifying practical encoding remains challenging due to stringent Knill-Laflamme conditions.