Resilient Architectures for Fault-Tolerant Distributed and Embedded Systems: Theory, Methods, and Practical Pathways

Authors

  • Dr. Emilia Carter University of Edinburgh Author

Keywords:

Dependability, Fault Tolerance, Adaptive Systems, Mixed Criticality

Abstract

This article synthesizes foundational theory, contemporary methodologies, and applied strategies for designing fault-tolerant, dependable, and resilient computing systems across distributed, embedded, and cloud environments. It integrates classical dependability concepts with adaptive and mixed-criticality approaches, examines fault tolerance for high-reliability industrial and avionics systems, and extends discussion to modern cloud and GPU manufacturing test infrastructures. The structured abstract outlines the problem context, methodological approach, principal findings, and implications for future research and practice.

Downloads

Download data is not yet available.

References

Avizienis, A.; Laprie, J.C.; Randell, B. Fundamental Concepts of Dependability. UCLA CSD Report no. 010028, LAAS Report no. 01-145, Newcastle University Report no. CS-TR-739, 2001.

Burns, A. System Mode Changes—General and Criticality-Based. In Proceedings of the 2nd Workshop on Mixed Criticality Systems (WMC), RTSS, Rome, Italy, 2 December 2014.

Kim, K.H.K.; Lawrence, T.F. Adaptive fault-tolerance in complex real-time distributed computer system applications. Comput. Commun. 1992, 15, 243–251.

Årzén, K.E. Preface to special issue on adaptive embedded systems. Real-Time Syst. 2013, 49, 337–338.

Laprie, J.C. From dependability to resilience. In Proceedings of the 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, Anchorage, AK, USA, 24–27 June 2008.

Knight, J.; Strunk, E.; Sullivan, K. Towards a rigorous definition of information system survivability. In Proceedings of the DARPA Information Survivability Conference and Exposition, Washington, DC, USA, 22–24 April 2003; pp. 78–89.

Proenza, J.; Barranco, M.; Ballesteros, A.; Álvarez, I.; Gessner, D.; Derasevic, S.; Rodríguez-Navas, G. DFT4FTT Project. Available online: http://srv.uib.es/dft4ftt/ (accessed on 1 September 2022).

Álvarez, I.; Ballesteros, A.; Barranco, M.; Gessner, D.; Djerasevic, S.; Proenza, J. Fault Tolerance in Highly Reliable Ethernet-Based Industrial Systems. Proc. IEEE 2019, 107, 977–1010.

Wensley, J.; Lamport, L.; Shostak, R.; Weinstock, C.; Goldberg, J.; Green, M.; Levitt, K.; Melliar-Smith, P. SIFT: Design and analysis of a fault-tolerant computer for aircraft control. Proc. IEEE 2008, 66, 1240–1255.

Abd Elfattah, E.; Elkawkagy, M.; El Sisi, A. A Reactive Fault Tolerance Approach for Cloud Computing. In Proceedings of the 13th International IEEE Computer Engineering Conference (ICENCO’17), 2017, pp. 190–194.

Hasan, M.; Goraya, M. S. Priority Based Cooperative Computing in Cloud Using Task Backfilling. Lect. Notes Software Eng., Vol. 4, 2016, pp. 229–233.

Kochhar, D.; Hilda, A. K. J. An Approach for Fault Tolerance in Cloud Computing Using Machine Learning Technique. Int. J. Pure Appl. Math., Vol. 117, 2017, No. 22, pp. 345–351.

Gupta, S.; Gupta, B. B. XSS-Secure as a Service for the Platforms of Online Social Network-Based Multimedia Web Applications in the Cloud. Multimedia Tools Appl., Vol. 77, 2018, No. 4, pp. 4829–4861.

Tebaa, M.; El Hajji, S. From Single to Multi-Clouds Computing Privacy and Fault Tolerance. In Proceedings of the International Conference on Future Information Engineering, Elsevier B. V., 2014, pp. 112–118.

Abid, A.; Khemakhem, M. T.; Marzouk, S.; Bem Jemaa, M.; Monteil, T.; Drira, K. Toward Ant Fragile Cloud Computing Infrastructures. Procedia Computer Science, Vol. 32, 2014, pp. 850–855.

Lin, X.; Mamat, A.; Lu, Y.; Deogun, J.; Goddard, S. Real-Time Scheduling of Divisible Loads in Cluster Computing Environments. Journal of Parallel and Distributed Computing, Vol. 70, 2010, pp. 296–308.

Jhawar, R.; Piuri, V. Fault Tolerance and Resilience in Cloud Computing Environments. In Computer and Information Security Handbook. 2013, pp. 1–29.

Sun, D.; Chang, G.; Miao, C.; Wang, X. Modelling and Evaluating a High Serviceability Fault Tolerance Strategy in Cloud Computing Environments. International Journal of Security and Networks, Vol. 7, 2012, pp. 196–210.

Designing Fault-Tolerant Test Infrastructure for Large-Scale GPU Manufacturing. International Journal of Signal Processing, Embedded Systems and VLSI Design, 2025, 5(01), 35–61.

Downloads

Published

2025-09-30

How to Cite

Resilient Architectures for Fault-Tolerant Distributed and Embedded Systems: Theory, Methods, and Practical Pathways . (2025). EuroLexis Research Index of International Multidisciplinary Journal for Research & Development, 12(09), 450-458. https://researchcitations.org/index.php/elriijmrd/article/view/47

Similar Articles

11-20 of 74

You may also start an advanced similarity search for this article.