Self-Healing Memory Systems in AI Fabrics: Machine Learning-Driven Predictive Detection and Autonomous Mitigation of Memory Leaks in High Performance Network Switches
Keywords:
Self-Healing Networks, Memory Leak Prediction, Lstm Forecasting, Reinforcement Learning, Ai Fabric ReliabilityAbstract
The explosive growth of AI workloads is driving the shift today towards ultra-high-speed 800G and infuture 1.6T networks, where control-plane processes in network switches increasingly suffer from hiddenmemory leaks and non
References
Ellie Lipe, "Energy Efficient Scheduling of AI/ML Workloads on Multi-Instance Gpus with Dynamic Repartitioning," in 2025 IEEE 25th International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 30 June 2025. https://ieeexplore.ieee.org/document/11044810


