迈向自愈片上网络：二维环面架构中的强化学习驱动路由 (Toward Self-Healing Networks-on-Chip: RL-Driven Routing in 2D Torus Architectures)

We investigate adaptive minimal routing in 2D torus networks on chip NoCs under node fault conditions comparing a reinforcement learning RL based strategy to an adaptive routing baseline A torus topology is used for its low diameter high connectivity properties The RL approach models each router as an agent that learns to forward packets based on network state while the adaptive scheme uses fixed minimal paths with simple rerouting around faults We implement both methods in simulation injecting up to 50 node faults uniformly at random Key metrics are measured 1 throughput vs offered load at fault density 02 2 packet delivery ratio PDR vs fault density and 3 a fault adaptive score FT vs fault density Experimental results show the RL method achieves significantly higher throughput at high load approximately 2030 gain and maintains higher reliability under increasing faults The RL router delivers more packets per cycle and adapts to faults by exploiting path diversity whereas the adaptive scheme degrades sharply as faults accumulate In particular the RL approach preserves end to end connectivity longer PDR remains above 90 until approximately 3040 faults while adaptive PDR drops to approximately 70 at the same point The fault adaptive score likewise favors RL routing Thus RL based adaptive routing demonstrates clear advantages in throughput and fault resilience for torus NoCs

翻译：本研究探讨了在节点故障条件下，二维环面片上网络（NoCs）中的自适应最小路由策略，比较了一种基于强化学习（RL）的方法与一种自适应路由基线。环面拓扑因其低直径和高连通性特性而被采用。RL方法将每个路由器建模为一个智能体，基于网络状态学习转发数据包；而自适应方案则使用固定的最小路径，并在故障周围进行简单的重路由。我们在仿真中实现了两种方法，均匀随机注入最多50个节点故障。关键指标包括：（1）在故障密度为0.2时，吞吐量与提供负载的关系；（2）数据包交付率（PDR）与故障密度的关系；（3）故障自适应评分（FT）与故障密度的关系。实验结果表明，RL方法在高负载下实现了显著更高的吞吐量（增益约20-30%），并在故障增加时保持了更高的可靠性。RL路由器每个周期交付更多数据包，并通过利用路径多样性适应故障，而自适应方案在故障累积时性能急剧下降。具体而言，RL方法能更长时间保持端到端连通性（PDR在约30-40个故障前保持在90%以上），而自适应方案的PDR在同一故障点降至约70%。故障自适应评分同样支持RL路由。因此，基于RL的自适应路由在环面NoCs中展现出吞吐量和故障恢复能力方面的明显优势。