SARA：一种面向混合关键性系统的停滞感知内存分配策略 (SARA: A Stall-Aware Memory Allocation Strategy for Mixed-Criticality Systems)

The memory capacity in edge devices is often limited due to constraints on cost, size, and power. Consequently, memory competition leads to inevitable page swapping in memory-constrained mixed-criticality edge devices, causing slow storage I/O and thus performance degradation. In such scenarios, inefficient memory allocation disrupts the balance between application performance, causing soft real-time (soft RT) tasks to miss deadlines or preventing non-real-time (non-RT) applications from optimizing throughput. Meanwhile, we observe unpredictable, long system-level stalls (called long stalls) under high memory and I/O pressure, which further degrade performance. In this work, we propose a Stall-Aware Real-Time Memory Allocator (SARA), which discovers opportunities for performance balance by allocating just enough memory to soft RT tasks to meet deadlines and, at the same time, optimizing the remaining memory for non-RT applications. To minimize the memory usage of soft RT tasks while meeting real-time requirements, SARA leverages our insight into how latency, caused by memory insufficiency and measured by our proposed PSI-based metric, affects the execution time of each soft RT job, where a job runs per period and a soft RT task consists of multiple periods. Moreover, SARA detects long stalls using our definition and proactively drops affected jobs, minimizing stalls in task execution. Experiments show that SARA achieves an average of 97.13% deadline hit ratio for soft RT tasks and improves non-RT application throughput by up to 22.32x over existing approaches, even with memory capacity limited to 60% of peak demand.

翻译：由于成本、尺寸和功耗的限制，边缘设备的内存容量通常有限。因此，在内存受限的混合关键性边缘设备中，内存竞争导致不可避免的页面交换，引发缓慢的存储I/O，进而造成性能下降。在此类场景下，低效的内存分配会破坏应用程序性能间的平衡，导致软实时任务错过截止时间，或阻碍非实时应用程序优化吞吐量。同时，我们观察到在高内存和I/O压力下会出现不可预测的、长时间的系统级停滞（称为长停滞），这进一步降低了性能。本文提出一种停滞感知实时内存分配器（SARA），它通过为软实时任务分配恰好满足截止时间要求的内存，同时将剩余内存优化分配给非实时应用程序，以实现性能平衡的机会。为了在满足实时性要求的同时最小化软实时任务的内存使用，SARA基于以下洞察：由内存不足引起的延迟（通过我们提出的基于PSI的指标度量）如何影响每个软实时作业的执行时间，其中作业按周期运行，而软实时任务由多个周期组成。此外，SARA根据我们的定义检测长停滞，并主动丢弃受影响的作业，从而最小化任务执行中的停滞。实验表明，即使在内存容量限制为峰值需求60%的情况下，SARA对软实时任务的平均截止时间命中率达到97.13%，并将非实时应用程序的吞吐量较现有方法提升最高达22.32倍。