从分钟到秒：重新定义AI时代内存层次结构的五分钟法则 (From Minutes to Seconds: Redefining the Five-Minute Rule for AI-Era Memory Hierarchies)

In 1987, Jim Gray and Gianfranco Putzolu introduced the five-minute rule, a simple, storage-memory-economics-based heuristic for deciding when data should live in DRAM rather than on storage. Subsequent revisits to the rule largely retained that economics-only view, leaving host costs, feasibility limits, and workload behavior out of scope. This paper revisits the rule from first principles, integrating host costs, DRAM bandwidth/capacity, and physics-grounded models of SSD performance and cost, and then embedding these elements in a constraint- and workload-aware framework that yields actionable provisioning guidance. We show that, for modern AI platforms, especially GPU-centric hosts paired with ultra-high-IOPS SSDs engineered for fine-grained random access, the DRAM-to-flash caching threshold collapses from minutes to a few seconds. This shift reframes NAND flash memory as an active data tier and exposes a broad research space across the hardware-software stack. We further introduce MQSim-Next, a calibrated SSD simulator that supports validation and sensitivity analysis and facilitates future architectural and system research. Finally, we present two concrete case studies that showcase the software system design space opened by such memory hierarchy paradigm shift. Overall, we turn a classical heuristic into an actionable, feasibility-aware analysis and provisioning framework and set the stage for further research on AI-era memory hierarchy.

翻译：1987年，Jim Gray与Gianfranco Putzolu提出了五分钟法则，这是一种基于存储-内存经济学的简单启发式规则，用于判定数据应驻留于DRAM而非存储设备。后续对该法则的修订基本保留了纯经济学视角，未将主机成本、可行性限制及工作负载行为纳入考量范围。本文从第一性原理出发重新审视该法则，整合主机成本、DRAM带宽/容量以及基于物理原理的SSD性能与成本模型，并将这些要素嵌入到考虑约束条件与工作负载的框架中，从而生成可操作的资源配置指导。研究表明，对于现代AI平台——尤其是以GPU为核心的主机搭配专为细粒度随机访问设计的超高IOPS SSD——DRAM到闪存的缓存阈值已从分钟级骤降至秒级。这一转变将NAND闪存重构为活跃数据层，并揭示了跨越硬件-软件栈的广阔研究空间。我们进一步推出MQSim-Next，这是一个经过校准的SSD模拟器，支持验证与敏感性分析，并为未来体系结构与系统研究提供工具。最后，我们通过两个具体案例研究，展示了此类内存层次范式转变所开启的软件系统设计空间。总体而言，我们将经典启发式规则转化为具有可操作性、兼顾可行性的分析与资源配置框架，并为AI时代内存层次结构的后续研究奠定基础。

相关内容