Reproducibility in simulation-based computer architecture research requires coordinating artifacts like disk images, kernels, and benchmarks, but existing workflows are inconsistent. We improve gem5, an open-source simulator with over 1600 forks, and gem5 Resources, a centralized repository of over 2000 pre-packaged artifacts, to address these issues. While gem5 Resources enables artifact sharing, researchers still face challenges. Creating custom disk images is complex and time-consuming, with no standardized process across ISAs, making it difficult to extend and share images. gem5 provides limited guest-host communication features through a set of predefined exit events that restrict researchers' ability to dynamically control and monitor simulations. Lastly, running simulations with multiple workloads requires researchers to write custom external scripts to coordinate multiple gem5 simulations which creates error-prone and hard-to-reproduce workflows. To overcome this, we introduce several features in gem5 and gem5 Resources. We standardize disk-image creation across x86, ARM, and RISC-V using Packer, and provide validated base images with pre-annotated benchmark suites (NPB, GAPBS). We provide 12 new disk images, 6 new kernels, and over 200 workloads across three ISAs. We refactor the exit event system to a class-based model and introduce hypercalls for enhanced guest-host communication that allows researchers to define custom behavior for their exit events. We also provide a utility to remotely monitor simulations and the gem5-bridge driver for user-space m5 operations. Additionally, we implemented Suites and MultiSim to enable parallel full-system simulations from gem5 configuration scripts, eliminating the need for external scripting. These features reduce setup complexity and provide extensible, validated resources that improve reproducibility and standardization.
翻译:基于仿真的计算机体系结构研究中的可复现性需要协调磁盘镜像、内核和基准测试等构件,但现有工作流程缺乏一致性。我们改进了gem5(一个拥有超过1600个分支的开源模拟器)及其配套的gem5资源库(一个包含2000多个预制构件的中心化存储库)以解决这些问题。尽管gem5资源库支持构件共享,研究人员仍面临诸多挑战:创建自定义磁盘镜像过程复杂耗时,且缺乏跨指令集架构(ISA)的标准化流程,导致镜像扩展与共享困难;gem5仅通过一组预定义退出事件提供有限的宿主机-客户机通信功能,限制了研究人员动态控制和监控仿真的能力;此外,运行多工作负载仿真需研究人员编写自定义外部脚本来协调多个gem5实例,这种工作流程易出错且难以复现。为应对这些挑战,我们在gem5及gem5资源库中引入多项新特性:使用Packer工具实现x86、ARM和RISC-V架构的磁盘镜像创建标准化,并提供预标注基准测试套件(NPB、GAPBS)的已验证基础镜像;新增12个磁盘镜像、6个内核以及覆盖三种ISA的200多个工作负载;将退出事件系统重构为基于类的模型,引入超级调用以增强宿主机-客户机通信,允许研究人员为退出事件定义自定义行为;同时提供远程监控仿真工具及支持用户空间m5操作的gem5-bridge驱动。此外,我们实现了Suites和MultiSim功能,支持通过gem5配置脚本直接运行并行全系统仿真,无需依赖外部脚本。这些特性显著降低了配置复杂度,提供了可扩展且经验证的资源,有效提升了研究可复现性与标准化水平。