In complex systems with many compute nodes containing multiple CPUs that are coherent within each node, a key challenge is maintaining efficient and correct coherence between nodes. The Unimem system addresses this by proposing a virtualized global address space that enables such coherence, relying on the I/O Memory Management Unit (IOMMU) in each node. The goal of this thesis is to support this approach by successfully testing and using the IOMMU of a single node. For this purpose, we used ARM's IOMMU, known as the System Memory Management Unit (SMMU), which translates virtual addresses to physical addresses. Because Linux documentation for the SMMU is limited and unclear, we implemented custom kernel modules to test and use its functionality. First, we tested the SMMU in the Processing System (PS) of the Xilinx Zynq UltraScale+ MPSoC by developing a module that inserted virtual-to-physical address mappings into the SMMU. We then triggered a DMA transfer to a virtual address and observed that the request passed through the SMMU for address translation. We repeated this experiment by initiating DMA transactions from the Programmable Logic (PL) and similarly confirmed that the transactions were translated by the SMMU. Finally, we developed a module that enables transactions from the PL without requiring explicit pre-mapping of virtual and physical address pairs. This was achieved by configuring the SMMU with the page table pointer of a user process, allowing it to translate all relevant virtual addresses dynamically. Overall, we successfully demonstrated the correct operation of the SMMU across all tested scenarios. Due to time constraints, further exploration of advanced SMMU features is left for future work.
翻译:在包含多个计算节点且每个节点内部CPU保持一致的复杂系统中,维持节点间高效且正确的内存一致性是一项关键挑战。Unimem系统通过提出一种虚拟化全局地址空间来解决这一问题,该方案依赖于每个节点中的输入输出内存管理单元(IOMMU)。本论文的目标是通过成功测试并使用单节点的IOMMU来支持该方法。为此,我们采用了ARM的IOMMU(即系统内存管理单元SMMU),其功能是将虚拟地址转换为物理地址。由于Linux系统中关于SMMU的文档有限且表述不清,我们实现了自定义内核模块以测试和使用其功能。首先,我们在Xilinx Zynq UltraScale+ MPSoC的处理系统(PS)中测试了SMMU,开发了一个向SMMU插入虚拟地址到物理地址映射关系的模块。随后触发对虚拟地址的直接内存访问(DMA)传输,并观察到请求通过SMMU进行地址转换。我们通过从可编程逻辑(PL)发起DMA事务重复该实验,同样证实了事务经过SMMU的地址转换。最后,我们开发了一个无需预先显式映射虚拟地址与物理地址对即可支持PL发起事务的模块。这是通过将SMMU配置为用户进程的页表指针实现的,使其能够动态转换所有相关虚拟地址。总体而言,我们在所有测试场景中成功验证了SMMU的正确运行。由于时间限制,对SMMU高级功能的进一步探索将留待后续研究。