The \emph{Partial Cache-Coherence (PCC)} model maintains hardware cache coherence only within subsets of cores, enabling large-scale memory sharing with emerging memory interconnect technologies like Compute Express Link (CXL). However, PCC's relaxation of global cache coherence compromises the correctness of existing single-machine software. This paper focuses on building consistent and efficient indexes on PCC platforms. We present that existing indexes designed for cache-coherent platforms can be made consistent on PCC platforms following SP guidelines, i.e., we identify \emph{sync-data} and \emph{protected-data} according to the index's concurrency control mechanisms, and synchronize them accordingly. However, conversion with SP guidelines introduces performance overhead. To mitigate the overhead, we identify several unique performance bottlenecks on PCC platforms, and propose P$^3$ guidelines (i.e., using Out-of-\underline{P}lace update, Re\underline{P}licated shared variable, S\underline{P}eculative Reading) to improve the efficiency of converted indexes on PCC platforms. With SP and P$^3$ guidelines, we convert and optimize two indexes (CLevelHash and BwTree) for PCC platforms. Evaluation shows that converted indexes' throughput improves up to 16$\times$ following P$^3$ guidelines, and the optimized indexes outperform their message-passing-based and disaggregated-memory-based counterparts by up to 16$\times$ and 19$\times$.
翻译:部分缓存一致性(PCC)模型仅在核心子集内维护硬件缓存一致性,使得通过计算快速链路(CXL)等新兴内存互连技术实现大规模内存共享成为可能。然而,PCC对全局缓存一致性的放宽影响了现有单机软件的正确性。本文聚焦于在PCC平台上构建一致且高效的索引结构。我们指出,遵循SP指导原则(即根据索引的并发控制机制识别同步数据与受保护数据,并相应进行同步),可为缓存一致性平台设计的现有索引在PCC平台上实现一致性。但依据SP原则进行转换会引入性能开销。为降低开销,我们识别了PCC平台上若干特有的性能瓶颈,并提出P³指导原则(即采用异地更新、复制共享变量及推测读取)以提升转换后索引在PCC平台上的效率。基于SP与P³原则,我们为PCC平台转换并优化了两种索引(CLevelHash与BwTree)。评估表明,遵循P³原则的转换索引吞吐量最高提升16倍,优化后索引的性能分别超越基于消息传递和基于解耦内存的对应方案达16倍和19倍。