This paper presents Bancroft, a computational genomics acceleration platform that provides the illusion of practically infinite on-device memory capacity by compressing genomic data movement over PCIe. Bancroft introduces novel optimizations for efficient accelerator implementation to reference-based genome compression, including fixed-stride matching using cuckoo hashes and grouped header encoding, incorporated into a familiar interface supporting random accesses. We evaluate a prototype implementation of Bancroft on an affordable Alveo U50 FPGA equipped with 8 GB of HBM. Thanks to the orders of magnitude improvements in performance and resource efficiency of genomic compression, our prototype provides access to TBs of host-side genomic data at memory-class performance, measuring speeds over 30% of the on-device HBM bandwidth, an order of magnitude higher than conventional PCIe-limited architectures. Using a real-world pre-alignment filtering application, Bancroft demonstrates over 6x improvement over the conventional PCIe-attached architecture, achieving 30% of peak internal throughput of an accelerator with HBM, and 90% of the one with DDR4. Bancroft supports memory-class performance to practically infinite data capacity, using a small, fixed amount of HBM, making it an attractive solution to continued future scalability of computational genomics.
翻译:暂无翻译