简约即高效：基于空间填充曲线排序的实用距离报告方法 (Simpler is Faster: Practical Distance Reporting by Sorting Along a Space-Filling Curve)

Range reporting is a classical problem in computational geometry. A (rectangular) reporting data structure stores a point set $P$, such that, given a (rectangular) query region $Δ$, it returns all points in $P \cap Δ$. A variety of data structures support such queries with differing asymptotic guarantees such as k-d trees, range trees, R-trees, and quadtrees. A common variant of range queries are distance reporting queries, where the input is a query point $q$ and a radius $δ$, and the goal is to report all points in $P$ within distance $δ$ of $q$. Such queries frequently arise as subroutines in geometric data structures. Practical implementations typically answer distance queries through rectangular range queries using the data structures listed before. This paper revisits a simple and practical heuristic for distance reporting, originally proposed in TCS'97: sort the input point set~$P$ along a space-filling curve. Queries then reduce to scanning at most four contiguous ranges along the sorted curve. The fact that sorting along a space-filling curve is beneficial for range reporting is well-known. Many implementations use this technique to speed up their query and construction times. The point that this paper makes is subtle, but interesting: we argue that often, it is the space-filling curve rather than the overall data structure that provides the performance benefits. Thus, we offer a simple but effective alternative: only sort $P$ along a space-filling curve instead. We compare this approach to eight range searching implementations, across an elaborate test suite of real-world and synthetic data. Our experiments confirm this simple 200-line code approach out-performs all high-end implementations in terms of space usage and construction time. It presents almost always the best query times. In a dynamic setting, our approach dominates in performance.

翻译：范围报告是计算几何中的一个经典问题。一个（矩形）报告数据结构存储点集$P$，使得给定一个（矩形）查询区域$Δ$时，能够返回$P \cap Δ$中的所有点。多种数据结构支持此类查询，并具有不同的渐近性能保证，例如k-d树、范围树、R树和四叉树。范围查询的一个常见变体是距离报告查询，其输入为查询点$q$和半径$δ$，目标是报告$P$中距离$q$在$δ$范围内的所有点。此类查询常作为几何数据结构中的子程序出现。实际实现通常通过使用前述数据结构进行矩形范围查询来回答距离查询。本文重新审视了一种简单而实用的距离报告启发式方法，最初在TCS'97中提出：沿空间填充曲线对输入点集$P$进行排序。查询随后简化为沿排序后的曲线扫描至多四个连续范围。沿空间填充曲线排序对范围报告有益是众所周知的。许多实现使用此技术来加速查询和构建时间。本文提出的观点微妙但有趣：我们认为，通常是空间填充曲线而非整体数据结构提供了性能优势。因此，我们提供了一种简单而有效的替代方案：仅沿空间填充曲线对$P$进行排序。我们将此方法与八种范围搜索实现进行了比较，测试集涵盖了真实世界和合成数据的详尽测试套件。实验证实，这种仅需200行代码的简单方法在空间使用和构建时间方面优于所有高端实现，并且在几乎所有情况下都表现出最佳的查询时间。在动态设置中，我们的方法在性能上占据主导地位。