The Viterbi algorithm is a key operator for structured sequence inference in modern data systems, with applications in trajectory analysis, online recommendation, and speech recognition. As these workloads increasingly migrate to resource-constrained edge platforms, standard Viterbi decoding remains memory-intensive and computationally inflexible. Existing methods typically trade decoding time for space efficiency, but often incur significant runtime overhead and lack adaptability to various system constraints. This paper presents FLASH Viterbi, a Fast, Lightweight, Adaptive, and Hardware-Friendly Viterbi decoding operator that enhances adaptability and resource efficiency. FLASH Viterbi combines a non-recursive divide-and-conquer strategy with pruning and parallelization techniques to enhance both time and memory efficiency, making it well-suited for resource-constrained data systems. To further decouple space complexity from the hidden state space size, we present FLASH-BS Viterbi, a dynamic beam search variant built on a memory-efficient data structure. Both proposed algorithms exhibit strong adaptivity to diverse deployment scenarios by dynamically tuning internal parameters. To ensure practical deployment on edge devices, we also develop FPGA-based hardware accelerators for both algorithms, demonstrating high throughput and low resource usage. Extensive experiments show that our algorithms consistently outperform existing baselines in both decoding time and memory efficiency, while preserving adaptability and hardware-friendly characteristics essential for modern data systems. All codes are publicly available at https://github.com/Dzh-16/FLASH-Viterbi.
 翻译:暂无翻译