Accurate AS-to-organization mapping underpins Internet measurement and security, yet registries are fragmented, PeeringDB is narrow, and routing views reflect connectivity rather than ownership. We take a pragmatic step: ASINT integrates curated web evidence with retrieval-guided LLM techniques and strict, evidence-cited validation to infer two relations (aliases and directed parent-child) and then revalidates them conservatively. To keep the dataset sustainable, we operate a public dashboard and API where operators can inspect per-ASN evidence and submit feedback that seeds refreshes. At scale, ASINT maps 112,172 ASNs into 82,840 organization families and, on overlapping AS sets, yields fewer, larger families with 21-24% more multi-AS groups than prior datasets (i.e., CAIDA AS2Org [11], AS2ORG+ [4], AS-Sibling [10], and Borges [28]). Quality is high in practice: ASINT achieves a precision of 0.9608, a recall of 0.9915 and an accuracy of 0.9752 under manual validation. Public deployment further drew operator-submitted reports for 595 ASNs across 106 organizations, with only 6 errors (99.0% observed clustering accuracy), with feedback coming from network operators across all RIR regions. Better organization context improves downstream analyses: +27.5% intra-organization RPKI misconfiguration detections, -9.4% benign hijack alerts, and -5.9% corrections to cases mislabeled as IP leasing. We release code, datasets, and the operator platform with APIs; given persistent ambiguity in organizational names and the continual evolution of corporate structures, an operator-in-the-loop process is essential; the platform records per ASN feedback with provenance and incorporates it into periodic refreshes and retraining. The methodology is model-agnostic and stands to improve further as base LLMs advance.
翻译:暂无翻译