Phishing websites pose a major cybersecurity threat, exploiting unsuspecting users and causing significant financial and organisational harm. Traditional machine learning approaches for phishing detection often require extensive feature engineering, continuous retraining, and costly infrastructure maintenance. At the same time, proprietary large language models (LLMs) have demonstrated strong performance in phishing-related classification tasks, but their operational costs and reliance on external providers limit their practical adoption in many business environments. This paper investigates the feasibility of small language models (SLMs) for detecting phishing websites using only their raw HTML code. A key advantage of these models is that they can be deployed on local infrastructure, providing organisations with greater control over data and operations. We systematically evaluate 15 commonly used Small Language Models (SLMs), ranging from 1 billion to 70 billion parameters, benchmarking their classification accuracy, computational requirements, and cost-efficiency. Our results highlight the trade-offs between detection performance and resource consumption, demonstrating that while SLMs underperform compared to state-of-the-art proprietary LLMs, they can still provide a viable and scalable alternative to external LLM services. By presenting a comparative analysis of costs and benefits, this work lays the foundation for future research on the adaptation, fine-tuning, and deployment of SLMs in phishing detection systems, aiming to balance security effectiveness and economic practicality.
翻译:钓鱼网站构成重大网络安全威胁,利用毫无戒备的用户并造成严重的财务与组织损害。传统的钓鱼检测机器学习方法通常需要大量特征工程、持续重新训练以及昂贵的基础设施维护。与此同时,专有大型语言模型(LLMs)在钓鱼相关分类任务中展现出强大性能,但其运营成本和对第三方供应商的依赖限制了其在许多商业环境中的实际应用。本文探讨了仅使用原始HTML代码检测钓鱼网站的小型语言模型(SLMs)的可行性。这些模型的一个关键优势在于可部署于本地基础设施,为组织提供对数据和运营的更大控制权。我们系统评估了15种常用的小型语言模型(SLMs),参数量从10亿到700亿不等,对其分类准确性、计算需求和成本效益进行了基准测试。我们的结果凸显了检测性能与资源消耗之间的权衡,表明尽管SLMs相较于最先进的专有LLMs表现稍逊,但仍可为外部LLM服务提供可行且可扩展的替代方案。通过对比分析成本与效益,本研究为未来在钓鱼检测系统中适应、微调和部署SLMs的研究奠定了基础,旨在平衡安全有效性与经济实用性。