The rapid development of technology has also led to the development of platforms for sharing advances in various fields of research. However, this growth also involves cybercriminals developing new techniques to deceive their poor victims and share potentially harmful content with them. One such technique is name squatting, more specifically brand impersonation. Although it has been studied in various fields such as social media and domain names, the use of AI models to identify this phenomenon remains largely unexplored. This thesis will investigate the use of AI models to identify potential cases of name squatting within the HuggingFace platform and how such models perform in solving this difficult problem. In particular, two pipelines will be proposed to address the problem from two different perspectives: the first using embedding models and clustering algorithms, and the second using string distance metrics and instruction-based language models. A complementary study on email scam detection based on previously conducted and documented work will then be included in this thesis. This is done to assess how well a deep learning model can perform against simpler models and algorithms, and consequently to understand whether the latter still have competitiveness in the field of cybersecurity, more precisely in the field of anomaly detection.

Application of AI Models for Name Squatting and Scam Detection

VEGO SCOCCO, THOMAS
2024/2025

Abstract

The rapid development of technology has also led to the development of platforms for sharing advances in various fields of research. However, this growth also involves cybercriminals developing new techniques to deceive their poor victims and share potentially harmful content with them. One such technique is name squatting, more specifically brand impersonation. Although it has been studied in various fields such as social media and domain names, the use of AI models to identify this phenomenon remains largely unexplored. This thesis will investigate the use of AI models to identify potential cases of name squatting within the HuggingFace platform and how such models perform in solving this difficult problem. In particular, two pipelines will be proposed to address the problem from two different perspectives: the first using embedding models and clustering algorithms, and the second using string distance metrics and instruction-based language models. A complementary study on email scam detection based on previously conducted and documented work will then be included in this thesis. This is done to assess how well a deep learning model can perform against simpler models and algorithms, and consequently to understand whether the latter still have competitiveness in the field of cybersecurity, more precisely in the field of anomaly detection.
File in questo prodotto:
File Dimensione Formato  
Thomas_Vego_Scocco___Thesis (1).pdf

accesso aperto

Dimensione 2.77 MB
Formato Adobe PDF
2.77 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14247/26983