The rapid development of technology has also led to the development of platforms for sharing advances in various fields of research. However, this growth also involves cybercriminals developing new techniques to deceive their poor victims and share potentially harmful content with them. One such technique is name squatting, more specifically brand impersonation. Although it has been studied in various fields such as social media and domain names, the use of AI models to identify this phenomenon remains largely unexplored. This thesis will investigate the use of AI models to identify potential cases of name squatting within the HuggingFace platform and how such models perform in solving this difficult problem. In particular, two pipelines will be proposed to address the problem from two different perspectives: the first using embedding models and clustering algorithms, and the second using string distance metrics and instruction-based language models. A complementary study on email scam detection based on previously conducted and documented work will then be included in this thesis. This is done to assess how well a deep learning model can perform against simpler models and algorithms, and consequently to understand whether the latter still have competitiveness in the field of cybersecurity, more precisely in the field of anomaly detection.
Application of AI Models for Name Squatting and Scam Detection
VEGO SCOCCO, THOMAS
2024/2025
Abstract
The rapid development of technology has also led to the development of platforms for sharing advances in various fields of research. However, this growth also involves cybercriminals developing new techniques to deceive their poor victims and share potentially harmful content with them. One such technique is name squatting, more specifically brand impersonation. Although it has been studied in various fields such as social media and domain names, the use of AI models to identify this phenomenon remains largely unexplored. This thesis will investigate the use of AI models to identify potential cases of name squatting within the HuggingFace platform and how such models perform in solving this difficult problem. In particular, two pipelines will be proposed to address the problem from two different perspectives: the first using embedding models and clustering algorithms, and the second using string distance metrics and instruction-based language models. A complementary study on email scam detection based on previously conducted and documented work will then be included in this thesis. This is done to assess how well a deep learning model can perform against simpler models and algorithms, and consequently to understand whether the latter still have competitiveness in the field of cybersecurity, more precisely in the field of anomaly detection.| File | Dimensione | Formato | |
|---|---|---|---|
|
Thomas_Vego_Scocco___Thesis (1).pdf
accesso aperto
Dimensione
2.77 MB
Formato
Adobe PDF
|
2.77 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14247/26983