This thesis investigates the ability of Generative AI to recognize and interpret emotions in conversational text, comparing its performance to human annotations. Using the MELD dataset, we benchmark Tools such as Gemma3’s emotion and sentiment classification against human-labeled ground truth. The study evaluates agreement through statistical metrics, and analyzes where AI aligns with or diverges from human judgment. The findings highlight both the potential and limitations of the Generative AI used in affective computing and suggest directions for future work, including multimodal integration and human-in-the-loop systems.

This thesis investigates the ability of Generative AI to recognize and interpret emotions in conversational text, comparing its performance to human annotations. Using the MELD dataset, we benchmark Tools such as Gemma3’s emotion and sentiment classification against human-labeled ground truth. The study evaluates agreement through statistical metrics, and analyzes where AI aligns with or diverges from human judgment. The findings highlight both the potential and limitations of the Generative AI used in affective computing and suggest directions for future work, including multimodal integration and human-in-the-loop systems.

Emotion and Empathy in Dialogue: Comparing Generative AI Tools and Humans

DALLAI, NADA
2024/2025

Abstract

This thesis investigates the ability of Generative AI to recognize and interpret emotions in conversational text, comparing its performance to human annotations. Using the MELD dataset, we benchmark Tools such as Gemma3’s emotion and sentiment classification against human-labeled ground truth. The study evaluates agreement through statistical metrics, and analyzes where AI aligns with or diverges from human judgment. The findings highlight both the potential and limitations of the Generative AI used in affective computing and suggest directions for future work, including multimodal integration and human-in-the-loop systems.
2024
This thesis investigates the ability of Generative AI to recognize and interpret emotions in conversational text, comparing its performance to human annotations. Using the MELD dataset, we benchmark Tools such as Gemma3’s emotion and sentiment classification against human-labeled ground truth. The study evaluates agreement through statistical metrics, and analyzes where AI aligns with or diverges from human judgment. The findings highlight both the potential and limitations of the Generative AI used in affective computing and suggest directions for future work, including multimodal integration and human-in-the-loop systems.
File in questo prodotto:
File Dimensione Formato  
Master's Thesis Nada Dallai.pdf

accesso aperto

Dimensione 2.01 MB
Formato Adobe PDF
2.01 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14247/27078