Skip to content

vivekk16/pun-detection-with-age-appropriateness

Repository files navigation

Pun detection with age appropriateness

This project explores various methods for detecting, locating, and interpreting puns in English sentences, utilizing a combination of machine learning techniques and lexical resources. The primary objective is to determine whether a given sentence contains a pun, identify the word responsible for the pun, and interpret its dual meanings. The project draws on the SemEval 2017 Task 7 dataset, focusing on homographic puns, and leverages transformer-based models like DistilBERT for pun detection. Several techniques, including word ambiguity, positional strategies, and semantic similarity, were employed to identify the pun's location. We also incorporated WordNet for word sense disambiguation (WSD) and used large language models (LLMs) like GPT-4 for refining the sense identification.

Additionally, age-appropriateness was explored by determining the age at which the pun word is typically acquired, using the Kuperman dataset and synthetic datasets. Chain of thought prompting was also employed with LLMs to enhance pun interpretation, with special attention to reasoning and contextual relevance. The approach blends statistical, semantic, and contextual methods, achieving notable accuracy in identifying puns and their meanings, with plans for future improvements in word sense disambiguation and reasoning methodologies.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published