Back to Main Conference 2018
LREC 2018main
Comparison of Pun Detection Methods Using Japanese Pun Corpus
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Abstract
A sampling survey of typology and component ratio analysis in Japanese puns revealed that the type of Japanese pun that had the largest proportion was a pun type with two sound sequences, whose consonants are phonetically close to each other in the same sentence which includes the pun. Based on this finding, we constructed rules to detect pairs of phonetically similar sequences as features for a supervised machine learning classifier. Using these features in addition to Bag-of-Words features, an evaluation experiment confirmed the effectiveness of adding the rule-based features to the baseline.