XXIX Открытая конференция студентов-филологов в СПбГУ

Can AI crack a joke? Exploring the frontiers of AI generated humour

Maria Vitalevna Ivanova
Докладчик
студент 4 курса
Московский городской педагогический университет
Daria Nikolaevna Pervakova
Докладчик
магистрант 1 курса
ВШЭ

Ключевые слова, аннотация

The paper examines the ability of ChatGPT-5 and Yandex GPT-5 to generate humour across three joke formats (knock-knock jokes, bar jokes and lightbulb jokes), comparing outputs with human-made examples and AI-generated memes. Both models replicate formal structures and produce basic wordplay, yet fail to land effective punchlines or the unexpected twist that makes humour work. The findings suggest that current language models imitate the shape of humour rather than its substance. 


Тезисы

Keywords: humour generation; joke cycles; internet memes; language models; verbal humour theory

The question of whether AI can produce genuinely funny content has attracted growing attention as language models become increasingly integrated into creative processes. The shift from GPT-3 to GPT-5 significantly improved models' ability to understand context and produce more nuanced responses. Humour, however, remains a particularly difficult area, since it relies on cultural knowledge, timing and the kind of unexpected twist that makes a joke land.
The study draws a comparison between jokes produced by ChatGPT-5 and Yandex GPT-5 across three classic formats: knock-knock jokes, bar jokes and lightbulb jokes with human generated examples. The analysis takes its root in the General Theory of Verbal Humour which outlines how jokes function by activating two conflicting interpretations, with the punchline forcing a sudden switch between them [Attardo, Raskin, 1991].
In knock-knock jokes both models use phonetic wordplay (cargo/car go and woody/would you) but the puns feel disconnected from the overall exchange, whereas the human example (lettuce/let us) works because the pun is tied directly to the situation (someone standing outside in the cold) so both meanings feel equally present and real. Neither model delivers an ending that recontextualises the setup, so there is no real moment where everything clicks into place. The human joke (three lettering styles walking into a bar and being told: "We don't serve your type here") works because type carries two meanings at once: a printing term for a style of lettering and a word for a kind of person. The lightbulb section follows the same pattern: ChatGPT's cat answer is coherent but way too predictable, while Yandex GPT loses the thread at the verb watch. Human version, on the contrary, catches the reader off guard. Even though the question sounds like a generic standard lightbulb joke, the answer reframes it entirely making an unexpected reference to psychotherapy. 
The meme comparison draws similar conclusions. The LOLcat generated by ChatGPT ("I can haz deadline? No time 4 naps") follows genre conventions well enough by making use of intentional misspellings, number substitutions and a relatable scenario and therefore can be seen as a reasonable imitation. The Pepe Silvia meme, however, is too wordy and literal, missing the conciseness and layered meaning that make memes effective [Stryker, 2011]. Tsakona maintains that memes draw heavily on shared cultural references, and this is precisely where AI-generated examples are lacking [Tsakona, 2020].
The results indicate that ChatGPT-5 and Yandex GPT-5 function as structural imitators of humour. While reproducing formal patterns and executing basic wordplay, they tend to fall short of generating the semantic collisions that make humour genuinely effective. Future progress will likely depend on advances in models' capacity for cultural reasoning and the kind of interpretive surprise that lies at the heart of the comic effect.

References:
Attardo S., Raskin V. Script theory revis(it)ed: Joke similarity and joke representation model // Humor: International Journal of Humor Research. 1991. 4 (3—4). 293—347.
Stryker C. Epic win for the Anonymous: how 4chan's army conquered the Web. New York, 2011. 
Tsakona V. Tracing the trajectories of contemporary online joking // Media Linguistics. 2020.​​​​​​​​​​​​​​​​ 169—183.