AI, Narratives and Underrepresented Languages
- Karine Megerdoomian
- 7 days ago
- 5 min read
At the recent North American Conference on Iranian Linguistics (NACIL4), I had the opportunity to present our paper entitled AI-Enabled Narrative Analytics for Persian and Kurdish. This project explores how large language models (LLMs) can be used to automatically extract and interpret complex event structures, timelines, and narrative logic—from languages that have often been neglected in the AI space.
Here's a summary of our talk:
Overview
We used prompt engineering to develop LLMs that identify the structural elements of a narrative--in other words, the system automatically extracts the information from a text to answer who did what to whom, where and when and why. We found that LLMs perform quite well on this task for Persian and Sorani Kurdish, especially in inferring implicit information and discontinuous elements, without requiring the integration of NLP pipeline components or structured resources such as WordNet or a Treebank. However, these systems are inconsistent (especially for Kurdish) and don't perform as well in complex analyses such as coreference resolution.
AI and LRLs
There is no denying the fact that the advent of Large Language Models or LLMs has transformed the whole discipline of Natural Language Processing and AI. LLMs that have been pretrained mostly on high-resource languages (for GPT3, approximately 92.65% of the training data was in English) have been shown to result in skewed outputs for low-resource languages. Data scarcity is the biggest hurdle, leading to suboptimal performance, especially for languages with complex structures or little digital presence, and lack of LLM support for these languages risks digital exclusion for millions. We wanted to explore the feasibility of building multilingual and instruction-tuned LLMs for a medium-resource language (Persian) and a low-resource language (Sorani Kurdish).
Why narrative analytics?
Narratives are foundational to human expression across cultures. They are in the stories we tell, in folktales, news reports, memoirs, podcasts and visual media. The linguist Bill Labov describes the Narrative as "a recounting of things that have happened, involving a sequence of events meaningfully connected in a temporal and often causal relation, typically structured with a beginning, middle, and end". Narrative understanding isn’t just about identifying verbs or entities. It’s about tracking who did what to whom, when, where and why, often across paragraphs, embedded quotes, shifting tenses, omitted agents and implicit temporal information.
In conventional NLP approaches, building a system for such analysis in a low-resource language means starting almost entirely from scratch to capture tokenization, part-of-speech tagging, entity detection, event extraction or coreference modules, and a temporal analysis and reasoning component. And this has to be repeated for each new language. So, instead of building a full NLP pipeline for Persian and Kurdish narrative analysis, we decided to leverage LLMs and evaluate their performance in these languages, with the possibility of extending the system to other low-resource languages of the Iranian family.
Our approach: Prompt-driven narrative analysis with LLMs
We experimented with prompt-based LLM workflows using multilingual foundation models (e.g., GPT-4, DeepSeek, LLaMa). We implemented:
Definition-driven in-context learning (ICL) prompts where the language model performs the task by conditioning on a small number of instructions and provides a specific output format.
Prompt Chain where each prompt feeds into the next with structured outputs to avoid error propagation down the analytics pipeline. For example, the output of the entity and event detection prompt is fed into the following entity and event relation identification prompt.
We evaluated model outputs using a small gold-standard dataset annotated for events and their temporal anchoring. For Persian we created 20 documents with a total of 160 sentences.

After evaluating the results, we tuned the GPT model with
Few-shot Prompting: Provided 2-5 examples in the prompt for the specific challenging task to improve (e.g., date conversion from Persian to Gregorian calendar).
Chain-of-Thought (CoT) reasoning: Encourage the LLM to generate intermediate reasoning steps before arriving at the final answer, improving performance on complex tasks that require multi-step logic or inference (e.g., causal inference, implicit temporal resolution).


Linguistic areas that the system performed well in:
Event detection, even implicit ones that require pragmatic understanding, and nested events.
Identifying compound verbs (very common in Persian and Kurdish), even if components are separated from each other.
Conditional events (if x, then y).
Inferring implicit information from context, as the subject in passives, and external historical and cultural knowledge not explicitly stated in text.
The challenges? Disambiguating references in embedded and subordinate clauses or other complex constructions; inconsistent results with Counterfactuals (e.g., intended action that didn't actually occur).
What surprised us
Here’s the twist: LLMs performed remarkably well out of the box, even on Sorani Kurdish, a truly low-resource language with very little labeled data. Without any fine-tuning, the model was able to:
Identify key events and actors in complex narratives
Infer missing information based on context
Produce timelines that generally aligned with human judgment
This is a dramatic shift from the traditional NLP paradigm. While the models weren’t flawless, their ability to generalize across unseen syntax and pragmatics was a clear sign that LLMs—when guided carefully—can be leveraged for real work in LRLs without having to build a full NLP stack from scratch. For researchers working on endangered or minority languages, this opens exciting doors.
Why it matters
The implications extend beyond technical novelty. Narrative analytics can support:
Human rights monitoring (e.g., tracking patterns of abuse in local reporting)
Cross-linguistic education (e.g., teaching students how stories encode culture)
AI equity (e.g., reducing the gap between high- and low-resource languages)
It also challenges the field to rethink what “low-resource” really means in the age of LLMs. Maybe it’s not just about data availability—but about whether we're asking the right questions with the right tools.
Key takeaways
We found that:
•LLMs enable holistic narrative analysis, capturing causality, temporality, and implicit elements—especially valuable for under-resourced languages like Persian and Kurdish.
•Traditional NLP offers precision and schema alignment but struggles with flexibility, implicit reasoning, and multilingual generalization.
•LLMs perform surprisingly well out-of-the-box for Sorani and Persian, but results are inconsistent (esp. for Kurdish) and lack explainability.
We think there is potential for a hybrid approach:
•Combine LLM strengths (inference, multilingualism, context) with traditional NLP precision (structured extraction, schema mapping).
•Apply LLMs for causal inference, temporal ordering, and character motivation analysis.
Final Thoughts
We are continuing this research by building an annotated dataset for Sorani Kurdish for evaluation purposes, exploring schema-constrained prompting to bridge LLM outputs with formal evaluation frameworks (e.g., TimeML), and exploring ways to improve challenging areas by integrating knowledge-based modules with LLMs.
This project is part of a larger research program at the Zoorna Institute for Language, AI and Society for building datasets and AI tools for Middle Eastern languages, to help develop an AI ecosystem that represents our global linguistic and cultural diversity.
If you’re working on anything similar—or just curious about how stories and AI intersect—let’s talk.
🧠 Sign up to the Zoorna Institute newsletter to keep up with progress on this project and related research
コメント