AI, Narratives and Underrepresented Languages

Karine Megerdoomian
Jun 4
5 min read

At the recent North American Conference on Iranian Linguistics (NACIL4), I had the opportunity to present our paper entitled AI-Enabled Narrative Analytics for Persian and Kurdish. This project explores how large language models (LLMs) can be used to automatically extract and interpret complex event structures, timelines, and narrative logic—from languages that have often been neglected in the AI space.

Here's a summary of our talk:

Overview

We used prompt engineering to develop LLMs that identify the structural elements of a narrative--in other words, the system automatically extracts the information from a text to answer who did what to whom, where and when and why. We found that LLMs perform quite well on this task for Persian and Sorani Kurdish, especially in inferring implicit information and discontinuous elements, without requiring the integration of NLP pipeline components or structured resources such as WordNet or a Treebank. However, these systems are inconsistent (especially for Kurdish) and don't perform as well in complex analyses such as coreference resolution.

AI and LRLs

There is no denying the fact that the advent of Large Language Models or LLMs has transformed the whole discipline of Natural Language Processing and AI. LLMs that have been pretrained mostly on high-resource languages (for GPT3, approximately 92.65% of the training data was in English) have been shown to result in skewed outputs for low-resource languages. Data scarcity is the biggest hurdle, leading to suboptimal performance, especially for languages with complex structures or little digital presence, and lack of LLM support for these languages risks digital exclusion for millions. We wanted to explore the feasibility of building multilingual and instruction-tuned LLMs for a medium-resource language (Persian) and a low-resource language (Sorani Kurdish).

Why narrative analytics?

Narratives are foundational to human expression across cultures. They are in the stories we tell, in folktales, news reports, memoirs, podcasts and visual media. The linguist Bill Labov describes the Narrative as "a recounting of things that have happened, involving a sequence of events meaningfully connected in a temporal and often causal relation, typically structured with a beginning, middle, and end". Narrative understanding isn’t just about identifying verbs or entities. It’s about tracking who did what to whom, when, where and why, often across paragraphs, embedded quotes, shifting tenses, omitted agents and implicit temporal information.

In conventional NLP approaches, building a system for such analysis in a low-resource language means starting almost entirely from scratch to capture tokenization, part-of-speech tagging, entity detection, event extraction or coreference modules, and a temporal analysis and reasoning component. And this has to be repeated for each new language. So, instead of building a full NLP pipeline for Persian and Kurdish narrative analysis, we decided to leverage LLMs and evaluate their performance in these languages, with the possibility of extending the system to other low-resource languages of the Iranian family.

Our approach: Prompt-driven narrative analysis with LLMs

We experimented with prompt-based LLM workflows using multilingual foundation models (e.g., GPT-4, DeepSeek, LLaMa). We implemented:

Definition-driven in-context learning (ICL) prompts where the language model performs the task by conditioning on a small number of instructions and provides a specific output format.
Prompt Chain where each prompt feeds into the next with structured outputs to avoid error propagation down the analytics pipeline. For example, the output of the entity and event detection prompt is fed into the following entity and event relation identification prompt.

We evaluated model outputs using a small gold-standard dataset annotated for events and their temporal anchoring. For Persian we created 20 documents with a total of 160 sentences.

LLM evaluation results for Persian on entity, event, temporal and timeline detection — Evaluation results for Persian

After evaluating the results, we tuned the GPT model with

Few-shot Prompting: Provided 2-5 examples in the prompt for the specific challenging task to improve (e.g., date conversion from Persian to Gregorian calendar).
Chain-of-Thought (CoT) reasoning: Encourage the LLM to generate intermediate reasoning steps before arriving at the final answer, improving performance on complex tasks that require multi-step logic or inference (e.g., causal inference, implicit temporal resolution).

Sample Persian output (ChatGPT 4.o): For a given text, the system provides a summary and key narrative themes in the language chosen by the user, extracts the main entities (Persons, Organizations and Locations) mentioned in the text, provides a graph of main event relations, and builds a timeline of the events depicted in the text.

Sample Sorani Kurdish output (ChatGPT 4.o): For a given text, the system provides a translation, extracts and classifies the main events found in the text, identifies the causal relationships and builds a timeline of the events.

Linguistic areas that the system performed well in:

Event detection, even implicit ones that require pragmatic understanding, and nested events.
Identifying compound verbs (very common in Persian and Kurdish), even if components are separated from each other.
Conditional events (if x, then y).
Inferring implicit information from context, as the subject in passives, and external historical and cultural knowledge not explicitly stated in text.

The challenges? Disambiguating references in embedded and subordinate clauses or other complex constructions; inconsistent results with Counterfactuals (e.g., intended action that didn't actually occur).

What surprised us

Here’s the twist: LLMs performed remarkably well out of the box, even on Sorani Kurdish, a truly low-resource language with very little labeled data. Without any fine-tuning, the model was able to:

Identify key events and actors in complex narratives
Infer missing information based on context
Produce timelines that generally aligned with human judgment

This is a dramatic shift from the traditional NLP paradigm. While the models weren’t flawless, their ability to generalize across unseen syntax and pragmatics was a clear sign that LLMs—when guided carefully—can be leveraged for real work in LRLs without having to build a full NLP stack from scratch. For researchers working on endangered or minority languages, this opens exciting doors.

Why it matters

The implications extend beyond technical novelty. Narrative analytics can support:

Human rights monitoring (e.g., tracking patterns of abuse in local reporting)
Cross-linguistic education (e.g., teaching students how stories encode culture)
AI equity (e.g., reducing the gap between high- and low-resource languages)

It also challenges the field to rethink what “low-resource” really means in the age of LLMs. Maybe it’s not just about data availability—but about whether we're asking the right questions with the right tools.

Key takeaways

We found that:

•LLMs enable holistic narrative analysis, capturing causality, temporality, and implicit elements—especially valuable for under-resourced languages like Persian and Kurdish.

•Traditional NLP offers precision and schema alignment but struggles with flexibility, implicit reasoning, and multilingual generalization.

•LLMs perform surprisingly well out-of-the-box for Sorani and Persian, but results are inconsistent (esp. for Kurdish) and lack explainability.

We think there is potential for a hybrid approach:

•Combine LLM strengths (inference, multilingualism, context) with traditional NLP precision (structured extraction, schema mapping).

•Apply LLMs for causal inference, temporal ordering, and character motivation analysis.

Final Thoughts

We are continuing this research by building an annotated dataset for Sorani Kurdish for evaluation purposes, exploring schema-constrained prompting to bridge LLM outputs with formal evaluation frameworks (e.g., TimeML), and exploring ways to improve challenging areas by integrating knowledge-based modules with LLMs.

This project is part of a larger research program at the Zoorna Institute for Language, AI and Society for building datasets and AI tools for Middle Eastern languages, to help develop an AI ecosystem that represents our global linguistic and cultural diversity.

If you’re working on anything similar—or just curious about how stories and AI intersect—let’s talk.

🧠 Sign up to the Zoorna Institute newsletter to keep up with progress on this project and related research

Karine Megerdoomian, PhD

AI, Narratives and Underrepresented Languages

Recent Posts

Comments