top of page
< Back
Extending a Persian Morphological Analyzer to Blogs

Abstract

This paper describes a two-level morphological analyzer for Persian using a system based on the Xerox finite state tools. Persian language presents certain challenges to computational analysis: There is a complex verbal conjugation paradigm which includes long-distance morphological dependencies; phonological alternations apply at morpheme boundaries; word boundaries are difficult to define since morphemes may be detached from their stems and distinct words can appear without an intervening space. In this work, we develop these problems and provide solutions in a finite-state morphology system. The paper also presents an overview of new issues that have arisen since the advent of blogs and the propagation of informal Persian text on the web. This new mode of writing provides the computational system with further challenges. The paper proposes approaches for extending the current morphological system to analyze the material found in Persian blogs.

Public released

N/A

External link: 

Download Document
(if available)

@2025 website by Karine Megerdoomian.

Catwoman logo
bottom of page