Advanced Narrative Analytics System Infrastructure (ANAnSI)
Abstract
This paper provides an in-depth description of the Advanced Narrative Analytics System Infrastructure (ANAnSI), which performs content extraction and detailed narrative analytics for knowledge discovery within a distributed high-performance system infrastructure. ANAnSI is a hybrid system that leverages linguistic resources including substance use and mental condition lexicons, and combines them with probabilistic parsing as well as knowledge-based analytics to identify and extract rich event-based narrative analysis at the sentence level (i.e., who did what to whom, where and when analysis). The system also utilizes linguistic knowledge in machine algorithms to perform reasoning tasks (e.g., temporal reasoning) and integrates machine learning based components to make data-driven predictions (e.g., treatment outcome analysis). ANAnSI processes each sentence in the data collection and produces a detailed event-based analysis. Additional domain-specific components are applied to discover properties relevant to substance use, mental illness, and child pornography risk assessment. The system has successfully been applied to large datasets in health, news reports, and law enforcement.
[This document was not public released and is not shareable. Please contact me if interested]
Public released
no
External link:
Download Document
(if available)