top of page
Language ID for Short Texts: Evaluation Report
Abstract
This document provides a report on a large-scale evaluation of Language Identification tools applied to short text (tweets) on 30 distinct language and script combinations. This project investigated the performance of 11 open source, COTS and GOTS tools and measured language and script coverage, accuracy, precision/recall and performance (speed, scalability and robustness).
[The report was not public released so we cannot communicate the evaluation results]
Public released
no
External link:
Download Document
(if available)
bottom of page