top of page

Language ID for Short Texts: Evaluation Report

Abstract

This document provides a report on a large-scale evaluation of Language Identification tools applied to short text (tweets) on 30 distinct language and script combinations. This project investigated the performance of 11 open source, COTS and GOTS tools and measured language and script coverage, accuracy, precision/recall and performance (speed, scalability and robustness).

[The report was not public released so we cannot communicate the evaluation results]

Public released

no

External link

Not all documents are
available for download

@2025 website by Karine Megerdoomian. 

Catwoman logo
bottom of page