top of page
< Back
Language ID for Short Texts: Evaluation Report

Abstract

This document provides a report on a large-scale evaluation of Language Identification tools applied to short text (tweets) on 30 distinct language and script combinations. This project investigated the performance of 11 open source, COTS and GOTS tools and measured language and script coverage, accuracy, precision/recall and performance (speed, scalability and robustness).

[The report was not public released so we cannot communicate the evaluation results]

Public released

no

External link: 

Download Document
(if available)

@2025 website by Karine Megerdoomian. Powered by Wix.

Catwoman logo
bottom of page