Spanish Full-Form Lexicon

Spanish Full-Form Lexicon

Extremely comprehensive coverage

Simplifies morphological analysis

Monolingual or bilingual format

Overview

CJKI maintains full-form lexicons for Arabic, Japanese, and Spanish whose coverage will soon exceed 150 million entries. CJKI’s monolingual edition Spanish Full-Form Lexicon (SFULEX) covers approximately one million entries and includes part-of-speech codes and other grammatical attributes forms, whereas the bilingual edition contains about 26 million entries.

A full-form lexicon is a comprehensive lexical database that contains all inflected, declined, and conjugated forms of a language. Unlike an ordinary dictionary that lists only the canonical forms (base lexemes), such as eat, a full-form lexicon includes all inflected forms such as eatingeaten, and ate. In English, the number of inflected forms is limited to a handful of word forms, but languages like Arabic (and Japanese and Spanish) can have thousands of inflected forms for each verb. For example, the Spanish hablar ‘to speak’ has hundreds of inflected forms like hablaré ‘I will speak’ and hablaría ‘I would speak’.

SFULEX provides full coverage for all inflected and conjugated forms (word forms) in Spanish. The sample shows a subset of the conjugation paradigm for hablar ‘to speak’.

Main Features

Comprehensive coverage

Of over 26 million entries

Rich set of useful attributes

Conjugation patterns and orthographic variants

Detailed part-of-speech

And other grammatical codes

Includes all inflected and declined word forms

Such as part-of-speech codes

Dozens of data fields

Mapped to their canonical forms

Fully bilingual

Mapped to multiple English equivalent(s)

Spanish Full-Form Lexicon: Hablar

Pers.: Person

Practical Applications

CJKI’s full-form lexicons can bring the following benefits to various NLP applications:

Machine translation

Greatly enhanced translation quality

Morphological analysis

Significantly simplified algorithms

Pedagogical applications

Automatic conjugation systems

Information retrieval applications

Support for query processing

Named-entity recognition (NER)

Dramatically improved

Part-of-speech (POS) analysis and tagging