ArabLEX
Arabic Full-Form Lexicon

Over 530 million entries

Exhaustive coverage of inflected forms

Ideal for NLP, AI and cybersecurity

Overview

CJKI’s Arabic Full-Form Lexicon, or ArabLEX, is the most comprehensive Arabic computational lexicon ever created, covering over 530 million entries.

As a full-form lexicon, ArabLEX includes all inflected forms for Modern Standard Arabic and covers not only general vocabulary but also, for the first time, fully inflected proper nouns (personal and place names).

ArabLEX is an essential resource for any development project involving Arabic NLP or AI. It is suited for such applications as machine translation, speech technology, deep learning, and cybersecurity.

As one such example, ArabLEX has significantly contributed to the development of Amazon’s advanced Arabic speech technology for both speech synthesis (TTS) and speech recognition (ASR). Due to its comprehensiveness and the fact that it covers morphology and phonology in depth, it has helped Amazon build a robust data-driven model that reduces the error rates for both recognition and generation; that is, it enables Amazon’s Alexa technology to recognize Arabic queries, including place and personal names, as well as generate answers with greater precision.

Distinctive Features

* Select one of the tabs below.

ARABICROMANLEMMAGENNUMCASE
كَاتِبٌkā́tibunكَاتِبٌMSNOM
كَاتِبُkā́tibuكَاتِبٌMSNOM
كَاتِبِيkā́tibi̱كَاتِبٌMSNOM
كَاتِبُكَkātíbukaكَاتِبٌMSNOM
كَاتِبُكِkātíbukiكَاتِبٌMSNOM
كَاتِبُهُkātíbuhuكَاتِبٌMSNOM
كَاتِبُهَاkātíbuha̱كَاتِبٌMSNOM
كَاتِبُنَاkātíbuna̱كَاتِبٌMSNOM
كَاتِبُكُمْkātíbukumكَاتِبٌMSNOM
كَاتِبُكُنَّkātibukúnnaكَاتِبٌMSNOM
كَاتِبُكُمَاkātibúkuma̱كَاتِبٌMSNOM
كَاتِبُهُمْkātíbuhumكَاتِبٌMSNOM
كَاتِبُهُنَّkātibuhúnnaكَاتِبٌMSNOM
كَاتِبُهُمَاkātibúhuma̱كَاتِبٌMSNOM
كَاتِبُهُمَاkātibúhuma̱كَاتِبٌMSNOM
كَاتِبٍkā́tibinكَاتِبٌMSGEN
كَاتِبِkā́tibiكَاتِبٌMSGEN
كَاتِبِيkā́tibi̱كَاتِبٌMSGEN
كَاتِبِكَkātíbikaكَاتِبٌMSGEN
كَاتِبِكِkātíbikiكَاتِبٌMSGEN
كَاتِبِهِkātíbihiكَاتِبٌMSGEN
كَاتِبِهَاkātíbiha̱كَاتِبٌMSGEN
كَاتِبِنَاkātíbina̱كَاتِبٌMSGEN
كَاتِبِكُمْkātíbikumكَاتِبٌMSGEN
كَاتِبِكُنَّkātibikúnnaكَاتِبٌMSGEN
كَاتِبِكُمَاkātibíkuma̱كَاتِبٌMSGEN
كَاتِبِهِمْkātíbihimكَاتِبٌMSGEN
كَاتِبِهِنَّkātibihínnaكَاتِبٌMSGEN
كَاتِبِهِمَاkātibíhima̱كَاتِبٌMSGEN
كَاتِبِهِمَاkātibíhima̱كَاتِبٌMSGEN
كَاتِبًاkā́tibanكَاتِبٌMSACU
كَاتِبَkā́tibaكَاتِبٌMSACU
كَاتِبِيkā́tibi̱كَاتِبٌMSACU
كَاتِبَكَkātíbakaكَاتِبٌMSACU
كَاتِبَكِkātíbakiكَاتِبٌMSACU
كَاتِبَهُkātíbahuكَاتِبٌMSACU
كَاتِبَهَاkātíbaha̱كَاتِبٌMSACU
كَاتِبَنَاkātíbana̱كَاتِبٌMSACU
كَاتِبَكُمْkātíbakumكَاتِبٌMSACU
كَاتِبَكُنَّkātibakúnnaكَاتِبٌMSACU
كَاتِبَكُمَاkātibákuma̱كَاتِبٌMSACU
كَاتِبَهُمْkātíbahumكَاتِبٌMSACU
كَاتِبَهُنَّkātibahúnnaكَاتِبٌMSACU
كَاتِبَهُمَاkātibáhuma̱كَاتِبٌMSACU
كَاتِبَهُمَاkātibáhuma̱كَاتِبٌMSACU
وَكَاتِبٌwakā́tibunكَاتِبٌMSNOM
وَكَاتِبُwakā́tibuكَاتِبٌMSNOM
وَكَاتِبِيwakā́tibi̱كَاتِبٌMSNOM
وَكَاتِبُكَwakātíbukaكَاتِبٌMSNOM
وَكَاتِبُكِwakātíbukiكَاتِبٌMSNOM
وَكَاتِبُهُwakātíbuhuكَاتِبٌMSNOM
وَكَاتِبُهَاwakātíbuha̱كَاتِبٌMSNOM
وَكَاتِبُنَاwakātíbuna̱كَاتِبٌMSNOM
وَكَاتِبُكُمْwakātíbukumكَاتِبٌMSNOM
وَكَاتِبُكُنَّwakātibukúnnaكَاتِبٌMSNOM
وَكَاتِبُكُمَاwakātibúkuma̱كَاتِبٌMSNOM
وَكَاتِبُهُمْwakātíbuhumكَاتِبٌMSNOM
وَكَاتِبُهُنَّwakātibuhúnnaكَاتِبٌMSNOM
وَكَاتِبُهُمَاwakātibúhuma̱كَاتِبٌMSNOM
وَكَاتِبُهُمَاwakātibúhuma̱كَاتِبٌMSNOM
وَكَاتِبٍwakā́tibinكَاتِبٌMSGEN
وَكَاتِبِwakā́tibiكَاتِبٌMSGEN
وَكَاتِبِيwakā́tibi̱كَاتِبٌMSGEN
وَكَاتِبِكَwakātíbikaكَاتِبٌMSGEN
وَكَاتِبِكِwakātíbikiكَاتِبٌMSGEN
وَكَاتِبِهِwakātíbihiكَاتِبٌMSGEN
وَكَاتِبِهَاwakātíbiha̱كَاتِبٌMSGEN
وَكَاتِبِنَاwakātíbina̱كَاتِبٌMSGEN
وَكَاتِبِكُمْwakātíbikumكَاتِبٌMSGEN
وَكَاتِبِكُنَّwakātibikúnnaكَاتِبٌMSGEN
وَكَاتِبِكُمَاwakātibíkuma̱كَاتِبٌMSGEN
وَكَاتِبِهِمْwakātíbihimكَاتِبٌMSGEN
وَكَاتِبِهِنَّwakātibihínnaكَاتِبٌMSGEN
وَكَاتِبِهِمَاwakātibíhima̱كَاتِبٌMSGEN
وَكَاتِبِهِمَاwakātibíhima̱كَاتِبٌMSGEN
وَكَاتِبًاwakā́tibanكَاتِبٌMSACU
وَكَاتِبَwakā́tibaكَاتِبٌMSACU
وَكَاتِبِيwakā́tibi̱كَاتِبٌMSACU
وَكَاتِبَكَwakātíbakaكَاتِبٌMSACU
وَكَاتِبَكِwakātíbakiكَاتِبٌMSACU
وَكَاتِبَهُwakātíbahuكَاتِبٌMSACU
وَكَاتِبَهَاwakātíbaha̱كَاتِبٌMSACU
وَكَاتِبَنَاwakātíbana̱كَاتِبٌMSACU
وَكَاتِبَكُمْwakātíbakumكَاتِبٌMSACU
وَكَاتِبَكُنَّwakātibakúnnaكَاتِبٌMSACU
وَكَاتِبَكُمَاwakātibákuma̱كَاتِبٌMSACU
وَكَاتِبَهُمْwakātíbahumكَاتِبٌMSACU
وَكَاتِبَهُنَّwakātibahúnnaكَاتِبٌMSACU
وَكَاتِبَهُمَاwakātibáhuma̱كَاتِبٌMSACU
وَكَاتِبَهُمَاwakātibáhuma̱كَاتِبٌMSACU

Practical Applications

CJKI’s full-form lexicons can bring the following benefits to various NLP applications:

Machine translation

Greatly enhanced translation quality

Morphological analysis

Significantly simplified algorithms

Pedagogical applications

Automatic conjugation systems

Named-entity recognition (NER)

Dramatically improved

Related Resources

DiaLex

Arabic Dialects Full-Form Lexicon

Full-form lexicon for all major Arabic dialects

JFULEX

Japanese Full-Form Lexicon

Includes all inflected, declined and conjugated forms

AWL

Arabic Wordlist

General vocabulary, proper nouns and technical terms