Arabic Wordlist

Over 200,000 entries

Covers proper nouns and technical terms

Ideal for speech technology

Overview

CJKI maintains comprehensive monolingual wordlists for Chinese, Japanese, Korean (CJK), Arabic, and Spanish covering near 15 million entries in total.

Our Arabic Wordlist (AWL) covers over 200,000 canonical forms for general vocabulary and proper nouns and includes romanized readings, part-of-speech codes, and semantic classification type codes.

Used by some of the world’s leading IT companies, this database is suitable for a variety of NLP applications for information retrieval like search engines, morphological analysis tools like tokenizers, and speech technology applications like text-to-speech synthesis. The related ArabLEX (Arabic Full-Form Lexicon) under construction is expected to exceed 530 million entries.

Arabic Wordlist

TypePOSUnvocalizedVocalizedTranscription
PNPمشاضيمَشَاضِيmashā́ḍi̱
PNPمشاديمَشَادِيmashā́di̱
PNPمشاقمِشَاقmishā́q
GAمشاركمُشَارِكmushā́rik
PNPمشايطةمَشَايْطَةmashā́yṭa
GNCمشعلمِشْعَلmíshɛal
GAمشهرمُشَهَّرmusháhhar
PNPمشهورمَشْهُورmashhū́r
PNPمشناءمِشْنَاءmishnā́ʾ
GAمشرحمُشَرَّحmushárraḥ
GNCمشتركمُشْتَرِكmúshtarik
GAمؤاتمُؤَاتٍmuʾā́tin
GAمؤسلمُؤَسَّلmuʾással
PNPمذبانمُذْبَانmudhbā́n
GNCمذبحمَذْبَحmádhbaḥ
PNPمذيمِذِيmídhi̱
GNCمذيعمُذِيعmudhī́ɛ
PNPماصيمَاصِّيّma̱ṣṣíy̱
PNPماله اللهمَالِه اَللّٰهmā́lih ʾallā́h

Practical Applications

CJKI’s Comprehensive Wordlists are being used by some of the world’s leading IT companies for a variety of natural language processing applications, including:

Information retrieval

Word segmentation

Morphological analysis

Related Resources

CWL

Chinese Wordlist

General vocabulary, proper nouns, and technical terms

JWL

Japanese Wordlist

General vocabulary, proper nouns, and technical terms

KWL

Korean Wordlist

General vocabulary, proper nouns and technical terms