Chinese Lexical Database

Chinese Lexical Database

Covers over 500,000 entries

Simplified and Traditional Chinese

Optimized for NLP applications

Overview

The CJKI Chinese Lexical Database (CLD) is a comprehensive monolingual lexical database specifically designed for NLP applications. It consists of two modules, Simplified Chinese (SC) and Traditional Chinese (TC), with about 250,000 entries in each module covering general vocabulary, technical terms, and important proper nouns.

Main Features

Phonological information

such as pinyin, zhuyin and IPA

Semantic classification codes

such as type of proper noun

Grammatical information

such as POS and adjacency attributes

Simplified Chinese Lexical Database
POSSCPinyin
NC东家之子dōngjiāzhīzǐ
E东家效颦dōngjiāxiàopín
NP东架松dōngjiàsōng
NP东河dōnghé
NP东河dōnghé
NP东河镇dōnghézhèn
NP东河沿dōnghéyán
NP东河区dōnghéqū
NP东河漕胡同dōnghécáo hútóng
NP东河道dōnghédào
NP东花dōnghuā
NP东花厅胡同dōnghuātīng hútóng
NP东花枝胡同dōnghuāzhī hútóng
NP东霞dōngxiá
NP东会村dōnghuìcūn
NC东海dōnghǎi
NP东海dōnghǎi
NP东海dōnghǎi
NP东海县dōnghǎixiàn
E东海扬尘dōnghǎiyángchén
E东海捞针dōnghǎilāozhēn
U东海舰队dōnghǎijiànduì
E东海桑田dōnghǎisāngtián
NP东海大学dōnghǎidàxué
NP东外大街dōngwàidàjiē
NC东郭dōngguō
NP东郭dōngguō
E东郭先生dōngguōxiānshēng
NC东郭履dōngguōlǚ
NP东革新里dōnggéxīnlǐ
NC东岳dōngyuè
NP东岳dōngyuè
NP东冠英胡同dōngguānyīng hútóng
NP东官房胡同dōngguānfáng hútóng
NC东干dōnggān
NP东管头dōngguǎntóu
NP东管头前街dōngguǎntóuqiánjiē
NP东莞dōngguān
NP东莞市dōngguānshì
NC东岸dōngàn
NP东岩dōngyán
NP东喜dōngxǐ
NP东旗dōngqí
NP东起dōngqǐ
NP东吉dōngjí
NP东吉祥胡同dōngjíxiáng hútóng
NP东弓匠胡同dōnggōngjiàng hútóng
NP东旧帘子胡同dōngjiùliánzǐ hútóng
NP东牛角胡同dōngniújiǎo hútóng
NP东京dōngjīng
NP东京影展dōngjīngyǐngzhǎn
NP东京畿道dōngjīngjīdào
NC东京股市dōngjīnggǔshì
NP东京大学dōngjīngdàxué
NP东京都dōngjīngdū
NP东京湾dōngjīngwān
NP东教场胡同dōngjiāocháng hútóng
NP东教胡同dōngjiāo hútóng
NP东局村dōngjúcūn
NP东玉dōngyù
NP东玉河dōngyùhé
NP东琴dōngqín
NP东琴科dōngqínkē
NP东区dōngqū
NC东隅dōngyú
NC东君dōngjūn
NP东慧dōnghuì
NP东月dōngyuè
NP东健dōngjiàn
NP东源dōngyuán
NP东源县dōngyuánxiàn
NP东湖dōnghú
NP东湖渠dōnghúqú
NP东湖区dōnghúqū
NC东胡dōnghú
N东胡史dōnghúshǐ
NP东交民巷dōngjiāomínxiàng
NP东光dōngguāng
NP东光dōngguāng
NP东光县dōngguāngxiàn
NP东光镇dōngguāngzhèn
NP东光胡同dōngguāng hútóng
NP东公街dōnggōngjiē
NP东公文dōnggōngwén
NP东厚dōnghòu
NP东口袋胡同dōngkǒudài hútóng
NC东向dōngxiàng
NP东向dōngxiàng
NP东后河沿dōnghòuhéyán
NP东幸福街dōngxìngfújiē
NP东康dōngkāng
NP东江dōngjiāng
NP东浩dōnghào
NP东港dōnggǎng
NP东港区dōnggǎngqū
NP东港市dōnggǎngshì
NC东皇dōnghuáng
NP东皇城根南街dōnghuángchénggēnnánjiē
NP东皇城根北街dōnghuángchénggēnběijiē
NA东航dōngháng
NP东航dōngháng
NP东航dōngháng
U东行航程dōngxínghángchéng
NC东郊dōngjiāo
NP东香dōngxiāng
NP东香河园dōngxiānghéyuán
NP东高地dōnggāodì
NP东高房胡同dōnggāofáng hútóng
NP东合dōnghé
NP东合盛dōnghéchéng
NP东克尔dōngkèěr
NP东克尔曼dōngkèěrmàn
NP东国dōngguó
NP东根dōnggēn
NP东佐夫dōngzuǒfū
E东差西误dōngchāxīwù
NP东沙岛dōngshādǎo
NP东沙群岛dōngshāqúndǎo
NP东塞尔dōngsāiěr
NP东才dōngcái
NC东作dōngzuò
NP东三亲家坟dōngsānqīnjiāfén
NP东三环中路dōngsānhuánzhōnglù
NP东三环北路dōngsānhuánběilù
NP东三巷dōngsānxiàng
NC东三省dōngsānshěng
NP东三省事宜条约dōngsānshěngshìyítiáoyuē
NP东三条dōngsāntiáo
NP东三道街dōngsāndàojiē
NP东山dōngshān
NP东山dōngshān
NP东山县dōngshānxiàn
NP东山镇dōngshānzhèn
NP东山区dōngshānqū
E东山高卧dōngshāngāowò
E东山再起dōngshānzàiqǐ
E东山之志dōngshānzhīzhì
NC东山法门dōngshānfǎmén
NP东山坡一里dōngshānpōyīlǐ
NP东山坡三里dōngshānpōsānlǐ
NP东山坡二里dōngshānpōèrlǐ
NC东司dōngsī
NP东四块玉南街dōngsìkuàiyùnánjiē
NP东四块玉北街dōngsìkuàiyùběijiē
NP东四头条dōngsìtóutiáo
NP东四九条dōngsìjiǔtiáo
NP东四西大街dōngsìxīdàjiē
NP东四道街dōngsìdàojiē
NP东四道口dōngsìdàokǒu
NP东四南大街dōngsìnándàjiē
NP东四北大街dōngsìběidàjiē
NP东子dōngzǐ
NC东市dōngshì
NP东市dōngshì
NP东市场五巷dōngshìchángwǔxiàng
NP东市区dōngshìqū
E东市朝衣dōngshìcháoyī
NP东志远dōngzhìyuǎn
NC东指dōngzhǐ
E东支西吾dōngzhīxīwú
NP东斯dōngsī
NP东斯科伊dōngsīkēyī
E东施效颦dōngshīxiàopín
NP东枝dōngzhī
NP东至县dōngzhìxiàn
NP东耳dōngěr

Traditional Chinese Lexical Database
POSTCZhuyin
A,NC博學ㄅㄛˊㄒㄩㄝˊ
V搏戰ㄅㄛˊㄓㄢˋ
NC伯仲ㄅㄛˊㄓㄨㄥˋ
V駁斥ㄅㄛˊㄔˋ
V泊車ㄅㄛˊㄔㄜ
NC薄產ㄅㄛˊㄔㄢˇ
NC駁船ㄅㄛˊㄔㄨㄢˊ
NC博士ㄅㄛˊㄕˋ
U博識ㄅㄛˊㄕˋ
NC博士班ㄅㄛˊㄕˋㄅㄢ
NC博士論文ㄅㄛˊㄕˋㄌㄨㄣˋㄨㄣˊ
NC博士後ㄅㄛˊㄕˋㄏㄡˋ
NC博士學位ㄅㄛˊㄕˋㄒㄩㄝˊㄨㄟˋ
NC博士生ㄅㄛˊㄕˋㄕㄥ
V搏殺ㄅㄛˊㄕㄚ
U薄紗ㄅㄛˊㄕㄚ
NC帛書ㄅㄛˊㄕㄨ
D勃然ㄅㄛˊㄖㄢˊ
A薄弱ㄅㄛˊㄖㄨㄛˋ
NC薄弱環節ㄅㄛˊㄖㄨㄛˋㄏㄨㄢˊㄐㄧㄝˊ
NC脖子ㄅㄛˊㄗ˙
A駁雜ㄅㄛˊㄗㄚˊ
NC,V薄葬ㄅㄛˊㄗㄤˋ
NC,NP伯祖ㄅㄛˊㄗㄨˇ
NC伯祖母ㄅㄛˊㄗㄨˇㄇㄨˇ
V博采ㄅㄛˊㄘㄞˇ
NC博彩ㄅㄛˊㄘㄞˇ
A薄脆ㄅㄛˊㄘㄨㄟˋ
NP伯斯特ㄅㄛˊㄙㄊㄜˋ
NC,NP博愛ㄅㄛˊㄞˋ
NC駁岸ㄅㄛˊㄢˋ
NP伯恩ㄅㄛˊㄣ
NC博弈ㄅㄛˊㄧˋ
A,NP博雅ㄅㄛˊㄧㄚˇ
NC柏油ㄅㄛˊㄧㄡˊ
V博引ㄅㄛˊㄧㄣˇ
NC博物ㄅㄛˊㄨˋ
NC博物館ㄅㄛˊㄨˋㄍㄨㄢˇ
NC博物院ㄅㄛˊㄨˋㄩㄢˋ
NC泊位ㄅㄛˊㄨㄟˋ
U柏原ㄅㄛˊㄩㄢˊ
V駁運ㄅㄛˊㄩㄣˋ
Vㄅㄛˋ
V播報ㄅㄛˋㄅㄠˋ
V播發ㄅㄛˋㄈㄚ
V播放ㄅㄛˋㄈㄤˋ
V播弄ㄅㄛˋㄋㄨㄥˋ
NC簸籮ㄅㄛˋㄌㄨㄛˊ
NC薄荷ㄅㄛˋㄏㄜˊ
NC薄荷ㄅㄛˋㄏㄜ˙
V擘劃ㄅㄛˋㄏㄨㄚˋ
V擘畫ㄅㄛˋㄏㄨㄚˋ
NC簸箕ㄅㄛˋㄐㄧ
NC簸箕ㄅㄛˋㄐㄧ˙
V播種ㄅㄛˋㄓㄨㄥˇ
V播種ㄅㄛˋㄓㄨㄥˋ
V播送ㄅㄛˋㄙㄨㄥˋ
V播音ㄅㄛˋㄧㄣ
NC播音員ㄅㄛˋㄧㄣㄩㄢˊ
V播映ㄅㄛˋㄧㄥˋ
NC餑餑ㄅㄛㄅㄛ˙
NC波譜ㄅㄛㄆㄨˇ
V撥髮ㄅㄛㄈㄚ
NC波峰ㄅㄛㄈㄥ
NC波幅ㄅㄛㄈㄨˊ
V撥付ㄅㄛㄈㄨˋ
U撥打ㄅㄛㄉㄚˇ
NC波導ㄅㄛㄉㄠˇ
U波導管ㄅㄛㄉㄠˇㄍㄨㄢˇ
V波蕩ㄅㄛㄉㄤˋ
V撥電話ㄅㄛㄉㄧㄢˋㄏㄨㄚˋ
V剝奪ㄅㄛㄉㄨㄛˊ
NP波多黎各ㄅㄛㄉㄨㄛㄌㄧˊㄍㄜˋ
NC波段ㄅㄛㄉㄨㄢˋ
NC,V波動ㄅㄛㄉㄨㄥˋ
NP波特ㄅㄛㄊㄜˋ
NC波濤ㄅㄛㄊㄠˊ
NC缽頭ㄅㄛㄊㄡˊ
V撥通ㄅㄛㄊㄨㄥ
V撥弄ㄅㄛㄋㄨㄥˋ
V撥拉ㄅㄛㄌㄚ
NP波蘭ㄅㄛㄌㄢˊ
NC波瀾ㄅㄛㄌㄢˊ
NC波浪ㄅㄛㄌㄤˋ
NC撥浪鼓ㄅㄛㄌㄤˋㄍㄨˇ
NC撥浪鼓ㄅㄛㄌㄤ˙ㄍㄨˇ
U波浪鼓ㄅㄛㄌㄤ˙ㄍㄨˇ
V剝離ㄅㄛㄌㄧˊ
NC玻璃ㄅㄛㄌㄧˊ
NC玻璃紙ㄅㄛㄌㄧˊㄓˇ
NC玻璃磚ㄅㄛㄌㄧˊㄓㄨㄢ
NC玻璃絲ㄅㄛㄌㄧˊㄙ
NP玻利維亞ㄅㄛㄌㄧˋㄨㄟˊㄧㄚˋ
NC玻璃ㄅㄛㄌㄧ˙
N玻璃體ㄅㄛㄌㄧ˙ㄊㄧˇ
NC玻璃鋼ㄅㄛㄌㄧ˙ㄍㄤ
NC玻璃纖維ㄅㄛㄌㄧ˙ㄒㄧㄢㄨㄟˊ
NC玻璃紙ㄅㄛㄌㄧ˙ㄓˇ
NC玻璃磚ㄅㄛㄌㄧ˙ㄓㄨㄢ
NC玻璃絲ㄅㄛㄌㄧ˙ㄙ
NC波羅ㄅㄛㄌㄨㄛˊ
NC菠蘿蜜ㄅㄛㄌㄨㄛˊㄇㄧˋ
NC波羅蜜ㄅㄛㄌㄨㄛˊㄇㄧˋ
V剝落ㄅㄛㄌㄨㄛˋ
NP波哥大ㄅㄛㄍㄜㄉㄚˋ
V撥給ㄅㄛㄍㄟˇ
NC波谷ㄅㄛㄍㄨˇ
V撥開ㄅㄛㄎㄞ
NC,V撥款ㄅㄛㄎㄨㄢˇ
V撥號ㄅㄛㄏㄠˋ
V波及ㄅㄛㄐㄧˊ
V播講ㄅㄛㄐㄧㄤˇ
NC波形ㄅㄛㄒㄧㄥˊ
V剝削ㄅㄛㄒㄩㄝˋ
NC剝削階級ㄅㄛㄒㄩㄝˋㄐㄧㄝㄐㄧˊ
NC波折ㄅㄛㄓㄜˊ
M剝啄ㄅㄛㄓㄨㄛˊ
V播種ㄅㄛㄓㄨㄥˇ
NC波長ㄅㄛㄔㄤˊ
V播出ㄅㄛㄔㄨ
V撥出ㄅㄛㄔㄨ
V剝蝕ㄅㄛㄕˊ
NP波士頓ㄅㄛㄕˋㄉㄨㄣˋ
NC波束ㄅㄛㄕㄨˋ
D,V撥冗ㄅㄛㄖㄨㄥˇ
NC缽子ㄅㄛㄗ˙
NC撥子ㄅㄛㄗ˙
NC菠菜ㄅㄛㄘㄞˋ
V播撒ㄅㄛㄙㄚˇ
NC波爾卡ㄅㄛㄦˇㄎㄚˇ
NC,NP波音ㄅㄛㄧㄣ
NC波紋ㄅㄛㄨㄣˊ
NC缽盂ㄅㄛㄩˊ
NC波源ㄅㄛㄩㄢˊ
Vㄅㄞ
A百倍ㄅㄞˇㄅㄟˋ
NC百寶箱ㄅㄞˇㄅㄠˇㄒㄧㄤ
D百般ㄅㄞˇㄅㄢ
NC百病ㄅㄞˇㄅㄧㄥˋ
V擺佈ㄅㄞˇㄅㄨˋ
V擺平ㄅㄞˇㄆㄧㄥˊ
U百米ㄅㄞˇㄇㄧˇ
NP百慕達ㄅㄞˇㄇㄨˋㄉㄚˊ
NC,NN,OC百分ㄅㄞˇㄈㄣ
NC百分比ㄅㄞˇㄈㄣㄅㄧˇ
NC百分表ㄅㄞˇㄈㄣㄅㄧㄠˇ
N百分點ㄅㄞˇㄈㄣㄉㄧㄢˇ
NC百分率ㄅㄞˇㄈㄣㄌㄩˋ
NC百分號ㄅㄞˇㄈㄣㄏㄠˊ
NC百分號ㄅㄞˇㄈㄣㄏㄠˋ
NC百分制ㄅㄞˇㄈㄣㄓˋ
D百分之百ㄅㄞˇㄈㄣㄓㄅㄞˇ
U百分尺ㄅㄞˇㄈㄣㄔˇ
V擺放ㄅㄞˇㄈㄤˋ
NC百代ㄅㄞˇㄉㄞˋ
V擺地攤ㄅㄞˇㄉㄧˋㄊㄢ
V擺渡ㄅㄞˇㄉㄨˋ
V擺動ㄅㄞˇㄉㄨㄥˋ
V擺攤子ㄅㄞˇㄊㄢㄗ˙
V擺脫ㄅㄞˇㄊㄨㄛ
NC百衲本ㄅㄞˇㄋㄚˋㄅㄣˇ
NC百衲衣ㄅㄞˇㄋㄚˋㄧ
NC,NN百年ㄅㄞˇㄋㄧㄢˊ
NC百年大計ㄅㄞˇㄋㄧㄢˊㄉㄚˋㄐㄧˋ
V擺弄ㄅㄞˇㄋㄨㄥ
A百樂ㄅㄞˇㄌㄜˋ
V擺擂台ㄅㄞˇㄌㄟˋㄊㄞˊ
NC百里ㄅㄞˇㄌㄧˇ
NC百事ㄅㄞˇㄕˋ
NC百事通ㄅㄞˇㄕˋㄊㄨㄥ
V擺設ㄅㄞˇㄕㄜˋ
V擺手ㄅㄞˇㄕㄡˇ
NC百日ㄅㄞˇㄖˋ
NC百日咳ㄅㄞˇㄖˋㄎㄜˊ
NC百日維新ㄅㄞˇㄖˋㄨㄟˊㄒㄧㄣ
NC擺子ㄅㄞˇㄗ˙
U百足之蟲死而不僵ㄅㄞˇㄗㄨˊㄓㄔㄨㄥˊㄙˇㄦˊㄅㄨˋㄐㄧㄤ
NC百草ㄅㄞˇㄘㄠˇ
NC百歲ㄅㄞˇㄙㄨㄟˋ
NC百葉ㄅㄞˇㄧㄝˋ
U百業ㄅㄞˇㄧㄝˋ
NC百葉箱ㄅㄞˇㄧㄝˋㄒㄧㄤ
NC百葉窗ㄅㄞˇㄧㄝˋㄔㄨㄤ
NC百物ㄅㄞˇㄨˋ
U百位ㄅㄞˇㄨㄟˋ
NN百萬ㄅㄞˇㄨㄢˋ
NC百萬富翁ㄅㄞˇㄨㄢˋㄈㄨˋㄨㄥ
D白白ㄅㄞˊㄅㄞˊ
NC白報紙ㄅㄞˊㄅㄠˋㄓˇ
NC白班ㄅㄞˊㄅㄢ
NC白斑ㄅㄞˊㄅㄢ
NC白板ㄅㄞˊㄅㄢˇ
U白版ㄅㄞˊㄅㄢˇ
U白榜ㄅㄞˊㄅㄤˇ
NC白皮書ㄅㄞˊㄆㄧˊㄕㄨ
U白票ㄅㄞˊㄆㄧㄠˋ
NC,NP白馬ㄅㄞˊㄇㄚˇ
NC白馬王子ㄅㄞˊㄇㄚˇㄨㄤˊㄗˇ
NC白煤ㄅㄞˊㄇㄟˊ
NC白茅ㄅㄞˊㄇㄠˊ

Practical Applications

CLD is being used by major IT companies to enhance their Chinese morphological analysis technology, and is especially suitable for natural language processing (NLP) applications, such as:

Segmentation and tokenization

Named-entity recognition

Input method editors

Morphological analysis

Information retrieval

Part-of-speech tagging

Related Resources

To derive the full benefit of CLD, it can be used in conjunction with the following resources:

Japanese Lexical Database

Monolingual general vocabulary for NLP applications

Korean Lexical Database

Monolingual general vocabulary for NLP applications

Chinese Hanyu Pinyin Database

Accurate hanyu pinyin data including technical terms and proper nouns