Principal CJK Lexical Resources


©2001-2008 The CJK Dictionary Institute, Inc.


The CJK Dictionary Institute (CJKI), which specializes in CJK computational lexicography, is engaged in the continuous expansion of a comprehensive CJK lexical database called DESK. Currently, DESK has over two million Japanese, one million Simplified Chinese and one million Traditional Chinese entries, and includes a rich set of grammatical and semantic attributes required for developing information retrieval applications, input method editors, and electronic dictionaries.



Chinese Lexical Database
DescriptionSimplified
Chinese
Traditional
Chinese
   General vocabulary 250,000   250,000  
   Companies and organizations 50,000   50,000  
   Personal names 650,000   650,000  
   Place names 170,000   170,000  
   Famous people's names 60,000   -
   Computer terminology 45,000   45,000  
   Single character 18,000   14,000  
   Others 120,000   120,000  
Total 1,363,000   1,299,000  



Japanese Lexical Database
Description 
   General vocabulary 390,000  
   Katakana loanwords 50,000  
   Companies and organizations 600,000  
   Personal names 570,000  
   Place names 90,000  
   Famous people's names 20,000  
   Computer terminology 50,000  
   Technical terminology 250,000  
   Single characters 17,000  
   Orthographical variants 80,000  
Total 2,117,000