| Overview | We maintain the world's largest Chinese-English bilingual bidirectional databases of proper nouns, of which the following data sample is a subset focusing on Chinese and non-Chinese place names. Our proper noun databases are used by some of the world's major IT companies for a wide variety of applications, such as:
See also our Comprehensive Database of Chinese Name Variants, which contains hundreds of thousands of name variants. |
|---|---|
| Coverage and Fields | Our Chinese - English - Chinese database of Chinese and non-Chinese place names place names is very comprehensive. Each entry includes various attributes such as readings in pinyin, zhuyin fuhao, Cantonese and several romanization systems, semantic classification codes and frequency rankings, locale codes, and other useful information such as frequency statistics, described in chinfreq.htm (but not shown here). Please contact Jack Halpern ( |
| Format and Encoding |
The data in any desired format, such as plain text files with fields delimited by tabs, Excel, Access, html, etc. in any encoding, such as UTF-8, UCS-2, EUC, GB-2312 and Big Five (the sample below is encoded in UTF-8). |
| Field Number |
Field Name |
Description |
|---|---|---|
| 1 | ID | The prefix "N" stands for "Name" |
| 2 | ENGLISH | |
| 3 | SC | Name in Simplified Chinese |
| 4 | L/O | An example of orthographic conversion, marked "O" in the table below, is 头发 converted to 頭髮, and 出发 to 出發. An example of lexemic conversion, marked "L" in the table below, is 'laser', translated to 激光 in SC but to 雷射 in TC. For details, see Jack Halpern's paper The Pitfalls and Complexities of Chinese to Chinese Conversion, presented at several international conferences. |
| 5 | TC | Name in Traditional Chinese |
| 6 | SC_PIN | Pinyin plus tone with hyphen separating syllables |
| 7 | ZHUYIN | Name rendered in Zhuyin fuhao transcription system |
| ID | ENGLISH | SC | L/O | TC | SC_PIN | ZHUYIN |
|---|---|---|---|---|---|---|
| N002657 | Aruba | 阿鲁巴 | L | 阿盧巴 | a1-lu3-ba1 | ㄚㄌㄨˊㄅㄚ |
| N001635 | Azerbaijan | 阿塞拜疆 | L | 亞塞拜然 | a1-sai1-bai4-jiang1 | ㄧㄚˋㄙㄜˋㄅㄞˋㄖㄢˊ |
| N081006 | Brasilia | 巴西利亚 | O | 巴西利亞 | ba1-xi1-li4-ya4 | ㄅㄚㄒㄧㄌㄧˋㄧㄚˋ |
| N016658 | Caracas | 加拉加斯 | L | 卡拉卡斯 | jia1-la1-jia1-si1 | ㄎㄚˇㄌㄚㄎㄚˇㄙ |
| N014214 | Cairo | 开罗 | O | 開羅 | kai1-luo2 | ㄎㄞㄌㄨㄛˊ |
| N058842 | Chad | 乍得 | L | 查德 | zha4-de2 | ㄔㄚˊㄉㄜˊ |
| N087916 | Dongyang City | 东阳市 | O | 東陽市 | dong1-yang2-shi4 | ㄉㄨㄥㄧㄤˊㄕˋ |
| N078960 | Fukuoka | 福冈 | O | 福岡 | fu2-gang1 | ㄈㄨˊㄍㄤ |
| N047517 | Georgia | 乔治亚 | O | 喬治亞 | qiao2-zhi4-ya4 | ㄑㄧㄠˊㄓˋㄧㄚˋ |
| N023778 | Guinea | 几内亚 | O | 幾內亞 | ji3-nei4-ya4 | ㄐㄧˇㄋㄟˋㄧㄚˋ |
| N031561 | Haiyan | 海盐 | O | 海鹽 | hai3-yan2 | ㄏㄞˇㄧㄢˊ |
| N036150 | Hanyang | 汉阳 | O | 漢陽 | han4-yang2 | ㄏㄢˋㄧㄤˊ |
| N032756 | Heshan | 鹤山 | O | 鶴山 | he4-shan1 | ㄏㄜˋㄕㄢ |
| N032307 | Huailai | 怀来 | O | 懷來 | huai2-lai2 | ㄏㄨㄞˊㄌㄞˊ |
| N000617 | Ireland | 爱尔兰 | O | 愛爾蘭 | ai4-er3-lan2 | ㄞˋㄌㄧㄣˊ |
| N052916 | Jiangning County | 江宁县 | O | 江寧縣 | jiang1-ning2-xian4 | ㄐㄧㄤㄋㄧㄥˊㄒㄧㄢˋ |
| N125824 | Longjing | 龙井 | O | 龍井 | long2-jing3 | ㄌㄨㄥˊㄐㄧㄥˇ |
| N068134 | New Zealand | 新西兰 | L | 紐西蘭 | xin1-xi1-lan2 | ㄋㄧㄡˇㄒㄧㄌㄢˊ |
| N057306 | Sanya City | 三亚市 | O | 三亞市 | san1-ya4-shi4 | ㄙㄢㄧㄚˋㄕˋ |
| N36301 | Seoul | 首尔 | O | 首爾 | shou3-er3 | ㄕㄡˇㄦˇ |
| N054474 | Seoul | 汉城 | O | 漢城 | han4-cheng2 | ㄏㄢˋㄔㄥˊ |
| N077920 | Taierzhuang District | 台儿庄区 | O | 台兒莊區 | tai2-er2-zhuang1-qu1 | ㄊㄞˊㄦˊㄓㄨㄤㄑㄩ |
| N062125 | Tel Aviv | 特拉维夫 | O | 特拉維夫 | te4-la1-wei2-fu1 | ㄊㄜˋㄌㄚㄨㄟˊㄈㄨ |
| N061921 | Xiaoshan City | 萧山市 | O | 蕭山市 | xiao1-shan1-shi4 | ㄒㄧㄠㄕㄢㄕˋ |
| N004005 | Yemen | 也门 | L | 葉門 | ye3-men2 | ㄧㄝˋㄇㄣˊ |
| N042437 | Yibin | 宜宾 | O | 宜賓 | yi2-bin1 | ㄧˊㄅㄧㄣ |