Japanese Personal Name Variants

Japanese Personal Name Variants

Approximately 3.5 million entries

Covers many romanization systems

Gender classification codes

Overview

CJKI’s Database of Japanese Personal Name Variants (JNV) includes approximately 3.5 million entries of Japanese given names and surnames and their variants, and covers all major and some minor romanization systems for romanizing Japanese. The dataset also includes various popular and “hybrid” variants.

The reason for so many variants of Japanese first names and last names are numerous, including the presence or absence of apostrophes (Kenichi, Ken’ichi), multiple ways of representing long vowels and certain consonants (toutooto, toh for とう), the mixing of different Japanese romanization systems, and others. If these variants happen to combine in the same Japanese name, the number of permutations can increase significantly.

This database of variants of Japanese names is especially useful in English-Japanese machine translation, since the original English text can contain many unpredictable variants.

* Select one of the tabs below.

Practical Applications

JNV is used for identifying Japanese personal name variants and is useful in such applications as:

Query processing by search engines

Immigration control systems

Segmentation and morphological analysis

Anti-money laundering (AML)

Fraud detection by financial institutions

Entity and information extraction

Improving accuracy of machine translation

Database cleansing and normalization

Security applications

Identifying suspected name variants of criminals

Reference Documents

Related Resources

JEN

Japanese-English Personal Names

Japanese-English database of CJK and Western personal names

KJN

Korean-Japanese Personal Names

Korean-Japanese database of CJK and Western personal names

CNV

Chinese Personal Name Variants

Chinese personal names and their romanized variants