Japanese Multiword Expression Lexicon

Japanese Multiword Expression Lexicon

Overview

For Japanese, JMWEL (Japanese Multiword Expression Lexicon) is the first manually compiled, full-scale database with a rich set of grammatical attributes fine-tuned to phrase-based NLP applications such as MT, IR, and morphosyntactic analysis. It contains about 160,000 headwords covering almost every kind of Japanese MWE.

JMWEL was first designed for Japanese MT in the late 1960s and after decades of meticulous development by Dr. Kosho Shudo, professor
emeritus of Fukuoka University, Japan, and a Computational Linguistics researcher in Japan is now available for licensing.

Please refer to 研究工房ことばの森 for details.

Distinctive Features

  1. JMWEL provides extensive coverage of diverse types of orthographic variants, including alternative kanji, such as 憧れの的 vs. 憬れの的 (‘object of longing’), and alternative okurigana such as 漁火 vs. 漁り火 (‘fish-attracting torch’).

  2. JMWEL describes syntactic structures in which modifying clauses can be inserted. Idioms such as「油を売る」 (not meaning ‘sell oil’, but rather ‘waste time’), are provided with a syntactic structure description [[油を]*売る] so that inserting「喫茶店で」 forms a discontinuous idiom 「油を喫茶店で売る」(‘waste time in a coffee shop’).

  3. JMWEL is enriched by 4,700 sentence-final MWEs that provide speaker judgment modalities, e.g., 「~べきだった」 (‘~should have Vpp~.’), speaker demand information, e.g., 「~てくださいませんか」 (‘Would you please V~?’), aspectual information, e.g., 「~たばかりだ」 (‘~have just Vpp~.’), and so on.

  4. JMWEL covers 1,900 discourse-marking and sentence-connective MWEs, which indicate semantic relations between a paragraph or sentence and the immediately following segment, such as 「話変わりますが」 (‘To change the subject,~.’) or 「上に述べたように」 (‘As mentioned above,~.’)

Orthographic Variants of Japanese MWEs

IDVariant IDPart of SpeechLemmaHiragana
NP00411001nominal憧れ--あこがれのまと
NP00411002nominal憧れ--まとあこがれのまと
NP00411003nominal憬れ--あこがれのまと
NP00411004nominal憬れ--まとあこがれのまと
NP00411005nominalあこがれ--あこがれのまと
NP00411006nominalあこがれ--まとあこがれのまと
NP03509001nominal--_おににかなぼう
NP03509002nominal--_ぼうおににかなぼう
NP03509003nominal--かな_おににかなぼう
NP03509004nominal--かな_ぼうおににかなぼう
NP03509005nominalおに--_おににかなぼう
NP03509006nominalおに--_ぼうおににかなぼう
NP03509007nominalおに--かな_おににかなぼう
NP03509008nominalおに--かな_ぼうおににかなぼう
VP37342001v-class2----付けるあしをちにつける
VP37342002v-class2----つけるあしをちにつける
VP37342003v-class2----付けるあしをちにつける
VP37342004v-class2----つけるあしをちにつける
VP37342005v-class2----付けるあしをちにつける
VP37342006v-class2----つけるあしをちにつける
VP37342007v-class2----付けるあしをちにつける
VP37342008v-class2----つけるあしをちにつける
VP37342009v-class2あし----付けるあしをちにつける
VP37342010v-class2あし----つけるあしをちにつける
VP37342011v-class2あし----付けるあしをちにつける
VP37342012v-class2あし----つけるあしをちにつける

Morphosyntactic Attributes of Japanese MWEs

Lemma
Part of Speech 2
Structure
Dependency
憧れ-の-的
NP
[*V22no]*N
CLICK TO ENLARGE
鬼-に-金_棒
S/incomplete
[Nni][[N$]$]
CLICK TO ENLARGE
足-を-地-に-付ける
VP_d3
[Nwo][[Nni]*V30]
CLICK TO ENLARGE
合わせる-顔-が-無い
AdjP_c2
[[*V40N]ga]*nai
CLICK TO ENLARGE
火-を-見る-より-明らか
AdjVP
[[[Nwo]V30]yori]K00
CLICK TO ENLARGE
感ずる-所-有っ-て
AdvP_Vte
[[[*V40N](ga)]at]te
CLICK TO ENLARGE
欠く-可から-ざる
AdnP_Vzaru
[V30bekara]zaru
CLICK TO ENLARGE
物-は-相談-だ-が
DM/SA_Ndaga
[[Nha(ga)][Nda]]ga
CLICK TO ENLARGE
て-頂ける-と-良い-の-です-が
CEP_p
[[[[[[$te]V30]to]A40]no]tesu]ga
CLICK TO ENLARGE
を-目標-に
CPP
[[$wo][[Nni](si)]](te)
CLICK TO ENLARGE

JMWEL Related Resources

JLD

Japanese Lexical Database

Monolingual general vocabulary for NLP applications

JPD

Japanese Phonetic Database

IPA phonetic and phonemic transcriptions for core Japanese vocabulary

JWL

Comprehensive Japanese Wordlist

General vocabulary, proper nouns, and technical terms

Reference Documents