2006-04-20

Who you gonna call? And how you gonna look up their phone number?

Chris writes about yuureimoji:

[T]here are apparently certain characters called 幽霊文字 ("ghost characters") that have no readings, meanings, or examples of use. Even if you look them up in a dictionary you get definitions like 意義未詳 (reading and meaning unknown). Examples of these ghost characters are 暃 and 碵.
They all come from the JIS set, which is a set of characters that are standard for computer terminals to display. Apparently during the compliation of the JIS set, some characters that weren't actually characters got onto the list accidentally -- either because they were miswritten versions of actual characters or the compilers misread certain kanji.
So why do they appear in dictionaries? ...

This seems like an eminently sensible development to me. People are always complaining about kanji having multiple readings, saying that one reading per kanji would be preferable. What, then, could be better than kanji that have no readings? (Borges would loved it, too.)

But are they all really mistakes and nothing more? Japan has a long history of literacy, and only a very small portion of that had anything to do with computers, and snap-together writing systems are wide open to abuse innovation. It doesn't matter whether a new character is created intentionally or not -- if enough people use it, the Blue Fairy of descriptivism makes it a Real Character.

Here's Chris's list, for reference:

粫 挧 橸 膤 袮 閠 妛 暃 椦 軅 鵈 恷 碵 駲 墸 壥 彁 蟐

A couple of these are about as real as you can get: was apparently in use as part of a place name; is a Chinese surname (pronunced /yŭ/, it seems); 膤 is, according to my dictionary, part of a placename in Kumamoto prefecture: 膤割 (Yukiwari).

袮 is allegedly a simplified version of 禰, meaning "one's father's mausoleum", and this is quite believable because after all 尓 is a simplified version of 爾. And I don't see any real reason to doubt 軅's validity as a Japan-made variant on 軈 -- all it takes is one influential writer to omit that 心 at the bottom and the dot on top, and a new character is born.

On the other hand, even the JIS bigwigs admit that 妛 and 椦 are indeed just mistakes.

This is the kind of topic that Google can't really do justice to, sadly. Which is why I propose we launch an all-Wikipedia investigation! What could possibly go wrong?

Update! A helpful follow-up from Chris.

Popularity factor: 19

amida:

Matt: In Chinese, 袮 is sometimes used by Christians as a second-person ("second-deity"?) pronoun to refer to God. Standard 你 has a person-radical and that just wouldn't do, would it? Better replace it with that old religion-friendly 示. Both are read "ni3." There's also 祂 for the um, third-entity pronoun, taking the place of 他, 她, 牠, and 它.


Matt:

Interesting... is that used for Jesus too, by, say, the disciples?

Does Satan just get 他?


amida:

To tell you the truth, I am not sure if that is used for Jesus or just God. I don't know much about the Bible in any language, but Taiwanymous does. Check out this post:http://taiwanonymous.blogspot.com/2006/04/de-honorific-chinese-character-for.html


Anonymous:

[Posted by Jimmy Ho, via Language Hat]

I second amida (quite the appropriate name in this instance) about the modern use of 禰, which I've only witnessed in Taiwanese documents. This is parallel to other gender-specific characters like the "feminine" ni3: 妳, a convenient way to make things clear when you sing karaoke. However, the classical meaning for 禰 is indeed what you wrote (父廟), like in the Zhouli 周禮. Now that I check it in the 漢語大辭典, it can also be a (presumably rare) surname 姓 with the reading mi2. The one (and perhaps only) occurrence given is Mi Heng 禰衡, who has a biographical notice (禰衡傳) in the Hou Han shu 後漢書.

-- Jimmy Ho


language:

Hey, Matt, check my thread on this -- you know anything about "silent kanji"?


Anonymous:

Now that LH involuntarily took me out of my retreat, here is one more thing: many of those forms look like handwritten, cursive forms reformatted for print. I guess this is what the original post means by "miswriting" and "misreading". For instance, 妛 could be a reinterpretation of 妄 or even 安, and 暃 could be a "variant" for, say, 昇 or 罪. It makes sense when you are looking at such forms in a manuscript, because the context directs you to the original character.


Anonymous:

Sorry, the previous comment was by me, Jimmy Ho.


Matt:

Thanks, Jimmy! Re your second post in particular -- Good point! The fact that a single type of curved line could be any one of several different radicals is one of the things that makes reading old-school Japanese cursive so hard, in my opinion, especially if you're shaky on the context. But you'd hope that the guys designing a standard character set would at least write carefully in the design documents...


Anonymous:

Yes, the manuscript-to-print transition is a constant problem for Chinese paleographic and epigraphic publications.I just read Chris Kern's update and it does clarify some points. I wish he'd mention his sources, though, in particular for the妛 (the "photocopy incident" looks a bit like the starting point of a modern Korean ghost movie). My initial hypothesis was a variant of 奾 (xian1), now used only in female personal names, but presumably a gendered form of the homophonous 仙. However, that still didn't explain the horizontal line between the "woman" and the "mountain". Then I looked at the reading provided by the "Taiwan" Chinese editor for Windows XP (those are generally useless for rare characters: they give a default transcription based on the reading of the graphically closest standard form). It gave me chi1, a good hint, since I hadn't considered 媸 (chi1). Even the most skilled experts need some context or background and comparison points.

-- Jimmy Ho


Anonymous:

膤 is, according to my dictionary, part of a placename in Kumamoto prefecture: 膤割 (Yukiwari).

My dictionary has this too, but the name is suspiciously absent from Internet searches and some Japanese pages I've looked at say that this place does not actually exist. The JIS committee claimed it came from a source called the 国土行政区画総覧 but I've read that much research has been done on this and that the source does not actually contain the name. I'm not 100% sure about this, though.

軅 is also supposedly used in a placename in Fukuoka-ken called 軅飛

Someone else requested sources; here's one:http://homepage3.nifty.com/shikeda/kokudo31.html

Jim Breen told me that the 1997 revision of the JIS standard confirms the story about 妛, and I've seen it on a number of Japanese pages.


Anonymous:

One more comment:袮 is allegedly a simplified version of 禰

Note that the former has an extra stroke in the koromo-hen radical. Kanjigen says 祢(禰の異体字)の誤字 but this is unclear on who actually made the mistake.

Here's another page:http://www.tim.hi-ho.ne.jp/hebiguchi/KanjiCode/yurei.htm


Anonymous:

As embarrassing as it is, i had not noticed that it was the 衣 radical and not 示; I blame it on my myopy (terrible for on-screen Chinese reading). Therefore, 誤字 is right. This is very common in manuscripts, like the 方 part of 於, often written as a 扌 radical.

Anonymous/Chris, I assume that you are the author of the original post. Thank you for the links.


amida:

I made the same mistake as Jimmy. Kanji are killing my eyes.


Matt:

I think everyone except Chris (including me) made that mistake. D'oh! Thanks for the correction + additional info, Chris.


Anonymous:

One other comment:

粫 was apparently in use as part of a place name;

I think that reading the entire section of that page you linked to, the existence of that place name could not actually be confirmed. The conclusion the writer seemed to draw is that it was a mistake for 糯, although that seems like a rather severe mistake to make.

But on that page they say that even though the JIS specification made mention of a place name called 字粫田 (uruchida), people did a good amount of research, including looking at maps back to Meiji at the actual city office, and the existence of the place name still could not be confirmed.


Matt:

To summarize for those who can't read Japanese, the page says that it was in the 国土行政区画総覧, but they can't find any other supporting evidence of the town (including locally) using that kanji from the Meiji period or after.

This means that either the change occured pre-Meiji (and was quickly forgotten), or the character is a mistake in the 国土行政区画総覧 itself. Personally, without an explanation of why the original work would make a strange mistake like that, I am more inclined to err on the side of "it probably existed".


Anonymous:

This means that either the change occured pre-Meiji (and was quickly forgotten)

The problem with this theory is that there exists a very large (50-volume) dictionary that is devoted to old place names; the volume on Kumamoto-ken is 1000+ pages alone; I looked in there and there was no entry for yukiwari (under any spelling).

I suppose it's still possible that it exists, but the question then becomes why that character was chosen for inclusion in the JIS set when the place name is so obscure that its very existence can only be posited. Surely there are other place names that didn't make it onto the JIS set that have confirmed usages....or maybe not.


Anonymous:

Er, sorry, I got my kanji confused there; we weren't talking about yukiwari. I guess I should check that dictionary for this other name.


Matt:

Yeah, if you have access to it, that would be awesome!

You're right on the second point, though. One can't discount the possibility that those characters were indeed written down by someone in 1750, but were added to the JIS list because someone misread a different character entirely.

Comment season is closed.