2005-11-26

Too little too late

For those who want to learn more about classical Japanese (and speak the modern variety), here's the traditional textbook approach to the subject, all on one page with internal hyperlinks. This is basically what high school students work with, although obviously they get it in a more readable form with proper tables, illustrations, diagrams, etc.

Popularity factor: 14

amida:

Nice...I keep hoping for an automated CJ parser. Babelfish should get on it!


IbaDaiRon:

Then again, somebody closer at hand might be working on one...(wink wink nudge nudge knowharramean?)

Besides, given the wonderful job they've done with MJ, do you think they're up to CJ?


amida:

Come on, come on--bring on the jidou hinshi bunkai machine! If I could write code for anything but the Apple ][e I would write one myself.I love babelfish--it uses a great dictionary (for Chinese to English, at least), handles phrases decently, and makes great Dadaist poetry out of sentences.


IbaDaiRon:

Patience, patience! Heian Kyo wasn't built in a day!


IbaDaiRon:

Post in haste...

What sort of output would you, as an end-user, like to see?

Also, do you know of any online lists of verbs, etc?


amida:

I was thinking a simple hinshi bunkai aid would be nice. You put in your text and help it identify taigen, etc., then it keeps track of what is an auxilliary verb,etc. by tracing their conjugations. It could find a verb and ask "Is this a yodan verb?" If yes, then it knows that what follows could possibly be (whatever) because the verb is (whatever) form, and so on.

That would be pretty simple and not require any dictionaries etc. But if there is some database of CJ vocab somewhere, you could go whole hog fairly easily I guess.

(Again, I know there is stuff like that for Chinese, but seem to have trouble finding good electronic resources for Japanese.)


Matt:

I haven't been able to find any useful lists of verbs online, sadly. Maybe it could be a two-stage process where the user is asked to identify verb type once the program has identified verbitude. ("XXX looks like a 2-dan verb -- does its renyoukei end in e or i?" or whatever.)


IbaDaiRon:

Egad, ye wish for the moon!

I was thinking something modest to begin with, analysis of verb forms only, initially; adding ADJs later. Too modest to be of interest?


amida:

It would be simple, simpler even than what you are talking about, IDR, to make something that identifies conjugations of auxilliary verbs. You'd only need a little database and you wouldn't have to woory about verbs. But are you really serious about such a project or just putting up "bait"?


IbaDaiRon:

Bait?! That's a bit hurtful, what?

I'm as serious about this as I can be about anything.

(Which is a mite woorying in itself!)

: )


amida:

IDR: You refer to the "CJ Material bait" on your blog. I thought you were starting an open project there.


IbaDaiRon:

IbaDaiRon said...Post in haste...

One of these days maybe I'll actually practice what I preach?

Mea culpa, mea culpa!

Amida: Sorry, that wasn't a case of "It's OK if I say it, but if you say it, it's rude." I'd honestly forgotten that I'd used the phrase "take the bait" in my later "WWW-privacy" post...hopefully because simply posting "hit bait" was the furthest thing from my (conscious) mind. (But considering my mention of hit counts and access stats in that latter post, I've just been engaging in a bit of serious psycholinguistic introspection in front of the mirror!)

For the record: I am starting an open project. (If you follow my trail (in other another guise) about the web, you'll find I've been fairly (obnoxiously?) vocal in my support of open vs. closed (software) projects.) I would be honored if you and everyone else reading would lend a hand in any capacity (programming, data preparation, DB design, etc.).

宜しくお願いします!

(I've been playing around with the interface on IDR already, but haven't mentioned it because I'm still trying to get a handle on the Perl installation on my site...it's a Yahoo hosting and a bit squirrely. You get what you pay for, I guess.)


Matt:

Hey! You know the rules! No hugging, no learning! (Except about East Asian languages.)


amida:

My apologies to IDR--one posting on his blog led me to believe that another posting there (on the topic at hand) was some sort of ill-conceived joke. He's assured me it's not. Sorry for doubting it.

And apologies to Matt for posting all this, but public accusations call for public apologies so here it is.

Comment season is closed.