2005-11-23

CJ verbs, part 2

A refinement of the theory based on Ronald@IDR's comments:

A CJ verb consists of stem + ending. There are three main types of verbs:

  • C-type verbs are consonant-stem (sC) (e.g. omoh.u).
  • V-type verbs are vowel stem (sV) (e.g. mi.ru, ke.ru).
  • D-type verbs have a consonant stem (sC) and a vowel stem (sV) (e.g. sug.u/sugi, at.u/ate) The vowel stem is used whenever "available" (usually for MZ, RY and MR) and the consonant stem otherwise, but for some reason the RT and IZ consonant stems always take the post-vowel allomorphs.
The endings are as follows. I've thrown the D-type verb structure on the left-hand side, since it gets a bit complex.

D-type
stem + ending
verb
form
ending
(post-C allomorph)
ending
(post-V allomorph)
Consonant cluster handling
(if necessary)
sV + e(V)MZa0-
sV + e(V)RYi0-
sC + e(C)SSuru-
sC + e(V)RTuruz.r → n
C.rV → C.u.rV
sC + e(V)IZerez.r → n
C.rV → C.u.rV
sV + e(V)MReyo-

Notice that I've added "z.r → n" to go with "C.rV → C.u.rV". This lets me claim the extremely common negative auxiliary verb ず as a 100% regular D-type verb with the stems z/zu. (This is also why I'm still dubious that the .u. in C.u.rV is the SS form u -- "z.r → n" makes sense to me as a sound change, but "zu.r → n" doesn't, and nor is any sound change even required since it's not a consonant cluster in the first place.)

How we handle exceptions:

  • aru-type verbs = C-type verbs, except the SS form is replaced (in its entirety) with the RY form. (I call this variant "C(r)-type".)
  • su and ku = D-type verbs, except that sV is not available for the RY form. (D(s/k)-type.)
  • sinu and its ilk = D-type verbs, except that sV is not available anywhere. (Post-vowel ending allomorphs are still used for IZ and RT, though.) (D(n)-type.)
So, the three types of D-stem work like this:

FORM ↓/ verb type → Regular D-type D(se/ko)-type (s/k) D(n)-type (sinu et al)
MZ sV + e(V) sV + e(V) sC + e(C)
RY sV + e(V) sC + e(C) sC + e(C)
SS sC + e(C) sC + e(C) sC + e(C)
RT sC + e(V) sC + e(V) sC + e(V)
IZ sC + e(V) sC + e(V) sC + e(V)
MR sV + e(V) sV + e(V) sC + e(C)

(Remembering, of course, to apply "z.r → n" and "C.rV → C.u.rV" to the RT and IZ rows.)

Armed with C-type, C(r)-type, D-type, D(s/k)-type, D(n)-type, we can now classify the vast majority of CJ's auxiliary verb system. In fact, I believe we can classify all of it. (Interestingly, none of them are V-type.)

First, the most normal ones. These behave like regular verbs and attach to entire verb forms (stem + ending).

Traditional name Type Stem/s Attaches to... Notes
C m MZ -
むず D(s/k) mu.z / 0 MZ or "nz". む+す with rendaku. Has no sV since there is no place it would be used
らむCra.mSS same m?
けむ C ke.m RY same m? same k(e) as keri below??
けり C(r) k.er RY k + り (see next table)?
しむ D sim / sime MZ -
D z / zu MZ has partly merged with a parallel form derived from z.ar(u)
つ(完了) D t / te RY -
たり(完了) C(r) t.ar RY same t as つ(完了)
ぬ(完了) D(n) n RY -
めりC(r) mer SS from "見あり", I hear

Next, the ones that attach to stems alone, without an intervening ending. These all prefer to attach to an sV, but will settle for an sC if nothing else is available.
Traditional name Type post-sC stem/s post-sV stem/s
る/らる D ar / are rar / rare
す/さす D as / ase sas / sase
り(完了) C(r) er r

Then there some that conjugate like adjectives (BECAUSE THEY ARE. YEAH, I WENT THERE): たし、べし、ごとし (short-stem), まじ、まほし (long-stem).

Finally, there are a few weird ones belonging to a special group of conjugations I like to call X. The main feature of X is a lack of variation.


Form ↓ / Aux → らし、じ (XC-type) まし(XD-type) き(過去) (XD(k)-type)
MZ - mase or mas.ika se
RY - - -
SS ras.i mas.i k.i
RT ras.i mas.i s.i
IZ ras.i mas.ika s.ika
MR - -
If we succumb to the temptation of recategorizing らし as a particle or something else non-conjugating, we get a regular-looking conjugation with the sole irregularity of that one k-stem, which I assume leaked over from けり.

I'm not even going to bother dealing with the various naris and taris. They're all C(r) with optional particle-usage instead of RY forms, because they're made of those particles + ari, basically. Not difficult.

New issues:

  1. Can we simplify all those auxes made of k, m, r and t?
  2. Does it matter that this setup doesn't let us predict where certain forms will be missing? Can we let semantics handle that? (E.g. there is no imperative form of zu, but there is a z.are. Do we need to know this?)
  3. What's up with those X forms? Are they perhaps hideously deformed adjectives? How can they not have a RY form when RY is the stablest, most enduring form in the entire system?

Popularity factor: 7

Azuma:

Wow, o-tsukare. I want to think about what you have here before commenting in too much detail, but two thoughts:

One, as for the missing RY, which I emphatically agree with you is the center of the verb system, not the SS/dictionary form. I think it's a matter of usage. Because masi almost always comes at the end of its verbal phrase, we never get a RYK. The existence of an IZK is probably due to kakari-josi, similarly the SSK. I think we only have a MZK because of the frequency of "maseba, masikaba". I bet there aren't any "mashikazu" for example. I think they'd use "ji" instead.

As for rashi and ki, I'm not as sure. We know rashi proved haler than some of the big kids in the end (*ahem* keri *ahem*) for all its defectiveness, so why not at least a MZK for a negative, or a RYK for a past tense? My best guess is the fact that it sticks to the SSK shows it mostly occured at the end of its verb phrase, allowing no further. nari(hearsay), meri, and ramu, of similar meaning are all the same, and even the kemu that takes the RYK is missing its MZK and RYK. Actually, now that I look at it, all the auxiliaries that are a rainbow of variation on だろう are missing the same two forms, mu and ji as well. It makes sense. だろう comes after everything else in modern Japanese, too, negative, positive, what have you.

I just don't know why ki is screwed up. I think the technical reason is the same--it comes last except in a few select circumstances (ba, do). But why it should be so different than the other past-tense auxiliaries is a mystery to me. Perhaps its strange conjugation hints at an older origin, and somewhere in its history lies the answer.

Second ( long comment!): On a more general level, I don't feel so comfortable with your sequencing rules and exceptions. They don't seem wrong to me, but somehow they don't seem neat. But all the time you spent on it for the three or four of us who care (thanks!) deserves a thoughtful and detailed response, so more later.


Matt:

The Iwanami Kogo actually uses RY as the lookup form for verbs, which is much more logical in my opinion. (Yes, I LOVE the Iwanami Kogo and I'm going to marry it.)

Thanks! Those are helpful thoughts (especially about maseba... duh! you're absolutely right -- and the missing MZ/RY/MR forms for darou-type auxes. I hadn't really done much thinking about meaning, and it shows in my analysis...) I look forward to a longer response at some point!

Re your point about neatness, yeah. There's no point proposing a new CJ verb system if it isn't less complicated than the traditional Big Chart one. I'm hoping I can rework this to be neater and more logical rather than... well, a Big Chart.

Another issue, I suppose, is that all this would be more useful if I spun it around 180 degrees so that it was about how to decode CJ verbs, rather than build them...


IbaDaiRon:

Definitely, o-tsukare-sama!

Re the handling of -zu, I've been trying to think of another example in any language where z+r = n. Why not just handle the /z/ ~ /n/ alternation with suppletion?

...all this would be more useful if ... it was about how to decode CJ verbs...

Want to build a web app? PHP, Perl or Javascript...name yer poison!


Matt:

Yeah, it is a bit 前代未聞, isn't it... I suppose the basic answer is that I don't like suppletion. I suppose it would be best to reframe that rule as "wherever you would have /nr/, you get /z/ instead" and leave the issue of whether it's a phonemic or suppletive thing open for some PhD student to investigate. (Although I'm sure they already have.)

Re the web app... can I call Ruby?


IbaDaiRon:

Only if Ruby Begonia be the name on the bell that's ringing...or something like that.

You sure do have your linguistic prejudices, don't you! Don't like suppletion? Must make using be a "be-atch" every day! ;)

Seriously, have you got some Ruby code in mind already? Gee, this old-timer can't keep up!

(That was me deleted the comment; mouse ran amok!)


Matt:

Hey, it bith no problem..

As for Ruby code, nah, nothing in mind. OK, then, I'll settle for Perl!


Matt:

!

The scales have fallen from my eyes (on the train, as usual). I'm going to put aside my anti-suppletion prejudice and go with this:

which "stem" to add the regular ending (appropriate allomorph) to, for D-type verbs

MZ sVRY sV (sC for su/ku)SS sCRT SSIZ SSMR sV

Then we end up with no need for consonant-cluster reducing rules and only one exception, zu, which we handle with suppletion. Perfect.

Further note to self: 完了の「き」 attaches to stem! (X: i/s with i/k for SS) "se" MZ form (only used in se...masi, I believe) is indeed a misidentification of su's MZ form, as some suggest.

Comment season is closed.