I just found these two great papers by Shigeto KAWAHARA on rhyming in Japanese hip-hop:
- Aspects of Japanese Hip-Hop Rhymes: What They Reveal about the Structure of Japanese (2002)
- Half rhymes in Japanese rap lyrics and knowledge of similarity (2007)
The first of these is a five-page introduction to the topic that proposes the following rules:
- Rhymes "consist of the agreement of at least two moraic elements". A "moraic element" is an element that could form a mora on its own: a vowel (even if part of a CV mora in the actual word), a syllable-final /N/, or the first half of geminate consonant. For example, satsutaba rhymes with aru nara on the basis of [a-u-a-a]; panchira ni/han-biraki is [a-N-i-a-i], and naku naru tte/wakatte is a [Q-e] rhyme.
- Extrametricality is allowed. nurui kaze/furui tate(ru) is a valid rhyme. This also "helps to explain the fact that a long vowel can (and in fact often does) rhyme with its short counterpart": shigusa/furita(a).
- Rhymes are "always computed in one-to-one fashion (BINARITY) in a successive way (CYCLICITY)".
That last one feels shakiest to me. It's an oversimplification that doesn't seem neccessary. Take a look at Kawahara's example (from "MASTERMIND", by DJ HASEBE featuring ZEEBRA and Mummy-D):
It's true that a strict non-cyclic view requires us to discard certain obvious rhyme components in adjacent lines (e.g. the shared /N/ in main(do)/bōdarain). But a cyclic view equally requires us to discard them in non-adjacent lines (e.g. the initial [o-o-a] in the distant [o-o-a-a-i] words bōdarai(n)/ōganai(zu)/tōkanai—and topparai's [o-Q-a-a-i] starts to look suspicious in that context too. It really looks more like an ad-hoc grouping system incorporating both cyclical and binary organization is at work.
Which is to say: A complex Japanese hip-hop verse is, just like a complex English hip-hop verse, liable to be threaded with all kinds of irregularly located intra- and interlinear tiers of rhyme and assonance. A simple example of this would be the sekai-jū/ōganaizu/massai-chū triad towards the end of the above example, or Crystal Kay's "gomen asobase/ABASISO asa made" (ABASISO = approximated-kanafied pronunciation of "I bust it, so"), which I have always considered the best part of "I Like It"*.
The second paper is a more sophisticated analysis working the angle that although the moraic elements may be necessary-and-sufficient for the rhyme, there is also a strong tendency to use similar non-moraic elements. For example, the pair kettobase/gettomane(e) is a five-element [e-Q-o-a-e] rhyme, but it should also be noted that each pair of non-matching consonants (k/g, b/m, s/n) shares the same place of production. The satisfying conclusion is that similarity of sound correlates with rhymeability (frequency of use in rhymes):
To summarize, the multiple regression analysis has revealed the degree to which the agreement of each feature contributes to similarity: (i) [pal] has a fairly large effect, (ii) [cont], [voi], [nas] have a medium effect, and (iii) [son], [cons] and [place] have too weak an effect to be detected by multiple regression.
Wait—if place is too weak to be detected by multiple regression, doesn't that kind of undermine the utility of the kettobase/gettomane(e) example? Well, sort of:
The lack of a significant effect of major features and [place] appears to conflict with what we observed in §3.2 and §3.3; [son], [cons], and [place] do cause overrepresentation when two consonants agree in these features. Presumably, the variability due to these features overlaps too much with the variability due to the four manner features, and as a result, the effect of [place] and the major class features per se may not have been large enough to have been detected by multiple regression.
Anywho, the second part of the paper is an argument for the idea that these findings of rhymeability are based on acoustics; that is, sounds rhyme not simply because they are produced in the same place or in the same way, but because they sound the same. Evidence for this includes the highly rhymeable {kj-tΣ} pair (e.g. chōshi/kyōmi), which shares acoustic features but not method of production, and the fact that "[m]inimal pairs of oral consonants differing in place are less common than the {m-n} pair", which Kawahara argues is because the {m-n} pair, being nasal, is the least acoustically affected by place. (This neatly ties in with Kawahara's argument that the weakness of [place] derives from its relatively minor influence on consonantal acoustics.)
Other interesting new claims in this paper:
- "The boundary tone [dividing the non-rhyming and rhyming parts of a line] is usually a high tone (H) followed by a low tone (L) (with subsequent spreading of L), and this boundary tone can replace lexical tones."
- "[I]t is usually high vowels that can be extrametrical. My informal survey has found many instances of high extrametrical vowels, but no mid or low extrametrical vowels." (I guess this means that Kawahara is abandoning the argument that shigusa/furita(a) rhymes because of an extrametrical /a/ (low central vowel) in favor of an implied argument that long vowels can substitute for short ones?)
Supplementary: At the 15th Japanese/Korean Linguistics Conference at the University of Wisconsin - Madison in 2005, Natsuko TSUJIMURA, Kyoko OKAMURA, and Stuart DAVIS appear to have presented a paper likening all of this to Zwicky's "rock rhymes", and pointing out another crucial component in the system, which is performers altering pronunciations of native and loan words slightly to help the rhyme along. The paper doesn't seem to be freely available online, but the abstract is.