2015-07-06

Invented traditions and OCR

I wrote another article for the Japan Times! This one is nominally about a proposed new culinary tradition called "nagoshi gohan," but my goal was to contextualize this proposal and show that it really isn't that odd.

Meanwhile, premodern cursive Japanese OCR is now susceptible to OCR. At the risk of being quoted in Buzzfeed 2030's 1028 Times a Human Amusingly Embodied the Hubris that Led to Their Current Situation, I'll point out that what's being OCRed here is, by Japanese cursive standards, pretty tame. The hand is regular, and while it's not a perfect grid of squares, it mostly breaks down to well-defined rectangular cells. (You can get a closer look at the page shown in the example here.) It'll be interesting to see how far down Crazy Road this technology can purse Japanese writing.

Popularity factor: 2

leoboiko:

Cool, more news to make "The fifth generation fallacy" and "Asia's orthography dilemma" sound even more obsolete (both have argued that kanji would be too hard to process in computers).

Little typo: "Premodern cursive Japanese OCR". I'm picturing a karakuri-ningyō-based system...


Matt:

Whoops! I'll fix that, thanks.

(It's amusing reading those old books about how Chinese characters simply are not susceptible to computerization because you can't fit them in the maximum reasonable amount of memory, i.e. about 64 KB or perhaps 128 KB for enormous hardware mainframes; you can't display enough of them in 320x200 resolution, which will surely be the limit of consumer hardware going forward; etc. It doesn't invalidate the other arguments against hanzi [wasted time in school, etc.] but it doesn't help them much either.)

Aime la vérité, mais pardonne à l'erreur

Nom
LU d'R
Mail d'E
Mot

All fields optional. E-mail address will never be displayed, resold, etc. -- it's just a quick way to give me your e-mail address along with your comment, if you should feel the need. URL will be published, though, so don't enter it if it's a secret. You can use <a href>, but most other tags will be filtered out. (I'll fix it in post-production for you if it seems necessary.)