some problems with encoding parts of akṣaras in manuscript transcriptions
Hi everyone, I have two questions about encoding manuscript transcriptions that I wanted to submit to the collective experience of this group. Both relate to the problem of akṣaras having "parts" that canonically occur in a certain sequence but may be changed in a manuscript. First, the cancellation of vowel mātrās. Does anyone have a good way to encode this in TEI? In one manuscript (see this link https://goo.gl/uYxV3R) the scribe has written "yoṁdoṁdaṟoḷoḷa" (the last letter being mostly obliterated by a worm hole), and cancelled out the last "o" (which is written with the sideways "3") with a small cross-mark on top. The problem is that by cancelling out the mātrā, the scribe has changed the vowel. Since we are transcribing in Roman transliteration, we would have to do something like yoṁdoṁdaṟoḷ<del rend="cross">o</del><add type="implicit">a</add>ḷa, i.e., marking the addition of the vowel as "implicit" (or something similar) in order to make clear that it's not a new mark on the leaf. (If we were transcribing in Kannada script, we could do ಯೊಂದೊಂದಱೊಳ<del>ೊ</del>ಳ, but that will surely cause rendering problems.) Second, the "canonical" order of code-points and of transliteration for conjuncts with initial r has the r first. In Kannada script, however, a "flying r" is used, which occurs to the right of the other consonants in the conjunct. Sometimes there's a feature that we want to encode *between* the members of the conjunct, as in this example https://goo.gl/BqNVg9, where a string-hole intervenes between the "gg" and the "r" of "mārggaṁ". How should we encode this? I know that some of you have used "akṣarapart" to identify mātrās and other components of akṣaras, but I can't seem to get around the problem of the reversed sequence of the phonological representation and the graphic representation. Grateful for any help! Andrew
Dear Andrew et al.,
I'm interested in how others have handled this situation. The way we handled it cataloguing the Brown, Penn, and Harvard collections of Sanskrit manuscripts was:
1. We transcribe in SLP1 which is a Romanization so allows splitting conjuncts and separating vowels from the consonants on which they depend.
2. We delete a whole vowel or consonant whenever any part of it is indicated as deleted in the ms. and add the whole replacement without comment, thus <del>o</del><add>a</add>.
3. For rendering in Devanagari or another Indic script, we thought it is not a difficult task to transpose the finer tagging of phones in a romanization to whole akzaras in an Indic script, so transposed one would get <del>ro</del><add>ra</add> [since r underscore is does not represent a Sanskrit sound SLP1 does not encode it so I'm alterning the example].
4. For features between conuncts,
<s part='I'>...g</s>
<gap reason='design'>
<desc>string-hole</desc>
</gap>
Second, the "canonical" order of code-points and of transliteration for conjuncts with initial r has the r first. In Kannada script, however, a "flying r" is used, which occurs to the right of the other consonants in the conjunct. Sometimes there's a feature that we want to encode *between* the members of the conjunct, as in this example https://goo.gl/BqNVg9, where a string-hole intervenes between the "gg" and the "r" of "mārggaṁ". How should we encode this? I know that some of you have used "akṣarapart" to identify mātrās and other components of akṣaras, but I can't seem to get around the problem of the reversed sequence of the phonological representation and the graphic representation.
Grateful for any help!
Andrew _______________________________________________ indic-texts mailing list indic-texts@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/indic-texts
participants (2)
-
Andrew Ollett
-
Peter Scharf