Dear all,

I was delighted, though not entirely surprised, to see that many of you had grappled with similar issues. I sketched out the issues briefly and wrote up some of the solutions that you've suggested here:

https://wiki.tei-c.org/index.php/SIG:IndicTexts

(I found Paddy's example from early Bengali script extremely useful, but I didn't presume to add it to the wiki without asking.)

Transliteration promises to be a persistent issue: many encoding strategies just don't make sense if we are either inputting our texts in Indic scripts or providing for output in Indic scripts. It's not clear to me at this point how worried I should be about this: in our project, the only reason for offering a Kannada-script version of the manuscript transcriptions is "why not?". But if we did need a principled approach, the use of wrappers like <c> (or <g>?) might help.

I also tried looking at the ENRICH guidelines after Camillo mentioned them. I couldn't locate a schema file, and in the online documentation (there are lots of broken links) I didn't see any specification of attribute values for @type (in <del>) or @place (in <add>). But I tried to stick to what Dániel and Camillo recommended.

Thanks everyone, and please feel free to suggest additions or modifications to the TEI Wiki page (anyone with a TEI account can edit as well).

Andrew

2018-04-04 4:06 GMT-05:00 Balogh Dániel <danbalogh@gmail.com>:
Hello all, and let me add my thanks to Andrew for starting a thread. I've read your opinions with interest and have little new to add, but here are my thoughts anyway.

My approach is based on encoding texts in Romanised form (whether IAST or SLP doesn't matter). My basic feeling is that marking up every little feature may not be worth the trouble. I believe this is what Camillo has been suggesting and what Peter has shown in his example. So reflecting changes on the level of phonemes should be fine, and a change of one vowel to another can be marked up simply as <subst><del>o</del><add>ā</add></subst>. The deletion and the addition could be qualified with attributes as in Andrew's original post. I believe the generic solution (recommended in EpiDoc) would serve well: <subst><del rend="corrected">o</del><add place="overstrike">ā</add></subst> Exactly how this change is implemented graphically in the written specimen is, as Camillo says, something that can be left to the reader's knowledge of the writing system in question; or, in the rare cases where we as editors think it will not be obvious to anyone who cares, described in a comment for human readers only. The same would work for the deletion of vowel mātrās i.e. correction of another vowel to "a". Or, if the a is described as implicit for the sake of precision, I would still suggest using the @place attribute with that value, not @type.

If it is desired that the encoded text can be rendered in an Indic script, we must keep in mind that this is mainly for the sake of modern readers who are more familiar with those scripts than Romanisation. In most cases, rendering in a Devanagari or Kannada or whatever font will not be a 100% accurate representation of the way complex akṣaras are constituted in the MS. So, in my mind, this is a display issue that needs to be dealt with in the XSLT that produces human-readable output from your markup. Wrapping the entire akṣara in a <c> element may make the transformation a lot easier. This is similar to what Charles is doing with <subst> but, I believe, entirely canonical. Using <c> to wrap akṣaras may also be an idea to consider for Andrew's second problem, though of course it doesn't solve the problem of the floating r.

All the best,
Dan

_______________________________________________
indic-texts mailing list
indic-texts@lists.tei-c.org
http://lists.lists.tei-c.org/mailman/listinfo/indic-texts