Hi, it is nice to read your questioning, Andrew, and reflections on it! It seems to me that these problems have a deeper issue in the background, namely (even though "we all know how an abugida works") are we reflecting how Devanagari and similar scripts function when we encode them? Or do we rather use a way of thinking about writing that is quite familiar for us (our Roman types) when we speak of "delete", "add", of virama as a sign that deletes a, or of vowels as distinct letters, etc.? The smallest unit of transcription is for me (as for Patrick) the akshara. I try to think of what a copyist did in terms of "changing x to y by doing z": he changed "ro" to "ra" by crossing the o-element. As regards flying signs on top of aksaras, or elsewhere, resulting in signs removed from the expected position: cases I am familiar with are due to the flux of writingor other aesthetic reasons, and reflect the range of freedom the copyist had in reproducing that kind of akshara. In other words, they represent a context-based way of writing that akshara; so, as they do not produce ambiguities, I would only illustrate the phenomenon in the description of the ms. Handwriting can be quite free, but there are cases in print too. In early prints in Devanagari, there are cases of "hn." printed as "n.h", which suggests a practical solution to a problematic conjunction when the types were made. In Tibetan prints, I have repeatedly seen an "e" on top of the next letter when there is not enough space for such an "e" on top of the letter to which it should be assigned. Best wishes, cristina Am 02.04.2018 um 13:47 schrieb Camillo Formigatti:
(Sorry, I sent the first one by mistake!)
Dear Andrew, Peter, and Charlie,
Finally somebody started a thread!
If you don't mind my being very direct in what I write (I'm smiling, I assure you, because I totally share your doubts about how to mark up such cases), similar questions will always come up if we don't think starting from basic problems regarding how we look at manuscripts. We ought to first agree about the way we describe manuscripts and only then we can start to ask ourselves how to mark up. I believe two questions ought to be asked first (Peter partly pointed out already the first one in his reply): why mark up such phenomena? Also, I would add: to which degree of exactness?
As to the first question, there are obvious answers, such as if I'm preparing a diplomatic transcription or a critical edition, I have to do it. Then how? All solutions proposed entail the use of the elements <del></del> and <add></add>, as well as <subst></subst> (as in Charlie's example, who I guess is partly adopting our Cambridge standards), thus with the basic structure <subst><del></del><add></add></subst>. I totally agree with this approach, but...
Now let me answer to the first possible objection: in Andrew's example, is the scribe really adding something? Sure he is (let's not get politically correct, we know it was almost certainly a man, even if there is no colophon in the manuscript). He is not materially adding anything on the folio, sure, but what are we marking up? Let's say he wanted to substitute o with ā, then he would have added a mātrā, right? As we all know, the functioning of an abugida writing system rests on the principle of an inherent vowel. The point here is "as we all know." We are marking up transcriptions of manuscripts in scripts of which we know the functioning, so no need to get more catholic than the pope. Also, to a certain extent the scribe was substituting something with something else, by deleting an o and adding an a (or in other cases, a mātrā for any other vowel). I think that this is an elegant way of solving the "implicit" problem, though without using any further element or attribute.
The answer to my second question might also provide an answer to Andrew's second conundrum. In our catalogue we adopted two attributes for deletions and additions: for <add></add> we used @place to mark where the addition was made (using the standard values provided in the ENRICH schema), and for <del></del> we used the @type (values =yellow_paste, expuncted, erased, palimpsest, cancelled). I don't know if we can agree about the number or typology of attributes to be used, but this is not so important, as we will always have slightly different approaches, for as Peter pointed out, we have usually have different aims when describing manuscripts.
Thinking of the approach I have described above, the "we all know how an abugida works" argument might also solve the conundrum of marking up a whole akṣara or only a part. With this approach, there is no need to mark up only parts of an akṣara, as it is clear that only the mātrā was changed. (Also, no problem for cases of akṣaras divided by string holes, we can always nest the elements, if I get the problem–but I'm not really sure to have understood it.)
A belated Happy Eater to you all!
Camillo
________________________________________ From: indic-texts-bounces@lists.tei-c.org [indic-texts-bounces@lists.tei-c.org] on behalf of indic-texts-request@lists.tei-c.org [indic-texts-request@lists.tei-c.org] Sent: Sunday, April 01, 2018 11:00 AM To: indic-texts@lists.tei-c.org Subject: indic-texts Digest, Vol 3, Issue 1
Send indic-texts mailing list submissions to indic-texts@lists.tei-c.org
To subscribe or unsubscribe via the World Wide Web, visit http://lists.lists.tei-c.org/mailman/listinfo/indic-texts or, via email, send a message with subject or body 'help' to indic-texts-request@lists.tei-c.org
You can reach the person managing the list at indic-texts-owner@lists.tei-c.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of indic-texts digest..."
Today's Topics:
1. some problems with encoding parts of ak?aras in manuscript transcriptions (Andrew Ollett) 2. Re: some problems with encoding parts of ak?aras in manuscript transcriptions (Peter Scharf)
----------------------------------------------------------------------
Message: 1 Date: Sat, 31 Mar 2018 22:50:16 -0500 From: Andrew Ollett
To: indic-texts@lists.tei-c.org Subject: [Indic-texts] some problems with encoding parts of ak?aras in manuscript transcriptions Message-ID: Content-Type: text/plain; charset="utf-8" Hi everyone,
I have two questions about encoding manuscript transcriptions that I wanted to submit to the collective experience of this group. Both relate to the problem of ak?aras having "parts" that canonically occur in a certain sequence but may be changed in a manuscript.
First, the cancellation of vowel m?tr?s. Does anyone have a good way to encode this in TEI? In one manuscript (see this link https://goo.gl/uYxV3R) the scribe has written "yo?do?da?o?o?a" (the last letter being mostly obliterated by a worm hole), and cancelled out the last "o" (which is written with the sideways "3") with a small cross-mark on top. The problem is that by cancelling out the m?tr?, the scribe has changed the vowel. Since we are transcribing in Roman transliteration, we would have to do something like yo?do?da?o?<del rend="cross">o</del><add type="implicit">a</add>?a, i.e., marking the addition of the vowel as "implicit" (or something similar) in order to make clear that it's not a new mark on the leaf. (If we were transcribing in Kannada script, we could do ??????????<del>?</del>?, but that will surely cause rendering problems.)
Second, the "canonical" order of code-points and of transliteration for conjuncts with initial r has the r first. In Kannada script, however, a "flying r" is used, which occurs to the right of the other consonants in the conjunct. Sometimes there's a feature that we want to encode *between* the members of the conjunct, as in this example https://goo.gl/BqNVg9, where a string-hole intervenes between the "gg" and the "r" of "m?rgga?". How should we encode this? I know that some of you have used "ak?arapart" to identify m?tr?s and other components of ak?aras, but I can't seem to get around the problem of the reversed sequence of the phonological representation and the graphic representation.
Grateful for any help!
Andrew