Re: [Indic-texts] some problems with encoding parts of akṣaras in manuscript transcriptions

2 Apr 2018

      Hi,

it is nice to read your questioning, Andrew, and reflections on it!

It seems to me that these problems have a deeper issue in the 
background, namely (even though "we all know how an abugida works") are  
we reflecting how Devanagari and similar scripts function when we encode 
them?
Or do we rather use a way of thinking about writing that is quite 
familiar for us (our Roman types) when we speak of  "delete", "add", of 
virama as a sign that deletes a, or of vowels as distinct letters, etc.?

The smallest unit of transcription is for me (as for Patrick) the 
akshara. I try to think of what a copyist did in terms of "changing x to 
y by doing z": he changed "ro" to "ra" by crossing the o-element.

As regards flying signs on top of aksaras, or elsewhere, resulting in 
signs removed from the expected position: cases I am familiar with are 
due to the flux of writingor other aesthetic reasons, and reflect the 
range of freedom the copyist had in reproducing that kind of akshara. In 
other words, they represent a context-based way of writing that akshara; 
so, as they do not produce ambiguities, I would  only illustrate the 
phenomenon in the description of the ms.
Handwriting can be quite free, but there are cases in print too. In 
early prints in Devanagari, there are cases of "hn." printed as "n.h", 
which suggests a practical solution to a problematic conjunction when 
the types were made.
In Tibetan prints, I have repeatedly seen an "e" on top of the next 
letter when there is not enough space for such an "e" on top of the 
letter to which it should be assigned.

Best wishes, cristina

Am 02.04.2018 um 13:47 schrieb Camillo Formigatti:
...
(Sorry, I sent the first one by mistake!)
Dear Andrew, Peter, and Charlie,
Finally somebody started a thread!
If you don't mind my being very direct in what I write (I'm smiling, I assure you, because I totally share your doubts about how to mark up such cases), similar questions will always come up if we don't think starting from basic problems regarding how we look at manuscripts. We ought to first agree about the way we describe manuscripts and only then we can start to ask ourselves how to mark up. I believe two questions ought to be asked first (Peter partly pointed out already the first one in his reply): why mark up such phenomena? Also, I would add: to which degree of exactness?
As to the first question, there are obvious answers, such as if I'm preparing a diplomatic transcription or a critical edition, I have to do it. Then how? All solutions proposed entail the use of the elements <del></del> and <add></add>, as well as <subst></subst> (as in Charlie's example, who I guess is partly adopting our Cambridge standards), thus with the basic structure <subst><del></del><add></add></subst>. I totally agree with this approach, but...
Now let me answer to the first possible objection: in Andrew's example, is the scribe really adding something? Sure he is (let's not get politically correct, we know it was almost certainly a man, even if there is no colophon in the manuscript). He is not materially adding anything on the folio, sure, but what are we marking up? Let's say he wanted to substitute o with ā, then he would have added a mātrā, right? As we all know, the functioning of an abugida writing system rests on the principle of an inherent vowel. The point here is "as we all know." We are marking up transcriptions of manuscripts in scripts of which we know the functioning, so no need to get more catholic than the pope. Also, to a certain extent the scribe was substituting something with something else, by deleting an o and adding an a (or in other cases, a mātrā for any other vowel). I think that this is an elegant way of solving the "implicit" problem, though without using any further element or attribute.
The answer to my second question might also provide an answer to Andrew's second conundrum. In our catalogue we adopted two attributes for deletions and additions: for <add></add> we used @place to mark where the addition was made (using the standard values provided in the ENRICH schema), and for <del></del> we used the @type (values =yellow_paste, expuncted, erased, palimpsest, cancelled). I don't know if we can agree about the number or typology of attributes to be used, but this is not so important, as we will always have slightly different approaches, for as Peter pointed out, we have usually have different aims when describing manuscripts.
Thinking of the approach I have described above, the "we all know how an abugida works" argument might also solve the conundrum of marking up a whole akṣara or only a part. With this approach, there is no need to mark up only parts of an akṣara, as it is clear that only the mātrā was changed. (Also, no problem for cases of akṣaras divided by string holes, we can always nest the elements, if I get the problem–but I'm not really sure to have understood it.)
A belated Happy Eater to you all!
Camillo
________________________________________
From: indic-texts-bounces@lists.tei-c.org [indic-texts-bounces@lists.tei-c.org] on behalf of indic-texts-request@lists.tei-c.org [indic-texts-request@lists.tei-c.org]
Sent: Sunday, April 01, 2018 11:00 AM
To: indic-texts@lists.tei-c.org
Subject: indic-texts Digest, Vol 3, Issue 1
Send indic-texts mailing list submissions to
         indic-texts@lists.tei-c.org
To subscribe or unsubscribe via the World Wide Web, visit
         http://lists.lists.tei-c.org/mailman/listinfo/indic-texts
or, via email, send a message with subject or body 'help' to
         indic-texts-request@lists.tei-c.org
You can reach the person managing the list at
         indic-texts-owner@lists.tei-c.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of indic-texts digest..."
Today's Topics:
1. some problems with encoding parts of ak?aras in manuscript
       transcriptions (Andrew Ollett)
    2. Re: some problems with encoding parts of ak?aras in
       manuscript transcriptions (Peter Scharf)
----------------------------------------------------------------------
Message: 1
Date: Sat, 31 Mar 2018 22:50:16 -0500
From: Andrew Ollett <andrew.ollett@gmail.com>
To: indic-texts@lists.tei-c.org
Subject: [Indic-texts] some problems with encoding parts of ak?aras in
         manuscript transcriptions
Message-ID:
         <CAANHO15y_K1BRZLqj+PauXoxcvdgusDS1Df99juUp+W7xkV-Lg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi everyone,
I have two questions about encoding manuscript transcriptions that I wanted
to submit to the collective experience of this group. Both relate to the
problem of ak?aras having "parts" that canonically occur in a certain
sequence but may be changed in a manuscript.
First, the cancellation of vowel m?tr?s. Does anyone have a good way to
encode this in TEI? In one manuscript (see this link <https://goo.gl/uYxV3R>)
the scribe has written "yo?do?da?o?o?a" (the last letter being mostly
obliterated by a worm hole), and cancelled out the last "o" (which is
written with the sideways "3") with a small cross-mark on top. The problem
is that by cancelling out the m?tr?, the scribe has changed the vowel.
Since we are transcribing in Roman transliteration, we would have to do
something like yo?do?da?o?<del rend="cross">o</del><add
type="implicit">a</add>?a, i.e., marking the addition of the vowel as
"implicit" (or something similar) in order to make clear that it's not a
new mark on the leaf. (If we were transcribing in Kannada script, we could
do ??????????<del>?</del>?, but that will surely cause rendering problems.)
Second, the "canonical" order of code-points and of transliteration for
conjuncts with initial r has the r first. In Kannada script, however, a
"flying r" is used, which occurs to the right of the other consonants in
the conjunct. Sometimes there's a feature that we want to encode *between*
the members of the conjunct, as in this example <https://goo.gl/BqNVg9>,
where a string-hole intervenes between the "gg" and the "r" of "m?rgga?".
How should we encode this? I know that some of you have used "ak?arapart"
to identify m?tr?s and other components of ak?aras, but I can't seem to get
around the problem of the reversed sequence of the phonological
representation and the graphic representation.
Grateful for any help!
Andrew

Re: [Indic-texts] some problems with encoding parts of akṣaras in manuscript transcriptions

cristina pecchia