
Le 23 sept. 2018 à 10:46, Arlo Griffiths <arlo.griffiths@efeo.net> a écrit :
Dear colleagues,
I am trying to encode a text from Bali built up around Sanskrit stanzas, which are followed by Old Javanese glosses. The glosses themselves are interspersed with Sanskrit elements, generally (but not always) chunks from the root text. Here’s an example from e-text that’s going to be TEI encoded.
<KMN21s-ab> na te 'tra vimatiḥ kāryā nirviśaṅkena cetasā <KMN21s-cd> prakāśaya mahātulaṁ mantracaryānayam param ||21|| c. Speijer notes that the reading mahātulaṁ is unmetrical. <KMN21j> ka: hayva kita vicikitsa, NIRVIŚAṄKENA CETASĀ, ikaṅ nissandeha atah ambĕka[ka]nta, PRAKĀŚAYA MAHĀTULAṀ [msA-a17] MANTRACĀRYYANAYAM PARAṀ, at pintonakna ike, saṅ hyaṅ mantranaya mahāyāna.
How would you propose to encode a chunk like NIRVIŚAṄKENA CETASĀ? Options I have thought of (with Andrew Ollett) are:
(1) <quote type="pratīka">nirviśaṅkena cetasā<quote> (2) <term>nirviśaṅkena cetasā<term>
In the second solution, I could also add <gloss> to the string ikaṅ nissandeha atah ambĕka[ka]nta. But I am hoping that this is not mandatory. Neither of the two solutions immediately helps if I wanted to have a mechnism that links the pratīka with the corresponding words in the mūla.
I’d suggest something like this: <div> na te 'tra vimatiḥ kāryā <seg xml:id="KMN21s-b-end">nirviśaṅkena cetasā</seg> <mentioned sameAs="#KMN21s-b-end">nirviśaṅkena cetasā</mentioned> </div> (This would assume there is no construction corresponding to Skt /iti/ around the phrase. If there were, tei:quote would be better than tei:mentioned.) On Sun, Sep 23 2018, Andrea Acri wrote:
Dear Arlo
for the DhPāt, after consultation with Andrew during the TEI workshop in Paris, I have used the following commands:
<lg xml:lang="san-Latn" n="1" met="anuṣṭubh"> <l>acintyo niṣkalaḥ śāntaḥ,</l> <l>dhruva-m abyaya-m īśvaraḥ,<\l> <l>asau sūkṣmaḥ paraḥ śāntaḥ,</l> <l>śivaḥ sakalaniṣkalaḥ ◆</l> </lg> <p>apan sira sinaṅguh <quote type="lemma">acintya</quote>, apa tar kavәnaṅ inaṅәnaṅәn, <quote type="lemma">niṣkala</quote> sira tar pā- vak, tan pavarṇa, ta(1v)n baṅ, tan aputih, tar kuniṅ, tan hirәṅ, kapila dvivarṇādi, tan hana ikā kabeh ri saṅ hyaṅ paramārtha,</p>
<quote type="pratīka"’> should be fine, but will it not be unintelligible to non-Indologists?
The solution with ~tei:quote~ and ~@type="lemma"~ is what we use in some of the current SARIT texts. I think the main reason we used it is that "pratīka" is not only unclear to non-Indologists, but probably not the correct term in many cases (it’s a tricky issue, see below for my current idea). This solution was fine for our problem there: encoding existing editions, which would usually mark all actual pratīka-s and all other types of quotes from the base-text in the same way. For the conversion from printed to digital text, it would have introduced a new source for errors if we had differentiated these things further. But for a new edition that is being created digitally, a more differentiated kind of mark-up might be useful: you could then easily query your document for the different types of “quotes” that commentaries typically use. At the TEI conference in Tokyo 2 weeks ago, I presented a few cases of such quotations. Please take into account that this was meant for an audience of non-Indologists, hoping to let them understand the different types of quotes that one commonly finds in Sanskrit commentaries. The following is from my presentation notes, typos included (slides are currently here: http://rdorte.org/pma/tei2018.html): <<< BEGIN QUOTE >>> 1.2 Types of quotations ─────────────────────── 1.2.1 Simple quotes (and their context) ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ The first example is the simplest, and there can be no doubt that this is a quote in the fullest sense: ┌──── │ <div xmlns="http://www.tei-c.org/ns/1.0"> │ <p><gap/>etena yad uktam <persName role="opponent">udyotakareṇa</persName> <cit><quote>avācakatve śabdānāṃ pratijñāhetvorvyāghāta</quote></cit> iti tadapi pratyuktaṃ bhavati.<gap/></p> │ │ <p>Through this [argument in the base text] also what <persName>Uddyotakara</persName> said has │ been rejected, namely, <cit><quote>If words do not denote anything, both │ your proposition and your reason are inconsistent.</quote></cit>.</p> │ </div> └──── This example is from the ca. eighth century Tattvasaṃgrahapañjikā ([TSP1]), the commentary by Kamalaśīla on the extensive work in verse by Śāntarakṣita. There are three voices that must be distinguished in this passage: 1) First, it is spoken by Kamalaśīla, the commentator. 2) Second, he is connecting an argument from the base text, the text he is commenting upon, to an objection by a non-Buddhist opponent. The base text is here not quoted but only pointed at, by saying “Through this [argument in the base text]”. 3) Third, this opponent’s text is quoted verbatim, as a sub-phrase in a statement that rejects it. In this case, we can actually verify that it is a faithful quote, since the sentence is found in Uddyotakara’s work (\cite[312.21--22]{uddyotakara97:_NV}). The commentary here performs various functions: 1) /etena … bhavati/: connects the text commented upon and the provided quote 2) /yad uktam/: introduces quote (lit., “said”) 3) /udyotakareṇa/: original speaker of what was said 4) `<quote />': what was said 5) Its embedding in a `cit' element, here abused because there is no explicit reference to the source, unless we include the name of the speaker, makes it clear that it is not an imaginary quote (or attribution), but a claim actually upheld by someone. All of these functions can be tied to parts or segments of the sentences under consideration. As we shall see, it is desirable that all these functions that the commentary performs should, ideally, be available for general queries run on the group of texts. 1.2.2 Quotes as references ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ Later in the same text, Kamalaśīla introduces verse 1061 of the text he is commenting on like this, \cite[ad 1061,1062]{TSP1}: ┌──── │ <div xmlns="http://www.tei-c.org/ns/1.0-pause-valid" n="1061"> │ <lg xml:id="ts__1061"> │ <l><seg xml:id="ts__1061__a">agobhinnaṃ</seg> │ <seg xml:id="ts__1061__b">ca</seg> │ yadvastu tadakṣairvyavasīyate /</l> │ <l>pratibimbaṃ tadadhyastaṃ svasaṃvittyā'vagamyate // 1061 //</l> │ </lg> │ │ │ <p>yac coktam <hi rend="bold">indriyair</hi> ityādi, │ tad asiddham iti darśayann āha—<hi rend="bold">agobhinnaṃ │ ce</hi>tyādi. <hi rend="bold">ca</hi>kāro 'nuktārthasamuccaye. │ </p> │ │ <p xml:lang="en">Further, what was said with the words <quote>“By │ sense perceptions”</quote> and so on, that is unestablished. In │ order to show this, Śāntarakṣita said <quote>“And a thing │ differentiated from non-cow”</quote>. ... The uninflected word │ <quote>“and”</quote> is used in order to include the meanings of │ exclusion not mentioned in the verse.</p> │ </div> └──── Listing 1: Quotes as references This passage shows some other types of quotes than we saw before. The commentator’s introduction of the passage from the base text, the text the commentary is written on, contains two elements (here still marked graphically by `hi') that should be categorized as `quote' elements of some sort: 1) The first is “indriyair ityādi”. This refers back to verse 939, which itself is a quote, in the base text, of a passage by an opponent. 2) The second is “agobhinnaṃ cetyādi”, which points to the beginning of the verse in which the base text answers the opponent’s claim made in verse 939. Both of these quotations are what is properly called a “pratīka”, literally, “that which turns towards something”, and signifying “the front of something.” In this context, it refers to the beginning of the passage or verse that the commentator is referring to. What is essential about this type of quotation is that its content is usually irrelevant: it is a reference to a particular string of characters, or sequence of sounds. This is supported also by the liberties that the Tibetan translators of such texts took with these markers: they would not literally translate the content of the reference, but insert whatever words ended up at the beginning of the Tibetan translation of the passage that is referred to. To make this function explicit, one could encode the text in several ways. Here I will discuss solutions that I have at some point considered useful and then revised with further experience. These attempts are therefore probably not ideal. ┌──── │ <div xmlns="http://www.tei-c.org/ns/1.0-pause-valid" n="1061"> │ <lg xml:id="ts__1061"> │ <l><seg xml:id="ts__1061__a">agobhinnaṃ</seg> │ <seg xml:id="ts__1061__b">ca</seg> │ yadvastu tadakṣairvyavasīyate /</l> │ <l>pratibimbaṃ tadadhyastaṃ svasaṃvittyā'vagamyate // 1061 //</l> │ </lg> │ │ <p ana="#alt1">yac coktam <quote type="lemma" │ corresp="#ts__939">indriyair</quote> ityādi, tad asiddham iti │ darśayann āha—<quote type="lemma" │ corresp="#ts__1061__a">agobhinnaṃ ce</quote>tyādi.</p> │ </div> └──── Listing 2: Quotes as references, first attempt The first solution is to use the `quote' tag for these, with `@type' set to `lemma' (in the sense used by text criticism, not to be confused with the linguistic value as in the `@lemma' attribute). With this markup we can separate the `quote[@type="gloss"]' and other variants quite well. The drawback is a possible abuse of the `@corresp' attribute, which does not seem to contain the semantics of `@target'. The next suggested solution uses `ref' elements. ┌──── │ <div xmlns="http://www.tei-c.org/ns/1.0-pause-valid" n="1061"> │ <lg xml:id="ts__1061"> │ <l><seg xml:id="ts__1061__a">agobhinnaṃ</seg> │ <seg xml:id="ts__1061__b">ca</seg> │ yadvastu tadakṣairvyavasīyate /</l> │ <l>pratibimbaṃ tadadhyastaṃ svasaṃvittyā'vagamyate // 1061 //</l> │ </lg> │ │ <p ana="#alt2">yac coktam <ref target="#ts__939" │ type="lemma">indriyair</ref> ityādi, tad asiddham iti darśayann │ āha—<ref target="#ts__1061__a" type="lemma">agobhinnaṃ │ ce</ref>tyādi.</p> │ </div> └──── Listing 3: Quotes as references, second attempt However, this creates problems. Semantically it is problematic, because what refers in this case is not the content of the quote. It also has the significant drawback that any query intended to catch all quote-like elements will now have to include certain `ref' elements. This would increase the complexity of queries quite significantly. One could however combine these solutions, and treat the quote *as a whole* as a referring string: ┌──── │ <div xmlns="http://www.tei-c.org/ns/1.0-pause-valid" n="1061"> │ <lg xml:id="ts__1061"> │ <l><seg xml:id="ts__1061__a">agobhinnaṃ</seg> │ <seg xml:id="ts__1061__b">ca</seg> │ yadvastu tadakṣairvyavasīyate /</l> │ <l>pratibimbaṃ tadadhyastaṃ svasaṃvittyā'vagamyate // 1061 //</l> │ </lg> │ │ <p ana="#alt2">yac coktam <ref target="#ts__939"><quote │ type="lemma">indriyair</quote> ityādi</ref>, tad asiddham iti │ darśayann āha—<ref target="#ts__1061__a"><quote │ type="lemma">agobhinnaṃ ce</quote>tyādi</ref>.</p> │ </div> └──── Listing 4: Quotes as references, proposed solution This seems useful: we could pick out the `quote' elements along with all others, and still easily differentiate them by examining whether they are embedded in a reference to the base-text. 1.2.3 Quotes of individual words/phrases for elucidation ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ In the second example sentence that follows later in the commentary, Kamalaśīla takes up one word from the verse, the word “and”, Skt. /ca/, and says for which purpose it was used in the verse. This is not the same kind of referring quotation that we saw above, i.e., the purpose is to comment on the significance or meaning of the term, and not to indicate a particular point in the base text. ┌──── │ <div xmlns="http://www.tei-c.org/ns/1.0" n="1061"> │ <lg xml:id="ts__1061"> │ <l><seg xml:id="ts__1061__a">agobhinnaṃ</seg> │ <seg xml:id="ts__1061__b">ca</seg> │ yadvastu tadakṣairvyavasīyate /</l> │ <l>pratibimbaṃ tadadhyastaṃ svasaṃvittyā'vagamyate // 1061 //</l> │ </lg> │ │ <p ana="#alt2"><mentioned sameAs="#ts__1061__b">ca</mentioned>kāro │ <gloss target="#ts__1061__b">'nuktārthasamuccaye</gloss>. │ </p> │ </div> └──── Listing 5: Quotes for explanation To a certain extent, this has similar problems as the simple `ref' solution just discussed, in terms of query complexity: any query for quotes in general will have to take the variation introduced by `mentioned' into account, and become more complicated because of that. However, `mentioned' is a `quoteLike' element, and as such should actually be considered in well-constructed queries for quotes. The primary function of repeating the word that is to be explained, /ca/, is not to refer to the text, but to say something about the meaning or content of that term. Semantically, this solution seems to fit quite well. Both elements, `mentioned' and `gloss', are, in any case, here loosely tied to the base text, and not to each other, because there are variant forms where there is either no clearly identifiable `gloss' or the term which could be `mentioned' is not repeated in the text. <<< END QUOTE >>> The upshot of the whole thing is that it would be easy when editing the text to differentiate these basic types of quotes/mentionings/references. And (for my work at least) it would be helpful if one could filter the different kinds of quotes, e.g., for cross-checking the edition of the verses, or for discovering patterns in the commentator’s quotation habits. I’d be happy to have some feedback on these suggestions, especially on whether @corresp or @sameAs seems better for linking. I use both above, because I couldn’t make up my mind, but currently think that @sameAs is more appropriate: unlike @corresp, it can only contain one target, and the explanations that you want to link would always be keyed to exactly one word/phrase of the base-text. With best wishes, -- Patrick McAllister long-term email: pma@rdorte.org