on pādas (again)

7 Jun 2018

      Dear colleagues,

I am in the midst of a workshop in which we are attempting to encode texts
in Old Javanese in TEI format, and the issue of encoding pādas has (once
again) reared its head. We discussed this issue at length in the context of
the SARIT project, and we came to the conclusion that <l> should be used
for a pair of pādas, and the boundary between even and odd pādas should be
represented by the <caesura/> element. Hence the following vasantatilaka
verse:

        <lg n="3" met="vasantatilaka">
          <l>vvantən mañumbana puḍak ginuritnya pārtha<caesura/>
             ndān susvasusvani kinolnya hanan liniṅliṅ</l>
          <l>rakryan vədinta tan akun ləvu paṅhavista<caesura/>
             heman kitābapa niragraha māsku liṅnya</l>
        </lg>

This is somewhat contrary to what many people would expect, namely, that
each pāda should correspond to a single <l> element, as follows:

        <lg n="3" met="vasantatilaka">
          <l>vvantən mañumbana puḍak ginuritnya pārtha</l>
          <l>ndān susvasusvani kinolnya hanan liniṅliṅ</l>
          <l>rakryan vədinta tan akun ləvu paṅhavista</l>
          <l>heman kitābapa niragraha māsku liṅnya</l>
        </lg>

My arguments for the use of <caesura/> involved (a) the practical necessity
of encoding texts from printed editions, where the pādas are not separated
typographically in all cases, especially in shorter verse forms, and thus
(b) the requirement that <l> should mean the same thing for an anuṣṭubh
verse as for (e.g.) śakvarī verse, i.e., it should not refer to a pādayuga
in the first case, and a pāda in the second case; and (c) the frequent
occurrence, in Sanskrit, of words that span the boundaries between odd and
even pādas, and the undesirability of having structural elements like <l>
overlap with the grammatical structure of the text (at least at the level
of the word). The use of <caesura/> would be optional: it's not required
(and often isn't marked typographically in shorter verse forms), but if it
is present, the stylesheets will insert a space.

But I can now think of counterarguments for all of these points, and in
some ways, it might be easier if <l> always mean a "pāda." (<caesura/> also
doesn't have a @type attribute in standard TEI, so it might be more
difficult than I expected to differentiate this "pāda-boundary caesura"
from the pāda-internal yati.) So I am asking everyone whether there are
compelling reasons you've discovered for preferring one encoding solution
over another. (Or if you have other suggestions altogether, including the
use of <seg> or other such elements.) I know that there are some features
that vary across Indic languages, such as the coincidence of metrical and
grammatical (esp. lexical) boundaries: these structures always coincide in
Old Javanese, and almost never in Kannada, so I am hoping to avoid the
problem of overlapping hierarchies completely.

Andrew

Andrew Ollett

Balogh Dániel

Peter Scharf

Patrick McAllister

Arlo Griffiths

Peter Scharf

Andrew Ollett

Balogh Dániel

Arlo Griffiths

Balogh Dániel

Peter Scharf

Peter Scharf

Peter Scharf

tags

participants (5)