Re: [Indic-texts] Sanskrit names for division types

3 Nov 2018


      The encoding of divs sown by Patrick in his first example is exactly what I use and what I had in mind, i.e.
<div n=“1” type=“Ahnika”>
As for how to describe levels in a machine readable way, I had in mind the tagsDecl element in the teiHeader:
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-tagsDecl.html <http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-tagsDecl.html>
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HD57 <http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HD57>
Note that the latter link includes the following description:
"The tagsDecl <http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-tagsDecl.html> element is used to record the following information about the tagging used within a particular document:
…
any comment relating to the usage of particular elements not specified elsewhere in the header.
While the description of the use of this element mentions the specifying the use of elements, I’m not sure whether there is some equivalent to describe the use of attributes.
Yours,
Peter

******************************
Peter M. Scharf, President
The Sanskrit Library
scharf@sanskritlibrary.org
http://sanskritlibrary.org
******************************
...
On 2 Nov 2018, at 8:15 PM, Patrick McAllister <patrick.mcallister@oeaw.ac.at> wrote:
On Fri, Nov 02 2018, Tyler Neill wrote:
...
Agreed, Peter. To rephrase: I would be happy with whatever makes it
possible for various mid-level users (whether of XML/XSLT or of the
particular work or literary genre) to understand and manipulate the
material within a few minutes. The @n and @subtype attributes seem to be
doing the majority of that user-friendliness work here; perhaps the @type
attribute goes too far, as you say. But whether a few more or a few less, I
think the basic feedback to Patrick's question remains the same: Such
attributes need not be required by the guidelines, but they can still
serve to conveniently store annotations concerning traditional structural
information.
Thank you for your answers so far: I’m convinced now that it’s useful to
have these attributes.  What I’m not yet sure about is whether their
proposed mark up is satisfactory.  I have doubts in two areas:
1) Ease of use/accessibility
Something like this was suggested:
<div n="1" type="level1" subtype="volume">
    <div n="1" type="level2" subtype="āhnika">
    </div>
    <div n="1" type="level2" subtype="āhnika">
    </div>
</div>
Does the value of the @type represent something that’s in the book (as a
printed, material thing) or the text (as an abstract thing)?  If not,
and the attribute values “level1” or “level2” only indicate how many
“div-s” down the current element is, then it seems redundant. (I think
also Peter tends towards this opinion.)
Granted, with their presence it’s obvious at first sight where you are
in the document, but probably, after some exposure to xml technologies,
you’ll find it easier and more reliable to query the document itself for
this information, rather than rely on these indicators.
2) @type/@subtype
More problematical, IMO, is the relation of @type/@subtype.  @subtype is
described like this:
“subtype: provides a sub-categorization of the element, if needed” (at
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.typed.html)
I think the relation of these two attributes is quite literal.  The
Guidelines have examples like “sentence/declarative”,
“phrase/preposition”, “word/noun”, etc.
So, with this expectation, what the @type/@subtype we’re discussing
would tell us is that a “volume” div is a subtype of a “level1” div, and
that an “āhnika” div is a subtype of a “level2” div.  This seems to be
mixing things of different categories.  Additionally, the value of the
@type would change depending on whether the book has volumes, parts, or
only chapters.  But I suppose that “āhnika” could be used in all three,
and would thus clearly not be a subtype of a “levelX” div.
Something like this should really be sufficient:
<div n="1" type="āhnika">
</div>
(And it would solve my two problems.)
...
On Fri, Nov 2, 2018 at 7:17 AM Peter Scharf <scharf@sanskritlibrary.org>
wrote:
...
It is not really necessary to include level1, level2 in every div.  One
could describe this structure, namely, that level1 is volume, level 2 is
Ahnika, etc., in the header in machine readable form.
This is a good suggestion, too.  I don’t know how to make these things
machine-readable in TEI, but I’ve found myself using the @ana attribute
frequently to link interpretations with elements, e.g.:
<div n="1" ana="#āhnika">
</div>
And somewhere else (in the document or separately):
<interp xml:id="āhnika">Section of text manageable in a day.</interp>
This would also allow you to attach multiple interpretations:
<div n="1" ana="#āhnika #pariccheda-as-defined-by-dominik">
</div>
(For @ana, see
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.global.analytic.....)
But it is not machine-readable (unless we introduce some restrictions on
the content of the tei:interp elements).
About tei:interp, the TEI Guidelines say
(http://www.tei-c.org/release/doc/tei-p5-doc/en/html/AI.html#AISP):
“The same analysis may be expressed with the interp element instead of
the span element; this element provides attributes for recording an
interpretive category and its value, as well as the identity of the
interpreter, but does not itself indicate which passage of text is being
interpreted; the same interpretive structures can thus be associated
with many passages of the text.”
For SARIT, I think it would be very useful to have a central and visible
list of such tei:interp elements that one could link the @ana against,
in various texts.  Any volunteers?  Or other ideas of how to coordinate
the types/analysis across multiple texts?
--
Patrick
_______________________________________________
Indic-texts mailing list
Indic-texts@lists.tei-c.org
http://lists.lists.tei-c.org/mailman/listinfo/indic-texts