Re: [tei-council] TEI conformance issues

21 Mar 2017

      I have no excuse: I read Lou's first email, thought: "hmm, interesting" and
marked it unread to go back to it at a later point, and now it's been three
days. So with sufficient guilt I now start reading from the top.

Generally, I've always been thinking of TEI conformance as Martin has:
validity against TEI all. I think I understand what the "customized subset"
bubble in the diagram means: it's not just a subset, but a customization of
TEI (by means of extension and restriction) that is still conformant (valid
against TEI all) - this, to my mind, is the ideal use of TEI. So I think it
is most useful to define TEI conformance as validity against TEI all
(well-formedness can be implicit), and it provides an easily accessible
benchmark for any given customization.

It's also true that there are some things that the TEI requires or
recommends but cannot be checked by a machine... This is not a problem
unique to the TEI: I can use <h1> in HTML to tag measurements and it would
technically be not HTML any longer. Let's forget for a minute that not many
people care about semantic HTML, but basically using <h1> for marking up
measurements is just *bad* HTML, not invalid HTML. I think it's reasonable
to just have bad TEI. Perhaps, given our format's complexity, it is too
easy to create bad TEI, or at least there are many more pitfalls. Not sure
how to address this besides training and user support.

Finally, let's consider extensions to the TEI. If extensions are in their
own namespace, should we consider whatever is in the TEI namespace to be
conformant, if it validates against TEI all? Probably. I guess TEI + SVG
fits in this category, so why not TEI + MyCrazyNewElement? I think Lou also
brings up the issue of translating element names and creating new elements
that map to existing ones (such as <opus>Op. 9</opus> for <seg
type="opus">Op. 9</seg>). These extensions that are basically semantic
sugar could be called TEI confromable? That is can be made conformant by
"after application of any canonicalization algorithm specified by its
associated documentation", like Lou said.

To conclude, a note on Elisa's quote of Drucker's piece: "that to apply
markup at all was imposing an authority of our flawed/questionable
interpretation as if it were part of the text." In my (little) teaching, I
always make it clear that marking up text is a form of modelling (a la
McCarty) and therefore interpretational and instrumental by definition. It
is a means to an end; it formalizes someones' understanding of a text's
components that that someone wants to make explicit to a machine. I think
this is obvious to most TEI users and I doubt anyone thinks they're making
their interpretation part of the text (or that their interpretation is
questionable/flawed, but that's another rant).

To the Buzzetti quote that claims "inserting tags into the plane of texts
generates confusion and conflict (unresolvable) between the
text 'as a representation at the level of discourse...and at the level of
reference'" again misunderstands that encoding is a modelling tool, but
more worryingly implies that there can be a mark-up free digital
representation of text, probably plain text. That is a fallacy as there is
no electronic text without markup (markup does not need angle brackets).
For what I think is a good rebuttal of this, see Pierazzo's work from the
past... boh, 10 years? :)  In my latest class I had students try and
transcribe a manuscript page in plain text independently and all of them
came up with all sorts of different conventions to deal with deletions,
things they couldn't read off the page, etc. When I introduced TEI, doing
all of that suddenly seemed easy and they were all speaking the same
language.

Raff

On Mon, Mar 20, 2017 at 6:28 PM, Elisa Beshero-Bondar <ebb8@pitt.edu> wrote:
...
A quick correction: The piece by Drucker was not that chapter (Graphesis
and Code), but it was the one where she addresses a problem of "Mathesis"
in the emphasis on quantitative analysis and hierarchical structures. The
piece by her that my literary theory capstone students read a year ago was
an excerpt titled "Digital Humanities and Electronic Texts" from her book
on SpecLab, and reprinted in Richard Lane's Global Literary Theory. It's
not alone, of course.  A little context: She was giving a good overview of
the appeal of "humanities computing" from the era of Web 1.0, and I liked
her explanation of the allure of "mathesis" in humanities--and the problems
of applying quantitative analysis to markup that of course is a matter of
our interpretation to apply. But I really bristled at a passage where she
claimed that the use of XML over texts was basically now understood to be
flawed in its fundamental application to texts--that to apply markup at all
was imposing an authority of our flawed/questionable interpretation as if
it were part of the text. And furthermore in a long footnote, she said the
discussion of this was now pretty much closed after Dino Buzetti's paper
"Text Reoresentation and Textual Models" (a conference proceedings paper
published on iath.virginia.edu, which asserted that inserting tags into
the
plane of texts generates confusion and conflict (unresolvable) between the
text "as a representation at the level of discourse...and at the level of
reference." She wrote, "Putting content markers into the plane of discourse
(a tag that identifies the semantic value of a text relies on reference,
even though it is put directly into the character string) _as if_ they are
marking the plane of reference is a flawed practice. Markup, in its very
basis, embodies a contradiction. It collapses two distinct orders of
linguistic operation in a confused and messy way."  Well, there in a
nutshell seems to be a pretty sharp dismissal of XML as a flawed model.
(Let alone the TEI!)
Is this relevant? I think so, given the way humanities scholars respond to
notions of authority and hierarchy, and the *language* of conformance.
Can we reclaim the concepts of conformance and compliance, to amplify their
constructive connotations in the post-post-post cultures in which we work?
Elisa
On Mon, Mar 20, 2017 at 6:12 PM, Elisa Beshero-Bondar <ebbondar@gmail.com>
wrote:
...
I protest—I am very interested in anything semantic to do with how we
teach “good”/“expected”/“sustainable”/“compliant”/“conformant”
applications of the TEI! I am also up to my neck in university semester
Everything, and want time to think about it, after a good night’s sleep.
A few weeks ago, overwhelmed with my burgeoning e-mail inbox, I ventured
a
post here to say the issues being discussed here seem momentous enough to
call for some kind of Symposium or dedicated Think Tank. These in
particular (what *are* the three terms, or do we not know them yet and
are
they just “areas”?) related to the ranges of meaning of “conformance”
seem
especially important for serious thinking and debate. I think we should
get
the debate started here and try to define its terms, but I would aslo
Urge
that we open this up to a broader venue for discussion—find a good venue
to
invite more members of the community to participate.
That said, here is my take on the words “conformance” and “compliance”:
They connote pressure to conform, of course, and perhaps they support the
unpopular notion of “The TEI as thought police” imposing particular
conceptions of hierarchy. Most of us here, of course, don’t subscribe to
that view, and conformance has positive connotation associated with
*compatibility* and (perhaps even?) interoperability (though I heard a
plenary speaker in Vienna say she hadn’t seen much evidence of
interoperable TEI projects because as she sees it, we encourage too much
adaptation). I don’t agree because, well, I’ve seen the work that Georg
Vogeler has been doing with associating TEI with RDF, and I’ve also seen
lots of evidence from within the TEI eXist-db community of interest in
compatibiilty with standard TEI elements…hey, there’s another term:
“standard.”
I feel as if we may need to do some cultural(!) work to reclaim these
terms, especially as I am, in the course of teaching a literary theory
course, encountering negative characterizations of the text and
information
modeling we do right and left from those who critique XML markup on the
grounds of authoritarian hierarchy. See Johanna Drucker’s chapter,
“Graphesis and Code” from her book on the SpecLab project, or a quick
summary here: http://oxfordindex.oup.com/view/10.7208/chicago/
9780226165097.003.0008 . I’ve heard quite a lot of such theoretically
hyped critiques of markup and would like to find a way to answer them
(although I believe the best answer is the steady and quiet one: build
projects that matter and learn from and with the TEI as an interactive
consortium  that thinks together and sets standards of practice.
Alas, I may not be coherent right now as I’m operating on about 4 hours
of
sleep. But I did want to chime in because Lou seemed to think Council
members aren’t participating enough…I am sorry for my lack of
participation, but this is what I am thinking right now on the subject.
Elisa
--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail: ebb8@pitt.edu <ebb8@pitt.edu>
Development site: http://newtfire.org
On Mar 20, 2017, at 4:58 PM, Lou Burnard <lou.burnard@retired.ox.ac.uk>
wrote:
Apologies for being obscure. I had in mind the two categories of which
Martin speaks, and a third category of "everything which doesn't come
under
the other two categories".
As to your other question, at the start of this thread I did say "It
seemed to me it might be helpful to establish whether the Council agrees
about what the notion of conformance *ought to mean* before trying to
make
sure that the text of the Guidelines express it."
In the background however, I have been reading the text of chapter 23 and
identifying where it needs (in my view) to be brought up to date or made
less vague. I was hoping to get a consensus on the rather fundamental
issue
of whether "conformance" is something we all understand in the same way
before banging on about those issues though. But so far I am a bit
disappointed in Council's apparent lack of interest in either topic.
Given
that at least three current members of Council regularly teach courses on
how to use ODD and how to modify the Guidelines, this seems , um, a bit
odd.
On 20/03/17 19:56, C. M. Sperberg-McQueen wrote:
You are always so good at making me feel slow-witted.
What are the three categories of which you speak?  The two that Martin
identifies
and … ? I appreciate that the labeling is a bit tricky, but if you can’t
identify them,
I don’t know whether I agree or not.
(Actually, that’s not true.  If you *don’t* identify them, I don’t know
whether
I agree or not.  If you *can’t* identify them, I definitely don’t agree.)
Michael
p.s. And a question of my own:  several of the issues I raised last month
on
GitHub amount to asking what various terms in chapters 22 and/or 23 of P5
mean, and what various passages which use those terms mean.
In this email thread so far, the discussion seems to be focusing not on
how to
interpret the text of P5, but on how the TEI and/or the Guidelines should
define concepts like conformance.  Should one infer from that that those
who
have contributed to the discussion so far don’t think the current text of
P5
is worth expounding and have moved on from “How shall we interpret these
passages in P5?” to “What shall we put in their place?”
On Mar 20, 2017, at 10:58 AM, Lou Burnard <lou.burnard@RETIRED.OX.AC.UK
<lou.burnard@retired.ox.ac.uk>> wrote:
So, are we all agreed that there are not two but three usefully distinct
categories related to the vague concept of "using TEI correctly",
whatever
they may be called?
On 17/03/17 18:59, Hugh Cayless wrote:
I've been chewed out for it recently, but I agree with Martin 🙂.
Sent from my phone.
On Mar 17, 2017, at 14:36, Martin Holmes <mholmes@uvic.ca> wrote:
Hi Lou,
Yes, my position is that conformance is a useful concept only if it can
be
assessed programmatically (presumably through validation); but there is
another aspect of "using TEI correctly" which needs another name.
We could have "syntactic conformance" (validity) versus "semantic
conformance".
We could have "conformance" versus "compliance".
We could have "TEI-valid" versus "TEI-conformant".
Cheers,
Martin
On 2017-03-17 10:59 AM, Lou Burnard wrote:
Hi Martin and thanks for your feedback.
You're right about the customized subset blob being in the wrong place.
The problem is that a customized subset may or may not overlap a TEI
subset. Maybe it should float above it. Or something.
Thanks also for reminding me to say something about the @source: that's
a real omission.
Would it be fair to summarize your position as saying that TEI
conformance can only be assessed automatically? And that any
modification which results in something other than a pure TEI subset is
ipso facto non conformant? It's a reasonable (ish) position : I just
wanted to be sure.
Anyone else want to put their head above the parapet?
On 17/03/17 15:40, Martin Holmes wrote:
I'm still puzzled by this diagram. Your description of a "customized
subset" suggests that it may validate documents which are not valid
against tei_all, whereas the blob for it falls squarely within the TEI
subset box.
As far as I'm concerned, I would make it even simpler: any
customization that validates files which are not valid against tei_all
is an extended subset. It extends the TEI by providing options
(perhaps new elements or attributes, perhaps just new content models
for existing elements) which were not available before.
I think the definition of TEI conformance should be that all files
valid against a schema generated from the customization also validate
against a tei_all schema generated from the same P5 subset used to
create the first schema (so the TEI versions used must be the same).
Anything else is an extension.
This is a purely mechanical test, of course. It doesn't check whether
you're using <title> to tag measurements or <name> to tag bold text. I
think adherence to the spirit of the prose definitions and
descriptions needs a different word to describe it (and a human to
judge it).
Cheeers,
Martin
On 2017-03-17 06:24 AM, Lou Burnard wrote:
I've been thinking about what the Guidelines  say about conformance in
chapter 23, following Michael's spate of tickets and the subsequent
debate last month. It seemed to me it might be helpful to establish
whether the Council agrees about what the notion of conformance *ought
to mean* before trying to make sure that the text of the Guidelines
express it. So I have prepared a little (really little!) document for
you to read and disagree or (hopefully) not with. All comments welcomed.
The document is at http://lb42.github.io/W/conformance.html (there was
an earlier version on my foxglove blog, but now that I've got my
ceteicean foo back I'll be maintaining this document on github instead)
--
tei-council mailing list
tei-council@lists.tei-c.org
http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
********************************************
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
cmsmcq@blackmesatech.com
http://www.blackmesatech.com
********************************************
--
tei-council mailing list
tei-council@lists.tei-c.org
http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail: ebb8@pitt.edu <ebb8@pitt.edu>
Development site: http://newtfire.org
--
tei-council mailing list
tei-council@lists.tei-c.org
http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived

Re: [tei-council] TEI conformance issues

Raffaele Viglianti