Re: [tei-council] procmod discuss

26 Feb 2016

      Hello, here are some thoughts about the processing model that I hope will
be helpful to your discussion tomorrow. Please keep in mind that I'm
playing devil's advocate here a bit.

*Too long; didn't read: *my recommendation would be to keep the processing
model in its own namespace as a TEI extension.

Documents and tools considered:
Guidelines draft
<http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master/tei-pm.html>
Toolbox documentation and demo
<http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?odd=documentation.odd>
TEI Simple guidelines
<http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master/teisimple.odd.html>
(the
part relevant to the processing model)
Lyon presentation slides
<http://tei.it.ox.ac.uk/Talks/2015-08-maynooth/lyonPM.html>

Now for the long part of the email.

Some overall thoughts:

"Using the processing model definitely simplifies the life of the
developer, who often has to write a few thousand lines of code just to
render a particular TEI document into HTML, only to repeat the same tedious
process for PDF output."  This is from the TEI Processing Model Toolbox
Documentation, but the idea is rehearsed in other documents too. As a
developer, I fear this may not be true; rather I worry about the processing
model getting in the way of communication between the "XML expert" editor
and the "programmer" as described here (more on roles below). You'll tell
me the Toolbox is a demonstration of the contrary, but my opinion stands
when I try to think about my current day-to-day job.

The processing model is meant to be documentation, but it's in fact
prescriptive and not that human readable. I also have an issue with
expressing a desired output form while encouraging to disregard it at the
same time. This is reflected in the slightly awkwardly phrased sentence
that concludes opening paragraph of the processing model guidelines: "This
enables the creator of a TEI customization to specify how they intend
particular elements could be processed."

The terminology at work here (block, inline, list) is based on
typographical conventions. I see ODD as being applicable to domains that go
beyond: 1) text as a sequence of letters and characters and 2) digital text
as replicating print conventions. Making these procedural instructions
officially part of ODD ties it more strongly with the domains
above, despite the mechanism to extend the model. I recognize that this is
a very personal perspective rooted in my application of ODD beyond what we
here call "text". Though considering facts outside of one's immediate
domain may sometimes help avoid missteps; like the whole OCHO business if
they only had looked beyond the book.

The Rahtz rationale (every workflow has three distinct roles of editor,
programmer, designer) is a useful model. Yet, even though it accommodates
the roles being co-existent in or dived between one or more persons, it
doesn't account for the cyclic interchange between the roles; rather it
models a fairy rigid hierarchy that is not that strict in real life. For
example, an editor may not know that certain display options are available
or indeed desirable, but a programmer/designer might. Do these new
requirement need to be re-presented again in the processing model even
though the act of communication between editor and (the seemingly
subordinate) programmer has happened elsewhere? Isn't that cumbersome and
simply adds a new thing to maintain? What I'm saying is that this approach
is a simplification that may cover a large set of cases, but not all by far
- so why make it an official recommendation? To me, this is something that
belongs in the community of practice, not at the prescriptive level of the
Guidelines and TEI schema.

The Truska tenet (ODD stores as much information as possible) is more
convincing to me since ODD has the power of defining an application profile
as well as a schema for a TEI project. But the goal of the processing model
can be reached without extending the TEI vocabulary and dealing with the
consequences of doing so.

So my recommendation would be to keep the processing model in its own
namespace as a TEI extension. Mostly, this means that the council doesn't
have to maintain the elements and a reference implementation. This wouldn't
stop tools like the toolbox to do all its fantastic things anyway. Also the
people that care about the processing model would be able to develop and
better it faster, without having to go through a political body (the
council) with constantly changing members, opinions, and skill set.

One could make the argument that having to use additional namespaces would
complicate the authoring of the ODD document and it would make adoption
less likely. But if TEI Simple and other invested parties distribute a
ready-made ODD and schema, using namespaced elements will be as
straightforward as using TEI-only elements.

Speaking to more technical details, the model seems solid and easy to
understand. The existing implementations are a clear evaluation /
demonstration of its applicability. I worry a bit about the prominent role
of CSS and how the vocabulary needs to be limited for certain
implementations (such as LaTeX). It's not immediately transparent what's
going to work in which scenario. I suspect that eventually
technology-specific mechanisms like @cssClass will need to be created for
other technologies. I also worry that more procedural statements will be
requested if the idea catches on (like a loop, or grouping statements). In
the guidelines, some more complex cases with predicates get awkwardly close
to XSLT. But feel free to accuse me of the slippery slope fallacy: the
model *is* useful as is.

The proposed guidelines at the moment are dense and quite hard to read. The
Documentation pages of the TEI Processing Model Toolbox do a better job at
the moment. I took the liberty to add a few comments to the current draft
(from May 2015?) on a google doc:
https://docs.google.com/document/d/1ILTG41Y6N2r0kNCg6DX9pyWEwHOoEt6ddp4_-QS5...
I made some corrections that made sense to me; but a native speaker should
have the final word.

I think that the Guidelines are meant for the "editor" to understand how to
use the processing model, but at the moment they mix in instructions for
the "programmer" implementing them, too. This is confusing and the whole
thing is not easy to decode for a programmer looking for clear
implementation instructions. The "Implementing a TEI Processing Model"
helps clear the waters a bit, but I would recommend creating more rigid
documentation that defines a clear API using terms like MUST, SHOULD, etc.
Here is a good example in the DH realm: http://iiif.io/api/image/2.0/
Although a document of that kind would look odd in the Guidelines (pun not
intended).

I hope this helps a bit. Even if the council decides to go ahead and add
these elements to the TEI vocabulary, there is a fair bit of work to do
around documentation (for clarity and transparency - see LaTeX issue above)
and prose.

Raff

On Thu, Feb 25, 2016 at 10:03 AM, Magdalena Turska <tuurma@gmail.com> wrote:
...
And if you fall into the category 'I always wanted to know what the simple
buzz is all about but never quite got there', consider having a peek at
http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od...
and apps using it like
1. Early English Books Online
http://showcases.exist-db.org/exist/apps/eebo/works/
2. Shakespeare's Plays
http://showcases.exist-db.org/exist/apps/shakespeare-pm/works/
3. Foreign relations of United States https://history.state.gov/beta/
On 25 February 2016 at 14:45, Lou Burnard <lou.burnard@retired.ox.ac.uk>
wrote:
...
Magdalena, James, and I are meeting Saturday morning to review the
processing model state of affairs. Any last minute
thoughts/comments/protests gratefully received before then.
--
tei-council mailing list
tei-council@lists.tei-c.org
http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
--
tei-council mailing list
tei-council@lists.tei-c.org
http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived

Re: [tei-council] procmod discuss

Raffaele Viglianti