procmod discuss

Lou Burnard

25 Feb 2016 25 Feb '16

3:45 p.m.

Magdalena, James, and I are meeting Saturday morning to review the processing model state of affairs. Any last minute thoughts/comments/protests gratefully received before then.

Show replies by date

Magdalena Turska

25 Feb 25 Feb

4:03 p.m.

New subject: [tei-council] procmod discuss

And if you fall into the category 'I always wanted to know what the simple buzz is all about but never quite got there', consider having a peek at http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od... and apps using it like 1. Early English Books Online http://showcases.exist-db.org/exist/apps/eebo/works/ 2. Shakespeare's Plays http://showcases.exist-db.org/exist/apps/shakespeare-pm/works/ 3. Foreign relations of United States https://history.state.gov/beta/ On 25 February 2016 at 14:45, Lou Burnard <lou.burnard@retired.ox.ac.uk> wrote:

...

Magdalena, James, and I are meeting Saturday morning to review the processing model state of affairs. Any last minute thoughts/comments/protests gratefully received before then. -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

Raffaele Viglianti

26 Feb 26 Feb

9:57 p.m.

New subject: [tei-council] procmod discuss

Hello, here are some thoughts about the processing model that I hope will be helpful to your discussion tomorrow. Please keep in mind that I'm playing devil's advocate here a bit. *Too long; didn't read: *my recommendation would be to keep the processing model in its own namespace as a TEI extension. Documents and tools considered: Guidelines draft <http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master/tei-pm.html> Toolbox documentation and demo <http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?odd=documentation.odd> TEI Simple guidelines <http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master/teisimple.odd.html> (the part relevant to the processing model) Lyon presentation slides <http://tei.it.ox.ac.uk/Talks/2015-08-maynooth/lyonPM.html> Now for the long part of the email. Some overall thoughts: "Using the processing model definitely simplifies the life of the developer, who often has to write a few thousand lines of code just to render a particular TEI document into HTML, only to repeat the same tedious process for PDF output." This is from the TEI Processing Model Toolbox Documentation, but the idea is rehearsed in other documents too. As a developer, I fear this may not be true; rather I worry about the processing model getting in the way of communication between the "XML expert" editor and the "programmer" as described here (more on roles below). You'll tell me the Toolbox is a demonstration of the contrary, but my opinion stands when I try to think about my current day-to-day job. The processing model is meant to be documentation, but it's in fact prescriptive and not that human readable. I also have an issue with expressing a desired output form while encouraging to disregard it at the same time. This is reflected in the slightly awkwardly phrased sentence that concludes opening paragraph of the processing model guidelines: "This enables the creator of a TEI customization to specify how they intend particular elements could be processed." The terminology at work here (block, inline, list) is based on typographical conventions. I see ODD as being applicable to domains that go beyond: 1) text as a sequence of letters and characters and 2) digital text as replicating print conventions. Making these procedural instructions officially part of ODD ties it more strongly with the domains above, despite the mechanism to extend the model. I recognize that this is a very personal perspective rooted in my application of ODD beyond what we here call "text". Though considering facts outside of one's immediate domain may sometimes help avoid missteps; like the whole OCHO business if they only had looked beyond the book. The Rahtz rationale (every workflow has three distinct roles of editor, programmer, designer) is a useful model. Yet, even though it accommodates the roles being co-existent in or dived between one or more persons, it doesn't account for the cyclic interchange between the roles; rather it models a fairy rigid hierarchy that is not that strict in real life. For example, an editor may not know that certain display options are available or indeed desirable, but a programmer/designer might. Do these new requirement need to be re-presented again in the processing model even though the act of communication between editor and (the seemingly subordinate) programmer has happened elsewhere? Isn't that cumbersome and simply adds a new thing to maintain? What I'm saying is that this approach is a simplification that may cover a large set of cases, but not all by far - so why make it an official recommendation? To me, this is something that belongs in the community of practice, not at the prescriptive level of the Guidelines and TEI schema. The Truska tenet (ODD stores as much information as possible) is more convincing to me since ODD has the power of defining an application profile as well as a schema for a TEI project. But the goal of the processing model can be reached without extending the TEI vocabulary and dealing with the consequences of doing so. So my recommendation would be to keep the processing model in its own namespace as a TEI extension. Mostly, this means that the council doesn't have to maintain the elements and a reference implementation. This wouldn't stop tools like the toolbox to do all its fantastic things anyway. Also the people that care about the processing model would be able to develop and better it faster, without having to go through a political body (the council) with constantly changing members, opinions, and skill set. One could make the argument that having to use additional namespaces would complicate the authoring of the ODD document and it would make adoption less likely. But if TEI Simple and other invested parties distribute a ready-made ODD and schema, using namespaced elements will be as straightforward as using TEI-only elements. Speaking to more technical details, the model seems solid and easy to understand. The existing implementations are a clear evaluation / demonstration of its applicability. I worry a bit about the prominent role of CSS and how the vocabulary needs to be limited for certain implementations (such as LaTeX). It's not immediately transparent what's going to work in which scenario. I suspect that eventually technology-specific mechanisms like @cssClass will need to be created for other technologies. I also worry that more procedural statements will be requested if the idea catches on (like a loop, or grouping statements). In the guidelines, some more complex cases with predicates get awkwardly close to XSLT. But feel free to accuse me of the slippery slope fallacy: the model *is* useful as is. The proposed guidelines at the moment are dense and quite hard to read. The Documentation pages of the TEI Processing Model Toolbox do a better job at the moment. I took the liberty to add a few comments to the current draft (from May 2015?) on a google doc: https://docs.google.com/document/d/1ILTG41Y6N2r0kNCg6DX9pyWEwHOoEt6ddp4_-QS5... I made some corrections that made sense to me; but a native speaker should have the final word. I think that the Guidelines are meant for the "editor" to understand how to use the processing model, but at the moment they mix in instructions for the "programmer" implementing them, too. This is confusing and the whole thing is not easy to decode for a programmer looking for clear implementation instructions. The "Implementing a TEI Processing Model" helps clear the waters a bit, but I would recommend creating more rigid documentation that defines a clear API using terms like MUST, SHOULD, etc. Here is a good example in the DH realm: http://iiif.io/api/image/2.0/ Although a document of that kind would look odd in the Guidelines (pun not intended). I hope this helps a bit. Even if the council decides to go ahead and add these elements to the TEI vocabulary, there is a fair bit of work to do around documentation (for clarity and transparency - see LaTeX issue above) and prose. Raff On Thu, Feb 25, 2016 at 10:03 AM, Magdalena Turska <tuurma@gmail.com> wrote:

...

And if you fall into the category 'I always wanted to know what the simple buzz is all about but never quite got there', consider having a peek at

http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od... and apps using it like 1. Early English Books Online http://showcases.exist-db.org/exist/apps/eebo/works/ 2. Shakespeare's Plays http://showcases.exist-db.org/exist/apps/shakespeare-pm/works/ 3. Foreign relations of United States https://history.state.gov/beta/

On 25 February 2016 at 14:45, Lou Burnard <lou.burnard@retired.ox.ac.uk> wrote:

...
Magdalena, James, and I are meeting Saturday morning to review the processing model state of affairs. Any last minute thoughts/comments/protests gratefully received before then. -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

-- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

James Cummings

10:44 p.m.

New subject: [tei-council] procmod discuss

Hi Raff. All this sounds like you were reading the the TEI Simple material not the processing model stuff that is already in a branch of the TEI repo? We're not debating TEI Simple atm or these old drafts. But the chapter prose that Lou has already modified. Apologies I would have sent a link to it or somehow made it appear on teic.github.io but I am not sure how to get things there (and am on my phone atm). It is the additions to the TD chapter you shoud read. None of that Rahtz rationale or Turska tenet are in the guidelines prose. But some good thoughts nonetheless. James -- Dr James Cummings, Academic IT, University of Oxford -----Original Message----- From: Raffaele Viglianti [raffaeleviglianti@gmail.com] Received: Friday, 26 Feb 2016, 20:58 To: tei-council@lists.tei-c.org [tei-council@lists.tei-c.org] Subject: Re: [tei-council] procmod discuss Hello, here are some thoughts about the processing model that I hope will be helpful to your discussion tomorrow. Please keep in mind that I'm playing devil's advocate here a bit. *Too long; didn't read: *my recommendation would be to keep the processing model in its own namespace as a TEI extension. Documents and tools considered: Guidelines draft <http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master/tei-pm.html> Toolbox documentation and demo <http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?odd=documentation.odd> TEI Simple guidelines <http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master/teisimple.odd.html> (the part relevant to the processing model) Lyon presentation slides <http://tei.it.ox.ac.uk/Talks/2015-08-maynooth/lyonPM.html> Now for the long part of the email. Some overall thoughts: "Using the processing model definitely simplifies the life of the developer, who often has to write a few thousand lines of code just to render a particular TEI document into HTML, only to repeat the same tedious process for PDF output." This is from the TEI Processing Model Toolbox Documentation, but the idea is rehearsed in other documents too. As a developer, I fear this may not be true; rather I worry about the processing model getting in the way of communication between the "XML expert" editor and the "programmer" as described here (more on roles below). You'll tell me the Toolbox is a demonstration of the contrary, but my opinion stands when I try to think about my current day-to-day job. The processing model is meant to be documentation, but it's in fact prescriptive and not that human readable. I also have an issue with expressing a desired output form while encouraging to disregard it at the same time. This is reflected in the slightly awkwardly phrased sentence that concludes opening paragraph of the processing model guidelines: "This enables the creator of a TEI customization to specify how they intend particular elements could be processed." The terminology at work here (block, inline, list) is based on typographical conventions. I see ODD as being applicable to domains that go beyond: 1) text as a sequence of letters and characters and 2) digital text as replicating print conventions. Making these procedural instructions officially part of ODD ties it more strongly with the domains above, despite the mechanism to extend the model. I recognize that this is a very personal perspective rooted in my application of ODD beyond what we here call "text". Though considering facts outside of one's immediate domain may sometimes help avoid missteps; like the whole OCHO business if they only had looked beyond the book. The Rahtz rationale (every workflow has three distinct roles of editor, programmer, designer) is a useful model. Yet, even though it accommodates the roles being co-existent in or dived between one or more persons, it doesn't account for the cyclic interchange between the roles; rather it models a fairy rigid hierarchy that is not that strict in real life. For example, an editor may not know that certain display options are available or indeed desirable, but a programmer/designer might. Do these new requirement need to be re-presented again in the processing model even though the act of communication between editor and (the seemingly subordinate) programmer has happened elsewhere? Isn't that cumbersome and simply adds a new thing to maintain? What I'm saying is that this approach is a simplification that may cover a large set of cases, but not all by far - so why make it an official recommendation? To me, this is something that belongs in the community of practice, not at the prescriptive level of the Guidelines and TEI schema. The Truska tenet (ODD stores as much information as possible) is more convincing to me since ODD has the power of defining an application profile as well as a schema for a TEI project. But the goal of the processing model can be reached without extending the TEI vocabulary and dealing with the consequences of doing so. So my recommendation would be to keep the processing model in its own namespace as a TEI extension. Mostly, this means that the council doesn't have to maintain the elements and a reference implementation. This wouldn't stop tools like the toolbox to do all its fantastic things anyway. Also the people that care about the processing model would be able to develop and better it faster, without having to go through a political body (the council) with constantly changing members, opinions, and skill set. One could make the argument that having to use additional namespaces would complicate the authoring of the ODD document and it would make adoption less likely. But if TEI Simple and other invested parties distribute a ready-made ODD and schema, using namespaced elements will be as straightforward as using TEI-only elements. Speaking to more technical details, the model seems solid and easy to understand. The existing implementations are a clear evaluation / demonstration of its applicability. I worry a bit about the prominent role of CSS and how the vocabulary needs to be limited for certain implementations (such as LaTeX). It's not immediately transparent what's going to work in which scenario. I suspect that eventually technology-specific mechanisms like @cssClass will need to be created for other technologies. I also worry that more procedural statements will be requested if the idea catches on (like a loop, or grouping statements). In the guidelines, some more complex cases with predicates get awkwardly close to XSLT. But feel free to accuse me of the slippery slope fallacy: the model *is* useful as is. The proposed guidelines at the moment are dense and quite hard to read. The Documentation pages of the TEI Processing Model Toolbox do a better job at the moment. I took the liberty to add a few comments to the current draft (from May 2015?) on a google doc: https://docs.google.com/document/d/1ILTG41Y6N2r0kNCg6DX9pyWEwHOoEt6ddp4_-QS5... I made some corrections that made sense to me; but a native speaker should have the final word. I think that the Guidelines are meant for the "editor" to understand how to use the processing model, but at the moment they mix in instructions for the "programmer" implementing them, too. This is confusing and the whole thing is not easy to decode for a programmer looking for clear implementation instructions. The "Implementing a TEI Processing Model" helps clear the waters a bit, but I would recommend creating more rigid documentation that defines a clear API using terms like MUST, SHOULD, etc. Here is a good example in the DH realm: http://iiif.io/api/image/2.0/ Although a document of that kind would look odd in the Guidelines (pun not intended). I hope this helps a bit. Even if the council decides to go ahead and add these elements to the TEI vocabulary, there is a fair bit of work to do around documentation (for clarity and transparency - see LaTeX issue above) and prose. Raff On Thu, Feb 25, 2016 at 10:03 AM, Magdalena Turska <tuurma@gmail.com> wrote:

...

And if you fall into the category 'I always wanted to know what the simple buzz is all about but never quite got there', consider having a peek at

http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od... and apps using it like 1. Early English Books Online http://showcases.exist-db.org/exist/apps/eebo/works/ 2. Shakespeare's Plays http://showcases.exist-db.org/exist/apps/shakespeare-pm/works/ 3. Foreign relations of United States https://history.state.gov/beta/

On 25 February 2016 at 14:45, Lou Burnard <lou.burnard@retired.ox.ac.uk> wrote:

...
Magdalena, James, and I are meeting Saturday morning to review the processing model state of affairs. Any last minute thoughts/comments/protests gratefully received before then. -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

-- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

-- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council PLEASE NOTE: postings to this list are publicly archived

Raffaele Viglianti

10:51 p.m.

New subject: [tei-council] procmod discuss

Hi James, I considered a few things. I read the Rahtz rationale and Turska tenets from the Lyon material, not the guidelines. I haven't read any parts concerning with Simple alone; I only focused on the processing model.

...

From what you say, it is likely that the draft of the processing model guidelines that I read is old (May 2015) http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master...

I'm not sure where to look for Lou's corrected version - can you send me a link? Plenty of time to read through today on this side of the globe. Raff On Fri, Feb 26, 2016 at 4:44 PM, James Cummings <james.cummings@it.ox.ac.uk> wrote:

...

Hi Raff.

All this sounds like you were reading the the TEI Simple material not the processing model stuff that is already in a branch of the TEI repo?

We're not debating TEI Simple atm or these old drafts. But the chapter prose that Lou has already modified. Apologies I would have sent a link to it or somehow made it appear on teic.github.io but I am not sure how to get things there (and am on my phone atm). It is the additions to the TD chapter you shoud read. None of that Rahtz rationale or Turska tenet are in the guidelines prose. But some good thoughts nonetheless.

James

-- Dr James Cummings, Academic IT, University of Oxford

-----Original Message----- From: Raffaele Viglianti [raffaeleviglianti@gmail.com] Received: Friday, 26 Feb 2016, 20:58 To: tei-council@lists.tei-c.org [tei-council@lists.tei-c.org] Subject: Re: [tei-council] procmod discuss

Hello, here are some thoughts about the processing model that I hope will be helpful to your discussion tomorrow. Please keep in mind that I'm playing devil's advocate here a bit.

*Too long; didn't read: *my recommendation would be to keep the processing model in its own namespace as a TEI extension.

Documents and tools considered: Guidelines draft < http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master...

...
Toolbox documentation and demo < http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od...

...
TEI Simple guidelines < http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master...

...
(the part relevant to the processing model) Lyon presentation slides <http://tei.it.ox.ac.uk/Talks/2015-08-maynooth/lyonPM.html>

Now for the long part of the email.

Some overall thoughts:

"Using the processing model definitely simplifies the life of the developer, who often has to write a few thousand lines of code just to render a particular TEI document into HTML, only to repeat the same tedious process for PDF output." This is from the TEI Processing Model Toolbox Documentation, but the idea is rehearsed in other documents too. As a developer, I fear this may not be true; rather I worry about the processing model getting in the way of communication between the "XML expert" editor and the "programmer" as described here (more on roles below). You'll tell me the Toolbox is a demonstration of the contrary, but my opinion stands when I try to think about my current day-to-day job.

The processing model is meant to be documentation, but it's in fact prescriptive and not that human readable. I also have an issue with expressing a desired output form while encouraging to disregard it at the same time. This is reflected in the slightly awkwardly phrased sentence that concludes opening paragraph of the processing model guidelines: "This enables the creator of a TEI customization to specify how they intend particular elements could be processed."

The terminology at work here (block, inline, list) is based on typographical conventions. I see ODD as being applicable to domains that go beyond: 1) text as a sequence of letters and characters and 2) digital text as replicating print conventions. Making these procedural instructions officially part of ODD ties it more strongly with the domains above, despite the mechanism to extend the model. I recognize that this is a very personal perspective rooted in my application of ODD beyond what we here call "text". Though considering facts outside of one's immediate domain may sometimes help avoid missteps; like the whole OCHO business if they only had looked beyond the book.

The Rahtz rationale (every workflow has three distinct roles of editor, programmer, designer) is a useful model. Yet, even though it accommodates the roles being co-existent in or dived between one or more persons, it doesn't account for the cyclic interchange between the roles; rather it models a fairy rigid hierarchy that is not that strict in real life. For example, an editor may not know that certain display options are available or indeed desirable, but a programmer/designer might. Do these new requirement need to be re-presented again in the processing model even though the act of communication between editor and (the seemingly subordinate) programmer has happened elsewhere? Isn't that cumbersome and simply adds a new thing to maintain? What I'm saying is that this approach is a simplification that may cover a large set of cases, but not all by far - so why make it an official recommendation? To me, this is something that belongs in the community of practice, not at the prescriptive level of the Guidelines and TEI schema.

The Truska tenet (ODD stores as much information as possible) is more convincing to me since ODD has the power of defining an application profile as well as a schema for a TEI project. But the goal of the processing model can be reached without extending the TEI vocabulary and dealing with the consequences of doing so.

So my recommendation would be to keep the processing model in its own namespace as a TEI extension. Mostly, this means that the council doesn't have to maintain the elements and a reference implementation. This wouldn't stop tools like the toolbox to do all its fantastic things anyway. Also the people that care about the processing model would be able to develop and better it faster, without having to go through a political body (the council) with constantly changing members, opinions, and skill set.

One could make the argument that having to use additional namespaces would complicate the authoring of the ODD document and it would make adoption less likely. But if TEI Simple and other invested parties distribute a ready-made ODD and schema, using namespaced elements will be as straightforward as using TEI-only elements.

Speaking to more technical details, the model seems solid and easy to understand. The existing implementations are a clear evaluation / demonstration of its applicability. I worry a bit about the prominent role of CSS and how the vocabulary needs to be limited for certain implementations (such as LaTeX). It's not immediately transparent what's going to work in which scenario. I suspect that eventually technology-specific mechanisms like @cssClass will need to be created for other technologies. I also worry that more procedural statements will be requested if the idea catches on (like a loop, or grouping statements). In the guidelines, some more complex cases with predicates get awkwardly close to XSLT. But feel free to accuse me of the slippery slope fallacy: the model *is* useful as is.

The proposed guidelines at the moment are dense and quite hard to read. The Documentation pages of the TEI Processing Model Toolbox do a better job at the moment. I took the liberty to add a few comments to the current draft (from May 2015?) on a google doc:

https://docs.google.com/document/d/1ILTG41Y6N2r0kNCg6DX9pyWEwHOoEt6ddp4_-QS5... I made some corrections that made sense to me; but a native speaker should have the final word.

I think that the Guidelines are meant for the "editor" to understand how to use the processing model, but at the moment they mix in instructions for the "programmer" implementing them, too. This is confusing and the whole thing is not easy to decode for a programmer looking for clear implementation instructions. The "Implementing a TEI Processing Model" helps clear the waters a bit, but I would recommend creating more rigid documentation that defines a clear API using terms like MUST, SHOULD, etc. Here is a good example in the DH realm: http://iiif.io/api/image/2.0/ Although a document of that kind would look odd in the Guidelines (pun not intended).

I hope this helps a bit. Even if the council decides to go ahead and add these elements to the TEI vocabulary, there is a fair bit of work to do around documentation (for clarity and transparency - see LaTeX issue above) and prose.

Raff

On Thu, Feb 25, 2016 at 10:03 AM, Magdalena Turska <tuurma@gmail.com> wrote:

...
And if you fall into the category 'I always wanted to know what the simple buzz is all about but never quite got there', consider having a peek at

http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od...

...
and apps using it like 1. Early English Books Online http://showcases.exist-db.org/exist/apps/eebo/works/ 2. Shakespeare's Plays http://showcases.exist-db.org/exist/apps/shakespeare-pm/works/ 3. Foreign relations of United States https://history.state.gov/beta/

On 25 February 2016 at 14:45, Lou Burnard <lou.burnard@retired.ox.ac.uk> wrote:

...
Magdalena, James, and I are meeting Saturday morning to review the processing model state of affairs. Any last minute thoughts/comments/protests gratefully received before then. -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

-- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

-- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

Hugh Cayless

10:52 p.m.

New subject: [tei-council] procmod discuss

Should be here: http://teic.github.io/TEI/TD.html#TDPMPM Sent from my phone.

...

On Feb 26, 2016, at 16:44, James Cummings <james.cummings@it.ox.ac.uk> wrote:

Hi Raff.

All this sounds like you were reading the the TEI Simple material not the processing model stuff that is already in a branch of the TEI repo?

We're not debating TEI Simple atm or these old drafts. But the chapter prose that Lou has already modified. Apologies I would have sent a link to it or somehow made it appear on teic.github.io but I am not sure how to get things there (and am on my phone atm). It is the additions to the TD chapter you shoud read. None of that Rahtz rationale or Turska tenet are in the guidelines prose. But some good thoughts nonetheless.

James

-- Dr James Cummings, Academic IT, University of Oxford

-----Original Message----- From: Raffaele Viglianti [raffaeleviglianti@gmail.com] Received: Friday, 26 Feb 2016, 20:58 To: tei-council@lists.tei-c.org [tei-council@lists.tei-c.org] Subject: Re: [tei-council] procmod discuss

Hello, here are some thoughts about the processing model that I hope will be helpful to your discussion tomorrow. Please keep in mind that I'm playing devil's advocate here a bit.

*Too long; didn't read: *my recommendation would be to keep the processing model in its own namespace as a TEI extension.

Documents and tools considered: Guidelines draft <http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master/tei-pm.html> Toolbox documentation and demo <http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?odd=documentation.odd> TEI Simple guidelines <http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master/teisimple.odd.html> (the part relevant to the processing model) Lyon presentation slides <http://tei.it.ox.ac.uk/Talks/2015-08-maynooth/lyonPM.html>

Now for the long part of the email.

Some overall thoughts:

"Using the processing model definitely simplifies the life of the developer, who often has to write a few thousand lines of code just to render a particular TEI document into HTML, only to repeat the same tedious process for PDF output." This is from the TEI Processing Model Toolbox Documentation, but the idea is rehearsed in other documents too. As a developer, I fear this may not be true; rather I worry about the processing model getting in the way of communication between the "XML expert" editor and the "programmer" as described here (more on roles below). You'll tell me the Toolbox is a demonstration of the contrary, but my opinion stands when I try to think about my current day-to-day job.

The processing model is meant to be documentation, but it's in fact prescriptive and not that human readable. I also have an issue with expressing a desired output form while encouraging to disregard it at the same time. This is reflected in the slightly awkwardly phrased sentence that concludes opening paragraph of the processing model guidelines: "This enables the creator of a TEI customization to specify how they intend particular elements could be processed."

The terminology at work here (block, inline, list) is based on typographical conventions. I see ODD as being applicable to domains that go beyond: 1) text as a sequence of letters and characters and 2) digital text as replicating print conventions. Making these procedural instructions officially part of ODD ties it more strongly with the domains above, despite the mechanism to extend the model. I recognize that this is a very personal perspective rooted in my application of ODD beyond what we here call "text". Though considering facts outside of one's immediate domain may sometimes help avoid missteps; like the whole OCHO business if they only had looked beyond the book.

The Rahtz rationale (every workflow has three distinct roles of editor, programmer, designer) is a useful model. Yet, even though it accommodates the roles being co-existent in or dived between one or more persons, it doesn't account for the cyclic interchange between the roles; rather it models a fairy rigid hierarchy that is not that strict in real life. For example, an editor may not know that certain display options are available or indeed desirable, but a programmer/designer might. Do these new requirement need to be re-presented again in the processing model even though the act of communication between editor and (the seemingly subordinate) programmer has happened elsewhere? Isn't that cumbersome and simply adds a new thing to maintain? What I'm saying is that this approach is a simplification that may cover a large set of cases, but not all by far - so why make it an official recommendation? To me, this is something that belongs in the community of practice, not at the prescriptive level of the Guidelines and TEI schema.

The Truska tenet (ODD stores as much information as possible) is more convincing to me since ODD has the power of defining an application profile as well as a schema for a TEI project. But the goal of the processing model can be reached without extending the TEI vocabulary and dealing with the consequences of doing so.

So my recommendation would be to keep the processing model in its own namespace as a TEI extension. Mostly, this means that the council doesn't have to maintain the elements and a reference implementation. This wouldn't stop tools like the toolbox to do all its fantastic things anyway. Also the people that care about the processing model would be able to develop and better it faster, without having to go through a political body (the council) with constantly changing members, opinions, and skill set.

One could make the argument that having to use additional namespaces would complicate the authoring of the ODD document and it would make adoption less likely. But if TEI Simple and other invested parties distribute a ready-made ODD and schema, using namespaced elements will be as straightforward as using TEI-only elements.

Speaking to more technical details, the model seems solid and easy to understand. The existing implementations are a clear evaluation / demonstration of its applicability. I worry a bit about the prominent role of CSS and how the vocabulary needs to be limited for certain implementations (such as LaTeX). It's not immediately transparent what's going to work in which scenario. I suspect that eventually technology-specific mechanisms like @cssClass will need to be created for other technologies. I also worry that more procedural statements will be requested if the idea catches on (like a loop, or grouping statements). In the guidelines, some more complex cases with predicates get awkwardly close to XSLT. But feel free to accuse me of the slippery slope fallacy: the model *is* useful as is.

The proposed guidelines at the moment are dense and quite hard to read. The Documentation pages of the TEI Processing Model Toolbox do a better job at the moment. I took the liberty to add a few comments to the current draft (from May 2015?) on a google doc: https://docs.google.com/document/d/1ILTG41Y6N2r0kNCg6DX9pyWEwHOoEt6ddp4_-QS5... I made some corrections that made sense to me; but a native speaker should have the final word.

I think that the Guidelines are meant for the "editor" to understand how to use the processing model, but at the moment they mix in instructions for the "programmer" implementing them, too. This is confusing and the whole thing is not easy to decode for a programmer looking for clear implementation instructions. The "Implementing a TEI Processing Model" helps clear the waters a bit, but I would recommend creating more rigid documentation that defines a clear API using terms like MUST, SHOULD, etc. Here is a good example in the DH realm: http://iiif.io/api/image/2.0/ Although a document of that kind would look odd in the Guidelines (pun not intended).

I hope this helps a bit. Even if the council decides to go ahead and add these elements to the TEI vocabulary, there is a fair bit of work to do around documentation (for clarity and transparency - see LaTeX issue above) and prose.

Raff

...
On Thu, Feb 25, 2016 at 10:03 AM, Magdalena Turska <tuurma@gmail.com> wrote:

And if you fall into the category 'I always wanted to know what the simple buzz is all about but never quite got there', consider having a peek at

http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od... and apps using it like 1. Early English Books Online http://showcases.exist-db.org/exist/apps/eebo/works/ 2. Shakespeare's Plays http://showcases.exist-db.org/exist/apps/shakespeare-pm/works/ 3. Foreign relations of United States https://history.state.gov/beta/

On 25 February 2016 at 14:45, Lou Burnard <lou.burnard@retired.ox.ac.uk> wrote:

...
Magdalena, James, and I are meeting Saturday morning to review the processing model state of affairs. Any last minute thoughts/comments/protests gratefully received before then. -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

Raffaele Viglianti

27 Feb 27 Feb

12:08 a.m.

New subject: [tei-council] procmod discuss

Ok, the current version of the guidelines is very readable and clear. I'm glad I learned about the Rahtz rationale and Turska tenet anyway, because they clarified the motivation for the processing model and potentially revealed some inherent issues, particularly concerning the definition of roles. My comments and overall recommendation stand. The processing model is a useful tool, but it doesn't belong in the TEI Guidelines in my opinion. One difference from my earlier assessment is that if we do go ahead and include the elements in the TEI, the documentation is already in a sufficiently clear state. One thing that I noticed in the old draft that survived in the current one is this text in § 22.5.5.8 "Implemention" "When deciding which of the model elements within an elementSpec should be acted upon, *the models are processed in sequence, until one is found* with no value for its predicate attribute (meaning that it matches any occurrence of the element), or with a value which is true of the element being processed." Is this really a requisite for the implementation? There are many methods for locating a desired element in a set that are not just processing in sequence. Even if sequence mattered in case of two models matching the same predicate, they only need to be processed in sequence once located, not to locate them. So I don't think that's a valid recommendation. Raff On Fri, Feb 26, 2016 at 4:52 PM, Hugh Cayless <philomousos@gmail.com> wrote:

...

Should be here: http://teic.github.io/TEI/TD.html#TDPMPM

Sent from my phone.

...
On Feb 26, 2016, at 16:44, James Cummings <james.cummings@it.ox.ac.uk> wrote:

Hi Raff.

All this sounds like you were reading the the TEI Simple material not the processing model stuff that is already in a branch of the TEI repo?

We're not debating TEI Simple atm or these old drafts. But the chapter prose that Lou has already modified. Apologies I would have sent a link to it or somehow made it appear on teic.github.io but I am not sure how to get things there (and am on my phone atm). It is the additions to the TD chapter you shoud read. None of that Rahtz rationale or Turska tenet are in the guidelines prose. But some good thoughts nonetheless.

James

-- Dr James Cummings, Academic IT, University of Oxford

-----Original Message----- From: Raffaele Viglianti [raffaeleviglianti@gmail.com] Received: Friday, 26 Feb 2016, 20:58 To: tei-council@lists.tei-c.org [tei-council@lists.tei-c.org] Subject: Re: [tei-council] procmod discuss

Hello, here are some thoughts about the processing model that I hope will be helpful to your discussion tomorrow. Please keep in mind that I'm playing devil's advocate here a bit.

*Too long; didn't read: *my recommendation would be to keep the processing model in its own namespace as a TEI extension.

Documents and tools considered: Guidelines draft < http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master...

Toolbox documentation and demo < http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od...

TEI Simple guidelines < http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master...

(the part relevant to the processing model) Lyon presentation slides <http://tei.it.ox.ac.uk/Talks/2015-08-maynooth/lyonPM.html>

Now for the long part of the email.

Some overall thoughts:

"Using the processing model definitely simplifies the life of the developer, who often has to write a few thousand lines of code just to render a particular TEI document into HTML, only to repeat the same tedious process for PDF output." This is from the TEI Processing Model Toolbox Documentation, but the idea is rehearsed in other documents too. As a developer, I fear this may not be true; rather I worry about the processing model getting in the way of communication between the "XML expert" editor and the "programmer" as described here (more on roles below). You'll tell me the Toolbox is a demonstration of the contrary, but my opinion stands when I try to think about my current day-to-day job.

The processing model is meant to be documentation, but it's in fact prescriptive and not that human readable. I also have an issue with expressing a desired output form while encouraging to disregard it at the same time. This is reflected in the slightly awkwardly phrased sentence that concludes opening paragraph of the processing model guidelines: "This enables the creator of a TEI customization to specify how they intend particular elements could be processed."

The terminology at work here (block, inline, list) is based on typographical conventions. I see ODD as being applicable to domains that go beyond: 1) text as a sequence of letters and characters and 2) digital text as replicating print conventions. Making these procedural instructions officially part of ODD ties it more strongly with the domains above, despite the mechanism to extend the model. I recognize that this is a very personal perspective rooted in my application of ODD beyond what we here call "text". Though considering facts outside of one's immediate domain may sometimes help avoid missteps; like the whole OCHO business if they only had looked beyond the book.

The Rahtz rationale (every workflow has three distinct roles of editor, programmer, designer) is a useful model. Yet, even though it accommodates the roles being co-existent in or dived between one or more persons, it doesn't account for the cyclic interchange between the roles; rather it models a fairy rigid hierarchy that is not that strict in real life. For example, an editor may not know that certain display options are available or indeed desirable, but a programmer/designer might. Do these new requirement need to be re-presented again in the processing model even though the act of communication between editor and (the seemingly subordinate) programmer has happened elsewhere? Isn't that cumbersome and simply adds a new thing to maintain? What I'm saying is that this approach is a simplification that may cover a large set of cases, but not all by far - so why make it an official recommendation? To me, this is something that belongs in the community of practice, not at the prescriptive level of the Guidelines and TEI schema.

The Truska tenet (ODD stores as much information as possible) is more convincing to me since ODD has the power of defining an application profile as well as a schema for a TEI project. But the goal of the processing model can be reached without extending the TEI vocabulary and dealing with the consequences of doing so.

So my recommendation would be to keep the processing model in its own namespace as a TEI extension. Mostly, this means that the council doesn't have to maintain the elements and a reference implementation. This wouldn't stop tools like the toolbox to do all its fantastic things anyway. Also the people that care about the processing model would be able to develop and better it faster, without having to go through a political body (the council) with constantly changing members, opinions, and skill set.

One could make the argument that having to use additional namespaces would complicate the authoring of the ODD document and it would make adoption less likely. But if TEI Simple and other invested parties distribute a ready-made ODD and schema, using namespaced elements will be as straightforward as using TEI-only elements.

Speaking to more technical details, the model seems solid and easy to understand. The existing implementations are a clear evaluation / demonstration of its applicability. I worry a bit about the prominent role of CSS and how the vocabulary needs to be limited for certain implementations (such as LaTeX). It's not immediately transparent what's going to work in which scenario. I suspect that eventually technology-specific mechanisms like @cssClass will need to be created for other technologies. I also worry that more procedural statements will be requested if the idea catches on (like a loop, or grouping statements). In the guidelines, some more complex cases with predicates get awkwardly close to XSLT. But feel free to accuse me of the slippery slope fallacy: the model *is* useful as is.

The proposed guidelines at the moment are dense and quite hard to read. The Documentation pages of the TEI Processing Model Toolbox do a better job at the moment. I took the liberty to add a few comments to the current draft (from May 2015?) on a google doc:

https://docs.google.com/document/d/1ILTG41Y6N2r0kNCg6DX9pyWEwHOoEt6ddp4_-QS5...

...
I made some corrections that made sense to me; but a native speaker should have the final word.

I think that the Guidelines are meant for the "editor" to understand how to use the processing model, but at the moment they mix in instructions for the "programmer" implementing them, too. This is confusing and the whole thing is not easy to decode for a programmer looking for clear implementation instructions. The "Implementing a TEI Processing Model" helps clear the waters a bit, but I would recommend creating more rigid documentation that defines a clear API using terms like MUST, SHOULD, etc. Here is a good example in the DH realm: http://iiif.io/api/image/2.0/ Although a document of that kind would look odd in the Guidelines (pun not intended).

I hope this helps a bit. Even if the council decides to go ahead and add these elements to the TEI vocabulary, there is a fair bit of work to do around documentation (for clarity and transparency - see LaTeX issue above) and prose.

Raff

...
On Thu, Feb 25, 2016 at 10:03 AM, Magdalena Turska <tuurma@gmail.com> wrote:

And if you fall into the category 'I always wanted to know what the simple buzz is all about but never quite got there', consider having a peek at

http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od...

...
and apps using it like 1. Early English Books Online http://showcases.exist-db.org/exist/apps/eebo/works/ 2. Shakespeare's Plays http://showcases.exist-db.org/exist/apps/shakespeare-pm/works/ 3. Foreign relations of United States https://history.state.gov/beta/

On 25 February 2016 at 14:45, Lou Burnard <lou.burnard@retired.ox.ac.uk

...
wrote:

...
Magdalena, James, and I are meeting Saturday morning to review the processing model state of affairs. Any last minute thoughts/comments/protests gratefully received before then. -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

James Cummings

12:14 a.m.

New subject: [tei-council] procmod discuss

That is indeed a good point. I think that section was trying to indicate how precedence should work and might be able to be clearer. James -- Dr James Cummings, Academic IT, University of Oxford -----Original Message----- From: Raffaele Viglianti [raffaeleviglianti@gmail.com] Received: Friday, 26 Feb 2016, 23:08 To: tei-council@lists.tei-c.org [tei-council@lists.tei-c.org] Subject: Re: [tei-council] procmod discuss Ok, the current version of the guidelines is very readable and clear. I'm glad I learned about the Rahtz rationale and Turska tenet anyway, because they clarified the motivation for the processing model and potentially revealed some inherent issues, particularly concerning the definition of roles. My comments and overall recommendation stand. The processing model is a useful tool, but it doesn't belong in the TEI Guidelines in my opinion. One difference from my earlier assessment is that if we do go ahead and include the elements in the TEI, the documentation is already in a sufficiently clear state. One thing that I noticed in the old draft that survived in the current one is this text in § 22.5.5.8 "Implemention" "When deciding which of the model elements within an elementSpec should be acted upon, *the models are processed in sequence, until one is found* with no value for its predicate attribute (meaning that it matches any occurrence of the element), or with a value which is true of the element being processed." Is this really a requisite for the implementation? There are many methods for locating a desired element in a set that are not just processing in sequence. Even if sequence mattered in case of two models matching the same predicate, they only need to be processed in sequence once located, not to locate them. So I don't think that's a valid recommendation. Raff On Fri, Feb 26, 2016 at 4:52 PM, Hugh Cayless <philomousos@gmail.com> wrote:

...

Should be here: http://teic.github.io/TEI/TD.html#TDPMPM

Sent from my phone.

...
On Feb 26, 2016, at 16:44, James Cummings <james.cummings@it.ox.ac.uk> wrote:

Hi Raff.

All this sounds like you were reading the the TEI Simple material not the processing model stuff that is already in a branch of the TEI repo?

We're not debating TEI Simple atm or these old drafts. But the chapter prose that Lou has already modified. Apologies I would have sent a link to it or somehow made it appear on teic.github.io but I am not sure how to get things there (and am on my phone atm). It is the additions to the TD chapter you shoud read. None of that Rahtz rationale or Turska tenet are in the guidelines prose. But some good thoughts nonetheless.

James

-- Dr James Cummings, Academic IT, University of Oxford

-----Original Message----- From: Raffaele Viglianti [raffaeleviglianti@gmail.com] Received: Friday, 26 Feb 2016, 20:58 To: tei-council@lists.tei-c.org [tei-council@lists.tei-c.org] Subject: Re: [tei-council] procmod discuss

Hello, here are some thoughts about the processing model that I hope will be helpful to your discussion tomorrow. Please keep in mind that I'm playing devil's advocate here a bit.

*Too long; didn't read: *my recommendation would be to keep the processing model in its own namespace as a TEI extension.

Documents and tools considered: Guidelines draft < http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master...

Toolbox documentation and demo < http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od...

TEI Simple guidelines < http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master...

(the part relevant to the processing model) Lyon presentation slides <http://tei.it.ox.ac.uk/Talks/2015-08-maynooth/lyonPM.html>

Now for the long part of the email.

Some overall thoughts:

"Using the processing model definitely simplifies the life of the developer, who often has to write a few thousand lines of code just to render a particular TEI document into HTML, only to repeat the same tedious process for PDF output." This is from the TEI Processing Model Toolbox Documentation, but the idea is rehearsed in other documents too. As a developer, I fear this may not be true; rather I worry about the processing model getting in the way of communication between the "XML expert" editor and the "programmer" as described here (more on roles below). You'll tell me the Toolbox is a demonstration of the contrary, but my opinion stands when I try to think about my current day-to-day job.

The processing model is meant to be documentation, but it's in fact prescriptive and not that human readable. I also have an issue with expressing a desired output form while encouraging to disregard it at the same time. This is reflected in the slightly awkwardly phrased sentence that concludes opening paragraph of the processing model guidelines: "This enables the creator of a TEI customization to specify how they intend particular elements could be processed."

The terminology at work here (block, inline, list) is based on typographical conventions. I see ODD as being applicable to domains that go beyond: 1) text as a sequence of letters and characters and 2) digital text as replicating print conventions. Making these procedural instructions officially part of ODD ties it more strongly with the domains above, despite the mechanism to extend the model. I recognize that this is a very personal perspective rooted in my application of ODD beyond what we here call "text". Though considering facts outside of one's immediate domain may sometimes help avoid missteps; like the whole OCHO business if they only had looked beyond the book.

The Rahtz rationale (every workflow has three distinct roles of editor, programmer, designer) is a useful model. Yet, even though it accommodates the roles being co-existent in or dived between one or more persons, it doesn't account for the cyclic interchange between the roles; rather it models a fairy rigid hierarchy that is not that strict in real life. For example, an editor may not know that certain display options are available or indeed desirable, but a programmer/designer might. Do these new requirement need to be re-presented again in the processing model even though the act of communication between editor and (the seemingly subordinate) programmer has happened elsewhere? Isn't that cumbersome and simply adds a new thing to maintain? What I'm saying is that this approach is a simplification that may cover a large set of cases, but not all by far - so why make it an official recommendation? To me, this is something that belongs in the community of practice, not at the prescriptive level of the Guidelines and TEI schema.

The Truska tenet (ODD stores as much information as possible) is more convincing to me since ODD has the power of defining an application profile as well as a schema for a TEI project. But the goal of the processing model can be reached without extending the TEI vocabulary and dealing with the consequences of doing so.

So my recommendation would be to keep the processing model in its own namespace as a TEI extension. Mostly, this means that the council doesn't have to maintain the elements and a reference implementation. This wouldn't stop tools like the toolbox to do all its fantastic things anyway. Also the people that care about the processing model would be able to develop and better it faster, without having to go through a political body (the council) with constantly changing members, opinions, and skill set.

One could make the argument that having to use additional namespaces would complicate the authoring of the ODD document and it would make adoption less likely. But if TEI Simple and other invested parties distribute a ready-made ODD and schema, using namespaced elements will be as straightforward as using TEI-only elements.

Speaking to more technical details, the model seems solid and easy to understand. The existing implementations are a clear evaluation / demonstration of its applicability. I worry a bit about the prominent role of CSS and how the vocabulary needs to be limited for certain implementations (such as LaTeX). It's not immediately transparent what's going to work in which scenario. I suspect that eventually technology-specific mechanisms like @cssClass will need to be created for other technologies. I also worry that more procedural statements will be requested if the idea catches on (like a loop, or grouping statements). In the guidelines, some more complex cases with predicates get awkwardly close to XSLT. But feel free to accuse me of the slippery slope fallacy: the model *is* useful as is.

The proposed guidelines at the moment are dense and quite hard to read. The Documentation pages of the TEI Processing Model Toolbox do a better job at the moment. I took the liberty to add a few comments to the current draft (from May 2015?) on a google doc:

https://docs.google.com/document/d/1ILTG41Y6N2r0kNCg6DX9pyWEwHOoEt6ddp4_-QS5...

...
I made some corrections that made sense to me; but a native speaker should have the final word.

I think that the Guidelines are meant for the "editor" to understand how to use the processing model, but at the moment they mix in instructions for the "programmer" implementing them, too. This is confusing and the whole thing is not easy to decode for a programmer looking for clear implementation instructions. The "Implementing a TEI Processing Model" helps clear the waters a bit, but I would recommend creating more rigid documentation that defines a clear API using terms like MUST, SHOULD, etc. Here is a good example in the DH realm: http://iiif.io/api/image/2.0/ Although a document of that kind would look odd in the Guidelines (pun not intended).

I hope this helps a bit. Even if the council decides to go ahead and add these elements to the TEI vocabulary, there is a fair bit of work to do around documentation (for clarity and transparency - see LaTeX issue above) and prose.

Raff

...
On Thu, Feb 25, 2016 at 10:03 AM, Magdalena Turska <tuurma@gmail.com> wrote:

And if you fall into the category 'I always wanted to know what the simple buzz is all about but never quite got there', consider having a peek at

http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od...

...
and apps using it like 1. Early English Books Online http://showcases.exist-db.org/exist/apps/eebo/works/ 2. Shakespeare's Plays http://showcases.exist-db.org/exist/apps/shakespeare-pm/works/ 3. Foreign relations of United States https://history.state.gov/beta/

On 25 February 2016 at 14:45, Lou Burnard <lou.burnard@retired.ox.ac.uk

...
wrote:

...
Magdalena, James, and I are meeting Saturday morning to review the processing model state of affairs. Any last minute thoughts/comments/protests gratefully received before then. -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

Lou Burnard

12:17 a.m.

New subject: [tei-council] procmod discuss

I'm not sure that we actually need to keep section 22.5.5.8 in its present form at all. It says both too little and too much about specifics of an implementation. On 26/02/16 23:14, James Cummings wrote:

...

That is indeed a good point. I think that section was trying to indicate how precedence should work and might be able to be clearer.

James

-- Dr James Cummings, Academic IT, University of Oxford

-----Original Message----- From: Raffaele Viglianti [raffaeleviglianti@gmail.com] Received: Friday, 26 Feb 2016, 23:08 To: tei-council@lists.tei-c.org [tei-council@lists.tei-c.org] Subject: Re: [tei-council] procmod discuss

Ok, the current version of the guidelines is very readable and clear. I'm glad I learned about the Rahtz rationale and Turska tenet anyway, because they clarified the motivation for the processing model and potentially revealed some inherent issues, particularly concerning the definition of roles.

My comments and overall recommendation stand. The processing model is a useful tool, but it doesn't belong in the TEI Guidelines in my opinion. One difference from my earlier assessment is that if we do go ahead and include the elements in the TEI, the documentation is already in a sufficiently clear state.

One thing that I noticed in the old draft that survived in the current one is this text in § 22.5.5.8 "Implemention"

"When deciding which of the model elements within an elementSpec should be acted upon, *the models are processed in sequence, until one is found* with no value for its predicate attribute (meaning that it matches any occurrence of the element), or with a value which is true of the element being processed."

Is this really a requisite for the implementation? There are many methods for locating a desired element in a set that are not just processing in sequence. Even if sequence mattered in case of two models matching the same predicate, they only need to be processed in sequence once located, not to locate them. So I don't think that's a valid recommendation.

Raff

On Fri, Feb 26, 2016 at 4:52 PM, Hugh Cayless <philomousos@gmail.com> wrote:

...
Should be here: http://teic.github.io/TEI/TD.html#TDPMPM

Sent from my phone.

...
On Feb 26, 2016, at 16:44, James Cummings <james.cummings@it.ox.ac.uk> wrote: Hi Raff.

All this sounds like you were reading the the TEI Simple material not the processing model stuff that is already in a branch of the TEI repo? We're not debating TEI Simple atm or these old drafts. But the chapter prose that Lou has already modified. Apologies I would have sent a link to it or somehow made it appear on teic.github.io but I am not sure how to get things there (and am on my phone atm). It is the additions to the TD chapter you shoud read. None of that Rahtz rationale or Turska tenet are in the guidelines prose. But some good thoughts nonetheless.

James

-- Dr James Cummings, Academic IT, University of Oxford

-----Original Message----- From: Raffaele Viglianti [raffaeleviglianti@gmail.com] Received: Friday, 26 Feb 2016, 20:58 To: tei-council@lists.tei-c.org [tei-council@lists.tei-c.org] Subject: Re: [tei-council] procmod discuss

Hello, here are some thoughts about the processing model that I hope will be helpful to your discussion tomorrow. Please keep in mind that I'm playing devil's advocate here a bit.

*Too long; didn't read: *my recommendation would be to keep the processing model in its own namespace as a TEI extension.

Documents and tools considered: Guidelines draft < http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master... Toolbox documentation and demo < http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od... TEI Simple guidelines < http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master... (the part relevant to the processing model) Lyon presentation slides <http://tei.it.ox.ac.uk/Talks/2015-08-maynooth/lyonPM.html>

Now for the long part of the email.

Some overall thoughts:

"Using the processing model definitely simplifies the life of the developer, who often has to write a few thousand lines of code just to render a particular TEI document into HTML, only to repeat the same tedious process for PDF output." This is from the TEI Processing Model Toolbox Documentation, but the idea is rehearsed in other documents too. As a developer, I fear this may not be true; rather I worry about the processing model getting in the way of communication between the "XML expert" editor and the "programmer" as described here (more on roles below). You'll tell me the Toolbox is a demonstration of the contrary, but my opinion stands when I try to think about my current day-to-day job.

The processing model is meant to be documentation, but it's in fact prescriptive and not that human readable. I also have an issue with expressing a desired output form while encouraging to disregard it at the same time. This is reflected in the slightly awkwardly phrased sentence that concludes opening paragraph of the processing model guidelines: "This enables the creator of a TEI customization to specify how they intend particular elements could be processed."

The terminology at work here (block, inline, list) is based on typographical conventions. I see ODD as being applicable to domains that go beyond: 1) text as a sequence of letters and characters and 2) digital text as replicating print conventions. Making these procedural instructions officially part of ODD ties it more strongly with the domains above, despite the mechanism to extend the model. I recognize that this is a very personal perspective rooted in my application of ODD beyond what we here call "text". Though considering facts outside of one's immediate domain may sometimes help avoid missteps; like the whole OCHO business if they only had looked beyond the book.

The Rahtz rationale (every workflow has three distinct roles of editor, programmer, designer) is a useful model. Yet, even though it accommodates the roles being co-existent in or dived between one or more persons, it doesn't account for the cyclic interchange between the roles; rather it models a fairy rigid hierarchy that is not that strict in real life. For example, an editor may not know that certain display options are available or indeed desirable, but a programmer/designer might. Do these new requirement need to be re-presented again in the processing model even though the act of communication between editor and (the seemingly subordinate) programmer has happened elsewhere? Isn't that cumbersome and simply adds a new thing to maintain? What I'm saying is that this approach is a simplification that may cover a large set of cases, but not all by far - so why make it an official recommendation? To me, this is something that belongs in the community of practice, not at the prescriptive level of the Guidelines and TEI schema.

The Truska tenet (ODD stores as much information as possible) is more convincing to me since ODD has the power of defining an application profile as well as a schema for a TEI project. But the goal of the processing model can be reached without extending the TEI vocabulary and dealing with the consequences of doing so.

So my recommendation would be to keep the processing model in its own namespace as a TEI extension. Mostly, this means that the council doesn't have to maintain the elements and a reference implementation. This wouldn't stop tools like the toolbox to do all its fantastic things anyway. Also the people that care about the processing model would be able to develop and better it faster, without having to go through a political body (the council) with constantly changing members, opinions, and skill set.

One could make the argument that having to use additional namespaces would complicate the authoring of the ODD document and it would make adoption less likely. But if TEI Simple and other invested parties distribute a ready-made ODD and schema, using namespaced elements will be as straightforward as using TEI-only elements.

Speaking to more technical details, the model seems solid and easy to understand. The existing implementations are a clear evaluation / demonstration of its applicability. I worry a bit about the prominent role of CSS and how the vocabulary needs to be limited for certain implementations (such as LaTeX). It's not immediately transparent what's going to work in which scenario. I suspect that eventually technology-specific mechanisms like @cssClass will need to be created for other technologies. I also worry that more procedural statements will be requested if the idea catches on (like a loop, or grouping statements). In the guidelines, some more complex cases with predicates get awkwardly close to XSLT. But feel free to accuse me of the slippery slope fallacy: the model *is* useful as is.

The proposed guidelines at the moment are dense and quite hard to read. The Documentation pages of the TEI Processing Model Toolbox do a better job at the moment. I took the liberty to add a few comments to the current draft (from May 2015?) on a google doc:

https://docs.google.com/document/d/1ILTG41Y6N2r0kNCg6DX9pyWEwHOoEt6ddp4_-QS5...

...
I made some corrections that made sense to me; but a native speaker should have the final word.

I think that the Guidelines are meant for the "editor" to understand how to use the processing model, but at the moment they mix in instructions for the "programmer" implementing them, too. This is confusing and the whole thing is not easy to decode for a programmer looking for clear implementation instructions. The "Implementing a TEI Processing Model" helps clear the waters a bit, but I would recommend creating more rigid documentation that defines a clear API using terms like MUST, SHOULD, etc. Here is a good example in the DH realm: http://iiif.io/api/image/2.0/ Although a document of that kind would look odd in the Guidelines (pun not intended).

I hope this helps a bit. Even if the council decides to go ahead and add these elements to the TEI vocabulary, there is a fair bit of work to do around documentation (for clarity and transparency - see LaTeX issue above) and prose.

Raff

...
On Thu, Feb 25, 2016 at 10:03 AM, Magdalena Turska <tuurma@gmail.com> wrote: And if you fall into the category 'I always wanted to know what the simple buzz is all about but never quite got there', consider having a peek at

http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od...

...
and apps using it like 1. Early English Books Online http://showcases.exist-db.org/exist/apps/eebo/works/ 2. Shakespeare's Plays http://showcases.exist-db.org/exist/apps/shakespeare-pm/works/ 3. Foreign relations of United States https://history.state.gov/beta/

On 25 February 2016 at 14:45, Lou Burnard <lou.burnard@retired.ox.ac.uk wrote:

...
Magdalena, James, and I are meeting Saturday morning to review the processing model state of affairs. Any last minute thoughts/comments/protests gratefully received before then. -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

James Cummings

12:14 a.m.

New subject: [tei-council] procmod discuss

Thanks Hugh! James -- Dr James Cummings, Academic IT, University of Oxford -----Original Message----- From: Hugh Cayless [philomousos@gmail.com] Received: Friday, 26 Feb 2016, 21:53 To: tei-council@lists.tei-c.org [tei-council@lists.tei-c.org] Subject: Re: [tei-council] procmod discuss Should be here: http://teic.github.io/TEI/TD.html#TDPMPM Sent from my phone.

...

On Feb 26, 2016, at 16:44, James Cummings <james.cummings@it.ox.ac.uk> wrote:

Hi Raff.

All this sounds like you were reading the the TEI Simple material not the processing model stuff that is already in a branch of the TEI repo?

We're not debating TEI Simple atm or these old drafts. But the chapter prose that Lou has already modified. Apologies I would have sent a link to it or somehow made it appear on teic.github.io but I am not sure how to get things there (and am on my phone atm). It is the additions to the TD chapter you shoud read. None of that Rahtz rationale or Turska tenet are in the guidelines prose. But some good thoughts nonetheless.

James

-- Dr James Cummings, Academic IT, University of Oxford

-----Original Message----- From: Raffaele Viglianti [raffaeleviglianti@gmail.com] Received: Friday, 26 Feb 2016, 20:58 To: tei-council@lists.tei-c.org [tei-council@lists.tei-c.org] Subject: Re: [tei-council] procmod discuss

Hello, here are some thoughts about the processing model that I hope will be helpful to your discussion tomorrow. Please keep in mind that I'm playing devil's advocate here a bit.

*Too long; didn't read: *my recommendation would be to keep the processing model in its own namespace as a TEI extension.

Documents and tools considered: Guidelines draft <http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master/tei-pm.html> Toolbox documentation and demo <http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?odd=documentation.odd> TEI Simple guidelines <http://htmlpreview.github.io/?https://github.com/TEIC/TEI-Simple/blob/master/teisimple.odd.html> (the part relevant to the processing model) Lyon presentation slides <http://tei.it.ox.ac.uk/Talks/2015-08-maynooth/lyonPM.html>

Now for the long part of the email.

Some overall thoughts:

"Using the processing model definitely simplifies the life of the developer, who often has to write a few thousand lines of code just to render a particular TEI document into HTML, only to repeat the same tedious process for PDF output." This is from the TEI Processing Model Toolbox Documentation, but the idea is rehearsed in other documents too. As a developer, I fear this may not be true; rather I worry about the processing model getting in the way of communication between the "XML expert" editor and the "programmer" as described here (more on roles below). You'll tell me the Toolbox is a demonstration of the contrary, but my opinion stands when I try to think about my current day-to-day job.

The processing model is meant to be documentation, but it's in fact prescriptive and not that human readable. I also have an issue with expressing a desired output form while encouraging to disregard it at the same time. This is reflected in the slightly awkwardly phrased sentence that concludes opening paragraph of the processing model guidelines: "This enables the creator of a TEI customization to specify how they intend particular elements could be processed."

The terminology at work here (block, inline, list) is based on typographical conventions. I see ODD as being applicable to domains that go beyond: 1) text as a sequence of letters and characters and 2) digital text as replicating print conventions. Making these procedural instructions officially part of ODD ties it more strongly with the domains above, despite the mechanism to extend the model. I recognize that this is a very personal perspective rooted in my application of ODD beyond what we here call "text". Though considering facts outside of one's immediate domain may sometimes help avoid missteps; like the whole OCHO business if they only had looked beyond the book.

The Rahtz rationale (every workflow has three distinct roles of editor, programmer, designer) is a useful model. Yet, even though it accommodates the roles being co-existent in or dived between one or more persons, it doesn't account for the cyclic interchange between the roles; rather it models a fairy rigid hierarchy that is not that strict in real life. For example, an editor may not know that certain display options are available or indeed desirable, but a programmer/designer might. Do these new requirement need to be re-presented again in the processing model even though the act of communication between editor and (the seemingly subordinate) programmer has happened elsewhere? Isn't that cumbersome and simply adds a new thing to maintain? What I'm saying is that this approach is a simplification that may cover a large set of cases, but not all by far - so why make it an official recommendation? To me, this is something that belongs in the community of practice, not at the prescriptive level of the Guidelines and TEI schema.

The Truska tenet (ODD stores as much information as possible) is more convincing to me since ODD has the power of defining an application profile as well as a schema for a TEI project. But the goal of the processing model can be reached without extending the TEI vocabulary and dealing with the consequences of doing so.

So my recommendation would be to keep the processing model in its own namespace as a TEI extension. Mostly, this means that the council doesn't have to maintain the elements and a reference implementation. This wouldn't stop tools like the toolbox to do all its fantastic things anyway. Also the people that care about the processing model would be able to develop and better it faster, without having to go through a political body (the council) with constantly changing members, opinions, and skill set.

One could make the argument that having to use additional namespaces would complicate the authoring of the ODD document and it would make adoption less likely. But if TEI Simple and other invested parties distribute a ready-made ODD and schema, using namespaced elements will be as straightforward as using TEI-only elements.

Speaking to more technical details, the model seems solid and easy to understand. The existing implementations are a clear evaluation / demonstration of its applicability. I worry a bit about the prominent role of CSS and how the vocabulary needs to be limited for certain implementations (such as LaTeX). It's not immediately transparent what's going to work in which scenario. I suspect that eventually technology-specific mechanisms like @cssClass will need to be created for other technologies. I also worry that more procedural statements will be requested if the idea catches on (like a loop, or grouping statements). In the guidelines, some more complex cases with predicates get awkwardly close to XSLT. But feel free to accuse me of the slippery slope fallacy: the model *is* useful as is.

The proposed guidelines at the moment are dense and quite hard to read. The Documentation pages of the TEI Processing Model Toolbox do a better job at the moment. I took the liberty to add a few comments to the current draft (from May 2015?) on a google doc: https://docs.google.com/document/d/1ILTG41Y6N2r0kNCg6DX9pyWEwHOoEt6ddp4_-QS5... I made some corrections that made sense to me; but a native speaker should have the final word.

I think that the Guidelines are meant for the "editor" to understand how to use the processing model, but at the moment they mix in instructions for the "programmer" implementing them, too. This is confusing and the whole thing is not easy to decode for a programmer looking for clear implementation instructions. The "Implementing a TEI Processing Model" helps clear the waters a bit, but I would recommend creating more rigid documentation that defines a clear API using terms like MUST, SHOULD, etc. Here is a good example in the DH realm: http://iiif.io/api/image/2.0/ Although a document of that kind would look odd in the Guidelines (pun not intended).

I hope this helps a bit. Even if the council decides to go ahead and add these elements to the TEI vocabulary, there is a fair bit of work to do around documentation (for clarity and transparency - see LaTeX issue above) and prose.

Raff

...
On Thu, Feb 25, 2016 at 10:03 AM, Magdalena Turska <tuurma@gmail.com> wrote:

And if you fall into the category 'I always wanted to know what the simple buzz is all about but never quite got there', consider having a peek at

http://showcases.exist-db.org/exist/apps/tei-simple/doc/documentation.xml?od... and apps using it like 1. Early English Books Online http://showcases.exist-db.org/exist/apps/eebo/works/ 2. Shakespeare's Plays http://showcases.exist-db.org/exist/apps/shakespeare-pm/works/ 3. Foreign relations of United States https://history.state.gov/beta/

On 25 February 2016 at 14:45, Lou Burnard <lou.burnard@retired.ox.ac.uk> wrote:

...
Magdalena, James, and I are meeting Saturday morning to review the processing model state of affairs. Any last minute thoughts/comments/protests gratefully received before then. -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council

PLEASE NOTE: postings to this list are publicly archived

3417

Age (days ago)

3418

Last active (days ago)

List overview

Download

9 comments

5 participants

participants (5)

Hugh Cayless
James Cummings
Lou Burnard
Magdalena Turska
Raffaele Viglianti

procmod discuss

tags

participants (5)