standOff problem #1: content model of <teiCorpus>
The current content model of <teiCorpus> is: teiHeader, ( ( model.resourceLike+, ( TEI | teiCorpus )* ) | ( TEI | teiCorpus )+ ) ) That is, you may *either* have a series of 1 or more <TEI> and <teiCorpus> elements, intermingled; OR you can have one or more of <facsimile>, <fsdDecl>, <sourceDoc>, and <text> elements intermingled followed by zero or more <TEI> and <teiCorpus> elements intermingled. This is exactly the same as saying teiHeader, model.resourceLike*, ( TEI | teiCorpus )* except that it requires at least one child <TEI>, <teiCorpus>, <facsimile>, <fsdDecl>, <sourceDoc>, or <text>. In the new <standOff> world, we have voted to make <TEI> a member of model.resourceLike. Thus we have to alter the content model of <teiCorpus> because as written it would be ambiguous (if a <TEI> is your first child after <teiHeader>, you don't know which branch of the content model you are in). It is not too difficult to solve the ambiguity problem: teiHeader, ( ( model.resourceLike+, ( teiCorpus, ( TEI | teiCorpus )* )? ) | ( teiCorpus, ( TEI | teiCorpus )* ) ) ) I am pretty confident this content model validates the same set of documents (and, perhaps as importantly, rejects the same set of documents) as the original content model (with <TEI> in model.resourceLike). I am not against using this content model. But it occurs to me that a) it is a bit cumbersome b) the set of documents permitted is already screwy c) it looks very different from the content model of <TEI>, which (in the new world order) <teiCorpus> is very similar to. One possible solution is to 1) add <teiCorpus> to model.resourceLike, too; and then 2) change content model of <teiCorpus> to match that of <TEI> (in the new standOff world): teiHeader, model.resourceLike+ This has the advantage of having a very clean, understandable content model for <teiCorpus> (and no ambiguity). It has the disadvantage that it allows for even more screwy things. E.g. <teiCorpus> <teiHeader> <standOff> <TEI> <facsimile> <TEI> <teiCorpus> <TEI> <facsimile> <TEI> <teiCorpus> <facsimile> </teiCorpus> It has the feature (which some will consider an advantage, others a disadvantage, I'm sure) that <TEI> and <teiCorpus> become the same thing. Just as an XSLT stylesheet can have an outermost element of xsl:stylesheet or xsl:transform, no difference, a TEI document will be able to have <TEI> or <teiCorpus> as an outermost element, no difference. Same goes at every level of nesting where one is allowed: so is the other, and it can have the same content. Personally, I still think putting <TEI> into model.resourceLike was probably a mistake, but once we've done that, I'm inclined to say throw <teiCorpus> in there, too.
I agree with Syd on this: teiCorpus should be a member of model.resourceLike. The fact that as a result TEI and teiCorpus become essentially the same thing is actually a plus for me. I think any line drawn between the two is terminological rather than structural, and while it might be clearly delineated in the case of particular projects, in the broader view it's very fuzzy indeed. Cheers, Martin On 2019-12-09 3:16 p.m., Syd Bauman wrote:
The current content model of <teiCorpus> is:
teiHeader, ( ( model.resourceLike+, ( TEI | teiCorpus )* ) | ( TEI | teiCorpus )+ ) )
That is, you may *either* have a series of 1 or more <TEI> and <teiCorpus> elements, intermingled; OR you can have one or more of <facsimile>, <fsdDecl>, <sourceDoc>, and <text> elements intermingled followed by zero or more <TEI> and <teiCorpus> elements intermingled. This is exactly the same as saying
teiHeader, model.resourceLike*, ( TEI | teiCorpus )*
except that it requires at least one child <TEI>, <teiCorpus>, <facsimile>, <fsdDecl>, <sourceDoc>, or <text>.
In the new <standOff> world, we have voted to make <TEI> a member of model.resourceLike. Thus we have to alter the content model of <teiCorpus> because as written it would be ambiguous (if a <TEI> is your first child after <teiHeader>, you don't know which branch of the content model you are in).
It is not too difficult to solve the ambiguity problem:
teiHeader, ( ( model.resourceLike+, ( teiCorpus, ( TEI | teiCorpus )* )? ) | ( teiCorpus, ( TEI | teiCorpus )* ) ) )
I am pretty confident this content model validates the same set of documents (and, perhaps as importantly, rejects the same set of documents) as the original content model (with <TEI> in model.resourceLike). I am not against using this content model. But it occurs to me that a) it is a bit cumbersome b) the set of documents permitted is already screwy c) it looks very different from the content model of <TEI>, which (in the new world order) <teiCorpus> is very similar to.
One possible solution is to 1) add <teiCorpus> to model.resourceLike, too; and then 2) change content model of <teiCorpus> to match that of <TEI> (in the new standOff world): teiHeader, model.resourceLike+
This has the advantage of having a very clean, understandable content model for <teiCorpus> (and no ambiguity). It has the disadvantage that it allows for even more screwy things. E.g. <teiCorpus> <teiHeader> <standOff> <TEI> <facsimile> <TEI> <teiCorpus> <TEI> <facsimile> <TEI> <teiCorpus> <facsimile> </teiCorpus>
It has the feature (which some will consider an advantage, others a disadvantage, I'm sure) that <TEI> and <teiCorpus> become the same thing. Just as an XSLT stylesheet can have an outermost element of xsl:stylesheet or xsl:transform, no difference, a TEI document will be able to have <TEI> or <teiCorpus> as an outermost element, no difference. Same goes at every level of nesting where one is allowed: so is the other, and it can have the same content.
Personally, I still think putting <TEI> into model.resourceLike was probably a mistake, but once we've done that, I'm inclined to say throw <teiCorpus> in there, too. _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
-- ------------------------------------------ Martin Holmes UVic Humanities Computing and Media Centre
I’m really sorry not being able to join you later (to bring it forward myself) but please have a look at ticket https://github.com/TEIC/TEI/issues/1823 and PR https://github.com/TEIC/TEI/pull/1922 where I already tried to tackle the content model of teiCorpus. The problem we faced (and still seems evident with Syd’s proposal) that teiCorpus would allow a structure like <teiCorpus xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> ... </teiHeader> <text> ... </text> </teiCorpus> which we thought was not intended. Best Peter
Am 10.12.2019 um 00:50 schrieb Martin Holmes
: I agree with Syd on this: teiCorpus should be a member of model.resourceLike. The fact that as a result TEI and teiCorpus become essentially the same thing is actually a plus for me. I think any line drawn between the two is terminological rather than structural, and while it might be clearly delineated in the case of particular projects, in the broader view it's very fuzzy indeed.
Cheers, Martin
On 2019-12-09 3:16 p.m., Syd Bauman wrote:
The current content model of <teiCorpus> is: teiHeader, ( ( model.resourceLike+, ( TEI | teiCorpus )* ) | ( TEI | teiCorpus )+ ) ) That is, you may *either* have a series of 1 or more <TEI> and <teiCorpus> elements, intermingled; OR you can have one or more of <facsimile>, <fsdDecl>, <sourceDoc>, and <text> elements intermingled followed by zero or more <TEI> and <teiCorpus> elements intermingled. This is exactly the same as saying teiHeader, model.resourceLike*, ( TEI | teiCorpus )* except that it requires at least one child <TEI>, <teiCorpus>, <facsimile>, <fsdDecl>, <sourceDoc>, or <text>. In the new <standOff> world, we have voted to make <TEI> a member of model.resourceLike. Thus we have to alter the content model of <teiCorpus> because as written it would be ambiguous (if a <TEI> is your first child after <teiHeader>, you don't know which branch of the content model you are in). It is not too difficult to solve the ambiguity problem: teiHeader, ( ( model.resourceLike+, ( teiCorpus, ( TEI | teiCorpus )* )? ) | ( teiCorpus, ( TEI | teiCorpus )* ) ) ) I am pretty confident this content model validates the same set of documents (and, perhaps as importantly, rejects the same set of documents) as the original content model (with <TEI> in model.resourceLike). I am not against using this content model. But it occurs to me that a) it is a bit cumbersome b) the set of documents permitted is already screwy c) it looks very different from the content model of <TEI>, which (in the new world order) <teiCorpus> is very similar to. One possible solution is to 1) add <teiCorpus> to model.resourceLike, too; and then 2) change content model of <teiCorpus> to match that of <TEI> (in the new standOff world): teiHeader, model.resourceLike+ This has the advantage of having a very clean, understandable content model for <teiCorpus> (and no ambiguity). It has the disadvantage that it allows for even more screwy things. E.g. <teiCorpus> <teiHeader> <standOff> <TEI> <facsimile> <TEI> <teiCorpus> <TEI> <facsimile> <TEI> <teiCorpus> <facsimile> </teiCorpus> It has the feature (which some will consider an advantage, others a disadvantage, I'm sure) that <TEI> and <teiCorpus> become the same thing. Just as an XSLT stylesheet can have an outermost element of xsl:stylesheet or xsl:transform, no difference, a TEI document will be able to have <TEI> or <teiCorpus> as an outermost element, no difference. Same goes at every level of nesting where one is allowed: so is the other, and it can have the same content. Personally, I still think putting <TEI> into model.resourceLike was probably a mistake, but once we've done that, I'm inclined to say throw <teiCorpus> in there, too. _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
-- ------------------------------------------ Martin Holmes UVic Humanities Computing and Media Centre _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
In reading and rethinking this, I am trying to remember our line reasoning in wanting TEI to be a member of model.resourceLike, and not including teiCorpus there. I think we did want to preserve a semantic distinction between the two, didn’t we? Our current explanation of model.resourceLike is that it “groups separate elements which constitute the content of a digital resource, as opposed to its metadata.” In the context of a TEI document that is a child of a teiCorpus, does this mean we consider TEI the “content” of the digital resource, as opposed to the “metadata” of the corpus as a collection? And yet in a context without a corpus collection, we would consider the TEI element to contain both metadata and content. To make teiCorpus a member of model.resourceLike seems like we are saying it, too, can be considered a content level element as opposed to a metadata element. I am finding this worrisome, because it seems like we are just opening these two elements up to meaning two contradictory things. Either they are content-only, or they contain metadata and content. Somehow I would rather they both could be seen as always metadata-bearing on principle. Were we thinking of changing the explanation of model.resourceLike to accommodate the bearing of metadata in the TEI element? I don’t think this is just a fuzzy terminological matter and find myself unexpectedly disagreeing with Martin here. I think the distinction between content and metadata is something we ought to consider here: can we no longer tell a difference between metadata and content in a standoff context? Pretty unsure of the content just now, Elisa Sent from my iPhone
On Dec 10, 2019, at 2:10 AM, Peter Stadler
wrote: I’m really sorry not being able to join you later (to bring it forward myself) but please have a look at ticket https://github.com/TEIC/TEI/issues/1823 and PR https://github.com/TEIC/TEI/pull/1922 where I already tried to tackle the content model of teiCorpus. The problem we faced (and still seems evident with Syd’s proposal) that teiCorpus would allow a structure like <teiCorpus xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> ... </teiHeader> <text> ... </text> </teiCorpus> which we thought was not intended.
Best Peter
Am 10.12.2019 um 00:50 schrieb Martin Holmes
: I agree with Syd on this: teiCorpus should be a member of model.resourceLike. The fact that as a result TEI and teiCorpus become essentially the same thing is actually a plus for me. I think any line drawn between the two is terminological rather than structural, and while it might be clearly delineated in the case of particular projects, in the broader view it's very fuzzy indeed.
Cheers, Martin
On 2019-12-09 3:16 p.m., Syd Bauman wrote: The current content model of <teiCorpus> is: teiHeader, ( ( model.resourceLike+, ( TEI | teiCorpus )* ) | ( TEI | teiCorpus )+ ) ) That is, you may *either* have a series of 1 or more <TEI> and <teiCorpus> elements, intermingled; OR you can have one or more of <facsimile>, <fsdDecl>, <sourceDoc>, and <text> elements intermingled followed by zero or more <TEI> and <teiCorpus> elements intermingled. This is exactly the same as saying teiHeader, model.resourceLike*, ( TEI | teiCorpus )* except that it requires at least one child <TEI>, <teiCorpus>, <facsimile>, <fsdDecl>, <sourceDoc>, or <text>. In the new <standOff> world, we have voted to make <TEI> a member of model.resourceLike. Thus we have to alter the content model of <teiCorpus> because as written it would be ambiguous (if a <TEI> is your first child after <teiHeader>, you don't know which branch of the content model you are in). It is not too difficult to solve the ambiguity problem: teiHeader, ( ( model.resourceLike+, ( teiCorpus, ( TEI | teiCorpus )* )? ) | ( teiCorpus, ( TEI | teiCorpus )* ) ) ) I am pretty confident this content model validates the same set of documents (and, perhaps as importantly, rejects the same set of documents) as the original content model (with <TEI> in model.resourceLike). I am not against using this content model. But it occurs to me that a) it is a bit cumbersome b) the set of documents permitted is already screwy c) it looks very different from the content model of <TEI>, which (in the new world order) <teiCorpus> is very similar to. One possible solution is to 1) add <teiCorpus> to model.resourceLike, too; and then 2) change content model of <teiCorpus> to match that of <TEI> (in the new standOff world): teiHeader, model.resourceLike+ This has the advantage of having a very clean, understandable content model for <teiCorpus> (and no ambiguity). It has the disadvantage that it allows for even more screwy things. E.g. <teiCorpus> <teiHeader> <standOff> <TEI> <facsimile> <TEI> <teiCorpus> <TEI> <facsimile> <TEI> <teiCorpus> <facsimile> </teiCorpus> It has the feature (which some will consider an advantage, others a disadvantage, I'm sure) that <TEI> and <teiCorpus> become the same thing. Just as an XSLT stylesheet can have an outermost element of xsl:stylesheet or xsl:transform, no difference, a TEI document will be able to have <TEI> or <teiCorpus> as an outermost element, no difference. Same goes at every level of nesting where one is allowed: so is the other, and it can have the same content. Personally, I still think putting <TEI> into model.resourceLike was probably a mistake, but once we've done that, I'm inclined to say throw <teiCorpus> in there, too. _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
-- ------------------------------------------ Martin Holmes UVic Humanities Computing and Media Centre _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
_______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
I think Peter's objection is a killer for putting teiCorpus into model.resourceLike if that is still the case with Syd's formulation of it.
What if instead we simplified the current content model by putting TEI and teiCorpus into a special class all of their own, and then having an alternation between that and model.resourceLike? Or would that result in the same ambiguous situation? I don't have a problem with TEI and teiCorpus being treated the same as long as we don't end up with things like <text> being a child of teiCorpus.
Many thanks,
James
--
Dr James Cummings, James.Cummings@newcastle.ac.uk
Senior Lecturer in Late-Medieval Literature and Digital Humanities
School of English, Newcastle University
________________________________
From: Tei-council
Am 10.12.2019 um 00:50 schrieb Martin Holmes
: I agree with Syd on this: teiCorpus should be a member of model.resourceLike. The fact that as a result TEI and teiCorpus become essentially the same thing is actually a plus for me. I think any line drawn between the two is terminological rather than structural, and while it might be clearly delineated in the case of particular projects, in the broader view it's very fuzzy indeed.
Cheers, Martin
On 2019-12-09 3:16 p.m., Syd Bauman wrote:
The current content model of <teiCorpus> is: teiHeader, ( ( model.resourceLike+, ( TEI | teiCorpus )* ) | ( TEI | teiCorpus )+ ) ) That is, you may *either* have a series of 1 or more <TEI> and <teiCorpus> elements, intermingled; OR you can have one or more of <facsimile>, <fsdDecl>, <sourceDoc>, and <text> elements intermingled followed by zero or more <TEI> and <teiCorpus> elements intermingled. This is exactly the same as saying teiHeader, model.resourceLike*, ( TEI | teiCorpus )* except that it requires at least one child <TEI>, <teiCorpus>, <facsimile>, <fsdDecl>, <sourceDoc>, or <text>. In the new <standOff> world, we have voted to make <TEI> a member of model.resourceLike. Thus we have to alter the content model of <teiCorpus> because as written it would be ambiguous (if a <TEI> is your first child after <teiHeader>, you don't know which branch of the content model you are in). It is not too difficult to solve the ambiguity problem: teiHeader, ( ( model.resourceLike+, ( teiCorpus, ( TEI | teiCorpus )* )? ) | ( teiCorpus, ( TEI | teiCorpus )* ) ) ) I am pretty confident this content model validates the same set of documents (and, perhaps as importantly, rejects the same set of documents) as the original content model (with <TEI> in model.resourceLike). I am not against using this content model. But it occurs to me that a) it is a bit cumbersome b) the set of documents permitted is already screwy c) it looks very different from the content model of <TEI>, which (in the new world order) <teiCorpus> is very similar to. One possible solution is to 1) add <teiCorpus> to model.resourceLike, too; and then 2) change content model of <teiCorpus> to match that of <TEI> (in the new standOff world): teiHeader, model.resourceLike+ This has the advantage of having a very clean, understandable content model for <teiCorpus> (and no ambiguity). It has the disadvantage that it allows for even more screwy things. E.g. <teiCorpus> <teiHeader> <standOff> <TEI> <facsimile> <TEI> <teiCorpus> <TEI> <facsimile> <TEI> <teiCorpus> <facsimile> </teiCorpus> It has the feature (which some will consider an advantage, others a disadvantage, I'm sure) that <TEI> and <teiCorpus> become the same thing. Just as an XSLT stylesheet can have an outermost element of xsl:stylesheet or xsl:transform, no difference, a TEI document will be able to have <TEI> or <teiCorpus> as an outermost element, no difference. Same goes at every level of nesting where one is allowed: so is the other, and it can have the same content. Personally, I still think putting <TEI> into model.resourceLike was probably a mistake, but once we've done that, I'm inclined to say throw <teiCorpus> in there, too. _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
-- ------------------------------------------ Martin Holmes UVic Humanities Computing and Media Centre _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
You can already have <text> as a child of <teiCorpus> I'm afraid... On Tue, Dec 10, 2019 at 7:10 AM James Cummings < James.Cummings@newcastle.ac.uk> wrote:
I think Peter's objection is a killer for putting teiCorpus into model.resourceLike if that is still the case with Syd's formulation of it.
What if instead we simplified the current content model by putting TEI and teiCorpus into a special class all of their own, and then having an alternation between that and model.resourceLike? Or would that result in the same ambiguous situation? I don't have a problem with TEI and teiCorpus being treated the same as long as we don't end up with things like <text> being a child of teiCorpus.
Many thanks,
James
--
Dr James Cummings, James.Cummings@newcastle.ac.uk Senior Lecturer in Late-Medieval Literature and Digital Humanities
School of English, Newcastle University ------------------------------ *From:* Tei-council
on behalf of Peter Stadler *Sent:* 10 December 2019 07:10 *To:* Martin Holmes *Cc:* tei-council@lists.tei-c.org *Subject:* Re: [Tei-council] standOff problem #1: content model of <teiCorpus> I’m really sorry not being able to join you later (to bring it forward myself) but please have a look at ticket https://github.com/TEIC/TEI/issues/1823 and PR https://github.com/TEIC/TEI/pull/1922 where I already tried to tackle the content model of teiCorpus. The problem we faced (and still seems evident with Syd’s proposal) that teiCorpus would allow a structure like <teiCorpus xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> ... </teiHeader> <text> ... </text> </teiCorpus> which we thought was not intended.
Best Peter
Am 10.12.2019 um 00:50 schrieb Martin Holmes
: I agree with Syd on this: teiCorpus should be a member of model.resourceLike. The fact that as a result TEI and teiCorpus become essentially the same thing is actually a plus for me. I think any line drawn between the two is terminological rather than structural, and while it might be clearly delineated in the case of particular projects, in the broader view it's very fuzzy indeed.
Cheers, Martin
On 2019-12-09 3:16 p.m., Syd Bauman wrote:
The current content model of <teiCorpus> is: teiHeader, ( ( model.resourceLike+, ( TEI | teiCorpus )* ) | ( TEI | teiCorpus )+ ) ) That is, you may *either* have a series of 1 or more <TEI> and <teiCorpus> elements, intermingled; OR you can have one or more of <facsimile>, <fsdDecl>, <sourceDoc>, and <text> elements intermingled followed by zero or more <TEI> and <teiCorpus> elements intermingled. This is exactly the same as saying teiHeader, model.resourceLike*, ( TEI | teiCorpus )* except that it requires at least one child <TEI>, <teiCorpus>, <facsimile>, <fsdDecl>, <sourceDoc>, or <text>. In the new <standOff> world, we have voted to make <TEI> a member of model.resourceLike. Thus we have to alter the content model of <teiCorpus> because as written it would be ambiguous (if a <TEI> is your first child after <teiHeader>, you don't know which branch of the content model you are in). It is not too difficult to solve the ambiguity problem: teiHeader, ( ( model.resourceLike+, ( teiCorpus, ( TEI | teiCorpus )* )? ) | ( teiCorpus, ( TEI | teiCorpus )* ) ) ) I am pretty confident this content model validates the same set of documents (and, perhaps as importantly, rejects the same set of documents) as the original content model (with <TEI> in model.resourceLike). I am not against using this content model. But it occurs to me that a) it is a bit cumbersome b) the set of documents permitted is already screwy c) it looks very different from the content model of <TEI>, which (in the new world order) <teiCorpus> is very similar to. One possible solution is to 1) add <teiCorpus> to model.resourceLike, too; and then 2) change content model of <teiCorpus> to match that of <TEI> (in the new standOff world): teiHeader, model.resourceLike+ This has the advantage of having a very clean, understandable content model for <teiCorpus> (and no ambiguity). It has the disadvantage that it allows for even more screwy things. E.g. <teiCorpus> <teiHeader> <standOff> <TEI> <facsimile> <TEI> <teiCorpus> <TEI> <facsimile> <TEI> <teiCorpus> <facsimile> </teiCorpus> It has the feature (which some will consider an advantage, others a disadvantage, I'm sure) that <TEI> and <teiCorpus> become the same thing. Just as an XSLT stylesheet can have an outermost element of xsl:stylesheet or xsl:transform, no difference, a TEI document will be able to have <TEI> or <teiCorpus> as an outermost element, no difference. Same goes at every level of nesting where one is allowed: so is the other, and it can have the same content. Personally, I still think putting <TEI> into model.resourceLike was probably a mistake, but once we've done that, I'm inclined to say throw <teiCorpus> in there, too. _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
-- ------------------------------------------ Martin Holmes UVic Humanities Computing and Media Centre _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
_______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
This seems to be a disastrous error introduced in TEI P5 3.0.0. I despair.
Many thanks,
James
--
Dr James Cummings, James.Cummings@newcastle.ac.uk
Senior Lecturer in Late-Medieval Literature and Digital Humanities
School of English, Newcastle University
________________________________
From: Tei-council
Am 10.12.2019 um 00:50 schrieb Martin Holmes
mailto:mholmes@uvic.ca>: I agree with Syd on this: teiCorpus should be a member of model.resourceLike. The fact that as a result TEI and teiCorpus become essentially the same thing is actually a plus for me. I think any line drawn between the two is terminological rather than structural, and while it might be clearly delineated in the case of particular projects, in the broader view it's very fuzzy indeed.
Cheers, Martin
On 2019-12-09 3:16 p.m., Syd Bauman wrote:
The current content model of <teiCorpus> is: teiHeader, ( ( model.resourceLike+, ( TEI | teiCorpus )* ) | ( TEI | teiCorpus )+ ) ) That is, you may *either* have a series of 1 or more <TEI> and <teiCorpus> elements, intermingled; OR you can have one or more of <facsimile>, <fsdDecl>, <sourceDoc>, and <text> elements intermingled followed by zero or more <TEI> and <teiCorpus> elements intermingled. This is exactly the same as saying teiHeader, model.resourceLike*, ( TEI | teiCorpus )* except that it requires at least one child <TEI>, <teiCorpus>, <facsimile>, <fsdDecl>, <sourceDoc>, or <text>. In the new <standOff> world, we have voted to make <TEI> a member of model.resourceLike. Thus we have to alter the content model of <teiCorpus> because as written it would be ambiguous (if a <TEI> is your first child after <teiHeader>, you don't know which branch of the content model you are in). It is not too difficult to solve the ambiguity problem: teiHeader, ( ( model.resourceLike+, ( teiCorpus, ( TEI | teiCorpus )* )? ) | ( teiCorpus, ( TEI | teiCorpus )* ) ) ) I am pretty confident this content model validates the same set of documents (and, perhaps as importantly, rejects the same set of documents) as the original content model (with <TEI> in model.resourceLike). I am not against using this content model. But it occurs to me that a) it is a bit cumbersome b) the set of documents permitted is already screwy c) it looks very different from the content model of <TEI>, which (in the new world order) <teiCorpus> is very similar to. One possible solution is to 1) add <teiCorpus> to model.resourceLike, too; and then 2) change content model of <teiCorpus> to match that of <TEI> (in the new standOff world): teiHeader, model.resourceLike+ This has the advantage of having a very clean, understandable content model for <teiCorpus> (and no ambiguity). It has the disadvantage that it allows for even more screwy things. E.g. <teiCorpus> <teiHeader> <standOff> <TEI> <facsimile> <TEI> <teiCorpus> <TEI> <facsimile> <TEI> <teiCorpus> <facsimile> </teiCorpus> It has the feature (which some will consider an advantage, others a disadvantage, I'm sure) that <TEI> and <teiCorpus> become the same thing. Just as an XSLT stylesheet can have an outermost element of xsl:stylesheet or xsl:transform, no difference, a TEI document will be able to have <TEI> or <teiCorpus> as an outermost element, no difference. Same goes at every level of nesting where one is allowed: so is the other, and it can have the same content. Personally, I still think putting <TEI> into model.resourceLike was probably a mistake, but once we've done that, I'm inclined to say throw <teiCorpus> in there, too. _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.orgmailto:Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
-- ------------------------------------------ Martin Holmes UVic Humanities Computing and Media Centre _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.orgmailto:Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
_______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.orgmailto:Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
Yes, and we already decided to fix this during our f2f at Graz: https://github.com/TEIC/TEI/issues/1823#issuecomment-531465879 Peter
Am 10.12.2019 um 15:19 schrieb James Cummings
: This seems to be a disastrous error introduced in TEI P5 3.0.0. I despair.
Many thanks, James
-- Dr James Cummings, James.Cummings@newcastle.ac.uk Senior Lecturer in Late-Medieval Literature and Digital Humanities School of English, Newcastle University From: Tei-council
on behalf of Hugh Cayless Sent: 10 December 2019 14:11 To: tei-council@lists.tei-c.org Subject: Re: [Tei-council] standOff problem #1: content model of <teiCorpus> You can already have <text> as a child of <teiCorpus> I'm afraid...
On Tue, Dec 10, 2019 at 7:10 AM James Cummings
wrote: I think Peter's objection is a killer for putting teiCorpus into model.resourceLike if that is still the case with Syd's formulation of it.
What if instead we simplified the current content model by putting TEI and teiCorpus into a special class all of their own, and then having an alternation between that and model.resourceLike? Or would that result in the same ambiguous situation? I don't have a problem with TEI and teiCorpus being treated the same as long as we don't end up with things like <text> being a child of teiCorpus.
Many thanks, James
-- Dr James Cummings, James.Cummings@newcastle.ac.uk Senior Lecturer in Late-Medieval Literature and Digital Humanities School of English, Newcastle University From: Tei-council
on behalf of Peter Stadler Sent: 10 December 2019 07:10 To: Martin Holmes Cc: tei-council@lists.tei-c.org Subject: Re: [Tei-council] standOff problem #1: content model of <teiCorpus> I’m really sorry not being able to join you later (to bring it forward myself) but please have a look at ticket https://github.com/TEIC/TEI/issues/1823 and PR https://github.com/TEIC/TEI/pull/1922where I already tried to tackle the content model of teiCorpus. The problem we faced (and still seems evident with Syd’s proposal) that teiCorpus would allow a structure like <teiCorpus xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> ... </teiHeader> <text> ... </text> </teiCorpus> which we thought was not intended.
Best Peter
Am 10.12.2019 um 00:50 schrieb Martin Holmes
: I agree with Syd on this: teiCorpus should be a member of model.resourceLike. The fact that as a result TEI and teiCorpus become essentially the same thing is actually a plus for me. I think any line drawn between the two is terminological rather than structural, and while it might be clearly delineated in the case of particular projects, in the broader view it's very fuzzy indeed.
Cheers, Martin
On 2019-12-09 3:16 p.m., Syd Bauman wrote:
The current content model of <teiCorpus> is: teiHeader, ( ( model.resourceLike+, ( TEI | teiCorpus )* ) | ( TEI | teiCorpus )+ ) ) That is, you may *either* have a series of 1 or more <TEI> and <teiCorpus> elements, intermingled; OR you can have one or more of <facsimile>, <fsdDecl>, <sourceDoc>, and <text> elements intermingled followed by zero or more <TEI> and <teiCorpus> elements intermingled. This is exactly the same as saying teiHeader, model.resourceLike*, ( TEI | teiCorpus )* except that it requires at least one child <TEI>, <teiCorpus>, <facsimile>, <fsdDecl>, <sourceDoc>, or <text>. In the new <standOff> world, we have voted to make <TEI> a member of model.resourceLike. Thus we have to alter the content model of <teiCorpus> because as written it would be ambiguous (if a <TEI> is your first child after <teiHeader>, you don't know which branch of the content model you are in). It is not too difficult to solve the ambiguity problem: teiHeader, ( ( model.resourceLike+, ( teiCorpus, ( TEI | teiCorpus )* )? ) | ( teiCorpus, ( TEI | teiCorpus )* ) ) ) I am pretty confident this content model validates the same set of documents (and, perhaps as importantly, rejects the same set of documents) as the original content model (with <TEI> in model.resourceLike). I am not against using this content model. But it occurs to me that a) it is a bit cumbersome b) the set of documents permitted is already screwy c) it looks very different from the content model of <TEI>, which (in the new world order) <teiCorpus> is very similar to. One possible solution is to 1) add <teiCorpus> to model.resourceLike, too; and then 2) change content model of <teiCorpus> to match that of <TEI> (in the new standOff world): teiHeader, model.resourceLike+ This has the advantage of having a very clean, understandable content model for <teiCorpus> (and no ambiguity). It has the disadvantage that it allows for even more screwy things. E.g. <teiCorpus> <teiHeader> <standOff> <TEI> <facsimile> <TEI> <teiCorpus> <TEI> <facsimile> <TEI> <teiCorpus> <facsimile> </teiCorpus> It has the feature (which some will consider an advantage, others a disadvantage, I'm sure) that <TEI> and <teiCorpus> become the same thing. Just as an XSLT stylesheet can have an outermost element of xsl:stylesheet or xsl:transform, no difference, a TEI document will be able to have <TEI> or <teiCorpus> as an outermost element, no difference. Same goes at every level of nesting where one is allowed: so is the other, and it can have the same content. Personally, I still think putting <TEI> into model.resourceLike was probably a mistake, but once we've done that, I'm inclined to say throw <teiCorpus> in there, too. _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
-- ------------------------------------------ Martin Holmes UVic Humanities Computing and Media Centre _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
_______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
Good. But how is it going to be fixed? On 10/12/2019 18:54, Peter Stadler wrote:
Yes, and we already decided to fix this during our f2f at Graz: https://github.com/TEIC/TEI/issues/1823#issuecomment-531465879
Peter
Am 10.12.2019 um 15:19 schrieb James Cummings
: This seems to be a disastrous error introduced in TEI P5 3.0.0. I despair.
Many thanks, James
-- Dr James Cummings, James.Cummings@newcastle.ac.uk Senior Lecturer in Late-Medieval Literature and Digital Humanities School of English, Newcastle University From: Tei-council
on behalf of Hugh Cayless Sent: 10 December 2019 14:11 To: tei-council@lists.tei-c.org Subject: Re: [Tei-council] standOff problem #1: content model of <teiCorpus> You can already have <text> as a child of <teiCorpus> I'm afraid...
On Tue, Dec 10, 2019 at 7:10 AM James Cummings
wrote: I think Peter's objection is a killer for putting teiCorpus into model.resourceLike if that is still the case with Syd's formulation of it.
What if instead we simplified the current content model by putting TEI and teiCorpus into a special class all of their own, and then having an alternation between that and model.resourceLike? Or would that result in the same ambiguous situation? I don't have a problem with TEI and teiCorpus being treated the same as long as we don't end up with things like <text> being a child of teiCorpus.
Many thanks, James
-- Dr James Cummings, James.Cummings@newcastle.ac.uk Senior Lecturer in Late-Medieval Literature and Digital Humanities School of English, Newcastle University From: Tei-council
on behalf of Peter Stadler Sent: 10 December 2019 07:10 To: Martin Holmes Cc: tei-council@lists.tei-c.org Subject: Re: [Tei-council] standOff problem #1: content model of <teiCorpus> I’m really sorry not being able to join you later (to bring it forward myself) but please have a look at ticket https://github.com/TEIC/TEI/issues/1823 and PR https://github.com/TEIC/TEI/pull/1922where I already tried to tackle the content model of teiCorpus. The problem we faced (and still seems evident with Syd’s proposal) that teiCorpus would allow a structure like <teiCorpus xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> ... </teiHeader> <text> ... </text> </teiCorpus> which we thought was not intended.
Best Peter
Am 10.12.2019 um 00:50 schrieb Martin Holmes
: I agree with Syd on this: teiCorpus should be a member of model.resourceLike. The fact that as a result TEI and teiCorpus become essentially the same thing is actually a plus for me. I think any line drawn between the two is terminological rather than structural, and while it might be clearly delineated in the case of particular projects, in the broader view it's very fuzzy indeed.
Cheers, Martin
On 2019-12-09 3:16 p.m., Syd Bauman wrote:
The current content model of <teiCorpus> is: teiHeader, ( ( model.resourceLike+, ( TEI | teiCorpus )* ) | ( TEI | teiCorpus )+ ) ) That is, you may *either* have a series of 1 or more <TEI> and <teiCorpus> elements, intermingled; OR you can have one or more of <facsimile>, <fsdDecl>, <sourceDoc>, and <text> elements intermingled followed by zero or more <TEI> and <teiCorpus> elements intermingled. This is exactly the same as saying teiHeader, model.resourceLike*, ( TEI | teiCorpus )* except that it requires at least one child <TEI>, <teiCorpus>, <facsimile>, <fsdDecl>, <sourceDoc>, or <text>. In the new <standOff> world, we have voted to make <TEI> a member of model.resourceLike. Thus we have to alter the content model of <teiCorpus> because as written it would be ambiguous (if a <TEI> is your first child after <teiHeader>, you don't know which branch of the content model you are in). It is not too difficult to solve the ambiguity problem: teiHeader, ( ( model.resourceLike+, ( teiCorpus, ( TEI | teiCorpus )* )? ) | ( teiCorpus, ( TEI | teiCorpus )* ) ) ) I am pretty confident this content model validates the same set of documents (and, perhaps as importantly, rejects the same set of documents) as the original content model (with <TEI> in model.resourceLike). I am not against using this content model. But it occurs to me that a) it is a bit cumbersome b) the set of documents permitted is already screwy c) it looks very different from the content model of <TEI>, which (in the new world order) <teiCorpus> is very similar to. One possible solution is to 1) add <teiCorpus> to model.resourceLike, too; and then 2) change content model of <teiCorpus> to match that of <TEI> (in the new standOff world): teiHeader, model.resourceLike+ This has the advantage of having a very clean, understandable content model for <teiCorpus> (and no ambiguity). It has the disadvantage that it allows for even more screwy things. E.g. <teiCorpus> <teiHeader> <standOff> <TEI> <facsimile> <TEI> <teiCorpus> <TEI> <facsimile> <TEI> <teiCorpus> <facsimile> </teiCorpus> It has the feature (which some will consider an advantage, others a disadvantage, I'm sure) that <TEI> and <teiCorpus> become the same thing. Just as an XSLT stylesheet can have an outermost element of xsl:stylesheet or xsl:transform, no difference, a TEI document will be able to have <TEI> or <teiCorpus> as an outermost element, no difference. Same goes at every level of nesting where one is allowed: so is the other, and it can have the same content. Personally, I still think putting <TEI> into model.resourceLike was probably a mistake, but once we've done that, I'm inclined to say throw <teiCorpus> in there, too. _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council --
Martin Holmes UVic Humanities Computing and Media Centre _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
_______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
Maybe it's time to reconsider adding TEI to model.resourceLike. This is not "just a matter of terminology" -- or rather, no more so than everything in the TEI is: a classic "TEI" thing was meant to be a combination of metadata with one or more non-metadata resources. (Yes, the distinction between metadata and not-metadata is arbitrary and conventional. So is the distinction between text and div. Next.). A classic "teiCorpus" thing was meant to be a combination of metadata and one or more "TEI" things. So, if the issue we are trying to fix is that in Standoffland there is need to consider a "TEI" thing as being itself a unitary resource, I humbly suggest the way is to call it something else: an encapsulated TEI for example, and add that new thingie to model.resourceLike, keeping the original definitions more or less intact. Otherwise everyone is going to get horribly confused, and I don't think it's possible to create a model which will not permit nonsense. On 11/12/2019 16:24, Lou Burnard wrote:
Good. But how is it going to be fixed?
On 10/12/2019 18:54, Peter Stadler wrote:
Yes, and we already decided to fix this during our f2f at Graz:https://github.com/TEIC/TEI/issues/1823#issuecomment-531465879
Peter
Am 10.12.2019 um 15:19 schrieb James Cummings
: This seems to be a disastrous error introduced in TEI P5 3.0.0. I despair.
Many thanks, James
-- Dr James Cummings,James.Cummings@newcastle.ac.uk Senior Lecturer in Late-Medieval Literature and Digital Humanities School of English, Newcastle University From: Tei-council
on behalf of Hugh Cayless Sent: 10 December 2019 14:11 To:tei-council@lists.tei-c.org Subject: Re: [Tei-council] standOff problem #1: content model of <teiCorpus> You can already have <text> as a child of <teiCorpus> I'm afraid...
On Tue, Dec 10, 2019 at 7:10 AM James Cummings
wrote: I think Peter's objection is a killer for putting teiCorpus into model.resourceLike if that is still the case with Syd's formulation of it.
What if instead we simplified the current content model by putting TEI and teiCorpus into a special class all of their own, and then having an alternation between that and model.resourceLike? Or would that result in the same ambiguous situation? I don't have a problem with TEI and teiCorpus being treated the same as long as we don't end up with things like <text> being a child of teiCorpus.
Many thanks, James
-- Dr James Cummings,James.Cummings@newcastle.ac.uk Senior Lecturer in Late-Medieval Literature and Digital Humanities School of English, Newcastle University From: Tei-council
on behalf of Peter Stadler Sent: 10 December 2019 07:10 To: Martin Holmes Cc:tei-council@lists.tei-c.org Subject: Re: [Tei-council] standOff problem #1: content model of <teiCorpus> I’m really sorry not being able to join you later (to bring it forward myself) but please have a look at tickethttps://github.com/TEIC/TEI/issues/1823 and PRhttps://github.com/TEIC/TEI/pull/1922where I already tried to tackle the content model of teiCorpus. The problem we faced (and still seems evident with Syd’s proposal) that teiCorpus would allow a structure like <teiCorpus xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> ... </teiHeader> <text> ... </text> </teiCorpus> which we thought was not intended.
Best Peter
Am 10.12.2019 um 00:50 schrieb Martin Holmes
: I agree with Syd on this: teiCorpus should be a member of model.resourceLike. The fact that as a result TEI and teiCorpus become essentially the same thing is actually a plus for me. I think any line drawn between the two is terminological rather than structural, and while it might be clearly delineated in the case of particular projects, in the broader view it's very fuzzy indeed.
Cheers, Martin
On 2019-12-09 3:16 p.m., Syd Bauman wrote:
The current content model of <teiCorpus> is: teiHeader, ( ( model.resourceLike+, ( TEI | teiCorpus )* ) | ( TEI | teiCorpus )+ ) ) That is, you may *either* have a series of 1 or more <TEI> and <teiCorpus> elements, intermingled; OR you can have one or more of <facsimile>, <fsdDecl>, <sourceDoc>, and <text> elements intermingled followed by zero or more <TEI> and <teiCorpus> elements intermingled. This is exactly the same as saying teiHeader, model.resourceLike*, ( TEI | teiCorpus )* except that it requires at least one child <TEI>, <teiCorpus>, <facsimile>, <fsdDecl>, <sourceDoc>, or <text>. In the new <standOff> world, we have voted to make <TEI> a member of model.resourceLike. Thus we have to alter the content model of <teiCorpus> because as written it would be ambiguous (if a <TEI> is your first child after <teiHeader>, you don't know which branch of the content model you are in). It is not too difficult to solve the ambiguity problem: teiHeader, ( ( model.resourceLike+, ( teiCorpus, ( TEI | teiCorpus )* )? ) | ( teiCorpus, ( TEI | teiCorpus )* ) ) ) I am pretty confident this content model validates the same set of documents (and, perhaps as importantly, rejects the same set of documents) as the original content model (with <TEI> in model.resourceLike). I am not against using this content model. But it occurs to me that a) it is a bit cumbersome b) the set of documents permitted is already screwy c) it looks very different from the content model of <TEI>, which (in the new world order) <teiCorpus> is very similar to. One possible solution is to 1) add <teiCorpus> to model.resourceLike, too; and then 2) change content model of <teiCorpus> to match that of <TEI> (in the new standOff world): teiHeader, model.resourceLike+ This has the advantage of having a very clean, understandable content model for <teiCorpus> (and no ambiguity). It has the disadvantage that it allows for even more screwy things. E.g. <teiCorpus> <teiHeader> <standOff> <TEI> <facsimile> <TEI> <teiCorpus> <TEI> <facsimile> <TEI> <teiCorpus> <facsimile> </teiCorpus> It has the feature (which some will consider an advantage, others a disadvantage, I'm sure) that <TEI> and <teiCorpus> become the same thing. Just as an XSLT stylesheet can have an outermost element of xsl:stylesheet or xsl:transform, no difference, a TEI document will be able to have <TEI> or <teiCorpus> as an outermost element, no difference. Same goes at every level of nesting where one is allowed: so is the other, and it can have the same content. Personally, I still think putting <TEI> into model.resourceLike was probably a mistake, but once we've done that, I'm inclined to say throw <teiCorpus> in there, too. _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council --
Martin Holmes UVic Humanities Computing and Media Centre _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council _______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
_______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
_______________________________________________ Tei-council mailing list Tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
LB> Good. But how is it going to be fixed? PS> Yes, and we already decided to fix this during our f2f at Graz: I have now taken a first crack at this as a starting point for discussion, per Council's discussions yesterday. See https://jenkins3.tei-c.org/job/TEIP5-sydb-standOff/lastSuccessfulBuild/artif... and https://jenkins3.tei-c.org/job/TEIP5-sydb-standOff/lastSuccessfulBuild/artif... Main changes: * "model.resourceLike" renamed to "model.resource" * new class, "model.describedResource" for those things that group a <teiHeader> with a resource(s), i.e. that have metadata describing the resource(s) * <TEI> no longer part of model.resource, now a member of model.describedResource along with <teiCorpus> * Content model of <TEI> changed to: ( teiHeader, ( ( model.resource+, TEI* ) | TEI+ ) ) * Content model of <teiCorpus> changed to: ( teiHeader, model.resource*, model.describedResource+ )
participants (7)
-
Elisa Beshero-Bondar
-
Hugh Cayless
-
James Cummings
-
Lou Burnard
-
Martin Holmes
-
Peter Stadler
-
Syd Bauman