Question for Syd in particular, but anyone else who understands this please chip in. https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd causes a Stylesheets test (test30.odd) to fail. Syd’s commit message led me to the (well, a) solution, which is to add a flag to Jing validation. But is this in fact leading to a risk of generating invalid Relax schemas? Can that be helped?
On 13/10/15 16:20, Hugh Cayless wrote:
Question for Syd in particular, but anyone else who understands this please chip in. https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd causes a Stylesheets test (test30.odd) to fail. Syd’s commit message led me to the (well, a) solution, which is to add a flag to Jing validation. But is this in fact leading to a risk of generating invalid Relax schemas? Can that be helped?
On the face of it, globally removing the check for ID/IDREF compatibility seems like a bad idea. Maybe Syd could expand a little on whatis causing 'the dreaded "conflicting ID-types" error' ?
For the record, I think probably the right thing to do is to use the -i (aka checkid=false) on calls to `jing`, and use some other mechanism to check ID/IDREF. (As mentioned in [1].) I think Eric van der Vlist says this well: "Basically, what's happening here is that DTD compatibility mode emulates even the restrictions of a DTD.". [paraphrased from [2]] Again, paraphrasing Eric van der Vlist: The requirement is that if an element <foo> is defined with a @bar attribute of type of ID, all the other definitions of a foo/@bar must also be of type ID. But hidden in the definition of macro.anyXML there can be a <foo> having an attribute @bar of type text.[3] So the normal definition of foo/@bar as ID and the any.XML definition of foo/@bar as text (or IDREF or whatever) conflict. If I understand correctly[4], there are 3 tests that we would in theory need to perform to duplicate the ID/IDREF tests. They are: 1) After performing normalize-space() on it, a. the value of each attribute of type ID has one NCName b. the value of each attribute of type IDREF has one token c. the value of each attribute of type IDREFS has 1+ token 2) No two attributes of type ID have the same values. 3) For each token in an IDREF or IDREFS attribute, there is a token in an ID attribute with the same value. Note that testing (1a) is done by RELAX NG whether ID/IDREF checking is on or not. So we don't have to worry about that. Note also that, because we only have one attribute name in the entire TEI scheme that is of type ID (namely @xml:id), testing 1a (if it were needed) and (2) is pretty easy. See [5] for methods of testing for (2) with ISO Schematron. BUT the IMPORTANT bit is that (afaik) we don't use any IDREF or IDREFS attributes in the TEI schema *at all*. (The string "IDREF" does not occur in .../TEI/P5/Exemplars/tei_all.rnc, nor in any of .../TEI/P5/Source/Specs/*.xml. However, it does occur nearly a dozen times in .../TEI/P5/Source/Guidelines/*/*.xml, I haven't looked at what the prose says yet.) So I am reasonably confident that all we need to do to work around this problem is disable ID/IDREF checking, and then use one of the snippets of Schematron at [5] to test for (2). Adding a <constraintSpec> for (2) to the Guidelines would also mean users could turn off ID/IDREF checking in oXygen without losing any validation for TEI files. (Although they would have to use the Schematron, and they might want ID/IDREF checking for their non-TEI files, of course.) Notes ----- [1] http://lists.lists.tei-c.org/pipermail/tei-council/2015/021830.html [2] http://www.relaxng.org/pipermail/relaxng-user/2003-September/000029.html [3] http://books.xmlschemata.org/relaxng/relax-CHP-11-SECT-4.html [4] And there's a good chance I may not. Take a look at http://relaxng.org/compatibility.html#id to understand why. [5] http://wiki.tei-c.org/index.php/Xmlid_uniqueness.sch Lou Burnard writes:
On 13/10/15 16:20, Hugh Cayless wrote:
Question for Syd in particular, but anyone else who understands this please chip in. https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd causes a Stylesheets test (test30.odd) to fail. Syd’s commit message led me to the (well, a) solution, which is to add a flag to Jing validation. But is this in fact leading to a risk of generating invalid Relax schemas? Can that be helped?
On the face of it, globally removing the check for ID/IDREF compatibility seems like a bad idea. Maybe Syd could expand a little on whatis causing 'the dreaded "conflicting ID-types" error' ?
Thanks for the explication Syd. What worries me though is why we're having this problem now. We've had a macro.anyXML in the Guidelines forever, or at least since P5 1.0, so why is this "conflicting ID-types" problem only surfacing now? As to the absence of IDREF(S) -- the reason is simple enough: ages ago we decided to turn everything like that into a URI. During the build process there is xslt which checks that locally defined URIs are satisfied (i.e. that if you say somewhere target="#foo", there is somewhere something with @xml:id="foo"). I suspect that if you set checkid=false, we will no longer detect the presence of two things with @xml:id="foo", which seems to me a distinctly retrograde step. (I speak as one who has spent hours tweaking xml:id values in examples in the Glines) On 15/10/15 03:49, Syd Bauman wrote:
For the record, I think probably the right thing to do is to use the -i (aka checkid=false) on calls to `jing`, and use some other mechanism to check ID/IDREF. (As mentioned in [1].)
I think Eric van der Vlist says this well: "Basically, what's happening here is that DTD compatibility mode emulates even the restrictions of a DTD.". [paraphrased from [2]]
Again, paraphrasing Eric van der Vlist:
The requirement is that if an element <foo> is defined with a @bar attribute of type of ID, all the other definitions of a foo/@bar must also be of type ID. But hidden in the definition of macro.anyXML there can be a <foo> having an attribute @bar of type text.[3]
So the normal definition of foo/@bar as ID and the any.XML definition of foo/@bar as text (or IDREF or whatever) conflict.
If I understand correctly[4], there are 3 tests that we would in theory need to perform to duplicate the ID/IDREF tests. They are:
1) After performing normalize-space() on it, a. the value of each attribute of type ID has one NCName b. the value of each attribute of type IDREF has one token c. the value of each attribute of type IDREFS has 1+ token 2) No two attributes of type ID have the same values. 3) For each token in an IDREF or IDREFS attribute, there is a token in an ID attribute with the same value.
Note that testing (1a) is done by RELAX NG whether ID/IDREF checking is on or not. So we don't have to worry about that.
Note also that, because we only have one attribute name in the entire TEI scheme that is of type ID (namely @xml:id), testing 1a (if it were needed) and (2) is pretty easy. See [5] for methods of testing for (2) with ISO Schematron.
BUT the IMPORTANT bit is that (afaik) we don't use any IDREF or IDREFS attributes in the TEI schema *at all*. (The string "IDREF" does not occur in .../TEI/P5/Exemplars/tei_all.rnc, nor in any of .../TEI/P5/Source/Specs/*.xml. However, it does occur nearly a dozen times in .../TEI/P5/Source/Guidelines/*/*.xml, I haven't looked at what the prose says yet.)
So I am reasonably confident that all we need to do to work around this problem is disable ID/IDREF checking, and then use one of the snippets of Schematron at [5] to test for (2).
Adding a <constraintSpec> for (2) to the Guidelines would also mean users could turn off ID/IDREF checking in oXygen without losing any validation for TEI files. (Although they would have to use the Schematron, and they might want ID/IDREF checking for their non-TEI files, of course.)
Notes ----- [1] http://lists.lists.tei-c.org/pipermail/tei-council/2015/021830.html [2] http://www.relaxng.org/pipermail/relaxng-user/2003-September/000029.html [3] http://books.xmlschemata.org/relaxng/relax-CHP-11-SECT-4.html [4] And there's a good chance I may not. Take a look at http://relaxng.org/compatibility.html#id to understand why. [5] http://wiki.tei-c.org/index.php/Xmlid_uniqueness.sch
Lou Burnard writes:
On 13/10/15 16:20, Hugh Cayless wrote:
Question for Syd in particular, but anyone else who understands this please chip in. https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd causes a Stylesheets test (test30.odd) to fail. Syd’s commit message led me to the (well, a) solution, which is to add a flag to Jing validation. But is this in fact leading to a risk of generating invalid Relax schemas? Can that be helped?
On the face of it, globally removing the check for ID/IDREF compatibility seems like a bad idea. Maybe Syd could expand a little on whatis causing 'the dreaded "conflicting ID-types" error' ?
* I think I can figure out why having macro.anyXML in tei module, instead of TD causes this problem to rear its ugly head, but I can't do that now. (I understood it at one point a few weeks ago, so I can reconstruct it, but I'm buried right now.) * Yes, you are absolutely right about IDREF(S). The reason I was searching was to make sure we really had none. * You are absolutely correct, that with checkid=false we will no longer detect the presence of two things with @xml:id="duck" *with the RELAX NG validation*. We absolutely need to check for this condition, but I don't see any reason that test has to be done in RELAX NG rather than something else. In particular, it's quite easy to do in Schematron.
Thanks for the explication Syd. What worries me though is why we're having this problem now. We've had a macro.anyXML in the Guidelines forever, or at least since P5 1.0, so why is this "conflicting ID-types" problem only surfacing now?
As to the absence of IDREF(S) -- the reason is simple enough: ages ago we decided to turn everything like that into a URI. During the build process there is xslt which checks that locally defined URIs are satisfied (i.e. that if you say somewhere target="#foo", there is somewhere something with @xml:id="foo").
I suspect that if you set checkid=false, we will no longer detect the presence of two things with @xml:id="foo", which seems to me a distinctly retrograde step. (I speak as one who has spent hours tweaking xml:id values in examples in the Glines)
On 15/10/15 15:55, Syd Bauman wrote:
* I think I can figure out why having macro.anyXML in tei module, instead of TD causes this problem to rear its ugly head, but I can't do that now. (I understood it at one point a few weeks ago, so I can reconstruct it, but I'm buried right now.)
Standing by for your explanation. I looked back through the version history, and I see that Sebastian defined the current content model back in 22 May 2010 with the comment "get anyXML working again, to avoid ID errors https://github.com/TEIC/TEI/commit/809cba3dbaf05ce88d42776afb83a54ffa0085e2" so I assume it must have worked then. There is a later modification (16 Jun 2011) "limit macro.anyXML macro.schemaPattern to tagdocs module, so as not to upset others" which moves it from tei to tagdocs, so clearly it did once work when it was a part of the tagdocs module. https://github.com/TEIC/TEI/commit/c6b6f0648a924ad910c986d3224bff94d0f760f0
On 15/10/15 18:54, Lou Burnard wrote:
clearly it did once work when it was a part of the tagdocs module. https://github.com/TEIC/TEI/commit/c6b6f0648a924ad910c986d3224bff94d0f760f0
sorry, make that "BEFORE it was a part of the tagdocs module" Archaeology also unearthed a (moderately) illuminating article by J Clark on why this is happening (see http://blog.jclark.com/2009/01/relax-ng-and-xmlid.html) ... but I cannot find any support for Sebastian's claim to have fixed this in 2010. Except (obviously) that we haven't seen the problem till now.
macro.anyXML was previously in the "tagdocs" module, now it’s in "tei". Probably the Stylesheets test we ran afoul of wasn’t importing tagdocs at all. Isn’t this going to cause problems anyway though? For example, Oxygen complains about the test RNG. Presumably it will let me turn this check off somehow, but unless I do that, I’d have trouble validating documents with it, right?
On Oct 15, 2015, at 6:37 , Lou Burnard
wrote: Thanks for the explication Syd. What worries me though is why we're having this problem now. We've had a macro.anyXML in the Guidelines forever, or at least since P5 1.0, so why is this "conflicting ID-types" problem only surfacing now?
As to the absence of IDREF(S) -- the reason is simple enough: ages ago we decided to turn everything like that into a URI. During the build process there is xslt which checks that locally defined URIs are satisfied (i.e. that if you say somewhere target="#foo", there is somewhere something with @xml:id="foo").
I suspect that if you set checkid=false, we will no longer detect the presence of two things with @xml:id="foo", which seems to me a distinctly retrograde step. (I speak as one who has spent hours tweaking xml:id values in examples in the Glines)
On 15/10/15 03:49, Syd Bauman wrote:
For the record, I think probably the right thing to do is to use the -i (aka checkid=false) on calls to `jing`, and use some other mechanism to check ID/IDREF. (As mentioned in [1].)
I think Eric van der Vlist says this well: "Basically, what's happening here is that DTD compatibility mode emulates even the restrictions of a DTD.". [paraphrased from [2]]
Again, paraphrasing Eric van der Vlist:
The requirement is that if an element <foo> is defined with a @bar attribute of type of ID, all the other definitions of a foo/@bar must also be of type ID. But hidden in the definition of macro.anyXML there can be a <foo> having an attribute @bar of type text.[3]
So the normal definition of foo/@bar as ID and the any.XML definition of foo/@bar as text (or IDREF or whatever) conflict.
If I understand correctly[4], there are 3 tests that we would in theory need to perform to duplicate the ID/IDREF tests. They are:
1) After performing normalize-space() on it, a. the value of each attribute of type ID has one NCName b. the value of each attribute of type IDREF has one token c. the value of each attribute of type IDREFS has 1+ token 2) No two attributes of type ID have the same values. 3) For each token in an IDREF or IDREFS attribute, there is a token in an ID attribute with the same value.
Note that testing (1a) is done by RELAX NG whether ID/IDREF checking is on or not. So we don't have to worry about that.
Note also that, because we only have one attribute name in the entire TEI scheme that is of type ID (namely @xml:id), testing 1a (if it were needed) and (2) is pretty easy. See [5] for methods of testing for (2) with ISO Schematron.
BUT the IMPORTANT bit is that (afaik) we don't use any IDREF or IDREFS attributes in the TEI schema *at all*. (The string "IDREF" does not occur in .../TEI/P5/Exemplars/tei_all.rnc, nor in any of .../TEI/P5/Source/Specs/*.xml. However, it does occur nearly a dozen times in .../TEI/P5/Source/Guidelines/*/*.xml, I haven't looked at what the prose says yet.)
So I am reasonably confident that all we need to do to work around this problem is disable ID/IDREF checking, and then use one of the snippets of Schematron at [5] to test for (2).
Adding a <constraintSpec> for (2) to the Guidelines would also mean users could turn off ID/IDREF checking in oXygen without losing any validation for TEI files. (Although they would have to use the Schematron, and they might want ID/IDREF checking for their non-TEI files, of course.)
Notes ----- [1] http://lists.lists.tei-c.org/pipermail/tei-council/2015/021830.html [2] http://www.relaxng.org/pipermail/relaxng-user/2003-September/000029.html [3] http://books.xmlschemata.org/relaxng/relax-CHP-11-SECT-4.html [4] And there's a good chance I may not. Take a look at http://relaxng.org/compatibility.html#id to understand why. [5] http://wiki.tei-c.org/index.php/Xmlid_uniqueness.sch
Lou Burnard writes:
On 13/10/15 16:20, Hugh Cayless wrote:
Question for Syd in particular, but anyone else who understands this please chip in. https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd causes a Stylesheets test (test30.odd) to fail. Syd’s commit message led me to the (well, a) solution, which is to add a flag to Jing validation. But is this in fact leading to a risk of generating invalid Relax schemas? Can that be helped?
On the face of it, globally removing the check for ID/IDREF compatibility seems like a bad idea. Maybe Syd could expand a little on whatis causing 'the dreaded "conflicting ID-types" error' ?
-- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
See my posting at 19h17. Before it was moved to the tagdocs module it was in the tei module *and it still worked*. On 15/10/15 20:35, Hugh Cayless wrote:
macro.anyXML was previously in the "tagdocs" module, now it’s in "tei". Probably the Stylesheets test we ran afoul of wasn’t importing tagdocs at all. Isn’t this going to cause problems anyway though? For example, Oxygen complains about the test RNG. Presumably it will let me turn this check off somehow, but unless I do that, I’d have trouble validating documents with it, right?
On Oct 15, 2015, at 6:37 , Lou Burnard
wrote: Thanks for the explication Syd. What worries me though is why we're having this problem now. We've had a macro.anyXML in the Guidelines forever, or at least since P5 1.0, so why is this "conflicting ID-types" problem only surfacing now?
As to the absence of IDREF(S) -- the reason is simple enough: ages ago we decided to turn everything like that into a URI. During the build process there is xslt which checks that locally defined URIs are satisfied (i.e. that if you say somewhere target="#foo", there is somewhere something with @xml:id="foo").
I suspect that if you set checkid=false, we will no longer detect the presence of two things with @xml:id="foo", which seems to me a distinctly retrograde step. (I speak as one who has spent hours tweaking xml:id values in examples in the Glines)
On 15/10/15 03:49, Syd Bauman wrote:
For the record, I think probably the right thing to do is to use the -i (aka checkid=false) on calls to `jing`, and use some other mechanism to check ID/IDREF. (As mentioned in [1].)
I think Eric van der Vlist says this well: "Basically, what's happening here is that DTD compatibility mode emulates even the restrictions of a DTD.". [paraphrased from [2]]
Again, paraphrasing Eric van der Vlist:
The requirement is that if an element <foo> is defined with a @bar attribute of type of ID, all the other definitions of a foo/@bar must also be of type ID. But hidden in the definition of macro.anyXML there can be a <foo> having an attribute @bar of type text.[3]
So the normal definition of foo/@bar as ID and the any.XML definition of foo/@bar as text (or IDREF or whatever) conflict.
If I understand correctly[4], there are 3 tests that we would in theory need to perform to duplicate the ID/IDREF tests. They are:
1) After performing normalize-space() on it, a. the value of each attribute of type ID has one NCName b. the value of each attribute of type IDREF has one token c. the value of each attribute of type IDREFS has 1+ token 2) No two attributes of type ID have the same values. 3) For each token in an IDREF or IDREFS attribute, there is a token in an ID attribute with the same value.
Note that testing (1a) is done by RELAX NG whether ID/IDREF checking is on or not. So we don't have to worry about that.
Note also that, because we only have one attribute name in the entire TEI scheme that is of type ID (namely @xml:id), testing 1a (if it were needed) and (2) is pretty easy. See [5] for methods of testing for (2) with ISO Schematron.
BUT the IMPORTANT bit is that (afaik) we don't use any IDREF or IDREFS attributes in the TEI schema *at all*. (The string "IDREF" does not occur in .../TEI/P5/Exemplars/tei_all.rnc, nor in any of .../TEI/P5/Source/Specs/*.xml. However, it does occur nearly a dozen times in .../TEI/P5/Source/Guidelines/*/*.xml, I haven't looked at what the prose says yet.)
So I am reasonably confident that all we need to do to work around this problem is disable ID/IDREF checking, and then use one of the snippets of Schematron at [5] to test for (2).
Adding a <constraintSpec> for (2) to the Guidelines would also mean users could turn off ID/IDREF checking in oXygen without losing any validation for TEI files. (Although they would have to use the Schematron, and they might want ID/IDREF checking for their non-TEI files, of course.)
Notes ----- [1] http://lists.lists.tei-c.org/pipermail/tei-council/2015/021830.html [2] http://www.relaxng.org/pipermail/relaxng-user/2003-September/000029.html [3] http://books.xmlschemata.org/relaxng/relax-CHP-11-SECT-4.html [4] And there's a good chance I may not. Take a look at http://relaxng.org/compatibility.html#id to understand why. [5] http://wiki.tei-c.org/index.php/Xmlid_uniqueness.sch
Lou Burnard writes:
On 13/10/15 16:20, Hugh Cayless wrote:
Question for Syd in particular, but anyone else who understands this please chip in. https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd causes a Stylesheets test (test30.odd) to fail. Syd’s commit message led me to the (well, a) solution, which is to add a flag to Jing validation. But is this in fact leading to a risk of generating invalid Relax schemas? Can that be helped?
On the face of it, globally removing the check for ID/IDREF compatibility seems like a bad idea. Maybe Syd could expand a little on whatis causing 'the dreaded "conflicting ID-types" error' ? -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
Yeah, not saying we can’t make it work, rather that the current state is a bit of a problem even if we can route our own processes around it—we can still end up with people building (apparently) broken RNG schemas from reasonable-looking ODDs. No?
On Oct 15, 2015, at 15:59 , Lou Burnard
wrote: See my posting at 19h17. Before it was moved to the tagdocs module it was in the tei module *and it still worked*.
On 15/10/15 20:35, Hugh Cayless wrote:
macro.anyXML was previously in the "tagdocs" module, now it’s in "tei". Probably the Stylesheets test we ran afoul of wasn’t importing tagdocs at all. Isn’t this going to cause problems anyway though? For example, Oxygen complains about the test RNG. Presumably it will let me turn this check off somehow, but unless I do that, I’d have trouble validating documents with it, right?
On Oct 15, 2015, at 6:37 , Lou Burnard
wrote: Thanks for the explication Syd. What worries me though is why we're having this problem now. We've had a macro.anyXML in the Guidelines forever, or at least since P5 1.0, so why is this "conflicting ID-types" problem only surfacing now?
As to the absence of IDREF(S) -- the reason is simple enough: ages ago we decided to turn everything like that into a URI. During the build process there is xslt which checks that locally defined URIs are satisfied (i.e. that if you say somewhere target="#foo", there is somewhere something with @xml:id="foo").
I suspect that if you set checkid=false, we will no longer detect the presence of two things with @xml:id="foo", which seems to me a distinctly retrograde step. (I speak as one who has spent hours tweaking xml:id values in examples in the Glines)
On 15/10/15 03:49, Syd Bauman wrote:
For the record, I think probably the right thing to do is to use the -i (aka checkid=false) on calls to `jing`, and use some other mechanism to check ID/IDREF. (As mentioned in [1].)
I think Eric van der Vlist says this well: "Basically, what's happening here is that DTD compatibility mode emulates even the restrictions of a DTD.". [paraphrased from [2]]
Again, paraphrasing Eric van der Vlist:
The requirement is that if an element <foo> is defined with a @bar attribute of type of ID, all the other definitions of a foo/@bar must also be of type ID. But hidden in the definition of macro.anyXML there can be a <foo> having an attribute @bar of type text.[3]
So the normal definition of foo/@bar as ID and the any.XML definition of foo/@bar as text (or IDREF or whatever) conflict.
If I understand correctly[4], there are 3 tests that we would in theory need to perform to duplicate the ID/IDREF tests. They are:
1) After performing normalize-space() on it, a. the value of each attribute of type ID has one NCName b. the value of each attribute of type IDREF has one token c. the value of each attribute of type IDREFS has 1+ token 2) No two attributes of type ID have the same values. 3) For each token in an IDREF or IDREFS attribute, there is a token in an ID attribute with the same value.
Note that testing (1a) is done by RELAX NG whether ID/IDREF checking is on or not. So we don't have to worry about that.
Note also that, because we only have one attribute name in the entire TEI scheme that is of type ID (namely @xml:id), testing 1a (if it were needed) and (2) is pretty easy. See [5] for methods of testing for (2) with ISO Schematron.
BUT the IMPORTANT bit is that (afaik) we don't use any IDREF or IDREFS attributes in the TEI schema *at all*. (The string "IDREF" does not occur in .../TEI/P5/Exemplars/tei_all.rnc, nor in any of .../TEI/P5/Source/Specs/*.xml. However, it does occur nearly a dozen times in .../TEI/P5/Source/Guidelines/*/*.xml, I haven't looked at what the prose says yet.)
So I am reasonably confident that all we need to do to work around this problem is disable ID/IDREF checking, and then use one of the snippets of Schematron at [5] to test for (2).
Adding a <constraintSpec> for (2) to the Guidelines would also mean users could turn off ID/IDREF checking in oXygen without losing any validation for TEI files. (Although they would have to use the Schematron, and they might want ID/IDREF checking for their non-TEI files, of course.)
Notes ----- [1] http://lists.lists.tei-c.org/pipermail/tei-council/2015/021830.html [2] http://www.relaxng.org/pipermail/relaxng-user/2003-September/000029.html [3] http://books.xmlschemata.org/relaxng/relax-CHP-11-SECT-4.html [4] And there's a good chance I may not. Take a look at http://relaxng.org/compatibility.html#id to understand why. [5] http://wiki.tei-c.org/index.php/Xmlid_uniqueness.sch
Lou Burnard writes:
On 13/10/15 16:20, Hugh Cayless wrote:
Question for Syd in particular, but anyone else who understands this please chip in. https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd causes a Stylesheets test (test30.odd) to fail. Syd’s commit message led me to the (well, a) solution, which is to add a flag to Jing validation. But is this in fact leading to a risk of generating invalid Relax schemas? Can that be helped?
On the face of it, globally removing the check for ID/IDREF compatibility seems like a bad idea. Maybe Syd could expand a little on whatis causing 'the dreaded "conflicting ID-types" error' ? -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
-- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
Yes, this is why I wasn't very happy with the suggestion of turning off id/idref validation. Replacing that with schematron rules seems like a suboptimal solution for most users. But I dunno what to do about it, because I still don't understand why it used to work, and now doesn't. On 15/10/15 21:05, Hugh Cayless wrote:
Yeah, not saying we can’t make it work, rather that the current state is a bit of a problem even if we can route our own processes around it—we can still end up with people building (apparently) broken RNG schemas from reasonable-looking ODDs. No?
On Oct 15, 2015, at 15:59 , Lou Burnard
wrote: See my posting at 19h17. Before it was moved to the tagdocs module it was in the tei module *and it still worked*.
On 15/10/15 20:35, Hugh Cayless wrote:
macro.anyXML was previously in the "tagdocs" module, now it’s in "tei". Probably the Stylesheets test we ran afoul of wasn’t importing tagdocs at all. Isn’t this going to cause problems anyway though? For example, Oxygen complains about the test RNG. Presumably it will let me turn this check off somehow, but unless I do that, I’d have trouble validating documents with it, right?
On Oct 15, 2015, at 6:37 , Lou Burnard
wrote: Thanks for the explication Syd. What worries me though is why we're having this problem now. We've had a macro.anyXML in the Guidelines forever, or at least since P5 1.0, so why is this "conflicting ID-types" problem only surfacing now?
As to the absence of IDREF(S) -- the reason is simple enough: ages ago we decided to turn everything like that into a URI. During the build process there is xslt which checks that locally defined URIs are satisfied (i.e. that if you say somewhere target="#foo", there is somewhere something with @xml:id="foo").
I suspect that if you set checkid=false, we will no longer detect the presence of two things with @xml:id="foo", which seems to me a distinctly retrograde step. (I speak as one who has spent hours tweaking xml:id values in examples in the Glines)
On 15/10/15 03:49, Syd Bauman wrote:
For the record, I think probably the right thing to do is to use the -i (aka checkid=false) on calls to `jing`, and use some other mechanism to check ID/IDREF. (As mentioned in [1].)
I think Eric van der Vlist says this well: "Basically, what's happening here is that DTD compatibility mode emulates even the restrictions of a DTD.". [paraphrased from [2]]
Again, paraphrasing Eric van der Vlist:
The requirement is that if an element <foo> is defined with a @bar attribute of type of ID, all the other definitions of a foo/@bar must also be of type ID. But hidden in the definition of macro.anyXML there can be a <foo> having an attribute @bar of type text.[3]
So the normal definition of foo/@bar as ID and the any.XML definition of foo/@bar as text (or IDREF or whatever) conflict.
If I understand correctly[4], there are 3 tests that we would in theory need to perform to duplicate the ID/IDREF tests. They are:
1) After performing normalize-space() on it, a. the value of each attribute of type ID has one NCName b. the value of each attribute of type IDREF has one token c. the value of each attribute of type IDREFS has 1+ token 2) No two attributes of type ID have the same values. 3) For each token in an IDREF or IDREFS attribute, there is a token in an ID attribute with the same value.
Note that testing (1a) is done by RELAX NG whether ID/IDREF checking is on or not. So we don't have to worry about that.
Note also that, because we only have one attribute name in the entire TEI scheme that is of type ID (namely @xml:id), testing 1a (if it were needed) and (2) is pretty easy. See [5] for methods of testing for (2) with ISO Schematron.
BUT the IMPORTANT bit is that (afaik) we don't use any IDREF or IDREFS attributes in the TEI schema *at all*. (The string "IDREF" does not occur in .../TEI/P5/Exemplars/tei_all.rnc, nor in any of .../TEI/P5/Source/Specs/*.xml. However, it does occur nearly a dozen times in .../TEI/P5/Source/Guidelines/*/*.xml, I haven't looked at what the prose says yet.)
So I am reasonably confident that all we need to do to work around this problem is disable ID/IDREF checking, and then use one of the snippets of Schematron at [5] to test for (2).
Adding a <constraintSpec> for (2) to the Guidelines would also mean users could turn off ID/IDREF checking in oXygen without losing any validation for TEI files. (Although they would have to use the Schematron, and they might want ID/IDREF checking for their non-TEI files, of course.)
Notes ----- [1] http://lists.lists.tei-c.org/pipermail/tei-council/2015/021830.html [2] http://www.relaxng.org/pipermail/relaxng-user/2003-September/000029.html [3] http://books.xmlschemata.org/relaxng/relax-CHP-11-SECT-4.html [4] And there's a good chance I may not. Take a look at http://relaxng.org/compatibility.html#id to understand why. [5] http://wiki.tei-c.org/index.php/Xmlid_uniqueness.sch
Lou Burnard writes:
On 13/10/15 16:20, Hugh Cayless wrote: > Question for Syd in particular, but anyone else who understands this please chip in. https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd https://github.com/TEIC/TEI/commit/a55fa633ddb8a4a1a749222b6120bce90fa316cd causes a Stylesheets test (test30.odd) to fail. Syd’s commit message led me to the (well, a) solution, which is to add a flag to Jing validation. But is this in fact leading to a risk of generating invalid Relax schemas? Can that be helped? > On the face of it, globally removing the check for ID/IDREF compatibility seems like a bad idea. Maybe Syd could expand a little on whatis causing 'the dreaded "conflicting ID-types" error' ? -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
participants (3)
-
Hugh Cayless
-
Lou Burnard
-
Syd Bauman