
I started trying to test this, but didnt get very far. 1. What happens to a content model like this: <alternate maxOccurs="unbounded"> <textNode/> <elementRef key="hi" /> </alternate> in RNC (using Syd's latest version of stylesheetes -- I think ) this gives us (text|hi)+ [krekt] in DTD I get (#PCDATA|hi)* which is not quite krekt but at least is valid in XSD I get <xs:complexType mixed="true"> <xs:sequence> <xs:element minOccurs="0" maxOccurs="unbounded" ref="tei:hi"/> </xs:sequence> <!-- ... --> i.e same as DTD. good. 2. Let's try using minOccurs <alternate> <textNode/> <elementRef key="name" minOccurs="2"/> </alternate> generates in RNC : (name+ | name+ | text) (?WTF?) in DTD : (#PCDATA|hi)* again [I think the presence of a textnode always generates a mixed content like this. Which is fine] in XSD <xs:complexType mixed="true"> <xs:choice minOccurs="0"> <xs:element maxOccurs="unbounded" ref="tei:name"/> <xs:element maxOccurs="unbounded" ref="tei:name"/> </xs:choice> ... </xs:complexType> [which won't do because it is invalid XSD, cos nonambig) 3. Maybe the textnode is special. Let's try something faintly plausible <sequence> <elementRef key="title" minOccurs="0" maxOccurs="1"/> <elementRef key="name" minOccurs="2" maxOccurs="4"/> </sequence> in RNC : (title?, name, name) in DTD : <!ELEMENT ab (((title)?,name,name,name,name))> in XSD : <xs:sequence> <xs:element minOccurs="0" ref="tei:title"/> <xs:element ref="tei:name"/> <xs:element ref="tei:name"/> </xs:sequence> None of these of course will validate something with 3 names, even though the ODD content model suggests they should. I conclude we're all doomed. But you knew that already.

Thank you, Lou! Really appreciate that someone else took the time to have fun with min and max. (Great name for a pair of dogs, especially if one is diminutive and one is large.:-) Hmmm ... First, I had to look up what "krekt" stands for. Thank goodness for Wiktionary. But I get different results, possibly because I haven't pushed up a few changes, but I'm not sure (I didn't think any of those changes would affect this, but hey, if I really understood this it wouldn't take > 3 days!) With the snippet of ODD provided in [1], I get the following RNC outputs: 1) ( text | hi )+ Correct! (Note that because 'text' matches the empty string, it makes no practical difference whether the <rng:choice> is allowed oneOrMore or zeroOrMore times.) 2) ( text | ( name, name ) ) Seems completely correct to me, but nothing like what Lou got. 3) ( title?, ( name, name, (name, name? )? ) ) Correct! (And deterministic, to boot.) But again, nothing like what Lou got. And I have never looked at DTD or XSD. The latter is generated by `trang`, so in theory should be correct if handed correct, deterministic, RNG. The former is generated by odd2dtd.xsl, which I hope to have a look at tomorrow. P.S. I've already pushed up a new P5/Test/testerrmav.odd that includes the snippet below. I will push up changes to teiodds.xsl to sydb-occurs2 branch of Stylesheets in a few mins, I hope. below ----- [1] <tmp:ODD_snippet xmlns="http://www.tei-c.org/ns/1.0"> <elementSpec ident="loub1" mode="add"> <classes mode="replace"> <memberOf key="att.global"/> <memberOf key="model.hiLike"/> </classes> <content> <alternate maxOccurs="unbounded"> <textNode/> <elementRef key="hi" /> </alternate> </content> </elementSpec> <elementSpec ident="loub2" mode="add"> <classes mode="replace"> <memberOf key="att.global"/> <memberOf key="model.hiLike"/> </classes> <content> <alternate> <textNode/> <elementRef key="name" minOccurs="2"/> </alternate> </content> </elementSpec> <elementSpec ident="loub3" mode="add"> <classes mode="replace"> <memberOf key="att.global"/> <memberOf key="model.hiLike"/> </classes> <content> <sequence> <elementRef key="title" minOccurs="0" maxOccurs="1"/> <elementRef key="name" minOccurs="2" maxOccurs="4"/> </sequence> </content> </elementSpec> </tmp:ODD_snippet>
I started trying to test this, but didnt get very far.
1. What happens to a content model like this: <alternate maxOccurs="unbounded"> <textNode/> <elementRef key="hi" /> </alternate>
in RNC (using Syd's latest version of stylesheetes -- I think ) this gives us (text|hi)+ [krekt] in DTD I get (#PCDATA|hi)* which is not quite krekt but at least is valid in XSD I get <xs:complexType mixed="true"> <xs:sequence> <xs:element minOccurs="0" maxOccurs="unbounded" ref="tei:hi"/> </xs:sequence> <!-- ... --> i.e same as DTD. good.
2. Let's try using minOccurs
<alternate> <textNode/> <elementRef key="name" minOccurs="2"/> </alternate>
generates in RNC : (name+ | name+ | text) (?WTF?) in DTD : (#PCDATA|hi)* again [I think the presence of a textnode always generates a mixed content like this. Which is fine] in XSD <xs:complexType mixed="true"> <xs:choice minOccurs="0"> <xs:element maxOccurs="unbounded" ref="tei:name"/> <xs:element maxOccurs="unbounded" ref="tei:name"/> </xs:choice> ... </xs:complexType> [which won't do because it is invalid XSD, cos nonambig)
3. Maybe the textnode is special. Let's try something faintly plausible <sequence> <elementRef key="title" minOccurs="0" maxOccurs="1"/> <elementRef key="name" minOccurs="2" maxOccurs="4"/> </sequence>
in RNC : (title?, name, name) in DTD : <!ELEMENT ab (((title)?,name,name,name,name))> in XSD : <xs:sequence> <xs:element minOccurs="0" ref="tei:title"/> <xs:element ref="tei:name"/> <xs:element ref="tei:name"/> </xs:sequence>
None of these of course will validate something with 3 names, even though the ODD content model suggests they should.
I conclude we're all doomed. But you knew that already.
-- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
-- Syd Bauman, EMT-Paramedic Senior XML Programmer/Analyst Northeastern University Women Writers Project s.bauman@northeastern.edu or Syd_Bauman@alumni.Brown.edu

So it seems I was mistaken in thinking I was using your stylesheets branch? I will take another look tomorrow. Past my bedtime now. Sent from my Honor Mobile -------- Original Message -------- Subject: Re: [tei-council] fun with min and max From: Syd Bauman To: tei-council@lists.tei-c.org CC: Thank you, Lou! Really appreciate that someone else took the time to have fun with min and max. (Great name for a pair of dogs, especially if one is diminutive and one is large.:-) Hmmm ... First, I had to look up what "krekt" stands for. Thank goodness for Wiktionary. But I get different results, possibly because I haven't pushed up a few changes, but I'm not sure (I didn't think any of those changes would affect this, but hey, if I really understood this it wouldn't take > 3 days!) With the snippet of ODD provided in [1], I get the following RNC outputs: 1) ( text | hi )+ Correct! (Note that because 'text' matches the empty string, it makes no practical difference whether the <rng:choice> is allowed oneOrMore or zeroOrMore times.) 2) ( text | ( name, name ) ) Seems completely correct to me, but nothing like what Lou got. 3) ( title?, ( name, name, (name, name? )? ) ) Correct! (And deterministic, to boot.) But again, nothing like what Lou got. And I have never looked at DTD or XSD. The latter is generated by `trang`, so in theory should be correct if handed correct, deterministic, RNG. The former is generated by odd2dtd.xsl, which I hope to have a look at tomorrow. P.S. I've already pushed up a new P5/Test/testerrmav.odd that includes the snippet below. I will push up changes to teiodds.xsl to sydb-occurs2 branch of Stylesheets in a few mins, I hope. below ----- [1] <tmp:ODD_snippet xmlns="http://www.tei-c.org/ns/1.0"> <elementSpec ident="loub1" mode="add"> <classes mode="replace"> <memberOf key="att.global"/> <memberOf key="model.hiLike"/> </classes> <content> <alternate maxOccurs="unbounded"> <textNode/> <elementRef key="hi" /> </alternate> </content> </elementSpec> <elementSpec ident="loub2" mode="add"> <classes mode="replace"> <memberOf key="att.global"/> <memberOf key="model.hiLike"/> </classes> <content> <alternate> <textNode/> <elementRef key="name" minOccurs="2"/> </alternate> </content> </elementSpec> <elementSpec ident="loub3" mode="add"> <classes mode="replace"> <memberOf key="att.global"/> <memberOf key="model.hiLike"/> </classes> <content> <sequence> <elementRef key="title" minOccurs="0" maxOccurs="1"/> <elementRef key="name" minOccurs="2" maxOccurs="4"/> </sequence> </content> </elementSpec> </tmp:ODD_snippet>
I started trying to test this, but didnt get very far.
1. What happens to a content model like this: <alternate maxOccurs="unbounded"> <textNode/> <elementRef key="hi" /> </alternate>
in RNC (using Syd's latest version of stylesheetes -- I think ) this gives us (text|hi)+ [krekt] in DTD I get (#PCDATA|hi)* which is not quite krekt but at least is valid in XSD I get <xs:complexType mixed="true"> <xs:sequence> <xs:element minOccurs="0" maxOccurs="unbounded" ref="tei:hi"/> </xs:sequence> <!-- ... --> i.e same as DTD. good.
2. Let's try using minOccurs
<alternate> <textNode/> <elementRef key="name" minOccurs="2"/> </alternate>
generates in RNC : (name+ | name+ | text) (?WTF?) in DTD : (#PCDATA|hi)* again [I think the presence of a textnode always generates a mixed content like this. Which is fine] in XSD <xs:complexType mixed="true"> <xs:choice minOccurs="0"> <xs:element maxOccurs="unbounded" ref="tei:name"/> <xs:element maxOccurs="unbounded" ref="tei:name"/> </xs:choice> ... </xs:complexType> [which won't do because it is invalid XSD, cos nonambig)
3. Maybe the textnode is special. Let's try something faintly plausible <sequence> <elementRef key="title" minOccurs="0" maxOccurs="1"/> <elementRef key="name" minOccurs="2" maxOccurs="4"/> </sequence>
in RNC : (title?, name, name) in DTD : <!ELEMENT ab (((title)?,name,name,name,name))> in XSD : <xs:sequence> <xs:element minOccurs="0" ref="tei:title"/> <xs:element ref="tei:name"/> <xs:element ref="tei:name"/> </xs:sequence>
None of these of course will validate something with 3 names, even though the ODD content model suggests they should.
I conclude we're all doomed. But you knew that already.
-- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
-- Syd Bauman, EMT-Paramedic Senior XML Programmer/Analyst Northeastern University Women Writers Project s.bauman@northeastern.edu or Syd_Bauman@alumni.Brown.edu -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council PLEASE NOTE: postings to this list are publicly archived

Nope, I definitely have installed Syd's branch: I cloned his branch, did a make install in it, and then ran teitorelaxng (etc.). Still not getting the good stuff you are. Checked that what's in my /usr/share/xml/tei/stylesheet is indeed the same as what's in syd branch, and different from what's in the current dev version of Stylesheets. And it is. I am forced to conclude either you havent updated your branch or I am doing sonmething reeely stupid. On 27/11/16 22:19, Lou Burnard wrote:
So it seems I was mistaken in thinking I was using your stylesheets branch? I will take another look tomorrow. Past my bedtime now.
Sent from my Honor Mobile
-------- Original Message -------- Subject: Re: [tei-council] fun with min and max From: Syd Bauman To: tei-council@lists.tei-c.org CC:
Thank you, Lou! Really appreciate that someone else took the time to have fun with min and max. (Great name for a pair of dogs, especially if one is diminutive and one is large.:-)
Hmmm ... First, I had to look up what "krekt" stands for. Thank goodness for Wiktionary.
But I get different results, possibly because I haven't pushed up a few changes, but I'm not sure (I didn't think any of those changes would affect this, but hey, if I really understood this it wouldn't take > 3 days!)
With the snippet of ODD provided in [1], I get the following RNC outputs:
1) ( text | hi )+ Correct! (Note that because 'text' matches the empty string, it makes no practical difference whether the <rng:choice> is allowed oneOrMore or zeroOrMore times.)
2) ( text | ( name, name ) ) Seems completely correct to me, but nothing like what Lou got.
3) ( title?, ( name, name, (name, name? )? ) ) Correct! (And deterministic, to boot.) But again, nothing like what Lou got.
And I have never looked at DTD or XSD. The latter is generated by `trang`, so in theory should be correct if handed correct, deterministic, RNG. The former is generated by odd2dtd.xsl, which I hope to have a look at tomorrow.
P.S. I've already pushed up a new P5/Test/testerrmav.odd that includes the snippet below. I will push up changes to teiodds.xsl to sydb-occurs2 branch of Stylesheets in a few mins, I hope.
below ----- [1] <tmp:ODD_snippet xmlns="http://www.tei-c.org/ns/1.0"> <elementSpec ident="loub1" mode="add"> <classes mode="replace"> <memberOf key="att.global"/> <memberOf key="model.hiLike"/> </classes> <content> <alternate maxOccurs="unbounded"> <textNode/> <elementRef key="hi" /> </alternate> </content> </elementSpec> <elementSpec ident="loub2" mode="add"> <classes mode="replace"> <memberOf key="att.global"/> <memberOf key="model.hiLike"/> </classes> <content> <alternate> <textNode/> <elementRef key="name" minOccurs="2"/> </alternate> </content> </elementSpec> <elementSpec ident="loub3" mode="add"> <classes mode="replace"> <memberOf key="att.global"/> <memberOf key="model.hiLike"/> </classes> <content> <sequence> <elementRef key="title" minOccurs="0" maxOccurs="1"/> <elementRef key="name" minOccurs="2" maxOccurs="4"/> </sequence> </content> </elementSpec> </tmp:ODD_snippet>
I started trying to test this, but didnt get very far.
1. What happens to a content model like this: <alternate maxOccurs="unbounded"> <textNode/> <elementRef key="hi" /> </alternate>
in RNC (using Syd's latest version of stylesheetes -- I think ) this gives us (text|hi)+ [krekt] in DTD I get (#PCDATA|hi)* which is not quite krekt but at least is valid in XSD I get <xs:complexType mixed="true"> <xs:sequence> <xs:element minOccurs="0" maxOccurs="unbounded" ref="tei:hi"/> </xs:sequence> <!-- ... --> i.e same as DTD. good.
2. Let's try using minOccurs
<alternate> <textNode/> <elementRef key="name" minOccurs="2"/> </alternate>
generates in RNC : (name+ | name+ | text) (?WTF?) in DTD : (#PCDATA|hi)* again [I think the presence of a textnode always generates a mixed content like this. Which is fine] in XSD <xs:complexType mixed="true"> <xs:choice minOccurs="0"> <xs:element maxOccurs="unbounded" ref="tei:name"/> <xs:element maxOccurs="unbounded" ref="tei:name"/> </xs:choice> ... </xs:complexType> [which won't do because it is invalid XSD, cos nonambig)
3. Maybe the textnode is special. Let's try something faintly plausible <sequence> <elementRef key="title" minOccurs="0" maxOccurs="1"/> <elementRef key="name" minOccurs="2" maxOccurs="4"/> </sequence>
in RNC : (title?, name, name) in DTD : <!ELEMENT ab (((title)?,name,name,name,name))> in XSD : <xs:sequence> <xs:element minOccurs="0" ref="tei:title"/> <xs:element ref="tei:name"/> <xs:element ref="tei:name"/> </xs:sequence>
None of these of course will validate something with 3 names, even though the ODD content model suggests they should.
I conclude we're all doomed. But you knew that already.
-- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived -- Syd Bauman, EMT-Paramedic Senior XML Programmer/Analyst Northeastern University Women Writers Project s.bauman@northeastern.edu or Syd_Bauman@alumni.Brown.edu
-- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived

Or I am dong something reeely stupid. You have checked out 'sydb-occurs2', yes? (As opposed to 'sydb-occurs', which is most certainly not working right.) Another difference: I was using old commandline `roma`, not current `teitorelaxng`. I will give the latter a try after my meeting which starts in 3 mins. Are you using the P5/Test/testerrmav.odd as an input file?
Nope, I definitely have installed Syd's branch: I cloned his branch, did a make install in it, and then ran teitorelaxng (etc.). Still not getting the good stuff you are. Checked that what's in my /usr/share/xml/tei/stylesheet is indeed the same as what's in syd branch, and different from what's in the current dev version of Stylesheets. And it is. I am forced to conclude either you havent updated your branch or I am doing sonmething reeely stupid.

On 28/11/16 14:57, Syd Bauman wrote:
Another difference: I was using old commandline `roma`, not current `teitorelaxng`. I will give the latter a try after my meeting which starts in 3 mins.
You really should stop using that, IMHO. If you don't want to change names, then make some aliases that really point to the new scripts. ;-) I'm not saying that is the problem... but is one of the first things I'd check! -James -- Dr James Cummings, James.Cummings@it.ox.ac.uk Academic IT Services, University of Oxford

Well, perhaps I should. But in the meantime, I just checked the output of my `roma` front-end with the output of $ teitorelaxng --odd --localsource=/path/to/p5.xml and the resulting output XML trees are for our purposes exactly the same. (There may be some differences in the serialization, as I converted both to canonical XML before comparing; and some details are different, e.g., the former generates a prefix pattern for each RELAX NG construct, the latter does not, and obviously the timestamps were different). So, Lou, we still don't know why you & I are getting different results.
You really should stop using [commdandline `roma`], IMHO. If you don't want to change names, then make some aliases that really point to the new scripts. ;-)
I'm not saying that is the problem... but is one of the first things I'd check!

I think you should also check what DTD and XSD outputs look like. As I said on the original ticket, we ought not to generate dtd or xsd which is actually illegal, even if it matches what the ODD says it should less well than the RNG output. In 28/11/16 16:55, Syd Bauman wrote:
Well, perhaps I should. But in the meantime, I just checked the output of my `roma` front-end with the output of $ teitorelaxng --odd --localsource=/path/to/p5.xml and the resulting output XML trees are for our purposes exactly the same.
(There may be some differences in the serialization, as I converted both to canonical XML before comparing; and some details are different, e.g., the former generates a prefix pattern for each RELAX NG construct, the latter does not, and obviously the timestamps were different).
So, Lou, we still don't know why you & I are getting different results.
You really should stop using [commdandline `roma`], IMHO. If you don't want to change names, then make some aliases that really point to the new scripts. ;-)
I'm not saying that is the problem... but is one of the first things I'd check!

I think you should also check what DTD and XSD outputs look like. As I said on the original ticket, we ought not to generate dtd or xsd which is actually illegal, even if it matches what the ODD says it should less well than the RNG output.
True enough. IIRC, current DTD and XSD generation produces schemas which are valid, but match incorrect # of things?

I checked DTD output on Test/testerrmav.odd (using the Stylesheets in the sydb-occurs2 branch).[1] The result is DTDs that are invalid because the declarations for most, but not all, the elements defined by <elementSpec> in the customization ODD need parens around the content model. Isn't that what Lou's code for p5subsetDoctored was supposed to do? Or is that wishful thinking on my part? If so, where was it incorporated into, and why isn't it working here? Notes ----- [1] Using both $ cd /path/to/Stylesheets/ # sydb-occurs2 branch $ ./bin/teitodtd /path/to/TEI/P5/Test/testerrmav.odd and $ cd /path/to/TEI/ # dev branch $ ANT_OPTS="-Xss2m -Xmx752m -Djava.awt.headless=true" ant -lib ../Utilities/lib/saxon9he.jar:../Utilities/lib/jing.jar -Dtrang=../Utilities/lib/trang.jar -DdefaultSource=`pwd`/../p5subsetDoctored.xml -DXSL=/path/to/Stylesheets -f antruntest.xml -Doutputname=testerrmav -Dtestfile=testerrmav.xml -DoddFile=testerrmav.odd validateodd compileodd dtd rng validaterng cleanup
I think you should also check what DTD and XSD outputs look like. As I said on the original ticket, we ought not to generate dtd or xsd which is actually illegal, even if it matches what the ODD says it should less well than the RNG output.
True enough. IIRC, current DTD and XSD generation produces schemas which are valid, but match incorrect # of things?

I tested XSD output on the same file (using teitoxsd with the sydb-occurs2 stylesheets), and the result seems correct -- there were 2 errors: 1) The content of <xenoData> is flagged as ambiguous. This is a problem with anyXML, the Stylesheets, or xenoData itself; it has nothing to do with the changes in sydb-occurs2, as the exact same problem occurs with the dev branch of the stylesheets. 2) The customized content model for <bibl> is flagged as ambiguous; which, it turns out, it is. When I run the same ODD through teitoxsd with the dev stylesheets, the errors are quite a bit worse: spurious maxOccurs="unbounded" are inserted in some places, breaking the XSD.
I think you should also check what DTD and XSD outputs look like. As I said on the original ticket, we ought not to generate dtd or xsd which is actually illegal, even if it matches what the ODD says it should less well than the RNG output.
True enough. IIRC, current DTD and XSD generation produces schemas which are valid, but match incorrect # of things?

The anyElement-xenodata problem is a worrying one. I can reproduce it using your ODD, but this doesn't occur in the Exemplars, where, given the exact same RNG definition, Trang produces a slightly different (and correct) xsd:group. Spooky. The busted one: <xs:group name="anyElement-xenoData"> <xs:choice> <xs:any namespace="##other" processContents="skip"/> <xs:any namespace="##other" processContents="skip"/> </xs:choice> </xs:group> The correct one: <xs:group name="anyElement-xenoData"> <xs:choice> <xs:any namespace="##other" processContents="skip"/> <xs:any namespace="##local" processContents="skip"/> </xs:choice> </xs:group> This from: <define name="anyElement-xenoData"> <element> <anyName> <except> <nsName ns="http://www.tei-c.org/ns/1.0"/> <name ns="http://www.tei-c.org/ns/Examples">egXML</name> </except> </anyName> <zeroOrMore> <attribute> <anyName/> </attribute> </zeroOrMore> <zeroOrMore> <choice> <text/> <ref name="anyElement-xenoData"/> </choice> </zeroOrMore> </element> </define> On Tue, Nov 29, 2016 at 12:32 AM, Syd Bauman <s.bauman@northeastern.edu> wrote:
I tested XSD output on the same file (using teitoxsd with the sydb-occurs2 stylesheets), and the result seems correct -- there were 2 errors:
1) The content of <xenoData> is flagged as ambiguous. This is a problem with anyXML, the Stylesheets, or xenoData itself; it has nothing to do with the changes in sydb-occurs2, as the exact same problem occurs with the dev branch of the stylesheets.
2) The customized content model for <bibl> is flagged as ambiguous; which, it turns out, it is.
When I run the same ODD through teitoxsd with the dev stylesheets, the errors are quite a bit worse: spurious maxOccurs="unbounded" are inserted in some places, breaking the XSD.
I think you should also check what DTD and XSD outputs look like. As I said on the original ticket, we ought not to generate dtd or xsd which is actually illegal, even if it matches what the ODD says it should less well than the RNG output.
True enough. IIRC, current DTD and XSD generation produces schemas which are valid, but match incorrect # of things? -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
participants (4)
-
Hugh Cayless
-
James Cummings
-
Lou Burnard
-
Syd Bauman