I have altered the P5/Scripts/purify.xml to turn rng:interleave into <sequence preserveOrder="false">. My question for the group is "should I check it in"? I changed it to do some conversion of WWP ODD files to Pure ODD, and it did what I wanted. But it's not at all clear to me this is a good general purpose solution. That's because the semantics of <interleave> may or may not be the same as the semantics of <sequence preserveOrder="false">. And that's because the semantics of the former are well defined and not precisely what many of us would expect, and the semantics of the that latter are all but undefined. First, allow me to gripe again that the naming of the Pure ODD construct for "all of these required, order not constrained" is at least problematic, and at worst makes us look like idiots. By the definition of the word, a sequence is ordered. Adding an attribute that can outright contradict the name of the element is embarrassing. Now that I have that off my chest ... a quick review of <interleave> is in order. (Pardon the pun. :-) Take the RELAX NG snippet element test = { ( a, b, c ) & ( x, y, z ) } You might think that only two XML fragments would be valid against that, namely <a/> <b/> <c/> <x/> <y/> <z/> and <x/> <y/> <z/> <a/> <b/> <c/> But in truth, because the RELAX NG interleave operator (aka "&") is not the same as the SGML DTD "&" operator, the following are also valid <a/> <x/> <b/> <c/> <y/> <z/> <a/> <b/> <x/> <c/> <y/> <z/> <x/> <y/> <a/> <b/> <c/> <z/> <a/> <x/> <y/> <b/> <c/> <z/> <a/> <b/> <x/> <y/> <c/> <z/> <a/> <x/> <y/> <z/> <b/> <c/> That is, as long as the *relative* order of each sub-group is preserved (i.e., <a> must come before <b> must come before <c>), members of the two groups can be intermixed in any order. The Pure ODD @preserveOrder on <sequence> does not define which of these behaviors, or something else, is intended. AFAIK the plan is that the Stylesheets should produce <interleave> in the output RELAX NG schema, although they do not do so correctly yet.[1] (There is no equivalent construct in XML DTDs or W3C Schema 1.0, and I don't know about W3C Schema 1.1.) I guess my instinct is that we should convert <interleave> into <sequence preserveOrder="false">, since we are likely to do that in the other direction. But I don't want to make that decision entirely by myself. P.S. Personally, it seems to me we should define preserveOrder=false as interleave, as it is not feasible to express the SGML DTD '&' semantics for an arbitrarily large number of children in any modern schema language.[2] Notes ----- [1] Stylesheets issue #241. [2] The way you do it is to say "this order OR that order" for each possible ordering, but the number of possible orderings goes up factorially. Thus to express the SGML DTD concept "a & b & c & d & e" in XML DTD or in RELAX NG w/o <interleave> takes the disjunction of 120 clauses; to express "a & b & c & d & e & f & g & h & i & j" takes 3.6 million clauses. To say "all the members of model.biblLike are required, but in any order" using this method would take ~10 ZB or ~8.5 ZiB of disk space, or ~75 times all of the Google holds.
On 12/03/17 18:18, Syd Bauman wrote:
I have altered the P5/Scripts/purify.xml to turn rng:interleave into <sequence preserveOrder="false">. My question for the group is "should I check it in"?
My answer is "no". The correct translation for rng:interleave into ODD is <interleave>. If we decide to remove that concept from the ODD language (as we seem rather hastily to have done) providing an alternative which doesn't mean the same thing is just silly. Whether or not you like <sequence preserveOrder=false> it's not the same as <interleave>. Redefining it to mean <interleave> would be just reintroducing the functionality we've already decided we don't want by a back door. Better by far to remove both of them. Better yet to keep it, and bring back interleave! Also, as far as I am aware, the purify.xsl script is intended for use when converting existing pre-5.0 ODD as used in the Guidelines source or ODDs referencing them to use the new syntax, not as a general purpose RNG conversion tool.
The Pure ODD @preserveOrder on <sequence> does not define which of these behaviors, or something else, is intended. AFAIK the plan is that the Stylesheets should produce <interleave> in the output RELAX NG schema, although they do not do so correctly yet.[1] (There is no equivalent construct in XML DTDs or W3C Schema 1.0, and I don't know about W3C Schema 1.1.)
Why do you think that "the plan is to produce <interleave>" ? Might an implementor not decide equally well to treat e.g. <sequence preserveOrder="false"> <elementRef key="foo"/><elementRef key="bar"/> </sequence> as a shortcut for <alternate> <sequence> <elementRef key="foo"/><elementRef key="bar"/> </sequence> <sequence> <elementRef key="bar"/><elementRef key="foo"/> </sequence> </alternate> With the obvious proviso that the number of possible children in the disordered sequence can't be greater than some manageable maximum.
participants (2)
-
Lou Burnard
-
Syd Bauman