I have altered the P5/Scripts/purify.xml to turn <rng:interleave>
into <sequence preserveOrder="false">. My question for the group is
"should I check it in"?
I changed it to do some conversion of WWP ODD files to Pure ODD, and
it did what I wanted. But it's not at all clear to me this is a good
general purpose solution. That's because the semantics of
<interleave> may or may not be the same as the semantics of <sequence
preserveOrder="false">. And that's because the semantics of the
former are well defined and not precisely what many of us would
expect, and the semantics of the that latter are all but undefined.
First, allow me to gripe again that the naming of the Pure ODD
construct for "all of these required, order not constrained" is at
least problematic, and at worst makes us look like idiots. By the
definition of the word, a sequence is ordered. Adding an attribute
that can outright contradict the name of the element is embarrassing.
Now that I have that off my chest ... a quick review of <interleave>
is in order. (Pardon the pun. :-)
Take the RELAX NG snippet
element test = { ( a, b, c ) & ( x, y, z ) }
You might think that only two XML fragments would be valid against
that, namely
<a/> <b/> <c/> <x/> <y/> <z/>
and
<x/> <y/> <z/> <a/> <b/> <c/>
But in truth, because the RELAX NG interleave operator (aka "&") is
not the same as the SGML DTD "&" operator, the following are also
valid
<a/> <x/> <b/> <c/> <y/> <z/>
<a/> <b/> <x/> <c/> <y/> <z/>
<x/> <y/> <a/> <b/> <c/> <z/>
<a/> <x/> <y/> <b/> <c/> <z/>
<a/> <b/> <x/> <y/> <c/> <z/>
<a/> <x/> <y/> <z/> <b/> <c/>
That is, as long as the *relative* order of each sub-group is
preserved (i.e., <a> must come before <b> must come before <c>),
members of the two groups can be intermixed in any order.
The Pure ODD @preserveOrder on <sequence> does not define which of
these behaviors, or something else, is intended. AFAIK the plan is
that the Stylesheets should produce <interleave> in the output RELAX
NG schema, although they do not do so correctly yet.[1] (There is no
equivalent construct in XML DTDs or W3C Schema 1.0, and I don't know
about W3C Schema 1.1.)
I guess my instinct is that we should convert <interleave> into
<sequence preserveOrder="false">, since we are likely to do that in
the other direction. But I don't want to make that decision entirely
by myself.
P.S. Personally, it seems to me we should define preserveOrder=false
as interleave, as it is not feasible to express the SGML DTD '&'
semantics for an arbitrarily large number of children in any
modern schema language.[2]
Notes
-----
[1] Stylesheets issue #241.
[2] The way you do it is to say "this order OR that order" for each
possible ordering, but the number of possible orderings goes up
factorially. Thus to express the SGML DTD concept "a & b & c & d
& e" in XML DTD or in RELAX NG w/o <interleave> takes the
disjunction of 120 clauses; to express "a & b & c & d & e & f & g
& h & i & j" takes 3.6 million clauses. To say "all the members
of model.biblLike are required, but in any order" using this
method would take ~10 ZB or ~8.5 ZiB of disk space, or ~75 times
all of the Google holds.