
Kudos to George Bina, who actually simply asked James Clark and Makoto Murata about the "Conflicting ID-types" problem. All we need to do is change the definition of the anyXML pattern so that an @xml:id that it matches that is already defined is declared as ID, not text! Gads, why didn't I think of that? The problem is this is not all that easy to do. The idea isn't so hard to grok, but implementation may be a tall order. Pretend we have a schema that defines only 3 elements. The root <test>, which has one or more <para>s, each of which can have text or <name>; of course, any element in our system can have an @xml:id: start = element test { attribute xml:id { xsd:ID }?, para+ } para = element para { attribute xml:id { xsd:ID }?, ( text | name )* } name = element name { attribute xml:id { xsd:ID }?, ("Horslips" | "Heart" | "Berlin" | "Blondie" | "Quarterflash" | "Renaissance") } That's a lovely working schema.[1] Let's say I also want to allow a single element <otherStuff> to precede the paragraphs, and allow it to have ANY content. Easy, with this method: start = element test { attribute xml:id { xsd:ID }?, otherStuff?, para+ } para = element para { attribute xml:id { xsd:ID }?, ( text | name )* } name = element name { attribute xml:id { xsd:ID }?, ("Horslips" | "Heart" | "Berlin" | "Blondie" | "Quarterflash" | "Renaissance") } otherStuff = element otherStuff { attribute xml:id { xsd:ID }?, ( text | anyXMLelement )* } anyXMLelement = # for elements that already have @xml:id defined as ID, define it # as ID here, too element test | otherStuff | name | para { attribute * - xml:id { text }*, attribute xml:id { xsd:ID }?, ( text | anyXMLelement )* } | # for all other elements, define @xml:id (and all other attrs) as # just text element * - ( test | otherStuff | name | para ) { attribute * { text }*, ( text | anyXMLelement )* } This is also a lovely working schema. It has the HUGE advantage that it does not trip over the "Conflicting ID-types" error. It has the (minor) disadvantage that for some elements inside <otherStuff> (namely, those that are NOT the enumerated <test>, <otherStuff>, <name>, <para>) @xml:id values are not required to be NCNames, and uniqueness of @xml:id values is not enforced. BUT those same problems occur with our current method (and our previous method) of doing this, and even more so as with those methods it applies to *all* descendant elements. The main problem, as I see it, is that it is hard to create the schema. That list of names (here "test | otherStuff | name | para") cannot be handled with indirection, because (AFAIK) an <rng:name> cannot be inside an <rng:define>. It has to be built at schema-build time, and then tucked into both places. And while building it you have to remember to include namespaces where necessary (for us that is only for "teix:egXML").[2] Another monkey wrench is that to build the list you not only have to go through all of the <tei:elementSpec>s in your flattened ODD, but also all the <rng:element>s in any schemas included by a <tei:moduleRef url="[MathML, SVG, whatever]"/> (And no, I have not thought through whether you want *all* element names, or only those that actually occur (perhaps indirectly via a class) in a content model, or if it matters.) I don't know how to do any of that off the top of my head, but my thought is that it is probably not too hard, albeit a significant amount of work. To see what a resulting tei_all.rnc would look like, feel free to take a look at [3], particularly the last dozen lines. Notes ----- [1] Working because the prefixes "xml:" and "xsd:" are magically defined. [2] And, if you're writing in the compact syntax (which we wouldn't be), to put a backslash in front of those names RELAX NG already uses (for us that's \list, \namespace, \default, \text, and \div) [3] http://paramedic.wwp.northeastern.edu/~syd/temp/TEI_Council/tei_all_new_ANY_...
participants (1)
-
Syd Bauman