RELAXNGV2

From Wg1-wiki

Jump to: navigation, search

Contents

[edit] Design Principles

RELAX NG is not yet the most widely used schema language, but quite a few users are happily using RELAX NG. RELAX NG V2 is intended to make such users even happier.

New features of RELAX NG V2 are restricted to those based on feedbacks from users of RELAX NG. Features not based on such feedbacks are not introduced no matter how they are interesting.

It is very important to keep RELAX NG V2 as a simple extension of RELAX NG so that V2 can be easily implemented.

[edit] Mechanism for distinguishing V1 schemas and V2 schemas

Requirements

  1. It should be easy to distinguish V1 schemas and V2 schemas.
  2. Both the XML syntax and the compact syntax should allow such distinction.
  3. V2 schemas may reference to V1 schemas.
  4. V1 schemas do not reference to V2 schemas.
  5. When V1 validators encounter V2 schemas, they stop normal processing.
  6. When V2 validators encounter V1 schemas, they continue normal processing.
  7. If a V2 schema does not have any features specific to V2, it can be easily converted to a V1 schema without changing the semantics.


The third and sixth requirements mean that V2 validators are required to handle the mixture of patterns from V1 and patterns from V2. As an extreme case, one definition in a V1 schema and another in a V2 schema may be combined by |=.

XML Syntax

Another namespace name, namely http://relaxng.org/ns/structure/2.0

Compact syntax

V2 schemas in RNC shall begin with a line $RELAX NG Version 2.

I am inclined not to change the EBNF but revise the clause "lexical structure". Since this clause allows comments, which do not occur in the EBNF, I do not think we have to revise the EBNF.

Structure of the new edition

The next edition specifies both V1 and V2. The structure of the next edition is basically the same as the current edition, but some changes are certainly needed.

  • Introduce "Version 1" and "Version 2" in the "Terms and definitions" clause (Clause 3).
  • Revise the "Full syntax" clause (Clause 6) for introducing a new namespace name for Version 2.
  • Introduce $RELAX NG Version 2 in the clause "lexical structure" (Clause C.3).
  • Revise the definition of include and externalRef for disallowing V1 schemas to reference to external V2 schemas.
  • Probably introduce a non-normative clause about the relationship between Version 1 and Version 2.
  • Revise the "Conformance" clauses (Clause 11 and subclasue C.6) and make clear that V1 implementations are not required to handle V2 schemas.

[edit] FutureRequirements from the OASIS RELAX NG wiki

Preventing schema from being included more then once

Murata: Yes. Quite a few users reported this problem. This is strongly needed.

When multiple include statements reference to the same schema, the first one is used and the other ones are ignored. To be precise, this applies only to those include statements which do not have RELAX NG start or define elements. Moreover, when two include statements have different current grammar elements, they are included twice. (The definition of "current" is tricky.)


There was a proposal to use SHA1 or other hash functions rather than input guards.

Name classes should be patterns

Murata: Probably, yes. Proposals of the syntax (both the XML syntax and compact syntax) are available at http://www.mail-archive.com/rng-users@yahoogroups.com/msg00755.html.

Add support for xml:id

Murata: No. Maybe in Part 6.

Revise ID/IDREF pattern constraints

Murata: No. Maybe in Part 6. But Rick suggested a shortcut for single-part keys and key references embedded in RELAX NG schemas. See http://lists.dsdl.org/dsdl-discuss/2009-09/0005.html.

Align with WXS datatypes 1.1

Murata: No. Note that even WXS datatypes 1.0 is outside the scope of RNG.

Pattern names should be QNames

Murata: No, since prefixes of patter names are good enough as a workaround.

Avoiding name collision of define statements -- actually this is a more general requirement to previous item

Murata: See "Pattern names should be QNames" and "Preventing schema from being included more then once"

Lifting restions on interleave -- list//interleave (10.2.4) should be allowed, and "10.5 Restrictions on interleave" should be removed.

Murata: No. Not a showstopper. Lifting this restriction may have impacts on implementations.

CREPDL -- use of CREPDL for text and mixed should be allowed.

Murata: Probably, yes. Should we commit to CREPDL or allow other mechanisms as well? (Probably yes). Should we allow CREPDL schemas as part of RELAX NG schemas or should we reference to external CREPDL schemas?

[edit] The ideas James wanted to consider for 2.0

See RELAX NG 2.0 ideas

1. Repeat M-N times (issue 11); I think this is particularly useful with regexes, so if we add regexes along the lines that John C proposes, then I think the balance tips towards these.

Murata: User requirements certainly exist. However, implementations will certainly become difficult. What happens when users specify minOccurs="100000" and minOccurs="100000"? It appears that Xerces-J use minOccurs="256" and maxOccurs="unbounded", when a number more than 256 is specified. Should we do the same thing?

Murata: In the transitional schema of OOXML, the value of minOccurs is as follows:

minOccurs

    368	unbounded
   2285	0
   2331	1
     32	2
     17	3
      4	4
      1	6
      3	9
      1	32
      1	64
      1	256

maxOccurs

    450	unbounded
   1732	1
     14	2
      9	3
      4	4
      1	6
      3	9
      1	32
      1	45
      1	64
      1	256

2. Richer delimiters for <list> (issue 34); maybe relevant for DSDL datatypes

Murata: No. Since DSDL part 5 provides this feature, I do not see a strong reason to do this extension.

3. A way to reference a pattern overriding a particular attribute as XML Schema allows with xs:restriction (issue 52).

Murata: No. I am not aware of any real requirements. Since derivation by restriction in XSD is considered as one of the most dangerous features, I think we can ignore it safely.

4. Better ways to do references between separate grammars (roughly [http://www.oasis-open.org/committees/relax-ng/issues-20011011.html#externalRef issue 55]); something more like xs:import.

Murata: No. I am not aware of any real requirements.

5. Mechanism to constrain mixed content, at least the character repertoire (issue 41).

Murata: See CREPDL in "FutureRequirements from the OASIS RELAX NG wiki"

6. concur (issue 15)

Murata: Rick would like to introduce this. TREX had this mechanism but James proposed to drop it (his observations are available at [1])

Murata: Because of the implementation difficulty (alternating tree automata!), I do not want to allow CONCUR without any restrictions. But it might make sense to allow CONCUR for combining multiple <data> patterns so that we can easily mimik derivation of simple types by restriction.

7. Some way to do exclusions (issue 16)

Murata: Since we have schematron, I do not think that this is a must.

8. Regexes (along the lines suggested by John C)

Murata: No. I think that it is too late to sell another syntax for regexps. Moreover, the syntax in XSD Part 2 is quite readable and widely used.

9. A way to define/ref name classes

Murata: See "Name classes should be patterns" in "FutureRequirements from the OASIS RELAX NG wiki".

10. A better restriction on <interleave> (issue 40); the current one forbids some useful things

Murata: No. See "Lifting restions on interleave" in "FutureRequirements from the OASIS RELAX NG wiki".

11. A less free-wheeling approach to annotations

Murata: I do not understand this issue.

12. Type assignment: a way for the schema to say that it wants elements/attributes in the instance to be unambiguously associated with element/attribute patterns in the schema, particularly in conjunction with 11

Murata: Schema-aware use of XSLT2 and XQuery is not common yet. However, if a profile of RELAX NG allows simple Type assignment, data binding might be easier. Anyway, this is not V2 but a profile.

13. Parameterized patterns (issue 8)

Murata: No. This is complicated and I am not aware of any user requirements.

Personal tools