Lines Matching +full:use +full:- +full:external +full:- +full:names

1 <?xml version='1.0' encoding='ISO-8859-1' standalone='no'?>
4 <!-- LAST TOUCHED BY: Tim Bray, 8 February 1997 -->
6 <!-- The words 'FINAL EDIT' in comments mark places where changes
8 publication. -->
13 <!ENTITY w3c.doc.date "02-Feb-1998">
27 <!ENTITY mdash "--"> <!-- &#x2014, but nsgmls doesn't grok hex -->
28 <!ENTITY com "--">
29 <!ENTITY como "--">
30 <!ENTITY comc "--">
32 <!-- <!ENTITY nbsp "�"> -->
40 <!-- audience and distribution status: for use at publication time -->
47 <!-- for Panorama *-->
54 <w3c-designation>REC-xml-&iso6.doc.date;</w3c-designation>
55 <w3c-doctype>W3C Recommendation</w3c-doctype>
59 <loc href="http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;">
60 http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;</loc>
61 <loc href="http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.xml">
62 http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.xml</loc>
63 <loc href="http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.html">
64 http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.html</loc>
65 <loc href="http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.pdf">
66 http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.pdf</loc>
67 <loc href="http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.ps">
68 http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.ps</loc>
71 <loc href="http://www.w3.org/TR/REC-xml">
72 http://www.w3.org/TR/REC-xml</loc>
75 <loc href="http://www.w3.org/TR/PR-xml-971208">
76 http://www.w3.org/TR/PR-xml-971208</loc>
77 <!--
78 <loc href='http://www.w3.org/TR/WD-xml-961114'>
79 http://www.w3.org/TR/WD-xml-961114</loc>
80 <loc href='http://www.w3.org/TR/WD-xml-lang-970331'>
81 http://www.w3.org/TR/WD-xml-lang-970331</loc>
82 <loc href='http://www.w3.org/TR/WD-xml-lang-970630'>
83 http://www.w3.org/TR/WD-xml-lang-970630</loc>
84 <loc href='http://www.w3.org/TR/WD-xml-970807'>
85 http://www.w3.org/TR/WD-xml-970807</loc>
86 <loc href='http://www.w3.org/TR/WD-xml-971117'>
87 http://www.w3.org/TR/WD-xml-971117</loc>-->
97 <author><name>C. M. Sperberg-McQueen</name>
123 corrected) for use on the World Wide Web. It is a product of the W3C
130 ref="Berners-Lee"/>, a work in progress expected to update <bibref
135 <loc href='http://www.w3.org/XML/xml-19980210-errata'>http://www.w3.org/XML/xml-19980210-errata</lo…
137 <loc href='mailto:xml-editor@w3.org'>xml-editor@w3.org</loc>.
144 World-Wide Web Consortium, XML Working Group, 1996, 1997.</p>
151 <language id='ebnf'>Extended Backus-Naur Form (formal grammar)</language>
155 <sitem>1997-12-03 : CMSMcQ : yet further changes</sitem>
156 <sitem>1997-12-02 : TB : further changes (see TB to XML WG,
158 <sitem>1997-12-02 : CMSMcQ : deal with as many corrections and
160 entify hard-coded document date in pubdate element,
163 about refernece to Berners-Lee et al.),
167 re-order back matter so normative appendices come first,
168 re-tag back matter so informative appendices are tagged informdiv1,
174 add reference to 'Fielding draft' (Berners-Lee et al.),
176 drop URIchar non-terminal and use SkipLit instead,
193 <sitem>1997-12-01 : JB : add some column-width parameters</sitem>
194 <sitem>1997-12-01 : CMSMcQ : begin round of changes to incorporate
202 change grammar's handling of internal subset (drop non-terminal markupdecls),
204 add integral-declaration constraint on internal subset,
209 add description of how to generate our name-space rules from
212 <sitem>1997-10-08 : TB : Removed %-constructs again, new rules
214 <sitem>1997-10-01 : TB : Case-sensitive markup; cleaned up
215 element-type defs, lotsa little edits for style</sitem>
216 <sitem>1997-09-25 : TB : Change to elm's new DTD, with
217 substantial detail cleanup as a side-effect</sitem>
218 <sitem>1997-07-24 : CMSMcQ : correct error (lost *) in definition
220 <sitem>Allow all empty elements to have end-tags, consistent with
222 <sitem>1997-07-23 : CMSMcQ : pre-emptive strike on pending corrections:
223 introduce the term 'empty-element tag', note that all empty elements
224 may use it, and elements declared EMPTY must use it.
232 <sitem>1997-06-30 : CMSMcQ : change date, some cosmetic changes,
239 <sitem>1997-06-29 : TB : various edits</sitem>
240 <sitem>1997-06-29 : CMSMcQ : further changes:
246 <sitem>1997-06-28 : CMSMcQ : Various changes for 1 July draft:
257 <sitem>1997-04-02 : CMSMcQ : final corrections of editorial errors
259 well-formed: Webster's Second hyphenates it, and that's enough
261 <sitem>1997-04-01 : CMSMcQ : corrections from JJC, EM, HT, and self</sitem>
262 <sitem>1997-03-31 : Tim Bray : many changes</sitem>
263 <sitem>1997-03-29 : CMSMcQ : some Henry Thompson (on entity handling),
268 <sitem>1997-03-28 : CMSMcQ : make as many corrections as possible, from
275 but 8879 uses that name for both internal and external entities.)</sitem>
276 <sitem>1997-03-26 : CMSMcQ : resynch the two forks of this draft, reapply
277 my changes dated 03-20 and 03-21. Normalize old 'may not' to 'must not'
279 <sitem>1997-03-21 : TB : massive changes on plane flight from Chicago
281 <sitem>1997-03-21 : CMSMcQ : correct as many reported errors as possible.
283 <sitem>1997-03-20 : CMSMcQ : correct typos listed in CMSMcQ hand copy of spec.</sitem>
284 <sitem>1997-03-20 : CMSMcQ : cosmetic changes preparatory to revision for
289 <sitem>1996-11-12 : CMSMcQ : revise using Tim's edits:
292 Suppress QuotedNames, Names (not used).
293 Correct trivial-grammar doc type decl.
297 Charref should use just [0-9] not Digit.
301 Clarify discussion of encoding names.
306 Reserve entity names of the form u-NNNN.
312 <sitem>1996-11-11 : CMSMcQ : revise for style.
314 <sitem>1996-11-10 : CMSMcQ : revise for style.
315 Fix / complete section on names, characters.
319 <sitem>1996-10-31 : TB : Add Entity Handling section</sitem>
320 <sitem>1996-10-30 : TB : Clean up term &amp; termdef. Slip in
322 <sitem>1996-10-28 : TB : Change DTD. Implement some of Michael's
324 XML namespace reservation. Add section on white-space handling.
326 <sitem>1996-10-24 : CMSMcQ : quick tweaks, implement some ERB
330 in marked sections. Call them attribute-value pairs not
331 name-value pairs, except once. Internal subset is optional, needs
334 <sitem>1996-10-16 : TB : track down &amp; excise all DSD references;
336 <sitem>1996-10-?? : TB : consistency check, fix up scraps so
338 <sitem>1996-10-10/11 : CMSMcQ : various maintenance, stylistic, and
350 section on partial-DTD summary PIs to end of Logical Structures
352 Revise DSD syntax section to use Tim's subset-in-a-PI
354 <sitem>1996-10-10 : TB : eliminate name recognizers (and more?)</sitem>
355 <sitem>1996-10-09 : CMSMcQ : revise for style, consistency through 2.3
357 <sitem>1996-10-09 : CMSMcQ : re-unite everything for convenience,
359 <sitem>1996-10-08 : TB : first major homogenization pass</sitem>
360 <sitem>1996-10-08 : TB : turn "current" attribute on div type into
362 <sitem>1996-10-02 : TB : remould into skeleton + entities</sitem>
363 <sitem>1996-09-30 : CMSMcQ : add a few more sections prior to exchange
365 <sitem>1996-09-20 : CMSMcQ : finish transcribing notes.</sitem>
366 <sitem>1996-09-19 : CMSMcQ : begin transcribing notes for draft.</sitem>
367 <sitem>1996-09-13 : CMSMcQ : made outline from notes of 09-06,
373 <div1 id='sec-intro'>
376 data objects called <termref def="dt-xml-doc">XML documents</termref> and
385 def="dt-entity">entities</termref>, which contain either parsed
387 Parsed data is made up of <termref def="dt-character">characters</termref>,
389 of which form <termref def="dt-chardata">character data</termref>,
390 and some of which form <termref def="dt-markup">markup</termref>.
394 <p><termdef id="dt-xml-proc" term="XML Processor">A software module
397 id="dt-app" term="Application">It is assumed that an XML processor is
403 <div2 id='sec-origin-goals'>
423 <item><p>XML documents should be human-legible and reasonably
440 <!-- is for &doc.audience;.-->
448 <div2 id='sec-terminology'>
458 <def><p><termdef id="dt-may" term="May">Conforming documents and XML
466 <!-- do NOT change this! this is what defines a violation of
467 a 'must' clause as 'an error'. -MSM -->
472 <def><p><termdef id='dt-error' term='Error'
480 <def><p><termdef id="dt-fatal" term="Fatal Error">An error
481 which a conforming <termref def="dt-xml-proc">XML processor</termref>
505 <termref def="dt-valid">valid</termref> XML documents.
508 <termref def="dt-validating">validating XML processors</termref>.</p></def>
511 <label>well-formedness constraint</label>
513 def="dt-wellformed">well-formed</termref> XML documents.
514 Violations of well-formedness constraints are
515 <termref def="dt-fatal">fatal errors</termref>.</p></def>
520 <def><p><termdef id="dt-match" term="match">(Of strings or names:)
521 Two strings or names being compared must be identical.
541 <def><p><termdef id="dt-compat" term="For Compatibility">A feature of
547 <def><p><termdef id="dt-interop" term="For interoperability">A
548 non-binding recommendation included to increase the chances that XML
559 <!-- &Docs; -->
561 <div1 id='sec-documents'>
564 <p><termdef id="dt-xml-doc" term="XML Document">
567 <termref def="dt-wellformed">well-formed</termref>, as
569 A well-formed XML document may in addition be
570 <termref def="dt-valid">valid</termref> if it meets certain further
575 def="dt-entity">entities</termref>. An entity may <termref
576 def="dt-entref">refer</termref> to other entities to cause their
578 def="dt-docent">document entity</termref>.
586 in <specref ref='wf-entities'/>.
589 <div2 id='sec-well-formed'>
590 <head>Well-Formed XML Documents</head>
592 <p><termdef id="dt-wellformed" term="Well-Formed">
594 a well-formed XML document if:</termdef>
597 matches the production labeled <nt def='NT-document'>document</nt>.</p></item>
599 meets all the well-formedness constraints given in this specification.</p>
601 <item><p>Each of the <termref def='dt-parsedent'>parsed entities</termref>
603 <titleref href='wf-entities'>well-formed</titleref>.</p></item>
608 <prod id='NT-document'><lhs>document</lhs>
609 <rhs><nt def='NT-prolog'>prolog</nt>
610 <nt def='NT-element'>element</nt>
611 <nt def='NT-Misc'>Misc</nt>*</rhs></prod>
614 <p>Matching the <nt def="NT-document">document</nt> production
618 <termref def="dt-element">elements</termref>.</p>
620 <!--* N.B. some readers (notably JC) find the following
626 could however use some recasting when the editors are feeling
627 stronger. -MSM *-->
628 <item><p><termdef id="dt-root" term="Root Element">There is exactly
631 def="dt-content">content</termref> of any other element.</termdef>
632 For all other elements, if the start-tag is in the content of another
633 element, the end-tag is in the content of the same element. More
634 simply stated, the elements, delimited by start- and end-tags, nest
639 <p><termdef id="dt-parentchild" term="Parent/Child">As a consequence
641 for each non-root element
654 <p><termdef id="dt-text" term="Text">A parsed entity contains
656 <termref def="dt-character">characters</termref>,
658 <termdef id="dt-character" term="Character">A <term>character</term>
663 The use of "compatibility characters", as defined in section 6.8
669 <prod id="NT-Char"><lhs>Char</lhs>
670 <rhs>#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD]
671 | [#x10000-#x10FFFF]</rhs>
679 vary from entity to entity. All XML processors must accept the UTF-8
680 and UTF-16 encodings of 10646; the mechanisms for signaling which of
681 the two is in use, or for bringing other encodings into play, are
684 <!--
688 UCS-4 code value.
689 </p>-->
692 <div2 id='sec-common-syn'>
696 <p><nt def="NT-S">S</nt> (white space) consists of one or more space (#x20)
702 <prod id='NT-S'><lhs>S</lhs>
714 <p><termdef id="dt-name" term="Name">A <term>Name</term> is a token
718 Names beginning with the string "<code>xml</code>", or any string
724 <p>The colon character within XML names is reserved for experimentation with
729 (There is no guarantee that any name-space mechanism
730 adopted for XML will in fact use the colon as a name-space delimiter.)
731 In practice, this means that authors should not use the colon in XML
732 names except as part of name-space experiments, but that XML processors
736 <nt def='NT-Nmtoken'>Nmtoken</nt> (name token) is any mixture of
739 <head>Names and Tokens</head>
740 <prod id='NT-NameChar'><lhs>NameChar</lhs>
741 <rhs><nt def="NT-Letter">Letter</nt>
742 | <nt def='NT-Digit'>Digit</nt>
743 | '.' | '-' | '_' | ':'
744 | <nt def='NT-CombiningChar'>CombiningChar</nt>
745 | <nt def='NT-Extender'>Extender</nt></rhs>
747 <prod id='NT-Name'><lhs>Name</lhs>
748 <rhs>(<nt def='NT-Letter'>Letter</nt> | '_' | ':')
749 (<nt def='NT-NameChar'>NameChar</nt>)*</rhs></prod>
750 <prod id='NT-Names'><lhs>Names</lhs>
751 <rhs><nt def='NT-Name'>Name</nt>
752 (<nt def='NT-S'>S</nt> <nt def='NT-Name'>Name</nt>)*</rhs></prod>
753 <prod id='NT-Nmtoken'><lhs>Nmtoken</lhs>
754 <rhs>(<nt def='NT-NameChar'>NameChar</nt>)+</rhs></prod>
755 <prod id='NT-Nmtokens'><lhs>Nmtokens</lhs>
756 <rhs><nt def='NT-Nmtoken'>Nmtoken</nt> (<nt def='NT-S'>S</nt> <nt def='NT-Nmtoken'>Nmtoken</nt>)*</…
763 (<nt def='NT-EntityValue'>EntityValue</nt>),
764 the values of attributes (<nt def='NT-AttValue'>AttValue</nt>),
765 and external identifiers
766 (<nt def="NT-SystemLiteral">SystemLiteral</nt>).
767 Note that a <nt def='NT-SystemLiteral'>SystemLiteral</nt>
771 <prod id='NT-EntityValue'><lhs>EntityValue</lhs>
774 | <nt def='NT-PEReference'>PEReference</nt>
775 | <nt def='NT-Reference'>Reference</nt>)*
781 | <nt def='NT-PEReference'>PEReference</nt>
782 | <nt def='NT-Reference'>Reference</nt>)*
785 <prod id='NT-AttValue'><lhs>AttValue</lhs>
788 | <nt def='NT-Reference'>Reference</nt>)*
794 | <nt def='NT-Reference'>Reference</nt>)*
797 <prod id="NT-SystemLiteral"><lhs>SystemLiteral</lhs>
801 <prod id="NT-PubidLiteral"><lhs>PubidLiteral</lhs>
802 <rhs>'"' <nt def='NT-PubidChar'>PubidChar</nt>*
804 | "'" (<nt def='NT-PubidChar'>PubidChar</nt> - "'")* "'"</rhs>
806 <prod id="NT-PubidChar"><lhs>PubidChar</lhs>
808 |&nbsp;[a-zA-Z0-9]
809 |&nbsp;[-'()+,./:=?;!*#@$_%]</rhs>
819 <p><termref def='dt-text'>Text</termref> consists of intermingled
820 <termref def="dt-chardata">character
822 <termdef id="dt-markup" term="Markup"><term>Markup</term> takes the form of
823 <termref def="dt-stag">start-tags</termref>,
824 <termref def="dt-etag">end-tags</termref>,
825 <termref def="dt-empty">empty-element tags</termref>,
826 <termref def="dt-entref">entity references</termref>,
827 <termref def="dt-charref">character references</termref>,
828 <termref def="dt-comment">comments</termref>,
829 <termref def="dt-cdsection">CDATA section</termref> delimiters,
830 <termref def="dt-doctype">document type declarations</termref>, and
831 <termref def="dt-pi">processing instructions</termref>.
834 <p><termdef id="dt-chardata" term="Character Data">All text that is not markup
839 delimiters, or within a <termref def="dt-comment">comment</termref>, a
840 <termref def="dt-pi">processing instruction</termref>,
841 or a <termref def="dt-cdsection">CDATA section</termref>.
843 They are also legal within the <termref def='dt-litentval'>literal entity
845 <specref ref='wf-entities'/>.
846 <!-- FINAL EDIT: restore internal entity decl or leave it out. -->
848 they must be <termref def="dt-escape">escaped</termref>
849 using either <termref def='dt-charref'>numeric character references</termref>
854 "<code>&amp;gt;</code>", and must, <termref def='dt-compat'>for
862 a <termref def="dt-cdsection">CDATA section</termref>.
867 not contain the start-delimiter of any markup.
869 is any string of characters not including the CDATA-section-close
873 apostrophe or single-quote character (') may be represented as
874 "<code>&amp;apos;</code>", and the double-quote character (") as
878 <prod id='NT-CharData'>
880 <rhs>[^&lt;&amp;]* - ([^&lt;&amp;]* ']]&gt;' [^&lt;&amp;]*)</rhs>
886 <div2 id='sec-comments'>
889 <p><termdef id="dt-comment" term="Comment"><term>Comments</term> may
891 <termref def='dt-markup'>markup</termref>; in addition,
894 They are not part of the document's <termref def="dt-chardata">character
898 <termref def="dt-compat">For compatibility</termref>, the string
899 "<code>--</code>" (double-hyphen) must not occur within
903 <prod id='NT-Comment'><lhs>Comment</lhs>
904 <rhs>'&lt;!--'
905 ((<nt def='NT-Char'>Char</nt> - '-')
906 | ('-' (<nt def='NT-Char'>Char</nt> - '-')))*
907 '-->'</rhs>
916 <div2 id='sec-pi'>
919 <p><termdef id="dt-pi" term="Processing instruction"><term>Processing
925 <prod id='NT-PI'><lhs>PI</lhs>
926 <rhs>'&lt;?' <nt def='NT-PITarget'>PITarget</nt>
927 (<nt def='NT-S'>S</nt>
928 (<nt def='NT-Char'>Char</nt>* -
929 (<nt def='NT-Char'>Char</nt>* &pic; <nt def='NT-Char'>Char</nt>*)))?
931 <prod id='NT-PITarget'><lhs>PITarget</lhs>
932 <rhs><nt def='NT-Name'>Name</nt> -
936 PIs are not part of the document's <termref def="dt-chardata">character
938 PI begins with a target (<nt def='NT-PITarget'>PITarget</nt>) used
940 The target names "<code>XML</code>", "<code>xml</code>", and so on are
944 XML <termref def='dt-notation'>Notation</termref> mechanism
950 <div2 id='sec-cdata-sect'>
953 <p><termdef id="dt-cdsection" term="CDATA Section"><term>CDATA sections</term>
962 <prod id='NT-CDSect'><lhs>CDSect</lhs>
963 <rhs><nt def='NT-CDStart'>CDStart</nt>
964 <nt def='NT-CData'>CData</nt>
965 <nt def='NT-CDEnd'>CDEnd</nt></rhs></prod>
966 <prod id='NT-CDStart'><lhs>CDStart</lhs>
969 <prod id='NT-CData'><lhs>CData</lhs>
970 <rhs>(<nt def='NT-Char'>Char</nt>* -
971 (<nt def='NT-Char'>Char</nt>* ']]&gt;' <nt def='NT-Char'>Char</nt>*))
974 <prod id='NT-CDEnd'><lhs>CDEnd</lhs>
979 Within a CDATA section, only the <nt def='NT-CDEnd'>CDEnd</nt> string is
988 are recognized as <termref def='dt-chardata'>character data</termref>, not
989 <termref def='dt-markup'>markup</termref>:
994 <div2 id='sec-prolog-dtd'>
997 <p><termdef id='dt-xmldecl' term='XML Declaration'>XML documents
1003 def="dt-wellformed">well-formed</termref> but not
1004 <termref def="dt-valid">valid</termref>:
1015 for a document to use the value "<code>1.0</code>"
1022 use any particular numbering scheme.
1030 storage and logical structure and to associate attribute-value pairs
1032 def="dt-doctype">document type declaration</termref>, to define
1033 constraints on the logical structure and to support the use of
1036 <termdef id="dt-valid" term="Validity">An XML document is
1041 the first <termref def="dt-element">element</termref> in the document.
1045 <prod id='NT-prolog'><lhs>prolog</lhs>
1046 <rhs><nt def='NT-XMLDecl'>XMLDecl</nt>?
1047 <nt def='NT-Misc'>Misc</nt>*
1048 (<nt def='NT-doctypedecl'>doctypedecl</nt>
1049 <nt def='NT-Misc'>Misc</nt>*)?</rhs></prod>
1050 <prod id='NT-XMLDecl'><lhs>XMLDecl</lhs>
1052 <nt def='NT-VersionInfo'>VersionInfo</nt>
1053 <nt def='NT-EncodingDecl'>EncodingDecl</nt>?
1054 <nt def='NT-SDDecl'>SDDecl</nt>?
1055 <nt def="NT-S">S</nt>?
1058 <prod id='NT-VersionInfo'><lhs>VersionInfo</lhs>
1059 <rhs><nt def="NT-S">S</nt> 'version' <nt def='NT-Eq'>Eq</nt>
1060 (' <nt def="NT-VersionNum">VersionNum</nt> '
1061 | " <nt def="NT-VersionNum">VersionNum</nt> ")</rhs>
1063 <prod id='NT-Eq'><lhs>Eq</lhs>
1064 <rhs><nt def='NT-S'>S</nt>? '=' <nt def='NT-S'>S</nt>?</rhs></prod>
1065 <prod id="NT-VersionNum">
1067 <rhs>([a-zA-Z0-9_.:] | '-')+</rhs>
1069 <prod id='NT-Misc'><lhs>Misc</lhs>
1070 <rhs><nt def='NT-Comment'>Comment</nt> | <nt def='NT-PI'>PI</nt> |
1071 <nt def='NT-S'>S</nt></rhs></prod>
1075 <p><termdef id="dt-doctype" term="Document Type Declaration">The XML
1078 <termref def='dt-markupdecl'>markup declarations</termref>
1083 The document type declaration can point to an external subset (a
1085 <termref def='dt-extent'>external entity</termref>) containing markup
1092 <p><termdef id="dt-markupdecl" term="markup declaration">
1094 an <termref def="dt-eldecl">element type declaration</termref>,
1095 an <termref def="dt-attdecl">attribute-list declaration</termref>,
1096 an <termref def="dt-entdecl">entity declaration</termref>, or
1097 a <termref def="dt-notdecl">notation declaration</termref>.
1100 within <termref def='dt-PE'>parameter entities</termref>,
1101 as described in the well-formedness and validity constraints below.
1103 <specref ref="sec-physical-struct"/>.</p>
1107 <prod id='NT-doctypedecl'><lhs>doctypedecl</lhs>
1108 <rhs>'&lt;!DOCTYPE' <nt def='NT-S'>S</nt>
1109 <nt def='NT-Name'>Name</nt> (<nt def='NT-S'>S</nt>
1110 <nt def='NT-ExternalID'>ExternalID</nt>)?
1111 <nt def='NT-S'>S</nt>? ('['
1112 (<nt def='NT-markupdecl'>markupdecl</nt>
1113 | <nt def='NT-PEReference'>PEReference</nt>
1114 | <nt def='NT-S'>S</nt>)*
1116 <nt def='NT-S'>S</nt>?)? '>'</rhs>
1117 <vc def="vc-roottype"/>
1119 <prod id='NT-markupdecl'><lhs>markupdecl</lhs>
1120 <rhs><nt def='NT-elementdecl'>elementdecl</nt>
1121 | <nt def='NT-AttlistDecl'>AttlistDecl</nt>
1122 | <nt def='NT-EntityDecl'>EntityDecl</nt>
1123 | <nt def='NT-NotationDecl'>NotationDecl</nt>
1124 | <nt def='NT-PI'>PI</nt>
1125 | <nt def='NT-Comment'>Comment</nt>
1127 <vc def='vc-PEinMarkupDecl'/>
1128 <wfc def="wfc-PEinInternalSubset"/>
1135 the <termref def='dt-repltext'>replacement text</termref> of
1136 <termref def='dt-PE'>parameter entities</termref>.
1138 individual nonterminals (<nt def='NT-elementdecl'>elementdecl</nt>,
1139 <nt def='NT-AttlistDecl'>AttlistDecl</nt>, and so on) describe
1141 <termref def='dt-include'>included</termref>.</p>
1143 <vcnote id="vc-roottype">
1146 The <nt def='NT-Name'>Name</nt> in the document type declaration must
1147 match the element type of the <termref def='dt-root'>root element</termref>.
1151 <vcnote id='vc-PEinMarkupDecl'>
1153 <p>Parameter-entity
1154 <termref def='dt-repltext'>replacement text</termref> must be properly nested
1158 declaration (<nt def='NT-markupdecl'>markupdecl</nt> above)
1160 <termref def='dt-PERef'>parameter-entity reference</termref>,
1163 <wfcnote id="wfc-PEinInternalSubset">
1166 <termref def='dt-PERef'>parameter-entity references</termref>
1170 external parameter entities or to the external subset.)
1174 Like the internal subset, the external subset and
1175 any external parameter entities referred to in the DTD
1177 allowed by the non-terminal symbol
1178 <nt def="NT-markupdecl">markupdecl</nt>, interspersed with white space
1179 or <termref def="dt-PERef">parameter-entity references</termref>.
1182 external subset or of external parameter entities may conditionally be ignored
1184 the <termref def="dt-cond-section">conditional section</termref>
1187 <scrap id="ext-Subset">
1188 <head>External Subset</head>
1190 <prod id='NT-extSubset'><lhs>extSubset</lhs>
1191 <rhs><nt def='NT-TextDecl'>TextDecl</nt>?
1192 <nt def='NT-extSubsetDecl'>extSubsetDecl</nt></rhs></prod>
1193 <prod id='NT-extSubsetDecl'><lhs>extSubsetDecl</lhs>
1195 <nt def='NT-markupdecl'>markupdecl</nt>
1196 | <nt def='NT-conditionalSect'>conditionalSect</nt>
1197 | <nt def='NT-PEReference'>PEReference</nt>
1198 | <nt def='NT-S'>S</nt>
1203 <p>The external subset and external parameter entities also differ
1205 <termref def="dt-PERef">parameter-entity references</termref>
1213 The <termref def="dt-sysid">system identifier</termref>
1217 <eg><![CDATA[<?xml version="1.0" encoding="UTF-8" ?>
1223 If both the external and internal subsets are used, the
1224 internal subset is considered to occur before the external subset.
1225 <!-- 'is considered to'? boo. whazzat mean? -->
1226 This has the effect that entity and attribute-list declarations in the
1227 internal subset take precedence over those in the external subset.
1231 <div2 id='sec-rmd'>
1234 as passed from an <termref def="dt-xml-proc">XML processor</termref>
1239 whether or not there are such declarations which appear external to
1240 the <termref def='dt-docent'>document entity</termref>.
1244 <prod id='NT-SDDecl'><lhs>SDDecl</lhs>
1246 <nt def="NT-S">S</nt>
1247 'standalone' <nt def='NT-Eq'>Eq</nt>
1250 <vc def='vc-check-rmd'/></prod>
1256 are no markup declarations external to the <termref def='dt-docent'>document
1257 entity</termref> (either in the DTD external subset, or in an
1258 external parameter entity referenced from the internal subset)
1262 external markup declarations.
1264 denotes the presence of external <emph>declarations</emph>; the presence, in a
1266 references to external <emph>entities</emph>, when those entities are
1269 <p>If there are no external markup declarations, the standalone document
1271 If there are external markup declarations but there is no standalone
1276 <vcnote id='vc-check-rmd'>
1279 the value "<code>no</code>" if any external markup declarations
1281 <item><p>attributes with <termref def="dt-default">default</termref> values, if
1286 if <termref def="dt-entref">references</termref> to those
1295 <p>element types with <termref def="dt-elemcontent">element content</termref>,
1305 <div2 id='sec-white-space'>
1308 <p>In editing XML documents, it is often convenient to use "white space"
1310 <nt def='NT-S'>S</nt> in this specification) to
1316 <p>An <termref def='dt-xml-proc'>XML processor</termref>
1318 markup through to the application. A <termref def='dt-validating'>
1321 in <termref def="dt-elemcontent">element content</termref>.
1323 <p>A special <termref def='dt-attr'>attribute</termref>
1328 <termref def="dt-attdecl">declared</termref> if it is used.
1330 <termref def='dt-enumerated'>enumerated type</termref> whose only
1334 default white-space processing modes are acceptable for this element; the
1341 <p>The <termref def='dt-root'>root element</termref> of any document
1348 <div2 id='sec-line-ends'>
1349 <head>End-of-Line Handling</head>
1350 <p>XML <termref def='dt-parsedent'>parsed entities</termref> are often stored in
1353 carriage-return (#xD) and line-feed (#xA).</p>
1354 <p>To simplify the tasks of <termref def='dt-app'>applications</termref>,
1355 wherever an external parsed entity or the literal entity value
1357 two-character sequence "#xD#xA" or a standalone literal
1358 #xD, an <termref def='dt-xml-proc'>XML processor</termref> must
1365 <div2 id='sec-lang-tag'>
1371 A special <termref def="dt-attr">attribute</termref> named
1377 <termref def="dt-attdecl">declared</termref> if it is used.
1382 <prod id='NT-LanguageID'><lhs>LanguageID</lhs>
1383 <rhs><nt def='NT-Langcode'>Langcode</nt>
1384 ('-' <nt def='NT-Subcode'>Subcode</nt>)*</rhs></prod>
1385 <prod id='NT-Langcode'><lhs>Langcode</lhs>
1386 <rhs><nt def='NT-ISO639Code'>ISO639Code</nt> |
1387 <nt def='NT-IanaCode'>IanaCode</nt> |
1388 <nt def='NT-UserCode'>UserCode</nt></rhs>
1390 <prod id='NT-ISO639Code'><lhs>ISO639Code</lhs>
1391 <rhs>([a-z] | [A-Z]) ([a-z] | [A-Z])</rhs></prod>
1392 <prod id='NT-IanaCode'><lhs>IanaCode</lhs>
1393 <rhs>('i' | 'I') '-' ([a-z] | [A-Z])+</rhs></prod>
1394 <prod id='NT-UserCode'><lhs>UserCode</lhs>
1395 <rhs>('x' | 'X') '-' ([a-z] | [A-Z])+</rhs></prod>
1396 <prod id='NT-Subcode'><lhs>Subcode</lhs>
1397 <rhs>([a-z] | [A-Z])+</rhs></prod>
1399 The <nt def='NT-Langcode'>Langcode</nt> may be any of the following:
1401 <item><p>a two-letter language code as defined by
1403 for the representation of names of languages"</p></item>
1406 prefix "<code>i-</code>" (or "<code>I-</code>")</p></item>
1408 between parties in private use; these must begin with the
1409 prefix "<code>x-</code>" or "<code>X-</code>" in order to ensure that they do not conflict
1410 with names later standardized or registered with IANA</p></item>
1412 <p>There may be any number of <nt def='NT-Subcode'>Subcode</nt> segments; if
1417 for the representation of names of countries."
1421 unless the <nt def='NT-Langcode'>Langcode</nt> begins with the prefix
1422 "<code>x-</code>" or
1423 "<code>X-</code>". </p>
1426 Note that these values, unlike other names in XML documents,
1430 <p xml:lang="en-GB">What colour is it?</p>
1431 <p xml:lang="en-US">What color is it?</p>
1438 <!--<p>The xml:lang value is considered to apply both to the contents of an
1441 values of all of its attributes with free-text (CDATA) values. -->
1446 <!--
1460 -->
1474 <!-- &Elements; -->
1476 <div1 id='sec-logical-struct'>
1479 <p><termdef id="dt-element" term="Element">Each <termref
1480 def="dt-xml-doc">XML document</termref> contains one or more
1482 either delimited by <termref def="dt-stag">start-tags</termref>
1483 and <termref def="dt-etag">end-tags</termref>, or, for <termref
1484 def="dt-empty">empty</termref> elements, by an <termref
1485 def="dt-eetag">empty-element tag</termref>. Each element has a type,
1490 def="dt-attrname">name</termref> and a <termref
1491 def="dt-attrval">value</termref>.
1494 <prod id='NT-element'><lhs>element</lhs>
1495 <rhs><nt def='NT-EmptyElemTag'>EmptyElemTag</nt></rhs>
1496 <rhs>| <nt def='NT-STag'>STag</nt> <nt def='NT-content'>content</nt>
1497 <nt def='NT-ETag'>ETag</nt></rhs>
1502 <p>This specification does not constrain the semantics, use, or (beyond
1503 syntax) names of the element types and attributes, except that names
1511 The <nt def='NT-Name'>Name</nt> in an element's end-tag must match
1513 the start-tag.
1521 <nt def='NT-elementdecl'>elementdecl</nt> where the
1522 <nt def='NT-Name'>Name</nt> matches the element type, and
1526 <termref def='dt-content'>content</termref>.</p></item>
1527 <item><p>The declaration matches <nt def='NT-children'>children</nt> and
1529 <termref def="dt-parentchild">child elements</termref>
1532 matching the nonterminal <nt def='NT-S'>S</nt>) between each pair
1534 <item><p>The declaration matches <nt def='NT-Mixed'>Mixed</nt> and
1535 the content consists of <termref def='dt-chardata'>character
1536 data</termref> and <termref def='dt-parentchild'>child elements</termref>
1537 whose types match names in the content model.</p></item>
1539 of any <termref def='dt-parentchild'>child elements</termref> have
1544 <div2 id='sec-starttags'>
1545 <head>Start-Tags, End-Tags, and Empty-Element Tags</head>
1547 <p><termdef id="dt-stag" term="Start-Tag">The beginning of every
1548 non-empty XML element is marked by a <term>start-tag</term>.
1550 <head>Start-tag</head>
1552 <prod id='NT-STag'><lhs>STag</lhs>
1553 <rhs>'&lt;' <nt def='NT-Name'>Name</nt>
1554 (<nt def='NT-S'>S</nt> <nt def='NT-Attribute'>Attribute</nt>)*
1555 <nt def='NT-S'>S</nt>? '>'</rhs>
1558 <prod id='NT-Attribute'><lhs>Attribute</lhs>
1559 <rhs><nt def='NT-Name'>Name</nt> <nt def='NT-Eq'>Eq</nt>
1560 <nt def='NT-AttValue'>AttValue</nt></rhs>
1566 The <nt def='NT-Name'>Name</nt> in
1567 the start- and end-tags gives the
1569 <termdef id="dt-attr" term="Attribute">
1570 The <nt def='NT-Name'>Name</nt>-<nt def='NT-AttValue'>AttValue</nt> pairs are
1573 <termdef id="dt-attrname" term="Attribute Name">with the
1574 <nt def='NT-Name'>Name</nt> in each pair
1576 <termdef id="dt-attrval" term="Attribute Value">the content of the
1577 <nt def='NT-AttValue'>AttValue</nt> (the text between the
1584 No attribute name may appear more than once in the same start-tag
1585 or empty-element tag.
1597 <head>No External Entity References</head>
1600 to external entities.
1605 <p>The <termref def='dt-repltext'>replacement text</termref> of any entity
1610 <p>An example of a start-tag:
1611 <eg>&lt;termdef id="dt-dog" term="dog"></eg></p>
1612 <p><termdef id="dt-etag" term="End Tag">The end of every element
1613 that begins with a start-tag must
1614 be marked by an <term>end-tag</term>
1616 start-tag:
1618 <head>End-tag</head>
1620 <prod id='NT-ETag'><lhs>ETag</lhs>
1621 <rhs>'&lt;/' <nt def='NT-Name'>Name</nt>
1622 <nt def='NT-S'>S</nt>? '>'</rhs></prod>
1626 <p>An example of an end-tag:<eg>&lt;/termdef></eg></p>
1627 <p><termdef id="dt-content" term="Content">The
1628 <termref def='dt-text'>text</termref> between the start-tag and
1629 end-tag is called the element's
1634 <prod id='NT-content'><lhs>content</lhs>
1635 <rhs>(<nt def='NT-element'>element</nt> | <nt def='NT-CharData'>CharData</nt>
1636 | <nt def='NT-Reference'>Reference</nt> | <nt def='NT-CDSect'>CDSect</nt>
1637 | <nt def='NT-PI'>PI</nt> | <nt def='NT-Comment'>Comment</nt>)*</rhs>
1642 <p><termdef id="dt-empty" term="Empty">If an element is <term>empty</term>,
1643 it must be represented either by a start-tag immediately followed
1644 by an end-tag or by an empty-element tag.</termdef>
1645 <termdef id="dt-eetag" term="empty-element tag">An
1646 <term>empty-element tag</term> takes a special form:
1650 <prod id='NT-EmptyElemTag'><lhs>EmptyElemTag</lhs>
1651 <rhs>'&lt;' <nt def='NT-Name'>Name</nt> (<nt def='NT-S'>S</nt>
1652 <nt def='NT-Attribute'>Attribute</nt>)* <nt def='NT-S'>S</nt>?
1659 <p>Empty-element tags may be used for any element which has no
1662 <termref def='dt-interop'>For interoperability</termref>, the empty-element
1664 <termref def='dt-eldecl'>declared</termref> <kw>EMPTY</kw>.</p>
1675 <p>The <termref def="dt-element">element</termref> structure of an
1676 <termref def="dt-xml-doc">XML document</termref> may, for
1677 <termref def="dt-valid">validation</termref> purposes,
1679 using element type and attribute-list declarations.
1681 <termref def="dt-content">content</termref>.
1685 appear as <termref def="dt-parentchild">children</termref> of the element.
1689 <p><termdef id="dt-eldecl" term="Element Type declaration">An <term>element
1694 <prod id='NT-elementdecl'><lhs>elementdecl</lhs>
1695 <rhs>'&lt;!ELEMENT' <nt def='NT-S'>S</nt>
1696 <nt def='NT-Name'>Name</nt>
1697 <nt def='NT-S'>S</nt>
1698 <nt def='NT-contentspec'>contentspec</nt>
1699 <nt def='NT-S'>S</nt>? '>'</rhs>
1701 <prod id='NT-contentspec'><lhs>contentspec</lhs>
1704 | <nt def='NT-Mixed'>Mixed</nt>
1705 | <nt def='NT-children'>children</nt>
1710 where the <nt def='NT-Name'>Name</nt> gives the element type
1727 <div3 id='sec-element-content'>
1730 <p><termdef id='dt-elemcontent' term='Element content'>An element <termref
1731 def="dt-stag">type</termref> has
1733 type must contain only <termref def='dt-parentchild'>child</termref>
1736 <nt def='NT-S'>S</nt>).
1743 content particles (<nt def='NT-cp'>cp</nt>s), which consist of names,
1747 <head>Element-content Models</head>
1749 <prod id='NT-children'><lhs>children</lhs>
1750 <rhs>(<nt def='NT-choice'>choice</nt>
1751 | <nt def='NT-seq'>seq</nt>)
1753 <prod id='NT-cp'><lhs>cp</lhs>
1754 <rhs>(<nt def='NT-Name'>Name</nt>
1755 | <nt def='NT-choice'>choice</nt>
1756 | <nt def='NT-seq'>seq</nt>)
1758 <prod id='NT-choice'><lhs>choice</lhs>
1759 <rhs>'(' <nt def='NT-S'>S</nt>? cp
1760 ( <nt def='NT-S'>S</nt>? '|' <nt def='NT-S'>S</nt>? <nt def='NT-cp'>cp</nt> )*
1761 <nt def='NT-S'>S</nt>? ')'</rhs>
1762 <vc def='vc-PEinGroup'/></prod>
1763 <prod id='NT-seq'><lhs>seq</lhs>
1764 <rhs>'(' <nt def='NT-S'>S</nt>? cp
1765 ( <nt def='NT-S'>S</nt>? ',' <nt def='NT-S'>S</nt>? <nt def='NT-cp'>cp</nt> )*
1766 <nt def='NT-S'>S</nt>? ')'</rhs>
1767 <vc def='vc-PEinGroup'/></prod>
1771 where each <nt def='NT-Name'>Name</nt> is the type of an element which may
1772 appear as a <termref def="dt-parentchild">child</termref>.
1775 def="dt-elemcontent">element content</termref> at the location where
1778 appear in the <termref def="dt-elemcontent">element content</termref> in the
1794 def='dt-compat'>For compatibility</termref>, it is an error
1798 <!-- appendix <specref ref="determinism"/>. -->
1799 <!-- appendix on deterministic content models. -->
1801 <vcnote id='vc-PEinGroup'>
1803 <p>Parameter-entity
1804 <termref def='dt-repltext'>replacement text</termref> must be properly nested
1807 in a <nt def='NT-choice'>choice</nt>, <nt def='NT-seq'>seq</nt>, or
1808 <nt def='NT-Mixed'>Mixed</nt> construct
1810 <termref def='dt-PERef'>parameter entity</termref>,
1812 <p><termref def='dt-interop'>For interoperability</termref>,
1813 if a parameter-entity reference appears in a
1814 <nt def='NT-choice'>choice</nt>, <nt def='NT-seq'>seq</nt>, or
1815 <nt def='NT-Mixed'>Mixed</nt> construct, its replacement text
1817 neither the first nor last non-blank
1822 <p>Examples of element-content models:
1825 &lt;!ELEMENT dictionary-body (%div.mix; | %dict.mix;)*></eg></p>
1828 <div3 id='sec-mixed-content'>
1831 <p><termdef id='dt-mixed' term='Mixed Content'>An element
1832 <termref def='dt-stag'>type</termref> has
1835 <termref def="dt-parentchild">child</termref> elements.</termdef>
1839 <head>Mixed-content Declaration</head>
1841 <prod id='NT-Mixed'><lhs>Mixed</lhs>
1842 <rhs>'(' <nt def='NT-S'>S</nt>?
1844 (<nt def='NT-S'>S</nt>?
1846 <nt def='NT-S'>S</nt>?
1847 <nt def='NT-Name'>Name</nt>)*
1848 <nt def='NT-S'>S</nt>?
1850 <rhs>| '(' <nt def='NT-S'>S</nt>? '#PCDATA' <nt def='NT-S'>S</nt>? ')'
1851 </rhs><vc def='vc-PEinGroup'/>
1852 <vc def='vc-MixedChildrenUnique'/>
1857 where the <nt def='NT-Name'>Name</nt>s give the types of elements
1860 <vcnote id='vc-MixedChildrenUnique'>
1862 <p>The same name must not appear more than once in a single mixed-content
1873 <head>Attribute-List Declarations</head>
1875 <p><termref def="dt-attr">Attributes</termref> are used to associate
1876 name-value pairs with <termref def="dt-element">elements</termref>.
1878 def="dt-stag">start-tags</termref>
1879 and <termref def="dt-eetag">empty-element tags</termref>;
1881 recognize them appear in <specref ref='sec-starttags'/>.
1882 Attribute-list
1889 <item><p>To provide <termref def="dt-default">default values</termref>
1893 <p><termdef id="dt-attdecl" term="Attribute-List Declaration">
1894 <term>Attribute-list declarations</term> specify the name, data type, and default
1897 <head>Attribute-list Declaration</head>
1898 <prod id='NT-AttlistDecl'><lhs>AttlistDecl</lhs>
1899 <rhs>'&lt;!ATTLIST' <nt def='NT-S'>S</nt>
1900 <nt def='NT-Name'>Name</nt>
1901 <nt def='NT-AttDef'>AttDef</nt>*
1902 <nt def='NT-S'>S</nt>? '&gt;'</rhs>
1904 <prod id='NT-AttDef'><lhs>AttDef</lhs>
1905 <rhs><nt def='NT-S'>S</nt> <nt def='NT-Name'>Name</nt>
1906 <nt def='NT-S'>S</nt> <nt def='NT-AttType'>AttType</nt>
1907 <nt def='NT-S'>S</nt> <nt def='NT-DefaultDecl'>DefaultDecl</nt></rhs>
1910 The <nt def="NT-Name">Name</nt> in the
1911 <nt def='NT-AttlistDecl'>AttlistDecl</nt> rule is the type of an element. At
1914 error. The <nt def='NT-Name'>Name</nt> in the
1915 <nt def='NT-AttDef'>AttDef</nt> rule is
1918 When more than one <nt def='NT-AttlistDecl'>AttlistDecl</nt> is provided for a
1923 <termref def='dt-interop'>For interoperability,</termref> writers of DTDs
1924 may choose to provide at most one attribute-list declaration
1927 in each attribute-list declaration.
1929 issue a warning when more than one attribute-list declaration is
1935 <div3 id='sec-attribute-types'>
1945 <prod id='NT-AttType'><lhs>AttType</lhs>
1946 <rhs><nt def='NT-StringType'>StringType</nt>
1947 | <nt def='NT-TokenizedType'>TokenizedType</nt>
1948 | <nt def='NT-EnumeratedType'>EnumeratedType</nt>
1951 <prod id='NT-StringType'><lhs>StringType</lhs>
1954 <prod id='NT-TokenizedType'><lhs>TokenizedType</lhs>
1957 <vc def='one-id-per-el'/>
1958 <vc def='id-default'/>
1978 <nt def='NT-Name'>Name</nt> production.
1984 <vcnote id='one-id-per-el'>
1988 <vcnote id='id-default'>
1997 the <nt def="NT-Name">Name</nt> production, and
1999 <nt def="NT-Names">Names</nt>;
2000 each <nt def='NT-Name'>Name</nt> must match the value of an ID attribute on
2009 must match the <nt def="NT-Name">Name</nt> production,
2011 <nt def="NT-Names">Names</nt>;
2012 each <nt def="NT-Name">Name</nt> must
2014 name of an <termref def="dt-unparsed">unparsed entity</termref> declared in the
2015 <termref def="dt-doctype">DTD</termref>.
2022 <nt def="NT-Nmtoken">Nmtoken</nt> production;
2024 match <termref def="NT-Nmtokens">Nmtokens</termref>.
2027 <!-- why?
2030 <specref ref="AVNormalize"/>.</p>-->
2031 <p><termdef id='dt-enumerated' term='Enumerated Attribute
2037 <prod id='NT-EnumeratedType'><lhs>EnumeratedType</lhs>
2038 <rhs><nt def='NT-NotationType'>NotationType</nt>
2039 | <nt def='NT-Enumeration'>Enumeration</nt>
2041 <prod id='NT-NotationType'><lhs>NotationType</lhs>
2043 <nt def='NT-S'>S</nt>
2045 <nt def='NT-S'>S</nt>?
2046 <nt def='NT-Name'>Name</nt>
2047 (<nt def='NT-S'>S</nt>? '|' <nt def='NT-S'>S</nt>?
2048 <nt def='NT-Name'>Name</nt>)*
2049 <nt def='NT-S'>S</nt>? ')'
2052 <prod id='NT-Enumeration'><lhs>Enumeration</lhs>
2053 <rhs>'(' <nt def='NT-S'>S</nt>?
2054 <nt def='NT-Nmtoken'>Nmtoken</nt>
2055 (<nt def='NT-S'>S</nt>? '|'
2056 <nt def='NT-S'>S</nt>?
2057 <nt def='NT-Nmtoken'>Nmtoken</nt>)*
2058 <nt def='NT-S'>S</nt>?
2063 <termref def='dt-notation'>notation</termref>, declared in the
2073 one of the <titleref href='Notations'>notation</titleref> names included in
2074 the declaration; all notation names in the declaration must
2082 must match one of the <nt def='NT-Nmtoken'>Nmtoken</nt> tokens in the
2086 <p><termref def='dt-interop'>For interoperability,</termref> the same
2087 <nt def='NT-Nmtoken'>Nmtoken</nt> should not occur more than once in the
2092 <div3 id='sec-attr-defaults'>
2095 <p>An <termref def="dt-attdecl">attribute declaration</termref> provides
2102 <prod id='NT-DefaultDecl'><lhs>DefaultDecl</lhs>
2105 <rhs>| (('#FIXED' S)? <nt def='NT-AttValue'>AttValue</nt>)</rhs>
2118 <!-- not any more!!
2123 of the application. -->
2124 <termdef id="dt-default" term="Attribute Default">If the
2127 <nt def='NT-AttValue'>AttValue</nt> value contains the declared
2138 all elements of the type in the attribute-list declaration.
2154 <p>Examples of attribute-list declarations:
2164 <head>Attribute-Value Normalization</head>
2175 is appended for a "#xD#xA" sequence that is part of an external
2189 by a non-validating parser as if declared
2194 <div2 id='sec-condition-sect'>
2196 <p><termdef id='dt-cond-section' term='conditional section'>
2198 <termref def='dt-doctype'>document type declaration external subset</termref>
2205 <prod id='NT-conditionalSect'><lhs>conditionalSect</lhs>
2206 <rhs><nt def='NT-includeSect'>includeSect</nt>
2207 | <nt def='NT-ignoreSect'>ignoreSect</nt>
2210 <prod id='NT-includeSect'><lhs>includeSect</lhs>
2213 <nt def="NT-extSubsetDecl">extSubsetDecl</nt>
2217 <prod id='NT-ignoreSect'><lhs>ignoreSect</lhs>
2219 <nt def="NT-ignoreSectContents">ignoreSectContents</nt>*
2223 <prod id='NT-ignoreSectContents'><lhs>ignoreSectContents</lhs>
2224 <rhs><nt def='NT-Ignore'>Ignore</nt>
2225 ('&lt;![' <nt def='NT-ignoreSectContents'>ignoreSectContents</nt> ']]&gt;'
2226 <nt def='NT-Ignore'>Ignore</nt>)*</rhs></prod>
2227 <prod id='NT-Ignore'><lhs>Ignore</lhs>
2228 <rhs><nt def='NT-Char'>Char</nt>* -
2229 (<nt def='NT-Char'>Char</nt>* ('&lt;![' | ']]&gt;')
2230 <nt def='NT-Char'>Char</nt>*)
2236 <p>Like the internal and external DTD subsets, a conditional section
2256 parameter-entity reference, the parameter entity must be replaced by its
2274 <!--
2275 <div2 id='sec-pass-to-app'>
2277 <p>When an XML processor encounters a start-tag, it must make
2284 <p>the names of attributes known to apply to this element type
2285 (validating processors must make available names of all attributes
2286 declared for the element type; non-validating processors must
2287 make available at least the names of the attributes for which
2294 -->
2297 <!-- &Entities; -->
2299 <div1 id='sec-physical-struct'>
2302 <p><termdef id="dt-entity" term="Entity">An XML document may consist
2306 the <termref def='dt-doctype'>external DTD subset</termref>)
2310 called the <termref def="dt-docent">document entity</termref>, which serves
2311 as the starting point for the <termref def="dt-xml-proc">XML
2314 <termdef id="dt-parsedent" term="Text Entity">A <term>parsed entity's</term>
2316 <termref def='dt-repltext'>replacement text</termref>;
2317 this <termref def="dt-text">text</termref> is considered an
2320 <p><termdef id="dt-unparsed" term="Unparsed Entity">An
2323 <termref def='dt-text'>text</termref>, and if text, may not be XML.
2326 def="dt-notation">notation</termref>, identified by name.
2337 <p><termdef id='gen-entity' term='general entity'
2339 are entities for use within the document content.
2343 <termdef id='dt-PE' term='Parameter entity'>Parameter entities
2344 are parsed entities for use within the DTD.</termdef>
2345 These two types of entities use different forms of reference and
2351 <div2 id='sec-references'>
2353 <p><termdef id="dt-charref" term="Character Reference">
2359 <prod id='NT-CharRef'><lhs>CharRef</lhs>
2360 <rhs>'&amp;#' [0-9]+ ';' </rhs>
2361 <rhs>| '&hcro;' [0-9a-fA-F]+ ';'</rhs>
2362 <wfc def="wf-Legalchar"/>
2365 <wfcnote id="wf-Legalchar">
2369 <termref def="NT-Char">Char</termref>.</p>
2379 <p><termdef id="dt-entref" term="Entity Reference">An <term>entity
2381 <termdef id='dt-GERef' term='General Entity Reference'>References to
2383 use ampersand (<code>&amp;</code>) and semicolon (<code>;</code>) as
2385 <termdef id='dt-PERef' term='Parameter-entity reference'>
2386 <term>Parameter-entity references</term> use percent-sign (<code>%</code>) and
2392 <prod id='NT-Reference'><lhs>Reference</lhs>
2393 <rhs><nt def='NT-EntityRef'>EntityRef</nt>
2394 | <nt def='NT-CharRef'>CharRef</nt></rhs></prod>
2395 <prod id='NT-EntityRef'><lhs>EntityRef</lhs>
2396 <rhs>'&amp;' <nt def='NT-Name'>Name</nt> ';'</rhs>
2397 <wfc def='wf-entdeclared'/>
2398 <vc def='vc-entdeclared'/>
2402 <prod id='NT-PEReference'><lhs>PEReference</lhs>
2403 <rhs>'%' <nt def='NT-Name'>Name</nt> ';'</rhs>
2404 <vc def='vc-entdeclared'/>
2410 <wfcnote id='wf-entdeclared'>
2415 the <nt def='NT-Name'>Name</nt> given in the entity reference must
2416 <termref def="dt-match">match</termref> that in an
2417 <titleref href='sec-entity-decl'>entity declaration</titleref>, except that
2418 well-formed documents need not declare
2422 reference to it which appears in a default value in an attribute-list
2424 <p>Note that if entities are declared in the external subset or in
2425 external parameter entities, a non-validating processor is
2426 <titleref href='include-if-valid'>not obligated to</titleref> read
2428 an entity must be declared is a well-formedness constraint only
2429 if <titleref href='sec-rmd'>standalone='yes'</titleref>.</p>
2431 <vcnote id="vc-entdeclared">
2433 <p>In a document with an external subset or external parameter
2435 the <nt def='NT-Name'>Name</nt> given in the entity reference must <termref
2436 def="dt-match">match</termref> that in an
2437 <titleref href='sec-entity-decl'>entity declaration</titleref>.
2440 specified in <specref ref="sec-predefined-ent"/>.
2443 reference to it which appears in a default value in an attribute-list
2446 <!-- FINAL EDIT: is this duplication too clumsy? -->
2451 def="dt-unparsed">unparsed entity</termref>. Unparsed entities may be referred
2452 to only in <termref def="dt-attrval">attribute values</termref> declared to
2466 Parameter-entity references may only appear in the
2467 <termref def='dt-doctype'>DTD</termref>.
2471 <eg>Type &lt;key>less-than&lt;/key> (&hcro;3C;) to save options.
2473 is classified &amp;security-level;.</eg></p>
2474 <p>Example of a parameter-entity reference:
2475 <eg><![CDATA[<!-- declare the parameter entity "ISOLat2"... -->
2477 SYSTEM "http://www.xml.com/iso/isolat2-xml.entities" >
2478 <!-- ... now reference it. -->
2482 <div2 id='sec-entity-decl'>
2485 <p><termdef id="dt-entdecl" term="entity declaration">
2490 <prod id='NT-EntityDecl'><lhs>EntityDecl</lhs>
2491 <rhs><nt def="NT-GEDecl">GEDecl</nt><!--</rhs><com>General entities</com>
2492 <rhs>--> | <nt def="NT-PEDecl">PEDecl</nt></rhs>
2493 <!--<com>Parameter entities</com>-->
2495 <prod id='NT-GEDecl'><lhs>GEDecl</lhs>
2496 <rhs>'&lt;!ENTITY' <nt def='NT-S'>S</nt> <nt def='NT-Name'>Name</nt>
2497 <nt def='NT-S'>S</nt> <nt def='NT-EntityDef'>EntityDef</nt>
2498 <nt def='NT-S'>S</nt>? '&gt;'</rhs>
2500 <prod id='NT-PEDecl'><lhs>PEDecl</lhs>
2501 <rhs>'&lt;!ENTITY' <nt def='NT-S'>S</nt> '%' <nt def='NT-S'>S</nt>
2502 <nt def='NT-Name'>Name</nt> <nt def='NT-S'>S</nt>
2503 <nt def='NT-PEDef'>PEDef</nt> <nt def='NT-S'>S</nt>? '&gt;'</rhs>
2504 <!--<com>Parameter entities</com>-->
2506 <prod id='NT-EntityDef'><lhs>EntityDef</lhs>
2507 <rhs><nt def='NT-EntityValue'>EntityValue</nt>
2508 <!--</rhs>
2509 <rhs>-->| (<nt def='NT-ExternalID'>ExternalID</nt>
2510 <nt def='NT-NDataDecl'>NDataDecl</nt>?)</rhs>
2511 <!-- <nt def='NT-ExternalDef'>ExternalDef</nt></rhs> -->
2513 <!-- FINAL EDIT: what happened to WFs here? -->
2514 <prod id='NT-PEDef'><lhs>PEDef</lhs>
2515 <rhs><nt def='NT-EntityValue'>EntityValue</nt>
2516 | <nt def='NT-ExternalID'>ExternalID</nt></rhs></prod>
2519 The <nt def='NT-Name'>Name</nt> identifies the entity in an
2520 <termref def="dt-entref">entity reference</termref> or, in the case of an
2528 <div3 id='sec-internal-ent'>
2531 <p><termdef id='dt-internent' term="Internal Entity Replacement Text">If
2533 <nt def='NT-EntityValue'>EntityValue</nt>,
2539 <termref def='dt-litentval'>literal entity value</termref> may be required to
2540 produce the correct <termref def='dt-repltext'>replacement
2541 text</termref>: see <specref ref='intern-replacement'/>.
2543 <p>An internal entity is a <termref def="dt-parsedent">parsed
2546 <eg>&lt;!ENTITY Pub-Status "This is a pre-release of the
2550 <div3 id='sec-external-ent'>
2551 <head>External Entities</head>
2553 <p><termdef id="dt-extent" term="External Entity">If the entity is not
2554 internal, it is an <term>external
2557 <head>External Entity Declaration</head>
2558 <!--
2559 <prod id='NT-ExternalDef'><lhs>ExternalDef</lhs>
2560 <rhs></prod> -->
2561 <prod id='NT-ExternalID'><lhs>ExternalID</lhs>
2562 <rhs>'SYSTEM' <nt def='NT-S'>S</nt>
2563 <nt def='NT-SystemLiteral'>SystemLiteral</nt></rhs>
2564 <rhs>| 'PUBLIC' <nt def='NT-S'>S</nt>
2565 <nt def='NT-PubidLiteral'>PubidLiteral</nt>
2566 <nt def='NT-S'>S</nt>
2567 <nt def='NT-SystemLiteral'>SystemLiteral</nt>
2570 <prod id='NT-NDataDecl'><lhs>NDataDecl</lhs>
2571 <rhs><nt def='NT-S'>S</nt> 'NDATA' <nt def='NT-S'>S</nt>
2572 <nt def='NT-Name'>Name</nt></rhs>
2573 <vc def='not-declared'/></prod>
2575 If the <nt def='NT-NDataDecl'>NDataDecl</nt> is present, this is a
2576 general <termref def="dt-unparsed">unparsed
2578 <vcnote id='not-declared'>
2581 The <nt def='NT-Name'>Name</nt> must match the declared name of a
2582 <termref def="dt-notation">notation</termref>.
2585 <p><termdef id="dt-sysid" term="System Identifier">The
2586 <nt def='NT-SystemLiteral'>SystemLiteral</nt>
2599 <termref def='dt-docent'>document entity</termref>, to the entity
2600 containing the <termref def='dt-doctype'>external DTD subset</termref>,
2601 or to some other <termref def='dt-extent'>external parameter entity</termref>.
2603 <p>An XML processor should handle a non-ASCII character in a URI by
2604 representing the character in UTF-8 as one or more bytes, and then
2608 <p><termdef id="dt-pubid" term="Public identifier">
2609 In addition to a system identifier, an external identifier may
2611 An XML processor attempting to retrieve the entity's content may use the public
2613 is unable to do so, it must use the URI specified in the system
2617 <p>Examples of external entity declarations:
2618 <eg>&lt;!ENTITY open-hatch
2620 &lt;!ENTITY open-hatch
2621 PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"
2623 &lt;!ENTITY hatch-pic
2632 <div3 id='sec-TextDecl'>
2634 <p>External parsed entities may each begin with a <term>text
2639 <prod id='NT-TextDecl'><lhs>TextDecl</lhs>
2641 <nt def='NT-VersionInfo'>VersionInfo</nt>?
2642 <nt def='NT-EncodingDecl'>EncodingDecl</nt>
2643 <nt def='NT-S'>S</nt>? &pic;</rhs>
2651 an external parsed entity.</p>
2653 <div3 id='wf-entities'>
2654 <head>Well-Formed Parsed Entities</head>
2655 <p>The document entity is well-formed if it matches the production labeled
2656 <nt def='NT-document'>document</nt>.
2657 An external general
2658 parsed entity is well-formed if it matches the production labeled
2659 <nt def='NT-extParsedEnt'>extParsedEnt</nt>.
2660 An external parameter
2661 entity is well-formed if it matches the production labeled
2662 <nt def='NT-extPE'>extPE</nt>.
2664 <head>Well-Formed External Parsed Entity</head>
2665 <prod id='NT-extParsedEnt'><lhs>extParsedEnt</lhs>
2666 <rhs><nt def='NT-TextDecl'>TextDecl</nt>?
2667 <nt def='NT-content'>content</nt></rhs>
2669 <prod id='NT-extPE'><lhs>extPE</lhs>
2670 <rhs><nt def='NT-TextDecl'>TextDecl</nt>?
2671 <nt def='NT-extSubsetDecl'>extSubsetDecl</nt></rhs>
2674 An internal general parsed entity is well-formed if its replacement text
2676 <nt def='NT-content'>content</nt>.
2677 All internal parameter entities are well-formed by definition.
2679 <p>A consequence of well-formedness in entities is that the logical
2681 <termref def='dt-stag'>start-tag</termref>,
2682 <termref def='dt-etag'>end-tag</termref>,
2683 <termref def="dt-empty">empty-element tag</termref>,
2684 <termref def='dt-element'>element</termref>,
2685 <termref def='dt-comment'>comment</termref>,
2686 <termref def='dt-pi'>processing instruction</termref>,
2687 <termref def='dt-charref'>character
2689 <termref def='dt-entref'>entity reference</termref>
2695 <p>Each external parsed entity in an XML document may use a different
2697 entities in either UTF-8 or UTF-16.
2700 <p>Entities encoded in UTF-16 must
2702 Unicode Appendix B (the ZERO WIDTH NO-BREAK SPACE character, #xFEFF).
2705 XML processors must be able to use this character to
2706 differentiate between UTF-8 and UTF-16 encoded documents.</p>
2708 the UTF-8 and UTF-16 encodings, it is recognized that other encodings are
2710 to read entities that use them.
2712 UTF-8 or UTF-16 must begin with a <titleref href='TextDecl'>text
2716 <prod id='NT-EncodingDecl'><lhs>EncodingDecl</lhs>
2717 <rhs><nt def="NT-S">S</nt>
2718 'encoding' <nt def='NT-Eq'>Eq</nt>
2719 ('"' <nt def='NT-EncName'>EncName</nt> '"' |
2720 "'" <nt def='NT-EncName'>EncName</nt> "'" )
2723 <prod id='NT-EncName'><lhs>EncName</lhs>
2724 <rhs>[A-Za-z] ([A-Za-z0-9._] | '-')*</rhs>
2728 In the <termref def='dt-docent'>document entity</termref>, the encoding
2729 declaration is part of the <termref def="dt-xmldecl">XML declaration</termref>.
2730 The <nt def="NT-EncName">EncName</nt> is the name of the encoding used.
2732 <!-- FINAL EDIT: check name of IANA and charset names -->
2734 "<code>UTF-8</code>",
2735 "<code>UTF-16</code>",
2736 "<code>ISO-10646-UCS-2</code>", and
2737 "<code>ISO-10646-UCS-4</code>" should be
2740 "<code>ISO-8859-1</code>",
2741 "<code>ISO-8859-2</code>", ...
2742 "<code>ISO-8859-9</code>" should be used for the parts of ISO 8859, and
2744 "<code>ISO-2022-JP</code>",
2746 "<code>EUC-JP</code>"
2747 should be used for the various encoded forms of JIS X-0208-1997. XML
2753 using their registered names.
2754 Note that these registered names are defined to be
2755 case-insensitive, so processors wishing to match against them
2756 should do so in a case-insensitive
2758 <p>In the absence of information provided by an external
2760 it is an <termref def="dt-error">error</termref> for an entity including
2764 of an external entity, or for
2766 declaration to use an encoding other than UTF-8.
2768 is a subset of UTF-8, ordinary ASCII entities do not strictly need
2771 <p>It is a <termref def='dt-fatal'>fatal error</termref> when an XML processor
2774 <eg>&lt;?xml encoding='UTF-8'?>
2775 &lt;?xml encoding='EUC-JP'?></eg></p>
2782 required behavior of an <termref def='dt-xml-proc'>XML processor</termref> in
2788 anywhere after the <termref def='dt-stag'>start-tag</termref> and
2789 before the <termref def='dt-etag'>end-tag</termref> of an element; corresponds
2790 to the nonterminal <nt def='NT-content'>content</nt>.</p></def>
2795 <termref def='dt-stag'>start-tag</termref>, or a default
2796 value in an <termref def='dt-attdecl'>attribute declaration</termref>;
2798 <nt def='NT-AttValue'>AttValue</nt>.</p></def></gitem>
2801 <def><p>as a <nt def='NT-Name'>Name</nt>, not a reference, appearing either as
2804 the space-separated tokens in the value of an attribute which has been
2810 <termref def='dt-litentval'>literal entity value</termref> in
2812 <nt def='NT-EntityValue'>EntityValue</nt>.</p></def></gitem>
2814 <def><p>as a reference within either the internal or external subsets of the
2815 <termref def='dt-doctype'>DTD</termref>, but outside
2816 of an <nt def='NT-EntityValue'>EntityValue</nt> or
2817 <nt def="NT-AttValue">AttValue</nt>.</p></def>
2830 <td bgcolor='&cellback;'>External Parsed
2838 <td bgcolor='&cellback;'><titleref href='not-recognized'>Not recognized</titleref></td>
2840 <td bgcolor='&cellback;'><titleref href='include-if-valid'>Included if validating</titleref></td>
2847 <td bgcolor='&cellback;'><titleref href='not-recognized'>Not recognized</titleref></td>
2856 <td bgcolor='&cellback;'><titleref href='not-recognized'>Not recognized</titleref></td>
2857 <td bgcolor='&cellback;'><titleref href='not-recognized'>Forbidden</titleref></td>
2858 <td bgcolor='&cellback;'><titleref href='not-recognized'>Forbidden</titleref></td>
2874 <td bgcolor='&cellback;'><titleref href='as-PE'>Included as PE</titleref></td>
2882 <div3 id='not-recognized'>
2886 DTD are not recognized as markup in <nt def='NT-content'>content</nt>.
2887 Similarly, the names of unparsed entities are not recognized except
2893 <p><termdef id="dt-include" term="Include">An entity is
2895 <termref def='dt-repltext'>replacement text</termref> is retrieved
2900 <termref def='dt-chardata'>character data</termref>
2901 and (except for parameter entities) <termref def="dt-markup">markup</termref>,
2907 as an entity-reference delimiter.)
2912 <div3 id='include-if-valid'>
2915 to <termref def="dt-valid">validate</termref>
2917 <termref def="dt-include">include</termref> its
2919 If the entity is external, and the processor is not
2921 processor <termref def="dt-may">may</termref>, but need not,
2923 If a non-validating parser does not include the replacement text,
2930 Browsers, for example, when encountering an external parsed entity reference,
2938 <termref def='dt-fatal'>fatal</termref> errors:
2941 <termref def='dt-unparsed'>unparsed entity</termref>.
2943 <item><p>the appearance of any character or general-entity reference in the
2944 DTD except within an <nt def='NT-EntityValue'>EntityValue</nt> or
2945 <nt def="NT-AttValue">AttValue</nt>.</p></item>
2946 <item><p>a reference to an external entity in an attribute value.</p>
2953 <p>When an <termref def='dt-entref'>entity reference</termref> appears in an
2955 value, its <termref def='dt-repltext'>replacement text</termref> is
2961 For example, this is well-formed:
2966 &lt;element attribute='a-&amp;EndAttr;></eg>
2970 <p>When the name of an <termref def='dt-unparsed'>unparsed
2974 application of the <termref def='dt-sysid'>system</termref>
2975 and <termref def='dt-pubid'>public</termref> (if any)
2977 <termref def="dt-notation">notation</termref>.</p>
2982 <nt def='NT-EntityValue'>EntityValue</nt> in an entity declaration,
2985 <div3 id='as-PE'>
2987 <p>Just as with external parsed entities, parameter entities
2988 need only be <titleref href='include-if-valid'>included if
2990 When a parameter-entity reference is recognized in the DTD
2992 <termref def='dt-repltext'>replacement
3001 <div2 id='intern-replacement'>
3006 <termdef id="dt-litentval" term='Literal Entity Value'>The <term>literal
3009 non-terminal <nt def='NT-EntityValue'>EntityValue</nt>.</termdef>
3010 <termdef id='dt-repltext' term='Replacement Text'>The <term>replacement
3012 replacement of character references and parameter-entity
3018 (<nt def='NT-EntityValue'>EntityValue</nt>) may contain character,
3019 parameter-entity, and general-entity references.
3023 <termref def='dt-include'>included</termref> as described above
3028 general-entity references must be left as-is, unexpanded.
3038 The general-entity reference "<code>&amp;rights;</code>" would be expanded
3043 <specref ref='sec-entexpand'/>.
3047 <div2 id='sec-predefined-ent'>
3049 <p><termdef id="dt-escape" term="escape">Entity and character
3061 <termref def='dt-interop'>For interoperability</termref>,
3077 be well-formed.
3084 <p><termdef id="dt-notation" term="Notation"><term>Notations</term> identify by
3085 name the format of <termref def="dt-extent">unparsed
3089 a <termref def="dt-pi">processing instruction</termref> is
3091 <p><termdef id="dt-notdecl" term="Notation Declaration">
3093 provide a name for the notation, for use in
3094 entity and attribute-list declarations and in attribute specifications,
3095 and an external identifier for the notation which may allow an XML
3100 <prod id='NT-NotationDecl'><lhs>NotationDecl</lhs>
3101 <rhs>'&lt;!NOTATION' <nt def='NT-S'>S</nt> <nt def='NT-Name'>Name</nt>
3102 <nt def='NT-S'>S</nt>
3103 (<nt def='NT-ExternalID'>ExternalID</nt> |
3104 <nt def='NT-PublicID'>PublicID</nt>)
3105 <nt def='NT-S'>S</nt>? '>'</rhs></prod>
3106 <prod id='NT-PublicID'><lhs>PublicID</lhs>
3107 <rhs>'PUBLIC' <nt def='NT-S'>S</nt>
3108 <nt def='NT-PubidLiteral'>PubidLiteral</nt>
3112 <p>XML processors must provide applications with the name and external
3115 additionally resolve the external identifier into the
3116 <termref def="dt-sysid">system identifier</termref>,
3120 notations for which notation-specific applications are not available on
3125 <div2 id='sec-doc-entity'>
3128 <p><termdef id="dt-docent" term="Document Entity">The <term>document
3130 tree and a starting-point for an <termref def="dt-xml-proc">XML
3141 <!-- &Conformance; -->
3143 <div1 id='sec-conformance'>
3146 <div2 id='proc-types'>
3147 <head>Validating and Non-Validating Processors</head>
3148 <p>Conforming <termref def="dt-xml-proc">XML processors</termref> fall into two
3149 classes: validating and non-validating.</p>
3150 <p>Validating and non-validating processors alike must report
3151 violations of this specification's well-formedness constraints
3153 <termref def='dt-docent'>document entity</termref> and any
3154 other <termref def='dt-parsedent'>parsed entities</termref> that
3156 <p><termdef id="dt-validating" term="Validating Processor">
3159 <termref def="dt-doctype">DTD</termref>, and
3164 DTD and all external parsed entities referenced in the document.
3166 <p>Non-validating processors are required to check only the
3167 <termref def='dt-docent'>document entity</termref>, including
3168 the entire internal DTD subset, for well-formedness.
3169 <termdef id='dt-use-mdecl' term='Process Declarations'>
3177 use the information in those declarations to
3181 <titleref href='sec-attr-defaults'>default attribute values</titleref>.
3183 They must not <termref def='dt-use-mdecl'>process</termref>
3184 <termref def='dt-entdecl'>entity declarations</termref> or
3185 <termref def='dt-attdecl'>attribute-list declarations</termref>
3190 <div2 id='safe-behavior'>
3193 must read every piece of a document and report all well-formedness and
3195 Less is required of a non-validating processor; it need not read any
3199 <item><p>Certain well-formedness errors, specifically those that require
3200 reading external entities, may not be detected by a non-validating processor.
3202 <titleref href='wf-entdeclared'>Entity Declared</titleref>,
3203 <titleref href='wf-textent'>Parsed Entity</titleref>, and
3204 <titleref href='wf-norecursion'>No Recursion</titleref>, as well
3210 parameter and external entities.
3211 For example, a non-validating processor may not
3215 <titleref href='sec-attr-defaults'>default attribute values</titleref>,
3217 external or parameter entities.</p></item>
3221 processors, applications which use non-validating processors should not
3223 Applications which require facilities such as the use of default
3224 attributes or internal entities which are declared in external
3225 entities should use validating XML processors.</p>
3229 <div1 id='sec-notation'>
3233 Extended Backus-Naur Form (EBNF) notation. Each rule in the grammar defines
3243 <p>Within the expression on the right-hand side of a rule, the following
3250 (UCS-4)
3256 encoding in use and is not significant for XML.</p></def>
3259 <label><code>[a-zA-Z]</code>, <code>[#xN-#xN]</code></label>
3260 <def><p>matches any <termref def='dt-character'>character</termref>
3264 <label><code>[^a-z]</code>, <code>[^#xN-#xN]</code></label>
3265 <def><p>matches any <termref def='dt-character'>character</termref>
3271 <def><p>matches any <termref def='dt-character'>character</termref>
3276 <def><p>matches a literal string <termref def="dt-match">matching</termref>
3281 <def><p>matches a literal string <termref def="dt-match">matching</termref>
3306 <label><code>A - B</code></label>
3329 <def><p>well-formedness constraint; this identifies by name a
3331 <termref def="dt-wellformed">well-formed</termref> documents
3337 <termref def="dt-valid">valid</termref> documents associated with
3345 <!-- &SGML; -->
3348 <!-- &Biblio; -->
3349 <div1 id='sec-bibliography'>
3352 <div2 id='sec-existing-stds'>
3357 (Internet Assigned Numbers Authority) <emph>Official Names for
3360 …oc href='ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets'>ftp://ftp.isi.edu/in-notes/ia…
3373 Code for the representation of names of languages.</emph>
3379 <emph>ISO 3166-1:1997 (E).
3380 Codes for the representation of names of countries and their subdivisions
3387 <emph>ISO/IEC 10646-1993 (E). Information technology &mdash; Universal
3388 Multiple-Octet Coded Character Set (UCS) &mdash; Part 1:
3396 Reading, Mass.: Addison-Wesley Developers Press, 1996.</bibl>
3409 Reading: Addison-Wesley, 1986, rpt. corr. 1988.</bibl>
3411 <bibl id="Berners-Lee" xml-link="simple" key="Berners-Lee et al.">
3412 Berners-Lee, T., R. Fielding, and L. Masinter.
3418 <bibl id='ABK' key='Br�ggemann-Klein'>Br�ggemann-Klein, Anne.
3421 S. 97-98. Springer-Verlag, Berlin 1992.
3422 Full Version in Theoretical Computer Science 120: 197-213, 1993.
3426 <bibl id='ABKDW' key='Br�ggemann-Klein and Wood'>Br�ggemann-Klein, Anne,
3435 <loc href='http://www.w3.org/TR/NOTE-sgml-xml-971215'>http://www.w3.org/TR/NOTE-sgml-xml-971215</lo…
3437 <bibl id="RFC1738" xml-link="simple" key="IETF RFC1738">
3440 ed. T. Berners-Lee, L. Masinter, M. McCahill.
3444 <bibl id="RFC1808" xml-link="simple" key="IETF RFC1808">
3451 <bibl id="RFC2141" xml-link="simple" key="IETF RFC2141">
3462 edition &mdash; 1986-10-15. [Geneva]: International Organization for
3469 <emph>ISO/IEC 10744-1992 (E). Information technology &mdash;
3470 Hypermedia/Time-based Structuring Language (HyTime).
3496 <prod id="NT-Letter"><lhs>Letter</lhs>
3497 <rhs><nt def="NT-BaseChar">BaseChar</nt>
3498 | <nt def="NT-Ideographic">Ideographic</nt></rhs> </prod>
3499 <prod id='NT-BaseChar'><lhs>BaseChar</lhs>
3500 <rhs>[#x0041-#x005A]
3501 |&nbsp;[#x0061-#x007A]
3502 |&nbsp;[#x00C0-#x00D6]
3503 |&nbsp;[#x00D8-#x00F6]
3504 |&nbsp;[#x00F8-#x00FF]
3505 |&nbsp;[#x0100-#x0131]
3506 |&nbsp;[#x0134-#x013E]
3507 |&nbsp;[#x0141-#x0148]
3508 |&nbsp;[#x014A-#x017E]
3509 |&nbsp;[#x0180-#x01C3]
3510 |&nbsp;[#x01CD-#x01F0]
3511 |&nbsp;[#x01F4-#x01F5]
3512 |&nbsp;[#x01FA-#x0217]
3513 |&nbsp;[#x0250-#x02A8]
3514 |&nbsp;[#x02BB-#x02C1]
3516 |&nbsp;[#x0388-#x038A]
3518 |&nbsp;[#x038E-#x03A1]
3519 |&nbsp;[#x03A3-#x03CE]
3520 |&nbsp;[#x03D0-#x03D6]
3525 |&nbsp;[#x03E2-#x03F3]
3526 |&nbsp;[#x0401-#x040C]
3527 |&nbsp;[#x040E-#x044F]
3528 |&nbsp;[#x0451-#x045C]
3529 |&nbsp;[#x045E-#x0481]
3530 |&nbsp;[#x0490-#x04C4]
3531 |&nbsp;[#x04C7-#x04C8]
3532 |&nbsp;[#x04CB-#x04CC]
3533 |&nbsp;[#x04D0-#x04EB]
3534 |&nbsp;[#x04EE-#x04F5]
3535 |&nbsp;[#x04F8-#x04F9]
3536 |&nbsp;[#x0531-#x0556]
3538 |&nbsp;[#x0561-#x0586]
3539 |&nbsp;[#x05D0-#x05EA]
3540 |&nbsp;[#x05F0-#x05F2]
3541 |&nbsp;[#x0621-#x063A]
3542 |&nbsp;[#x0641-#x064A]
3543 |&nbsp;[#x0671-#x06B7]
3544 |&nbsp;[#x06BA-#x06BE]
3545 |&nbsp;[#x06C0-#x06CE]
3546 |&nbsp;[#x06D0-#x06D3]
3548 |&nbsp;[#x06E5-#x06E6]
3549 |&nbsp;[#x0905-#x0939]
3551 |&nbsp;[#x0958-#x0961]
3552 |&nbsp;[#x0985-#x098C]
3553 |&nbsp;[#x098F-#x0990]
3554 |&nbsp;[#x0993-#x09A8]
3555 |&nbsp;[#x09AA-#x09B0]
3557 |&nbsp;[#x09B6-#x09B9]
3558 |&nbsp;[#x09DC-#x09DD]
3559 |&nbsp;[#x09DF-#x09E1]
3560 |&nbsp;[#x09F0-#x09F1]
3561 |&nbsp;[#x0A05-#x0A0A]
3562 |&nbsp;[#x0A0F-#x0A10]
3563 |&nbsp;[#x0A13-#x0A28]
3564 |&nbsp;[#x0A2A-#x0A30]
3565 |&nbsp;[#x0A32-#x0A33]
3566 |&nbsp;[#x0A35-#x0A36]
3567 |&nbsp;[#x0A38-#x0A39]
3568 |&nbsp;[#x0A59-#x0A5C]
3570 |&nbsp;[#x0A72-#x0A74]
3571 |&nbsp;[#x0A85-#x0A8B]
3573 |&nbsp;[#x0A8F-#x0A91]
3574 |&nbsp;[#x0A93-#x0AA8]
3575 |&nbsp;[#x0AAA-#x0AB0]
3576 |&nbsp;[#x0AB2-#x0AB3]
3577 |&nbsp;[#x0AB5-#x0AB9]
3580 |&nbsp;[#x0B05-#x0B0C]
3581 |&nbsp;[#x0B0F-#x0B10]
3582 |&nbsp;[#x0B13-#x0B28]
3583 |&nbsp;[#x0B2A-#x0B30]
3584 |&nbsp;[#x0B32-#x0B33]
3585 |&nbsp;[#x0B36-#x0B39]
3587 |&nbsp;[#x0B5C-#x0B5D]
3588 |&nbsp;[#x0B5F-#x0B61]
3589 |&nbsp;[#x0B85-#x0B8A]
3590 |&nbsp;[#x0B8E-#x0B90]
3591 |&nbsp;[#x0B92-#x0B95]
3592 |&nbsp;[#x0B99-#x0B9A]
3594 |&nbsp;[#x0B9E-#x0B9F]
3595 |&nbsp;[#x0BA3-#x0BA4]
3596 |&nbsp;[#x0BA8-#x0BAA]
3597 |&nbsp;[#x0BAE-#x0BB5]
3598 |&nbsp;[#x0BB7-#x0BB9]
3599 |&nbsp;[#x0C05-#x0C0C]
3600 |&nbsp;[#x0C0E-#x0C10]
3601 |&nbsp;[#x0C12-#x0C28]
3602 |&nbsp;[#x0C2A-#x0C33]
3603 |&nbsp;[#x0C35-#x0C39]
3604 |&nbsp;[#x0C60-#x0C61]
3605 |&nbsp;[#x0C85-#x0C8C]
3606 |&nbsp;[#x0C8E-#x0C90]
3607 |&nbsp;[#x0C92-#x0CA8]
3608 |&nbsp;[#x0CAA-#x0CB3]
3609 |&nbsp;[#x0CB5-#x0CB9]
3611 |&nbsp;[#x0CE0-#x0CE1]
3612 |&nbsp;[#x0D05-#x0D0C]
3613 |&nbsp;[#x0D0E-#x0D10]
3614 |&nbsp;[#x0D12-#x0D28]
3615 |&nbsp;[#x0D2A-#x0D39]
3616 |&nbsp;[#x0D60-#x0D61]
3617 |&nbsp;[#x0E01-#x0E2E]
3619 |&nbsp;[#x0E32-#x0E33]
3620 |&nbsp;[#x0E40-#x0E45]
3621 |&nbsp;[#x0E81-#x0E82]
3623 |&nbsp;[#x0E87-#x0E88]
3626 |&nbsp;[#x0E94-#x0E97]
3627 |&nbsp;[#x0E99-#x0E9F]
3628 |&nbsp;[#x0EA1-#x0EA3]
3631 |&nbsp;[#x0EAA-#x0EAB]
3632 |&nbsp;[#x0EAD-#x0EAE]
3634 |&nbsp;[#x0EB2-#x0EB3]
3636 |&nbsp;[#x0EC0-#x0EC4]
3637 |&nbsp;[#x0F40-#x0F47]
3638 |&nbsp;[#x0F49-#x0F69]
3639 |&nbsp;[#x10A0-#x10C5]
3640 |&nbsp;[#x10D0-#x10F6]
3642 |&nbsp;[#x1102-#x1103]
3643 |&nbsp;[#x1105-#x1107]
3645 |&nbsp;[#x110B-#x110C]
3646 |&nbsp;[#x110E-#x1112]
3653 |&nbsp;[#x1154-#x1155]
3655 |&nbsp;[#x115F-#x1161]
3660 |&nbsp;[#x116D-#x116E]
3661 |&nbsp;[#x1172-#x1173]
3666 |&nbsp;[#x11AE-#x11AF]
3667 |&nbsp;[#x11B7-#x11B8]
3669 |&nbsp;[#x11BC-#x11C2]
3673 |&nbsp;[#x1E00-#x1E9B]
3674 |&nbsp;[#x1EA0-#x1EF9]
3675 |&nbsp;[#x1F00-#x1F15]
3676 |&nbsp;[#x1F18-#x1F1D]
3677 |&nbsp;[#x1F20-#x1F45]
3678 |&nbsp;[#x1F48-#x1F4D]
3679 |&nbsp;[#x1F50-#x1F57]
3683 |&nbsp;[#x1F5F-#x1F7D]
3684 |&nbsp;[#x1F80-#x1FB4]
3685 |&nbsp;[#x1FB6-#x1FBC]
3687 |&nbsp;[#x1FC2-#x1FC4]
3688 |&nbsp;[#x1FC6-#x1FCC]
3689 |&nbsp;[#x1FD0-#x1FD3]
3690 |&nbsp;[#x1FD6-#x1FDB]
3691 |&nbsp;[#x1FE0-#x1FEC]
3692 |&nbsp;[#x1FF2-#x1FF4]
3693 |&nbsp;[#x1FF6-#x1FFC]
3695 |&nbsp;[#x212A-#x212B]
3697 |&nbsp;[#x2180-#x2182]
3698 |&nbsp;[#x3041-#x3094]
3699 |&nbsp;[#x30A1-#x30FA]
3700 |&nbsp;[#x3105-#x312C]
3701 |&nbsp;[#xAC00-#xD7A3]
3703 <prod id='NT-Ideographic'><lhs>Ideographic</lhs>
3704 <rhs>[#x4E00-#x9FA5]
3706 |&nbsp;[#x3021-#x3029]
3708 <prod id='NT-CombiningChar'><lhs>CombiningChar</lhs>
3709 <rhs>[#x0300-#x0345]
3710 |&nbsp;[#x0360-#x0361]
3711 |&nbsp;[#x0483-#x0486]
3712 |&nbsp;[#x0591-#x05A1]
3713 |&nbsp;[#x05A3-#x05B9]
3714 |&nbsp;[#x05BB-#x05BD]
3716 |&nbsp;[#x05C1-#x05C2]
3718 |&nbsp;[#x064B-#x0652]
3720 |&nbsp;[#x06D6-#x06DC]
3721 |&nbsp;[#x06DD-#x06DF]
3722 |&nbsp;[#x06E0-#x06E4]
3723 |&nbsp;[#x06E7-#x06E8]
3724 |&nbsp;[#x06EA-#x06ED]
3725 |&nbsp;[#x0901-#x0903]
3727 |&nbsp;[#x093E-#x094C]
3729 |&nbsp;[#x0951-#x0954]
3730 |&nbsp;[#x0962-#x0963]
3731 |&nbsp;[#x0981-#x0983]
3735 |&nbsp;[#x09C0-#x09C4]
3736 |&nbsp;[#x09C7-#x09C8]
3737 |&nbsp;[#x09CB-#x09CD]
3739 |&nbsp;[#x09E2-#x09E3]
3744 |&nbsp;[#x0A40-#x0A42]
3745 |&nbsp;[#x0A47-#x0A48]
3746 |&nbsp;[#x0A4B-#x0A4D]
3747 |&nbsp;[#x0A70-#x0A71]
3748 |&nbsp;[#x0A81-#x0A83]
3750 |&nbsp;[#x0ABE-#x0AC5]
3751 |&nbsp;[#x0AC7-#x0AC9]
3752 |&nbsp;[#x0ACB-#x0ACD]
3753 |&nbsp;[#x0B01-#x0B03]
3755 |&nbsp;[#x0B3E-#x0B43]
3756 |&nbsp;[#x0B47-#x0B48]
3757 |&nbsp;[#x0B4B-#x0B4D]
3758 |&nbsp;[#x0B56-#x0B57]
3759 |&nbsp;[#x0B82-#x0B83]
3760 |&nbsp;[#x0BBE-#x0BC2]
3761 |&nbsp;[#x0BC6-#x0BC8]
3762 |&nbsp;[#x0BCA-#x0BCD]
3764 |&nbsp;[#x0C01-#x0C03]
3765 |&nbsp;[#x0C3E-#x0C44]
3766 |&nbsp;[#x0C46-#x0C48]
3767 |&nbsp;[#x0C4A-#x0C4D]
3768 |&nbsp;[#x0C55-#x0C56]
3769 |&nbsp;[#x0C82-#x0C83]
3770 |&nbsp;[#x0CBE-#x0CC4]
3771 |&nbsp;[#x0CC6-#x0CC8]
3772 |&nbsp;[#x0CCA-#x0CCD]
3773 |&nbsp;[#x0CD5-#x0CD6]
3774 |&nbsp;[#x0D02-#x0D03]
3775 |&nbsp;[#x0D3E-#x0D43]
3776 |&nbsp;[#x0D46-#x0D48]
3777 |&nbsp;[#x0D4A-#x0D4D]
3780 |&nbsp;[#x0E34-#x0E3A]
3781 |&nbsp;[#x0E47-#x0E4E]
3783 |&nbsp;[#x0EB4-#x0EB9]
3784 |&nbsp;[#x0EBB-#x0EBC]
3785 |&nbsp;[#x0EC8-#x0ECD]
3786 |&nbsp;[#x0F18-#x0F19]
3792 |&nbsp;[#x0F71-#x0F84]
3793 |&nbsp;[#x0F86-#x0F8B]
3794 |&nbsp;[#x0F90-#x0F95]
3796 |&nbsp;[#x0F99-#x0FAD]
3797 |&nbsp;[#x0FB1-#x0FB7]
3799 |&nbsp;[#x20D0-#x20DC]
3801 |&nbsp;[#x302A-#x302F]
3805 <prod id='NT-Digit'><lhs>Digit</lhs>
3806 <rhs>[#x0030-#x0039]
3807 |&nbsp;[#x0660-#x0669]
3808 |&nbsp;[#x06F0-#x06F9]
3809 |&nbsp;[#x0966-#x096F]
3810 |&nbsp;[#x09E6-#x09EF]
3811 |&nbsp;[#x0A66-#x0A6F]
3812 |&nbsp;[#x0AE6-#x0AEF]
3813 |&nbsp;[#x0B66-#x0B6F]
3814 |&nbsp;[#x0BE7-#x0BEF]
3815 |&nbsp;[#x0C66-#x0C6F]
3816 |&nbsp;[#x0CE6-#x0CEF]
3817 |&nbsp;[#x0D66-#x0D6F]
3818 |&nbsp;[#x0E50-#x0E59]
3819 |&nbsp;[#x0ED0-#x0ED9]
3820 |&nbsp;[#x0F20-#x0F29]
3822 <prod id='NT-Extender'><lhs>Extender</lhs>
3831 |&nbsp;[#x3031-#x3035]
3832 |&nbsp;[#x309D-#x309E]
3833 |&nbsp;[#x30FC-#x30FE]
3847 <p>Name characters other than Name-start characters
3853 names.</p>
3857 with a "compatibility formatting tag" in field 5 of the database --
3861 <p>The following characters are treated as name-start characters
3863 them as Alphabetic: [#x02BB-#x02C1], #x0559, #x06E5, #x06E6.</p>
3866 <p>Characters #x20DD-#x20E0 are excluded (in accordance with
3878 <p>Characters ':' and '_' are allowed as name-start characters.</p>
3881 <p>Characters '-' and '.' are allowed as name characters.</p>
3886 <inform-div1 id="sec-xml-and-sgml">
3890 <termref def="dt-valid">valid</termref> XML document should also be a
3895 </inform-div1>
3896 <inform-div1 id="sec-entexpand">
3899 sequence of entity- and character-reference recognition and
3917 start- and end-tags of the "<code>p</code>" element will be recognized
3933 5 <!ENTITY % zz '&#60;!ENTITY tricky "error-prone" >' >
3949 "<code>&lt;!ENTITY tricky "error-prone" ></code>",
3950 which is a well-formed entity declaration.</p></item>
3955 ("<code>&lt;!ENTITY tricky "error-prone" ></code>") is parsed.
3957 declared, with the replacement text "<code>error-prone</code>".</p></item>
3961 "<code>test</code>" element is the self-describing (and ungrammatical) string
3962 <emph>This sample shows a error-prone method.</emph>
3966 </inform-div1>
3967 <inform-div1 id="determinism">
3969 <p><termref def='dt-compat'>For compatibility</termref>, it is
3973 <!-- FINAL EDIT: WebSGML allows ambiguity? -->
3977 flag non-deterministic content models as errors.</p>
3979 non-deterministic, because given an initial <code>b</code> the parser
4003 <p>Algorithms exist which allow many but not all non-deterministic
4005 models; see Br�ggemann-Klein 1991 <bibref ref='ABK'/>.</p>
4006 </inform-div1>
4007 <inform-div1 id="sec-guessing">
4010 entity, indicating which character encoding is in use. Before an XML
4012 know what character encoding is in use&mdash;which is what the internal label
4018 make it feasible to autodetect the character encoding in use in each
4024 (external) information. We consider the first case first.
4027 Because each XML entity not in UTF-8 or UTF-16 format <emph>must</emph>
4031 In reading this list, it may help to know that in UCS-4, '&lt;' is
4033 Order Mark required of UTF-16 data streams is "<code>#xFEFF</code>".</p>
4037 <p><code>00 00 00 3C</code>: UCS-4, big-endian machine (1234 order)</p>
4040 <p><code>3C 00 00 00</code>: UCS-4, little-endian machine (4321 order)</p>
4043 <p><code>00 00 3C 00</code>: UCS-4, unusual octet order (2143)</p>
4046 <p><code>00 3C 00 00</code>: UCS-4, unusual octet order (3412)</p>
4049 <p><code>FE FF</code>: UTF-16, big-endian</p>
4052 <p><code>FF FE</code>: UTF-16, little-endian</p>
4055 <p><code>00 3C 00 3F</code>: UTF-16, big-endian, no Byte Order Mark
4059 <p><code>3C 00 3F 00</code>: UTF-16, little-endian, no Byte Order Mark
4063 <p><code>3C 3F 78 6D</code>: UTF-8, ISO 646, ASCII, some part of ISO 8859,
4064 Shift-JIS, EUC, or any other 7-bit, 8-bit, or mixed-width encoding
4069 use the same bit patterns for the ASCII characters, the encoding
4076 use)</p>
4079 <p>other: UTF-8 without an encoding declaration, or else
4087 declaration and parse the character-encoding identifier, which is
4089 of encodings (e.g. to tell UTF-8 from 8859, and the parts of 8859
4091 use, and so on).
4097 use. Since in practice, all widely used character encodings fall into
4099 reasonably reliable in-band labeling of character encodings, even when
4100 external sources of information at the operating-system or
4101 transport-protocol level are unreliable.
4104 Once the processor has detected the character encoding in use, it can
4110 Like any self-labeling system, the XML encoding declaration will not
4113 character-encoding routines should be careful to ensure the accuracy
4114 of the internal and external information used to label the entity.
4123 specified as part of the higher-level protocol used to deliver XML.
4125 MIME-type label in an external header, for example, should be part of the
4130 <item><p>If an XML entity is in a file, the Byte-Order Mark
4131 and encoding-declaration PI are used (if present) to determine the
4143 MIME type of application/xml, then the Byte-Order Mark and
4144 encoding-declaration PI are used (if present) to determine the
4149 These rules apply only in the absence of protocol-level documentation;
4155 </inform-div1>
4157 <inform-div1 id="sec-xml-wg">
4168 <member><name>Tim Bray, Textuality and Netscape</name><role>XML Co-editor</role></member>
4169 <member><name>Jean Paoli, Microsoft</name><role>XML Co-editor</role></member>
4170 <member><name>C. M. Sperberg-McQueen, U. of Ill.</name><role>XML
4171 Co-editor</role></member>
4187 </inform-div1>
4190 <!-- Keep this comment at the end of the file
4193 sgml-default-dtd-file:"~/sgml/spec.ced"
4194 sgml-omittag:t
4195 sgml-shorttag:t
4197 -->