SGML documents have two parts: SGML prolog and document instance. SGML prolog consist of SGML declaration and document type definition.
SGML declaration gives information to the processing system about the character set used in document, etc.
SGML declaration example.
Document Type Definition
Document Type Definition (DTD) specifies the logical structure of document. Element definitions sets the names (and tags) for each element and content of that element (subelements, #PCDATA).
Each element must occur exactly once, if no occurance indicator is attached to element. Occurance indicators are:
Connectors between element names:
- ? Optional, 0 or 1 occurance,
- sample -- (elem1, elem2?)
- + Required and repeatable, 1 or more occurance,
- sample -- (elem1+, elem2?)
- * Optional and repeatable, 0 or more occurance,
- sample -- (elem1+, elem2*)
Brackets can be used to group elements.
- , All must occur in that order,
- sample -- (elem1, elem2, elem3)
- | Only one can occur,
- sample -- ((elem1|elem2), elem3)
- & All must occur but in any order,
- sample -- ((elem1&elem2), elem3)
Example of simple DTD:
<!ELEMENT memo -- ((date & from), to, content)>
<!ELEMENT date -- (#PCDATA)>
<!ELEMENT from -- (#PCDATA)>
<!ELEMENT to -- (#PCDATA)>
<!ELEMENT content -- (chapter+)>
<!ELEMENT chapter -- (heading, para+)>
<!ELEMENT heading -- (#PCDATA)>
<!ELEMENT para -O (#PCDATA)>
<!ATTLIST content confidential CDATA #REQUIRED>
If DTD is not with the document it can be accessed via reference like
<!DOCTYPE memo SYSTEM "memo.dtd">.
Document instance is the content of the document marked with tags from DTD. In most cases document instance is the only part of the SGML document that user has to deal with; usually writers write their documents to confrom some existing DTD.
Example of the document according to previous example DTD:
<heading>About this memo</heading>
<para>This memo is example of document that conforms to example DTD.
<para>It is defined in DTD that element PARA don't need end tag because end of the paragraph can be recognized from the start of next paragraph or end of the chapter.
<para>Tags that are used to mark up this memo comes from DTD, each element is marked with its tag.