SGML -document


SGML documents have two parts: SGML prolog and document instance. SGML prolog consist of SGML declaration and document type definition.

SGML declaration

SGML declaration gives information to the processing system about the character set used in document, etc.

SGML declaration example.


Document Type Definition

Document Type Definition (DTD) specifies the logical structure of document. Element definitions sets the names (and tags) for each element and content of that element (subelements, #PCDATA).

Each element must occur exactly once, if no occurance indicator is attached to element. Occurance indicators are:

Connectors between element names:
Brackets can be used to group elements.

HTML DTD


Example of simple DTD:

<!DOCTYPE memo[
<!ELEMENT memo -- ((date & from), to, content)>
<!ELEMENT date -- (#PCDATA)>
<!ELEMENT from -- (#PCDATA)>
<!ELEMENT to -- (#PCDATA)>
<!ELEMENT content -- (chapter+)>
<!ELEMENT chapter -- (heading, para+)>
<!ELEMENT heading -- (#PCDATA)>
<!ELEMENT para -O (#PCDATA)>
<!ATTLIST content confidential CDATA #REQUIRED>
]>

If DTD is not with the document it can be accessed via reference like
<!DOCTYPE memo SYSTEM "memo.dtd">.


Document instance

Document instance is the content of the document marked with tags from DTD. In most cases document instance is the only part of the SGML document that user has to deal with; usually writers write their documents to confrom some existing DTD.

Example of the document according to previous example DTD:

<memo>
<date>09.09.1995</date>
<from>Teemu Rautanen</from>
<to>All</to>
<content confidential="no">
<chapter>
<heading>About this memo</heading>
<para>This memo is example of document that conforms to example DTD.
<para>It is defined in DTD that element PARA don't need end tag because end of the paragraph can be recognized from the start of next paragraph or end of the chapter.
</chapter>
<chapter>
<heading>Tags</heading>
<para>Tags that are used to mark up this memo comes from DTD, each element is marked with its tag.
</chapter>
</content>
</memo>