HTML Syntax

This document introduces HTML syntax; it also describes tags for document meta-data -- information about the document itself. Other documents describe tags for document content.

HTML Syntax, Style

Here is a simple, minimal yet complete HTML document:

<HTML><HEAD>
<TITLE>the title</TITLE>
</HEAD><BODY>
<H1>Hello</H1>
<P>goodbye
</BODY></HTML>

Some of these tags (e.g., BODY) might be safely omitted entirely, though should be included for clarity and compatibility with other HTML applications. Some tags are currently not supported by Newt's Cape, though should be safely ignored by the parser.

Compatibility, Standards

Newt's Cape supports most of the HTML 2.0 specification. Here are a few of the many guides to HTML available:

Certain obsolete HTML 1.0 tags may not be implemented; certain HTML 3.2 features such as tables and new phrase tags have been added. Other browser-specific/format-oriented extensions may be added in the future as they become part of the official or de facto standard or requested by our users.

Remember that Newt's Cape, like WWW browsers, should ignore tags it doesn't recognize or understand. The closest thing to a "crash" in Newt's Cape is a badly-rendered book (hopefully). Of course, if a site uses sloppy syntax or makes assumptions about screen size, results are less predictable. If in doubt about what Newt's Cape supports, examine this collection of documents, since Newt's Cape has converted all of them into books. The Index summarizes current tags.

Newt's Cape now supports "well-formed" expressions, e.g., tags like <BR/> and <BR />, which are need to support XML, in particular XHTML; XML requires that attribute be non-unary, with values that have " or ' delimiters, and all tags must close, e.g., <IMG.../>, </P>, , </LI>, etc. Other dialects of XML, e.g., eBook, WML, etc., will require different helper apps to interpret the tags that are parsed by Newt's Cape.

Delimiters vs. Data

If you are familiar with programming languages, then you'll realize that there is a need to help the computer distinguish between commands and content -- directions and data. Tags are used to indicate what something is, not necessarily how it should appear. Tag names such as H1 are delimited by < and >; a matching end tag, if needed, begins with </. Tag names are not case sensitive, but we generally use UPPERCASE to make them more visible in source. Tags can be modified with attributes -- required or optional names -- which may be followed by = and an attribute value, usually surrounded with "".

If you are not using a WYSIWYG editor that outputs HTML, you may wish to do certain minimal formatting in an HTML source document to help you as the author see the structure that you are creating: for example, putting tags in uppercase, extra lines between sections and paragraphs, tabs in front of list items. Remember though that except in PRE, extra separators (spaces, tabs and return characters) are ignored inside attributes and within attribute and data text.

You may occasionally want to use command characters such as < and > in your content. In order to do this, you need to use a special mechanism -- "entity names" -- to treat a character as data, not as a delimiter. In addition, you may want to include special, international non-ASCII characters. You can specify these characters symbolically by name (names are case sensitive) or numerically. For example:

& or & for & (ampersand)
< for < (less than)
> for > (greater than)
" for " (quote)
¿, À, ¥, etc.
List of Character Entities

NewtonScript

You do not need to know NewtonScript (the Newton's built-in object-oriented language) in order to create HTML source documents, even including forms. If you want to customize the behavior of URLs for IMG, HREF and FORM, or add shared scripts and slots to the book itself, then some knowledge of programming and NewtonScript is useful. Newt Development Environment.

Since the NewtonScript source for expressions and methods is processed normally as HTML before being compiled on your Newton (assuming that you've enabled this in i:General:NewtonScript), the HTML quoting conventions apply. You need to do a few things differently so that your embedded NewtonScript source is converted as you intended. Quoting conventions, scripting and writing NewtonScript in Newt's Cape.

!-- (comment)

To include comments in the source, but not in the viewer. It should be possible to include comments almost anywhere in an HTML document, e.g., inside attributes. The "remove comments?" option (for HTML cache) can be useful to save space (and slightly speed processing).

Example:

<!-- this is a comment. nothing should appear -->

Result:

For More Info

Version 2.1. Last updated: Dec 2000