Hypertext Markup Language - an Overview

A web page is a document written in a special markup language that can be rendered (displayed) by a web browser like Firefox, Chrome or Microsoft Internet Explorer. A markup language is essentially a set of markup elements (or tags) which tell a user agent (which in this context is a web browser) how the content it is applied to should be displayed.

The markup language used to write web pages is called hypertext markup language (HTML). The most recent major version of HTML at the time of writing is HTML5. This version of HTML can do everything that previous versions could do, and in addition implements a number of new tags designed to make it easier for web developers to add features like multimedia content to their pages.

A web page consists of content (such as text and images), and markup elements (tags). Text content is usually (though not always) part of the HTML document itself. Media elements such as images, audio, and video are stored in separate files, and are included in the page (when displayed by the web browser) via the use of special tags that reference the relevant media files.

The most common kind of HTML tag actually consists of a pair of tags - an opening tag and a closing tag. The element to which the tag applies is enclosed between them. The opening tag tells the browser where a particular element begins, and the closing tag tells the browser where it ends. It is up to the browser implementation to interpret the meaning of each tag, and to render the enclosed content accordingly.

Note that there may be slight variations in the way that different browser implementations render (i.e. display) an HTML element. Generally speaking, however, your pages should look more or less the same in Firefox, Chrome, Internet Explorer, or any of the other popular web browsers.

Here is an example of the HTML code used to format a paragraph of text:

  Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.

The text, which is taken from Abraham Lincoln's famous Gettysburg Address, is enclosed between an opening paragraph tag (<p>) and a closing paragraph tag (</p>). The only difference between the opening and closing tags is that the closing tag has a forward slash (or solidus) in front of the letter p (which stands here for paragraph).

There is a special image tag (<img>) for displaying images in web documents. Image tags are among the few HTML tags that do not have a closing tag as such, and for that reason they are sometimes referred to as empty (or void) tags. The image file to which the image tag refers is referenced from within the body of the tag itself, as in the following example:

<img src="vintage_rolls_royce.jpg" alt="Vintage Rolls Royce">

There are a couple of things to note here. First, we have included an alt attribute that specifies the (alternative) text that will be displayed in the event that the image itself cannot be displayed. This is not absolutely necessary, but it is considered good practice. If the user cannot see the image - maybe they have a slow Internet connection, or they are sight impaired and use a screen reader - then they will at least receive some information about the image.

The second thing to note is that, in previous versions of these pages, we have inserted a space followed by a forward slash (solidus) immediately before the closing angle bracket. This is an XHTML convention which we chose to adopt for HTML5 tags that do not require a closing tag - sometimes known as self-closing tags. The XHTML markup language which succeeded HTML4 (and preceded HTML5) was syntactically much stricter than other versions of HTML and required all tags (including empty tags) to be closed.

The closing of empty tags in HTML5 is no longer necessary. If an empty tag is closed, it will not prevent a web page from displaying properly, and can actually avoid problems with the use of some HTML parsing software. The World Wide Web Consortium (W3C) has this to say on the subject:

"The term void elements is used to designate elements that must be empty . . . In HTML, these elements have a start tag only. The self-closing tag syntax may be used."

The 'self-closing tag syntax' referred to is the use of a space followed by a forward slash immediately preceding the closing angle bracket. You can make your own mind up as to whether you wish to use this convention, although going forward we would recommend not using it because the WC3 html validator now throws up the following warning when it encounters an empty tag that has been closed in this way:

"Trailing slash on void elements has no effect and interacts badly with unquoted attribute values."

One convention we have chosen to adopt from XHTML for these pages is that all HTML5 tag names and attribute names will be written in lower case. According to the W3C HTML5 specification, HTML tag names and attribute names are case-insensitive; they could even, if you so wished, be written in a mix of upper and lower-case characters. The choice is entirely yours. We recommend, however, that you choose a convention and stick to it.

In this section we will be showing you how to create web pages using HTML5. As well as the page elements already mentioned, we will show you how to add lists, tables and forms to your pages, how to display video and other multimedia content, how to work with different colours and fonts, how to add structure to your pages, and how to create hypertext links. We will also be looking at how you can make life easier for yourself using server-side incudes.

It probably goes without saying that not everything you want to do with your web pages can be achieved using HTML alone. We will be looking at the use of cascading style sheets and scripting languages to enhance the functionality and interactivity of your pages elsewhere. For the moment, it is worth remembering that the whole point of producing a web page is to present information of some kind to your intended audience.

Your web pages should be informative and contain useful content. Whether you are selling goods and services or simply sharing your ideas and experiences with a global audience, you will want to ensure that your pages are usable and easily accessible. You can go a long way towards achieving this goal by implementing a logical navigation scheme, being consistent in the way you structure your pages, and choosing a suitable colour scheme.

It is a good idea to familiarise yourself with what the World Wide Web Consortium has to say about standards. The information published by the W3C provides web developers with a set of guidelines designed to promote good practice in the design and implementation of web pages and web applications. Adhering to standards will ensure that your pages are easily found and indexed by search engines, that they will display correctly on a broad range of devices, and that they will be accessible to as many people as possible, including people with disabilities.