
(Republished with permission from the author. Originally
published on the Web site of Marisol
Productions)
Web Site Indexing
By Marilyn Joyce Rowland
Prepared
for the Web Indexing Workshop, presented at the June 1999 Annual
Conference of the American Society of Indexers and updated July 2000.
Introduction:
Web site indexing involves two basic skills: indexing and Web
site design, and an intangible one: the willingness and resourcefulness
to invent new ways to present information to people. Web site
indexes can be similar to book indexes or to tables of contents,
they may incorporate features of search engines, or they can be
whole new approaches to that basic task of an index: connecting
people with the information they are seeking, and, perhaps, with
information they don't yet know is there.
This article provides the basic information you will need to
construct a basic Web site index. But, as you do so, remember
that there are few rules in Web site indexing; that the "best"
form of a Web site index is highly dependent on the type of Web
site you are indexing and the needs of the audience you are creating
the index for.
Remember that the online user generally wants answers in a hurry.
While surfing the Web can be fun and diverting, index users are
generally looking for specific information, and your job is to
help them find it. Experiment with new ways of quickly connecting
people to information, and look at Web site indexing as adding
a whole new dimension to the puzzle and challenge of creating
useful indexes for the needs of specific audiences.
1.
Basic HTML:
Unless you are working with a Web site design team and are only
expected to come up with index terms, you’ll need to know
basic HTML commands before you start indexing. Sure, you can use
an HTML editor or fancy What-You-See-Is-What-You-Get Web site
design program, but you should have an understanding of what HTML
is and how it works, as well as a knowledge of how to construct
an index in plain HTML. This will help you format your index page
more precisely, and it will be invaluable in helping you to identify
and fix HTML problems.
As you become more involved with Web site indexing, you may learn
other techniques that will help you devise new and wonderful Web
site indexes: JavaScript and other scripting languages, CGI, Java,
DHTML, XML, Active Server Pages, and a whole range of yet-to-be-invented
Web technologies. We won’t deal with any of these today,
but, should you decide to learn more about Web design options,
keep Web site indexing in mind, and look for ways these new technologies
might make your index more useful, more clear, or even more entertaining
to the user.
HTML stands for HyperText Markup Language. It is simply a means
of coding text files so that they can be read by any computer.
HTML allows you to format the text (add bold, italics, and different
fonts, paragraph formatting and indentation), and add graphics,
sound, video, animation, and hyperlinks that allow users to instantly
jump from one spot in your page to another in the same page, or
(somewhat less than instantly, but still pretty quickly) from
another spot in your page to a relevant Web site on the other
side of the world. Links are one of the key aspects of HTML; they
allow the creation of the World Wide Web, and they are
crucial to Web site indexing, which, in its simplest form, is
a list of links.
There are a multitude of excellent books on how to write HTML.
There are a plethora of Web sites on various aspects of writing
HTML: articles, tutorials, lists of tips, glossaries, and encyclopedic
approaches. Think of this brief article as an introduction to
these books and articles, not as a substitute for them. Keep in
mind that the Web is constantly changing, that new technologies
are constantly being developed, that you should try to learn as
much as you can about HTML, but, also, that you can do a lot with
very simple HTML.
a. Simple HTML Concepts:
HTML is written using tags, which are commands written
inside angle brackets: < and >.
Most tags are written as pairs, with opening tags and closing
tags surrounding the affected text. Opening and closing tags are
identical except that the closing tag contains a forward slash
symbol (/). In <B>American
Society of Indexers</B>, <B>
is the opening tag and </B>
is the closing tag. The tag is used to create boldface text. When
you look at <B> American Society
of Indexers </B> in a browser,
you will not see the tags, but you will see the formatting: American
Society of Indexers.
The "B" can be typed in upper case or lower case; it makes no
difference. Many people prefer to use upper case because the codes
stand out more clearly from the text this way. Others prefer to
use lower case. If you use an HTML editor, you may find that the
editor automatically changes all tags to upper case. It’s
best to be flexible!
Many tags have attributes and values that provide
options for the text. If you want to make text appear red, for
instance, you can use the <FONT>
tag, the COLOR attribute, and the #FF0000 hexidecimal value. Values
are inserted within quotation marks. For instance, the following
code
<FONT COLOR="#FF0000">Web
Site Indexing</FONT>will
create red text. Spacing between lines of text is handled
with special tags, as browsers ignore extra spaces. <BR>
with produce a ling break; <P>
will produce a paragraph break. There are many additional ways
of controlling spacing.
Links are created with the <A>
tag and HREF attribute. In this case,
the value is Thus, we can create a link to the publications page
on the ASI Web site from the words "American Society of Indexers"
by typing:
<A HREF="http://www.asindexing.org/asipub.htm">American
Society of Indexers</A>
b.
Creating a Simple
Web Page:
Write your HTML page in ASCII format and save it with an .htm
extension. You can use any word processor or HTML editor.
All HTML codes are placed within <HTML>
tags, so start by by typing <HTML>
at the top of your page and </HTML>
a couple of lines down. Place all other tags between these two
tags.
Web pages generally have two sections: a HEAD section and BODY
section. Information about the Web page goes into the HEAD section,
including the title and any information you may wish to include
in META tags (such as keywords to better enable search engines
to find your site). while the BODY section contains the bulk of
the page: the text, graphics, links, and formatting that users
will see in their browsers.
For now, we will put only the title of the Web page in the Head
section. The title appears on the top of the browser, not in the
page itself. To create the HEAD section, and insert a title, type
<HEAD>
<TITLE>American Society of
Indexers Web Workshop Web Page--Web Site Index</TITLE>
</HEAD>
It is usually desirable to use the name of your Web
site as the title for all your pages, along with subtitles for specific
pages, as above. This will make your page more identifiable to both
users and search engines. Now, create the BODY by typing
<BODY>
</BODY>
All the rest of your HTML coding will be placed between
the opening BODY tag and the closing body tag. First, create
a title that will be seen on the page itself. HTML offers six levels of
headers (conveniently named H1, H2, H3, H4, H5,
and H6) that can be used to organize your
page. H1 is the largest heading and H6 is the smallest. These tags affect
size and placement of the text you place within them. Experiment with
them. You will rarely use all of them in a single document. You don’t
need to use any of them, as there are other ways to format a heading,
but let’s start by creating an H1 tag and an introductory paragraph.
<BODY>
<H1>Web Site Index</H1>
Welcome to my Web Site Index. Here, you’ll find links to
all the information contained in this Web site.
</BODY>
Now, let’s add the A entries in our index:
<BODY>
<H1>Web Site Index</H1>
Welcome to my Web Site Index. Here, you’ll find links
to all the information contained in this Web site.
- A
- absolute URLs
- accents
- alignment
- of images
- of text
- alphabetic order
- American Society of Indexers
- anchors
- creating
- links to
- in the same page
- in a different page
- ASCII format
- asterisks, in frames
- attributes
</BODY>
Unless you specify paragraphs or line breaks, browsers will run
all this text together. Instead of a nice index, you’ll
see something like this when you look at your page in a browser:
Web Site
Index
Welcome to my Web Site Index. Here, you’ll find links to
all the information contained in this Web site. A absolute URLs
accents alignment of images of text alphabetic order American
Society of Indexers anchors creating links to in the same page
in a different pageASCII format asterisks, in frames attributes
One way of specifying paragraphs is with paragraph tags.
Simply place a <P> tag before
each line and a </P> tag at the
end of each line. (Actually the closing P tag is not necessary,
but some people prefer to use both the opening and the closing tags.
Another way is to use the line break tag, <BR>,
where you want the line break to occur. There is no closing tag.
The <P> tag creates spacing
after the text. The <BR> tag
does not. Let's use the <P> tag
to set off the letter A.
<P>A</P>We
could use the <BR> tag to separate
the index entries, but, how do we deal with the indented subentries
and sub-subentries? HTML provides many options, but some are easier
than others to implement, and not all are viewed the same way by
all browsers. You may want to experiment with various options, including
fancy ones like subentries that are hidden until the user places
his or her mouse over the main entry. In the next section, we’ll
discuss one widely used method, the use of definition lists.
2.
Formatting an Indented Index:
Definition lists are used to glossary lists, or any other
kind of list in which a word or phrase is paired with a longer
description. The description is indented and placed on the line
below the word or phrase that is defined. In our index example,
the word or phrase is the main entry, and the description or definition
is the subentry.
Definition lists are created with <DL>,
<DT> and <DD>
tags. DL stands for definition list;
DT stands for definition term, and
DD stands for definition definition.
First, enclose the whole index section in
<DL></DL> tags. Then add <DT>
tags around main headings and <DD>
tags around subentries.
<DL>
<DT>absolute URLs</DT>
<DT>accents</DT>
<DT>alignment</DT>
<DD>of images</DD>
<DD>of text</DD>
<DT>alphabetic order</DT>
<DT>American Society of Indexers</DT>
<DT>anchors</DT>
<DD>creating</DD>
<DD>links to</DD>
in the same page
in a different page
<DT>ASCII format</DT>
<DT>asterisks, in frames</DT>
<DT>attributes</DT>
</DL>
This takes care of all but the sub-subentries. To format
sub-subentries, you’ll need to use nested <DL>
tags. Add another set of <DL>
tags around the material to be further indented. Then add additional
<DD> tags, as needed. The anchors
entry and associated subentries and sub-subentries with look like
this: <DT>anchors</DT>
<DD>creating</DD>
<DD>links to</DD>
<DL>
<DD>in the same page</DD>
<DD>in a different page</DD>
</DL>
The whole A section of the index will look like this:
<DL>
<DT>absolute URLs</DT>
<DT>accents</DT>
<DT>alignment</DT>
<DD>of images</DD>
<DD>of text</DD>
<DT>alphabetic order</DT>
<DT>American Society of Indexers</DT>
<DT>anchors</DT>
<DD>creating</DD>
<DD>links to</DD>
<DL>
<DD>in the same page</DD>
<DD>in a different page</DD>
</DL>
<DT>ASCII format</DT>
<DT>asterisks, in frames</DT>
<DT>attributes</DT>
</DL>
It can get confusing coding definition lists. Be sure
to test your HTML carefully by looking at it through more than one
browser (at least Netscape and Internet Explorer) as you go along
to make sure you are achieving the effect you are seeking.
Experiment with various methods of coding an index, using definition
lists as well as other methods to find a method that works for
you and the particular project you are working on.
3.
Adding Links and Targets:
Now you have the index formatted so that it looks like an index.
It’s time to add links so that it actually functions as
an index.
First, you have to decide whether you want your index entry to
link to a specific point in a page, or just to the top of the
page, or whether you want to use some combination of specific
and general references. Your decision may be based on several
factors, including how information-packed the pages are, whether
your entry is a general concept that is discussed throughout the
Web page, or whether it is a very specific mention that occurs
only once and may be hard to find; whether you have access to
the pages that are references, or whether you have control only
over the index document itself, and, to some degree, your personal
preference.
Let’s assume your index will include both types of links.
We’ll start with general links to the top of other Web pages.
As noted above, links are created with the <A> tag and HREF
attribute. The value is the complete URL of the Web page you want
to link to, and it is placed inside quotation marks.
<A HREF="http://www.asindexing.org">American
Society of Indexers/asipub.htm</A>When,
as in our sample Web site index, the pages referred to in the index
are on the same site and in the same directory as the index page
itself, our job is simplified. We need only include the name of
the Web page itself: <A
HREF="asipub.htm">American Society of Indexers</A>Let’s
assume that, in our Web site, absolute URLs are discussed on a Web
page named urls.htm; accents are discussed on a page named fonts.htm;
alignment in general is discussed on page align.htm; alignment of
images is discussed on page image.htm; and alignment of text is
discussed on page text.htm. We need to add an appropriate <A
HREF> tag to each entry and subentry in our index:
<DL>
<DT><A HREF="urls.htm">absolute
URLs</A></DT>
<DT><A HREF="fonts.htm">accents</A></DT>
<DT><A HREF="align.htm">alignment</A></DT>
<DD><A HREF="image.htm">of
images</A></DD>
<DD><A HREF="text.htm">of
text</A></DD>
….
</DL>
When your page is viewed in a browser, your links will
be underlined (unless you specify another approach):
- absolute URLs
- accents
- alignment
- of images
- of text
There are many factors that may affect your index formatting
choices. If, for instance, alignment of text is discussed on several
pages, you can devise a separate subentry for each occurrence:
- alignment
- of text in paragraphs
- of text in tables
- of text in titles
Or you can devise a method of indicating pages, perhaps by
the title of the page referenced, or through a numerical system:
- alignment
- of text paragraphs, tables, titles
- alignment
- of text, Formatting with Tables, Title Style,
Working with Paragraphs
- alignment
- of text, ref. 1, ref. 2, ref. 3
Suppose the Working with Paragraphs page is very long and
you want to make sure that the user can easily find the information
on text alignment within the page. You may want to link the index
term directly to the discussion of text alignment in the Working
with Paragraphs page. You can do this by creating an anchor in the
Working with Paragraphs page and specifying a link to that anchor
when you create your <A HREF>
link. Let’s say the page Working with Paragraphs
page is called parag.htm and that
the paragraph relating to text alignment starts off like this:
To begin a new paragraph, begin
with "<P". If you want to align text in the paragraph to the
center or to the right, type ALIGN=CENTER or ALIGN=RIGHT. Then type
the closing angle bracket, ">". You want
the index user to be transported to the beginning of this paragraph,
so you place an anchor (sometimes called a target)
there. To do this, use the <A>
tag with the NAME attribute.
<A NAME="textalign">To
begin a new paragraph, begin with "<P". If you want to align
text in the paragraph to the center or to the right, type ALIGN=CENTER
or ALIGN=RIGHT. Then type the closing angle bracket, ">".Now
write a link tag that includes the anchor by adding a # sign and
the name of the anchor: alignment
<A HREF="parag.htm#texalign">of
text</A>
4.
Index Navigation Systems:
Your Web site index may fit on one page or it may take up many
pages, depending on the nature and size to the Web site you are
indexing and the comprehensiveness of your index. In any case,
you will need some sort of navigation system to transport users
to specific parts of your index. Assuming your index is alphabetical,
this means you will probably want to create an alphabet across
the top of your index page with links from each letter to the
beginning of each alphabetical section of your index.
The most common method is to set up an alphabetical menu across
the top of your page, using a | symbol between each letter:
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P
| Q | R | S | T | U | V | W | X | Y | Z
Add a target to each alphabetical letter heading in your index
and create a link from each alphabetical letter in the menu to
the targets in the body of your index. If your index is spread
over numerous Web pages, use appropriate links and targets.
One problem with this type of menu is that the user may find
herself continually having to scroll back to the beginning of
the page to find the menu. You can resolve that in several ways,
including placing the menu intermittently throughout the Web page,
before each letter header, for instance; or placing the index
navigation letters (or the index itself, if it is small) in a
frame so that it is always at the top (or side or bottom) available
to the user, whenever he needs it.
Some Web designers and users dislike frames, but they can be
very useful as a design tool, and they are especially useful in
creating an index that becomes an integral part of the navigation
system of the Web site.
The following section explains how to create a simple frames-based
Web site index.
5. Frames
With frames, you can divide your page into two areas, one for
the index and one for the indexing navigation letters (A, B, C,
etc.). When you click on the letter "P," you'll find yourself
in the "P" section of the index. Once you find the term you're
looking for, let's say periodical indexing, you can click on the
locator, and the area of the page that once held the index now
hold the information on periodical indexing. The index letters
at the top of the page have remained in place though, and you
can get back to the index by clicking on them.
The first step in creating a frames page is to create a frames
document that defines how the page is to be divided. You can make
horizontal or vertical divisions, and you can nest frames within
each other. We will discuss only a simple frame system that divides
the page into a main text area and a narrow band at the top of
the page containing the index letters.
First, create your Web site index and name it siteindex.htm,
and create your alphabet letters page, and name it letters.htm.
(Or whatever names you choose.)
Next,
open a blank page and give it a name. Let's call it Websiteindex.htm.
Frames pages begins with an <HTML> tag, followed by a <frameset>
tag rather than by a <body>
tag. The <frameset>
tag allows you to size the frames by percentage of the screen
or by absolute value (number of pixels).When you want to create
a narrow band of a specific size to hold the alphabet letters,
sizing by pixels makes more sense.
<frameset
rows="10,*">
This tag creates two horizontal frames, one 10 pixels high at the
top of your page. The asterisk means that the second frame takes
up the rest of the space on your page.
To designate frames by percentage, simply add a % sign.
Now we have to describe the two frames we created. Each requires
a <frame> tag with two attributes, the source of the frame
and the name of the frame. The source tells the browser which
document to pull into the frame; while the name is important when
you start linking between frames.
Frames are always described from top to bottom (or left to right,
if you are using vertical frames), so the first frame refers to
the alphabet letters, letters.htm. The second frame will contain
your index, siteindex.htm. We describe the frames with <frame>
tags:
<frame src="letters.htm" name="frame1">
<frame src="siteindex.htm" name="frame2">
You can control frame margins with marginwidth and marginheight
attributes; whether or not the frame has a scrollbar with the
scrolling attribute, and whether or not the frame can be resized
with the noresize attribute.
To eliminate scrolling and resizing of the letters frame, we
add these attributes:
<frame src="letters.htm" name="frame1" scrolling=no noresize>
Now, when you load Websiteindex.htm, you should see a narrow
frame at the top, containing the letters and a larger area below
it, containing the index.
The last step in the process is to add links to your letters.htm
page so that the viewer will see the "M" section of the index
when you click on the M. Next, add anchors at the beginning of
each letter section in the index so that you can provide links
to the letters, not just to the top of the page. Finally, create
links designating frame2 as the target. Below is an example for
letter A.
<a href="siteindex.htm#a"
target="frame2">A</a>
If you don't designate a target, the index will open up in your
narrow letters frame. The target tells the browser to load the
index in the larger frame. The links in the index itself do not
have targets, so, when you click on a locator in the index, the
page will open in the larger frame.
There are many ways to customize frames so that they will do
exactly what you want, and there are many online tutorials and
books that can help you. After you have created a simple frame,
such as this one, experiment with some different types of frames
for different purposes.
6.
Index maintenance and troubleshooting
Chances are you’ll make a lot of mistakes in HTML when
you first start out—and for some time after that.
Check your work in a browser to make sure all the links work.
Then check it in a number of different browsers (at least the
most popular ones) to make sure it looks the way you want it to.
It is also very helpful to have someone else look at the site.
Don't forget to update your index when you make changes to your
Web site.
There are a myriad of useful programs available to help you in
all aspects of Web site design. You may want to use several programs,
to take advantage of their different features. You can use some
of these programs without any understanding of HTML at all,
but it is best to create a few pages using "raw" HTML. Your understanding
of the codes and what they do and the process of HTML will help
greatly when you run into problems--and you will. Fortunately,
however, there is no lack of sources of help: books, magazines,
Web sites, and trusted friends.
7. Web
Indexing Programs:
There are several Web indexing programs on the market that
can alleviate most of the tedium of writing HTML by hand. Evaluate
the programs carefully to make sure you are buying the one that
best suits your needs. It is possible to have a program modified
to suit your specific needs, too. The client may be willing to
pay for this additional programming, especially if it will help
you make the Web index more accurate and speed up the process
of Web indexing.
Two programs programs you may wish to look into are:
HTML IndexerTM 3.0 (for Windows)
David M. Brown
Brown Inc.
7417 SW B-H Hwy., #524
Portland, OR 97225-2169
E-Mail: dmbrown@brown-inc.com
URL: http://www.html-indexer.com/
HTML/Prep
David K. Ream
Leverage Technologies, Inc.
9519 Greystone Parkway
Cleveland, OH 44141-2939,
Tel: (888)-838-1203,br> Fax: (440)-838-1203
Email: info@LevTechInc.com
URL: http://LevTechInc.com
For
More Information:
There are a multitude of resources available in bookstores and
on the Web for the beginning and advancing Web site designer and
indexers. Here are some of them.
Books
Web site design books are out of date almost as soon as they
are published, so I highly recommend browsing in your local bookstore
or your favorite cyberspace bookstore for current and additional
selections.
Bryant, Stephanie Cottrell, Teach Yourself HTML 4 (Foster
City, CA: IDG Press, 1999)
Castro, Elizabeth, HTML 4 for the World Wide Web (Berkeley,
CA: Peachpit Press, 1998)
Web Sites
These are some of the larger and longer established sites devoted
to aiding the new HTML user. Each has links to lots more general
and specific sites, tutorials, and articles on HTML and Web site
design. For even more options, use your favorite search engine
to search for Web site design, HTML tutorials, or related terms.
Art and the Zen of Web
Sites
Lots of good advice on goals, navigation, generating repeat business,
color, image maps, page titles, images, loading time, animated
images, Java, frames, text, and advertising, told with humor.
A
Beginner's Guide to HTML
Good introduction, includes links to other tutorials, from the
National Center for Supercomputing Applications, University of
Illinois at Urbana-Champaign
HTML Writer's Guild
The WWW Development Resources contains a large and useful collection of
links.
Webreference: Developer's
Corner
Many links to tutorials and information on all aspects of Web
page development
Web Developers Virtual Library
Many links to topics and tutorials
WebMonkey
Has lots of good, easy-to-understand tutorials
and WebMonkey
for Kids
if you want a nice simple introduction to Web site design.
For more information, please e-mail Marilyn
Rowland.
© 2000 Marilyn Joyce Rowland
Back to Resources |