The Articles and Conventions of CoOL

This Will Be on the Exam

If you're reading this it's probably because you either have pages in CoOL or are preparing them. Please stow this away for future reference and check back regularly, as this document is amended frequently. I Hope this helps. Please let me know if anything is unclear.

I. FILE NAMING CONVENTIONS

Please adhere religiously to the following conventions.

A. General conventions

1. All file names, including directory names

must be all lowercase
must begin with a letter or number
File names must be composed of letters, numerals, underscores and hyphens but no other character.
Directory names must be composed of letters, numerals and hyphens but no other character.
Directory names must not contain periods (dots).
NB spaces and '8 bit' characters are especially forbidden. If you prepare your docs on a mac, please pay special attention to this as it's very hard for me to fix things once they reach my machines. Be especially conscious of this if you use Web authoring tools like pagemill or beyondpress, as they will mangle urls quite happily, turning those spaces into %20.
There must be a file extension (normally .html but see below). There must be one and only one dot (period) in the file name and it must immediately precede the extension.

Good
solvents.html (a file)
solvents (a directory)
Bad
Solvents.html (uppercase)
solvents.special.html (two dots)
solvents..html (two dots)
solvents.html.htm (double extension)
solvents special.html (space in name)
solvents%20.html (illegal %-sign in name)

Long file names are acceptible for both file and directory names, but very long names are deprecated (among other things, they make URLs very long, and hard to manage).

B. File extensions

All html documents must have the extension .html (.htm is acceptible, but .html is preferred.)

Other file extensions
.gif
.jpg (not .jpeg)
.tif (not .tiff)
.pdf
.png

C. Path separators

Path separators (which are '/' in Unix and '\' in DOS and something more baroque in MacOs) must be Unix style, no matter what the operating system you work on. Your browser should handle '/' just fine, so you shouldn't have any trouble editing.

D. Default/Index files

Each directory in your web should (with some possible exceptions), contain a file called "index.html". (not "default.html, main.html, or home.html). This is the "main" document of that directory and is returned when the browser asks for the directory without an additional file name. e.g. when you ask for http://palimpsest.stanford.edu/byorg/abbey/ the server sends http://palimpsest.stanford.edu/byorg/abbey/index.html. (Also My web maintenance software looks for those files, so you must name them index.html).

E. Reserved file names

The following names are magical. They should not be used.

File or directory names: garbag* (e.g garbage, garbage2); trash*; bugger; *.old; old*; obosolet*
Directory names: obs (You may actually use this name for hiding old documents: if you put something in a dir called obs, it will not be served or indexed.)

The following names are reserved for use as "index" files

index.htm, index.html, index.shtml
default.html main.html, and home.html (or .htm or .shtml) are reserved, and must not be used in CoOL.

Even though CoOL uses "index.html", the other forms are off-limits. .shtml indicates that the file will be subject to server-parsed-html (this is a netscape term for what is more commonly referred to as Server-Side-Includes. Please consult with me before using server-parsed-html, as there are security implications to deal with.

F. Naming newsletter or journal articles

Please contact me concerning approaches to naming files for serial publications, such as newsletters. Depending on your publication frequency, I will assign you a filename format, typically based on a combination of a two-letter code representing your publication, and the volume, issue, and article numbers. If you are working on Xylophone Conservation Monthly, I might assign a code "xc" for your newsletter. Then if your were working on volume 3, number 3, which contains 5 articles, the articles would be named (in the order they appear in the newsletter)

xc03-501.html
xc03-502.html
...
xc03-505.html

G. Naming related images

Please try to name images using part of the html file's name with incremental numbers/letters. You can use XXXXt.gif to indicate a thumbnail associated with XXXX.gif

For example an article by Robertson with 103 images and 103 thumbnails might go like this:

rob.html
rob1at.gif
rob1a.gif
...
rob1zt.gif
rob1z.gif
rob2at.gif
rob2a.gif

If you are working on a serial (e.g. newsletter), then see above. Your images should be named by appending sequential letters to the base of your issue name. E.g in the above example we could use:

xc03-5a.gif
xc03-5b.gif
...
xc03-5z.gif
xc03-5aa.jpg
xc03-5az.gif
....

If you happen to have say .gif and .jpg files, please use a different base name for each (that is do not use xc03-5a.gif and xc03-5a.jpg).

See individual articles of The Abbey Newsletter for examples.

Then a clickable link would look like this:

    <code><p><a href="rob1z.gif"><img
    src="rob1/gif width="302" height="156"
    border="0"></a></p>

II. HTML

CoOL uses a superset of HTML 3.2 (including frames and some relatively harmless browser-specific elements/attributes. If you use features that are not in HTML 2.0, or that are specific to a browser, think twice and hard, as the rendering on other browsers may not be appropriate. You may use valid HTML 4, but think thrice and hard. If you want to see the syntax for the "custom" version of HTML I use here (called HTMLX), you can check out
http://cool.conservation-us.org/admin_cool/dtdguides/htmlx/DTD-HOME.html. The DTD itself is at http://cool.conservation-us.org/admin_cool/dtdguides/dtds/htmlx/htmlx.dtd

For general guidance on HTML authoring (somewhat obsolete), and other tools, see http://www-sul.stanford.edu/tools/

For those who haven't read it, I strongly advise reading the tutorial HTML Basics, even though it is terribly outdated. Also recommended: A Gentle Introduction to SGML

You might also want to look at The Library of Congress World Wide Web Style Guide, which offers some very good guidelines for creating sane, maintainable, accessible web pages.

Using CSS style sheets is encouraged. Such style sheets must be External and referenced via a LINK element.

A. HEADINGS

Please pay attention to the HTML spec with regard to H1..H6 (any version you use will have the same basic rules about headers):

Headings indicate a structural division, not a type style and they are intended to be used in order:

<h1>The Story of Me</h1>
<h2>Chapter 1. My early Life
<h3>Adventures in Nursery School
<h4>The first time the teacher bit me
<h4>The last time the teacher bit me
<h3>Graduation from Nursery School
<h2>Chapter 2. Fear and Loathing in K-12</h2>
<h3>Filing down the teacher's teeth</h3>
<h4>Why I chose a bastard file</h3>
<h2>Chapter 3</h3>

Note that H1 is the first heading, has the title of the page and is used only once. Note also that there are no skips (not <h2>foo</h2><h4></h4>). A heading is intended to be a SECTION indicator identifies everything that follows it until the next Heading of equal level. Alles Klar?

Exceptions: Sometimes you have repeating info (eg for a newsletter you might have a series title, vol, issue number. You may us a lower level Heading (h2,3,4, etc) before the h1.

Similarly, some repeating things--like button bars at the bottom of every page--don't really fit into the neat heirarchy described above, but are clearly "headings of some arbitrary lowness" so you may do something like:

    <h4 align="center"><a href="../">[Up]</a> <a href="xxx/">[Down]</a></h4

See http://www-sul.stanford.edu/tools/tutorials/html2.0/headings.html for more detail.

B. Tables

1. Sample table

Here is a small table you can use as a model. Notice that the first th or td of each row contains a p. This will cause the rows to be rendered intelligibly by non-table-capable browsers (the columns will not line up properly of course).

    <table border="3" cols="3">
    <caption><b>Famous People</b></caption>

    <tr>
    <th>Author
    <th>Title
    <th>Color

    <tr>
    <td><p>Alan Burrows
    <td>Art Criticism from the Laboratory
    <td>Tan

    </table>

C. Centering text

The CENTER tag is deprecated in CoOL. In general, centering should be done by adding an align attribute on the start tag of a block element, such as H2, P etc. E.g.

    <p align="center">This would b centered</p>

or, as an attribute on the DIV element:

    <div align="center">
    <p>blah blah</p>
    <p>more blah</p>
    </div

III. IMAGES

All images, no matter how trivial, should have an ALT element briefly identifying their content.

    <img src="foo.gif" alt="Diagram" ... >
    <img src="foo.gif" alt="Graph" ...... >
    <img src="foo.gif" alt="Photomacrograph" ... >
    <img src="foo.gif" alt="Figure 1. The painting after treatment" ... >
    <img src="foo.gif" alt="Dingbat" ... >
    <img src="foo.gif" alt="" ... >

The last example, an empty ALT attribute, should be used only when the image is purely ornamental.

In the past I've asked that inline images for CoOL be limited to GIFs and that JPEG's be restricted to links. Times change and now all major browsers support inline jpegs (and as the FAQ points out there's a good library available to browser authors so there's no excuse for them not to support it anymore), so from now on, I'm encouraging you to use inline jpeg's where they are appropriate (ie photographs, things with continuous tone color or grayscale; line art is still best done with GIF as is anything that needs a transparent border). PNG files are also acceptible, if you are willing to live with the fact that some older browsers won't handle them.

Just a reminder (from one who keeps forgetting): you shouldn't edit jpegs, but should edit the (TIFF/PHOTOSHOP/WHATEVER) source from which you prepared the JPEG, because JPEG is lossy, and you'll get cascading lossyness.

Also, whenever possible, when using any inline images, provide a width and heigh attribute on the img tag as it really does make a big difference.

(btw, feel free to use interlaced gifs/progressive jpegs or not; whatever suits your pages).

III. URLs

All URLs referring to your own web, or to a web in CoOL, should be relative.

Good: <a href="../../img/foo.gif">
Bad: <a href="/byorg/myorg/img/foo.gif">; <a href="http://palimpsest.stanford.edu/byorg/myorg/img/foo.gif">

References to the default index file "index.html" must omit the name "index.html" and indicate the directory with ./ ../ ../.. etc. Thus:

Good: <a href="./">; <a href="../../">
Bad: <a href="./index.html">; <a href="../../index.html">

This is very important because we might switch your default index file from .html to .shtml. If the filename is omitted the server will handle the switch gracefully.

All URLs pointing to sites outside of CoOL must be fully qualified, They must begin with "http://" (or "ftp:// or gopher:// etc.).

Good: <a href="http://www.foobar.com">; Bad; <a href="www.foobar.com">

IV. Miscellany

Please put double quotes around *all* attribute values, even when not required by sgml. e.g.: Even though

    <img src=foo.gif width=322 height=120>

is perfectly legal, I want

    <img src="foo.gif" width="322" heigh="120">

This does 2 things. First, the rules about when quotes can be omitted are a bit arcane and as it is always legal to have the quotes, so it is safer to use them by habit. Second, if I can count on the double quotes being there, it makes it easier for me to search through thousands of files.

In name anchors, such as

    <a name="foo"></>

Note several things

The attribute value ("foo") must be in lowercase (it is not illegal html to have mixed case attributes, but it makes my troubleshooting more difficult).
The attribute value must consist only of letters and numbers, and underscores and must begin with a letter.

Please keep attribute values short.

    <a name="chap1"></>

is easier to deal with than

    <a name="chapterone_the_history_of_solvent_based_treatments_in_the_late_1980s"></>

A. Formatting your html

1. Line length

Try to keep the line length down to 68 characters. Unix editors get nasty about long lines and I sometimes have to edit things using brain-dead terminal emulators and if you have long lines, I end up having to download the files to work on them, which is extra work. (remember html doesn't, in most circumstances (eg inside a PRE element), care about extraneous whitespace, including vertical whitespace).

NB Please do not break up a URL, no matter how long it is, and make sure your editor doesn't try to wrap a URL.

btw: I normally mung all incoming files so that tags and attribute names are lowercased, there are empty lines around paragraphs and headings etc, so if you get a file back from me, it will nearly always be changed.

If you edit a file that is on the server (e.g. if you view it with your browser and Save As Source) , you must strip off anything following the HTML end-tag (</html>). The server addes a datestamp and some other stuff at the bottom of every page. If you don't strip it off, next time you see it it will have 2, and the next time you repeat the process it will have 3 ...

<head><body>

Please make sure that every .html file

has both a </head> end tag and a <body> start tag
there is an empty line between the <body> start tag and the rest of your document.

Neither of these rules is required by the HTML specification, but they are very important in CoOL; some of my software depends on this.

B. Metadata

You are encouraged to use Dublin Core metadata elements in the HEAD element of your documents. In addition, CoOL also uses its own metadata schema. Articles that use the "DC.Creator" and "DC.Title" elements (or the deprecated CoOL equivalents "au" and "ti") will normally be included in master indexes (e.g. CoOL's Author index). For this reason, please use "DC.Creator" and "DC.Title" elements when appropriate. only:

on the main page of a multi-file article (i.e. if you make an article that has 4 files: index.html, intro.html, toc.html, and appendix.html, only index.html should have "DC.Creator" and "DC.Title" elements.
on the ARTICLE-level documents of serials (i.e. each article should have these elements, but your title page, "about the organization", and other other non-article pages should not.

If you have any doubt about whether to use these metadata elements, please ask me

C. Uploading

Upload only the files and directories that you intend the public to see. Take care that any backup files, temporary files, or anything else that may be lying around on your own machine are not uploaded to palimpsest. If your FTP client is in the habit of dropping a .LOG file in amongst the uploaded files (as does ws_ftp), reconfigure it so that it doesn't.
If you have something that you need to preview, get in touch with the webmaster and arrange to have that material temporarily placed in a test area.
Transfer .html and .text files as ASCII. Transfer .gif, .jpg, .pdf and other image files as BINARY.

V. Copyright

If you are not the copyright owner of the material you are contributing (including any included material, such as images, charts, etc.), you are responsible for obtaining written permission from the copyright owner.

[Search all CoOL documents]

Feedback for CoOL webmaster