The Articles and Conventions of CoOLThis Will Be on the Exam |
If you're reading this it's probably because you either have pages in CoOL or are preparing them. Please stow this away for future reference and check back regularly, as this document is amended frequently. I Hope this helps. Please let me know if anything is unclear.
Please adhere religiously to the following conventions.
1. All file names, including directory names
NB spaces and '8 bit' characters are especially forbidden. If you prepare your docs on a mac, please pay special attention to this as it's very hard for me to fix things once they reach my machines. Be especially conscious of this if you use Web authoring tools like pagemill or beyondpress, as they will mangle urls quite happily, turning those spaces into %20.
(a file)
(a directory)
(uppercase)
(two dots)
(two dots)
(double extension)
(space in name)
(illegal %-sign in name)
Long file names are acceptible for both file and directory names, but very long names are deprecated (among other things, they make URLs very long, and hard to manage).
All html documents must have the extension .html (.htm is acceptible, but .html is preferred.)
Other file extensions
.gif
.jpg
(not .jpeg)
.tif
(not .tiff)
.pdf
.png
Path separators (which are '/' in Unix and '\' in DOS and something more baroque in MacOs) must be Unix style, no matter what the operating system you work on. Your browser should handle '/' just fine, so you shouldn't have any trouble editing.
Each directory in your web should (with some possible
exceptions), contain a file called "index.html". (not "default.html,
main.html, or home.html). This is the "main" document of that
directory and is returned when the browser asks for the directory
without an additional file name. e.g. when you ask for
http://palimpsest.stanford.edu/byorg/abbey/
the server sends
http://palimpsest.stanford.edu/byorg/abbey/index.html
.
(Also My web maintenance software looks for those files, so you must name
them index.html).
The following names are magical. They should not be used.
The following names are reserved for use as "index" files
Even though CoOL uses "index.html", the other forms are
off-limits.
.shtml
indicates that the file will be subject to
server-parsed-html (this is a netscape term for what is more
commonly referred to as Server-Side-Includes. Please consult with me
before using server-parsed-html, as there are security implications
to deal with.
Please contact me concerning approaches to naming files for serial publications, such as newsletters. Depending on your publication frequency, I will assign you a filename format, typically based on a combination of a two-letter code representing your publication, and the volume, issue, and article numbers. If you are working on Xylophone Conservation Monthly, I might assign a code "xc" for your newsletter. Then if your were working on volume 3, number 3, which contains 5 articles, the articles would be named (in the order they appear in the newsletter)
xc03-501.html
xc03-502.html
...
xc03-505.html
See also Naming related images below
Note that for serials, we are trying to keep every single article (and every image file that goes with it) uniquely named, so that if we need to copy all your articles into a single directory or or load them into a database there's no danger of something being clobbered. (Trust me, this is a Good Thing).
Please try to name images using part of the html file's name with
incremental numbers/letters. You can use XXXXt.gif
to
indicate a thumbnail associated with XXXX.gif
For example an article by Robertson with 103 images and 103 thumbnails might go like this:
rob.html
rob1at.gif
rob1a.gif
...
rob1zt.gif
rob1z.gif
rob2at.gif
rob2a.gif
If you are working on a serial (e.g. newsletter), then see above. Your images should be named by appending sequential letters to the base of your issue name. E.g in the above example we could use:
xc03-5a.gif
xc03-5b.gif
...
xc03-5z.gif
xc03-5aa.jpg
xc03-5az.gif
....
If you happen to have say .gif and .jpg files, please use a different base name for each (that is do not use xc03-5a.gif and xc03-5a.jpg).
See individual articles of The Abbey Newsletter for examples.
Then a clickable link would look like this:
<code><p><a href="rob1z.gif"><img src="rob1/gif width="302" height="156" border="0"></a></p>
CoOL uses a superset of HTML 3.2 (including frames and some
relatively harmless browser-specific elements/attributes. If you use
features that are not in HTML 2.0, or that are specific to a browser,
think twice and hard, as the rendering on other browsers may not be
appropriate. You may use valid HTML 4, but think thrice and
hard. If you want to see the syntax for the "custom" version of
HTML I use here (called HTMLX), you can check out
http://cool.conservation-us.org/admin_cool/dtdguides/htmlx/DTD-HOME.html.
The DTD itself is at http://cool.conservation-us.org/admin_cool/dtdguides/dtds/htmlx/htmlx.dtd
For general guidance on HTML authoring (somewhat obsolete), and other tools, see http://www-sul.stanford.edu/tools/
For those who haven't read it, I strongly advise reading the tutorial HTML Basics, even though it is terribly outdated. Also recommended: A Gentle Introduction to SGML
You might also want to look at The Library of Congress World Wide Web Style Guide, which offers some very good guidelines for creating sane, maintainable, accessible web pages.
Using CSS style sheets is encouraged. Such style sheets must be
External and referenced via a LINK
element.
Please pay attention to the HTML spec with regard to
H1.
.H6
(any version you use will have the
same basic rules about headers):
Headings indicate a structural division, not a type style and they are intended to be used in order:
<h1>The Story of Me</h1>
<h2>Chapter 1. My early Life
<h3>Adventures in Nursery School
<h4>The first time the teacher bit me
<h4>The last time the teacher bit me
<h3>Graduation from Nursery School
<h2>Chapter 2. Fear and Loathing in K-12</h2>
<h3>Filing down the teacher's teeth</h3>
<h4>Why I chose a bastard file</h3>
<h2>Chapter 3</h3>
Note that H1
is the first heading, has the
title of the page and is used only once. Note also that
there are no skips (not
<h2>foo</h2><h4></h4>
). A
heading is intended to be a SECTION indicator identifies everything
that follows it until the next Heading of equal level. Alles
Klar?
Exceptions: Sometimes you have repeating info (eg for a
newsletter you might have a series title, vol, issue number. You may
us a lower level Heading (h2,3,4
, etc) before the
h1.
Similarly, some repeating things--like button bars at the bottom of every page--don't really fit into the neat heirarchy described above, but are clearly "headings of some arbitrary lowness" so you may do something like:
<h4 align="center"><a href="../">[Up]</a> <a href="xxx/">[Down]</a></h4
See http://www-sul.stanford.edu/tools/tutorials/html2.0/headings.html for more detail.
Here is a small table you can use as a model. Notice that the first
th
or td
of each row contains a p
.
This will cause the rows to be rendered intelligibly by
non-table-capable browsers (the columns will not line up properly of
course).
<table border="3" cols="3"> <caption><b>Famous People</b></caption> <tr> <th>Author <th>Title <th>Color <tr> <td><p>Alan Burrows <td>Art Criticism from the Laboratory <td>Tan </table>
The CENTER
tag is deprecated in CoOL.
In general, centering should be done by adding an align attribute
on the start tag of a block element, such as H2,
P
etc. E.g.
<p align="center">This would b centered</p>
or, as an attribute on the DIV
element:
<div align="center"> <p>blah blah</p> <p>more blah</p> </div
All images, no matter how trivial, should have an ALT element briefly identifying their content.
<img src="foo.gif" alt="Diagram" ... >
<img src="foo.gif" alt="Graph" ...... >
<img src="foo.gif" alt="Photomacrograph" ... >
<img src="foo.gif" alt="Figure 1. The painting after treatment" ... >
<img src="foo.gif" alt="Dingbat" ... >
<img src="foo.gif" alt="" ... >
The last example, an empty ALT attribute, should be used only when the image is purely ornamental.
In the past I've asked that inline images for CoOL be limited to GIFs and that JPEG's be restricted to links. Times change and now all major browsers support inline jpegs (and as the FAQ points out there's a good library available to browser authors so there's no excuse for them not to support it anymore), so from now on, I'm encouraging you to use inline jpeg's where they are appropriate (ie photographs, things with continuous tone color or grayscale; line art is still best done with GIF as is anything that needs a transparent border). PNG files are also acceptible, if you are willing to live with the fact that some older browsers won't handle them.
Just a reminder (from one who keeps forgetting): you shouldn't edit jpegs, but should edit the (TIFF/PHOTOSHOP/WHATEVER) source from which you prepared the JPEG, because JPEG is lossy, and you'll get cascading lossyness.
Also, whenever possible, when using any inline images, provide a width and heigh attribute on the img tag as it really does make a big difference.
(btw, feel free to use interlaced gifs/progressive jpegs or not; whatever suits your pages).
All URLs referring to your own web, or to a web in CoOL, should be relative.
<a href="../../img/foo.gif">
<a href="/byorg/myorg/img/foo.gif">
<a href="http://palimpsest.stanford.edu/byorg/myorg/img/foo.gif">
References to the default index file "index.html" must omit the name "index.html" and indicate the directory with ./ ../ ../.. etc. Thus:
<a href="./">
<a href="../../">
<a href="./index.html">
<a href="../../index.html">
This is very important because we might switch your default index file from .html to .shtml. If the filename is omitted the server will handle the switch gracefully.
All URLs pointing to sites outside of CoOL must be fully qualified, They must begin with "http://" (or "ftp:// or gopher:// etc.).
<a href="http://www.foobar.com">
<a href="www.foobar.com">
Please put double quotes around *all* attribute values, even when not required by sgml. e.g.: Even though
<img src=foo.gif width=322 height=120>
is perfectly legal, I want
<img src="foo.gif" width="322" heigh="120">
This does 2 things. First, the rules about when quotes can be omitted are a bit arcane and as it is always legal to have the quotes, so it is safer to use them by habit. Second, if I can count on the double quotes being there, it makes it easier for me to search through thousands of files.
In name anchors, such as
<a name="foo"></>
Note several things
<a name="chap1"></>
is easier to deal with than
<a name="chapterone_the_history_of_solvent_based_treatments_in_the_late_1980s"></>
Try to keep the line length down to 68 characters. Unix editors
get nasty about long lines and I sometimes have to edit things using
brain-dead terminal emulators and if you have long lines, I end up
having to download the files to work on them, which is extra work.
(remember html doesn't, in most circumstances (eg inside a
PRE
element), care about extraneous whitespace,
including vertical whitespace).
NB Please do not break up a URL, no matter how long it is, and make sure your editor doesn't try to wrap a URL.
btw: I normally mung all incoming files so that tags and attribute names are lowercased, there are empty lines around paragraphs and headings etc, so if you get a file back from me, it will nearly always be changed.
If you edit a file that is on the server (e.g. if you view it
with your browser and Save As Source) , you must strip
off anything following the HTML
end-tag
(</html>
). The server addes a datestamp and some
other stuff at the bottom of every page. If you don't strip it off,
next time you see it it will have 2, and the next time you repeat
the process it will have 3 ...
Please make sure that every .html file
</head>
end tag and a <body>
start tag
<body>
start
tag and the rest of your document.
Neither of these rules is required by the HTML specification, but they are very important in CoOL; some of my software depends on this.
You are encouraged to use Dublin Core metadata elements in the HEAD element of your documents. In addition, CoOL also uses its own metadata schema. Articles that use the "DC.Creator" and "DC.Title" elements (or the deprecated CoOL equivalents "au" and "ti") will normally be included in master indexes (e.g. CoOL's Author index). For this reason, please use "DC.Creator" and "DC.Title" elements when appropriate. only:
If you have any doubt about whether to use these metadata elements, please ask me
If you are not the copyright owner of the material you are contributing (including any included material, such as images, charts, etc.), you are responsible for obtaining written permission from the copyright owner.