Web Informant #244, 16 April 2001:
Improving your site's chances with search engines

http://strom.com/awards/244.html

I asked my friend Tara Calishain to write this week's essay. Tara is the co-author of Official Netscape Guide to Internet Research, 2nd Edition, and writes the weekly newsletter ResearchBuzz.com, a great source for anyone interested in keeping current with Internet research sites and trends. She describes herself as "crazy about search engines," and after reading her contribution, you'll understand why.

I've been searching on the Internet since I got a UNIX shell account back in 1993. Searching at first was difficult (because shell accounts ain't exactly friendly to the novice), then easier (as the Web cranked up and Mosaic became available, mostly because there just wasn't much stuff up yet), then difficult (as the Internet exploded) then really difficult (as most companies got an online presence, government e-initiatives took hold, and universities started digitizing collections.)

Perhaps I'm misusing words. Finding things is getting more difficult. The act of searching, in itself, is getting easier. Search engines are implementing more options, more tools, and better technology to deal with the flood of information that threatens to overwhelm even the simplest search. I'm sure in time search engine technology will get to the point where indexing web pages every day or so is normal (actually, you can purchase this option now from Inktomi) or content can be updated instantaneously, making sure site searches always find fresh material (there was a beta of this kind of technology up from Infrasearch until it went gonesilent.)

But until then, I propose eight things sites can do to make it easier so that the rest of the world can find their materials via a search engine. (I know in the case of some of these examples, I don't walk my talk on ResearchBuzz.com. Alas, I am but a staff of one and am too busy generating content, to make sure the search engines see it properly. Large companies with actual staffs to keep up their sites should read on.)

  1. Put the date of the article in the page title: Don't rely on the date's appearance in the body of the article page. Putting the date in the page title makes it much easier to scan search engine results for timeliness and relevance.

  2. Put the date of the article underneath the article title. If the article has already been published offline, and if there's a difference between published and posted date, make sure it's obvious. If a page has a dynamically-generated date, be sure that's off to the side and not linked to the article at all!

  3. Don't post articles using links that'll expire in a few days. If materials are moved to a paid-access archive, at least leave up a synopsis of the article and a link to the paid archive. That'll really cut down on 404 (page not found) errors and might bring in some more business to the pay-for-access archives.

  4. Put up links to related materials when possible. Seems obvious, but it's amazing how many sites don't do this.

  5. If a site has some kind of offline presence, make sure every page of the site notes that physical location, at least with city and state. I have no idea where Criminychip County is. I would not expect non- North Carolinians to know where Fuquay-Varina is. Even a footer saying something like, "Serving Fuquay-Varina, North Carolina" is of immeasurable help.

  6. Customize 404 errors. If I find a good-looking article in a search engine but the link is dead, I'll try to force a 404 error on the site, since good 404 page provide lots of information -- links to search engines, guidance for finding things, etc. Uncustomized 404 errors are a huge wasted opportunity for content- rich sites.

  7. Have some standard way of presenting an article. Headlines always using H2 tags, for example, bylines always italicized H3, whatever. These standards make it easier to scan articles and easier for content aggregators to index headlines.

  8. If there are recent additions or updated news in one corner of a page, configure the internal search engine so those words aren't indexed. There's nothing more frustrating than getting 200 hits on a site's internal search engine and then finding out it's hitting the same place across 200 pages.

Search engine technology is getting closer and closer to fast and frequent updating, making it easier and easier to find news of current events online. If sites make just a few tweaks, their information will be all set to take advantage of the upgrades to search engine tech.

To subscribe, send a blank email to
informant-subscribe@pez.oreillynet.com

To be removed from this list, send a blank email to
informant-unsubscribe@pez.oreillynet.com

David Strom
david@strom.com
+1 (516) 944-3407
back issues
entire contents copyright 2001 by David Strom, Inc.
Web Informant is ® registered trademark with the U.S. Patent and Trademark Office.
ISSN #1524-6353 registered with U.S. Library of Congress.