[Appeared in Infoworld 1/20/97]
Do you lie awake at night wondering if your web site has any broken links, presenting your visitors with the dreaded "404 -- Not found" message? Well, even if you sleep soundly thinking your site is perfect, you might want to try NetCarta's WebMapper. The latest beta is now available for 32-bit Windows and you can download a six megabyte copy from their web site, which is good for 30 days.
WebMapper is a powerful tool, producing copious reports about your site's structure. Basically, you point it at a URL and it will suck down the entire site, analyze all the links, and produce a series of reports. It sounds simple, but there is a great deal of sophistication behind it. It took about an hour to download and analyze my site, but mainly that was because of my ISDN connection to the Internet: on a T-1 line, the total time was less than five minutes.
I knew my site was a mess and was eager to use WebMapper to try to find and then fix many of the broken links. With over 800 individual pages, I found plenty: over 400 broken links needed my attention. The reports are produced in HTML format, and you basically go through your pages with a text editor and try to figure out why things are broken. Usually it is because the external site has changed its structure, so that a URL that I had coded as www.company.com/directory1/document.html was now called www.company.com/directory2/document.html. Whatever. The web changes from day to day, and that's why you need a product like WebMapper to at least keep on top of your own site.
For the most part, the various reports were accurate: I found a few pages that had correct links but for some reason (sunspots or random Internet outages when I ran my analysis) were reported broken. The error report isn't organized the way I'd ideally like: if you have more than one broken link on a certain page, you won't find them grouped together but scattered throughout the report. If you follow the error report as your guide to making corrections, you'll end up editing the same page several times to fix multiple mistakes. That can get annoying, especially when you have 400 broken links to see to. I found navigating around the reports fairly simple, albeit somewhat overwhelming: the report showing which pages link to other pages on my site (called InLinks by NetCarta) went on for 33 long pages with hundreds of links.
There are tons of reports, too: in addition to the error report, there are reports on duplicate pages (yes, I had one set that I had forgotten about), images used, and a nice site index. The index is a listing of every word and a link to the page where it is used. And if that isn't enough, you can produce your own custom reports, sort them by whatever column you'd like (just by clicking on the heading -- very nice), and then print them out. For example, you can view the largest pages (by bytes) and see whether you want to break them into smaller chunks and give your modem-connected users some relief from waiting for these monsters to download. You can also import your server access logs into WebMapper, although manipulating them to produce reports isn't well documented and not so simple.
One of the more interesting views of my site is a hyperbolic graph, showing linked pages where you can zoom and pan around to see what is linked to what. While pretty, I found it wasn't as useful as the textual reports that gave the details of the various pages. Still, being a webmaster can be a lonely job and it is nice to have something cool to impress your friends and family.
Some quibbles: One thing missing from WebMapper is the ability to synch up a site once you have made your changes and fixed the broken links. You have to use the old standby FTP programs to move the new pages up to your site. (Another Microsoft-owned product, FrontPage, does a better job here.) If you don't specify the full path to your home page (such as www.strom.com instead of typing in www.strom.com/index.html), you'll get duplicate index.htm and index.html files for each of your default directories.
408 461 8920
800 461 2449
408 461 8939 fax
copyright 1997 Infoworld Publishing Co.