The Document Standards War is Over: HTML Won?

C11: Interop Vegas 4/3/96
David Strom
+1 516 944 3407 

1. Two things I still can't do

Transfer phone calls ("hook flash")
Attach documents via email

2. 92 billion documents every year

And each one of them uses a different format! 
Sometimes it seems as if there are as many different word processor versions, even from the same vendor!

3. My background

Have used 15 different word processors during my computing career
Have written hundreds of technical magazine articles, but ... 
Can't do footnotes or anything fancy
Write my own HTML for my web site using WordPad
Can program my VCR and do an occasional mail-merge

4. Agenda

Who needs doc format standards, anyway?
Document markup tutorial
A brief review of SGML
HTML history and features
HTML futures
Resources for further study

5. Standards? We don't need no stinkin' standards!

6. Purpose of document formats
Communicate, emphasize, organize
Carry pictorial information
Preserve typographic design
Share information!

7. Why are format standards important?
Reduce the cost of updating docs
Reduce the time to prepare docs
Exchange docs across multiple hardware types and users
Create a standard look for multiple authors

8. So why are there so many of 'em?
No common agreement among vendors
No common agreement among programmers!
No open systems mindset or motivation
Remember RTF failure?
Everyone wants one more feature...

9. Six eras of word processing
Wylbur (1974-80)
TeX and other VT page editors (1976-85)
NBI, Xerox, Vydec word processors (1977-83)
Multimate/Wang (1982-5)
Word Perfect (1984-96)
MS Word (1992-)

10. Welcome to the seventh era!
HTML (1993-)

11. Lessons from word processing history
Dedicated machines with incompatible formats
New hardware platforms every 3-4 years
Alternating between WYSIWIG and tagged text

12. A brief tutorial on document markup 
Content vs. markup distinction
Procedural vs. descriptive distinction
Who cares?

13. Content vs. markup
Content: the actual text and information of a document
Markup: everything else
formatting commands

14. Procedural vs. descriptive document markup
Procedural: type style, printing instructions
Descriptive: sections, paragraphs page numbers
Procedural is tied to the finished product, descriptive is tied to content itself
SGML is descriptive, TeX and troff are procedural 

15. Example
troff (procedural):  .sp1; .ss; sin +12 -12
HTML (descriptive):  This is in bold
WYSIWYG (some of both): This is in italics.

16. Who cares?
Procedural markups used by most word processing/desktop publishing software: change software, go convert your docs
Move from printed pages to screens, formatting problems!
Not platform independent
Procedural markups tedious for authors

17. When I was busy word processing...
The web ran amok
Microsoft Word: defacto document interchange
Tagged text became fashionable once again
Every 15-year old knows HTML 

18. So, our goals are...
(I can read your docs)
Cross-platform compatibility
(Macs can read PC docs)
Collaborate with my colleagues!
(We can jointly author docs)

19. A brief review of SGML
SGML Standard (ISO 8879, '86)
Terms and concepts
Differences with HTML

20. Differences between SGML and HTML (1)
SGML: not the current fashion 
used by various industry bodies including Dept of Defense
a  metalanguage, superset of HTML
many different DTDs
HTML: abused by every browser vendor to some extent
a single DTD (per each version) 
an application of SGML
developed at CERN for document distribution first, publishing later

21. One view (Laura Lemay)
SGML is big and bloated, hard to write docs by hand
HTML is small and simple
SGML is pure content
HTML is pure presentation

22. HTML history and features
Originated as distribution tool
Basic features
Version chronology 
Docs don't look the same in different browsers
Checking your code

23. Remember HTML's origins:
"HTML was not designed to be a WYSIWYG publishing tool. It was designed to be a universal, document distribution and publishing medium."  -- Mary Morris

24. Features of HTML
Operating system independent
Browser independent
The user controls the browser
The author controls organization
The server controls -- well, not much!

25. Chronology of HTML standards
HTML v1 ('90)
HTML v2  (RFC 1866, 11/95) forms
HTML v3 (tables, frames, motion)  (varying interpretations)

26. Different browsers, different views!
Not all browsers see the same thing
Text (Lynx) vs. graphical 
Images on or off
Older vs. newer software versions
R. Scott -- how various browsers stack up  with supporting HTML v3 features.
Graphical examples of different browser's feature support

27. Checking your HTML code
HTML Checker 
Weblint (perl script) 

28. HTML futures
Netscape extensions
Microsoft extensions
Where it is all going

29. HTML v3 (Netscape version)
Sub, superscripts
Client-side image maps
Client file uploads

30. HTML v3 (Microsoft version)
OLE Controls imbedded in pages
Customize frame borders and floating frames
Colored cells in tables and better alignment tools

31. HTML NG: other extensions
Style sheets
Math equations (remember TeX?)
More presentation controls

32. Small bump in the road: standards
All different DTDs for each extension
Market is moving quickly ahead of standards 
Microsoft and Netscape aren't the best of buddies

33. Possible HTML futures
Microsoft and Netscape kiss and make up
Or don't
W3C/IETF/et al. have actual influence 
Vendors do their own thing(s)
34. SGML Resources for further study 35. Web-based HTML resources 36. Suggested books to read to learn HTML Additional web-based resources

Cye Waldman's HTML book list

Return to David Strom's Home Page