Skip to Content
Skip to Table of Contents

← Previous Article Next Article →

ATPM 11.07
July 2005

Columns

Segments

How To

Extras

Reviews

Download ATPM 11.07

Choose a format:

Publisher’s Letter

by Michael Tsai, mtsai@atpm.com

This is the first issue of ATPM produced with our new publishing system. The formats remain the same—two HTML and two PDF editions—but we’re producing the issues using new tools. The new system should let us produce better PDFs in less time. Although we had been using the old system for nearly five years, ATPM has had several format changes over the years, so I thought I would take a little time here to examine them.

DOCMaker

From April 1995 to April 1996, ATPM was available exclusively in DOCMaker format. DOCMaker was a nifty application that provided a primitive, SimpleText-style editing environment. DOCMaker didn’t support spell-checking, tabs, or wrapping text around graphics, but its signature feature was that instead of saving its documents in SimpleText format, it could save them as stand-alone documents. These “documents” were actually applications: your document, combined with the software needed to view it. So, you could double-click an issue of ATPM, and it would display itself in a nice interface, complete with a search feature and commands for navigating between different articles. Most people received the ATPM DOCMaker files via e-mail or downloaded them from AOL or eWorld.

HTML

In May 1996, we first made ATPM available on the Web in HTML format. At first, Evan Trent and Nancy Ross converted the DOCMaker issues to HTML by hand. Starting with ATPM 3.09, I began using Myrmidon to do this (mostly) automatically by printing to HTML. Most people, however, preferred to download the DOCMaker files from our FTP site. They were compressed with StuffIt for faster downloading and, once downloaded, provided a faster and more Mac-like reading experience. Web browsers weren’t always as nice as they are today.

PDF

A few years later, Adobe Acrobat was widely available, and ATPM readers were clamoring for PDF-formatted issues. There were a variety of reasons for this. DOCMaker documents didn’t print well. Graphics would shift around because the layout was done using spaces, and sometimes graphics or even lines of text would be split across page boundaries. The main body font that we used, Geneva, looked great on-screen, but wasn’t optimal for printing. DOCMaker couldn’t embed fonts, so text using fonts that weren’t built into the Mac OS had to be converted into graphics. (This also let us “de-bump” the edges of the text, since font smoothing wasn’t yet built into the Mac OS.) Anyway, these fonts were rasterized at screen resolution, so they looked blocky when printed.

In July 1999, we introduced the first PDF version of ATPM. Aside from printing better (because of better pagebreaks, vector graphics, and embedded fonts), PDF offered a number of benefits. Since it was a cross-platform format, ATPM readers could download and view it using Windows machines at work. By this time, URLs were more common in ATPM articles. DOCMaker, via ICeTEe, could make URLs in the text clickable if you held down the Command key. We wanted to include more URLs, but sprinkling too many of them throughout the text made the paragraphs unreadable. PDF had the ability to make any text clickable, just like the Web, and the actual URL could be hidden at the bottom of the page (so it would be visible when printed).

Finally, switching to PDF made it much easier to produce ATPM. We could now use Adobe FrameMaker, a real word-processing and page-layout program. Besides providing a nice editing environment, FrameMaker made it possible to set up the clickable links and bookmarks (for article navigation) ahead of time. Most other software, even today, requires a separate, manual pass using Adobe Acrobat to add these features. Aside from being more work (you have to draw boxes to set the clickable area for each link), this wasn’t feasible for us because it would have to be done at the last minute, after converting the issue to PDF. Finally, in addition to exporting to PDF, FrameMaker had a customizable HTML export feature, which actually generated pretty good HTML, needing just a little post-processing via a Perl script.

eDOC and Offline Webzine

The PDF-formatted ATPM got a great reception, but many ATPM readers continued to prefer the DOCMaker format because it was easier on the eyes (on-screen) and because Acrobat Reader was slow and unwieldy. With ATPM now being produced in FrameMaker, it wasn’t possible (given our staff and time constraints) to continue producing a DOCMaker version. However, FrameMaker had some powerful features that allowed us to solve this problem with the help of a wonderful utility called eDOC. Like Myrmidon, eDOC is a printer driver that intercepts drawing commands to create a file instead of a printed document. However, instead of creating HTML, it creates a stand-alone document that’s similar to DOCMaker, but that supports arbitrary layout (like PDF). Thus, with the help of FrameMaker’s stylesheets and its conditional text feature, we were able to produce PDFs as well as stand-alone eDOC documents from the same source files. The eDOC version of ATPM was introduced in August 1999, along with the Offline Webzine edition that continues to this day.

Screen and Print PDFs

In May 2000, we had to stop producing the popular eDOC edition. The eDOC driver had stopped working with FrameMaker, and despite a clean installation and help from eDOC’s author, we were not able to get it working again.

Instead, we upgraded from FrameMaker to FrameMaker+SGML. FrameMaker works like a word processor, with both character- and paragraph-level stylesheets. The more powerful FrameMaker+SGML is designed for creating structured documents. Instead of having two styles (character and paragraph), each run of text is part of a tree of elements (think HTML tags). This richer markup enables the creation of context-sensitive layout rules, and this made it possible to create screen- and print-optimized layouts, which we debuted in August 2000. The screen version had larger fonts and a smaller page size, while the print version had a two-column US Letter layout with smaller fonts and footnotes showing where each URL would lead.

Goodbye FrameMaker

This combination of four formats, online and offline HTML editions and two PDF formats generated with FrameMaker+SGML, has served us well. Each has a loyal following, and FrameMaker+SGML’s power, combined with various scripts developed over the years, made this a comfortable combination of formats to produce. However, the Mac world has changed in the nearly five years since we started using FrameMaker+SGML. FrameMaker (version 7 incorporates the SGML features and is simply called FrameMaker) is out of favor at Adobe, which is now focussing its development efforts on InDesign. FrameMaker, which started out as a Unix application, was behind the times interfacewise when we started using it. The latest version is little better, and Adobe has halted development of the Mac version of FrameMaker. It will never be Carbonized for Mac OS X.

FrameMaker continues to run very well in the Classic environment. In fact, it’s the only Classic application that I use regularly, and it runs much faster now on my G5 with Mac OS X 10.4 than it ever did running directly in Mac OS 9. All that said, it’s time for a change. FrameMaker has bugs and limitations that add several hours to the production of each issue of ATPM. It no longer makes sense to live with them now that it’s clear that they’ll never be addressed. Also, the upcoming Macs with Intel processors will not be able to run Classic, which means that in a few years Apple will no longer make any Macs that run FrameMaker. (It seems that Apple’s own technical writers are in the process of switching from FrameMaker to a system using XML and XEP.)

New Tools

There are surprisingly few options on Mac OS X for producing rich PDFs (with bookmarks and clickable links) and HTML documents from the same source. One might first consider Microsoft Word, or Apple’s new Pages, but the Mac version of Word can’t create rich PDFs, and neither can Pages, despite Mac OS X’s overall excellent support for PDF. Plus, both have poor HTML export abilities. Adobe InDesign can, but it’s more for page-layout than word-processing, and it lacks some of FrameMaker’s powerful auto-formatting features. Every time I contemplated switching to InDesign, I vowed to continue using FrameMaker, even if that meant running it on an older Mac if it stopped working with the latest Apple hardware and software.

Fortunately, Mac OS X’s Unix base provides new options in the form of open-source software that can be programmed to do what we want. The core of our new publishing system is Docutils, a text processing system designed for use with the Python programming language. I now format ATPM articles in reStructuredText format, a plaintext markup language that’s similar to Setext and Markdown.

Docutils includes a parser, which reads reStructuredText files into an internal representation. Then, a writer translates the internal representation into an output format such as HTML. The beauty of the Docutils system is that both sides of this process are extensible. reStructuredText doesn’t have support for the blue ATPM article headers, but I was able to teach it about them by extending the parser. Similarly, Docutils doesn’t have writers that generate ATPM-style HTML or PDFs, but it lets you plug in writers of your own. As a programmer, it was simple for me to make these and other extensions, and Docutils’ good design meant that I didn’t have to re-invent the wheel to do so.

Actually, developing a writer for the PDF format would be quite a bit of work; PDF is much more complicated than HTML. Instead, I created a writer that, with the help of PyObjC, generates output in LaTeX format. LaTeX is a somewhat arcane publishing system built on top of Donald Knuth’s TeX language, and TeX incorporates a lot of typesetting knowledge, such as how much space to put between each word so that the edges of the columns are flush. In fact, it’s significantly better at this than FrameMaker, or pretty much any other program. So, the writer produces LaTeX (slightly different LaTeX for the screen and print versions), which I then feed into the open-source pdfLaTeX program to produce the PDFs.

My ATPM holy grail is to be able to edit and assemble the articles for an ATPM issue, press a button, and have all four formats automatically generated. With this new publishing system, we are finally close to this goal. Since it’s programmed using lower-level tools rather than off-the-shelf applications, it’s possible to get the new system to do exactly what we want (subject to the limitations of my LaTeX skills), and easier to automate it. In 1996, I never would have expected that ATPM would someday be published in this way, but Mac OS X opened the doors for this new system, even as it hastened the end of the old FrameMaker-based one.

Finally, it occurs to me that nearly all of ATPM is now produced using BBEdit, the flagship product of our current sponsor, Bare Bones Software. BBEdit provides spell-checking and various other tools for editing ATPM. Its function pop-up lets me navigate the articles by section, and its text formatting tools make it easy to convert our submissions into reStructuredText format. But BBEdit is also a programmer’s editor, and so it was essential in writing the Python, Perl, and Make code that comprises the new publishing system, in debugging the HTML and LaTeX files that said code generated, in writing the content management system that combines the HTML files into the ATPM Web site, and in helping to store all the text and code files safely using the Subversion version control system. I suppose every Mac user has one primary application, be it a mail program, an outliner, Photoshop, or Microsoft Word. For me, all roads lead to BBEdit.

Also in This Series

Reader Comments (12)

Andrei Popov · July 3, 2005 - 02:30 EST #1
Has your choice of reStructuredText been mostly driven by desire not to mess with XML? You could, using, say XXE + DocBook + XSL stylesheets produce all the various HTML/PS/PDF/RTF flavors of ATPM. True, stylesheet could require some tweaking, but I would think that overall control over document apearance could be higher than with reStructuredText, closer to that chieved with FrameMaker-SGML.
Michael Tsai (ATPM Staff) · July 3, 2005 - 10:57 EST #2
Andrei: It's true that XML would probably be a bit more flexible. However, it's much easier to get our content into reStructuredText format than into XML. reStructuredText is also clean enough that the content can then be edited directly, without first rendering it into a readable format. These are the activities that will continue to happen each month, so they're the ones that I want to make as efficient as possible.

I don't think DocBook is a very good match for ATPM. In fact, I tried using it when we first started using FrameMaker. It was missing a few elements that we wanted, and the rest of it is mostly overkill. ATPM content is pretty simple, requiring only about four more directives beyond what ReST natively provides. If we ever do want to go to XML, I think it would be relatively simple to build, say, a DocBook writer for Docutils, and this would let us keep using the ReST front-end.

In terms of tools, the XML+XSL+FO system is much more complicated than ReST+LaTeX. I was able to get our core system working in a matter of hours. It was actually quicker than setting up the Element Definition Documents for FrameMaker. With the XML tools, I would have had to read about XSL and FO, and even then I think it would have required writing more code. That would be OK if it provided significant benefits, but in this case it doesn't. The XML is in some sense more elegant, because it provides better separation of concerns. But for relatively simple documents like ATPM, it's quicker (and, I think, clearer) to walk the node tree in Python to gather context, and then just generate the LaTeX code that we want, rather than rely on extra language machinery.
Mavan Atapattu · July 3, 2005 - 14:27 EST #3
What on earth makes you think that LaTeX is "somewhat arcane"?!
Maarten Sneep · July 3, 2005 - 16:30 EST #4
In response to "arcane": TeX is a full programming language, but with some rules that can certainly cause surprising results to the uninitiated. If TeX were to be designed today, it a) would never get of the ground in the first place - way too much time, and way too complicated and b) the programming side would probably resemble a macro/programming language like Python or Ruby much more closely.

It is funny though: the XML/FO engines that produce the best results all seem to use TeX as there engine. Donald Knuth must have done something right ;-)

Michael: what are the things you couldn't get done? There is a rather active LaTeX macintosh user community, and we may be able to offer some suggestions. Start at http://www.esm.psu.edu/mac-tex/ and the mailing list hosted there.
Michael Tsai (ATPM Staff) · July 3, 2005 - 17:40 EST #5
Mavan: Perhaps I should have omitted the word "somewhat." Seriously. I like LaTeX, and I've used it on a wide range of documents, with and without lots of math, over the past eight or so years. I've read probably six books about it, and I've often helped people debug their LaTeX documents. In short, I know enough about LaTeX to know how little I truly know about it. If you stay on the beaten path, it's quite simple, but if you veer off it you generally need to reduce your expectations or find a wizard. And package conflicts bring back memories of OS 9 extension conflicts. :-)

Maarten: I had a limited amount of time, so some of the following are probably things that I could figure out how to do, but they are nevertheless open issues.
  • Adjust the size of the hanging indent for the blue review info list environment. No matter which length I changed, it always seemed to have the side effect of throwing another parameter out of whack.
  • Make runs of hyperlinked words automatically line-wrap at word boundaries.
  • Use arbitrary PostScript fonts. Theoretically, I know how to do this, but I haven't yet had the time or stomach to generate the metric files, put them the right places, etc.
  • Switch in and out of two-column mode multiple times without introducing any page breaks. Span floating figures across two columns. Get good automatic placement of floating figures, both spanning and non-spanning, when in two-column mode.
  • Wrap text around images. I could get this to work in some cases, but in others I couldn't get the right vertical positioning, and sometimes the text would completely fail to wrap as though the image weren't there.
  • Automatically wrap long URLs in footnotes.
  • Let a paragraph of colored text continue from one page to the next without having the color disappear after the pagebreak.
  • Achieve the layout and PDF bookmarks that we currently have using the built-in sectioning commands. I had to manually format the text and add it to the table of contents in order to avoid various weird problems when the section names contained hyperlinks.
Andrei Popov · July 4, 2005 - 23:30 EST #6
Michael -- but you don't necessarily need to do all the editing in XML. I agree that working with a plain text setup like ReST (or MarkDown, or Textile) is easier, but you're loosing the benefit of semantic mark-up and structuring your content in a way that could facilitate further processing.

In the end of the day, though, it really matters little -- as long as ATPM gets released :)
Michael Tsai (ATPM Staff) · July 5, 2005 - 08:09 EST #7
Andrei: What good XML editors are there that hide the tags--aside from FrameMaker? :-) As to semantic markup, I think ReST is much more capable than Markdown or Textile in that regard. With a bit more work, I think I could set up a structure that's isomorphic to our FrameMaker EDD, because ReST handles custom block elements as well as custom inline elements. The main limitation, as I see it, is that it doesn't handle nested inline elements, but I don't think ATPM has ever used those.
Andrei Popov · July 5, 2005 - 22:57 EST #8
Michael: Have you seen XXE from the link in my first post? It does not really hide XML/SGML the way FrameMaker does, but comes *very* close to it.
Andrei Popov · July 5, 2005 - 23:00 EST #9
To add: in case screenshots are a bit misleading -- it's written in Java, hence would work on a Mac just as well as on a Wintel box.
Michael Tsai (ATPM Staff) · July 6, 2005 - 07:45 EST #10
Andrei: I haven't used it, but I think I'd prefer editing the tags directly or running FrameMaker in Virtual PC to using a Java app.
Andrei Popov · July 6, 2005 - 13:04 EST #11
While this is probably not worth a post on the site -- may I inquire why? Not that I've got any interest in XML Mind as a company, but as a person that tried XXE out I can say that it is very well-designed and easy to use application. Sitting on a dial-up link I don't feel like pulling it off the site and taking a peek on my PB how well it behaves, yet I'd doubt it would do so badly.

Though in the end of the day -- personal choice is personal choice :)

[Feel free to carry this on on email if desired, I am marking this last comment as 'E-mail me']
Michael Tsai (ATPM Staff) · July 7, 2005 - 09:39 EST #12
I just don't think XXE feels at all like a Mac app. At least FrameMaker on Windows would still be FrameMaker.

Add A Comment





 E-mail me new comments on this article