[Novalug] Replacement for abiword?? -- lightweight markup languages

Michael Henry novalug at drmikehenry.com
Mon Jan 22 09:10:52 EST 2007


Beartooth wrote:
 >
 >     With apologies for the bandwidth, I now have a short list of likely
 > apps which I can put onto Jo's machine yet tonight or tomorrow, and let
 > her get started choosing between.
 >
 >     Many many thanks to all!

There's another approach that I didn't see in the thread (my apologies
if someone has already mentioned this).  You can use one of the
lightweight markup languages to write your text, then convert the markup
to another format as desired.  Markdown[1] and reStructuredText[2] come
to mind first, though there are probably other worthy choices.

One of the benefits of using a markup language[3] is that you write your
documents in plain text[4].  For your application, nearly all of the
content of your document would be simple text, just like you see in this
email message - simply plain English words, typed into your document
just like you'd type on a typewriter.

Plain text, by itself, has almost no direct support for "formatting"
your document.  Changing fonts, making text bold or italic, and other
issues of typesetting require some extension to the "typewriter" model
of document entry.  Loosely speaking, there are two main ways to
accomplish the goal of adding formatting to your typed document.

In the "word processor" model, you type your document into a specialized
word processing program.  When you want to change the look of a
particular piece of text, you highlight it and select a menu option that
applies some formatting (e.g., making a word bold or italic).  Because
this formatting is a form of meta-data[5], something special must be
done to store the formatting along with the text of your document.
Often, your text and the formatting metadata are combined and stored in
a "custom" format in a (typically binary) file.  This custom format is
frequently specific to the particular word processing program, and is
therefore not usable directly by other word processing programs (though
there are efforts to standardize on some open formats such as the Open
Document Format[6]).  Because of the custom file format, the tools
available for manipulating the document are limited.

In the "markup" model, the writer intermingles his own prose with
formatting metadata.  Both the prose and the metadata are written in
plain text, following some structured rules defined by the markup
language.  For example, in an email it is conventional to emphasize a
word by surrounding it with *asterisks*.  To a human reading the email
in plain text format, the asterisks do serve to make the emphasized word
visually stand out.  If the author takes care to follow the rules of the
markup language, it is also possible for a program to interpret the
marked up text.  Such a program could convert the document into any
number of other representations, such as the aforementioned custom word
processor document format, or into HTML for publication to the web, or
into a PDF document, etc.  If you'd like to try out a markup language to
see how well it works for you, you can try the Markdown dingus[7].  It
lets you type some Markdown-formatted text into a web form and click a
button to see the resulting formatted output.

There are several advantages to using a plain text markup language.
Plain text is universal.  There will always be tools for reading plain
text files, so your documents are "future proof".  There are hundreds
(thousands?) of "plain text" editors[8][9] that are designed for
capturing your keystrokes and generating a plain text file.  You can
learn the rules of the markup language once, and apply those rules using
any text editor on any platform from now on.  You can search your files
easily using standard tools like grep.

For your purposes, I'd look toward lightweight markup languages such as
Markdown and reStructuredText.  Documents employing these languages are
easy to read and write in their marked-up plain text format.  Since you
are not planning to typeset mathematics, enter tables, or perform other
fancy formatting tricks, you probably would have all the power you need
without needing to traverse the steep learning curve of fancier markup
languages.  By the way, HTML is itself a markup language (that's the
"ML" in "HTML").  So is XML.  Both TeX (Donald Knuth's brilliant
typesetting package) and LaTeX (Leslie Lamport's macro package that
extends TeX) are markup languages of great power and commensurate
learning curve.  These other markup languages all have their place, and
you may decide to learn them as well.  But given the simple learning
curve of Markdown and reStructuredText, and the fact that you can later
convert them into HTML or another format, you might just want to setup
Jo with an arbitrarily chosen text editor and set her to work typing up
her prose.  You can always go back and add in formatting markup later
(anything from lightweight formatting, to whole-hog TeX/LaTeX).  In
fact, if you decide in the end that you want to go the word processing
route, you can always simply open your text document and begin adding
formatting using the word processor's custom formatting options.  Also,
if you've already selected a word processor as the method of text entry,
you can simply save the document in a plain text format such that you
can still use a markup language for formatting if you so desire.

Good luck with your decision,
Michael Henry


[1]: http://daringfireball.net/projects/markdown/
[2]: http://docutils.sourceforge.net/rst.html
[3]: http://en.wikipedia.org/wiki/Markup_language
[4]: http://www.bellevuelinux.org/plain_text.html
[5]: http://en.wikipedia.org/wiki/Metadata
[6]: http://en.wikipedia.org/wiki/OpenDocument
[7]: http://daringfireball.net/projects/markdown/dingus
[8]: http://kate-editor.org/
[9]: http://www.gnome.org/projects/gedit/




More information about the Novalug mailing list