mldown is a modular text formatter (like markdown) able to take several syntaxes in input, perform complex tasks on the abstract representation (such as generating a table of contents) and outputs it different languages (xHTML and latex are currently supported)

1 — What is mldown ?

mldown is very similar to markdown : it is built to parse and analyse natural syntaxes, which have to goals :

Unlike markdown and such, mldown is able to take different syntaxes as input (for now there is markdown and mdown2) and to extend them with small syntaxic extension (footnotes, unicode support, and so on). mldown also lets you build complex transformation on the abstract tree, for instance to get information from the document or to improve the way the document is printed.

mldown is written in OCaml.

2 — Getting mldown

mldown is on gitorious. To compile just type make, all you need is ocaml >= 3.11.

3 — mldown documents

To see what a mldown document actually look like you can see the source of this document.

3.1 — Module loading

At the beginning of a document, mldown lets you load several extensions (or modules) and pass some option to tweak the way they behave. You must at least load an extension in mdown2 and markdown. If you do not load any syntax, the parser will have no rules and fail on your input.

To load a module, do at the beginning of the file only :

%& <name-of-the-module> <option1>, <option2>

For instance, to load the footnotes modules and to use greek letters as index for the footnotes, use :

%& footnotes transformer.footnotes.greek

On the section Modules, you will see how modules works. When loading syntax extension, you need to pay attention to the order of syntaxic extensions, as they may conflict. Here's a general rule : load before small extensions and after big ones as small ones often partially overrides constructions of the big ones.

3.2 — Metadata

mldown lets you store metadata in your file, so that you can easily extract them after. This must directly follow the loading of modules (with no empty lines or any lines not matching the pattern) and are as follows :

%? title
%? author
%? date

Of course, they are not all mandatory, and you can only specify the title if you like.

It is also possible to use named metadata, if you want to store complex metadata:

%! tags OCaml, text-formatting

Then your document will have the metadata tags with the value text-formatting. You can either use the mldown program or the mldown library to retrieve metadata. You can even use a simple script shell if like!

3.3 — Blocks and inline markup

After the metadata, the real document starts. The document is splitted in blocks, in which inline markups appears.

The semantic of blocks is quite complex, as the parser lets you a lot of freedom in defining your blocks. General principles :

Blocks are like in html : paragraph, list, quotations, code, and so on. They contain inline markup (for most of them at least) : links, emphasis, etc.

4 — Modules

4.1 — Kinds of modules

There are for now five types of modules :

Each modules has some metadata including :

Blacklisting modules allows backends to disable modules providing abstract ways to do what the backend already do well (such as : table of contents and footnotes for LaTeX).

4.2 — Option passing

You can give modules some options when loading them. A module can be loaded several times and then, the options are concatenated. When loading a module you can specify options for the module itself or for the modules it autoloads. In the former case, you don't need the prefix (<type>.<name>) but in the latter, you do :

%& footnotes transformer.footnotes.greek

because footnotes autoloads transformer.footnotes, but

%& contents number-heading

5 — mldown's builtin extensions

Here is a list of the builtin extensions in mldown, with a short description. Note that you can have this list for your version of mldown with the command

mldown --list-extensions

5.1 — Syntax extensions

5.1.1 — macros

Provides means to insert in the document the result of a given command. This adds two constructions :

The result of the command is parsed so the command can generate mldown code that will be parsed and then integrated.

5.1.2 — reference

Provides indirect links à la markdown. Adds the following construction - [text][link] ([text][] is a shorthand for [text][text]) ; - ![image][link] ; - reference definition :

  [reference name]: http://link.com "Optional title"

5.1.3 — footnotes

Provides syntaxic support for footnotes : {^My footnote} (inline)

5.1.4 — expsub

Provides syntaxic support for superscripts/subsripts : ^{My superscript} and _{My subscript}

5.2 — transformer

5.2.1 — contents

Generates a table of contents in div named Contents. Adds a table of contents after the contents of the div. You can define the depth of the content table :

:Contents[depth_max=3]
   Contents

will only show heading with a level up to three (you can use depth_min also)

5.2.2 — footnotes

Transforms footnotes with hyperlinks. You can use greek letters to denote the footnotes (greek option) and you can specify the level to flush footnotes (level=(none|section|subsection|subsubsection)). By default footnotes are flushed at the end of document. In any case, you can manually flush footnotes with an empty div called flushFootnotes.

5.2.3 — unicode

This extension replaces some sequence by their unicode equivalent. The transformations happens in two contexts : inline text and code. The extension loads transformation from a file (file option) with the following syntax :

 regexp replacement
 ...

type:name
 regexp replacement
 ...

The file is splitted in sections defining by type (code or text) and a name. The first section has no name and will be used in every text (corresponds to text:). Likewise code: will be used in every code whereas code:language will be used in every code of language language. text:dict will be used if the dictionary dict is loaded (with the option `dict). Here is a small example

 \.\.\. …
 \([^a-z-]\)---\([^a-zA-Z^]\) \1—\2
code:
 -> →

code:ocaml
 'a α
 'b β
 'c γ
 'd δ 

5.3 — backend

5.3.1 — xhtml

Outputs in XHTML1.1. Provides syntax highlighting thanks to pygments.

5.3.2 — xml-dvp

Outputs in the XML format used by développez. Is not well updated, not used but on demand can be updated

5.3.3 — zcode and zminituto

Outputs to zCode and zTuto format from site du zéro

5.4 — preprocessor

5.4.1 — standard

Juse removes commented lines (that is, the ones starting with %)

5.5 — At the toplevel

5.5.1 — footnotes

Autoloads syntax.footnotes and transformer.footnotes

5.5.2 — reference

Autoloads syntax.reference and transformer.reference

5.5.3 — markdown

This an implementation of markdown with its flaws, but tends to be as compatible as possible.

5.5.4 — mdown2

This is a (partial) implementation of mdown2

6 — The mldown program

For a basic man, you can use mldown --help.

You can append and prepend a file to the output with the option -H and -F. It is possible to use the document metadata in the file by enclosing them in ${variable} (it is a simple substitution.). You can require a metadata with meta and load/blacklists modules from command line. There is a shortcut for --load backend.xhtml : -f xhtml.

7 — The mldown library

You can use the mldown library to be able to use mldown inside your programs. The documentation is available here.

To use mldown, you need to create a context. This holds the loaded modules, the parser rules, and so on. In this context, you load/blacklist modules, and you can the parse the lines.

For instance, you could use this function to compile comments for a blog :

let mldown_compile ~out ~lines ~meta = 
  let context = Context.empty () in
  let context = Context.blacklist context
    ~extension:"syntax.macros" ~reason:"blog engine" in
  let context = Context.load_many
    (("backend.xhtml", "") :: ("preprocessor.standard", "") :: meta.Reader.modules) context in
  let context = Context.initialize context in
  let lines = context.Context.preprocess lines in
  let tree = Reader.parse_lines context lines in
  with_open_out_file (context.Context.output tree) out;
  tree

mldown documentation