mldown is a modular text formatter (like markdown) able to take several syntaxes in input, perform complex tasks on the abstract representation (such as generating a table of contents) and outputs it different languages (xHTML and latex are currently supported)
1 — What is mldown ?
mldown is very similar to markdown : it is built to parse and analyse natural syntaxes, which have to goals :
- to be less verbose as possible : we don't want to have to type painful html tags or latex commands ;
- to be designed in a way that makes source files readdable even as is in a simple text editor.
Unlike markdown and such, mldown is able to take different syntaxes as input (for now there is markdown and mdown2) and to extend them with small syntaxic extension (footnotes, unicode support, and so on). mldown also lets you build complex transformation on the abstract tree, for instance to get information from the document or to improve the way the document is printed.
mldown is written in OCaml.
2 — Getting mldown
mldown is on gitorious. To compile just type make, all you need is ocaml >= 3.11.
3 — mldown documents
To see what a mldown document actually look like you can see the source of this document.
3.1 — Module loading
At the beginning of a document, mldown lets you load several extensions (or modules) and pass some option to tweak the way they behave. You must at least load an extension in mdown2 and markdown. If you do not load any syntax, the parser will have no rules and fail on your input.
To load a module, do at the beginning of the file only :
%& <name-of-the-module> <option1>, <option2>
For instance, to load the footnotes modules and to use greek letters as index for the footnotes, use :
%& footnotes transformer.footnotes.greek
On the section Modules, you will see how modules works. When loading syntax extension, you need to pay attention to the order of syntaxic extensions, as they may conflict. Here's a general rule : load before small extensions and after big ones as small ones often partially overrides constructions of the big ones.
3.2 — Metadata
mldown lets you store metadata in your file, so that you can easily extract them after. This must directly follow the loading of modules (with no empty lines or any lines not matching the pattern) and are as follows :
%? title %? author %? date
Of course, they are not all mandatory, and you can only specify the title if you like.
It is also possible to use named metadata, if you want to store complex metadata:
%! tags OCaml, text-formatting
Then your document will have the metadata tags with the value text-formatting. You can either use the mldown program or the mldown library to retrieve metadata. You can even use a simple script shell if like!
3.3 — Blocks and inline markup
After the metadata, the real document starts. The document is splitted in blocks, in which inline markups appears.
The semantic of blocks is quite complex, as the parser lets you a lot of freedom in defining your blocks. General principles :
- to seperate blocks, use empty lines. This is mandatory for paragraph. if you are in a paragraph and want to start a new block, leave a line blank.
- however some blocks (such as lines) allows empty lines (but only one) for readdability matters.
- blocks can often be nested and defines an indentation level : this is often fixed for a given syntax (2 for mdown2 and 4 for markdown) although it may be good to let the user choose. This means that, in a list, if you want to start a sublist, you have to indent it. Moreover, in nested structures, you have to pay attention to empty lines, as they may break your blocks.
Blocks are like in html : paragraph, list, quotations, code, and so on. They contain inline markup (for most of them at least) : links, emphasis, etc.
4 — Modules
4.1 — Kinds of modules
There are for now five types of modules :
- syntax extension that add rules to the parser ;
- transformation that operate on the AST generated by the parser ;
- backends that output the AST ;
- preprocessor that operates on the document's line ;
- virtual extensions that load several extension.
Each modules has some metadata including :
- name ;
- short description ;
- options ;
- the modules it autoloads
- the modules it blacklists
Blacklisting modules allows backends to disable modules providing abstract ways to do what the backend already do well (such as : table of contents and footnotes for LaTeX).
4.2 — Option passing
You can give modules some options when loading them. A module can be loaded several times and then, the options are concatenated. When loading a module you can specify options for the module itself or for the modules it autoloads. In the former case, you don't need the prefix (<type>.<name>) but in the latter, you do :
%& footnotes transformer.footnotes.greek
because footnotes autoloads transformer.footnotes, but
%& contents number-heading
5 — mldown's builtin extensions
Here is a list of the builtin extensions in mldown, with a short description. Note that you can have this list for your version of mldown with the command
mldown --list-extensions
5.1 — Syntax extensions
5.1.1 — macros
Provides means to insert in the document the result of a given command. This adds two constructions :
- inline :
$(command) -
block (a single space before the dollar) :
$ command
The result of the command is parsed so the command can generate mldown code that will be parsed and then integrated.
5.1.2 — reference
Provides indirect links à la markdown. Adds the following construction - [text][link] ([text][] is a shorthand for [text][text]) ; - ![image][link] ; - reference definition :
[reference name]: http://link.com "Optional title"
5.1.3 — footnotes
Provides syntaxic support for footnotes : {^My footnote} (inline)
5.1.4 — expsub
Provides syntaxic support for superscripts/subsripts : ^{My superscript} and _{My subscript}
5.2 — transformer
5.2.1 — contents
Generates a table of contents in div named Contents. Adds a table of contents after the contents of the div. You can define the depth of the content table :
:Contents[depth_max=3] Contents
will only show heading with a level up to three (you can use depth_min also)
5.2.2 — footnotes
Transforms footnotes with hyperlinks. You can use greek letters to denote the footnotes (greek option) and you can specify the level to flush footnotes (level=(none|section|subsection|subsubsection)). By default footnotes are flushed at the end of document. In any case, you can manually flush footnotes with an empty div called flushFootnotes.
5.2.3 — unicode
This extension replaces some sequence by their unicode equivalent. The transformations happens in two contexts : inline text and code. The extension loads transformation from a file (file option) with the following syntax :
regexp replacement ... type:name regexp replacement ...
The file is splitted in sections defining by type (code or text) and a name. The first section has no name and will be used in every text (corresponds to text:). Likewise code: will be used in every code whereas code:language will be used in every code of language language. text:dict will be used if the dictionary dict is loaded (with the option `dict). Here is a small example
\.\.\. … \([^a-z-]\)---\([^a-zA-Z^]\) \1—\2 code: -> → code:ocaml 'a α 'b β 'c γ 'd δ
5.3 — backend
5.3.1 — xhtml
Outputs in XHTML1.1. Provides syntax highlighting thanks to pygments.
5.3.2 — xml-dvp
Outputs in the XML format used by développez. Is not well updated, not used but on demand can be updated
5.3.3 — zcode and zminituto
Outputs to zCode and zTuto format from site du zéro
5.4 — preprocessor
5.4.1 — standard
Juse removes commented lines (that is, the ones starting with %)
5.5 — At the toplevel
5.5.1 — footnotes
Autoloads syntax.footnotes and transformer.footnotes
5.5.2 — reference
Autoloads syntax.reference and transformer.reference
5.5.3 — markdown
This an implementation of markdown with its flaws, but tends to be as compatible as possible.
5.5.4 — mdown2
This is a (partial) implementation of mdown2
6 — The mldown program
For a basic man, you can use mldown --help.
You can append and prepend a file to the output with the option -H and -F. It is possible to use the document metadata in the file by enclosing them in ${variable} (it is a simple substitution.). You can require a metadata with meta and load/blacklists modules from command line. There is a shortcut for --load backend.xhtml : -f xhtml.
7 — The mldown library
You can use the mldown library to be able to use mldown inside your programs. The documentation is available here.
To use mldown, you need to create a context. This holds the loaded modules, the parser rules, and so on. In this context, you load/blacklist modules, and you can the parse the lines.
For instance, you could use this function to compile comments for a blog :
let mldown_compile ~out ~lines ~meta =
let context = Context.empty () in
let context = Context.blacklist context
~extension:"syntax.macros" ~reason:"blog engine" in
let context = Context.load_many
(("backend.xhtml", "") :: ("preprocessor.standard", "") :: meta.Reader.modules) context in
let context = Context.initialize context in
let lines = context.Context.preprocess lines in
let tree = Reader.parse_lines context lines in
with_open_out_file (context.Context.output tree) out;
tree
mldown documentation