Content-Type: text/x-zim-wiki
Wiki-Format: zim 0.4
Creation-Date: Tue, 19 May 2009 19:48:40 +0200

====== Parse Tree ======

FIXME - make rules more lax - p should be optional
FIXME - add bullet and checkbox lists (+ numbered lists in proposals)

This page documents the parse tree format as used in zim. This is the format all the formats should support and is also understood by the gui editor widget.

The parse tree is a parse tree object that supports the xml.etree API. The root node tag always must be "**zim-tree**".

Structure below consists of headings and paragraphs. Top level can not contain any text except for whitespace. Whitespace between headings and paragraphs is considered a hint but can be ignored when exporting.

//Top level tags://
**h** - heading
	attribute //level// 1 .. 6 gives nesting
	can only contain text
**p** - paragraph
	can contain text, formats, links, images and lists
	attribute //indent// gives indent level
**tt** - verbatim paragraph
	con only contain text, nothing else
	attribute //indent// gives indent level

Basically all content should be wrapped in paragraph nodes.

//Available formattings://
**em** - emphasis, default rendered italic
**strong** - default rendered in bold
**mark** - default rendered as highlighted, alternatively underline
**strike **- strike through
**code** - verbatim text

	Formats can only contain text and links, with the exception of the **tt** format, which can only contains text.

Links:
**link**
    can contain: none
    text: any
    attrib:
        href: string, actual link
        _href-type: page | url | file | ... (if none, href not yet resolved)
        _href: string, resolved target, depends on type
        type: string, semantic type of the link

//Images://
**img**
    text: none
    attrib:
        src: string, link as it apeared in the source (optional)
        file: string, absolute file name
        width int
        height: int
        alt: string, name / short description, used e.g. for tooltip
        obj-type: string, reserved for objects that render as image in the gui

Captions can be implied if the image is followed directly by a paragraph without whitespace in between (so the tail of the img element is empty). In this case the exporter can render this paragraph the caption of the image.

===== Proposed extensions =====

==== Horizontal lines ====
Allow a <hr> element.

==== Image links ====
How to add linking capability to images ? Either add the link as a child node of the image or merge the attributes.

Alternatively we could consider clickable images an object. 

==== Image captions ====
For more elaborate documents you may want to have image captions. These should start with "Image XXX:" where the images are numbered automatically. In that case you also want references, links that link these images with their text automatically updated to the right number. (How to integrate this with latex export, which does this natively ?)

One way to do this is to assume that any text directly below an image is the caption, if the image is on it's own line and there is no text above the image. In the tree this is a paragraph starting with an image and the tail of the image starting with "\n".

==== Icons ====
Special case of images, instead of a image file we give a stock image name. This can be used e.g. for smileys or tags. When images can be links as well an icon that is clickable can be nice for plugin development.

==== Objects ====
Generic tag "obj" which contains arbitrary content. In the interface there is a class for each object type which can render it. For serializing it could have special code for specific formats and a generic fallback that returns some parse tree consisting of basic types. For example an equation object would have a method to put a pixbuf in the gui + override the context menu. It also would have support for latex and wiki formats, (wiki parser needs to know about it though) and the fallback gives the image, which is used e.g. in html export.

Base class for object extensions could fallback to parse tree with "image not found" icon.

If we want captions for objects it should work same as for images. This is something we want for tables - although maybe it should be configurable whether the caption is above or below the object per type (below for images, above for tables). Also equations can have a captions etc.
