An OctoML Language is a javascript file that exports an Object containing 4 properties:

module.exports = {
   'global': {} // Object containing any global attributes.
   'pre': [Word,Word,...]   // text preprocessing
   'main': [Word,Word,...]   // main DOM replacement
   'post': [Word,Word,...]   // text postprocessing 
};

I'm using the term 'Words' to refer to individual objects listed. Words process and replace sections of HTML. When OctoML converts a file, it first runs through pre, the text preprocessing Words. It then creates a DOM (using jsdom) and provides a jQuery object to the Words in the main list. Finally the results are put together in postprocessing.

The global object is a store for non-HTML data. It can contain strings, Arrays or Objects. For example, in the OctoML library language, Words can add bits of CSS. To give them somewhere to store this, you add to global object:

global: {
   css: "",
}

The Words can all access and add to the global css string. In postprocessing, the css string can then be added (in this case appended to the <head> wrapped in a <style> tag.

Words

I'll start with the main Words, since they're the most complete. Below is a full list of all the properties that can be in a main Word. These are a reference, they'll become clear through the examples

{
   'tag': 'css selector(s)',
// flags
   'single': false,     //set to true for a single tag (like <img> that has no close tag)
   'async': false,      //async function - only allowed for root tags (not inner)
   'replace': true,     //replace %variables% in the output
   'nested': false,     //nested tag - may contain other versions of itself
// attributes
   'attr': ['attr1','attr2','...'],
    or : { 'attr1': '%val%',
           'attr2': function() {return 'string'}
         },
   'attrDefault': {
      'attr1': 'alternative',
      'attr2': function(data) {},
   },
   'attrType': {
      'fruit': ['apples','oranges']
   },
   'attrInherit': ['attr3','attr4',...],   //inherited from parent (for inner only)
// main functions
   'first': 
   'last': 
   'html':    //all three of these can be any of the following:
        'html string',
   or  ['html','parent top','parent bottom'],
   or   {
         html: 'html string',
         css: 'other global data',
      },
   or   function(data) {}
   or  function(data,callback) {}   //if async
// inner tags
   'inner':  [ {Word},{Word},... ]
   or      function(data) {}       //always sync
}

Simple Replacement

OctoML allows you to convert custom (usually non-standard) HTML tags into other (usually valid) HTML.

So, let's create a new HTML tag: <box>. In the OctoML Language file you add a Word into the 'main' array.

You add the tag property, which is the CSS selector for your new tag. You also give an output, the html property.

We'll start with a single tag (ie one without a close tag) - so just <box> will do, not <box></box>.

For single tags, set the single property to true.

{
   tag: "box",
   single: true,
   html: '<div class="box"></div>',
},

<section>
   <box>
   <box>
</section>
<section>
   <div class="box"></div>
   <div class="box"></div>
</section>

By default, tags are non-single, and wrap around contents. To copy the contents of your tag, use the %html% variable (all variables are of the form %variable-name%):

{
   tag: "box",
   html: '<div class="box">Box contents: %html%</div>',
},
<box>Box 1</box>
<box>Box 2</box>
<box>Box 3</box>
<div class="box">Box contents: Box 1</div>
<div class="box">Box contents: Box 2</div>
<div class="box">Box contents: Box 3</div>

HTML functions

The html property doesn't have to be a static string, it can be a function:

{
   tag: "box",
   html: function(data) {
      return '<div class="box">Box contents: %html%</div>';
   },
},

This will give the same output as before. Note that the function is passed a data Object. This gives various pieces of information about the current element. More on that in a moment.

By default html functions are synchronous, but you can make them asynchronous by putting the async property to true.

Asynchronous functions are passed the data Object and a callback function.

{
   tag: "box",
   async: true,
   html: function(data,callback) {
      setTimeout(function() {
         callback('<div class="box">Box contents: %html%</div>');
      },100);
   }
},

Data Properties

The full list of data properties are:

data = {
   '$':     jQuery
   'attr':   Object with all specified attributes 
   'attrOther': any other non-specified attributes
   'obj':   the current element (jQuery object)
   'tag':   current elements tag name (lowercase)
   'length': number of elements that match the selector (within parent)
   'index':  position of current element (starting at 0)
   'first':  true if first element (index == 0)
   'last':   true if last element (index == length-1)
   'html':   obj.html() // html code of the current element's contents (before inner is converted)
   'text':   obj.text() // text of element's contents (before inner is converted)
   'id':    function() {returns current ID (or generate one)}
   'top':   "" //text which will be added to the beginning of the current object's contents (prepended)
   'bottom': "" //text which will append to the end of the current object's contents
   'parent':   // data Object for parent (if this is an inner nested tag)
   'global': global data
};

OctoML uses jQuery, to give you a set of tools to manipulate the DOM. The data.obj is the jQuery element itself. Any alterations you make to it will remain. For example:

{
   tag: "box",
   html: function(data) {
      data.obj.append('!!!!!!!!');
      return '<div class="box">%html%</div>';
   },
},
<box>Box 1</box>
<box>Box 2</box>
<box>Box 3</box>
<div class="box">Box 1!!!!!!!!</div>
<div class="box">Box 2!!!!!!!!</div>
<div class="box">Box 3!!!!!!!!</div>

The jQuery function is also provided in the data.$. This allows you to use all of jQuery's functions, and provides access to the entire DOM:

{
   tag: "box",
   html: function(data) {
      var boxCount = data.$('box').length;
      return '<div class="box">%html%/'+boxCount+'</div>';
   },
},
<box>Box 1</box>
<box>Box 2</box>
<box>Box 3</box>
<div class="box">Box 1/3</div>
<div class="box">Box 2/2</div>
<div class="box">Box 3/1</div>

You'll note that this example doesn't work. As OctoML runs through, it changes the dom. So as each <box> tag is converted, the total count goes down.

HTML tag name

Since the Word's tag property can be any CSS selector, you can select elements different HTML tags. To tell the function which HTML tag it currently is, the function is provided a tag string of the current HTML tag (in lowercase)

You could get this from interrogating the jQuery obj as well (data.obj[0].tagName.toLowerCase())

{
   tag: "box,circle",
   html: function(data) {
      return '<div class="'+data.tag+'">%html%</div>';
   },
},

Here we've defined two new tags <box> and <circle>. Both simply create divs with the class as 'box' or 'circle'. So:

<box>Box 1</box>
<circle>Circle 1</circle>
<circle>Circle 2</circle>
<box>Box 2</box>
<div class="box">Box 1</div>
<div class="circle">Circle 1</div>
<div class="circle">Circle 2</div>
<div class="box">Box 2</div>

Variables

A shorthand for the tag property is the %tag% variable:

{
   tag: "box,circle",
   html: '<div class="%tag%">%html%</div>';
   },
},

%tag% is the second variable covered so far (along with %html% the contents of the current element). There are 4 main variables: %html%, %tag%, %id% and %attr%. Along with variables for any attributes you declare (don't use these 4 names)

If you don't want your HTML to have these variables replaced (for example if you are inserting arbitrary HTML), you can add the property replace: true to your Word

Element ID

If you want to give your HTML an ID, you can use the %id% variable. This will be the current element's ID - if it has one - or if not, one will be generated. By default these are of the form 'octoml-x' where x is a number.

In an html function you can access the ID by using the data property id. This is a function which returns the current ID or generates one. If it is called multiple times in the same function it will always give the same result.

{
   tag: "circle",
   html: '<div class="circle" id="%id%"></div>'
},
<circle></circle>
<circle id="middle"></circle>
<circle></circle>
<div class="circle" id="octoml-1"></div>
<div class="circle" id="middle"></div>
<div class="circle" id="octoml-2"></div>

Element Order

data.index gives the position in the DOM of the current element. The first element's index is 0. The total number of elements matching the Word's tag CSS selector is given by the data.length.

There are two more shorthand data properties: data.first and data.last. These are Booleans, and are pretty self explanatory. If it's the first element, data.first will be true, if it's the last, data.last will be true, and if it's the only one, both will be true.

You can also declare different html strings or functions for the first or last elements only:

{
   tag: "box",
   first: '<div class="first box">%html%</div>',
   html: '<div class="box">%html%</div>',
   last: '<div class="last box">%html%</div>',
}
<box>Box 1</box>
<box>Box 2</box>
<box>Box 3</box>
<box>Box 4</box>
<div class="first box">Box 1</div>',
<div class="box">Box 2</div>',
<div class="box">Box 3</div>',
<div class="last box">Box 4</div>',

Care needs to be taken with all these properties when using nested tags, since they count the number within the parent. If there is only one element, the first will run rather than last.

Attributes

You can read attributes in your custom HTML and use them as well. For example, let's add a color attribute to a <circle> tag.

The simplest way is to add an Array of attribute names


{
   tag: "circle",
   attr: ['color'],
   html: '<div class="circle" style="background:%color%;"></div>'
},

A new variable: %color% has been created, it will give the attribute's content:

<circle color="blue"></circle>
<circle color="green"></circle>
<circle></circle>
<div class="circle" style="background:blue;"></div>
<div class="circle" style="background:green;"></div>
<div class="circle" style="background:;"></div>

In that example, the third <circle> didn't work. Without a color attribute the %color% variable converted to an empty string.

To solve this, we can make the attr an Object. The key will be the attribute name (in this case 'color'), the value will be a string which the variable (%color%) will convert to.

Inside the value string, you can use the variable %val% to refer to the attribute contents.

{
   tag: "circle",
   attr: {
      'color': 'style="background:%val%;"',
   },
   html: '<div class="circle" %color%></div>'
},
<circle color="blue"></circle>
<circle color="green"></circle>
<circle></circle>
<div class="circle" style="background:blue;"></div>
<div class="circle" style="background:green;"></div>
<div class="circle" ></div>

The attributes are also available to html functions. The data.attr is an Object. The keys are all the attribute names. The values are false if the attribute isn't there, true if the attribute is there but has no content, or the attribute's content.

{
   tag: "circle",
   attr: {
      'color': 'style="background:%val%;"',
   },
   html: function(data) {
      var color = data.attr['color'];
      if(color === true) return '<div class="circle" style="background:red;"></div>';
      if(color === false) return '<div class="circle"></div>';
      if(color === "blue") return '<div class="circle" style="background:#3228ef;"></div>';
      return '<div class="circle" %color%></div>';
   },
},
<circle color="blue"></circle>
<circle color="green"></circle>
<circle color></circle>
<circle></circle>

<div class="circle" style="background:#3228ef;"></div>
<div class="circle" style="background:green;"></div>
<div class="circle" style="background:red;"></div>
<div class="circle"></div>

Using the attribute property, we can distinguish between attributes with no content, or those without the attribute at all.

Note that the data.attr contains the attribute's content, but the variable %color% is converted into the text given in the Word's attr Object.

%attr%

Any other attributes that aren't listed are stored in data.attrOther. You can keep these other attributes in tact using the %attr% variable:

{
   tag: "circle",
   attr: ['color'],
   html: '<div class="circle" %attr%>%html%</div>',
},
<circle id="blue" color="blue"></circle>
<circle type="button" color="green"></circle>

<div class="circle" id="blue"></div>
<div class="circle" type="button"></div>

You'll note that because the 'color' attribute was listed, it was not included, and since it wasn't used, that information was lost. It's not advised to use %attr% more than once.

Default Attribute

You can define the default attribute text we add an attrDefault Object to the Word:

{
   tag: "circle",
   attr: {
      'color': 'style="background:%color%;"',
   attrDefault: {
      'color': 'style="background:red;"',
   },
   html: '<div class="circle" %color%></div>'
},
<circle color="blue"></circle>
<circle color="green"></circle>
<circle></circle>
<div class="circle" style="background:blue;"></div>
<div class="circle" style="background:green;"></div>
<div class="circle" style="background:red;"></div>

Mututally Exclusive Attributes

To shorthand things even further, you can create whole groups of attributes. In the circle example, we can create a color group, with shorthand attributes red, green and blue.

Add an Object attrType to your word. The key is the group's name ('color'), and the value is an array of attribute types.

{
   tag: "circle",
   attrType: {
      'color': ['red','blue','green'],
   },
   html: '<div class="circle" style="background:%color%"></div>'
}

The variable %color% will now be contents of the color attribute or one of the shorthand attribute names.

<circle red></circle>
<circle blue></circle>
<circle green></circle>
<circle color="purple"></circle>
<div class="circle" style="background:red;"></div>
<div class="circle" style="background:blue;"></div>
<div class="circle" style="background:green;"></div>
<div class="circle" style="background:purple;"></div>

The shorthand attributes have to be mutually exclusive. If more than one of them appear in an element, the variable will just have one of them. You can still put your group attribute ('color') in the attr and attrDefault Word properties

{
   tag: "circle",
   attr: {
      'color': ' style="background:%val%"',
   },
   attrType: {
      'color': ['red','blue','green'],
   },
   attrDefault: {
      'color': ' style="background:red;"',
   },
   html: '<div class="circle"%color%></div>'
}
<circle green></circle>
<circle blue="this will be ignored"></circle>
<circle></circle>
<div class="circle" style="background:green;"></div>
<div class="circle" style="background:blue;"></div>
<div class="circle" style="background:red;"></div>

The final attribute type are inherited attributes. These are for nested inner Words. They inherit their attributes from their parent object.

To explain those, we'll first look at nested Words.

Nested Words

You can convert the contents of your element, by adding an inner property to your Word. The inner is an Array of Words that operate on the contents of each element.

In the example below, the inner Word is for an <inner> tag. It's parent element will be an <outer>

{
   tag: "outer",
   html: '<div>%html%</div>',
   inner: [{
      tag: "inner",
      html: '<span>%html%</span>',
   }],
},
<outer>
   <inner>Inner 1</inner>
   <inner>Inner 2</inner>
</outer>
<inner>Not inside an outer</inner>
<div>
  <span>Inner 1</span>
  <span>Inner 2</span>
</div>
<inner>Not inside an outer</inner>

You'll note that the <inner> that is not inside an <outer> is not converted.

Nested Ordering

For inner tags, the index starts at 0 inside each outer tag. This affects the length, and first and last Booleans as well.

{
   tag: "outer",
   first: '<div class="first">%html%</div>',
   html: '<div>%html%</div>',
   last: '<div class="last">%html%</div>',
   inner: [{
      tag: "inner",
      first: '<span class="first">%html%</span>',
      html: '<span>%html%</span>',
      last: '<span class="last">%html%</span>',
   }],
}
<outer>
   <inner>Inner 1</inner>
   <inner>Inner 2</inner>
   <inner>Inner 3</inner>
</outer>
<outer>
   <inner>Inner 4</inner>
   <inner>Inner 5</inner>
   <inner>Inner 6</inner>
</outer>
<outer>
   <inner>Inner 7</inner>
   <inner>Inner 8</inner>
   <inner>Inner 9</inner>
</outer>

<div class="first">
  <span class="first">Inner 1</span>
  <span>Inner 2</span>
  <span class="last">Inner 3</span>
</div>
<div>
  <span class="first">Inner 4</span>
  <span>Inner 5</span>
  <span class="last">Inner 6</span>
</div>
<div class="last">
  <span class="first">Inner 7</span>
  <span>Inner 8</span>
  <span class="last">Inner 9</span>
</div>

Top and Bottom

There are two strings data.top and data.bottom. These are prepended and appended to %html%. Used directly these aren't much use (you could have just just put it in your html). Their use comes from inner Words being able to access and add to their parent's data.top and data.bottom.

To add to the parent object's data.top and data.bottom. Rather than having string for the html Word property, you return an Array of strings. The first string is the replacement HTML (like before). The second string is added to the parent object's top.

{
   tag: "outer",
   html: '<div>%html%</div>',
   inner: [{
      tag: "inner",
      html: ['<span>%html%</span>','\n<header>%html%</header>'],
   }],
}
<outer>
   <section>
      <inner>Inner 1</inner>
      <inner>Inner 2</inner>
      <inner>Inner 3</inner>
   </section>
</outer>
<div>
<header>Inner 1</header>
<header>Inner 2</header>
<header>Inner 3</header>
  <section>
   <span>Inner 1</span>
   <span>Inner 2</span>
   <span>Inner 3</span>
  </section>
</div>

You can also add to the parent element's bottom text by putting it in the third Array string. This time, it is added to the beginning of the botttom text. Note in the next example that the order is reversed:

{
   tag: "outer",
   html: '<div>%html%</div>',
   inner: [{
      tag: "inner",
      html: ['<span>%html%</span>','','\n<footer>%html%</footer>'],
   }],
}
<outer>
   <section>
      <inner>Inner 1</inner>
      <inner>Inner 2</inner>
      <inner>Inner 3</inner>
   </section>
</outer>
<div>
  <section>
   <span>Inner 1</span>
   <span>Inner 2</span>
   <span>Inner 3</span>
  </section>
<footer>Inner 3</footer>
<footer>Inner 2</footer>
<footer>Inner 1</footer>
</div>

The reason that the parent's top text is appended while the bottom is prepended is to help wrapping the parent's content by using both.

The Parent Object

There is a data.parent Object that contains the parent element's data. You can inspect and alter the data properties of the parent.

If you want your footer to go in the right order, you can always append text to data.parent.bottom

You can also inspect the parent object's attributes (data.parent.attr and data.parent.attrOther). Which brings us back to inherited attributes.

Inherited Attributes

Inherited attributes can be inherited from the parent object, but can also be overwritten locally. For example, with a 'color' attribte:

{
   tag: "outer",
   attr: {
      color: 'style="background:%val%;"',
   },
   html: '<div>%html%</div>',
   inner: [{
      tag: "inner",
      attrInherit: ['color'],
      html: function(data) {
         return '<span %color%>%html%</span>';
      },
   }],
},
<outer color="red">
   <inner>Inner 1</inner>
   <inner>Inner 2</inner>
</outer>
<outer color="blue">
   <inner>Inner 1</inner>
   <inner color="green">Inner 2</inner>
</outer>
<div>
   <span style="background:red;">Inner 1</span>
   <span style="background:red;">Inner 2</span>
</div>
<div>
   <span style="background:blue;">Inner 1</span>
   <span style="background:green;">Inner 2</span>
</div>

On Timing

When using inner Words, it's important to understand the order in which the page is processed. The parent Word (<outer>)'s html function is run first - before its contents has been converted by any inner Words.

Any %variables% (including crucially %html%) are not replaced until after the inner contents is converted by any inner Words.

This has a few consequences. Since the <outer> function is run first, the data property data.html will give the original contents, while the %html% variable will give the converted contents:

{
   tag: "outer",
   html: function(data) {return '<div class="before">'+data.html+'</div><div class="after">%html%</div>'},
   inner: [{
      tag: "inner",
      html: function(data) {
         return '<span>%html%</span>';
      },
   }],
},
<outer>
   <inner>Inner 1</inner>
   <inner>Inner 2</inner>
</outer>
<div class="before">
   <inner>Inner 1</inner>
   <inner>Inner 2</inner>
</div><div class="after">
   <span>Inner 1</span>
   <span>Inner 2</span>
</div>

The fact that the <outer> function runs first also means you can alter the jQuery element data.obj, and this will alter the contents of your object before the inner Words are run.

The inner Words are run sequentailly (if there are more than one). Each one runs through any elements found in DOM order.

Inner Functions

Rather than specify your Word's inner as an Array of Words, you can use a function which returns an Array of Words. The function is passed a data object, just like the html functions.

{
   tag: "outer",
   html: "<div>%html%</div>",
   inner: function(data) {
      var numInner = data.obj.find('inner').length;
      if(numInner == 2) return [{tag: "inner", html: '<span class="two-col">%html%</span>'}];
      else if(numInner == 3) return [{tag: "inner", html: '<span class="three-col">%html%</span>'}];
      else return [];
   },
}
<outer>
   <inner></inner>
</outer>
<outer>
   <inner></inner>
   <inner></inner>
</outer>
<outer>
   <inner></inner>
   <inner></inner>
   <inner></inner>
</outer>
<div>
   <span></span>
</div>
<div>
   <span class="two-col"></span>
   <span class="two-col"></span>
</div>
<div>
   <span class="three-col"></span>
   <span class="three-col"></span>
   <span class="three-col"></span>
</div>

Deep Nesting

Your inner Words can themselves have Arrays of inner Words. The parent object is always the immediate parent (but you can reach up further in the chain by inspecting eg data.parent.parent).

You can also append to the top and bottom by returning a longer Array. The order goes:

["html","top of parent","bottom of parent","top of parent's parent", "bottom of parent's parent" ... etc]

Global Object

We've come full circle back to the Language's global object defined at the beginning. The example given was CSS text, I'll add in an array of external stylesheets.

global: {
   css: "@charset utf-8\n", // initialised with something
   stylesheets: [],
},

For a Word to add to either of these global attributes, the html can be an object. The html string or Array is given the key 'html', and any global object is given it's key.

{
   tag: "css",
   html: { css: '%html%', html: ''}
}

This takes the contents of any <css> tags and append them to the global.css. The <css> tags will be removed from the page.

For Arrays, like global.stylesheet new entries will be appended. For Objects, each value, in the key/value pair that is a string or Array will be appended.

{
   tag: "stylesheet",
   attr: ['src'],
   html: function(data) {
      return { stylesheets: [data.attr.src] };
   },
}

Preprocessing

Before the DOM is created, the preprocessing Words are run. These allow for sections that would break the DOM, like CSS. For preprocessing Words, the tag selector is much simpler: it can only be a tag name.

You can still use attributes and html functions.

Postprocessing

Postprocesing Words have no tag. They act over the whole page. They contain an html function. The data object passed to it is the global attributes - with one additional piece: data.body contains the DOM html as text. Postprocessing words return the final HTML page (as text), or a plain object for additional outputs (perhaps javascript). OctoML returns a plain object containing the page HTML (stored as html) and any additional outputs.

For more details on the words, look through the Bootstrap examples, and read the source