Deep dive into TOML, JSON and YAML

Shame on me, frankly before I have started working with Hugo TOML was a new area to discover, but I was very familiar with YAML and JSON. This article should help you to see how to structure your data through the different data formats.

In Hugo, you can use all these three data formats for configuration, front matter, and custom data, but TOML is the recommended format to use for the whole project. First I would like to give you a short introduction into each of the data formats separately before we jump right into the specifications and comparisons.

TOML (Tom’s Obvious, Minimal Language)

TOML was obviously written by Tom - Tom Preston-Werner - to be precise. It is an open source project licensed under MIT and has more than 5k stars on Github currently. The first version of TOML released in March 2013 which qualifies TOML to be the youngster of the three standards.

TOML aims to be a minimal configuration file format that’s easy to read due to precise semantics. TOML is designed to map unambiguously to a hash table. TOML should be easy to parse into data structures in a wide variety of languages.

Quick facts about TOML syntax

  • TOML is case sensitive.
  • A TOML file must contain only UTF-8 encoded Unicode characters.
  • Whitespace means tab (0x09) or space (0x20).
  • Newline means LF (0x0A) or CRLF (0x0D0A).

To use TOML in the front matter, you need to wrap it between +++ like:

+++
date = "2016-12-14T21:27:05.454Z"
publishdate = "2016-12-14T21:27:05.454Z"

title = "Deep dive into TOML, JSON and YAML"
tags = ["toml","yaml","json", "front matter"]

type = "article"

[amp]
    elements = []
    
[article]
    lead = "Lorem ipsum."
    category = "frontmatter"
    related = []

[sitemap]
  changefreq = "monthly"
  priority = 0.5
  filename = "sitemap.xml"
+++

YAML (Ain’t Markup Language)

YAML is a widely spread language and used for configuration files across different languages and frameworks. Creator and maintainer of YAML is Clark C. Evans, started out as SML-DEV, a list of XML people focusing on simplifying XML helped produce Common XML, a highly functional subset of XML founded an alternative to XML for data serialization, especially with Python, Perl, and Ruby. The project started in 2001, and the first 1.0 release came out in January 2009 by Oren Ben-Kiki, Clark Evans, and Brian Ingerson. Since 2009 the current version 1.2 is in use.

Quick facts about YAML syntax

  • .yml files begin with ‘—’, marking the start of the document
  • key-value pairs are separated by a colon
  • lists start with a hyphen
  • YAML uses indentation with one or more spaces to describe nested collections

To use YAML in the front matter, you need to wrap it between --- like:

---
date: '2016-12-14T21:27:05.454Z'
publishdate: '2016-12-14T21:27:05.454Z'
title: Deep dive into TOML, JSON and YAML
tags:
- toml
- yaml
- json
- front matter
type: article
amp:
  elements: []
article:
  lead: Lorem ipsum.
  category: frontmatter
  related: []
sitemap:
  changefreq: monthly
  priority: 0.5
  filename: sitemap.xml
---

JSON (JavaScript Object Notation)

JSON is a lightweight data-interchange format. JSON widely used for API communication between browser and server in the web-world since JavaScript and most of the Serverside languages support JSON natively. In the early 2000s, Douglas Crockford introduced the first specification of the data format JSON. The current version specified by ECMA-404 dated in October 2013.

Quick facts about JSON syntax

  • Data stored in name/value pairs
  • Records separated by commas. Trailing commas without the following property are not allowed.
  • Double quotes wrap property names & strings. Single quotes are not allowed.

Since JSON wrapped in two curly braces {} there is no special wrapping necessary to use it within the front matter in Hugo:

{
    "date" : "2016-12-14T21:27:05.454Z",
    "publishdate" : "2016-12-14T21:27:05.454Z",
    "title" : "Deep dive into TOML, JSON and YAML",
    "tags" : ["toml","yaml","json", "front matter"],
    "type" : "article",
    "amp" : {
        "elements" : []
    },
    "article" : {
        "lead" : "Lorem ipsum.",
        "category" : "frontmatter",
        "related" : []
    },
    "sitemap" : {
      "changefreq" : "monthly",
      "priority" : 0.5,
      "filename" : "sitemap.xml"
    }
}

Syntactical differences between TOML, YAML, and JSON

Let’s have a look now into the syntactical and feature set differences in the most common use cases.

Strings

Any of the format support Strings. The only feature difference here is, that multiline Strings are not supported by JSON.

TOML

key = "String Value"
multiline = """\
       The quick brown \
       fox jumps over \
       the lazy dog.\
       """

YAML

key : String Value
multilinePreservedLinebreaks:
|
  L1 - The quick brown
  L2 - fox jumps over
  L3 - the lazy dog.
multilineReplaceLinebreaksWithWhitespace:
>
  This sentence ist just too long to keep it
  on the same line.

JSON

{
  "key" : "String Value"
}

Objects / Hash Tables / Collections

Tables in TOML is pretty much the same as Objects in JSON and Collections in YAML. To access a collection in Hugo templates you navigate by the . like {{ .Params.objectkey.subkey }}.

TOML

[table_key]
property = "Value"
secondProperty = "2nd Value"

[alternative.direct]
access = "results in alternative.direct.access for this value"

alternativeCalledInlineTable = { property = "Value", "etc" = "You got it." }

YAML

objectKey:
  property: Value
  secondProperty: 2nd Value
alternative: { subkey: 5.0, another: 123 }

JSON

{
  "objectKey" : {
    "property" : "Value",
    "secondProperty" : "2nd Value"
  }
}

Arrays / Lists

Arrays or Lists are supported by all languages.

TOML

fruits = [ "Apple", "Banana", "Strawberry" ]
formats = [
  "YAML",
  "JSON",
  "TOML"
]

YAML

fruits:
  - Apple
  - Banana
  - Strawberry
formats: [ YAML, JSON, TOML ]

JSON

{
  "fruits": ["Apple","Banana","Strawberry"],
  "formats": [
    "YAML",
    "JSON",
    "TOML"
  ]
}

To extend these examples a little bit we can create a list of Objects / Tables / Collections as well like this:

TOML

[[fruits]]
name = "Apple"
weight = 600

[[fruits]]
name = "Banana"
weight = 300

[[fruits]]
name = "Strawberry"
weight = 40

YAML

fruits:
- name: Apple
  weight: 600
- name: Banana
  weight: 300
- name: Strawberry
  weight: 40

JSON

{
  "fruits": [
    {
        "name" : "Apple",
        "weight" : 600
    },
    {
        "name" : "Banana",
        "weight" : 300
    },
    {
        "name" : "Strawberry",
        "weight" : 40
    }
  ]
}

All examples above results in a list you can iterate through {{ range .Params.fruits }}<strong>{{ .name }}</strong> - Weight: {{ .weight }}{{ end }} in Hugo template files.

I think you have a pretty good understanding now of how arrays and tables work together; let’s extend it once more to get the complete overview.

TOML

[[fruits]]
  name = "Apple"
  weight = 600

  [fruit.physical]
    color = "red"
    shape = "round"

  [[fruit.variety]]
    name = "red delicious"

  [[fruit.variety]]
    name = "granny smith"

[[fruits]]
  name = "Banana"
  weight = 300

  [fruit.physical]
    color = "yellow"
    shape = "curved"
    
  [[fruit.variety]]
    name = "plantain"
    
[[fruits]]
  name = "Strawberry"
  weight = 40

  [fruit.physical]
    color = "red"
    shape = "kind-of-oval"
    
  [[fruit.variety]]
    name = "the-good-one"

YAML

fruits:
- name: Apple
  weight: 600
  physical:
    color: red
    shape: round
  variety:
  - name: red delicious
  - name: granny smith
- name: Banana
  weight: 300
  physical:
    color: yellow
    shape: curved
  variety:
  - name: plantain
- name: Strawberry
  weight: 40
  physical:
    color: red
    shape: kind-of-oval
  variety:
  - name: the-good-one

JSON

{
  "fruits": [
    {
        "name" : "Apple",
        "weight" : 600,
        "physical": {
          "color": "red",
          "shape": "round"
        },
        "variety": [
          { "name": "red delicious" },
          { "name": "granny smith" }
        ]
    },
    {
        "name" : "Banana",
        "weight" : 300,
        "physical": {
          "color": "yellow",
          "shape": "curved"
        },
        "variety": [
          { "name": "plantain" }
        ]
    },
    {
        "name" : "Strawberry",
        "weight" : 40,
        "physical": {
          "color": "red",
          "shape": "kind-of-oval"
        },
        "variety": [
          { "name": "the-good-one" }
        ]
    }
  ]
}

Numbers (Integer, Floats, Infinity etc.)

Numbers are written very similar in all data structures with differences in the feature set:

TOML

explicit_pos = +99
positive = 42
zero = 0
negative = -17

# For large numbers, you may use underscores to enhance readability.
# Each underscore must be surrounded by at least one digit.
large = 1_000
verylarge = 5_349_221

# fractional
float = +1.0
float_pi = 3.1415
negative_float = -0.01

# exponent
flt4 = 5e+22
flt5 = 1e6
flt6 = -2E-2

# both
flt7 = 6.626e-34

YAML

integer: 12
octal_number: 014
hexadecimal: 0xC
float: 18.6
exponential: 1.2e+32
infinity: .inf

JSON (Infinity and NaN are not supported in JSON)

{
  "integer": 12,
  "octal_number": 12,
  "hexadecimal": 12,
  "float": 18.6,
  "exponential": 1.2e+32
}

Misc - Datetime, Boolean, Null

TOML

bool1 = true
bool2 = false

date1 = 1979-05-27T07:32:00Z
date2 = 1979-05-27T00:32:00-07:00
date3 = 1979-05-27T00:32:00.999999-07:00

YAML

bool1: true
bool2: false

null1: null
null2: ~

date_iso: 2016-12-14T21:59:43.10-05:00 # ISO-8601
date_simple: 2016-12-14

JSON

{
  "bool1": true,
  "bool2": false,
  "null1": null,
  "date_iso": "2016-12-14 21:59:43 -0500",
  "date_simple": "2016-12-14"
}

I hope you got a good overview of the differences between these 3 data structures and you feel comfortable using any of them in your future project.