Concept Maps helper

Because writing raw graphviz is annoying

July 29, 2023 6 minutes read | 1222 words by Ruben Berenguel

I am a big fan of Concept maps, but writing them in Graphviz is annoying. So I wrote a helper.

The post below serves as a kind of manual, reminding me of how it works in case of need. If you want to use it, you can find it here.

It is formed by two contenteditable divs, you should paste your simplified graphviz representation of the map on the right, and you will get (hopefully) valid graphviz on the left and on your clipboard. And you may wonder, what is simplified, and how much better than graphviz is it?

As an example of the differences, let’s look at the following Concept Map for Apache Kafka, one of the first I wrote.

Click to open the image in a new tab/window

The core of the graph, after removing the style definition headers is

  Topics
  Topics -> Internal [label="can be"]
  Topics -> User [label="can be"]
  cg [label=" Consumer groups "]
  cg -> Consumers [label="have"]
  User -> cg [label="consumed by"]
  Topics -> Partitions [label="have"]
  {
    rank=same
  	Partitions [label=" Partitions "]
    Offset
  }
  Partitions -> Records [label="contain"]
  Partitions -> Consumers [label="read from"]
  Consumers -> Topics [label="can read from several" arrowhead=normal]
  Records -> Ordered [label="are partially"]
  kv [label="Key-value"]
  Records -> kv [label="are"]
  kv -> Tombstone [label="value=null"]
  sk [label="Same key"]
  kv -> sk [label="can have"]
  sk -> sp [label="go to"]
  Compaction -> sk [label="keeps latest with"]
  Time -> Compaction [label="required by"]
  ret [label="Record expiration time" shape="none" style=none fontsize=9]
  Time -> Expiration [label="required by"]
  Expiration -> ret [style=dashed]
  Records -> Time [label="have"]
  Records -> sk [label="can have"]
  Ordered -> Time [label="by"]
  Consumers -> Offset [label="start at"]
  Consumers -> Offset [label="\n\nkeep track of"]
  Offset -> Internal [label="is stored in"]
  Offset -> Ordered [label="requires"]
  sp [label="   Same partition   "]
  sp -> Ordered [label=" is completely" arrowhead=normal]

It is not complex but having to write so many times label and so many open-and-close brackets makes for a very poor writing experience.

An equivalent, simplified concept map is:

  Topics
  CG            Consumer    groups
  Consumers
  Topics    ->  Internal    can be
  Topics    ->  User        can be
  CG        ->  Consumers   have
  User      ->  CG          consumed by
  Topics    ->  Partitions  have
  {
    rank=same
  	Partitions
    Offset
  }
  Partitions  -> Records      contain
  Partitions  -> Consumers    read from
  Consumers   -> Topics       can read from several; arrowhead=normal
  Records     -> Ordered      are partially
  KV          Key-value
  Records     -> KV           are
  KV          -> Tombstone    value=null
  SK          Same key
  KV          -> SK           can have
  SK          -> SP           go to
  Compaction  -> SK           keeps latest with
  Time        -> Compaction   required by
  RET         Record expiration time; shape="none" style=none fontsize=9
  Time        -> Expiration   required by
  Expiration  -> RET          ;style=dashed
  Records     -> Time         have
  Records     -> SK           can have
  Ordered     -> Time         by
  Consumers   -> Offset       start at
  Consumers   -> Offset       \n\nkeep track of
  Offset      -> Internal     is stored in
  Offset      -> Ordered      requires
  SP          Same partition
  SP          -> Ordered      is completely; arrowhead=normal

Basically, removes all the unnecessary repetition when writing a concept map in the format I do.

Syntax

nodename    label; props

node1  ->  node2  edge label; props

Anything after the name of the node or an arrow assignment is split by ;, to the left we have the label (which can be empty) and to the right we have the properties, which are added after the label. Any number of spaces can appear anywhere, the label itself won’t have any leading or trailing spaces. Properties will have leading spaces kept.

{
//
}
foo=bar

Several types of lines are left as they are to be handled by Graphviz. In particular, lone braces, properties and comments. Lone braces are useful for rank=same blocks, for alignment. I think I set the code to require no spaces around the = sign, but that could be fixed. In all cases, there can be any number of leading or trailing spaces.

# Foo

The first line (with any hash removed) will be used as a title label for the whole diagram, as in label="\n\nFoo\n". I Have found this layout of newlines to be the best so far. You can avoid having such a label by setting an empty first line and label="" as the first content line. I remove hashes to let me write these as if it was Markdown.

$foo = something_or_other

This defines a replacement that will be used before line parsing. So anything below that refers to having a hex color value somewheree can be handled for red by defining $RED = #99222277 and using $RED where you would need a hex color definition.

subgraph cluster_foo { #12345678
  property=value
  
cluster foo { #12345678
  property=value

This is an automation to make compound subgraphs work better. A compound graph is one where edges can end at a cluster boundary, instead of ending in a node inside the cluster. This is useful if you have a complex entity that does something as a whole. Creating such subgraph will automatically generate an invisible node foo inside of subgraph cluster_foo, to have a target for the edge. By default the cluster will have dotted and rounded contour. You can change this by adding any styling properties inside the subgraph.

Optionally, passing a RGBA hex string like above (anywhere in the line) will add fillcolor="#12345678" to the subgraph properties.

You can avoid writing subgraph if you want and just say cluster.

Caveat: to use node -> foo or foo -> node as a compound (i.e. pointing to the cluster and not a node) you need to declare the subgraph before using it. This would not work if you have loops of clusters pointing to each other, though. To handle this case you need to manually solve the loop by adding something like node -> foo whatever label;lhead=cluster_foo.

subgraph cluster_foo {
  a =;fillcolor=red
  b
  a -> b !
}

Quite regularly I find myself adding properties to nodes where the name does not change. To avoid having to retype the name (and to prevent empty label from being invalid) if the node label is =, the conversion will keep the node identifier as label. The rare case where I may need to use = as a label can be fixed by adding the property label="=".

Another usual situation is wanting a different layout inside the subgraph. For that I usually add edges between internal nodes, that I set to label="", style="invis". By creating an edge with the label ! this is automated. I have not decided if I want to automate the setting I usually add in these cases, which is overall edge styling in the subgraph as edge[margin=0,minlen=1].

Why?

I love writing concept maps in graphviz, but having to type so many braces annoyed me.

How?

A single-pass parser, like most of the stupid tools I write. Since it has no state, subgraphs need to be declared before using them as compound nodes, because I don’t know what the clusters are named otherwise.

All is JavaScript, inlined in a single HTML page.

It embeds my standard Graphviz header, although using Roboto instead of Monoid as the main font to make sure it works well anywhere.

Is it safe to use?

It is a self-contained page, the only external resource is a link to the CSS here to use Monoid as the font. And only because I was lazy, glancer has the font base64 encoded in the page it generates themselves.