Pete's Alley - Data and Metadata
Written by Rich Morin.
Precis: how Pete's Alley uses data and metadata
In Pete’s Alley, the line between data
and metadata is more like a spectrum.
At one end, we have body text,
such as the paragraph you’re currently reading.
For convenience in reading and editing,
it’s encoded as Markdown, with some minor extensions.
At the other end, we have the file ID, keys for other items, etc.
In the middle, there is a range of sub-trees (e.g.,
whose content may be interpreted as either data or metadata.
This information is stored in a flexible, if somewhat unusual,
The input data is stored as a shallow tree
of TOML files
Each file contains a tree of
hash maps, using
text strings as both keys and
At load time, the file tree is flattened into a single hash (
whose keys are the relative path names of each TOML file.
The content of the file, with minor additions and changes,
is stored as the item’s value.
For example, in a nod toward efficiency,
the item’s hash keys are converted to symbols.
Some metadata is also harvested and added to the item.
Finally, a set of inverted indexes is created,
allowing tags and types to be located rapidly.
Most of the item’s metadata resides in its
meta.tags hash, in particular,
stores sets of tag values (e.g.,
under a limited number of types (e.g.,
Our searching technology uses these
to enable complex set operations,
using the intersection
and union of query results.
Basically, we are using collections of hashes and functions to implement a small part of the capabilities offered by a graph database such as Neo4j. Indeed, we expect to experiment with Neo4j as we continue, but the current, informal approach provides all of the flexibility and performance we need for this stage of development.
To be continued…