Pete's Alley - Implementation

Written by Rich Morin.

Contents: (hide) (show)

Path:  AreasContentOverviews

Precis:  early implementation notes on Pete's Alley

This site is my third prototype of a server for Pete’s Alley. Each prototype has taught me things about what the code needs to do and let me experiment with ways to meet those needs. There are still a lot of things that need to be done, but the current version seems ready for limited public viewing.

Initial Prototypes

The first two prototypes used Ruby scripts to process a tree of data files into a set of markup files. This approach allows humans to read and edit the data conveniently, which is a basic requirement. It also lets the system deal with a defined data model, which is a huge win for error checking, indexing, and other processing needs. For example, schema files can be used to document the structure and semantics of data files, as well as supporting error detection.

Initially, the data files were encoded in YAML, but I soon switched to TOML. YAML’s structuring approach relies on indentation, which isn’t really blind-friendly. The language also has an extremely complicated specification, making life difficult for both its implementers and its users. More generally, TOML is a better fit for the data structure I’m using: a tree of associative arrays (aka dictionaries, hashes, or maps) of text strings. In YAML, each level of the tree requires another level of indentation; in TOML, a section header (e.g., [foo.bar]) defines the current context.

The first prototype used Foswiki to process and present files of wiki markup. This was a successful “proof of concept”, but the approach was inappropriate for even a moderate-scale production web server. Given that the wiki engine (written in Perl 5) must interpret the markup file(s) for each web request, this approach is clearly not scalable.

Also, although Foswiki markup is perfectly reasonable, it isn’t very popular. Markdown, in contrast, has a great deal of documentation, tooling, and user support. So, for the second prototype I switched to Markdown, presented by Hugo, a static site generator (written in Go). This would have been a scalable approach, but I wanted to support all sorts of dynamic behavior.

Current Prototype

Rather than add a dynamic subsystem to a Hugo-based web site, I decided to switch to a framework that offers scalable performance and great support for dynamic behavior, including client applications, persistent sessions, etc. Specifically, I’m using the Elixir programming language and the Phoenix web framework. These technologies are very well suited to building concurrent, distributed, failsoft, and performant software systems. So, Pete’s Alley should be able to handle any expected needs and workloads.

The web site’s content is loaded (and reloaded) from a tree of text files, indexed in various ways, then made available by an Elixir process. I plan to add support for user accounts and online editing of content, but these features aren’t needed for this stage in the process. The search interface is based on the selection and reuse of typed tags. So, for example, the user can search for items that have a value of “bar” for type “foo”. This is an unusual (and possibly novel) approach; I would be happy to get feedback and/or learn about any other work of a similar nature.

Project Approach

I have been following the development of Elixir for years, but this is the first project I’ve actually done in the language. So far, the experience has been very positive. The language and tooling are flexible and powerful, the community is helpful and thoughtful, etc. That said, there has been (and continues to be) a substantial learning curve. Although Elixir and Phoenix are reminiscent of Ruby and Rails, many things are profoundly different. So, I’ve been forced to abandon some familiar programming idioms, while learning ones more suitable to functional programming, etc.

Architecture

Dave Thomas makes a number of excellent points in his Keynote, given at the 2018 EMPEX conference in New York City. He explains why he doesn’t like Elixir’s GenServer API, Phoenix’s contexts and directory structure, Mix’s umbrella projects, and so forth. Basically, he contends that we’re doing it all wrong and sketches out a way to do it better.

He also explains that the term “Application” is highly overloaded and muddles our thinking. Instead, he suggests using “Library” for stateless modules and “Component” for stateful ones. He contends that we should be dynamically (and automagically) constructing and configuring “Assemblies” of Components and Libraries. Finally, he presents a description of Noddy, his first cut at infrastructure to support all this. It’s a great talk: both entertaining and thought provoking.

Unfortunately, although I love the ideas behind Noddy, the noddy-test GitHub repo is a long way from being production infrastructure. So, I’m kind of stuck with the current Elixir, Mix, and Phoenix tooling. Fortunately, Dave has also presented some ideas on how to use these. In his Coding Gnome video course, Elixir for Programmers, Dave makes a number of useful architectural suggestions. By and large, my design approach is modeled on these ideas.

I use an umbrella project, because Mix strongly promotes this approach. However, I eschew contexts and keep the Phoenix app (phx_http) as small as possible, using other apps for the business logic (e.g., searching). So, the Phoenix app only has to deal with presentation and user interaction. I also attempt to keep everything as simple as possible: each module and function tends to do only one thing, making the code easy to understand and modify.

The phx_http app uses the standard directory layout for Phoenix, but all of the other apps use an approach that Dave suggests. Each app has an interface file (e.g., lib/foo.ex) which uses a set of defdelegate calls to define the app’s API. The implementation code is stored in a sibling directory (e.g., lib/foo/bar.ex). The implementation files tend to be modular and cohesive; they may also include testable “public” implementation functions which are not presented by the app’s API.

Coding Style

The coding style is reminiscent of Avdi Grimm’s Confident Ruby, in that functions often call other functions without doing any checking beforehand. That is, they assume that the called function will do what is needed (if anything). Elixir’s function dispatching (based on pervasive pattern matching) and “Let it crash” philosophy (all inherited from Erlang) make this a very natural approach.

Like Dave, I am unhappy with the current state of mix format, the Elixir code formatter. I use a lot of inline spacing to make my code more readable. For example, I line up the values in data structure literals so that I can scan down the columns. I also use inline spacing to show parallel construction, etc. A code formatter which removes my carefully considered spacing is not one which I will use on my own projects.

Careful Reader may notice that I use a variety of marker comments to make it easier to find things in the code base. For example, I tend to put an abridged form of the directory path at the start of each file. I also use the following flags for specific code attributes:

Futures

The presentation module (phx_http) talks to the data reporting modules (info_files, info_toml, info_web) by means of a rather ad hoc API. This could be recast into GraphQL, allowing other back ends to be added or substituted. For example, Neo4j could easily do many of the things (and far more) that we’re currently doing in Elixir code.

Adding user accounts to the system is an important short-term goal. Aside from supporting persistance of session and preference data, this would provide a solid foundation for online entry of data and text. The Ueberauth and Guardian libraries look like the Golden Path for this sort of thing. This example seems apposite. It expects us to use Ecto and PostgreSQL, but if we already have GraphQL and Neo4j hooked up, we could use them for account-related information…