Notes on Design in Practice

Rich Hickey's talks have been influential in my thinking about building software, in particular his thoughts on the importance of "hammock time" and craftsmanship.

Rich's latest talk is titled Design in Practice and clarifies a structure of building and recording a design that I feel is "just enough" to helpful without being burdensome. Further it elaborates Architecture Decision Records (ADR), which is the design structure I'm most familiar with. I still refer back to Mike Nygard's post on ADRs from 2011 regularly.

Notes on Design in Practice

video: https://www.youtube.com/watch?v=c5QF2HjHLSE

full transcript: https://github.com/matthiasn/talk-transcripts/blob/master/Hickey_Rich/DesignInPractice.md

This talk is about formalizing the process of design. Given Rich's background, he's primarily talking about software design but you may apply these concepts to other domains.

Some of the benefits and goals include

He begins by talking about precision in words. Using precise words helps keep wording concise and makes understanding easier. An artifact of using precise words is a glossary.

Rich reminds us the questions and the Socratic method are powerful tools for building understanding. Exposing answers to find the truth helps everyone learn. Detach yourself from your ideas. There is an objective truth and the goal is discover it together.

A framework for questions:

These questions frame your current status and your direction, from both progress and understanding. Rich suggests these questions can help with reflective inquiry; being aware of your own thinking.

He suggests that we should try to structure the record of our design in stories. A story should have these sections:

Not a checklist. Goal is record decisions made and why.

The suggested phases that a design process follows are:

  1. Describe (situation)
  2. Diagnose (possible problems)
  3. Delimit (the problem you are going to solve)
  4. Direction (strategy, approach)
  5. Design (tactics, implementation plan)
  6. Dev (build it)

Rich continues to explain the 6 phases.

Describe

Diagnose

Delimit

Direction

Design

Published: 2023-08-17

Tagged: talk clojure design richhickey

Perfection-ism induced hiatus complete

In my previous writings (more than 2 years ago), I discussed building a simple blog generator. These posts described the basic ideas behind the blog generator I built for this (personal) blog.

I built my own software but did not use it. Why?

In my experience, root cause analysis often finds many factors. Explaining why I haven't updated the blog in a couple years has many factors.

First, my personal life became busier. I started a new job. I had a second child. I haven't been doing a lot of personal programming. These are all excuses - if I wanted to, I could have found time to make a new post.

The real reason I stopped updating the blog was that I wasn't happy with it. Writing new posts included some friction; copy & pasting DSL snippets to add new paragraphs or headings. I wanted to make some changes to reduce the friction, and didn't want to write code for more posts until I'd made these changes. In my mind, new posts hinged on complete these updates first so I just didn't.

What are priorities for a blog?

I think building the blog generator was a great experience. It may not have been a challenging pursuit, but was enjoyable (if not a yak-shave). It gave me complete control over my site.

I had wanted to do major surgery to the blog generator to take the DSL in a new direction. As I thought about the mental burden of the changes, I realized that benefits of building my own software were not outweighing the costs (anymore).

If I want the blog to be an effective communication tool or just personal record of my thoughts, the most important feature is ease of writing. My own blog was not easy (at least, not yet) to write for.

The benefits of building a blog generator had run their course; I know I can do it, but is building and maintaining this tool how I want to spend my time? The answer came back a resounding - no.

Priorities come first

One important lesson I’ve been pondering lately is that it’s priorities and judgement matters; it’s magnitudes faster to be on an effective path sooner than to move quickly on an ineffective path.

Prioritizing is discussed so often that its boring for me to mention it here but it has been revelatory for me, including several moments thinking “how could I have been so lost before?”

As a mid-career engineer, I’ve discovered that I can have an outsized impact by guiding efforts away from rabbit holes and pitfalls. For example, an early-career Adam would have spent many hours considering the dozens of blog generator tools available and the merits of each. Now, I’m able to first consider my priorities and criteria then quickly filter the available options to make a decision.

So, how does the blog work now?

Once I realized that I wanted to get out of the blog generator game, it became clear that I should just leverage an existing tool that makes it easy to write and publish content.

I considered many blogging tools before landing on quickblog. I've been enjoying Clojure for nearly a decade and quickblog is built on tools and a language that I already understand while fitting with my priorities.

Porting over the handful of previous posts and existing styles required a bit of work but I'm expecting in the long run, the investment will pay itself back. And, you're likely to read new content from me in the future.

A short segway on parenting with priorities

My daughter’s favorite reminder when we’re late is to “remember the story of the tortoise and the hare? Slow and steady wins!”. She’s quite wise for a 5 year old! Even at my age, the rushing mindset is an easy trap in which to succumb. I think it’s actually quite rare that rushing is the optimal path; the downside of being late is often less than the risk of making mistakes by mindlessly rushing. My daughter may test my patience by being so frequently distracted but does trying to rush her help? I’d venture “no”.

Published: 2023-02-01

Tagged: priorities quickblog personal blog

Parsing and grammars

Parsing is a common task for software systems. Most domain specific languages and every programming language require a parser to process their input before acting. Most bridges between two or more systems need to encode then parse the data passed between them.

I've probably written dozens of parsers over the years, of which I remember less than half. The following experience report and light introduction to the topics of parsing & grammars may lead to better decisions when building parsers.

So, we need to build a parser?

We've got some input. It's a string. The string has some structure which follows a recognizable format. We want to turn that string into data we can use. We need a parser.

There are two primary approaches (I know of) to write parsers; hand built or parser generator with a grammar.

In my experience, parsers begin hand built. The input syntax is simple or you just want to get it done quickly. You write a small regular expression. You add an iterative loop or recursion. Suddenly, you've got a hand built parser.

Hand built parser

You've got a string with a general syntax. You need code that finds the parts of every string matching the syntax and act on it. You write code that finds matches then directly calls the action code.

Hand built parsers can be fast. Being purpose built for the task, code can optimized for performance. Any abstraction would require more machine effort than a well chosen algorithm.

Time passes and after a couple updates or changes in syntax, the code gets messy. Each change brings an accumulating pain. You've got difficult-to-follow recursion or incomprehensible clauses in your switch/cond statement. You long for a better abstraction or easier debugging but you're vibing sunk cost fallacy and can't bear to toss this significant subsystem. If you muster enough courage or 20% time then you go for the full refactor but like an old back injury, the pain returns in time.

Breaking down the work

Whether explicit or not, hand built parsers perform 3 duties. First, they search the input for specific tokens. Often input languages are defined in mutually exclusive states. In the JavaScript programming language for example, some characters are invalid in identifiers but valid in strings.

Second, they parse the token stream into the rules for the domain specific language. In JavaScript, the var keyword must be followed by an identifier string.

Third, hand built parsers (often) act on the rules of the domain specific language.

Let's use this information to find a better abstraction. As Rich Hickey would say "let's decomplect it".

Lexer

A lexical analyzer (or lexer) scans the input and splits it into tokens. In a string, a token is a collection of characters (including a collection of size one). Tokens should have meaning. Meaning that a parser would need to apply the rules of the domain specific language.

Lexer definition often looks like a regular expression for recognizing a specific character or sequence of characters. The lexer produces a series of tokens pulled from the input.

A common example of a lexical analyzer generator is Lex). Interestingly, Lex was originally written in 1975 by Mike Lesk and Eric Schmidt (the future CEO of Novell & Google).

Parser

Using the rules of a language, a parser takes a stream of tokens and produces a tree. Most languages are recursive so a tree data structure makes it clear which tokens are composed within the body of others.

Yacc is a commonly used parser, often paired with Lex. This is what my University computer science courses required (15 years ago).

Grammar

Grammars are an expressive language for describing rules of a domain specific language. You write a grammar then give it to a parser generator, which generates code for interpreting the input (usually a string).

Here's an example grammar for the common CSV (comma separated values) format. This grammar is defined in ANTLR 4 which combines both lexer and parser definitions in the same grammar.

csvFile: hdr row+ ;
hdr : row ;
row : field (',' field)* ' '? ' ' ;
field
: TEXT
| STRING
|
;
TEXT : ~[, "]+ ;
STRING : '"' ('""'|~'"')* '"' ;

ANTLR combines both lexer and parser rules in the same grammar. In it's language, a lexer rule identifier begins with an upper case letter and a parser rule does not. TEXT and STRING are both lexer rules which result in tokens. The field parser rule uses the tokens (including the inline ',' in the row rule) to build the higher level abstractions. In ANTLR rules that use alternatives (|) order matters; the field rule with prefer TEXT tokens over STRING tokens.

Ambiguous and unambiguous languages

There are languages that cannot be specified in a grammar, so beware but (in my experience) they are rare. More commonly, you're going to find languages that are ambiguous.

An ambiguous language can have more than one parser rule match a set of characters. For example, let's say you have a language with the following rules.

link: [[ STRING ]]
alias: [ STRING ]( STRING )
STRING: [a-zA-Z0-9 ]+

These two rules share the same left stop character. If a grammar were to parse [[alias](target)] then the parser would be unable to determine which rule to follow. Likely, the parser would fail trying to apply the link rule but not finding the ]] right stop characters.

There are ways to work around ambiguous rules, but it would be better to design the language to remove these ambiguities if possible. The best work around I have discovered is to define each rule with optional characters to cover other ambiguous rules. From our previous example, you could add an optional [ like so. ]

link: [[ STRING ]]
alias: [? [ STRING ]( STRING )
STRING: [a-zA-Z0-9 ]+

The parser can remove the ambiguity through matching the left stop characters on both rules. Note that this is ANTLR 4 specific, but you may be able find a similar solution in other grammar definition languages.

Further reading

I am a fan of ANTLR 4. I have found it to be powerful, easy to use, performant and well supported. A Clojure wrapper exists for it's Java implementation. @aphyr even did some performance tests of it (specifically comparing it to Instaparse). If you want a deeper dive into using ANTLR then I'd recommend The Definitive ANTLR 4 Reference. There are plenty of helpful examples of ANTLR-based grammars for different languages available on github.

Published: 2022-12-11

Tagged: dsl clojure blog