Dmitry Petrov Back

Part 5: Gems and pitfalls

All posts from the series can be found here

All the points below summarize the things I enjoy using in Common Lisp and are not especially the ones preferred or recommended by community. Also proof reading showed that many of the examples were already mentioned earlier, but I still think it’s good to have them all in one place for the reference. Here you go.

Generic functions

If I were to choose the single most amazing feature of the language I would choose generic functions. Why?

The reason is that decoupling them from the object resulted in amazing freedom and flexibility when using them.

First of all, generic function dispatch against all it’s arguments and second, all of the arguments can be of any type and not some object hierarchy. What the means that it’s very easy to build recursive implementations that do some preliminary work with arguments and then call the same function again which makes generic function dispatch differently and execute another implementation. I used this trick a lot.

One example of such a definition could be fetch-posts function. What we know that we want to fetch posts to some kind of database, which has a store. Such a definition means that we can have objects of both types at our disposal. With described technique it’s simple to make it work as desired:

(defgeneric fetch-posts (db))

(defmethod fetch-posts ((db <db>))
  (set-credentials db)
  (fetch-posts (fetch-store db)))

(defmethod fetch-posts ((store <store>))
  (do-actual-fetch))

When we call fetch-posts against the database implementation just extracts store from it and call the function again which execute another method and now we can fetch posts both with store or database instance without any effort.

Another bonus from decoupling is that you are free to create any new function and implement it against any object hierarchy. It may be not even a hierarchy but just a set of types.

Remember function to-hash-table that I used before? It is implemented as a generic function.

(defgeneric to-hash-table (source &key))

With such a definition you can just implement it against any type that you want without any restrictions. When I needed it for the database I wrote this implementation:

(defmethod to-hash-table ((db <db>) &key (key-sub #'itemid))
  (let ((ht (make-hash-table :test 'equal)))
    (dolist (post (posts db) ht)
        (setf (gethash (funcall key-sub post) ht) post))))

And when later it became clear that <store> class can also benefit from one, I just implemented it:

(defmethod to-hash-table ((store <store>) &key)
  (let ((ht (make-hash-table :test 'equal)))
    (dolist (item (events store) ht)
      (let ((itemid (-<> item
                         (getf <> :event)
                         (getf <> :itemid)))
            (ts (getf item :sync-ts)))
        (setf (gethash itemid ht) ts)))))

Having said that I need to mention that this freedom applies to external functions as well and many of the external libraries expose precisely generic functions to control their behavior. The good example is with plump library when it was enough just to implement function against the set of types without doing any work at all with types.

Last bit that I want to praise is auxiliary methods. This is just incredible because of flexibility it gives. When I just started coding I wanted to have database saved on any operation. Post created? Save! Updated? Save! Deleted? Save!

Using :after modified allowed me to completely decouple the logic and for example, the main implementation of publish-post knew nothing about saving to the database, but meanwhile, in the other package it was as easy as

(defmethod publish-post :after ((db <db>) (post-file <post-file>))
  (save-posts))

And it’s done.

Another use case can be found in markdown handling that I did. What I wanted to do was to write a file with links pointing to the local files and translate them to the real urls on the markdown compilation phase.

(defmethod render-span-to-html :before
    ((code (eql 'inline-link)) body encoding-method)
  (let ((record (cl-journal.db:get-by-fname cl-journal::*posts* (cadr body))))
    (if record
        (setf (cadr body) (cl-journal.db:url record)))))

:before method does not in general change the behavior of the main code, however, it has access to all it’s arguments and in this case I check if whatever is passed as an url can be resolved to a post url from the database and modify argument accordingly. After that, I don’t really need to touch library logic, it works as usual.

Streams

Streams are really powerful. Languages that don’t have them (let’s say javascript) often end up with ugly hacks or additional code or with string concatenation all over the place and with streams all the api gets streamlined immediately and the neat thing is that it’s still totally under control.

I benefited from this fact quite a lot. One example is saving data to the file. Common Lisp provides a function pprint that prints data structure in a nice way. Default is stdout, but stream can also be passed as an argument. Given that saving state to file becomes as simple as:

(defun save-posts ()
  (with-open-file (out *posts-file*
                       :direction :output
                       :if-exists :supersede)
    (with-standard-io-syntax
      (pprint (to-list *posts*) out))))

And what’s cool is that whatever is printed with pprint by the definition can be consumed by read. Since read invokes Common Lisp reader, it has some security implications for sure. In my use case however, I didn’t really bother about this part, because I’m a sole user of the cl-journal and database reading and writing logic is written in such a way, that I can change it to whatever format anytime without any friction.

Another usage was showcased in part about html to markdown conversion. Serialization logic in plump simply prints results in *stream* variable that by default happens to be equal to stdout (if I’m not mistaken) and in the method I could capture any part of output simply by temporary setting *stream* to the other stream and then doing whatever I found fit with the output.

(with-output-to-string (out)
  (let ((*stream* out))
    (loop for child across (children node)
          do (serialize-object child))))

format works with streams too, I’ll talk about it later.

Special variables

Special variables is yet another super powerful concept. What’s special about them is that they use dynamic binding instead of a lexical one and that means the changing the value of such not only for the current scope but for all the code that is called from there.

This gives a powerful weapon to penetrate through layers of abstractions without a cost of passing this variable through or making it global in the true sense of this word.

In case of plump special variable provided an easy way to control the output.

Loop

If were to write this post a couple of months ago I wouldn’t mention loop at all. I came to Common Lisp from clojure and my brain was really fixed on immutable data structures, hence I tried to avoid any imperative constructs. Another moment was that loop had a controversial perception across community some of which preferred iterate and the rest tried to avoid both.

The turning point was when I started writing fetch and sync logic and I had to iterate there a lot in many different ways. I tried loop once then twice and then I ended up using it all over the place. Why so?

Simply because loop unify all different looping constructs that other languages have like for or while or even map in one unified call and in addition to that gives local bindings all an easy way to execute body only under certain conditions and return from any point, and I’m sure there are lots of things I’m not aware of.

Let me show a couple of especially impressive examples from the code. Here is a function that prints a list of files that need to be merged:

(defun get-merge-candidates (db)
  (let ((store (restore-source-posts (fetch-store db)))
        (ht (to-hash-table *posts*))
        (visited (make-hash-table)))
    (loop for event in (events store)
          for itemid = (getf (getf event :event) :itemid)
          for post = (gethash itemid ht)
          when (and (not (gethash itemid visited))
                    (or (not post)
                        (older-than-p post (getf event :sync-ts))))
            collect
            (progn
              (setf (gethash itemid visited) t)
              (if (null post)
                  (format nil "~a - ~a"
                          itemid
                          (getf (getf event :event) :url))
                  (format nil "~a - ~a (~a)"
                          itemid
                          (getf (getf event :event) :url)
                          (filename post))
                  )))))

Iteration goes over (events store), two other fors work just like local bindings. when part uses local bindings as well as bindings from function scope to understand if this particular post needs to be merged. If this check succeeds collect executes next form and appends it to a resulting list to be returned. In this body we update a list of visited posts so that any duplicate post is ignored.

Next example is silly but show loop can serve as a loop construct.

(defun lj-get-server-ts ()
  ;; scary hack to get server ts in a single timezone
  (labels ((r () (getf (lj-getevents '(1000000)) :lastsync))) ;; something big enough to have empty lookup
    (loop with a = (r)
          do
             (let ((b (r)))
               (format t "~a~%" b)
               (when (older-p a b 10) (return a))
               (when (older-p b a 10) (return b))))))

with serves as a local binding there and whenever we want to finish we can simply call return and it’s argument will be the return value.

Yet another loop combo allows an easy traversal of plist with destructuring of key/value pair:

(loop for (key value . rest)
        on (getf post-file :fields)
      by #'cddr
      do
         (format out "~a: ~a~%"
                 (string-downcase (symbol-name key))
                 value))

From examples above you can already spot the greatest weakness of this macro - it’s syntax is so diverse that using it looks natural only when you read it, writing tends to be more in the field of trial and error.

Macros

There is a lot written about macros and their pros and cons. Main drawback for me is that their usage or, better, usage of nonstandard ones has a huge impact in readability of the code simply because you need to go and understand them first and macros are not the easiest thing to understand, especially for people without years of full time lisp development like me.

From the other side writing them is a total pleasure because you can watch the language mold under your hands. I’ll show one example there and it’s naming is probably totally incorrect, but it solved the issue I had.

To print the status I had to print information about different states - print a list of new files, drafts, updated files etc. Each of them had it’s own sub to produce a list and it’s own text of course. To make it more human-friendly I wanted to have not one but three text - for zero, one and many results.

I wrote initial logic as a function and amount of duplication became obvious very soon. I decided to give macros ago and imagined a perfect way to generate a status of a certain kind. I could do a macro that does code execution in place but that would sit in the body of a single function inflate it’s size. Based on that I decided to make a macro to generate a function definition.

This was a perfect syntax in my opinion:

(with-files new (get-new-files)
  "There are ~a new files to publish~%"
  "There is a new file to publish~%"
  "No new files to publish~%")

That would generate a function with the name with-new-files that takes a callback to print a list that’s a result of calling (get-new-files) form and prints header by itself and items list with this callback.

The callback was an extension point to remove any constraint on the type of list items and let actual code deal with it.

With the macro implemented actual status code again became really trivial.

(defun print-status ()
  (flet ((print-names (items)
           (format t "~%~{    ~a~^~%~}~%~%" (mapcar #'filename items)))
         (print-string-names (items)
           (format t "~%~{    ~a~^~%~}~%~%" items))
         )
    (with-draft-files #'print-string-names)
    (with-new-files #'print-names)
    (with-modified-files #'print-names)
    (with-deleted-files #'print-names)
    (with-fetched-files #'print-string-names)))

And here is the actual macro code:

(defmacro with-files (name accessor multiplemsg singlemsg nomsg)
  (let ((fn-name (intern
                  (concatenate 'string
                               "WITH-"
                               (symbol-name name)
                               "-FILES"))))
    `(defun ,fn-name (cb)
       (let ((items ,accessor))
         (if (> (length items) 0)
             (progn
               (if (> (length items) 1)
                   (format t ,multiplemsg (length items))
                   (format t ,singlemsg))
               (funcall cb items))
             (format t ,nomsg))))))

What happens there is I’m assembling function name and then return function definition with placeholders replaced with the data from arguments.

Macro is not a tool for any task but it can be really life changing if you only ever programmed with macroless languages.

Format

I find format really fascinating. Most of the languages mimic C, they implement sprintf style functions and it’s a shame given how much more powerful format is.

First of all, it works on streams, that gives a lot of power on their own, as I wrote earlier. Next it literary has all functionality necessary to always print to output with one call.

It’s possible to print a quoted value, a value without surrounding quotes, add all sorts of paddings to the printed data and to even print arrays!

Here are few spells:

(format nil "~{~a/~}~a-~2,'0d-~2,'0d-~a.md"
        rest
        (getf date :year)
        (getf date :mon)
        (getf date :day)
        name
        )

What happens there? We print a pathname. rest contains a list with folders, date parts go after that and in the end we want to print the name. ~{~a/~} prints folders separated by the forward slash, ~2,'0d ensure that two digit number is printed and pads it with leading zeros if necessary. If arguments are ((list "path" "to" "file") 2018 3 4 title result will be path/to/file/2018-03-04-tile.md

Or here is how I print a list of files in status, ~% means newline:

(format t "~%~{    ~a~^~%~}~%~%" items)

Format can do much more and it simply eliminates manual fiddling with data to prepare it for printing.

Little things

There are lots of decisions in Common Lisp standard package that make you only wonder why didn’t the find the way to any other language.

For example, read-line function accepts an argument that will be returned if the end of stream was reached. Why would you need this? Simply because returning custom value there can make some upper-level logic more generic. Btw, dolist does that too.

Another small nicety is that many functions on collections accept parameters like :test or :key that immediately make them more useful.

Here is how last published post is found, for example:

(defun get-last-published-post (db)
  (car (sort (posts db) #'> :key #'created-at)))

Not effective, I know. Or member function in this example:

(defun known-editor (e)
  (member e '("vim" "emacs" "emacsclient") :test 'string=))

It’s string there, but having this parameter immediately allows to have list members of any type as long as there is an equality check for them.

Common Lisp pitfalls

Packages

Before writing cl-journal I didn’t have too much experience working with Common Lisp, so I decided to pick up whatever I considered to be latest best practices and try to live with it. I’ve split all the functionality into separate packages and tried to export and import only really necessary functionality.

Soon I’ve found out that this way of development was much more verbose there than in other languages and the reason was mostly CLOS. Packages serve as containers for symbols in Common Lisp and any symbol you want to share between packages needs to be exported.

That means that class name should be exported as well as all generic functions generated for it accessors. A good example is cl-journal.db package. Three classes with a handful of slots each generated a long list of symbols to export and besides that, there were still ordinary functions and special variables. And since generic functions where magically generated it still left open questions about how they will work if such generic functions were imported from two different packages into third one. I’m sure this behavior is defined somewhere, but I don’t know.

In the end, I got so tired of all this maintenance that I started importing whole packages with :use even though I treated that earlier as a non recommended way.

Standard library

I think a very thick book can be compiled from all the complaints regarding the standard package of the language. Sometimes there are functions that are of no interest to most people and there are many cases when obviously necessary functions are not there.

Good examples are user input, string manipulations or external processes. Some things like prompt are more or less easy to do, but password input proved to be a really painful exercise. uiop/run-program is also far from the easiest function to use. Ok, maybe I missed a very good tutorial on this one, but I had to go through it’s code several times to understand the details or meaning of it’s parameters.

Common Lisp comes from the time when it was if not mainstream but a widespread language with lisp machine legacy and from what I understand this had an influence on it’s relations with outside world and that really hurts, especially comparing string and io operations with languages like perl that do that this bit particularly well.

Third-party libraries

That’s another very common complaint. Common Lisp libraries are often of fantastic quality feature-wise but it doesn’t really help if they have no documentation or a brief one that explains 10% of the functionality. And you can almost forget about library specific tutorials. Getting over this was a rewarding intellectual achievement for me but the price was time, lots of time.

Here I’d like to admit that plump library had one of the best documentations and even a couple of projects using it in neighbor repos. This bit really helped me in understanding the library and coming up with a proper solution for html conversion.

Final thoughts

You probably spotted already that I’ve been mentioning Livejournal everywhere it can give an impression that there is no way in life for the client to support any other service. That’s actually not true, but will require a good amount of work of course. The first step would be to abstract away remote api and the second one will be to make <post> and <post-file> classes service agnostic. It’ll be even easier if we do not set the aim to support the same feature set for every single platform and provide a cl-journal as a platform that can give the same experience as it does now for Livejournal with service specific changes.

In terms of featureset, a lot of things can be done better of course. For example, I can think of template posts, or client can support failures much nicer, but after using it for more than two years I can say that none of it is something that turns it into something unusable.

Was it a correct choice to use Common Lisp for implementation? For me, absolutely yes, because a lot of things I implemented would have taken twice as code to get them working and interactive development mad me so performant that I was able to add significant features to the code even with a very limited time I had.

One good learning for me was to add test into interactive development workflow. It was really not that obvious for me, but comment driven development as I did for syncitems with an addition of test allowed me to write functional code even in the time of heavy sleep deprivation that I tend to fall into.

Another good learning that I had was to follow a bottom-up approach, you could see an example in the merge description. Whenever I approached big task I started asking myself very simple question starting from “How do I get the changes?” or “How do I get the filename” and solving them slowly one by one. After they were all complete the final task that was thought to be complex and indeed was turned out to be a very small function with simple logic.

Thank you for reading.