Hacklog: Diffing and Comparing Guix Derivations Using Breadth-first Search & Jaccard (February, 2026)

A cool thing about Guix (and probably functional package managers in general) is, that derivations form a directed acyclic graph, which means that all packages with their dependencies or system configurations can be represented as such. Another, even cooler, thing is, that Guix provides a graphing utility called `guix graph` which helps visualising these DAGs in Graphviz (if you ever wanted to frame a picture of your favorite package graph or play a game of "is this the dependency graph of a rust package or the visualization of a Mandelbrot set?" this should be the tool of your choice).

This post is me thinking out loud about my diff-drv utility that helps comparing these DAGs and my current approach in comparing the depth of changes. I plan to write a more regular hacklog, so feedback on this kind of posts is pretty much welcome!

The utility itself can be found on Codeberg as theesm/diff-drv.

SRFI-1, Guix Graph ...

As someone who a. exclusively uses Guix System on their personal/work computers and servers and b. from time to time contributes to the packaging side of Guix in their spare time, I sometimes want to know:

and up until two years ago I did so by:

... And The My REPL Hackery Should Actually Be A Program Pipeline

I shared my second approach on StackOverflow in a question on how to find build information of a guix store item two years ago. If I find myself doing something repetitive often enough on a REPL I usually come up with a program, which is why I wrote my first draft of diff-drv around the same time. The output looks like this:

    diff-drv /gnu/store/7z3ij54rnhff65wa6byg0rrycrh5442g-system.drv /gnu/store/5s0y5c3fxkxrdak5n5knv0pp2wx6g8q2-system.drv
      outputs  (  -1  +1  U0)
               - ("out" . "ysnz14g7drrmddwp0vdqyy0dvvrh9wv5-system")
               + ("out" . "dsz5sb6ccvb2yfpjmz6a6hs9nwraj5pb-system")

       inputs  (  -4  +4  U7)
               - "0aqfbi51i9l1pa1h2f83kb45add9iy86-provenance.drv"
               - "hnaqz9r5b2sp1556fck3c3gka541a5w2-etc.drv"
               - "mq5ypn7f5cmb70nmaww65w2hzzqm8gvj-boot.drv"
               - "ys9axwc1w8v6jpfqgr0p6zxp8blnqzgp-activate.scm.drv"
               + "2il6si89psh41265j6y9lapwqb9v8w55-activate.scm.drv"
               + "fn8pg1jg664zbak6lz2id3dw0fr57153-boot.drv"
               + "j3wj6yqqz35xbxkbp63vh7cjcdahpl1m-provenance.drv"
               + "v3m0vrrgbix8nb2q7i085l1jakbndvjz-etc.drv"
               u "09qcqgni16z3hpv98hlvg5506iyzyy87-profile.drv"
               u "2l9815zdbf5nm4r5ia1njcl6yvrwyrda-module-import-compiled.drv"
               u "b5flkz0fxnprg9qhb38mp91b87n6c2ar-raw-initrd.drv"
               u "hcll0jzfrikv9km83799xqkcl25vx5vc-parameters.drv"
               u "lmw8m5hqr5x476scw4gmc5qdxqlfx56p-locale-multiple-versions.drv"
               u "lvwn6apbaanyiqq1n3ivcypf6b9pvnay-profile.drv"
               u "rhmvzpzk4qc33h0nhd15swfrh7155qsv-guile-3.0.9.drv"

      sources  (  -2  +2  U2)
               - "nr7h7kgdzgdpbqpnrpc9rkzx65giajin-configuration.scm"
               - "v8rw1dvznmc89qvr8lnb633cnrdzwah1-system-builder"
               + "av2ag8gj3k6m09lx17vf53p053h5znwn-configuration.scm"
               + "yl1i65cvxq2ng6570midxar0ziryc52w-system-builder"
               u "c6cf35bavqqs5mqsffl45izqaf0qn4dg-module-import"
               u "qq9fwykdm519k5rqx9j8c8h97x2l39p7-channels.scm"

       system  (  -0  +0  U1)
               u "aarch64-linux"

      builder  (  -0  +0  U1)
               u #f

         args  (  -1  +1  U5)
               - "v8rw1dvznmc89qvr8lnb633cnrdzwah1-system-builder"
               + "yl1i65cvxq2ng6570midxar0ziryc52w-system-builder"
               u "--no-auto-compile"
               u "-L"
               u "c6cf35bavqqs5mqsffl45izqaf0qn4dg-module-import"
               u "-C"
               u "hxbq4i5516rrfhmfjhwfhvlray9gbjmx-module-import-compiled"

     env-vars  (  -1  +1  U2)
               - ("out" . "ysnz14g7drrmddwp0vdqyy0dvvrh9wv5-system")
               + ("out" . "dsz5sb6ccvb2yfpjmz6a6hs9nwraj5pb-system")
               u ("LC_CTYPE" . "C.UTF-8")
               u ("preferLocalBuild" . "1")

    file-name  (  -1  +1  U0)
               - "7z3ij54rnhff65wa6byg0rrycrh5442g-system.drv"
               + "5s0y5c3fxkxrdak5n5knv0pp2wx6g8q2-system.drv"

and it already tells us superficially what has changed between to derivations. We're able to answer wether or not a direct dependency has changed, has been added/removed, if builder args/env-vars have changed, if output has changed (which is more useful for packages with multiple outputs).

Down this Weekends Rabbit Hole

However, this doesn't give us any clue how deep the changes go, how much of the DAG has changed, and where down the graph the cause for the change appeared. While I solved my REPL shenanigans with diff-drv, I still found myself looking at visualized graphs to figure out what exactly has changed.

A mail written by me to guix-devel on possibly introducing a guix store command (that would cover store plumbing commands such as diffing and pretty-printing items) and this brief IRC conversation:

> ieure: wonders if there's a .drv file pretty-printer lurking in Guix somewhere
> Rutherther: https://codeberg.org/guix/guix/pulls/6199 :)
> theesm: Rutherther: almost forgot about the mail regarding the drv/store interactions commands i wanted to send to guix devel as there's been several codeberg issues discussing the same thing (mentioned in 6199 that i wanted to write one, maybe now's a good time to do so)
> Rutherther: theesm: yeah, this is a recurring topic for sure
> theesm: think i'll write the mail now^^ (especially as i'm looking for a reason to finally retire my hacky awful aterm .drv files diff thing (<https://codeberg.org/theesm/diff-drv>) if stuff regarding store commands land in guix proper one day)
> civodul: theesm: diff-drv, i’ve always wanted that!
> civodul: ah but i think it’s not quite what i want: i’d want a recursive diff, that finds the “deepest” differing input

led me to spend parts of this weekend improving diff-drv to be actually (and not just superficially) useful. And of course, there's an XKCD to explain my weekend shenenigans!

The new, additional, output regarding depth and similiarty diff-drv provides looks like this:

    theesm@minty ~/diff-drv [env]$ diff-drv --bfs /gnu/store/89d1x358yq6flynsq29kdgfrzlyaz7p5-linux-libre-arm64-mnt-reform-6.18.7.drv /gnu/store/wfmv6z3bblly1akwa4hml4jp43csdi2h-linux-libre-arm64-mnt-reform-6.18.10.drv
    Similarity (Jaccard): 0.9240  (92.40%)
    A (linux-libre-arm64-mnt-reform-6.18.7.drv): 420 (only A: 19, shared: 401 = 95.48%)
    B (linux-libre-arm64-mnt-reform-6.18.10.drv): 415 (only B: 14, shared: 401 = 96.63%)
    A ∪ B: 434 |  A ∩ B: 401 |  Diff: 33

    paths only present in /gnu/store/89d1x358yq6flynsq29kdgfrzlyaz7p5-linux-libre-arm64-mnt-reform-6.18.7.drv:
     * 89d1x358yq6flynsq29kdgfrzlyaz7p5-linux-libre-arm64-mnt-reform-6.18.7.drv
      * 7laq0xpvkwixyj8yghv2iap9l7qyia68-dwarves-1.29.drv
        * az4im9jak383yfmivk0nd0h5r0sm26g2-cmake-minimal-3.31.10.drv
          * hy51f0d74f8jxqwpgb5k729z5j1w67bx-jsoncpp-1.9.6.drv
            * wx3w3szw60f7wbdx0kymmn3bakyg7gly-meson-1.9.0.drv
              * 3c1rpfvx101cikrxn8pm58nw8ciaj2rk-python-3.11.14.drv
              * ad8sr2mrd3kah317sb6cs8f0ij5dbq4j-python-setuptools-80.9.0.drv
              * bb1xi2z3rdg8nfx82i25lias7n9n045n-guile-json-4.7.3.drv
                * g934s5hvgh39371kcld10q1hmvs6chzz-guile-json-4.7.3.tar.gz.drv
              * h32mzli797v881bccxyqhkiqk59y3w9x-python-wrapper-3.11.14.drv
              * jjadn60fv200w7i224d7inad2z3nmh30-module-import-compiled.drv
          * rmmlcn20a28ba6kgvax5qa6skca84vkd-cmake-bootstrap-3.31.10.drv
      * a0jpykjbgn45w8i98r31z69shmx8gfkc-reform-debian-packages-2023-07-10-515-gc527a1d.drv
        * rirbz073z6g5l2w5n1v43l0wp341k9ya-reform-debian-packages-2023-07-10-515-gc527a1d-checkout.drv
      * xia4jsdqvq3776106v7372nks2y2yhm4-linux-libre-6.18.7-guix.tar.zst.drv
        * ap3916dn6ql87il8iqbcmhbfgmysx5v5-linux-libre-6.18.7-guix.tar.xz.drv
          * fgk7agjgd291nhrb03zgwz63qp9c94b1-linux-libre-deblob-6.18.7-gnu.drv
          * mnsa49q6dln255lhcg6vqnz6pcd55si6-linux-6.18.7.tar.xz.drv
          * vvha80z5g5ij8r0cznlhcz6v6xkvqh80-linux-libre-deblob-check-6.18.7-gnu.drv

    paths only present in /gnu/store/wfmv6z3bblly1akwa4hml4jp43csdi2h-linux-libre-arm64-mnt-reform-6.18.10.drv:
     * wfmv6z3bblly1akwa4hml4jp43csdi2h-linux-libre-arm64-mnt-reform-6.18.10.drv
      * 8jcjzpik7q2lay8hl4r342bpmh7s82sa-reform-debian-packages-2023-07-10-525-g7e8a95c.drv
        * pm06jz5fbdyb5dwwbwmzlj69i4lhkjh1-reform-debian-packages-2023-07-10-525-g7e8a95c-checkout.drv
      * rdc9x694chl1gjyzhls0hwrsama2b50j-dwarves-1.29.drv
        * s4rqk30j1wpzjn79hfqyglkqfmsrc13i-cmake-minimal-3.31.10.drv
          * 9avyw2v2q8m8mr4d6v3sxnx32524hipa-jsoncpp-1.9.6.drv
            * 41pxa9sv01v5gp0sdg1l1zjqmpydqibw-meson-1.9.0.drv
              * 1h7k8vhd33h0kq1kwfc3rgyyjwjspycr-python-setuptools-bootstrap-80.9.0.drv
          * wbvkvaxf46jxjr001hiqnp983s0mkixa-cmake-bootstrap-3.31.10.drv
      * z6rdck21vbpvci6kqa85ibl510hy0vxp-linux-libre-6.18.10-guix.tar.zst.drv
        * hxyfxbc9b9bra3l3akasm7cs2n6s8s8n-linux-libre-6.18.10-guix.tar.xz.drv
          * 53b2d046s5jfa5wvryry1j6jf80821vs-linux-libre-deblob-6.18.10-gnu.drv
          * i01ykrczzklch882hy0yrwamv0ja1byg-linux-6.18.10.tar.xz.drv
          * mp0825xk3vcvsrjp7jbzygzchfwp3r13-linux-libre-deblob-check-6.18.10-gnu.drv

Meaning that we now have a Jaccard index as a means to describe the similiarty of both DAGs, stats on how many nodes both derivations have and how they differ:

    Similarity (Jaccard): 0.9240  (92.40%)
    A (linux-libre-arm64-mnt-reform-6.18.7.drv): 420 (only A: 19, shared: 401 = 95.48%)
    B (linux-libre-arm64-mnt-reform-6.18.10.drv): 415 (only B: 14, shared: 401 = 96.63%)
    A ∪ B: 434 |  A ∩ B: 401 |  Diff: 33

and listings of the actual paths that differ between both graphs. Let's talk about how it works next, so sit back, grab a cup of coffee, and put on some Lucy Dacus.

Bread(th) & Roses

My first step in this was to come up with a reasonable way to build the input graph of a derivation. Luckily, guixes (guix derivations) module provides us with functions to make this task easier. I also took some inspiration from what guix graph is doing especially in derivation-dependencies but didn't got to wrap my head around what lift1 is doing and how (guix monads) works, which it utilized, so I decided to roll my own.

I used Breadth-first search to get the set of all reachable derivations and a path from root to each derivation.

For this I:

    (define (build-derivation-graph root)
      (let ((parent-map (make-hash-table))
            (queue      (make-q))
            (child-cache (make-hash-table)))

        (define (children drv-path)
          (or (hash-ref child-cache drv-path #f)
              (let* ((drv  (read-derivation-from-file drv-path))
                     (deps (map (lambda (input)
                                  (derivation-file-name
                                   (derivation-input-derivation input)))
                                (derivation-inputs drv))))
                (hash-set! child-cache drv-path deps)
                deps)))

        (hash-set! parent-map root #f)
        (enq! queue root)

        (let bfs ()
          (if (q-empty? queue)
              parent-map
              (let ((node (deq! queue)))
                (for-each
                 (lambda (child)
                   (unless (hash-get-handle parent-map child)
                     (hash-set! parent-map child node)
                     (enq! queue child)))
                 (children node))
                (bfs))))))

explores the graph layer by layer, the resulting tree is a shortest-path tree

A simplification this implementation makes is that, as BFS explores the DAG layer by layer, we construct a shortest-path spanning tree of all reachable derivations, which I consider a "good enough" approach.

The only "downside" of this becomes apparent when a deeper-lying dependency changes: because derivation input hashes propagate upwards, a single change will most likely result in many nodes along different branches to appear modified, resulting in a pretty verbose diff output which can become pretty hard to read.

To make this more readable, we could probably build something to filter for a list of the deepest lying changes and only display a list of those, but that's for another time.

Printing The Diff Dance

So if A, B and C all depend on libfoo somewhere down their dependency chain, we would see the full paths leading to all three changes wherever the hash of the store items differ. I think it would be better UX to just isolate the cause and just print that, but that's for another weekend, right now we're doing this:

    (define (print-diff-tree root parents only)
      (format #t "paths only present in ~a:\n" root)
      (let ((mark (make-hash-table))
            (keep (make-hash-table))
            (kids (make-hash-table)))

        (hash-set! keep root #t)
        (for-each
         (lambda (n)
           (hash-set! mark n #t)
           (let loop ((cur n))
             (when (and cur (not (hash-ref keep cur #f)))
               (hash-set! keep cur #t)
               (loop (hash-ref parents cur #f)))))
         only)

        (hash-for-each
         (lambda (node _)
           (let ((p (hash-ref parents node #f)))
             (when (and p (hash-ref keep p #f))
               (hash-set! kids p (cons node (hash-ref kids p '()))))))
         keep)

        (hash-for-each (lambda (p xs) (hash-set! kids p (sort xs string<?))) kids)

        (let walk ((node root) (depth 0))
          (format #t "~a~c ~a\n"
                  (make-string (* 2 depth) #\space)
                  (if (hash-ref mark node #f) #\* #\space)
                  (store-path-base node))
          (for-each (lambda (c) (walk c (+ depth 1)))
                    (hash-ref kids node '())))
        (newline)))

to get to our diff tree. It is currently dependent on the report-jaccard+diffs function I'll introduce next for data as that's where only is coming from, see:

    (call-with-values
            (lambda () (report-jaccard+diffs a b parentsA parentsB))
          (lambda (only-a only-b)
            (print-diff-tree a parentsA only-a)
            (print-diff-tree b parentsB only-b)))

print-diff-tree mostly does really ugly plumbing to get printable pruned tree with formatted nodes, as I'm hopefully rewriting this part soon to fit the exiting formatting of diff-drv better I won't go into detail for now.

Jaccard et al

I have the awful tendency to use single/dual-letter variable names when hacking things on a REPL and to come up with abstractions that sounded more useful than they turned out to be in the end (hence walky!). I'll ignore the rough edges for now, and will focus on report-jaccard+diffs:

    ;; don't know if i shoot myself in the foot with this abstraction
    ;; PRs suggesting a better name for this welcome
    (define* (walky! src other
                     #:key
                     (on-key    (lambda (_k) #t))
                     (on-shared (lambda (_k) #t))
                     (on-only   (lambda (_k) #t)))
      (hash-for-each
       (lambda (k _)
         (on-key k)
         (if (hash-get-handle other k)
             (on-shared k)
             (on-only k)))
       src))

    (define (report-jaccard+diffs a b A B)
      (let ((sa 0) (sb 0) (i 0) (only-a '()) (only-b '()))

        (walky! A B
                #:on-key    (lambda (_k) (set! sa (+ sa 1)))
                #:on-shared (lambda (_k) (set! i  (+ i 1)))
                #:on-only   (lambda (k)  (set! only-a (cons k only-a))))

        (walky! B A
                #:on-key  (lambda (_k) (set! sb (+ sb 1)))
                #:on-only (lambda (k)  (set! only-b (cons k only-b))))

        (let* ((u (- (+ sa sb) i))
               (j (if (= u 0) 1.0 (/ (exact->inexact i) (exact->inexact u))))
               (only-a (sort only-a string<?))
               (only-b (sort only-b string<?))
               (da (- sa i))
               (db (- sb i))
               (pU (if (= u 0) 100.0 (* 100.0 j)))
               (pA (if (= sa 0) 100.0 (* 100.0 (/ (exact->inexact i) (exact->inexact sa)))))
               (pB (if (= sb 0) 100.0 (* 100.0 (/ (exact->inexact i) (exact->inexact sb))))))

          (format #t "Similarity (Jaccard): ~,4f  (~,2f%)\n" j pU)
          (format #t "A (~a): ~a (only A: ~a, shared: ~a = ~,2f%)\n"
                  (store-path-package-name a) sa da i pA)
          (format #t "B (~a): ~a (only B: ~a, shared: ~a = ~,2f%)\n"
                  (store-path-package-name b) sb db i pB)
          (format #t "A ∪ B: ~a |  A ∩ B: ~a |  Diff: ~a\n\n"
                  u i (+ da db))

          (values only-a only-b))))

Basically, we've got:

at this point, which allows us to calculate the union size (sa + sb - i) as well as the Jaccard similiarty (which is IoU) and percentages, so we're able to include a simple summary in our output:

    Similarity (Jaccard): 0.9240  (92.40%)
    A (linux-libre-arm64-mnt-reform-6.18.7.drv): 420 (only A: 19, shared: 401 = 95.48%)
    B (linux-libre-arm64-mnt-reform-6.18.10.drv): 415 (only B: 14, shared: 401 = 96.63%)
    A ∪ B: 434 |  A ∩ B: 401 |  Diff: 33

that gives us an idea of how similiar two derivations are, how much of the dependency graph is shared, wether the dependency graph of A fits in B, you get the gist.

Things You Most Likely Didn't Ask Yourself About Derivation Similiarty

Now we're able to put this to good use. I bet you've always been wondering if you can fit the dependency tree of hello inside the dependency tree of a kernel variant such as linux-libre-arm64-mnt-pocket-reform and how similiar that tree is:

    theesm@minty ~/diff-drv [env]$ diff-drv /gnu/store/fxy567lyimwkxxi23brf8y8g2106anlx-linux-libre-arm64-mnt-pocket-reform-6.17.12.drv /gnu/store/qzh7rvz609rm845xxf3jsasxzgbp208x-hello-2.12.2.drv
    Similarity (Jaccard): 0.3957  (39.57%)
    A (linux-libre-arm64-mnt-pocket-reform-6.17.12.drv): 420 (only A: 253, shared: 167 = 39.76%)
    B (hello-2.12.2.drv): 169 (only B: 2, shared: 167 = 98.82%)
    	A ∪ B: 422 |  A ∩ B: 167 |  Diff: 255

turns out: it fits (well, except for the source package):

    paths only present in /gnu/store/qzh7rvz609rm845xxf3jsasxzgbp208x-hello-2.12.2.drv:
     * qzh7rvz609rm845xxf3jsasxzgbp208x-hello-2.12.2.drv
      * kbr9zcwsln63dlr0217ir7any1dcb2ss-hello-2.12.2.tar.gz.drv/store

Comparing System Reconfigurations

We're also now able to tell how similiar two system configurations are, so we're able to tell wether something has been a small change like this one:

    theesm@minty ~/diff-drv [env]$ diff-drv --bfs $(guix gc --derivers /var/guix/profiles/system-12-link) $(guix gc --derivers /var/guix/profiles/system-11-link)
    Similarity (Jaccard): 0.9906  (99.06%)
    A (system.drv): 4441 (only A: 13, shared: 4428 = 99.71%)
    B (system.drv): 4457 (only B: 29, shared: 4428 = 99.35%)
    A ∪ B: 4470 |  A ∩ B: 4428 |  Diff: 42

    paths only present in /gnu/store/2fm780820z78knqn1r6hnsgc6q8ppfj6-system.drv:
     * 2fm780820z78knqn1r6hnsgc6q8ppfj6-system.drv
      * 4nyvgv999pxixffrm5jnk5fznvrvg6h7-activate.scm.drv
        * sncv8fq3dcjcz4n4n69k54iqar43li4b-activate-service.scm.drv
      * 787rdicqj5amcpdxgba95zjryv5n9j3z-etc.drv
        * dn7zij6x594c2np0g7pv8yrrz6cpfpbd-udev.drv
          * 5rj188vvgrrsd76i5c7fzz68k1csbjjx-udev-rules.d.drv
          * ic9vasgj8bfl6xhcpvbw2nfawya0jf4j-hwdb.bin.drv
            * rvqwppv5aqg3wh9v29kdlmkzqm0q7w14-udev-hwdb.d.drv
        * xm1fcyckiqhampwcvyzvn77mmnb3bg1j-dbus-configuration.drv
          * 17v8dln6lisjz0svdzx3nyja5j4i5nnm-dbus-system-services.drv
      * d5nagfkl8zh21mh6dhbyklinapybdwgh-boot.drv
        * qk1936p5qz4fknsawlk3iln0v68jdmk8-shepherd.conf.drv
      * wn8bnk0ljxjazy82ck0rdyfgrn6kb2gz-provenance.drv

    paths only present in /gnu/store/8b9ng0l6jybb9fv76kp4pfw5hpdgndja-system.drv:
     * 8b9ng0l6jybb9fv76kp4pfw5hpdgndja-system.drv
      * 3x4hnlkpphw0w12dv9xpfb2l7rhkrza5-etc.drv
        * 1mdxg10na89hz0njnz96v54ki0df31nw-dbus-configuration.drv
          * f4s0y8wha0zhgi8kkvh8pk61przxfy3w-dbus-system-services.drv
          * nmdlx1g3j09vcvbynrkvcv248j10dfhb-blueman-2.4.6.drv
            * 66fmbrkqf74v2sv8bp1pszq614r7s567-blueman-2.4.6.tar.xz.drv
            * crhnzh1fv1z0a6ag8qvvck95p69s6jgp-libappindicator-12.10.1-0-298.drv
              * 5mj32f343083kz5v9ahli49fx6mz5v4z-dbus-test-runner-19.04.0.drv
                * 38mwiq495v802y7p9chkp413ymyh6j6m-python-dbusmock-0.37.2.drv
                  * 6bzwp0gddzipd2p4fvxic409bfh8ps0b-python-dbusmock-0.37.2-checkout.drv
                  * drs86vgpcflg0p5xfhdi4klm1041zcww-upower-1.90.2.drv
                    * ppd9aq1byjw9mc8921fbq4845skvi4rn-umockdev-0.17.13.drv
                      * 02iyj43k38k58nrw3g77dg9xfdj36iwf-umockdev-0.17.13.tar.xz.drv
                    * rzamla8y0dswyx60f85cks4r8yyjwn81-upower-1.90.2-checkout.drv
                      * k6299ri40k1bzlagvl3z1059x340zz1p-upower-1.90.2-checkout.drv
                * gy4df8aqrw0m6v3b7fxg36phagnijbnf-dbus-test-runner-19.04.0.tar.gz.drv
              * klsks181ljvbllvzf8kxrn51qv4d1rd8-libappindicator-12.10.1-0-298-checkout.drv
        * 66clmb1rmmf4lhmaqbpdbfx6dinpkbzb-etc-bluetooth.drv
        * px16kpgqsfqlyv9djqn9yb78njkp02aw-udev.drv
          * bika5wj6qsw41q1jsqvn07dswqb38cxx-hwdb.bin.drv
            * zz6bhc26mr99pf0j3cldvwxn6swq9p4b-udev-hwdb.d.drv
          * lgw3qpm1x942wwz7605dwkg0jxdpkjhs-udev-rules.d.drv
      * bdby9zxazgqnhh5fnslfp2g0bs556fb3-activate.scm.drv
        * 0377sfp2h3b04wkgc8467f7i7ckyiicl-activate-service.scm.drv
      * bqrd2fxkfvrcxccnmsrnlql62zbp9ccd-boot.drv
        * cx33w99s4i8y29awsbdgfbkl2fhqi8v2-shepherd.conf.drv
          * 0vcgmxz7dw33y3ib29xndnn9khyifxkn-shepherd-bluetooth.go.drv
            * wyqbs6rhg616hb0hlpxy78rn3mjfm2dq-shepherd-bluetooth.scm.drv
      * rkyd816vsa59gycv0q9bnkqjp5s9wqlw-provenance.drv

or a rather big change like this one between the 16th and 14th system generation:

    diff-drv --bfs $(guix gc --derivers /var/guix/profiles/system-16-link) $(guix gc --derivers /var/guix/profiles/system-14-link)
    Similarity (Jaccard): 0.5509  (55.09%)
    A (system.drv): 4441 (only A: 1242, shared: 3199 = 72.03%)
    B (system.drv): 4565 (only B: 1366, shared: 3199 = 70.08%)
    A ∪ B: 5807 |  A ∩ B: 3199 |  Diff: 2608

and what impact a reconfiguration has had. I won't paste the full tree output here, but this has been a major upgrade for a lot of packages.

Analyzing Dependency Hells

This also helps figuring out why the hash of a package changed even if the semver of a package didn't change and the package definition stayed the same. I use senpai as my IRC client as it works well with soju (which is the bouncer I use). During one upgrade the hash of senpai did change while the semver stayed at 0.4.1. So diff-drv can also be used as fuel for my confirmation bias to stay away from Go and Rust and stay in C/PHP/Perl (which I write at my job) and Guile/Elisp happyland:

    theesm@minty ~/diff-drv [env]$ diff-drv --bfs /gnu/store/6ghn9zl05p9jrjps2idzd8lka44s8gwl-senpai-0.4.1.drv /gnu/store/10vwgxsy3hb434kdazgyskm4yh0kh2cf-senpai-0.4.1.drv
    Similarity (Jaccard): 0.7470  (74.70%)
    A (senpai-0.4.1.drv): 570 (only A: 80, shared: 490 = 85.96%)
    B (senpai-0.4.1.drv): 576 (only B: 86, shared: 490 = 85.07%)
    A ∪ B: 656 |  A ∩ B: 490 |  Diff: 166

even though I again won't include the 166 paths diff in this post for readabilities sake. System derivation can be dissected further and the output of diff-drv could probably be optimized for this scenario, to get a summary wether packages, services, users, file-systems, the kernel etc. changed between derivations, but that would be a whole different tool in my book (even though the idea of this seems worth exploring).

Wrapping Up

As stated before, diff-drv can be found on on Codeberg as theesm/diff-drv. My approach in using BFS and Jaccard similiarity is somewhat useful in answering questions I have regarding the comparison of derivations, even though the output could probably be a bit more ergonomical. Either way, it was a fun hack session and fun to implement!