A cool thing about Guix (and probably functional package managers in general) is, that derivations form a directed acyclic graph, which means that all packages with their dependencies or system configurations can be represented as such. Another, even cooler, thing is, that Guix provides a graphing utility called `guix graph` which helps visualising these DAGs in Graphviz (if you ever wanted to frame a picture of your favorite package graph or play a game of "is this the dependency graph of a rust package or the visualization of a Mandelbrot set?" this should be the tool of your choice).
This post is me thinking out loud about my diff-drv utility that
helps comparing these DAGs and my current approach in comparing the
depth of changes. I plan to write a more regular hacklog, so feedback
on this kind of posts is pretty much welcome!
The utility itself can be found on Codeberg as theesm/diff-drv.
SRFI-1, Guix Graph ...
As someone who a. exclusively uses Guix System on their personal/work computers and servers and b. from time to time contributes to the packaging side of Guix in their spare time, I sometimes want to know:
- why the hash of a package changes if the semver of the package itself didn't change and what down the dependency graph caused this change.
- how similiar the dependency graph of two derivations of the same package is.
- if the dependency graph grew over time for a specific package.
- how to measure how big of a change my system rebuild was and how much two system derivations still share.
and up until two years ago I did so by:
- graphing the visual graphs and visually comparing changes (having two Graphviz graphs open in split and figuring out what the heck changed between those two)
- firing up a REPL and gluing together
(srfi srfi-1)for set operatings and(ice-9 pretty-print)for pretty-printing to somewhat get a superficial human-readable output of how .drv files differ.
... And The My REPL Hackery Should Actually Be A Program Pipeline
I shared my second approach on StackOverflow in a question on how to
find build information of a guix store item two years ago. If I find
myself doing something repetitive often enough on a REPL I usually
come up with a program, which is why I wrote my first draft of
diff-drv around the same time. The output looks like this:
diff-drv /gnu/store/7z3ij54rnhff65wa6byg0rrycrh5442g-system.drv /gnu/store/5s0y5c3fxkxrdak5n5knv0pp2wx6g8q2-system.drv
outputs ( -1 +1 U0)
- ("out" . "ysnz14g7drrmddwp0vdqyy0dvvrh9wv5-system")
+ ("out" . "dsz5sb6ccvb2yfpjmz6a6hs9nwraj5pb-system")
inputs ( -4 +4 U7)
- "0aqfbi51i9l1pa1h2f83kb45add9iy86-provenance.drv"
- "hnaqz9r5b2sp1556fck3c3gka541a5w2-etc.drv"
- "mq5ypn7f5cmb70nmaww65w2hzzqm8gvj-boot.drv"
- "ys9axwc1w8v6jpfqgr0p6zxp8blnqzgp-activate.scm.drv"
+ "2il6si89psh41265j6y9lapwqb9v8w55-activate.scm.drv"
+ "fn8pg1jg664zbak6lz2id3dw0fr57153-boot.drv"
+ "j3wj6yqqz35xbxkbp63vh7cjcdahpl1m-provenance.drv"
+ "v3m0vrrgbix8nb2q7i085l1jakbndvjz-etc.drv"
u "09qcqgni16z3hpv98hlvg5506iyzyy87-profile.drv"
u "2l9815zdbf5nm4r5ia1njcl6yvrwyrda-module-import-compiled.drv"
u "b5flkz0fxnprg9qhb38mp91b87n6c2ar-raw-initrd.drv"
u "hcll0jzfrikv9km83799xqkcl25vx5vc-parameters.drv"
u "lmw8m5hqr5x476scw4gmc5qdxqlfx56p-locale-multiple-versions.drv"
u "lvwn6apbaanyiqq1n3ivcypf6b9pvnay-profile.drv"
u "rhmvzpzk4qc33h0nhd15swfrh7155qsv-guile-3.0.9.drv"
sources ( -2 +2 U2)
- "nr7h7kgdzgdpbqpnrpc9rkzx65giajin-configuration.scm"
- "v8rw1dvznmc89qvr8lnb633cnrdzwah1-system-builder"
+ "av2ag8gj3k6m09lx17vf53p053h5znwn-configuration.scm"
+ "yl1i65cvxq2ng6570midxar0ziryc52w-system-builder"
u "c6cf35bavqqs5mqsffl45izqaf0qn4dg-module-import"
u "qq9fwykdm519k5rqx9j8c8h97x2l39p7-channels.scm"
system ( -0 +0 U1)
u "aarch64-linux"
builder ( -0 +0 U1)
u #f
args ( -1 +1 U5)
- "v8rw1dvznmc89qvr8lnb633cnrdzwah1-system-builder"
+ "yl1i65cvxq2ng6570midxar0ziryc52w-system-builder"
u "--no-auto-compile"
u "-L"
u "c6cf35bavqqs5mqsffl45izqaf0qn4dg-module-import"
u "-C"
u "hxbq4i5516rrfhmfjhwfhvlray9gbjmx-module-import-compiled"
env-vars ( -1 +1 U2)
- ("out" . "ysnz14g7drrmddwp0vdqyy0dvvrh9wv5-system")
+ ("out" . "dsz5sb6ccvb2yfpjmz6a6hs9nwraj5pb-system")
u ("LC_CTYPE" . "C.UTF-8")
u ("preferLocalBuild" . "1")
file-name ( -1 +1 U0)
- "7z3ij54rnhff65wa6byg0rrycrh5442g-system.drv"
+ "5s0y5c3fxkxrdak5n5knv0pp2wx6g8q2-system.drv"and it already tells us superficially what has changed between to derivations. We're able to answer wether or not a direct dependency has changed, has been added/removed, if builder args/env-vars have changed, if output has changed (which is more useful for packages with multiple outputs).
Down this Weekends Rabbit Hole
However, this doesn't give us any clue how deep the changes go, how
much of the DAG has changed, and where down the graph the cause for
the change appeared. While I solved my REPL shenanigans with
diff-drv, I still found myself looking at visualized graphs to
figure out what exactly has changed.
A mail written by me to guix-devel on possibly introducing a guix store command (that would cover store plumbing commands such as
diffing and pretty-printing items) and this brief IRC conversation:
> ieure: wonders if there's a .drv file pretty-printer lurking in Guix somewhere
> Rutherther: https://codeberg.org/guix/guix/pulls/6199 :)
> theesm: Rutherther: almost forgot about the mail regarding the drv/store interactions commands i wanted to send to guix devel as there's been several codeberg issues discussing the same thing (mentioned in 6199 that i wanted to write one, maybe now's a good time to do so)
> Rutherther: theesm: yeah, this is a recurring topic for sure
> theesm: think i'll write the mail now^^ (especially as i'm looking for a reason to finally retire my hacky awful aterm .drv files diff thing (<https://codeberg.org/theesm/diff-drv>) if stuff regarding store commands land in guix proper one day)
> civodul: theesm: diff-drv, i’ve always wanted that!
> civodul: ah but i think it’s not quite what i want: i’d want a recursive diff, that finds the “deepest” differing inputled me to spend parts of this weekend improving diff-drv to be
actually (and not just superficially) useful. And of course, there's
an XKCD to explain my weekend shenenigans!
The new, additional, output regarding depth and similiarty diff-drv
provides looks like this:
theesm@minty ~/diff-drv [env]$ diff-drv --bfs /gnu/store/89d1x358yq6flynsq29kdgfrzlyaz7p5-linux-libre-arm64-mnt-reform-6.18.7.drv /gnu/store/wfmv6z3bblly1akwa4hml4jp43csdi2h-linux-libre-arm64-mnt-reform-6.18.10.drv
Similarity (Jaccard): 0.9240 (92.40%)
A (linux-libre-arm64-mnt-reform-6.18.7.drv): 420 (only A: 19, shared: 401 = 95.48%)
B (linux-libre-arm64-mnt-reform-6.18.10.drv): 415 (only B: 14, shared: 401 = 96.63%)
A ∪ B: 434 | A ∩ B: 401 | Diff: 33
paths only present in /gnu/store/89d1x358yq6flynsq29kdgfrzlyaz7p5-linux-libre-arm64-mnt-reform-6.18.7.drv:
* 89d1x358yq6flynsq29kdgfrzlyaz7p5-linux-libre-arm64-mnt-reform-6.18.7.drv
* 7laq0xpvkwixyj8yghv2iap9l7qyia68-dwarves-1.29.drv
* az4im9jak383yfmivk0nd0h5r0sm26g2-cmake-minimal-3.31.10.drv
* hy51f0d74f8jxqwpgb5k729z5j1w67bx-jsoncpp-1.9.6.drv
* wx3w3szw60f7wbdx0kymmn3bakyg7gly-meson-1.9.0.drv
* 3c1rpfvx101cikrxn8pm58nw8ciaj2rk-python-3.11.14.drv
* ad8sr2mrd3kah317sb6cs8f0ij5dbq4j-python-setuptools-80.9.0.drv
* bb1xi2z3rdg8nfx82i25lias7n9n045n-guile-json-4.7.3.drv
* g934s5hvgh39371kcld10q1hmvs6chzz-guile-json-4.7.3.tar.gz.drv
* h32mzli797v881bccxyqhkiqk59y3w9x-python-wrapper-3.11.14.drv
* jjadn60fv200w7i224d7inad2z3nmh30-module-import-compiled.drv
* rmmlcn20a28ba6kgvax5qa6skca84vkd-cmake-bootstrap-3.31.10.drv
* a0jpykjbgn45w8i98r31z69shmx8gfkc-reform-debian-packages-2023-07-10-515-gc527a1d.drv
* rirbz073z6g5l2w5n1v43l0wp341k9ya-reform-debian-packages-2023-07-10-515-gc527a1d-checkout.drv
* xia4jsdqvq3776106v7372nks2y2yhm4-linux-libre-6.18.7-guix.tar.zst.drv
* ap3916dn6ql87il8iqbcmhbfgmysx5v5-linux-libre-6.18.7-guix.tar.xz.drv
* fgk7agjgd291nhrb03zgwz63qp9c94b1-linux-libre-deblob-6.18.7-gnu.drv
* mnsa49q6dln255lhcg6vqnz6pcd55si6-linux-6.18.7.tar.xz.drv
* vvha80z5g5ij8r0cznlhcz6v6xkvqh80-linux-libre-deblob-check-6.18.7-gnu.drv
paths only present in /gnu/store/wfmv6z3bblly1akwa4hml4jp43csdi2h-linux-libre-arm64-mnt-reform-6.18.10.drv:
* wfmv6z3bblly1akwa4hml4jp43csdi2h-linux-libre-arm64-mnt-reform-6.18.10.drv
* 8jcjzpik7q2lay8hl4r342bpmh7s82sa-reform-debian-packages-2023-07-10-525-g7e8a95c.drv
* pm06jz5fbdyb5dwwbwmzlj69i4lhkjh1-reform-debian-packages-2023-07-10-525-g7e8a95c-checkout.drv
* rdc9x694chl1gjyzhls0hwrsama2b50j-dwarves-1.29.drv
* s4rqk30j1wpzjn79hfqyglkqfmsrc13i-cmake-minimal-3.31.10.drv
* 9avyw2v2q8m8mr4d6v3sxnx32524hipa-jsoncpp-1.9.6.drv
* 41pxa9sv01v5gp0sdg1l1zjqmpydqibw-meson-1.9.0.drv
* 1h7k8vhd33h0kq1kwfc3rgyyjwjspycr-python-setuptools-bootstrap-80.9.0.drv
* wbvkvaxf46jxjr001hiqnp983s0mkixa-cmake-bootstrap-3.31.10.drv
* z6rdck21vbpvci6kqa85ibl510hy0vxp-linux-libre-6.18.10-guix.tar.zst.drv
* hxyfxbc9b9bra3l3akasm7cs2n6s8s8n-linux-libre-6.18.10-guix.tar.xz.drv
* 53b2d046s5jfa5wvryry1j6jf80821vs-linux-libre-deblob-6.18.10-gnu.drv
* i01ykrczzklch882hy0yrwamv0ja1byg-linux-6.18.10.tar.xz.drv
* mp0825xk3vcvsrjp7jbzygzchfwp3r13-linux-libre-deblob-check-6.18.10-gnu.drvMeaning that we now have a Jaccard index as a means to describe the similiarty of both DAGs, stats on how many nodes both derivations have and how they differ:
Similarity (Jaccard): 0.9240 (92.40%)
A (linux-libre-arm64-mnt-reform-6.18.7.drv): 420 (only A: 19, shared: 401 = 95.48%)
B (linux-libre-arm64-mnt-reform-6.18.10.drv): 415 (only B: 14, shared: 401 = 96.63%)
A ∪ B: 434 | A ∩ B: 401 | Diff: 33and listings of the actual paths that differ between both graphs. Let's talk about how it works next, so sit back, grab a cup of coffee, and put on some Lucy Dacus.
Bread(th) & Roses
My first step in this was to come up with a reasonable way to build
the input graph of a derivation. Luckily, guixes (guix derivations)
module provides us with functions to make this task easier. I also
took some inspiration from what guix graph is doing especially in
derivation-dependencies but didn't got to wrap my head around what
lift1 is doing and how (guix monads) works, which it utilized, so
I decided to roll my own.
I used Breadth-first search to get the set of all reachable derivations and a path from root to each derivation.
For this I:
- defined a
childrenhelper for the outgoing edges. - put a good old BFS (as famously seen in probably every introductory algorithm university course) to work starting at the root derivation.
- record, for each discovered node, from where it was first reached.
- I also tried to be smart about handling .drv files if they appear on multiple occasions within the DAG, but in retrospect this falls in premature optimization land and shouldn't have been necessary at all.
(define (build-derivation-graph root)
(let ((parent-map (make-hash-table))
(queue (make-q))
(child-cache (make-hash-table)))
(define (children drv-path)
(or (hash-ref child-cache drv-path #f)
(let* ((drv (read-derivation-from-file drv-path))
(deps (map (lambda (input)
(derivation-file-name
(derivation-input-derivation input)))
(derivation-inputs drv))))
(hash-set! child-cache drv-path deps)
deps)))
(hash-set! parent-map root #f)
(enq! queue root)
(let bfs ()
(if (q-empty? queue)
parent-map
(let ((node (deq! queue)))
(for-each
(lambda (child)
(unless (hash-get-handle parent-map child)
(hash-set! parent-map child node)
(enq! queue child)))
(children node))
(bfs))))))explores the graph layer by layer, the resulting tree is a shortest-path tree
A simplification this implementation makes is that, as BFS explores the DAG layer by layer, we construct a shortest-path spanning tree of all reachable derivations, which I consider a "good enough" approach.
The only "downside" of this becomes apparent when a deeper-lying dependency changes: because derivation input hashes propagate upwards, a single change will most likely result in many nodes along different branches to appear modified, resulting in a pretty verbose diff output which can become pretty hard to read.
To make this more readable, we could probably build something to filter for a list of the deepest lying changes and only display a list of those, but that's for another time.
Printing The Diff Dance
So if A, B and C all depend on libfoo somewhere down their dependency chain, we would see the full paths leading to all three changes wherever the hash of the store items differ. I think it would be better UX to just isolate the cause and just print that, but that's for another weekend, right now we're doing this:
(define (print-diff-tree root parents only)
(format #t "paths only present in ~a:\n" root)
(let ((mark (make-hash-table))
(keep (make-hash-table))
(kids (make-hash-table)))
(hash-set! keep root #t)
(for-each
(lambda (n)
(hash-set! mark n #t)
(let loop ((cur n))
(when (and cur (not (hash-ref keep cur #f)))
(hash-set! keep cur #t)
(loop (hash-ref parents cur #f)))))
only)
(hash-for-each
(lambda (node _)
(let ((p (hash-ref parents node #f)))
(when (and p (hash-ref keep p #f))
(hash-set! kids p (cons node (hash-ref kids p '()))))))
keep)
(hash-for-each (lambda (p xs) (hash-set! kids p (sort xs string<?))) kids)
(let walk ((node root) (depth 0))
(format #t "~a~c ~a\n"
(make-string (* 2 depth) #\space)
(if (hash-ref mark node #f) #\* #\space)
(store-path-base node))
(for-each (lambda (c) (walk c (+ depth 1)))
(hash-ref kids node '())))
(newline)))to get to our diff tree. It is currently dependent on the
report-jaccard+diffs function I'll introduce next for data as that's
where only is coming from, see:
(call-with-values
(lambda () (report-jaccard+diffs a b parentsA parentsB))
(lambda (only-a only-b)
(print-diff-tree a parentsA only-a)
(print-diff-tree b parentsB only-b)))print-diff-tree mostly does really ugly plumbing to get printable
pruned tree with formatted nodes, as I'm hopefully rewriting this part
soon to fit the exiting formatting of diff-drv better I won't go
into detail for now.
Jaccard et al
I have the awful tendency to use single/dual-letter variable names
when hacking things on a REPL and to come up with abstractions that
sounded more useful than they turned out to be in the end (hence
walky!). I'll ignore the rough edges for now, and will focus on
report-jaccard+diffs:
;; don't know if i shoot myself in the foot with this abstraction
;; PRs suggesting a better name for this welcome
(define* (walky! src other
#:key
(on-key (lambda (_k) #t))
(on-shared (lambda (_k) #t))
(on-only (lambda (_k) #t)))
(hash-for-each
(lambda (k _)
(on-key k)
(if (hash-get-handle other k)
(on-shared k)
(on-only k)))
src))
(define (report-jaccard+diffs a b A B)
(let ((sa 0) (sb 0) (i 0) (only-a '()) (only-b '()))
(walky! A B
#:on-key (lambda (_k) (set! sa (+ sa 1)))
#:on-shared (lambda (_k) (set! i (+ i 1)))
#:on-only (lambda (k) (set! only-a (cons k only-a))))
(walky! B A
#:on-key (lambda (_k) (set! sb (+ sb 1)))
#:on-only (lambda (k) (set! only-b (cons k only-b))))
(let* ((u (- (+ sa sb) i))
(j (if (= u 0) 1.0 (/ (exact->inexact i) (exact->inexact u))))
(only-a (sort only-a string<?))
(only-b (sort only-b string<?))
(da (- sa i))
(db (- sb i))
(pU (if (= u 0) 100.0 (* 100.0 j)))
(pA (if (= sa 0) 100.0 (* 100.0 (/ (exact->inexact i) (exact->inexact sa)))))
(pB (if (= sb 0) 100.0 (* 100.0 (/ (exact->inexact i) (exact->inexact sb))))))
(format #t "Similarity (Jaccard): ~,4f (~,2f%)\n" j pU)
(format #t "A (~a): ~a (only A: ~a, shared: ~a = ~,2f%)\n"
(store-path-package-name a) sa da i pA)
(format #t "B (~a): ~a (only B: ~a, shared: ~a = ~,2f%)\n"
(store-path-package-name b) sb db i pB)
(format #t "A ∪ B: ~a | A ∩ B: ~a | Diff: ~a\n\n"
u i (+ da db))
(values only-a only-b))))Basically, we've got:
- sa representing number of nodes in A
- sb doing the same for B
- i which is the intersection size of A and B
- only-a: nodes in A but not B
- only-b: nodes in B but not A
at this point, which allows us to calculate the union size (sa + sb - i) as well as the Jaccard similiarty (which is IoU) and percentages, so we're able to include a simple summary in our output:
Similarity (Jaccard): 0.9240 (92.40%)
A (linux-libre-arm64-mnt-reform-6.18.7.drv): 420 (only A: 19, shared: 401 = 95.48%)
B (linux-libre-arm64-mnt-reform-6.18.10.drv): 415 (only B: 14, shared: 401 = 96.63%)
A ∪ B: 434 | A ∩ B: 401 | Diff: 33that gives us an idea of how similiar two derivations are, how much of the dependency graph is shared, wether the dependency graph of A fits in B, you get the gist.
Things You Most Likely Didn't Ask Yourself About Derivation Similiarty
Now we're able to put this to good use. I bet you've always been
wondering if you can fit the dependency tree of hello inside the
dependency tree of a kernel variant such as
linux-libre-arm64-mnt-pocket-reform and how similiar that tree is:
theesm@minty ~/diff-drv [env]$ diff-drv /gnu/store/fxy567lyimwkxxi23brf8y8g2106anlx-linux-libre-arm64-mnt-pocket-reform-6.17.12.drv /gnu/store/qzh7rvz609rm845xxf3jsasxzgbp208x-hello-2.12.2.drv
Similarity (Jaccard): 0.3957 (39.57%)
A (linux-libre-arm64-mnt-pocket-reform-6.17.12.drv): 420 (only A: 253, shared: 167 = 39.76%)
B (hello-2.12.2.drv): 169 (only B: 2, shared: 167 = 98.82%)
A ∪ B: 422 | A ∩ B: 167 | Diff: 255turns out: it fits (well, except for the source package):
paths only present in /gnu/store/qzh7rvz609rm845xxf3jsasxzgbp208x-hello-2.12.2.drv:
* qzh7rvz609rm845xxf3jsasxzgbp208x-hello-2.12.2.drv
* kbr9zcwsln63dlr0217ir7any1dcb2ss-hello-2.12.2.tar.gz.drv/storeComparing System Reconfigurations
We're also now able to tell how similiar two system configurations are, so we're able to tell wether something has been a small change like this one:
theesm@minty ~/diff-drv [env]$ diff-drv --bfs $(guix gc --derivers /var/guix/profiles/system-12-link) $(guix gc --derivers /var/guix/profiles/system-11-link)
Similarity (Jaccard): 0.9906 (99.06%)
A (system.drv): 4441 (only A: 13, shared: 4428 = 99.71%)
B (system.drv): 4457 (only B: 29, shared: 4428 = 99.35%)
A ∪ B: 4470 | A ∩ B: 4428 | Diff: 42
paths only present in /gnu/store/2fm780820z78knqn1r6hnsgc6q8ppfj6-system.drv:
* 2fm780820z78knqn1r6hnsgc6q8ppfj6-system.drv
* 4nyvgv999pxixffrm5jnk5fznvrvg6h7-activate.scm.drv
* sncv8fq3dcjcz4n4n69k54iqar43li4b-activate-service.scm.drv
* 787rdicqj5amcpdxgba95zjryv5n9j3z-etc.drv
* dn7zij6x594c2np0g7pv8yrrz6cpfpbd-udev.drv
* 5rj188vvgrrsd76i5c7fzz68k1csbjjx-udev-rules.d.drv
* ic9vasgj8bfl6xhcpvbw2nfawya0jf4j-hwdb.bin.drv
* rvqwppv5aqg3wh9v29kdlmkzqm0q7w14-udev-hwdb.d.drv
* xm1fcyckiqhampwcvyzvn77mmnb3bg1j-dbus-configuration.drv
* 17v8dln6lisjz0svdzx3nyja5j4i5nnm-dbus-system-services.drv
* d5nagfkl8zh21mh6dhbyklinapybdwgh-boot.drv
* qk1936p5qz4fknsawlk3iln0v68jdmk8-shepherd.conf.drv
* wn8bnk0ljxjazy82ck0rdyfgrn6kb2gz-provenance.drv
paths only present in /gnu/store/8b9ng0l6jybb9fv76kp4pfw5hpdgndja-system.drv:
* 8b9ng0l6jybb9fv76kp4pfw5hpdgndja-system.drv
* 3x4hnlkpphw0w12dv9xpfb2l7rhkrza5-etc.drv
* 1mdxg10na89hz0njnz96v54ki0df31nw-dbus-configuration.drv
* f4s0y8wha0zhgi8kkvh8pk61przxfy3w-dbus-system-services.drv
* nmdlx1g3j09vcvbynrkvcv248j10dfhb-blueman-2.4.6.drv
* 66fmbrkqf74v2sv8bp1pszq614r7s567-blueman-2.4.6.tar.xz.drv
* crhnzh1fv1z0a6ag8qvvck95p69s6jgp-libappindicator-12.10.1-0-298.drv
* 5mj32f343083kz5v9ahli49fx6mz5v4z-dbus-test-runner-19.04.0.drv
* 38mwiq495v802y7p9chkp413ymyh6j6m-python-dbusmock-0.37.2.drv
* 6bzwp0gddzipd2p4fvxic409bfh8ps0b-python-dbusmock-0.37.2-checkout.drv
* drs86vgpcflg0p5xfhdi4klm1041zcww-upower-1.90.2.drv
* ppd9aq1byjw9mc8921fbq4845skvi4rn-umockdev-0.17.13.drv
* 02iyj43k38k58nrw3g77dg9xfdj36iwf-umockdev-0.17.13.tar.xz.drv
* rzamla8y0dswyx60f85cks4r8yyjwn81-upower-1.90.2-checkout.drv
* k6299ri40k1bzlagvl3z1059x340zz1p-upower-1.90.2-checkout.drv
* gy4df8aqrw0m6v3b7fxg36phagnijbnf-dbus-test-runner-19.04.0.tar.gz.drv
* klsks181ljvbllvzf8kxrn51qv4d1rd8-libappindicator-12.10.1-0-298-checkout.drv
* 66clmb1rmmf4lhmaqbpdbfx6dinpkbzb-etc-bluetooth.drv
* px16kpgqsfqlyv9djqn9yb78njkp02aw-udev.drv
* bika5wj6qsw41q1jsqvn07dswqb38cxx-hwdb.bin.drv
* zz6bhc26mr99pf0j3cldvwxn6swq9p4b-udev-hwdb.d.drv
* lgw3qpm1x942wwz7605dwkg0jxdpkjhs-udev-rules.d.drv
* bdby9zxazgqnhh5fnslfp2g0bs556fb3-activate.scm.drv
* 0377sfp2h3b04wkgc8467f7i7ckyiicl-activate-service.scm.drv
* bqrd2fxkfvrcxccnmsrnlql62zbp9ccd-boot.drv
* cx33w99s4i8y29awsbdgfbkl2fhqi8v2-shepherd.conf.drv
* 0vcgmxz7dw33y3ib29xndnn9khyifxkn-shepherd-bluetooth.go.drv
* wyqbs6rhg616hb0hlpxy78rn3mjfm2dq-shepherd-bluetooth.scm.drv
* rkyd816vsa59gycv0q9bnkqjp5s9wqlw-provenance.drvor a rather big change like this one between the 16th and 14th system generation:
diff-drv --bfs $(guix gc --derivers /var/guix/profiles/system-16-link) $(guix gc --derivers /var/guix/profiles/system-14-link)
Similarity (Jaccard): 0.5509 (55.09%)
A (system.drv): 4441 (only A: 1242, shared: 3199 = 72.03%)
B (system.drv): 4565 (only B: 1366, shared: 3199 = 70.08%)
A ∪ B: 5807 | A ∩ B: 3199 | Diff: 2608and what impact a reconfiguration has had. I won't paste the full tree output here, but this has been a major upgrade for a lot of packages.
Analyzing Dependency Hells
This also helps figuring out why the hash of a package changed even if
the semver of a package didn't change and the package definition
stayed the same. I use senpai as my IRC client as it works well with
soju (which is the bouncer I use). During one upgrade the hash of
senpai did change while the semver stayed at 0.4.1. So diff-drv can
also be used as fuel for my confirmation bias to stay away from Go and
Rust and stay in C/PHP/Perl (which I write at my job) and Guile/Elisp
happyland:
theesm@minty ~/diff-drv [env]$ diff-drv --bfs /gnu/store/6ghn9zl05p9jrjps2idzd8lka44s8gwl-senpai-0.4.1.drv /gnu/store/10vwgxsy3hb434kdazgyskm4yh0kh2cf-senpai-0.4.1.drv
Similarity (Jaccard): 0.7470 (74.70%)
A (senpai-0.4.1.drv): 570 (only A: 80, shared: 490 = 85.96%)
B (senpai-0.4.1.drv): 576 (only B: 86, shared: 490 = 85.07%)
A ∪ B: 656 | A ∩ B: 490 | Diff: 166even though I again won't include the 166 paths diff in this post for
readabilities sake. System derivation can be dissected further and the
output of diff-drv could probably be optimized for this scenario, to
get a summary wether packages, services, users, file-systems, the
kernel etc. changed between derivations, but that would be a whole
different tool in my book (even though the idea of this seems worth
exploring).
Wrapping Up
As stated before, diff-drv can be found on on Codeberg as
theesm/diff-drv. My approach in
using BFS and Jaccard similiarity is somewhat useful in answering
questions I have regarding the comparison of derivations, even though
the output could probably be a bit more ergonomical. Either way, it
was a fun hack session and fun to implement!