Reproducible Builds In Koji

Mike McLean

Sr. Software Engineer

Red Hat

About Mike

  • Koji author and maintainer
  • Release Configuration Management team
    • aka Release Engineering
  • Long ago: installer QA
  • Past life: mathematics

  • FAS: mikem

  • mikem@redhat.com

About KOji

  • Fedora's build system
  • but also used by
    • Red Hat
    • CentOS
    • Scientific Linux
    • Amazon
    • and more

Reproducibility is the ability of an entire experiment or study to be duplicated, either by the same researcher or by someone else working independently.

Ingredients

  • Source code
  • Build parameters
  • Build environment

Building

  • Fresh buildroot
  • Generated using mock
  • ...from a repo generated by koji
  • Repo content comes from the build tag
    • at a specific point in time
  • Run build in buildroot

What Koji Tracks

  • Source code
    • srpm saved with build
    • also, git ref store in task info
  • Build parameters
    • captured in one of the following
      • srpm
      • task parameters
      • build tag
  • Build environment
    • repo id used to generate buildroot
      • repo references build tag, event id
    • contents also recorded in the database

Untracked, so far

  • software outside the buildroot
    • mock
    • yum
    • koji itself
  • running kernel version

DEbian

DEBIAN

These guys are doing great work!

Much of it is landing upstream

Back to Koji

  • Koji has tracked build environments since day 1
  • Different goals
    • reproducing failures
    • build consistency
    • determining impact of toolchain bugs

Koji buildroots

  • Each build task given:
    • source code
    • target (build tag, destination tag)
  • Determine current repo for build tag
    • tag contents are versioned
    • each repo references a particular event id
  • Generate mock buildroot from current repo
  • Run mock build
  • Data trail
    • task parameters
    • buildroot reference for each rpm
      • buildroot references repo id
    • buildroot contents recorded

Getting the Data

  • Task parameters:
    • koji taskinfo command (or web ui)
  • Buildroot contents
    • koji list buildroot <id>
    • get buildroot id from rpminfo
    • (or web ui)
  • Yum repo
    • for recent builds, koji may still have the repo
    • for older builds, still have the data
    • Koji can remake the repo

Local Mock

$ koji mock-config --task 10676157 -o mikem.cfg
(download source rpm)
(move mock config in place)
$ mock -r mikem --rebuild ./ast-8.0.5-1.fc23.src.rpm

Build succeeds? yes

Byte for byte? no (varies by package)

VIA Koji

Approach:

  • Determine parameters of original build
    • source url
    • build tag
    • koji event id
  • Get koji to replicate the repo
  • Rebuild using the replicated repo

 

Requires special access now, but hopefully that can change

$ koji call repoInfo 510427
{'create_event': 11952446,
 'create_ts': 1439297485.77903,
 'creation_time': '2015-08-11 12:51:25.779027',
 'id': 510427,
 'state': 2,
 'tag_id': 315,
 'tag_name': 'f24-build'}

Let's Do ThIS

$ koji call newRepo f24-build 11952446
10678016
$ koji watch-task 10678016
Watching tasks (this may be safely interrupted)...
-snip-
10678016 newRepo (f24-build) completed successfully
$ koji call getTaskResult 10678016
[510635, 11952446]
$ koji build --nowait --arch-override=x86_64 --scratch --repo-id=510635 none 'git://pkgs.fedoraproject.org/utf8proc?#f6336ba871e3e049c09c931c4139748d8eee080a'
Created task: 10678134
Task info: https://koji.fedoraproject.org/koji/taskinfo?taskID=10678134
$ koji watch-task 10678134

 

COMPARISON

$ (cd a; koji download-build --arch x86_64 utf8proc-1.3-1.fc24 )
utf8proc-1.3-1.fc24.x86_64.rpm
utf8proc-devel-1.3-1.fc24.x86_64.rpm

$ (cd b; koji download-task  10678134 )
Downloading [1/4]: utf8proc-1.3-1.fc24.x86_64.rpm
Downloading [2/4]: utf8proc-debuginfo-1.3-1.fc24.x86_64.rpm
Downloading [3/4]: utf8proc-devel-1.3-1.fc24.x86_64.rpm
Downloading [4/4]: utf8proc-1.3-1.fc24.src.rpm

$ rpmdiff -iT a/utf8proc-1.3-1.fc24.x86_64.rpm b/utf8proc-1.3-1.fc24.x86_64.rpm
$

 

Open Questions

  • With careful replication, what percentage of Fedora will rebuild byte-for-byte?
  • Do we have failure cases beyond those found by the Debian reproducibility project?
  • Is there interest in a Fedora reproducibility effort?

Q & A

[thank you]