path: root/wiki/src/contribute/APT_repository/tagged_snapshots.mdwn
diff options
Diffstat (limited to 'wiki/src/contribute/APT_repository/tagged_snapshots.mdwn')
1 files changed, 154 insertions, 0 deletions
diff --git a/wiki/src/contribute/APT_repository/tagged_snapshots.mdwn b/wiki/src/contribute/APT_repository/tagged_snapshots.mdwn
new file mode 100644
index 0000000..7d80dca
--- /dev/null
+++ b/wiki/src/contribute/APT_repository/tagged_snapshots.mdwn
@@ -0,0 +1,154 @@
+[[!meta title="Tagged snapshots of upstream APT repositories"]]
+[[!toc levels=2]]
+# Overview
+Our tagged snapshots of upstream APT repositories are published on
+These are _partial_, tagged snapshots of upstream APT repositories we
+need, so that one can rebuild a released ISO in the future, and we
+keep the corresponding source code around.
+The main goal here is having reproducible builds some day, and to
+comply with various licenses such as the GPL.
+These snapshots are partial: in a given snapshot, we import only the
+packages needed by a given build of Tails.
+The corresponding data shall be backup'ed, and expired very
+cautiously, if ever.
+# Source code
+* `tails::reprepro::snapshots::tagged` class in
+ [[!tails_gitweb_repo puppet-tails]]
+* bits scattered in the main Tails Git repository (details below)
+# Design notes
+## Listing needed packages
+To generate partial APT repositories, we need to know what to include
+in them. Therefore, we create a _build manifest_ at the end of an ISO
+build. It is generated by
+[[!tails_gitweb auto/scripts/generate-build-manifest]], thanks to
+[[!tails_gitweb data/wrappers/apt-get]] and
+[[!tails_gitweb data/debootstrap/scripts/jessie.patch]].
+- for each APT repository we use time-based snapshots for: name, serial
+- for each binary package: name, version, architecture
+- for each source package: name, version
+In passing, here are some nice side-effects of having this build
+- It allows to inspect the diff between the subset of two different
+ snapshots that was used at build time; the benefit is quite small as
+ long as we're based on Debian stable (we also fetch packages from
+ testing, sid, backports, etc. though), but if/when we switch to
+ being based on Debian testing, then we will definitely want that.
+- Say a branch (topic one, or devel, etc.) introduces a regression,
+ and has changes in the set of packages used at build time, we may
+ want to check how exactly that set was changed. Think "check the
+ diff between `.packages`" as we do at release time, but done in
+ a more correct way.
+## Importing packages into partial snapshots
+### How it's done in practice
+* [[!tails_gitweb auto/scripts/tag-apt-snapshots]]
+* [tails-prepare-tagged-apt-snapshot-import](
+* [tails-publish-tagged-apt-snapshot](
+### A corner case: APT pinning magics
+If a (package, version) is seen at build time in 2 or more APT
+sources, `tails-prepare-tagged-apt-snapshot-import` injects it
+into each of the tagged snapshots corresponding to these sources.
+The goal is to avoid this scenario, that could happen if we injected
+each package _only_ into the distribution it was downloaded from:
+ - version X of package P is available both in suite S1 on origin O1,
+ and in suite S2 on origin O2
+ - version Y of package P is available in suite S3 of origin O3
+ - our pinning makes us prefer version X of package P *because it's
+ available in O1/S1*; otherwise, if it wasn't in there, then our
+ pinning would make APT prefer version Y to version X
+ - at ISO build time, APT fetches package P version X from O2/S2
+ - given this build manifest, we import package P version X into our
+ tagged snapshot of O2/S2, but not into our tagged snapshot of O1/S1
+ - if we rebuild from the same source tree using that set of tagged
+ snapshots, then version Y of package P will be installed
+This scenario can happen in practice:
+ # cat /etc/apt/sources.list
+ deb wheezy/updates main
+ deb wheezy main
+ deb jessie main
+ # cat /etc/apt/preferences
+ Package: *
+ Pin: origin
+ Pin-Priority: -10
+ Package: *
+ Pin: release o=Debian,n=wheezy
+ Pin-Priority: 990
+ Package: *
+ Pin: release o=Debian,n=jessie
+ Pin-Priority: 700
+ # apt-cache madison a2ps
+ a2ps | 1:4.14-1.3 | jessie/main amd64 Packages
+ a2ps | 1:4.14-1.1+deb7u1 | wheezy/updates/main amd64 Packages
+ a2ps | 1:4.14-1.1+deb7u1 | wheezy/main amd64 Packages
+ # apt-cache policy a2ps
+ a2ps:
+ Installed: (none)
+ Candidate: 1:4.14-1.1+deb7u1
+ Version table:
+ 1:4.14-1.3 0
+ 700 jessie/main amd64 Packages
+ 1:4.14-1.1+deb7u1 0
+ -10 wheezy/updates/main amd64 Packages
+ 990 wheezy/main amd64 Packages
+And then, APT will download `a2ps` from security.d.o:
+ # apt-get download a2ps --print-uris
+ '' a2ps_4.14-1.1+deb7u1_amd64.deb 956298 sha256:e47d7fe9adb7aa62421108debf425830f4e2385e98151c5cb359d3eb8688eea8
+... but if `a2ps` was not available in the regular Wheezy archive,
+e.g. because we were using a tagged snapshot that imported `a2ps` into
+the security archive, then APT would prefer `a2ps` from Jessie, which
+demonstrates the problem.
+## Valid-Until
+A tagged APT repository snapshot that was used to build a given Tails
+release is immutable by design, so it does not need the protections
+provided by `Valid-Until`. Besides, not using `Valid-Until` for those
+makes it much easier to reproduce a given ISO build in the future.
+So, the `Release` files for tagged snapshots have no
+`Valid-Until` field.
+## Garbage collection
+We want to keep "forever" the tagged snapshots used by Tails releases.
+In practice, "forever" == min(3 years for GPL, how long we want to be
+able to reproduce the build of a released ISO) = 3 years.
+Depending on the growth rate of our tagged snapshots in practice, we
+may or may not need to implement expiration of these snapshots any
+time soon. Time will tell.