summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorintrigeri <intrigeri@boum.org>2016-05-18 18:33:08 +0000
committerintrigeri <intrigeri@boum.org>2016-05-18 18:33:08 +0000
commitd0b190717f1b524b15e52801e26c76c5d1824a24 (patch)
tree17dce4a72c184d85bedb36d317e06d8f1e7f9ebd
parent9273c6d9b8e194918743554870d37d4c1b5d7143 (diff)
Import lots of design and implementation notes from the blueprint.
-rw-r--r--wiki/src/contribute/APT_repository.mdwn116
-rw-r--r--wiki/src/contribute/APT_repository/tagged_snapshots.mdwn31
-rw-r--r--wiki/src/contribute/APT_repository/time-based_snapshots.mdwn146
3 files changed, 289 insertions, 4 deletions
diff --git a/wiki/src/contribute/APT_repository.mdwn b/wiki/src/contribute/APT_repository.mdwn
index dce330f..4f9e2e0 100644
--- a/wiki/src/contribute/APT_repository.mdwn
+++ b/wiki/src/contribute/APT_repository.mdwn
@@ -1,3 +1,7 @@
+[[!toc levels=2]]
+
+# Our APT repositories
+
We have three kinds of APT repositories:
* our [[custom APT repository|contribute/APT_repository/custom]],
@@ -8,3 +12,115 @@ We have three kinds of APT repositories:
* (partial) [[contribute/APT_repository/tagged snapshots]] of upstream
APT repositories we need, so that one can rebuild a released ISO in
the future, and we keep the corresponding source code around.
+
+# Snapshots and branches
+
+Here we discuss what APT snapshots of upstream repositories are used
+when building a Tails ISO image. This is a function of the branch we
+build from, we are building an ISO that is meant to be released (i.e.
+whether there is a tag in Git corresponding the last entry in
+debian/changelog).
+
+Building an ISO from the `devel` branch always uses the freshest set
+of APT repository snapshots available. Resolving what's the set of
+freshest APT repository snapshots is done at the beginning of the
+build ([[!tails_gitweb auto/config]],
+[[!tails_gitweb auto/scripts/apt-mirror]]), so that the entire build
+uses the exact same state of these
+repositories. This is needed for reproducible builds, and has a nice
+side effect: so long, `Hashsum mismatch`, and thanks for the fish.
+
+When building an ISO from the branch used to prepare the next major release
+(`testing`), or a topic branch based on it (`config/base_branch`):
+
+ * **outside of the freeze period**: we use the latest set of APT
+ repository snapshots, just like when building from `devel`;
+ * **freeze period**: at freeze time, the RM encodes in the Git
+ `testing` branch the set of APT repository snapshots (via their
+ serial numbers) that shall be used during the freeze; the only
+ exception is security.debian.org, for which we always use our
+ latest snapshot;
+ * **at release time**: when building from a tagged branch, similarly to
+ what we do for our
+ [[custom APT repository|contribute/APT_repository/custom]], instead
+ of using time-based APT repository snapshots, we use snapshots
+ labeled with the Git tag (note that this is not needed, strictly speaking,
+ as the APT sources used at Tails runtime will anyway be the
+ official (and not frozen) Debian ones; this is mostly needed for
+ legal purposes (this allows to distribute for a long
+ time the source packages needed to build a given Tails ISO image),
+ and it will be useful when we want to be able to reproduce a given
+ Tails ISO build);
+ * **after releasing**, the RM encodes in the `testing` Git branch the
+ fact that it is not frozen anymore, that is: the RM removes the
+ indication that a specific set of APT repository snapshots must be
+ used; and then, we're back to the "outside of the freeze
+ period" case.
+
+When building an ISO from the branch used to prepare the next point-release
+(`stable`), or a topic branch based on it (`config/base_branch`
+contains `stable`), we
+use snapshots labeled with the Git tag of the latest Tails release,
+except:
+
+ * we generally use our latest snapshot of security.debian.org;
+ * at release time: when building from a tagged branch, similarly to
+ what we do for our
+ [[custom APT repository|contribute/APT_repository/custom]], instead
+ of using time-based APT repository snapshots, we use snapshots
+ labeled with the Git tag
+ * if a set of APT repository snapshots is encoded directly in that
+ branch: use them, even for security.debian.org.
+
+# Design notes
+
+## Miscellaneous
+
+A given APT repository snapshot is immutable after it's been taken.
+We
+[[deal with freeze exception separately|contribute/APT_repository/freeze exception]].
+
+We want to have reproducible builds some day. Therefore, the APT
+`sources.list` shipped in the ISO must be stable across rebuilds from
+the same release Git tag.
+
+Say `kedit` is a package shipped in Debian, but not in Tails. Then,
+when run inside Tails, `apt install kedit` must fetch `kedit` from
+current Debian, as opposed to installing it from a Tails-specific, and
+generally obsolete, snapshot of the Debian APT repository.
+
+<a id="runtime-sources"></a>
+
+## APT sources used inside Tails
+
+A running Tails' APT must be pointed at the official, live Debian
+archive, and not to a Tails-specific and already obsolete snapshot.
+
+To achieve that we tweak `sources.list` in
+[[!tails_gitweb config/chroot_local-includes/lib/live/config/1500-reconfigure-APT]].
+
+## Upgrading to a new snapshot
+
+In other words: bumping, in Git, the pointers to the set of snapshots
+that shall be used by a given branch.
+
+Let's use, as an example of a situation in which we might want to do
+that, upgrading to a new Debian point-release.
+
+With this design:
+
+ * `devel` gets them automatically because it closely tracks the
+ Debian archive;
+ * for release branches (`stable`, `testing`): on a case-by-case
+ basis, depending on the respective Debian/Tails release schedule
+ timing, we can choose whether to switch to using a new snapshot of
+ the Debian archive for the next release. Note that this can be done
+ via a topic-branch since this information is encoded in Git. If we
+ choose not to manually pick the point release, which is the default
+ if we don't act at all, then:
+ - `testing` will start using the new Debian point-release as soon
+ as it is unfrozen, that is as soon as it has been used to release
+ a new major version of Tails;
+ - `stable` will start using the new Debian point-release once
+ a `testing` branch that uses that point-release is merged into
+ `stable`.
diff --git a/wiki/src/contribute/APT_repository/tagged_snapshots.mdwn b/wiki/src/contribute/APT_repository/tagged_snapshots.mdwn
index d06c17b..912b627 100644
--- a/wiki/src/contribute/APT_repository/tagged_snapshots.mdwn
+++ b/wiki/src/contribute/APT_repository/tagged_snapshots.mdwn
@@ -29,7 +29,9 @@ cautiously, if ever.
[[!tails_gitweb_repo puppet-tails]]
* bits scattered in the main Tails Git repository (details below)
-# Listing needed packages
+# Design notes
+
+## Listing needed packages
To generate partial APT repositories, we need to know what to include
in them. Therefore, we create a _build manifest_ at the end of an ISO
@@ -58,15 +60,15 @@ manifest:
diff between `.packages`" as we do at release time, but done in
a more correct way.
-# Importing packages into partial snapshots
+## Importing packages into partial snapshots
-## How it's done in practice
+### How it's done in practice
* [[!tails_gitweb auto/scripts/tag-apt-snapshots]]
* [tails-prepare-tagged-apt-snapshot-import](https://git-tails.immerda.ch/puppet-tails/tree/files/reprepro/snapshots/tagged/tails-prepare-tagged-apt-snapshot-import)
* [tails-publish-tagged-apt-snapshot](https://git-tails.immerda.ch/puppet-tails/tree/files/reprepro/snapshots/time_based/tails-publish-tagged-apt-snapshot)
-## A corner case: APT pinning magics
+### A corner case: APT pinning magics
If a (package, version) is seen at build time in 2 or more APT
sources, `tails-prepare-tagged-apt-snapshot-import` injects it
@@ -132,3 +134,24 @@ And then, APT will download `a2ps` from security.d.o:
e.g. because we were using a tagged snapshot that imported `a2ps` into
the security archive, then APT would prefer `a2ps` from Jessie, which
demonstrates the problem.
+
+## Valid-Until
+
+A tagged APT repository snapshot that was used to build a given Tails
+release is immutable by design, so it does not need the protections
+provided by `Valid-Until`. Besides, not using `Valid-Until` for those
+makes it much easier to reproduce a given ISO build in the future.
+
+So, the `Release` files for tagged snapshots have no
+`Valid-Until` field.
+
+## Garbage collection
+
+We want to keep "forever" the tagged snapshots used by Tails releases.
+
+In practice, "forever" == min(3 years for GPL, how long we want to be
+able to reproduce the build of a released ISO) = 3 years.
+
+Depending on the growth rate of our tagged snapshots in practice, we
+may or may not need to implement expiration of these snapshots any
+time soon. Time will tell.
diff --git a/wiki/src/contribute/APT_repository/time-based_snapshots.mdwn b/wiki/src/contribute/APT_repository/time-based_snapshots.mdwn
index 0c810d5..db58044 100644
--- a/wiki/src/contribute/APT_repository/time-based_snapshots.mdwn
+++ b/wiki/src/contribute/APT_repository/time-based_snapshots.mdwn
@@ -48,6 +48,10 @@ The corresponding data is not critical: we can restart the whole thing
from scratch if needed, without too much pain ⇒ no need to synchronize
this content to the failover server; no need to back it up.
+We don't bother merging mirrored APT repositories / suites into
+aggregated ones. It loses information, gives us more work, and brings
+little value.
+
# Source code
* `tails::reprepro::snapshots::time_based` class in
@@ -95,6 +99,8 @@ a specific set of APT repository snapshots must be used:
-m 'Thaw APT snapshots after Tails $VERSION was released.' \
config/APT_snapshots.d/*/serial
+<a id="bump-expiration-date"></a>
+
Bump expiration date
--------------------
@@ -134,3 +140,143 @@ days from now:
fi
done
)
+
+Stop tracking a distribution
+----------------------------
+
+After we stop tracking a distribution, e.g. after we release Tails
+based on a new Debian, we need to manually remove all corresponding
+time-based snapshots, and the packages that are not referenced
+anymore.
+
+For example, when we stopped tracking Wheezy, we did:
+
+ reprepro dumpreferences \
+ | grep -E '^s=wheezy' \
+ | awk '{print $1}' \
+ | sort -u \
+ | xargs -n 1 reprepro _removereferences \
+ && reprepro deleteunreferenced
+
+# Design notes
+
+## gensnapshot
+
+We use reprepro's `gensnapshot` command, that basically copies
+a distribution, keeping references to the packages it contains.
+
+Compared to the "snapshots as full-blown distributions + `reprepro
+pull`" option we
+[used in our initial experiments](https://labs.riseup.net/code/issues/6295#note-14),
+we are saving _a lot_ on database size, and thus in performance,
+because reprepro does less tracking on snapshots, than what it does
+for real distributions.
+
+The counterpart of using snapshots created with `gensnapshot` is that:
+
+ * garbage collecting expired snapshots is a bit more involved, i.e.
+ we have to
+ [do it ourselves](https://git-tails.immerda.ch/puppet-tails/tree/files/reprepro/snapshots/time_based/tails-delete-expired-apt-snapshots);
+ * bumping `Valid-Until` for a given time-based snapshot has to be
+ done directly in `dist`, without any help from reprepro; so here
+ again, we
+ [do it ourselves](https://git-tails.immerda.ch/puppet-tails/tree/files/reprepro/snapshots/time_based/tails-bump-apt-snapshot-valid-until).
+
+None of these problems warrant going back to the other option... and
+having to deal with 80GB+ Berkeley DB databases.
+
+## Garbage collection and Valid-Until
+
+We expire snapshots older than 10 days in order to save disk space,
+and to avoid the reprepro database to grow too much.
+
+To ensure that garbage collection doesn't delete a snapshot we still
+need, e.g. the one currently referenced in the frozen `testing`
+branch, we rely on the `Valid-Until` field found in `Release` files:
+the way to express "I want to keep a given snapshot around" is to
+postpone its expiration date; i.e. we don't differentiate "keep
+a given snapshot around" from "keep a given snapshot usable", which
+seems to make sense.
+
+See [[above|time-based_snapshots#bump-expiration-date]] for how we
+can manage `Valid-Until` manually, whenever needed.
+
+One advantage of this design is that we don't have to regularly update
+`Valid-Until` fields, and the corresponding signatures: we only do
+that on a case-by-case basis, when needed. And thus, we can actually
+benefit from the protections offered by APT when `Valid-Until` fields
+are present, as any snapshot will expire unless we do something
+about it.
+
+In practice, the main use case for keeping a given time-based APT
+repository snapshot around and valid is when it's being used by
+a release branch:
+
+ - `testing`: while it's frozen, that is during 5-10 days most of the
+ time;
+ - `stable`: that's a corner case, since `stable` generally uses the
+ set of tagged snapshots of the latest Tails release; if and when we
+ decide to manually point `stable` to a different set of snapshots,
+ then we can as well deal with `Valid-Until` manually.
+
+In passing, note that we ship an empty `/var/cache/apt/lists/` in the
+ISO ⇒ modifying `Release` and `Release.gpg` files on our APT
+repository won't prevent the ISO build from being deterministic.
+
+## APT vs. reprepro: dist names
+
+We need to encode in the APT sources' base URL the exact snapshot we
+want to use, in order to be able to pass it to `lb config --mirror-*`.
+But this doesn't match reprepro's directory structure as-is.
+
+Thankfully this problem can be workaround'ed with some symlinks or
+HTTP rewrite rules. Here's how.
+
+Let's assume:
+
+ lb config --distribution jessie
+ lb config --mirror-chroot \
+ http://time-based.snapshots.deb.tails.boum.org/debian/2016031101/
+ lb config --mirror-chroot-security \
+ http://time-based.snapshots.deb.tails.boum.org/debian-security/2016031102/
+ [...]
+
+Which generates this APT `sources.list`:
+
+ deb http://time-based.snapshots.deb.tails.boum.org/debian/2016031101/ jessie main
+ deb http://time-based.snapshots.deb.tails.boum.org/debian-security/2016031102/ jessie/updates main
+ [...]
+
+As a result APT sends HTTP requests with URLs such as:
+
+ * <http://time-based.snapshots.deb.tails.boum.org/debian/2016032401/dists/jessie/Release>
+ * <http://time-based.snapshots.deb.tails.boum.org/debian/2016032401/pool/XXX>
+ * <http://time-based.snapshots.deb.tails.boum.org/debian-security/2016032402/dists/jessie/updates/Release>
+ * <http://time-based.snapshots.deb.tails.boum.org/debian-security/2016032402/pool/XXX>
+
+The corresponding files in reprepro's filesystem (given that we have
+one reprepro instance per mirrored archive) are:
+
+ * in Debian archive's reprepro:
+ - `/srv/apt-snapshots/time-based/repositories/debian/dists/jessie/snapshots/2016032401/Release`,
+ that contains `Suite: jessie/snapshots/2016032401` and `Codename: jessie`
+ - `/srv/apt-snapshots/time-based/repositories/debian/pool/XXX`
+
+ * in Debian security archive's reprepro:
+ - `/srv/apt-snapshots/time-based/repositories/debian-security/dists/jessie/updates/snapshots/2016031102/Release`,
+ that contains `Suite: jessie/updates/snapshots/2016031102` and
+ `Codename: jessie/updates`
+ - `/srv/apt-snapshots/time-based/repositories/debian-security/pool/XXX`
+
+To have the above HTTP requests translate to access to these files,
+we use
+[a set of HTTP rewrite rules](https://git-tails.immerda.ch/puppet-tails/tree/templates/reprepro/snapshots/time_based/nginx_site.erb).
+
+Note: this works because APT only warns when the codename in the
+`Release` file doesn't match the one requested in `sources.list`.
+There's a code comment around this check, dating back from 2004, that
+says something like "This might become fatal in the future". We bet that if it
+becomes fatal some day, it will be possible to turn it back into
+a warning via configuration. This affects only development builds
+since we're not going to configure APT _in the Tails ISO_ to point to
+our own snapshots of the Debian archive anyway.