summaryrefslogtreecommitdiffstats
path: root/wiki/src/blueprint/Debian_Stretch.mdwn
blob: f0ebd786df6cbd09a1c2713585eafa11e3223753 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
[[!meta title="Porting Tails to Debian Stretch"]]

This is about the effort to create [Tails
3.0](https://labs.riseup.net/code/projects/tails/issues?query_id=198),
based on Debian Stretch.

[[!toc levels=2]]

# Big picture

When porting to Wheezy, our main problem has been that we started the
work too late, which made us discover Debian bugs too late (after the
Debian Wheezy freeze, or even after its release), and in the end we
had to workaround lots of problems on our side.

So we started much earlier the porting work to Jessie, which indeed
essentially avoided the aforementioned problem. But we didn't allocate
enough focused resources to this effort, and as a result the total
duration of the porting to Jessie work has lasted too much (July 2014
to January 2016, included = 19 months), and we've spent too much time
just keeping our feature/jessie branch up-to-date wrt. the changes we
were making in our Wheezy production branches.

For Stretch we would like to avoid both problems. We want to start early,
in order to fix problems directly in Debian Stretch before it's
released. And, we want the porting work to fit into a shorter time
span, so as soon as we start we'll allocate more resources to it, and
in a better organized, team-based and focused way.

Additionally, we would like to use this process as an opportunity to
evaluate the idea of basing Tails on snapshots of Debian testing.

<a id="schedule"></a>

# Schedule

* 2016Q1 — Tails 2.0 is out
* August 2016 — start working on Tails 3.0 (1 week sprint with
  all involved people) :intrigeri:anonym:
  - get feature/stretch to build and boot
  - update the automated test suite to test Tails/Stretch ISO images
* August 2016 to April 2017 — have one dedicated half-time, 1-week sprint
  every 6 weeks.
  - Schedule these sprints in advance, announce them
    publicly, and invite other contributors (e.g. doc writers).
  - Most of these sprints will probably be attended remotely, but at
    least some could happen face-to-face (at least the first sprint;
    and then, one out of two, i.e. every one every 3 months?).
  - At the beginning of each Stretch sprint, we import a new snapshot
    of Debian testing into our freezable APT repository, so that:
    * during the sprint we update our stuff as required by changes
      that happened in Debian since the last sprint;
    * our stuff is not broken by changes in Debian neither during
      sprints, nor between them;
    * we get a feeling of how "being based on snapshots of Debian
      testing" would work.
* November 14-18th: second sprint (in-person, organized by intrigeri),
  release Tails 3.0~alpha1
* December 20-23: third sprint (remotely attended for everybody except
  a couple of us)
* January 30 - February 3: fourth sprint (remotely attended); release
  Tails 3.0~beta1
* February 2017 to Tails 3.0 release: keep 3.0~beta as up-to-date,
  wrt. security vulnerabilities, as our 2.x channel is
* February 5 2017 — Debian Stretch freeze starts
* March 8: release Tails 3.0~beta2
* March 17-19th: fifth sprint (in-person, organized by intrigeri),
  release Tails 3.0~beta3
* April 19: release Tails 3.0~beta4
* May 16-19: sixth sprint (remotely attended for everybody except
  a couple of us), release Tails 3.0~rc1
* June 2017 — Debian Stretch and Tails 3.0 are released

<a id="rolling"></a>

# Let's go rolling

This material is mostly useful for a historical perspective. The next
steps and up-to-date information are documented on
[[blueprint/Debian_testing]] instead.

## Stretch cycle

Let's use this porting cycle to evaluate how being based on snapshots
of Debian testing would feel like. During the entire cycle:

* Keep the automated test suite up-to-date on feature/stretch
  - Whenever the porting work feels painful, e.g. because it requires
    updating images, consider fixing the problem at its root in our
    current devel branch, e.g. thanks to dogtail
    ([[!tails_ticket 10721]]).

* Keep the documentation up-to-date on feature/stretch (:sajolida:),
  or if it is too ambitious:
  - attend at least one sprint really early in the process, and one
    quite late
  - start doing it for real only once Stretch is frozen
  - before Stretch is frozen, during each Stretch sprint, test the
    documentation in order to:
    * find and report bugs while it is still time to fix them in the
      right place
    * identify what would have required work if we wanted to keep the
      documentation up-to-date on feature/stretch, and take notes
      about it; the goal is here is to gather useful data regarding
      the need to update the same piece of documentation multiple
      times over a given Debian release cycle, which will help us make
      a decision further down the road; it also might help us pinpoint
      specific parts of our documentation that could be updated to
      depend less on varying factors, if any
    * this testing work can be easily joined by newcomers (and we
      should make this clear)

* Take notes of issues we see from the perspective of "how would it go
  if we were based on testing already?". E.g.:
  - How to keep track of security issues that affect us, and whether
    their fix has been uploaded and has migrated to testing yet?
    See e.g. how security support for Debian testing [used to be
    (briefly) done](http://secure-testing-master.debian.net/), and in
    particular the
    [Vulnerable source packages in the testing suite](https://security-tracker.debian.org/tracker/status/release/testing)
    and the
    [Candidates for DTSAs](https://security-tracker.debian.org/tracker/status/dtsa-candidates).
  - How to deal with
    [[!debwiki OngoingTransitions desc="ongoing transitions"]] that
    block migration of security updates from sid to testing?

* Make testing ISOs available widely with clear explanations of what
  to test and what *not* to test. This should happen starting around the freeze
  of Stretch. We could have a message about what to test displayed at bootup of
  test ISOs.

Note that during half of the cycle, we'll be based on a frozen Debian
testing, so the changes rate coming from Debian, that impact the
documentation and test suite work, won't be crazy. Of course there
will be changes coming from our own porting efforts.

Also, note that the work we will be doing after Stretch is frozen is
work we would have to do to port Tails to Stretch anyway: so the only
part of this that is somewhat experimental, and might make us do extra
work, is limited to 4 months (August to November 2016).

## And after?

And then, once Tails 3.0 is out: we're not lagging behind anymore, and
are thus in a position to decide whether we want to base Tails on
snapshots of Debian testing. Chances are that we won't want to be
based on Debian testing during the 6 months that follow a Debian
stable release, so we will have time to make up our mind, and to
prepare the infrastructure and tooling we may need if we decide to
be based on Debian testing.

To sum up how a ~2 years Debian release cycle could look like from our
perspective, if we were based on snapshots of Debian testing:

1. a new Debian stable is released
2. 6 months during which Tails is based on Debian stable that was just
   released
3. 12 months during which Tails is based on a non-frozen Debian
   testing
4. 6 months during which Tails is based on a frozen Debian testing

Let's now analyze and discuss some consequences of this scenario.

We would be based on a moving target half of the time; the remaining
half of the time we are based on something that doesn't change much.

We would need to track 2 Debian versions at the same time (and
continuously forward-port development that was based on the oldest
one) during the first phase only, that is 6 months, compared to the 19
months during which we did that in the Jessie cycle. We would save
lots of time there, that could be re-invested in aspects of this
proposal that will require more time: essentially this idea is about
doing a different kind of work, and doing it at different places; it
won't necessarily lower the total amount of effort we have to do, but
it likely will make such efforts generate less frustration, more
happiness, and a warm feeling of being part of a broader community.

Another aspect is that we would essentially stop having to produce and
maintain backports of Debian packages.

<a id="rolling-stretch-analysis"></a>

## Analysis of the Stretch cycle

### Statistics of Git activity

We have generated
[statistics and graphs](https://labs.riseup.net/code/attachments/download/1640/gitstats-2.x-to-3.0.tar.bz2)
of the Git activity between our Jessie-based branch and our
Stretch-based one. Long story short:

 * We really started this work in 2016-08.
 * The first peeks of activity are the two first sprints we had in
   2016-08 and 2016-11. Then quite some work was done every month
   until the release, with more sustained activity starting in
   2017-03.
 * The biggest peaks are:
   - 2017-03: we had a Stretch + reproducible builds sprint, and the
     work on reproducible builds is accounted for in these stats (it
     was based on Stretch).
   - 2017-05: we had a Stretch sprint and our technical writers were
     busy updating the doc.

### Code

XXX:anonym, please add whatever input data, feelings and analysis
you can come up with.

Commits history in the `auto` and `config` directories between our
Jessie and Stretch -based branches:

 - 2016-01: 17
 - 2016-02: 1
 - 2016-03: 0
 - 2016-04: 0
 - 2016-05: 14
 - 2016-06: 0
 - 2016-07: 16
 - 2016-08: 59 (in-person sprint)
 - 2016-09: 2
 - 2016-10: 0
 - 2016-11: 88 (in-person sprint)
 - 2016-12: 35 (remote sprint)
 - 2017-01: 24 (remote sprint)
 - 2017-02: 10 (remote sprint)
 - 2017-03: 108 (in-person sprint + lots of work on reproducible builds)
 - 2017-04: 27
 - 2017-05: 97 (in-person sprint)
 - 2017-06: 42

So about 2/3 of the commits were done during 4 in-person sprints,
which was the plan. Interestingly, remote sprints were less
productive, but let's take this metric with a grain of salt.

Clocking data:

 * intrigeri worked 44 days on the Stretch migration. Sadly, this
   includes some work that's irrelevant here, like:
      * Some test suite work
      * Porting to the new Greeter
      * New memory wipe implementation
      * Preparing 5 alpha/beta releases, most of which we wouldn't
        have needed if we had been tracking Debian testing.
      * Reviewing other contributors' work targeted at Stretch, which
        would be instead part of RM'ing if we were based on
        Debian testing.
 * anonym mostly worked on the test suite, which doesn't count in this
   section. Let's assume the code work he did somewhat compensates the
   time erroneously clocked by intrigeri.
 * A few other people also did some work, which was helpful, but
   that's negligible vs. the total time spent.

So all in all, we can assume about 44 days of work in this area.
Dividing this number by the number of 3-months cycles in a Debian
release cycle (8), that's 5.5 days every 3 months. But of course some
work would have to be done more than once in two years if we tracked
Debian testing, so this figure probably reflects an unrealistic best
case scenario.

Regarding code changes we had to do to adjust for changes in testing:
Stretch was frozen for half of the experiment, so we haven't much data
here. After looking closely at the Git log, what I could spot is
mostly unfuzzying patches, cherry-picking packages from sid when they
were auto-removed from testing, and rebasing packages we patch on top
of the current version, i.e. trivial (although somewhat boring) work
whose amount we could substantially lower on the long term by working
more closely with upstream projects; the only bigger pieces I could
spot are the migration to GnuPG v2 and Icedove → Thunderbird, but
those would have to be made exactly once as well if we were based on
Debian testing.

Regarding tracking security issues fixed in sid but not in testing:

 * blocked due to the normal transition time from unstable to testing:
   we didn't pay attention to this when releasing Tails 3.0~, so we
   have no data here; too bad.
 * blocked by some ongoing transition: we started providing security
   support for Tails 3.0~ after Stretch was frozen, so we had no
   direct experience of this problem; too bad.

### Test suite

XXX:anonym, please add whatever input data, feelings and analysis
you can come up with.

Non-merge commits in `features/` and `run_test_suite` between our
Jessie-based branch and our Stretch-based one:

 - 2016-05: 1
 - 2016-06: 0
 - 2016-07: 1
 - 2016-08: 62
 - 2016-09: 0
 - 2016-10: 0
 - 2016-11: 41
 - 2016-12: 39
 - 2017-01: 7
 - 2017-02: 30
 - 2017-03: 29
 - 2017-04: 19
 - 2017-05: 20

Here as well, most of the work was done during a few sprints.

Clocking data:

 * intrigeri didn't clock his test suite work separately from his
   other Stretch work, but that probably wasn't more than 2-4 days.

 * anonym worked 24 days during sprints + XXX days outside of sprints
   on the Stretch migration; most of this time was spent on the
   test suite.

Total diff stat:

 - 258 files changed, 1252 insertions(+), 986 deletions(-)
 - among those, 200 images changed: 52 removed, 13 added, 135 updated;
   and among these 200 image changes, at least 50 are orthogonal to
   what we're analyzing here:
    * many of the removals are thanks to dotail-ification, so at least
      those won't need to be updated ever again
    * 21 due to switching to Tor Browser 7.0
    * 1 due to the memory wipe system change
    * 23 due to the new Greeter

Now let's do some stats ignoring the changes that are orthogonal to
what we're analyzing here, in that we would have to do them once,
regardless of what version of Debian (stable or testing) we're
tracking, i.e. the Icedove → Thunderbird rename and the memory erasure
system replacement:

 - `*.feature`: 19 files changed, 152 insertions(+), 125 deletions(-);
   most of the changes are still orthogonal to what we're analyzing
   here (refactoring and a few new regressions tests for bugs that
   affect Tails 2.x too)
 - `*.rb`: 21 files changed, 661 insertions(+), 502 deletions(-); many
   of the changes are still orthogonal to what we're analyzing here
   (refactoring and porting to the new Greeter and memory erasure
   system)

In short, by far the biggest part of the porting work was done by
updating images and dogtail-ifying tests. Only very few other bits had
to be updated (GnuPG v2, migrating away from obsolete CLI, adjusting
to new `nmcli`, this kind of things).

Now, of course we can't ignore the dogtail-ification work when we
think ahead about how test suite maintenance would look like if we
tracked Debian testing:

 * if we hadn't done this work, we would have lots more images
   updates, and way fewer image deletions;
 * it's mostly one-time work, so what's been done is done for good,
   still we should keep investing in it and dogtail-ify more tests in
   order to decrease the amount of images we need to maintain. So at
   least for a few 3-months cycles, this work will be recurring.

Additional data that would be interesting:

 * Did we have to update some tests (e.g. images) more than once due
   to changes in Debian testing?

### Documentation

- sajolida spent 6.3 days of work on the update to 3.0 in 2017
  (including amd64, Debian installation, new Greeter, KeePassX, etc.).
  spriver and cbrownstein combined spent as much time. So let's say 15
  days in total.
- Release notes were a big chunk for sajolida (1.3 days).
- But we (mostly spriver) also took time to go through all the
  documentation and test it and that's probably ~2 days of work. I don't
  think that's realistic or worth it to go through this every quarter so
  we should think about alternatives:
  * Get help from the Foundations team to spot what's worth testing
  * based
    on the underlying Debian packages?
  * Only test core pages?  Spread the load between different release
  * (test some pages for some
    release and some others for another release)?
- This time we also updated our installation documentation to the new
  Debian release. If we're based on Stretch we would have to do this at
  a different time but that shouldn't be a problem.

Note: the following stats ignore PO files, the Icedove → Thunderbird
renaming, and the `blueprint`, `contribute`, `inc` and `news`
directories. Sorry I realized too late that I should have taken the
3.0 release notes into account :/

Non-merge commits in `wiki/src` between our Jessie-based branch and our
Stretch-based one:

 - 2016-08: 1
 - 2016-09: 0
 - 2016-10: 0
 - 2016-11: 2
 - 2016-12: 0
 - 2017-01: 3
 - 2017-02: 2
 - 2017-03: 42
 - 2017-04: 26
 - 2017-05: 45
 - 2017-06: 1

So this confirms what we already knew: the bulk of the work was done
in the last 3 months before the release, and started after Stretch was
already deeply frozen, so obviously we won't learn much here regarding
_continuously_ updating our doc to match a moving target.

But at least we can get an idea of the total amount of work this has
required :)

 - 84 files changed, 417 insertions(+), 355 deletions(-)
 - Among the 84 changed files: 35 pictures + 49 text files.

Some of these changes are orthogonal to what we're analyzing here, in
that we would have to do them once regardless of what version of
Debian (stable or testing) we're tracking:

 - New Greeter: 8 files changed, 151 insertions(+), 21 deletions(-)
 - Most of the changes in the `install` directory are either about the
   move to 64-bit, or about supporting installation from Debian
   Stretch, or about the new Greteer as well:  21 files changed, 81
   insertions(+), 100 deletions(-)
 - KeePassX migration: even if we had been tracking testing, this
   would have had to be done only once:  1 file changed, 64
   insertions(+), 29 deletions(-)

So very roughly, the changes that would be impacted by the "let's go
rolling?" decision are closer to 54 files changed (24 pictures and 30
text files), 121 insertions, 205 deletions.

So, the worst case scenario is that every 3 months we have to:

 * go through changes in the packages list, and identify what doc
   pages and screenshots need updates; this can be semi-automated;
 * update about 24 pictures
 * update 100-200 lines in about 30 text files

Let's keep in mind some other factors that will make reality a bit
nicer:

 * There's a GNOME release only every 6 months, so a non-negligible
   part of our doc will have to be updated only twice a year, instead
   of every 3 months. I bet many of our screenshots are part of
   this lot.
 * Our test suite can help identify the doc bits that need an update:
   either text-based instructions (when our tests use dogtail), or
   screenshots (when our tests use picture recognition). There are
   a few ways we could boost this beneficial factor as we go e.g.:
   - add automated tests specifically about things we document;
   - add steps to our existing tests to validate the screenshots we
     have in our documentation.

### Summary

Looking at the data we have, my (intrigeri) general feeling is rather
positive but I was sold to this idea from the beginning, so let's be
clear: I'm biased :)

Now, I feel we haven't gathered enough data during the Stretch cycle
to make a final decision wrt. starting to track Buster in 2018-01 or
not: particularly in the doc and security support areas, we have very
little data about _continuously_ adapting to a moving target.

But I think we do have enough data to decide at the summit something
like "if X happens by the end of November and takes us less than
Y days, then we'll go ahead and switch to Buster in 2018-01".
Worst case we'll suffer for a year until Buster is frozen, and then we
won't try again before fixing some pre-conditions we'll have
identified during our failed first try.

I think the only way we can gather the missing data, and check if "X
happens" and whether it takes "less than Y days", is to have sprints
in 2017-10 or 2017-11, and actually port most of our code, test suite
and documentation to Buster. We could take some shortcuts though, e.g.
by estimating the amount work needed instead of doing it, when we feel
very confident we can come up with a realistic estimate.

It'll be impossible to have all these kinds of work done in parallel
during one single sprint though; the best process I can think of is to
serialize the tasks like this:

1. 3 days: make the ISO build and boot (anonym + intrigeri)
2. 3 days: loop over "port some of the test suite, fix regressions in
   our code to make the ported tests pass" (anonym + intrigeri)
3. 1 day: give our technical writers a list of software that has
   changed in a way that may affect our documentation, so they can
   update it (anonym + intrigeri)
4. 3 days: update the doc (sajolida & his amazing tech writing band)

If we eventually decide to postpone our migration to Stretch, some of
this work will be wasted, and hopefully some won't. To minimize the
amount of wasted work in the worst case, we should be ready to give up
in the middle of the afore-described process, if we notice it's going
to take us more time than "Y days".

While doing this we should carefully gather the data we did not get
during the Stretch cycle, in particular:

1. How would we address security issues fixed in sid but not in
   testing? (see the "Stretch cycle" section for details)
2. How can developers convey to technical writers what changes may
   affect the documentation? (see ideas in the "Documentation" section
   above)
3. Thanks to #2, can we identify what piece of doc needs updating
   _without_ having to test the entire documentation every
   three months?

When deciding what X and Y should be, and when analyzing how the
experiment went, we shall keep in mind that Debian testing changes _a
lot_ when the gates open after the freeze. It's not exactly the time
when it's the most reliable, even though there's been big progress in
this area. And there are lots of library transitions going on, that
could be painful for us, even though there's been big progress here as
well. In other words, the delta between Stretch and "Buster in
October" will be substantially bigger than what will happen during the
next 3-months cycles, and the porting work will be close to the
maximum one can expect during these 3-months cycles.