Implementing "testing" (was: Re: Potato now stable)

From: Anthony Towns (aj@azure.humbug.org.au)
Date: Thu Aug 17 2000 - 14:17:30 CEST

  • Next message: Anthony Towns: "Woody Freeze Plans - Progress Report"

    Hello world,

    So, on -devel-announce, I mentioned:

    > * New "testing" distribution
    > This is a (mostly finished) project that will allow us
    > to test out distribution by making it "sludgey" rather
    > than frozen: that is, a new distribution is added between
    > stable and unstable, that is regularly and automatically
    > updated with new packages from unstable when they've
    > had a little testing and now new RC bugs.
    >
    > (Anthony Towns; debian-devel)

    It's basically ready to be stuck in the archive now, as far as I can
    tell, but since it's not exactly a trivial change, it's probably time
    to discuss it a bit more.

    The basic idea, simplified immensely, is to address this problem:

    > * Testing updates to frozen is suboptimal: updates go into
    > incoming, wait there for a while, get added to frozen,
    > we discover they introduce as many release critical bugs
    > as they solve, rinse, repeat. The "wait for a while" part
    > is particularly suboptimal, but without it, it's not really
    > a freeze.

    The current way we do things is basically to build a new package, hope it
    works as advertised, and let people test it. If it doesn't work, we repeat
    as many times as necessary, or eventually just throw the package out.

    A better way to handle this, which I suspect everyone's just spontaneoulsy
    reinvented as the read the above, is to try to keep around a previous
    version of the package that was usable. That way if the new packages don't
    work, we can just keep the old one rather than having to throw it out
    entirely.

    That, essentially, is the point of the "testing" distribution: to contain
    a consistent set of the most recent "believed-to-be-reliable" packages.

    Some subheadings follow.

    Why call it testing?
    ~~~~~~~~~~~~~~~~~~~~
    One thing that the freeze is really bad at is fixing "normal" bugs. The
    point of packages in testing is not that they should be perfect or
    bug-free, just that they should be usable. There's a lot of difference
    between what we'd like to release (0 bugs, many many features) and what
    we'll accept for release (~0.005 RC bugs :), and this is really where
    beta testing should fit in.

    It also sorts nicely compared to "stable" and "unstable" :)

    What does "acceptable for release" mean?
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    For one thing, it means the packages are all consistent: if libgtk1.2.7
    is in the distribution, none of the packages should be depending on
    libgtk1.2.8. For another, it means packages shouldn't have any release
    critical bugs. It also means a package should be at the same version
    across all architectures it's present in [0]. It also means the maintainer
    of the package should be relatively happy with it.

    It means the package shouldn't have any release critical bugs: that is,
    no security holes [1] (critical or grave), the package shouldn't crash
    your system (critical), it should be usable for someone on the planet at
    least (grave), and it shouldn't violate policy too severely, by having
    incorrect dependencies, or no copyright, eg [2] (important).

    Note that what I'm writing here is what I think's best, and what's
    implemented. If there's an objectively better way of doing things, well,
    that's why I'm posting. [3]

    Okay. So the next question you're probably asking yourselves is "how does
    it work". Well, you don't have to ask yourself, you can ask me. Here's a
    summary.

    Archive Layout
    ~~~~~~~~~~~~~~
    As package pools aren't close to being rolled out, I'm opting for as
    minor a change as possible (which isn't really very minor). So instead
    of two distributions, stable and unstable, we have three distributions,
    stable, testing and unstable. As usual packages get uploaded via dinstall
    to unstable, broken and buggy however they might be. Eventually, by some
    automated process yet to be described, they eventually get added to the
    testing distribution. After some amount of time testing gets frozen,
    fixed, and released (the theory being that this will be easier than
    freezing unstable, fixing it, and releasing).

    So basically we'd have:

            unstable -- bleeding edge, broken, etc
            testing -- leading edge, maybe buggy, but working
            stable -- static, usable, going out of date

    Automated Process?
    ~~~~~~~~~~~~~~~~~~
    So pretty much all the policy is encoded in some "automated process"
    which updates testing. It works at the moment, basically as follows:

            1. First, it loads up all the Sources and Packages files in
               testing and unstable.
            2. It compares and contrasts them, working out what source
               packages are new in unstable.
            3. For each of these new source packages it checks:
                    a. That the package has had two weeks of testing,
                       or it's a medium or high urgency package (and has
                       had either one week, or three days of testing).
                    b. That each binary has been recompiled for each arch
                       it's on.
                    c. That each binary has 0 RC bugs, or fewer than the
                       testing version does [4].
            4. It then collects the source packages that pass 3, and
               tries installing them in various combinations to see if the
               number of uninstallable packages in "testing" either drops
               or remains the same. If so, they're in. If not, they're not.

    There are a bunch of helper scripts that ensure that dists/testing
    is fully populated either by symlinks to unstable, or by the files
    themselves, and that ensure that if the file in unstable is deleted
    by dinstall, that the symlink is changed to a hardlink to the old file
    rather than being left dangling.

    This has been being prototyped on auric, so you can see some stuff
    about it at http://auric.debian.org/~ajt/, and you can point apt at it
    too. Pointing apt at it probably isn't really too clever: it doesn't
    really have the bandwidth for users doing upgrades, or random people
    doing mirrors, and I keep changing things around fairly frequently to
    see how the scripts hold up. But you can do it.

    The actual scripts to do this are all in my home directory on auric,
    and so are probably only accessible to developers. auric:~ajt/doit.sh
    is the place to start if you want to have a look.

    Okay, so what next?

    Effects on the Release Cycle
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    So the main point of this is to create a distribution that, essentially,
    doesn't have any release critical bugs [5] and can be kept that way
    with much less effort on the part of the release manager. That should
    have a pretty profound effect with regard to speeding up the freeze,
    since it removes one of the two main bottlenecks [6].

    So, here's a rough guide as to how releases might work with a testing
    distribution with a focus on minimising time in the freeze:

            * Development time: packages are worked on, new upstream versions
              are installed. testing is kept fairly bug-free. Users can point
              apt at testing, and give feedback to the developers before the
              freeze, without having to worry about bash not working.

            * Freeze preparation: boot-floppies, CD scripts, release notes are
              updated to work with the new and updated packages.

            * Freeze: any remaining problems in testing are dealt with, either
              by adding them to the release errata, downgrading them, fixing
              them, or removing the package entirely.

    Since the remaining problems should be small, the freeze should be able to
    be kept very short.

    In addition, development and freeze preparation are entirely
    parallelizable. It's plausible and even desirable to simply continue
    to maintain boot-floppies, CD scripts, and release notes throughout the
    development phase. In an ideal world, testers should be able to obtain
    bootable CDs for testing as well as stable, in general.

    Even if the latter doesn't happen, though, eliminating just a few bugs
    remaining in testing, and any new bugs uncovered by boot-floppies or CD
    generation should be a lot easier than fixing all the bugs in unstable,
    as well as what new bugs are uncovered by boot-floppies, CD generation.

    There's a bit more to it than that actually, but this mail's probably
    already getting long enough,

    So that leaves...

    Transitioning
    ~~~~~~~~~~~~~
    So, here's how I see us ending up when we've *finished* the transition:

            potato/ woody/ sid/
            stable -> potato
            testing -> woody
            unstable -> sid
            
    That is, all the unreleased architectures, and all the new and broken or
    untested packages are in sid; potato's still stable; and the packages in
    woody are getting less and less buggy.

    To effect this, we would:

            * desymlink potato/binary-powerpc and potato/binary-arm (which point
              to sid presently)
            * remove sid/binary-powerpc and sid/binary-arm
            * create symlink trees in sid for each of the released architectures
              pointing at woody.
            * remove the symlinks for unreleased architectures from woody
            * point unstable at sid, and update dinstall so uploads to unstable
              go to sid
            * point the testing scripts at woody

    The testing scripts need to cope with a few things here:

            * Some .deb's in sid will symlink to woody. The .deb's in woody
              shouldn't be deleted while they're needed by sid.

            * When a .deb in woody is updated, the .deb will already be in sid
              (and will have been for two weeks). As such, there should
              simply be a symlink from woody to the actual .deb in sid to
              conserve mirror space.

            * When a .deb in sid is updated, there may be a symlink from woody
              to it. This symlink needs to be replaced with a copy of the real
              .deb since there's nothing to link to anymore.

    They do cope with this at the moment, and they're being prototyped on auric
    in /org/scratch/ajt/froody (like woody, but a little weird :).

    The way they cope with this is to keep a separate copy of the testing
    tree in /org/scratch/ajt/hidden, which rather than having any symlinks
    is all hard links to the actual .debs. When any of the .debs is removed,
    the hard links still remain, and can be copied (well, hard linked again
    actually) into the visible tree.

    So there you have it.

    It's coded. It works. It serves a useful purpose. I think we should
    use it.

    Cheers,
    aj

    [0] As opposed to a package being present in all architectures. That is,
        I think it's only appropriate to consider "foo doesn't build on the
        bar architecture" a release critical bug if it's already been built
        there before. And I also think it's appropriate for that bug to be
        downgradable if foo is simply removed from binary-bar.

    [1] That compromise root, or user's data. What about denial-of-service
        bugs? They don't actually fit into the existing severity levels,
        that I can see.

    [2] An explicit enumeration of what "too severely" means should appear in
        the next policy update, hopefully.

    [3] Here's hoping it won't degenerate like the "Intent to split" mail did.
        Yeesh.

    [4] The number of RC bugs against the testing version is assumed to be the
        number of RC bugs against the package when that version was the latest
        in unstable. If it's wrong, it's probably an underestimate so requiring
        fewer RC bugs in the new package isn't likely to introduce too many new
        errors.

    [5] What release critical bugs will it have? Obviously, it'll still have
        any security problems that get discovered, but presumably they'll be
        fixed within a day or two. There'll be bugs that have existed for a
        long time, but that no one's noticed until recently: things like the
        strange copyright and Depends: of dvidvi.

        One source of significant numbers of RC bugs in testing might be
        -policy changes: requiring Build-Depends:, or moving /usr/doc to
        /usr/share/doc, or requiring packages to be built with libc6 can
        declare huge numbers of packages buggy, and take a while to fix.

        The other source of bugs that could be problematic are problems like
        the bugs against net-tools and nscd: ones that are obviously critical,
        but aren't reproducable or diagnosed well enough to be fixed.

    [6] The other being getting boot-floppies to a point where they can be
        released.

    -- 
    Anthony Towns <aj@humbug.org.au> <http://azure.humbug.org.au/~aj/>
    I don't speak for anyone save myself. GPG signed mail preferred.
    

    ``We reject: kings, presidents, and voting. We believe in: rough consensus and working code.'' -- Dave Clark


    -- 
    To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
    with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
    



    This archive was generated by hypermail 2b30 : Sat Apr 07 2001 - 16:03:03 CEST