|
On Mon, Jun 18, 2012 at 11:49:46AM +0200, Rick Spencer wrote:
> On Mon, Jun 18, 2012 at 7:02 AM, Martin Pitt <[hidden email]> wrote: > > Sebastien Bacher [2012-06-15 17:26 +0200]: > >> Can we just drop the image rolling part of milestones? > > > > Our automated tests are still waaaay to incomplete for this step. In > > manual testing we have found quite a number of real deal-breaker bugs > > which the automatic tests didn't pick up. We also need to test the > > current images on a wider range of real iron; which is something our > > automated QA could do one day, but doesn't right now. > > > > So regular manual testing rounds are still required, and the points > > when we do them might just as well be called "milestones". > > But if the focus is testing, we should optimize the schedule around > testing. For example, I think Ubuntu would benefit from more frequent > "rounds" of such in depth testing than the current alpha/beta > milestones provide. (I think every 2 weeks would be a good cadence). This could be very beneficial if it were more aggressively organized. We did something similar with proprietary driver testing one release a few years back. We had people "join" a team, and then had them install isos and run through a checklist once a week. I found it to be quite valuable, but you had to be very organized for it to be useful. So this wasn't just a "install the image and file bugs" exercise, but an deliberate look for serious regressions. By having each person provide a continuous series of data points we could spot anomalies much more easily. If someone is installing things exactly the same way, on the same hardware, every week, and all of a sudden one week it fails, that helps you narrow things down a lot. Or equally important is seeing that a fix you roll out does indeed restore functionality across multiple testers. The key was to be very specific in the data collection, else you can generate a lot of noise quickly. Make a printable survey form they can fill in as they go through the checklist procedure, and a system info dumping tool that captures all the logs when they're done that might be needed for bug reports. The QA team has a tool for capturing all this data and showing it in a tabulated form so you can spot patterns and changes over time. The most important thing is that the data actually get used. This testing can take a fair bit of time, but if the testers know their efforts are helping to make things tangibly better they can really get passionate about doing it. Bryce -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
In reply to this post by Rick Spencer-2
On Mon, 2012-06-18 at 11:49 +0200, Rick Spencer wrote:
> On Mon, Jun 18, 2012 at 7:02 AM, Martin Pitt <[hidden email]> wrote: > > Sebastien Bacher [2012-06-15 17:26 +0200]: > >> Can we just drop the image rolling part of milestones? We still > >> probably want fixed checkpoints in the cycle to review the features, > >> etc but they don't especially need to be linked with a special > >> image... > > > > Our automated tests are still waaaay to incomplete for this step. In > > manual testing we have found quite a number of real deal-breaker bugs > > which the automatic tests didn't pick up. We also need to test the > > current images on a wider range of real iron; which is something our > > automated QA could do one day, but doesn't right now. > > > > So regular manual testing rounds are still required, and the points > > when we do them might just as well be called "milestones". > > But if the focus is testing, we should optimize the schedule around > testing. For example, I think Ubuntu would benefit from more frequent > "rounds" of such in depth testing than the current alpha/beta > milestones provide. (I think every 2 weeks would be a good cadence). https://wiki.ubuntu.com/QuantalQuetzal/ReleaseInterlock Between the 12.04.1 and Quantal Milestones, the QA Testing and QA Community testing have a pretty full load already. (see columns) What was decided to try with Quantal was to do a more intense round of manual testing on the dailies, the week before the milestone, so that the bugs found could be fixed by development, and still give the developers a good window of focused transitions and feature development time. This possibly could be adjusted to a round of testing 2 weeks prior, but would have to be juggled in with the testing team's other commitments? We're releasing Beta 1 on 9/6, Beta 2 on 9/27 and Final on 10/18 - each 3 weeks apart, so not as much room there. Not sure how many of the other Ubuntu flavors (Kubuntu, Xubuntu, etc.) that come out with the alpha milestones would want to participate in a more frequent testing schedule though. They already skip some of the milestones, based on which of their packages are landing and resources are available to do the manual testing, but do have an implied dependency on Ubuntu alpha/betas being available. For Alpha1, we did 2 respin sets after the first set was built, based on what the manual testing was finding and trying to get a set of ARM desktop images. (Note: We did not have quantal arm desktop images until the week of alpha 1, and then didn't have them again with the dailies between 6/10-6/14). Having milestones does force a focus on the full set of images. Daily images and the automated testing are still mostly focusing on unit tests for the x86 desktop and server images in virtualized hardware, and as Martin says, the manual testing is still finding issues on the real hardware that are causing respins. Kate -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
On Monday, June 18, 2012 05:15:52 PM Kate Stewart wrote:
> On Mon, 2012-06-18 at 11:49 +0200, Rick Spencer wrote: > > On Mon, Jun 18, 2012 at 7:02 AM, Martin Pitt <[hidden email]> wrote: > > > Sebastien Bacher [2012-06-15 17:26 +0200]: > > >> Can we just drop the image rolling part of milestones? We still > > >> probably want fixed checkpoints in the cycle to review the features, > > >> etc but they don't especially need to be linked with a special > > >> image... > > > > > > Our automated tests are still waaaay to incomplete for this step. In > > > manual testing we have found quite a number of real deal-breaker bugs > > > which the automatic tests didn't pick up. We also need to test the > > > current images on a wider range of real iron; which is something our > > > automated QA could do one day, but doesn't right now. > > > > > > So regular manual testing rounds are still required, and the points > > > when we do them might just as well be called "milestones". > > > > But if the focus is testing, we should optimize the schedule around > > testing. For example, I think Ubuntu would benefit from more frequent > > "rounds" of such in depth testing than the current alpha/beta > > milestones provide. (I think every 2 weeks would be a good cadence). > > https://wiki.ubuntu.com/QuantalQuetzal/ReleaseInterlock > Between the 12.04.1 and Quantal Milestones, the QA Testing and QA > Community testing have a pretty full load already. (see columns) > What was decided to try with Quantal was to do a more intense round > of manual testing on the dailies, the week before the milestone, > so that the bugs found could be fixed by development, and still give the > developers a good window of focused transitions and feature development > time. This possibly could be adjusted to a round of testing 2 weeks > prior, but would have to be juggled in with the testing team's other > commitments? We're releasing Beta 1 on 9/6, Beta 2 on 9/27 and Final > on 10/18 - each 3 weeks apart, so not as much room there. > > Not sure how many of the other Ubuntu flavors (Kubuntu, Xubuntu, etc.) > that come out with the alpha milestones would want to participate in a > more frequent testing schedule though. They already skip some of the > milestones, based on which of their packages are landing and resources > are available to do the manual testing, but do have an implied > dependency on Ubuntu alpha/betas being available. > > For Alpha1, we did 2 respin sets after the first set was built, > based on what the manual testing was finding and trying to get > a set of ARM desktop images. (Note: We did not have quantal arm > desktop images until the week of alpha 1, and then didn't have them > again with the dailies between 6/10-6/14). Having milestones does force > a focus on the full set of images. Daily images and the automated > testing are still mostly focusing on unit tests for the x86 desktop and > server images in virtualized hardware, and as Martin says, the manual > testing is still finding issues on the real hardware that are causing > respins. I think it's also worth mentioning that all this fancy automatic testing that's supposed to replace milestone testing is only fully planned for Canonical sponsored flavors. For the rest of us we still have to do manual testing. I, for one, only do install testing on the milestones (because of the pressure to get a release tested and out the door). The rest of us are going to have to do this and it doesn't happen in one day. I've also been considering how things are supposed to work with incremental uploads if the development release is always supposed to work. For Kubuntu, we upload (if we have resources) each KDE SC beta, RC, final, and bug fix update. The beta releases have bugs. It seems a bit counter to always wanting a working development release to upload beta releases to it, but that's how the software gets used and bugs detected and reported so they can be fixed. Scott K -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
In reply to this post by Kate Stewart
On Mon, Jun 18, 2012 at 3:15 PM, Kate Stewart
<[hidden email]> wrote: > Between the 12.04.1 and Quantal Milestones, the QA Testing and QA > Community testing have a pretty full load already. (see columns) > What was decided to try with Quantal was to do a more intense round > of manual testing on the dailies, the week before the milestone, > so that the bugs found could be fixed by development, and still give the > developers a good window of focused transitions and feature development > time. This possibly could be adjusted to a round of testing 2 weeks > prior, but would have to be juggled in with the testing team's other > commitments? We're releasing Beta 1 on 9/6, Beta 2 on 9/27 and Final > on 10/18 - each 3 weeks apart, so not as much room there. I think the gist of Rick's proposal makes sense; the goal here is to assure quality, and the "assurance" side of QA has become a core part of how Nick Skaggs on my team is approaching his work in the 12.10 cycle (such as assuring that mandatory ISO tests get run before we release an Alpha/Beta). As one voice in this discussion I would be more than supportive of Nick not focusing on milestones but instead a regular cadence of testing (e.g. every two weeks) to support our quality efforts. Jono -- Jono Bacon Ubuntu Community Manager www.ubuntu.com / www.jonobacon.org www.identi.ca/jonobacon www.twitter.com/jonobacon -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
In reply to this post by Kate Stewart
On Tue, Jun 19, 2012 at 12:15 AM, Kate Stewart
<[hidden email]> wrote: > > For Alpha1, we did 2 respin sets after the first set was built, > based on what the manual testing was finding and trying to get > a set of ARM desktop images. (Note: We did not have quantal arm > desktop images until the week of alpha 1, and then didn't have them > again with the dailies between 6/10-6/14). Having milestones does force > a focus on the full set of images. Daily images and the automated > testing are still mostly focusing on unit tests for the x86 desktop and > server images in virtualized hardware, and as Martin says, the manual > testing is still finding issues on the real hardware that are causing > respins. I believe there is widespread agreement on this thread that manual testing is good and necessary. I also think there is agreement that a faster cadence of complete manual testing than is accommodated by our current milestones would be desirable. I think it's fair to say that we can move ahead with increasing the frequency of manual testing with or without changes to our milestones. I will look to the Ubuntu Community team to begin with this, as they don't believe they are blocked by any other decisions to be made. I think the question on the table is, shall we drop most milestones altogether, or adopt a system such as Thierry suggests, where we use the most recent "good" daily as the milestone image? There is another question about what is the correct frequency of previous images to archive for purposes of "bisecting". This is also not directly tied to our milestone process. I suspect that a weekly snapshot would be quite useful, but we would have to determine how and where to store those snapshots, and for which images. Cheers, Rick -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
In reply to this post by Kate Stewart
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 On Mon, Jun 18, 2012 at 11:33 PM, Rick Spencer <[hidden email]> wrote: > On Tue, Jun 19, 2012 at 12:15 AM, Kate Stewart > <[hidden email]> wrote: >> >> For Alpha1, we did 2 respin sets after the first set was built, >> based on what the manual testing was finding and trying to get a >> set of ARM desktop images. (Note: We did not have quantal arm >> desktop images until the week of alpha 1, and then didn't have >> them again with the dailies between 6/10-6/14). Having >> milestones does force >> a focus on the full set of images. Daily images and the >> automated testing are still mostly focusing on unit tests for the >> x86 desktop and server images in virtualized hardware, and as >> Martin says, the manual testing is still finding issues on the >> real hardware that are causing respins. > > I believe there is widespread agreement on this thread that manual > testing is good and necessary. I also think there is agreement that > a faster cadence of complete manual testing than is accommodated by > our current milestones would be desirable. I think it's fair to > say that we can move ahead with increasing the frequency of manual > testing with or without changes to our milestones. I will look to > the Ubuntu Community team to begin with this, as they don't believe > they are blocked by any other decisions to be made. > > I think the question on the table is, shall we drop most > milestones altogether, or adopt a system such as Thierry suggests, > where we use the most recent "good" daily as the milestone image? > I have serious concerns with removing the milestones. As it stands, several images, including the vast majority of the ARM images, only get extensively tested at milestones due to the limited userbase of the image (specifically, highbank and armadaxp as of right now is limited to a handful of individuals in the world at the moment). Many critical issues with ARM (and to a lesser extent x86) have only been found during milestone testing. Without a set of defined and organized images for testing, more obscure parts of the installer simply do not get tested; for instance, how many people are going to test all possible server configurations or test the installer with no network. These scenarios are not common for development, but can and do occur regularly for many users who install Ubuntu for the first time. During 12.04 development, during milestone testing, three bugs* relating to both usecases were found to cause the installer to silently fail midway through installation leaving the user with only a partially configured system. Each milestone represents an opportunity for end-users and QA to test our images in something more resembling a production environment, and to test use-cases and recipes that may normally not see a lot of coverage unless one is explicatively checking for edge cases. Milestones exist to give the Ubuntu developer community to step back, and check to make sure nothing important has broken, and to gauge our progress through a cycle. In addition, they provide a dedicated time where as a community we step forth and check our images to ensure no regressions have slipped by. If we remove the milestones, the only period of extensively and review the images will be just before release. As such, any regressions that are found would require a scrambled fix during a period that minimal archive changes are desired and would both be costly in terms of development effort, and risky as each final freeze upload always carries the inherent chance of hosing something important. Unless the final intent is to ultimately abolish releases all together and move to a rolling-release model, I don't believe we, as a community, could successfully ship Ubuntu with its excellent state of quality assurance without the cycles of alpha and beta images. * - Relevant bug reports: https://bugs.launchpad.net/ubuntu/+source/livecd-rootfs/+bug/985737 https://bugs.launchpad.net/livecd-rootfs/+bug/985258 https://bugs.launchpad.net/ubuntu/+source/livecd-rootfs/+bug/985280 Michael -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJP4L+RAAoJEHM+GkLSJHY507UP/3G1IOSwI3zotWDi4Kr4mbkg PtlL2ml7gig4wv7c46ZcWHH9xYj0hbFPxBJDWQrzTVVRQCrKEOscWmV/AAO7kLtO CdBg9ifI15SWwuvKXvuWPqQLhSe0IcVnZ+BSb7eVq+iouOY1Vw/vPdFMeBxEZc1P +hMa//9eggQKQalnjX6O57Be2TTZt3+o8k8lQ9GFjnxrezpw6XGT2IpiXh/l0L6D +eQRgzf7HhDqYDeqsaLPDJSS4fVdtvhFS3C5G1aJW8t4YI5WqccgIRuYwQQ9QwFd XSzp77nYw5dlnufQsQYS5Hm6KQaEaxfvLkUNaY5dTC1vtvoeZQQhpo/CGnji5bqK 6Bl5/CzYg15O7H8xzcnDYWvbnc8f0cQzwfzIFQFKkuWftdM3D3Oim8MTU6dumXUW qQssOqkTmxHpljI8m7TuybcjPR+/JnKXHVkNt8xT/hs89DL4XN4AZVU9wLjC1Cl3 shxhPVNI7qLlM0H267f2RBhx5MeEa9Lrz9loaGsrXFgtAxO5lqG548UaAWo90zrx fHmETYZhGQIUVNk1llyJ6xjpvr1SSOAp3FHYg0G43mQmH09EUxM/S+Gr4UFlZ7sw JOJObzMNfQ8akLt0mJUJpazWbpriXi24053M+M0QtmdIflcIIPbs8DLmaLSLZ0YB p7l1IO0P1ztRnHUyaS3s =QrQF -----END PGP SIGNATURE----- -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
On Tuesday, June 19, 2012 11:06:14 AM Michael Casadevall wrote:
> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Mon, Jun 18, 2012 at 11:33 PM, Rick Spencer > > <[hidden email]> wrote: > > On Tue, Jun 19, 2012 at 12:15 AM, Kate Stewart > > > > <[hidden email]> wrote: > >> For Alpha1, we did 2 respin sets after the first set was built, > >> based on what the manual testing was finding and trying to get a > >> set of ARM desktop images. (Note: We did not have quantal arm > >> desktop images until the week of alpha 1, and then didn't have > >> them again with the dailies between 6/10-6/14). Having > >> milestones does > > force > > >> a focus on the full set of images. Daily images and the > >> automated testing are still mostly focusing on unit tests for the > >> x86 desktop and server images in virtualized hardware, and as > >> Martin says, the manual testing is still finding issues on the > >> real hardware that are causing respins. > > > > I believe there is widespread agreement on this thread that manual > > testing is good and necessary. I also think there is agreement that > > a faster cadence of complete manual testing than is accommodated by > > our current milestones would be desirable. I think it's fair to > > say that we can move ahead with increasing the frequency of manual > > testing with or without changes to our milestones. I will look to > > the Ubuntu Community team to begin with this, as they don't believe > > they are blocked by any other decisions to be made. > > > > I think the question on the table is, shall we drop most > > milestones altogether, or adopt a system such as Thierry suggests, > > where we use the most recent "good" daily as the milestone image? > > I have serious concerns with removing the milestones. As it > stands, several images, including the vast majority of the ARM images, > only get extensively tested at milestones due to the limited userbase > of the image (specifically, highbank and armadaxp as of right now is > limited to a handful of individuals in the world at the moment). > > Many critical issues with ARM (and to a lesser extent x86) > have only been found during milestone testing. Without a set of > defined and organized images for testing, more obscure parts of the > installer simply do not get tested; for instance, how many people are > going to test all possible server configurations or test the installer > with no network. > > These scenarios are not common for development, but can and do occur > regularly for many users who install Ubuntu for the first time. During > 12.04 development, during milestone testing, three bugs* relating to > both usecases were found to cause the installer to silently fail > midway through installation leaving the user with only a partially > configured system. > > Each milestone represents an opportunity for end-users and QA to test > our images in something more resembling a production environment, and > to test use-cases and recipes that may normally not see a lot of > coverage unless one is explicatively checking for edge cases. > > Milestones exist to give the Ubuntu developer community to step back, > and check to make sure nothing important has broken, and to gauge our > progress through a cycle. In addition, they provide a dedicated time > where as a community we step forth and check our images to ensure no > regressions have slipped by. > > If we remove the milestones, the only period of extensively and review > the images will be just before release. As such, any regressions that > are found would require a scrambled fix during a period that minimal > archive changes are desired and would both be costly in terms of > development effort, and risky as each final freeze upload always > carries the inherent chance of hosing something important. > > Unless the final intent is to ultimately abolish releases all > together and move to a rolling-release model, I don't believe we, > as a community, could successfully ship Ubuntu with its excellent > state of quality assurance without the cycles of alpha and beta images. > > * - Relevant bug reports: > https://bugs.launchpad.net/ubuntu/+source/livecd-rootfs/+bug/985737 > https://bugs.launchpad.net/livecd-rootfs/+bug/985258 > https://bugs.launchpad.net/ubuntu/+source/livecd-rootfs/+bug/985280 +1 I have to confess that when I threw out the idea of just abolishing the milestones way back in this thread I thought it was a sufficiently ridiculous idea that it would give people pause about dropping the freezes. People worried about velocity through a freeze can publish stuff in a PPA and ask them to test it during a freeze. I think this entire notion is going to add significant risk to the development cycle. Michael is right on target. Without a dedicated focus on human testing of various components things are going to be missed until the end game when broad user testing starts. Scott K -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
Scott Kitterman [2012-06-19 22:16 -0400]:
> +1 +1 as well, thanks Michael and Scott. Also, saying "we'll drop milestones and introduce a regular schedule for testing some particular dailies to fix bugs not caught by automatic tests" means nothing more than a simple renaming -- because that's exactly what milestones are today. So I also think that proved nicely what their purpose is and that we cannot do without them. We can certainly discuss having more or fewer of them, if the current distance between them is too high. Martin -- Martin Pitt | http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
In reply to this post by Michael Casadevall-5
On Tue, Jun 19, 2012 at 11:06:14AM -0700, Michael Casadevall wrote:
> > Milestones exist to give the Ubuntu developer community to step back, > and check to make sure nothing important has broken, and to gauge our > progress through a cycle. In addition, they provide a dedicated time > where as a community we step forth and check our images to ensure no > regressions have slipped by. I don't think anyone is arguing that we should do less manual testing. In fact, I think that milestones create a culture of less testing, in the sense that people ONLY test during milestones. What's even worse is that many people work from milestones as if they're somehow blessed images that are better than the daily produced two weeks later (we even have teams that take snapshots of the archive to avoid developing against a moving target which, again, means they're testing against a milestone, not the real deal). I think any discussion of "dropping milestones" can only come paired with a conversation about better continuous testing practices. If people have the time to test, they should test what's current. If people don't have the time to test, milestones don't magically create time, in fact, they drain time from people who have to make them go. Maybe the above math seems simpler to me than it is to others, I'm not sure, but if the argument is that we don't have the manpower to test more, then I don't see how halting several people in their tracks every once in a while creates manpower. It doesn't. It can't. Now, one could argue instead that every team should take a few hours a week to test a daily image, and I'd be all for that. Of course, if we pick different teams per day, all the best bugs will get filed on Wednesday by the kernel team, because they're clearly the best non-QA image testers we've got. :P Seriously, though. There's no way that a process that takes time can create time. It's simple physics. I understand the fear that dropping milestones could lead to less testing. And, in a culture where we only test milestones, that's obviously true. We need a cultural shift. The only reason milestones sometimes still look like a "mess" and need time to settle is because we don't test enough between them. If we tested enough between them, we wouldn't need milestones. We've put serious (and I'd say quite successful) effort into making the archive installable and generally sane all cycle with the +1 maint effort, and I think it's time we extended that culture to a +1 QA culture, where testing is consistent, weird bugs are filed every day, and we don't have these 4-day long panic periods where WE HAVE TO FIX EVERY WEIRD BUG RIGHT NAO 'cause no one bothered to install from a daily for the last month. Maybe I'm way off base. Maybe I'm a nutter. But I think I see where Rick's coming from here, and it's not about "muahaha, let's elminate testing by eliminating milestones", it's "hey, maybe we could be testing so consistently that milestones become an obvious waste of time". ... Adam PS: I leave you with a quote from IRC from the last milestone week: "most of the bugs we know about, while not good, we can live with for an A1, and they'll get fixed on the next daily." -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
On Wednesday, June 20, 2012 04:57:19 AM Adam Conrad wrote:
> On Tue, Jun 19, 2012 at 11:06:14AM -0700, Michael Casadevall wrote: > > Milestones exist to give the Ubuntu developer community to step back, > > and check to make sure nothing important has broken, and to gauge our > > progress through a cycle. In addition, they provide a dedicated time > > where as a community we step forth and check our images to ensure no > > regressions have slipped by. > > I don't think anyone is arguing that we should do less manual testing. > In fact, I think that milestones create a culture of less testing, in > the sense that people ONLY test during milestones. What's even worse > is that many people work from milestones as if they're somehow blessed > images that are better than the daily produced two weeks later (we > even have teams that take snapshots of the archive to avoid developing > against a moving target which, again, means they're testing against a > milestone, not the real deal). I do ISO/upgrade testing at milestones because that's the priority for that period in the development cycle. If there wasn't a milestone release and the related social pressures associated with it, I'd never do it (It's a PITA and takes time from other non-Ubuntu things I can't really spare on a regular basis). I agree with your point about people having faith in the ISOs for too long, OTOH, at least it's a stable point to install from. The later images may be better, but they may not. I think what we mostly need is more people running the development release day to day so that actual application bugs get detected, upstreamed, and fixed earlier in the cycle. From that POV it doesn't matter if one installs from a daily or from an earlier milestone image and immediately does a massive dist-upgrade. > I think any discussion of "dropping milestones" can only come paired > with a conversation about better continuous testing practices. If > people have the time to test, they should test what's current. If > people don't have the time to test, milestones don't magically create > time, in fact, they drain time from people who have to make them go. In my experience is does create time because it makes it a priority. > Maybe the above math seems simpler to me than it is to others, I'm > not sure, but if the argument is that we don't have the manpower to > test more, then I don't see how halting several people in their tracks > every once in a while creates manpower. It doesn't. It can't. Now, > one could argue instead that every team should take a few hours a week > to test a daily image, and I'd be all for that. Of course, if we > pick different teams per day, all the best bugs will get filed on > Wednesday by the kernel team, because they're clearly the best non-QA > image testers we've got. :P I don't understand how the milestone freezes halt people in their tracks. There are a few people that are heavily involved in archive maintenance, but for most people I don't think landing things in the official archive a few days either way affects much. > Seriously, though. There's no way that a process that takes time can > create time. It's simple physics. I understand the fear that dropping > milestones could lead to less testing. And, in a culture where we > only test milestones, that's obviously true. We need a cultural shift. It's not creating time, it's shifting priorities. > The only reason milestones sometimes still look like a "mess" and need > time to settle is because we don't test enough between them. If we > tested enough between them, we wouldn't need milestones. We've put > serious (and I'd say quite successful) effort into making the archive > installable and generally sane all cycle with the +1 maint effort, and > I think it's time we extended that culture to a +1 QA culture, where > testing is consistent, weird bugs are filed every day, and we don't > have these 4-day long panic periods where WE HAVE TO FIX EVERY WEIRD > BUG RIGHT NAO 'cause no one bothered to install from a daily for the > last month. > > Maybe I'm way off base. Maybe I'm a nutter. But I think I see where > Rick's coming from here, and it's not about "muahaha, let's elminate > testing by eliminating milestones", it's "hey, maybe we could be testing > so consistently that milestones become an obvious waste of time". > > ... Adam > > PS: I leave you with a quote from IRC from the last milestone week: > "most of the bugs we know about, while not good, we can live with > for an A1, and they'll get fixed on the next daily." Sure, but how did we find out about those bugs? It was generally because we had the milestone and people tested. I think that at some point we may discover that our daily testing is sufficient that we don't need the pauses to do the milestones. I think the way to do that is improve the daily testing and have multiple milestones with no surprises because we demonstrate the daily testing is adequate. Any discussion of dropping things now is putting the cart way in front of the horse. Scott K -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
In reply to this post by Michael Casadevall-5
On Tue, Jun 19, 2012 at 8:06 PM, Michael Casadevall
<[hidden email]> wrote: > -----BEGIN PGP SIGNED MESSAGE----- . > > Many critical issues with ARM (and to a lesser extent x86) > have only been found during milestone testing. Without a set of > defined and organized images for testing, more obscure parts of the > installer simply do not get tested; for instance, how many people are > going to test all possible server configurations or test the installer > with no network. But, again, why is the set of "defined organized testing" for milestones? That seems much too infrequent. Dependence on milestones is creating a lack of quality in this area, not improving it. We should not be allowing days to go by without a usable image. And, again, this is completely orthogonal to whether we freeze the archive for milestones, or have milestones at all. In other words, milestones seem a poor means to accomplish your goals here. We should organize a more rigorous and frequent cadence of testing for ARM images. -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
In reply to this post by Martin Pitt-4
On Wed, Jun 20, 2012 at 5:56 AM, Martin Pitt <[hidden email]> wrote:
> Scott Kitterman [2012-06-19 22:16 -0400]: >> +1 > > +1 as well, thanks Michael and Scott. > > Also, saying "we'll drop milestones and introduce a regular schedule > for testing some particular dailies to fix bugs not caught by > automatic tests" means nothing more than a simple renaming -- because > that's exactly what milestones are today. So I also think that proved > nicely what their purpose is and that we cannot do without them. of milestones. If we need more frequent and rigorous manual testing, we should just do it. Cheers, Rick -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
I think this was a very productive discussion. We considered a lot of
possibilities from a lot of angles. All told, I think there are four points under discussion. I'd like to tease them out so we can move forward. Question 1: shall we stop freezing the archive at milestones? I believe there is not 100% consensus on this point, but enough support to try it for Alpha 2, a la Theirry's suggestion. QA Team/Foundations Team, do we/will we have the tools in place for Alpha 2? Question 2: shall we stop having milestones altogether? This question arose in thread. I don't believe there is consensus for doing this suddenly in 12.10. Question 3: shall we increase the rate of manual testing? This question also arose in the thread. I think there is widespread consensus that we should do this, and it is not actually related to the other questions. Community Team, is it feasible to increase the rate of full manual testing runs to every 2 weeks or similar? Question 4: shall we keep snapshots of the development release so that we can "bisect" more easily and find when bugs were introduced? This question also arose, and also is not tied to the other questions. QA Team, is it feasible to keep a set of snap shots somewhere for this purpose? Cheers, Rici -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
On Wed, Jun 20, 2012 at 3:07 AM, Rick Spencer
<[hidden email]> wrote: > I think this was a very productive discussion. We considered a lot of > possibilities from a lot of angles. > > All told, I think there are four points under discussion. I'd like to > tease them out so we can move forward. > > Question 1: shall we stop freezing the archive at milestones? > I believe there is not 100% consensus on this point, but enough > support to try it for Alpha 2, a la Theirry's suggestion. > QA Team/Foundations Team, do we/will we have the tools in place for Alpha 2? > > Question 2: shall we stop having milestones altogether? > This question arose in thread. I don't believe there is consensus for > doing this suddenly in 12.10. > > Question 3: shall we increase the rate of manual testing? > This question also arose in the thread. I think there is widespread > consensus that we should do this, and it is not actually related to > the other questions. > Community Team, is it feasible to increase the rate of full manual > testing runs to every 2 weeks or similar? > > Question 4: shall we keep snapshots of the development release so that > we can "bisect" more easily and find when bugs were introduced? > This question also arose, and also is not tied to the other questions. > QA Team, is it feasible to keep a set of snap shots somewhere for this purpose? Yes. I spoke to Canonical IS yesterday and we have the space to keep 1 or 2 of the testing snapshots around. There will be some work on the cdimage tooling to allow them to stay around and I'll coordinate with the proper people to make that happen. Thanks ~pete -- Pete Graner - Release Engineering & QA Team Manager - <[hidden email]> Canonical Ltd. - http://www.canonical.com/ -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
In reply to this post by Rick Spencer-2
On Wed, Jun 20, 2012 at 12:07 AM, Rick Spencer
<[hidden email]> wrote: > Question 3: shall we increase the rate of manual testing? > This question also arose in the thread. I think there is widespread > consensus that we should do this, and it is not actually related to > the other questions. > Community Team, is it feasible to increase the rate of full manual > testing runs to every 2 weeks or similar? I talked with Nick Skaggs this week and we are happy to commit to manual testing every two weeks, starting a week on Thursday. Originally I required that Nick *assured* testing of all mandatory tests for each milestone, and I am asking him to provide the same assurances every two weeks. Jono -- Jono Bacon Ubuntu Community Manager www.ubuntu.com / www.jonobacon.org www.identi.ca/jonobacon www.twitter.com/jonobacon -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
In reply to this post by Rick Spencer-2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 On 06/19/2012 11:20 PM, Rick Spencer wrote: > On Tue, Jun 19, 2012 at 8:06 PM, Michael Casadevall > <[hidden email]> wrote: >> -----BEGIN PGP SIGNED MESSAGE----- > . >> >> Many critical issues with ARM (and to a lesser extent x86) have >> only been found during milestone testing. Without a set of >> defined and organized images for testing, more obscure parts of >> the installer simply do not get tested; for instance, how many >> people are going to test all possible server configurations or >> test the installer with no network. > But, again, why is the set of "defined organized testing" for > milestones? That seems much too infrequent. Dependence on > milestones is creating a lack of quality in this area, not > improving it. We should not be allowing days to go by without a > usable image. And, again, this is completely orthogonal to whether > we freeze the archive for milestones, or have milestones at all. In > other words, milestones seem a poor means to accomplish your goals > here. We should organize a more rigorous and frequent cadence of > testing for ARM images. I for one greatly welcome the return of return of more manual testing. Due to the massive number of images for ARM due the unique quirks of the architecture, and the limited manpower behind said efforts, the coverage per image was unfortunately lacking. Because of the amount of work, certain images were not as rigorously tested as they should have and as a result both Beta 2 for and the release candidate for the omap4 images were both woefully under-tested, and were only brought back up to a releasable quality at the zero-hour due to the herculean efforts of far too many parties to list. This new manual testing initiative should prevent us from ever having such a crisis ever again. That being said, due to the sheer number of ARM images due to the unique one-image-per-board that ARM requires, the lack of manpower has been a constant issue, and our previous manual testing efforts were woefully undermanned (granted, at the time, we had over 10-15 actively supported images). Although we do not have multiple image types as with x86 (live/alternate), we still have 6 images that are officially supported as of this moment. * Ubuntu Desktop for armhf+omap4 * Ubuntu Server for armhf+omap4 * Ubuntu Core for armhf * Netboot for armhf+omap4 * Netboot for armhf+armadaxp * Netboot for armhf+highbank (this obviously does not take into account any additional subarchitectures or flavors that may be added during quantal development). Michael -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJP4g+/AAoJEHM+GkLSJHY5NmcP/i185phR+AkeQRG6tqB0hz6E ofT4cWeFtkZO6pdZQa1NOYLrKloy3yVPug/H4xiSdgq59PQpAVqixZTuFxD/cZOR SwcobYIY6aJZAoq+7UDd/6tlgNTR0ymCPu2inrNpp1ieW50BA7Y3YZbZsrn7Bwjj 3ueDClQvLJAH7FxJKJPfl6mN9xGiPKOyDltAI1qV+KeEpuQQdMkkMO9ABESroq0H VX4n64a/9RpWaA5BhSa1TlXU2ajOEltk+YFzHRQiQX3q/3pfzDQQ2JTl1du04LTG AzCb4WbP6LZfFrt4l4378YaCrOfjTiqvjBU2V4GrwllRmM931Pg0t3tW405mNYdw +pR5JVJiNR9FQYvltyv9QJSREN+k820mK7P26QUGii6XN5rwxmP8EelmU54DlMiV JfFr7rO1g2Pf91iVEwqysFBHhtddN4sAjCE9kkwUwuKFAPo/qZ9/mgK/wBcQADxE yYCZywTiPUnK5ijvVHfzuK5qeHMoVSbFf7CKX+b4a8NliuAIyG189g58cyrKQSC7 h03KuCOe72KVDIfLpiVvYRaJ8R0f6Y6lEBpkasmAXVytWQ79k1fSnXhHXmb1sFvZ WgEoT0z6fvejB23W7CO3qVD2ZyT6Z3JAvKnk7tyLJrwJX829rG95qH9SHIGwt5sO uTfV2BKY0xBbbIdK3vbX =0EJ6 -----END PGP SIGNATURE----- -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
In reply to this post by Jono Bacon-3
Jono Bacon <[hidden email]> wrote: >On Wed, Jun 20, 2012 at 12:07 AM, Rick Spencer ><[hidden email]> wrote: >> Question 3: shall we increase the rate of manual testing? >> This question also arose in the thread. I think there is widespread >> consensus that we should do this, and it is not actually related to >> the other questions. >> Community Team, is it feasible to increase the rate of full manual >> testing runs to every 2 weeks or similar? > >I talked with Nick Skaggs this week and we are happy to commit to >manual testing every two weeks, starting a week on Thursday. >Originally I required that Nick *assured* testing of all mandatory >tests for each milestone, and I am asking him to provide the same >assurances every two weeks. For all flavors or just the Canonical ones? Scott K -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
In reply to this post by Rick Spencer-2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 On 06/20/2012 12:07 AM, Rick Spencer wrote: > I think this was a very productive discussion. We considered a lot > of possibilities from a lot of angles. > > All told, I think there are four points under discussion. I'd like > to tease them out so we can move forward. > > Question 1: shall we stop freezing the archive at milestones? I > believe there is not 100% consensus on this point, but enough > support to try it for Alpha 2, a la Theirry's suggestion. QA > Team/Foundations Team, do we/will we have the tools in place for > After much thought, I'm strongly disagree with discontinuing freezes in general. To successfully build images requires a constant and stable base else images can be trivially skewed due to an ill-timed upload. This is a fundamental aspect of the image mastering process, and as such, we break a fundamental assumption of creating stable images. While its not as obvious on x86/amd64, both ARM and powerpc have slower buildds and as such have missed daily images before due to the archive entering an uninstallable state. This skew exists from the moment a single architecture completes a package build to whenever the other four finally catch up. For any given package, it is within the realm of possibility that any of the five architectures in Launchpad could be the first to complete a build and upload a package; for some packages such as libreoffice which have large amounts of arch:all packages and associated post-processing work, amd64 will regularly complete the build and upload to Launchpad before i386* While the largest of these issues have been solved (arch-any/all skew), this underlying problem still exists, and as such we have no way without freezes to ensure a consistent image across all architectures. While the archive should never be in a state of uninstallability, there is no guarantee that doing an apt-get install will net the same package versions across architectures.no additional QA resources behind it. In addition, for packages with inter-arch:any-all relations, until the i386 build of a package is complete, apt will install the older version in-archive until the builds are complete and the skew is resolved. By abolishing freezes, effectively alpha 2 would only become a specific daily image with no special preparation work behind it. For instance, lets assume a new unity version is uploaded that fixes a critical crashing bug when a user does X, Y or Z is released the day before Alpha 2 release. Under the current system, the desktop team would request a freeze exception, and given the nature of the bug, it would be accepted, and then all images would be respun to include it, ensuring that all images have consistent package versioning across them. Without the freeze process, it is possible (and in the case of ARM, even likely) that the unity would not finish building before the final respin, creating a skew between package versions on the alpha 2 images. When people go to run the alpha 2 images, they would find that ARM still exhibits the crash behavior even though the image was spun after the upload that fixed it. Alternatively, it could also be that the respin triggers before publisher has finished, and thus none of the images would include the unity crashing bug fix. During freezes, known issues are collected and placed on the release notes for those adventurous to dive into Ubuntu alpha images. With a constantly changing base, creating such a list would be in my opinion close to impossible. > Question 3: shall we increase the rate of manual testing? This > question also arose in the thread. I think there is widespread > consensus that we should do this, and it is not actually related > to the other questions. Community Team, is it feasible to increase > the rate of full manual testing runs to every 2 weeks or similar? > There's another aspect to this question that has to be considered. Several images such Kubuntu, Edubuntu, and Xubuntu, and a number of ports (armel, armhf+omap and the entirety of the powerpc port) are not supported by Canonical and depend solely on community efforts for full and proper test coverage. As of writing, all these images has and have continued to meet the standards required by the current release process as to be included as part of the normal Ubuntu development cycle, and base their QA and testing efforts off the existing milestone process. Before any changes should be made that would radically alter the testing dynamics are made, these groups must be consulted and their input solicited to ensure that they are first willing, and second are able to meet the demands that bi-weekly testing. > Question 4: shall we keep snapshots of the development release so > that we can "bisect" more easily and find when bugs were > introduced? This question also arose, and also is not tied to the > other questions. QA Team, is it feasible to keep a set of snap > shots somewhere for this purpose? > I personally would welcome this change, or at the very least, the ability to reconstruct older images 1:1 via jigdo or another similar tool. As it stands, Librarian and launchpad retain all older debs, so modifying jigdo so it can simply reconstruct an image based off librarian debs would go a long way toward achieving goal. That being said, due to its nature jigdo only works with alternates at the moment as old squashfs's aren't retained. * - libreoffice build times on Launchpad as of the most recent upload (1:3.5.3-0ubuntu1): i386: 9 hours, 51 minutes, 33.0 seconds amd64: 4 hours, 31 minutes, 59.1 seconds Michael -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJP4ioUAAoJEHM+GkLSJHY5i4gP/2k/p5r14pU1UAiMhW6jWRLc KdovnqvV4UuVkfHe+w46JClvPjYh62OX1DH/N8EI57Kt3JAuaeFv00NjWg2Wy1Xc vESYnNCOjEVYcAHWMasekJdpesmPl0MyAwBSZVJvB2zWDbeaduIxI3J4xy4fHtnC J6F3ypAn+Yuga1Qc+dXZvIVzIUVWmMyO/beDle/6VWXjOTtQdWNhypoqGbqvQTmF gRJfmYYpu9xkqlDQ7I6TzVyI6ti/IWC5yNQemswfQoQ3m3zR+mNdDM05Ai2+oriy MOiKWCJmrno09oKzaHi3BfePaLxMYf8/Z7BKr+hpFEA5Rxnod6m7SaE1PPZLj4lx ZmufBVlUIiwv92Le28gZVNOd4XbkuZfASf0LUQbsROYCW31dw2PK7DlPI1ESncCn hMPPspDz+mhdBH7aKyprUX7SM8zMVRL9ZCDgAThdXBeYrUT3n74K0TqAbUonSrhk +Uir7CB+gzTBpFTs9jdY56Gxv1M/Udmgdq6Zs0dIPWKgLzJGwsOWE6BxGMFP/Z2u QZ6RkMHg0LNDVIfbRjyPJ3oXtRbkCWbIPvY/2G6UeHp52/lJHHs/6FHh7M555gug eHkO1j+IWV48vMVHfydIYAr9H3c3PjxeklIqJaXFtkLMQIvM+0uI+zdom6IS7bMB +OlG59zBtH73Qg4RRBrG =CLDM -----END PGP SIGNATURE----- -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
Le 20/06/2012 21:52, Michael Casadevall a écrit :
> While its not as obvious on x86/amd64, both ARM and powerpc > have slower buildds and as such have missed daily images before due to > the archive > entering an uninstallable state. What you describe here an issue doesn't apply in a world where proposed is used as a staging area, things would not be copied to the release pocket before being built on all archs... (Btw why did you feel like crossposting on a second list was a good idea, it just makes the discussion harder to follow, and you can assume that release team members read the devel list to follow what's happening in Ubuntu) -- Sebastien Bacher -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
|
In reply to this post by Adam Conrad-3
On Wed, Jun 20, 2012 at 04:57:19AM +0000, Adam Conrad wrote:
> On Tue, Jun 19, 2012 at 11:06:14AM -0700, Michael Casadevall wrote: > > Milestones exist to give the Ubuntu developer community to step back, > > and check to make sure nothing important has broken, and to gauge our > > progress through a cycle. In addition, they provide a dedicated time > > where as a community we step forth and check our images to ensure no > > regressions have slipped by. > > I don't think anyone is arguing that we should do less manual testing. > In fact, I think that milestones create a culture of less testing, in > the sense that people ONLY test during milestones. > > I think any discussion of "dropping milestones" can only come paired > with a conversation about better continuous testing practices. If > people have the time to test, they should test what's current. If > people don't have the time to test, milestones don't magically create > time, in fact, they drain time from people who have to make them go. I think if continuous testing is done in a well organized way, you may see an improvement in efficiency which could in fact get a lot more bang for a lot less time invested. A non-organized testing effort takes kind of a shotgun approach: "Here's some ISO's; go install them and then file bugs." These bug reports can vary widely in quality. Since that's the only output, scaling this style of testing up just means a lot more bug reports. Yet, we often can't answer some rather basic questions. From our test effort, was Alpha-2 measurably better or worse than Alpha-1? Were there any particular anomalies or regressions that affected a lot of people? How broad of hardware coverage did we achieve? Organized testing efforts know ahead of time specifically what to test, how to test it, and how to capture all the data in machine digestible formats so analysts can look for patterns later. It uses scripts or paint-by-number procedures for folks to follow so the same data is gathered consistently from everyone. And over time people will script the more time consuming procedures, which can make everyone more and more efficient over time. I've seen lots of examples of this style in Ubuntu over the years. I like how the kernel team passes around USB keys with the kernel they want tested, and whatever tests or scripts they need. The ISO tracker folks have a nifty collection of testing procedures and infrastructure. Checkbox is another example. In this style of testing the tangible output is sets of pass/fail data points; bug reports are generated too but those are just derivative data. Your goal is to end up with a consistent collection of data you can plot to show that yes, Alpha-2 gives 20% more passes than Alpha-1 did, that overall performance is 5% faster, and that testing covered 42% more types of hardware. Since you know what you want to test, and have measurements of what you've measured in the past, you will know what you *don't* need to test. For example, you may find that most graphics regressions tend to happen on systems with the newest video cards, so you could scale back testing on older hardware, and focus more on the newer systems. Bryce -- ubuntu-devel mailing list [hidden email] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel |
| Powered by Nabble | Edit this page |
