2009年8月5日星期三

Daily Build and Smoke Test

Steve McConnell,代码完成(code complete)的作者曾于1996年在IEEE
Software杂志上发表了下面这篇关于每日编译和冒烟测试的文章,已经是别人当时的最佳实践(Best
Practices)了。
我们现在好几个项目组都在做Daily
Build,就这一点上,其实已经落后别人十年了,但是我们希望能够坚持把Daily
Build做好,为研发的正常进行提供保证。
原文链接是http://www.stevemcconnell.com/bp04.htm
下面贴出原文,以及我自己的翻译,不当之处,请斧正。
Daily Build and Smoke Test
If you want to create a simple computer program consisting of only one
file, you merely need to compile and link that one file. On a typical team
project involving dozens, hundreds, or even thousands of files, however,
the process of creating an executable program becomes more complicated and
time consuming. You must "build" the program from its various components.
A common practice at Microsoft and some other shrink-wrap software
companies is the "daily build and smoke test" process. Every file is
compiled, linked, and combined into an executable program every day, and
the program is then put through a "smoke test," a relatively simple check
to see whether the product "smokes" when it runs.
BENEFITS. This simple process produces several significant benefits.
It minimizes integration risk. One of the greatest risks that a team
project faces is that, when the different team members combine or
"integrate" the code they have been working on separately, the resulting
composite code does not work well. Depending on how late in the project
the incompatibility is discovered, debugging might take longer than it
would have if integration had occurred earlier, program interfaces might
have to be changed, or major parts of the system might have to be
redesigned and reimplemented. In extreme cases, integration errors have
caused projects to be cancelled. The daily build and smoke test process
keeps integration errors small and manageable, and it prevents runaway
integration problems.
It reduces the risk of low quality. Related to the risk of unsuccessful or
problematic integration is the risk of low quality. By minimally
smoke-testing all the code daily, quality problems are prevented from
taking control of the project. You bring the system to a known, good
state, and then you keep it there. You simply don't allow it to
deteriorate to the point where time-consuming quality problems can occur.
It supports easier defect diagnosis. When the product is built and tested
every day, it's easy to pinpoint why the product is broken on any given
day. If the product worked on Day 17 and is broken on Day 18, something
that happened between the two builds broke the product.
It improves morale. Seeing a product work provides an incredible boost to
morale. It almost doesn't matter what the product does. Developers can be
excited just to see it display a rectangle! With daily builds, a bit more
of the product works every day, and that keeps morale high.
USING THE DAILY BUILD AND SMOKE TEST. The idea behind this process is
simply to build the product and test it every day. Here are some of the
ins and outs of this simple idea.
Build daily. The most fundamental part of the daily build is the "daily"
part. As Jim McCarthy says (Dynamics of Software Development, Microsoft
Press, 1995), treat the daily build as the heartbeat of the project. If
there's no heartbeat, the project is dead. A little less metaphorically,
Michael Cusumano and Richard W. Selby describe the daily build as the sync
pulse of a project (Microsoft Secrets, The Free Press, 1995). Different
developers' code is allowed to get a little out of sync between these
pulses, but every time there's a sync pulse, the code has to come back
into alignment. When you insist on keeping the pulses close together, you
prevent developers from getting out of sync entirely.
Some organizations build every week, rather than every day. The problem
with this is that if the build is broken one week, you might go for
several weeks before the next good build. When that happens, you lose
virtually all of the benefit of frequent builds.
Check for broken builds. For the daily-build process to work, the software
that's built has to work. If the software isn't usable, the build is
considered to be broken and fixing it becomes top priority.
Each project sets its own standard for what constitutes "breaking the
build." The standard needs to set a quality level that's strict enough to
keep showstopper defects out but lenient enough to dis-regard trivial
defects, an undue attention to which could paralyze progress.
At a minimum, a "good" build should
compile all files, libraries, and other components successfully;
link all files, libraries, and other components successfully;
not contain any showstopper bugs that prevent the program from being
launched or that make it hazardous to operate; and
pass the smoke test.
Smoke test daily. The smoke test should exercise the entire system from
end to end. It does not have to be exhaustive, but it should be capable of
exposing major problems. The smoke test should be thorough enough that if
the build passes, you can assume that it is stable enough to be tested
more thoroughly.
The daily build has little value without the smoke test. The smoke test is
the sentry that guards against deteriorating product quality and creeping
integration problems. Without it, the daily build becomes just a
time-wasting exercise in ensuring that you have a clean compile every day.
The smoke test must evolve as the system evolves. At first, the smoke test
will probably test something simple, such as whether the system can say,
"Hello, World." As the system develops, the smoke test will become more
thorough. The first test might take a matter of seconds to run; as the
system grows, the smoke test can grow to 30 minutes, an hour, or more.
Establish a build group. On most projects, tending the daily build and
keeping the smoke test up to date becomes a big enough task to be an
explicit part of someone's job. On large projects, it can become a
full-time job for more than one person. On Windows NT 3.0, for example,
there were four full-time people in the build group (Pascal Zachary,
Showstopper!, The Free Press, 1994).
Add revisions to the build only when it makes sense to do so. Individual
developers usually don't write code quickly enough to add meaningful
increments to the system on a daily basis. They should work on a chunk of
code and then integrate it when they have a collection of code in a
consistent state-usually once every few days.
Create a penalty for breaking the build. Most groups that use daily builds
create a penalty for breaking the build. Make it clear from the beginning
that keeping the build healthy is the project's top priority. A broken
build should be the exception, not the rule. Insist that developers who
have broken the build stop all other work until they've fixed it. If the
build is broken too often, it's hard to take seriously the job of not
breaking the build.
A light-hearted penalty can help to emphasize this priority. Some groups
give out lollipops to each "sucker" who breaks the build. This developer
then has to tape the sucker to his office door until he fixes the problem.
Other groups have guilty developers wear goat horns or contribute $5 to a
morale fund.
Some projects establish a penalty with more bite. Microsoft developers on
high-profile projects such as Windows NT, Windows 95, and Excel have taken
to wearing beepers in the late stages of their projects. If they break the
build, they get called in to fix it even if their defect is discovered at
3 a.m.
Build and smoke even under pressure. When schedule pressure becomes
intense, the work required to maintain the daily build can seem like
extravagant overhead. The opposite is true. Under stress, developers lose
some of their discipline. They feel pressure to take design and
implementation shortcuts that they would not take under less stressful
circumstances. They review and unit-test their own code less carefully
than usual. The code tends toward a state of entropy more quickly than it
does during less stressful times.
Against this backdrop, daily builds enforce discipline and keep
pressure-cooker projects on track. The code still tends toward a state of
entropy, but the build process brings that tendency to heel every day.
Who can benefit from this process? Some developers protest that it is
impractical to build every day because their projects are too large. But
what was perhaps the most complex software project in recent history used
daily builds successfully. By the time it was released, Microsoft Windows
NT 3.0 consisted of 5.6 million lines of code spread across 40,000 source
files. A complete build took as many as 19 hours on several machines, but
the NT development team still managed to build every day (Zachary, 1994).
Far from being a nuisance, the NT team attributed much of its success on
that huge project to their daily builds. Those of us who work on projects
of less staggering proportions will have a hard time explaining why we
aren't also reaping the benefits of this practice.
Editor: Steve McConnell, Construx Software, 11820 Northup Way #E200,
Bellevue, WA 98005.
E-mail: steve.mcconnell@construx.com - WWW:
http://www.construx.com/stevemcc/
每日构造与冒烟测试
如果你想创建一个只包含一个源程序文件的简单程序,那么你只需要编译、连接那一个文件就可以了。如果是一个团队项目组,有着许多甚至上千个源程序文件,那么要创建一个可执行程序的过程就变得更复杂、更耗时。你必须用各种各样的组件将程序逐步建立起来。
在微软或其它一些软件公司中惯例是:每日构造并做"冒烟测试"。每天都对已完成的源程序进行编译,然后连接组合成可执行的程序,并做"冒烟测试",以简单的检查该执行程序在运行时是否会"冒烟"。
带来的好处
虽然这是一个非常简单的过程,但却有非常重要的意义:
1、能最小化集成风险
项目组可能遇到的一个很大的风险是,项目组成员根据不同的系统功能各自开发不同的代码,但是当这些代码集成为一个系统的时候,也许系统完成不了预期的功能。这种风险的发生取决于项目中的这种不兼容性多久才被发现,由于程序界面已经发生了变化,或者系统的主要部分已经被重新设计和重新实现了,相应的排错工作将非常困难和耗时。极端情况下,集成的错误可能回导致项目被取消掉。每日构造和冒烟测试可以使这种集成错误变得非常小,而且便于解决,防止了很多集成问题的产生。
2、能减小产品低质量的风险
这种风险是和集成不成功、集成出错相关联的。每天对集成的代码做一些少量的冒烟测试,即可杜绝项目中那些基本的质量问题。通过这种方式,使系统达到一种周知的良好状态,维护这样的系统可以防止系统逐步恶化到耗费大量时间排查质量问题的地步。
3、能简单化错误诊断
当系统每天都进行build和测试时,系统任何一天发生的错误都能够变得十分精细,便于排查。比如在17日系统还运行正常,18日就出错了,那么只需要检查这两次build之间的代码变化就可以了。
4、能极大鼓舞项目组的士气
看到产品的不断成长,能够极大的鼓舞项目组的士气,有时甚至不管这个产品到底用来做什么。开发人员可能会为系统显示了一个矩形而感到激动。通过每日构造,产品每天进步一点点,保证项目士气的持续高涨。
进行每日构造和冒烟测试
虽然说这是一个简单枯燥的活,每天进行build,每天进行测试,但也有着一些值得注意的细节:
1、每天坚持
每日构造,最重要的就是"每日"。如Jim
McCarthy所说,把每日构造看作是项目的"心跳",没有"心跳"的话,项目也就死了(Dynamics
of Software Development, Microsoft Press, 1995)。Michael Cusumano and
Richard W.
Selby描述了另外一种隐含的比喻,把每日构造比作项目的"同步脉冲"(Microsoft
Secrets, The Free Press, 1995)。
不同开发人员写的代码在他们的"脉冲"之间肯定都会存在"同步"的差异,但是必须有这样一个"同步脉冲",使得这些代码能够组合为一个整体。当项目组坚持每天把这些不同的"脉冲"组合到一起的时候,开发人员脱离整体的情况就会得到极大程度的杜绝。
有些项目组把这一过程简化为"每周build一次"。这样带来的问题是,某一次build失败后,可能要回溯好几周才能找到原因。如果这种情况发生的话,已经得不到经常build带来的好处了。
2、严格检查每一次build
要保证每一次build的成功,就必须保证build后的结果(也可称为build)是可以正常运行的,如果build不可运行,那么本次build被认为是不成功的,同时应该将修复此次build的工作提高到项目组最高级别来处理。
对于如何衡量一个build,每一个项目组都会定义一些自己的标准,这些标准需要设定一个严格的质量级别来处理那些特别严重的缺陷,同时也需要具有一定的伸缩性来忽略掉那些微不足道的缺陷,一些不适当的关心也许会使整个过程举步为艰。
一个好的build起码应该具备以下条件:
●能够成功编译所有的文件、库,以及其它相关组件;
●能够成功链接所有的文件、库,以及其它相关组件;
●不能存在任何使得系统无法运行或者运行出错的高级别故障;
●当然,必须通过冒烟测试
3、每天进行冒烟测试
冒烟测试应该是对整个系统流程从输入到输出的完整测试。测试不必是面面俱到的,但是应该能够发现系统中较大的问题。冒烟测试应该是足够充分的,通过了冒烟测试的build就可以认为是经过充分测试、足够稳定的。
不进行冒烟测试的build是没有太大价值的。冒烟测试就像一个哨兵,在阻止着产品质量恶化和集成问题的产生,不进行冒烟测试,每日构造可能会变成浪费时间的练习。
冒烟测试必须随着系统的扩充而扩充。最初,冒烟测试可能是非常简单的,比如验证系统是否会打印"Hello
World",随着系统功能的扩充,冒烟测试需要越来越充分。最初的冒烟测试也许只需要几秒钟来执行,逐渐地,测试可能会花费30分钟,1小时,甚至更长。
4、建立一个专门的build小组
在很多项目组,维护每日构造,并更新冒烟测试用例,会耗费一个人工作的大部分时间。因此在一些大的项目中,这项工作独立成不止一个人来完成的全职工作。比如在
Windows NT 3.0的研发中,就有一个由四个全职人员组成的专门的build小组(Pascal
Zachary, Showstopper!, The Free Press, 1994)。
5、为build增加修订,如果这样做有意义的话
一般开发人员不会每天都经常向系统中快速的增加实际的代码,通常是每隔几天,他们在开发好完成某个功能的一套代码以后,然后集成到整个系统中。
6、规定一些导致build失败的惩罚措施
很多执行每日构造的项目组都会规定一些惩罚措施,来惩罚那些导致build失败的行为。从最开始,项目组成员就清楚的知道,build的正常执行是项目组的头等大事。一个失败的build是项目组的意外,无法成为项目组工作的准则。必须坚持:导致build失败的同事,必须停下手中的工作,首先来解决build失败的问题。如果一个项目组的build经常失败的话,久而久之的,再来谈build的正确性就没有意义了。
有种轻松的惩罚措施,能够突出解决问题的优先性。Some groups give out
lollipops to each "sucker" who breaks the build. This developer then has
to tape the sucker to his office door until he fixes the problem.
有些项目组会惩罚犯错的同事戴上山羊角,或者向一个项目基金捐献5块钱。
有些项目组对此的惩罚就有点残酷了。微软的开发人员,在一些知名度很高、很重要的产品如Windows
NT,Windows
95,Excel等产品后期研发中,被要求随时带着寻呼机,如果你的代码导致build失败的话,即使是凌晨3点钟,也会要求你立即来处理这个问题。
7、即使在压力下也需坚持每日构造和冒烟测试
当项目进度的压力越来越大时,维护每日构造的工作看起来有些浪费时间,但是恰恰相反。在压力之下,开发人员丢掉一些平时的规定,会采用一些设计和实现的捷径,这在平时压力较小的环境下一般时不会用的。代码的review和单元测试也可能会比平时粗心一些,这些代码的状态变化也会比平时快很多。
为防止这种情况的出现,每日构造会坚持相关的规定,让压力下的项目保持在正轨上。代码仍然每天在不断变化,但是构造过程使得这种变化每天都可控。
谁能够从每日构造这种过程中得到好处呢?一些开发人员会抗议说,由于他们的项目太大,每天进行build是没有实际意义的。但是为什么现在最复杂的软件项目组却能够成功的执行每日构造的制度呢?本文首发时,Windows
NT包括了560万行代码、分布在4万个源程序文件中,项目组仍然可以坚持每日构造。

没有评论: