Business readable progress and flow based on acceptance tests improves collaboration and trust

A couple of weeks ago I was attending Agile 2011 in Salt Lake City, where I held a brief 8 minute lightning talk about how we made progress and team flow in our projects business readable and what benefits we gained out of it. I want to summarize this talk with the following blog post.

Tasks are not business readable

So, what does business readable progress and team flow mean? Typically, teams break down user stories into task, in order to self-organize their work in delivering the story. Unlike user stories, tasks are not necessarily understandable to business:

bp1

Picture: the teams breaks a user stories into tasks (<8h) to self-organize their work

Business stakeholders can just see on the task board, when all tasks of a user story are done, and based on this information assume that the story is done. Individual pending or completed tasks usually don’t tell outsiders what is working already and what is not – sometimes this is even not obvious to other team members not working on this particular story.

Planning work and limiting work-in-progress based on tasks

When we started using SpecFlow, our workflow of preparing a user story for implementation looked like this:

  1. Collect examples and artifacts related to the user story: samples or existing documents, lo-fi UI scribbles, etc.
  2. Have a discussion around these artifacts and examples, and identify individual acceptance criteria that are illustrated with according scenarios using examples.
  3. Team commits to build the story based on the discussed acceptance criteria and scenarios.
  4. Team performs task planning for the committed user stories (we are working in sprints, but you could as well have a WIP and size limit for continuously pulling user stories into implementation).
  5. Team implements individual user stories based on tasks, starting with formalizing the scenarios into Gherkin for automating them. They organize work along the planned tasks. Actually, formalizing scenarios into Gherkin are tasks themselves. And of course, tasks are refined and adjusted as needed, as they story is implemented.

This approach had several problems:

  • Tasks spanned multiple acceptance criteria. User stories often fulfilled no single scenario until the very end, when the final tasks of a story were completed.
  • Sometimes the team dealt with formalizing and automating the Gherkin scenario only late, when the user story was already (almost) done. This caused frictions in automation (making an already implemented feature “automatable” in the end).
  • Delaying formalization and automation after task planning or even until the story was almost done brought up important discussions at a time when it was already rather late to consider the conclusions. Also, tasks that were planned before the formalization of scenarios became void as new things were discovered during formalization.

Planning work and limiting work-in-progress based on scenarios

In our retrospectives we tried to address those problems and defined the following goals:

  • Finish individual scenarios earlier, ideally one scenario of a user story after each other
  • Delay planning of tasks further to take important conclusions into account

After several experiments, we arrived at a workflow, where we formalized the Gherkin scenarios beforedoing the task planning. Formalization was done in pairs within the team, while task planning was still a collaborative effort for the whole team. During formalization, new questions that came up could be clarified with other project stakeholders as needed. The formalized scenarios were reviewed by the whole team during task planning:

bp2

Picture: list of scenarios discussed for a user story

bp3

Picture: one scenario formalized into Gherkin

This workflow had the following positive effects on the flow of implementation:

  • It was easier for the team to plan tasks, and fewer tasks became irrelevant during implementation.
  • Tasks aligned with individual scenarios which allowed it to limited work-in-progress on tasks for a specific scenario. As a result, scenarios for a given story could be fulfilled one after another.
  • Some teams even managed to fulfill individual scenarios in less than a day, and dropped additional task planning completely. Instead they just reviewed the formalized Gherkin scenarios together, before starting with the implementation.

Business readable progress yields faster feedback

After establishing this improved implementation flow, we discovered further benefits. To explain these, I need to describe our continuous integration setup:

After each check-in, our build server runs all automated scenario tests and deploys the new build to a staging system. Actually, we have two kinds of builds:

  • One build is running the scenarios for stories currently in implementation only. This build runs on every check-in.
  • The other build is running all automated scenarios of all stories completed already. As running all these tests takes quite some time, we are running this build only when a previous build has not been started since a certain period, to avoid having multiple instances of the build running in parallel on our build server.

Along with each deployed build, there is a test report, which lists all specified scenarios and their execution results:

bp4

Picture: test report listing all user stories committed for implementation as features, along with the execution results of their scenarios

  • Pending (purple): the scenario has not been automated yet
  • Failed (red): the scenario is automated and not fulfilled by the system
  • Success (green): the scenario is automated and fulfilled by the system

The deployed builds are posted along with their test report to a project dashboard, where all stakeholders can access them easily:

bp5

Picture: Project dashboard showing work-in-progress, including currently deployed build and automated scenarios report

This approach made the progress and flow of the team business readable. Instead of waiting for all tasks to be completed, stakeholders can now track completion of individual scenarios. They can see which user story is already working in a given deployed build, to start testing and give feedback.

Since user stories are implemented scenario after scenario, testing and giving feedback can start even earlier, when the core scenarios of a story are working already. Additional scenarios, for example extended validation or optional flows to be supported, can be tested later.

Better involving business into the BDD cycle

In our previous approach, business reviewed formalized Gherkin scenarios only very late – partly because they were completed late. It was sometimes even hard to keep business interested in the formalized scenarios, after we had the initial discussions for planning them.

Improving transparency and shortening the feedback cycle helped us to keep business and stakeholders more involved during the implementation. Conversations around the scenarios continued until the story was finally completed. This closer collaboration also increased the trust among all parties.

Summary

In summary our approach to make progress and team flow business readable provided us the following benefits:

  • Deferring task planning after formalization of Gherkin scenarios improved planning quality and aligned tasks with scenarios. Some teams even dropped task planning and organized their work only based on formalized scenarios.
  • Aligning tasks with scenarios allowed the team to limit work-in-progress on a scenario level, finishing scenario after scenario of a given user story. Stories could be tested already with a sub-set of scenarios, before the story was fully completed.
  • Providing transparency about already completed scenarios allowed earlier testing and feedback within the team as well as for business stakeholders.
  • Shortening the feedback cycle improved collaboration with business and helped building trust.

Further references

  • The agile alliance website hosts description and slides of my presentation.
  • We are using SpecFlow to write automated acceptance tests that are business readable.
  • We are mapping and visualizing SpecFlow scenarios using SpecLog.

BDD with SpecFlow @ SkillsMatter

Gaspar and Jonas did two workshops about BDD with SpecFlow at Skillsmatter in London.

In the first workshop they introduced the concept of BDD, and demonstrated the development workflow for building a system based on Gherkin acceptance criteria.

Jonas explained the key concept of “specification by example” and how it can be formalized using Gherkin to build the ubiquitous language amongst all stakeholders in a project. That’s why SpecFlow and Gherkin are actually aiming at specifying the details of a system. Testing of the completed system for compliance is just a by-product. Having the detail specification bound to the actual implementation ensures, however, that it is kept up-to-date throughout the whole application life cycle. It makes detail specifications valuable even after finishing the implementation of a given feature, as they are a business readable description of how a system truly behaves.

After the theoretic introduction, Gaspar did a coding demo, binding and implementing the first Gherkin formulated acceptance criteria of their sample domain. He used the SpecFlow book shop sample (using an MVC2 architecture), where he automated the first Gherkin scenario through the controller. From there, he implemented the necessary logic using TDD, until the scenario turned green. After that, there was an exercise for the audience to bind and then implement the next acceptance criteria for the application.

The second workshop focused on the process of how to harvest useful acceptance criteria and the challenges involved with that.

After a short introduction, the audience got the task to formulate acceptance criteria (scenarios) in Gherkin for a given user story (feature) – using pencil and paper. The groups presented their results and there was a discussion about the challenges people encountered and anti-patterns we found in the results. This exercise made it obvious to everyone,  that the key concept to master in BDD is not the tool (e.g. SpecFlow), but understanding the process of distilling and formalizing acceptance criteria in Gherkin.

I particularly liked this part of the workshop, as it not only was an exercise for the individual participants, but also a good demonstration for the challenges in BDD. Even if you are just watching the podcast, the discussion of the results and challenges everyone found is helping a lot to get a better understanding on detail specifications in Gherkin. Also Gaspar’s comment on the technical “TDD perspective” of a developer vs. the “business perspective” on acceptance criteria provides lots of insight.

The second workshop concluded with a summary of challenges and anti-patterns we encountered so far in our experience, and a discussion about how testing fits into BDD. In our projects we mostly automate below the UI, as this is much cheaper to achieve and maintain. It also ensures, that we really can automate all specified acceptance criteria. The main goal of using SpecFlow is not automating through the UI, but having an automated verifiable detail specification. Even if you automate through the UI, it doesn’t spare you additional efforts such as explorative testing.

If you are interested in using SpecFlow, I strongly recommend you watching those two workshops. Skillsmatter has put both of them online as a video podcast:

(Note: podcasts will become available in the next few days)

You can also download the hands-on material for Day1 and Day2 to try out the exercises on your own.

If you are interested in getting deeper into SpecFlow and need a head start, we can provide similar workshops on-site in the domain of your own project. Contact us at info_at_specflow_dot_org.

Scrum in fixed-everything contracts

I did an interview with Mitch Lacey, a Certified  Scrum Trainer we work together with, about how Scrum fits to fixed-everything contracts that we all know from our daily software development business.

He talks about

  • Why customers traditionally demand fixed-everything contracts
  • How to introduce Scrum in fixed-everything project deliveries and what to watch out for in such engagements
  • Who should take the role of the Product Owner in such cases
  • Contract models that provide more business value than fixed-everything

Access to the podcast: as RSS feed or through ITunes.

Choosing the proper sprint length in Scrum

A discussion that regularly comes up with teams new to Scrum is deciding what sprint length to choose for a project.

Although changing the sprint length during a project may help the team become more efficient, it is something that should not happen too often. The goal of maintaining a fixed sprint length is to establish a rhythm of delivering value to the product owner. It introduces routine, making the team more efficient and letting it predict the scope it can commit to in a sprint more accurately. A team should therefore try to select a sprint length that will most likely not require any changes during the project.

While the general rule demands a sprint length of 1-4 weeks, I personally find it important to start with two weeks sprints, when you are new to Scrum and have yet to find out what is optimal for the team. The general tendency of teams coming from a less agile approach, however, is to start with four week sprints instead, which makes it harder for them to successfully adopt Scrum.

Impact of sprint length

Understanding the impact of shorter or longer sprints is crucial for making the right decision about the ideal length for your project. I want to discuss the aspects I have found so far that need to be considered when choosing the length that is right for your project.

Pressure on the team to finish

The most obvious impact a team will usually consider is the pressure of delivering something of value to the product owner. The idea of delivering early and regularly might be scary for a team new to agile methodologies. It’s almost natural that four weeks seem to be the most feasible (but still inconvenient) option under this general impression.

While one week sprints can put even experienced teams under constant pressure, two week sprints already allow the team to put up with and release the tension.

That’s because in one week sprints you are already worried about the next deadline, when you just got out of the review of the previous sprint. You don’t even have a single weekend where you can relax and stop worrying about what you are going to deliver at the end of the next sprint.

Two weeks, however, give at least a week and a weekend, where the deadline is not in immediate proximity. For the second week there is again a build up of tension for the team, as the deadline approaches. This makes a good balance of recovery and pressure, which usually establishes a productive rhythm for the team.

Longer sprints of three or four weeks, on the other hand, disturb the balance of this rhythm. Only in the last third or quarter of a sprint will the team have the imminent deadline in mind again. Only disciplined and well focused teams will not lose sight of the sprint goal in the first two or three weeks.

Coordination with external parties

Another reason for longer sprints being brought up by teams quickly is the need to coordinate with external parties. These might be integration necessary for each sprint release, or dependencies between the team and external parties in either direction.

One week sprints again are hard to achieve when considering this aspect. They simply don’t allow for any contingency when either party fails to deliver in time to the other.

Two week sprints provide a good balance between flexibility to deal with glitches and keeping up the pressure on the other party to deliver.

Longer sprints bear the danger that problems are exposed too late to the other party. While a Scrum team has the daily stand-up as a tool to monitor progress and status within a sprint, external parties can track progress only based on what they deliver to each other.

Another reason that can be brought up against short sprint length is the high effort for integration at the end of each sprint. In this case, however, four week sprints will also cause troubles that need to be dealt with. A common approach to overcome such problems is to invest in sprint automation – you need to do this anyway, even with four week sprints.

Having enough time to finish a story

Worrying about having enough time to finish a story usually smells of something else going wrong in the project. The most common reasons I have come across for why a team complains about not being able to deliver business value within a sprint are:

  • Inability to identify the individual user intentions contained within a story, which would allow the team to split up the story into smaller deliverables.
  • Implementation tasks are distributed by layer instead of by user story, causing overhead and reducing focus on finishing a single story.
  • Big Design Up Front approach leading to implementation of functionality not required for a particular story.
  • Team members are not fully committed to the project, causing problems in communication and less efficiency because of multi-tasking between projects.

Reducing the sprint length helps teams identify and resolve these problems. Two week sprints are more than long enough for complex stories that are hard to split up further. Even one week sprints should allow enough time to deliver business value to the product owner in most cases.

Planning efforts

Shorter sprints mean running more sprints to deliver the same scope. A typical assumption derived from this is that shorter sprints mean more planning overhead.

Besides the daily stand-up, which is a constant engagement independent of the sprint length, the following meetings are held for each sprint according to Scrum:

  • Sprint planning
  • Sprint review
  • Sprint retrospective

I have found that the time required for the sprint planning in four week sprints is almost twice as much as in two week sprints. I don’t have comparable numbers for review and retrospective, but I assume that the amount of time required to demo is also nearly in linear relation to the length and scope of a sprint. The retrospective might be an additional overhead for each sprint. But on the other hand, learning about problems and possible improvements earlier can also be a benefit for the project.

While the overhead of the additional meetings might not be fully negligible in one week sprints, two week sprints do not introduce significantly higher planning efforts compared to longer sprints.

Amount of transient knowledge to be maintained

One goal of time boxing a sprint to 1-4 weeks is to limit the amount of transient detail knowledge the team is required to hold about a system.

Agile methodologies like Scrum acknowledge the fact that, at a certain level of detail, system requirements cannot be explicitly documented anymore. Other methodologies that try to ignore this fundamental law in software development impose huge inefficiencies on the development process: product owners try to specify details they have no clue about in advance, and developers desperately try to extract the business value and user intentions from loads of detailed specifications with no value, as they are not applicable anymore by the time they are needed.

However, the details required to implement a system still need to be agreed on between the product owner and the team. The sprint is the time box into which a feasible amount of detail knowledge should be fit. Each story is a reminder of a discussion that takes place before implementation, where the developer can ask the questions required to know what exactly to do. Of course the developer will take notes, either on the back of the story card, or in any other informal way that is suitable. In no case is a formal document created to capture the collected details. Instead, transient knowledge is built within the team and the product owner, starting with the sprint planning and further derived in discussions during the sprint.

The knowledge can be transient as it is not required beyond the scope of a sprint, and almost immediately transformed into (potentially automated) acceptance test scripts, unit tests and implementation.

The shorter the sprint length, the less transient knowledge has to be maintained. Shorter sprints also reduce the risk of defining outdated detail information, and a shorter period to look ahead makes it easier for the product owner to understand the impact of details defined for the system.

Having said this, one week sprints should be preferred when considering only this aspect. Two week sprints, however, still produce a well manageable amount of transient knowledge, and provide a good compromise with the other aspects considered so far.

Flexibility – opportunities to introduce change

While a Scrum project is agile, and allows the product owner to change scope and priorities between each sprint, it is quite strict about introducing changes during a sprint. This concerns organisational changes as well as functional scope changes.

Important organisational aspects to keep within a sprint are the length and the team. You shouldn’t finish a sprint later or earlier in order to accommodate a certain deadline, or deliver more scope than actually possible within the sprint. This would disturb the rhythm of the team, important for efficiency and for predicting what scope can be achieved. You also shouldn’t change the members of the team during a sprint. Important transient knowledge of leaving members may get lost before it can be transformed into implementation and new team members might miss knowledge that was built up in discussions during the sprint.

A team may choose to accept minor changes during a sprint, or commit to additional scope, when finishing early. But regular changes of the scope during the sprint will reduce the efficiency of the team, and the product owner should ideally make decisions based on the demonstrated current feature set of the product, which is not available before the end of the sprint.

Choosing a short sprint length allows the project to be delivered closer to externally set deadlines and provides more opportunities for change. Therefore this aspect would again be ideally fulfilled with one week sprints. But given the constraints introduced by the previously considered aspects of well balanced pressure and coordination with external parties, two week sprints allow for enough flexibility in most cases.

Summary

  • As the bottom line, I advocate starting projects in general with two week sprints, because they
  • Provide the best balance of pressure and recovery for the team.
  • Allow enough time to coordinate with external parties while keeping up short feedback cycles that expose potential problems of integration.
  • Better expose smells in the project related to organisational or engineering practices not well suited to agile development.
  • Do not introduce higher planning efforts than longer sprints.
  • Result in a well predictable and well manageable amount of transient detailed knowledge about the scope of the sprint.
  • Provide enough flexibility to meet external deadlines and provide enough opportunities to introduce changes to the project (team, scope, priorities).

Why start a blog now?

Well, I do have to admit that I am part of the late majority when it comes to blogging. So why am I bothering to start a blog now?

I am a managing partner at TechTalk – we are about 50 people, doing consulting for software engineering in enterprise application development, with a technological focus on the Microsoft .NET platform. Besides general management tasks, my primary role is to lead the advancement of our technical and methodical expertise in software engineering. I share management responsibility with my two other partners, who focus on leading our sales/marketing efforts and our project operations.

Being a technical leader was easy when we started. I joined TechTalk in 1997, and although the team grew quite quickly in the first years, projects were pretty much a hands-on experience for me. Communication was also easy, up to the point where we hit the 15 people mark. Since then, my role gradually transformed from daily coding to being regularly involved with architecture (I wouldn’t call myself an architect, though) and whose daily practice is rather about facilitating teams so they can successfully build enterprise software systems.

While I am happy with the transformation of my daily work, the lack of communication within the growing team increasingly started to bother me. In the early days, I knew about every project detail and all the problems and solutions we found, but now there are some projects which I only learn about in more detail when there are already problems to deal with. Even worse, I also have the impression that we have to overcome great resistance in order to transmit findings and perceptions from individual projects to other teams, let alone spread them generally in the whole company.

Another interesting aspect is that my opinions and standpoints are constantly evolving through the experience I gain in my daily work, and inputs I get from external peers. What I found useful a year ago might be not the same approach I would take to solve the same problem today. My colleagues are sometimes surprised when I disagree with positions I took fiercely not long ago. But I think it is essential for a successful team or company to constantly strive for improvement, learning new things, and embrace the resulting change; and that it is one of my primary objectives – to facilitate this process at our company.

Out of these aspects, I decided to start a blog with the following motivations:

1. Document problems, discussions and findings from our daily work, to have them spread more easily within our teams and organisation.

2. Provide a log of my opinions and standpoints, to make it easier for myself, my colleagues and peers to track how they evolve.

The primary target audience of my blog is our own organisation and maybe also some of my peers at our customers. In the intermediate term, I also hope to draw the attention of external readers on some topics I write about, to gain additional outside viewpoints on them.