Agility to the rescue of slipping tasks!
At Marmelab, we're developing projects for our customers using agile methodologies. And just like in every development, our tasks sometimes slip quite a bit!
We've recently experienced a user story we had estimated for 0.5 points slipping to over 5 points (between the initial estimate and the cost it actually took to develop it). I'm not even quite sure we can still call it slipping, since it took us more than 10 times the initial estimated load, it looks a lot more like free fall..
Surely we can learn from our mistakes.
The User Story: Accepting Changes On Terms of Use
So what was this user story about ?
We're maintaining an application where we ask users to accept the terms of use on login if they haven't. To make it easier for legal people to update it themselves, we're using a third-party CMS-as-a-service with a neat graphical editor.
Our user story consisted of making end-users accept terms of use again when the legal people change these terms on the CMS side. Seems quite simple, ain't it ?
Planning poker
First came the planning poker. We took advantage of our product owner being with us to estimate the user stories, and drop the remaining uncertainties.
After a brief discussion, all developers agreed that this particular user story was worth 0.5 points. There wasn't any second round of estimation or anything; the user story seemed clear enough, and the scope well defined.
We should have been much more cautious about this User Story. Whenever a US implies synchronization between two systems, there are many corner cases to discuss. And this US implies synchronization between 3 systems: The CMS, the server, and the client-side app (we use Single-Page Applications). Also, it lacked acceptance tests ; trying to write them would reveal the hidden complexity in no time. We should have challenged ourselves and the Product Owner about corner cases.
Development
The development of the user story started, and after some investigation we found different possible implementations:
- Add a button "Reset terms of use acceptance" on the admin UI, in the users list page: when clicked, all users must accept it again.
- Add a "TOS Last update" field on the third-party CMS service: when the acceptance date of the terms of use for a given user is before that date, they must accept it again.
- Store last publication date automagically every time the terms of use are updated by the legal people: everything is automated, and users must accept terms of use every time they are updated on the third-party service.
We do stand-up meetings with our customers on a daily basis, so we talked about this user story the next day. It was really just about figuring out which possible implementation is best fitted, considering user experience and how much we were prone to invest on the feature.
During this conversation, the scope changed. While explaining the possibilities, our product owner saw what was actually possible. In the initial User Story description, we only had to keep the latest date of acceptance. But the Product Owner asked us to also keep track of which user accepted which term of use in the past, and when. The product owner also acknowledged that the less the admins had to do, the best experience it would be.
So we went for the third possible implementation - the most expensive. In retrospect, we should have asked this question during the Planning Poker. It wouldn't save us development time, but it would give a more accurate view of the cost of that feature to the Product Owner.
We started developing this US and boom💥 already 1.5 points consumed, and it was just the start.
We then realized there were still challenges upcoming...
Technical Challenges
The application was already fetching the terms of use from the third-party CMS service, to display them on demand. We knew a webhook system existed in the CMS service, calling whatever URL we provided when the content changes. So we could easily know when the document would be updated.
That's without considering how everything was set up in the third-party service. You can edit and version the documents on different branches, just like with any Version Control System (VCS). But since people editing those documents are not IT people, they just publish everything to a single branch, master
. Guess what? Webhooks are branch-based on the CMS. So we can only configure one webhook... But we need to configure these webhooks for 4 different environments (dev, integration, staging, production). Impossible: the edition of the TOS on the CMS side can only trigger the webhook in 1 environment.
Webhook: not possible! So we had to investigate further possibilities. Since our third-party service caches requests, we decided to fetch the terms of use every time we needed them (💥 2.5 points).
This is the kind of surprise that happens all the time in programming. In retrospect, it's very unlikely that we could foresee this one by a longer preparation. Only by trying the first solution (webhooks) did we discover that it was unpractical. We benefited a lot from the Agile process here (start early, choose a better solution when blocked), while a waterfall process would have pushed the delay even further.
Functional Challenges
Now that we had up-to-date terms of use from the CMS, how should we display them?
- Should we only display them when the user logs in?
- What happens if there are new terms of use when the user is already logged in?
- Should we log the user off before any new request so they need to accept the new terms of use?
- What if they are submitting a form, do we force them off at the risk of losing data ?
After some chat with our product owner, we decided that the best solution would be to pause pending actions, ask users to accept the new terms of use, and automatically resume after acceptance.
If you're into web development, you know that pausing HTTP requests and resuming them is not an easy problem to tackle.
Anyway, we went that way and it started working (💥 4 points).
This kind of corner cases could be discovered during the planning session, but they are often hidden until the implementation starts. Once again, direct interaction with the PO is a very efficient tool to converge quickly.
More Technical Challenges
But there was one more thing we didn't expect. Since the application also runs on mobile devices, we had recently implemented a slow connection detector. Pausing a request to wait for a TOS agreement triggered the slow network detector. We didn't find any easy and fast way to overcome this without rewriting a lot of the REST logic on the client side. And we were already slipping pretty bad on this user story.
Back with our product owner to expose our technical difficulties. Since we already spent enough time exploring possibilities, we decided to go back to basics. What trade-offs could we accept for a simpler implementation?
We ended up with the simplest solution we could imagine: fetch the terms of use asynchronously once a day, and just disconnect users so they are forced to accept it again when logging in. There are two drawbacks: the modifications of the terms of use only apply for the next day, and users may lose their current action. But the PO was ready to live with these limitations.
Reducing scope to meet a deadline, or to maintain a good benefit/cost ratio, is one of the main benefits of agilty. We're glad our Product Owner was already used to dropping features instead of putting more pressure on the dev team to reach the initial scope.
After some reverting, a few lines of code, a hell of tests to make green, development was finally over! (💥 over 5 points)
Conclusion
So what happened ? We clearly didn't have enough functional and technical insights when estimating the user story. But development is a really complicated thing to predict. We are left with the choices of failing to deliver something fitting the needs, or accepting to discover the difficulty bit by bit, and adapting as fast as possible.
At Marmelab, we clearly made the choice of agility. We endlessly try to adapt to an ever changing situation. We believe that with a lot of communication, a real-time availability, fast feedbacks, and fast decision cycles, we can decrease the wasted time, and deliver high quality products.
In this particular case, communication made our user story evolve into a better-fitting solution. Fast feedbacks helped us realize our difficulties in less time. And fast decision cycles permitted to adapt the user story to the time willing to be invested into it.
We are lucky to have customers who are already sold to agility, and who make efforts to be available every day as much as possible. But it's also the role of our scrum masters to enforce agility, and to make our customers realize we need them as much as they need us in order to deliver awesome code.
What would have happened in a pipelined / waterfall process? Would the solution fit the user needs ? How long would have it taken in the end? Fortunately, we don't need to answer these questions - all our projects are 100% agile.