Sunday, 27 October 2019

How many iterations are required


Some people believe that if they increase number of iterations in Monte Carlo simulation, they will have much more accurate results. But remember that the  accuracy of results depends on the accuracy of input data or in our case, how we define statistical distributions for task cost, duration and other parameters. Let’s assume that your estimate low and high durations of task in software development project. Even you if you perform this task few times, you cannot tell if duration is between 4 and 6 days, or between 3.5 and 6.5 days. In this case, the accuracy of estimation is greater than 10%. Such accuracy is different for different projects and different tasks. Sometimes it may be 1-5% of you maintain records and do repeatable project. However sometimes it is very significant number. If everybody would estimate project duration very accurately we would not need any analysis and you would not need to read this book. 


Standard deviation of project duration vs. number of iteration.



Mean of project duration vs. number of iteration.

Now let’s take a look how much accuracy additional Monte Carlo simulations would add. When we calculate perform Monte Carlo simulation we calculate mean and standard deviation of project duration. There are the results for standard deviation and mean of project duration of very small real software development project (Figure 5.13 and 5.14).
As you can see here if number of iterations is small, there is a significant difference between results on current and previous iteration. However, after few hundred iterations, the difference would be reduced significantly. If fact the difference between standard deviation for 500 iterations calculation and 550 iterations calculation is only 0.4%. The difference between mean for 500 iteration case and 550 iteration case is even less 0.04%. Remember the accuracy of our input uncertainties around 10%? We found that it real world it does not make sense to do many iterations. In most schedules 300-500 iterations will be correct optimal number of iterations. There are two cases where you would need to do more iterations:
1.      You have very rare events, which you would like to capture in your schedule risk analysis. For example, earthquake with probability 0.01% per duration of the project. So your number iterations should be at least 10,000 iterations in this case.
2.      It is important for you to monitor results of “extreme percentiles”, for example for P1 or P99. If you input distributions with low probability on the tails, such as triangular, you may need to increase number of iterations to get more accurate results of simulation.
Modern schedule risk analysis software can perform Monte Carlo simulations quite fast. Results could be quite different for different schedules, hardware, and software packages, but we can give you some idea. It may take 30 seconds – 1 minute to run 2000 iterations for 5000 task schedule on average computer. Any problems with performance may occur with large integrated schedules with tenth of thousands of tasks if you attempt to do very many iterations.