@misc{15692, author = {Morten Welde and Magne J{\o}rgensen and Per Larssen and Torleif Halkjelsvik}, title = {Estimering av kostnader i store statlige prosjekter: Hvor gode er estimatene og usikkerhetsanalysene i KS2-rapportene?}, abstract = {The external quality assurance scheme for large government investment projects (the QA scheme / the state project model) aims, among other things, to ensure that budgets are realistic and that the risk analyses of the cost estimates reflect real cost uncertainty. The extent to which budgets, estimates and risk analyses are realistic, and where there may be potentials for improvements, are the main themes of this study. Chapter 1 describes the background and motivation for the study. The starting point is that the Concept research programme collects final costs in projects that have been through QA2 (quality assurance of cost estimate before the parliament{\textquoteright}s investment decision). That provides a basis for studies of cost performance. As the sample of projects increase, more detailed studies of the estimates that formed the basis for the parliament{\textquoteright}s investment decision becomes possible. The study has three main topics. We look at: The realism in the projects{\textquoteright} budgets The realism in the point estimates in the QA2 reports, and The realism and information value in the prediction intervals and estimation distributions. Chapter 2 provides a review of previous studies of cost performance in projects that have been through QA2. They all show relatively good results both in terms of deviation from budgets and risk assessments. While average cost overruns reported in international studies typically have been around 30 per cent, Norwegian studies report average overruns of between two and six per cent. Other studies also typically report a strong underestimation of uncertainty. The P50 and P85 estimates from the QA2 reports on the other hand (that is, estimates that are not expected to be exceeded in 50 and 85 per cent of cases, respectively) seem to have been reasonably well calibrated. However, several authors have pointed out that the distribution of final costs to the budgets have been somewhat higher than assumed at the time of the investment decision. The data used in the study, which is described in Chapter 3, is based on a larger sample of projects than previous studies. The analyses focus more on the estimates than previous studies have done. The analysis of the P50 and P85 estimates is based on samples of 83 and 85 projects respectively. Sufficient data for our analysis of the cost estimates were found for 70 of these projects. In Chapter 4, we outline detailed research questions and the methodology for the analyses. In this, we motivate and indicate, based on the latest research on the area, how probability-based cost estimates should be evaluated. We introduce an analysis of estimate deviations and estimation bias based on what is a reasonable "loss function", where the loss function is what we attempt to minimise in the estimates. We evaluate the extent to which we have been successful in estimating the real uncertainty of projects ex ante. We also assess how informative prediction intervals and estimate distributions have been. We argue that well-calibrated probability-based estimates (e.g., that 50 per cent of P50 estimates should not be exceeded) are not a sufficient evaluation criterion. In addition, we need indicators for how informative the probability-based estimates have been. In Chapter 5, we find that the median deviations between actual costs and the P50, measured as absolute percentage deviation, is 10 per cent (mean = 12.5), and that the median deviation from the P85 is 1.5 per cent (average = 3.4). In other words, for all the projects, there is only a slight tendency for overruns, and much lower than what has been reported in international studies. Over time, however, there has been a somewhat worrying development. While there was a tendency for cost underruns in the past (an average of 6 per cent underruns of the P50 for projects with an investment decision between 2001 and 2003), there has been a tendency for cost overruns in the later years (an average of 12 per cent overruns in the period 2010-2012). Given well-calibrated estimates, the actual cost should be below the P50 in about 50 per cent of the cases and below the P85 in about 85 per cent of the cases. However, we find that this only applies in 40 per cent of the cases for the P50 and 73 per cent for the P85. The shares have been declining over time. While in 2001-2003, 62 and 100 per cent were within the P50 and P85 respectively, in 2010-2012 there were only 21 and 43 per cent within, albeit based on a smaller sample than in the time-periods before. The reason why hit rates for the P50 and the P85 for all projects together are not so far from the intended targets is because we have gone from overestimation to underestimation. The tendency for underestimation should be reversed through better estimation and governance in future projects. The analyses of the estimates in Chapter 6 find about the same degree of overruns and estimate deviations for the P50 and P85 estimates as those reported in Chapter 5. The P50 estimates showed a median estimate bias of -1 per cent (mean = 3 per cent). The median percentage deviation (regardless of sign) was 12 per cent (mean = 14 per cent). We calculated that the expected deviation from the P50 budget could not be less than 8-10 per cent, given some assumptions that the projects do not adapt deliveries to reduce deviations. Although the latter assumption hardly is met, this calculation suggests that the deviations are not particularly high. We observe that there is typically a reduction from estimate to budget. The P50 budget was on average seven per cent lower than the P50 estimate and the P85 budget seven per cent lower than the P85 estimate. Although there were several projects that should have retained the original P50 and P85 estimates as P50 and P85 budget, respectively, we did not find that the adjustments overall reduced the realism. Many of the adjustments seem to be well justified. The estimates in the QA2 reports include both point estimates, prediction intervals and estimate distributions (S-curves). Our analyses include all of these and have as their main findings are as follows: The estimation distributions and prediction intervals are typically too narrow to reflect actual uncertainty. For example, as many as 19 per cent of the projects have a lower cost than the P10 estimate and 20 per cent more than the P90 estimate. Future estimation should take into account that the scope for project costs is broader than previously typically assumed. Estimated cost uncertainty, estimated through the width of the prediction interval and estimate distribution, does not correlate with actual cost uncertainty, measured by cost deviations and overruns. This indicates a low ability to distinguish between projects with high and low cost uncertainty. If we become better at identifying the high-risk projects, we could potentially reduce the need for risk contingency without compromising cost performance and project execution. We show, given some assumptions, that the P85 could be 17 per cent lower if the ability to distinguish between low and high risk projects had been better. Measures to improve this capability should be given priority in the estimation work. There are differences in estimation performance between agencies and between consultancies carrying out the external QA. Defence projects stand out by having a strong tendency to overestimate costs (their average underrun of the P50 estimate is 19 per cent) and overly narrow prediction intervals (29 per cent of projects within the 80 percent prediction range). The Norwegian Public Roads Administration also tends to estimate too narrow prediction intervals (57 per cent of projects within the 80 per cent prediction interval). Among the QA consultancies, there are no major differences in estimate deviations, but larger differences in how realistic the uncertainty is estimated. There may be differences in project complexity or other issues that explain these differences. Given the inability to distinguish between low- and high-risk projects in the estimation work, a simple mechanical mark-up model could in theory do just as well as the more demanding QA2 estimation work. We investigated this, where the uplifts were based on historical estimate deviations, but found that the QA2 estimates did better. This indicates that the work done in the QA2 estimation provides added value, measured against simple mark-up models. In Chapter 7, we summarize and discuss the findings. Overall, the main conclusions are that the QA2 framework is useful and that cost estimates appear to be realistic and reasonably well calibrated. However, developments over time are worrying and should lead to improvements in the estimation work. Two major areas of improvement are to specify broader estimate distributions, that is, to recognize that cost uncertainty is typically greater than that which has previously been identified in the estimation work, as well as to better distinguish between projects with low and high cost uncertainty.}, year = {2019}, journal = {Concept-rapport nr. 59}, publisher = {Ex ante akademisk forlag}, address = {Trondheim}, issn = {0803-9763}, isbn = {78-82-93253-81-5}, }