Evaluating data quantity and distributional uncertainties in the rating relationship
The rating curve is a fit to concurrent hydrometric measurements of stage and discharge (spot-gauging data) which is commonly referred to as the ‘stage-discharge relationship’. It is widely used by hydrologists as a convenient method of flow prediction using only continuous measurement of the stage. However, uncertainty in rating curves is a constant challenge for hydrometric experts and has a profound impact on the accuracy of the stage-discharge relationship. Uncertainties can arise due to several factors including measurement errors, seasonal differences in stage-discharge relationships and non-constant rating relationships at different stages. The quantification and understanding of the rating curve uncertainties are thus vital in maintaining high-accuracy in hydrometric measurements which in turn, improves the rating curve.
This thesis applies novel frameworks and a statistical sampling methodology to spot-gauging data records, the purpose of which is to quantify the general uncertainties in UK velocity-area rating curves related to the quantity and distribution (in terms of different stages and seasons) of spot-gauging data used for the rating curve fit. By fitting rating curves to repeat bootstrapped samples of the spot-gauging data, the impacts of increasing sample size, temporal distribution and the distributional characteristics of the sample are assessed. The main results of variation and uncertainty effects were shown in terms of fitting parameters which are a set of parameters produced when fitting the rating curve with the power law equation. They represent the shape of the rating relationship as well as the degree of variation in response to uncertainties within the rating curve. Here in the thesis, α was used as a proxy for vegetation growth and seasonal variation as well as variation of the rating curve at low flow regime as a linear component of the rating relationship. On the other hand, β was used for long-term variation of channels as well as a proxy for high flow regime variation of the rating curve.
The results show that for the selected sites, uncertainty due to data quantity decreases initially, before plateauing due to the effects of noise. At the national scale, gauging sites typically require 77 data points to reduce uncertainty below 10% and 157 to reduce it to less than 5% (with 95% confidence). In the context of the existing UK gauge station network, 97% of the current UK sites have sufficient data points for <10% parametric deviation in both fitting parameters, however, only 76% of sites have data quantity uncertainties constrained within 5% fitting parametric deviation, meaning that 24% of sites would require significantly more measurement. In addition, the data quantity uncertainties may be impacted by temporal changes to the rating curve, demonstrated by the significant change to rating curve parameters. For sites that have changed in the last decade, 39% of total UK sites do not have sufficient data quantity post-change to constrain the uncertainty within a 10% parametric deviation. In other words, a simple lack of measurements is not enough to explain the channel change for those sites.
Within the data quantity analysis, it was discovered that α has higher variance at low sample sizes but converges quicker with increasing sample size than β which is where the rating curve is most influential at high flow. This alluded to the fact that at low sample sizes, the low flow is more unpredictable. However, as measurement increases, it quickly subsides and at some point, the high flow becomes the most dominant source of error in the rating relationship. In addition, in data scarce environment it is likely to have higher discharge estimation than the actual values predicted by the converged rating curve due to the over-dominance effect of scarce high-flow data points. This has implications for flood and drought prediction with limited data points available which has become more likely with climate change. The findings showed that in the scenario of flood management, with a low sample size it is always necessary to take precautionary measures while on the contrary, in the case of drought conditions, it may not be so desirable as the overestimation of flow in drought conditions may deteriorate the compliance of decision-making.
Considering the seasonality of spot-gauging data, the results demonstrated that 84% of UK sites exhibit a seasonal uncertainty larger than the equivalent uncertainty due to data quantity at the site. Nationally, these seasonal differences coincided with the summer months (June-September) and resulted in average shifts in the rating curve equivalent to a change in Manning’s coefficient of 0.0051. The interaction of high-flow and low-flow datasets was investigated and it was discovered that with the increased introduction of high-flow data points in a rating curve, the regime dominated by lower flows will deviate further and further from the baseline fit (the fit that uses all datasets). At each flow quantile, fewer and fewer high-flow datasets are needed as the flow quantile increases to constrain the different flow quantiles. The impact of the data point distribution on rating curve error was also studied where the maximum possible uncertainty due to any combination of flow data points was 14% at the national level (75th percentile), this has implications on the determination of the possible unseen segmentation pattern in UK rating curves. Finally, the results from the stage distribution of spot-gauging data demonstrate the relative proportions of data below key flow thresholds (Q95, Q70, Q50, Q10, Q05) required to produce a curve representative of the full dataset, which were 38.18%, 18.34% 14.11% 4.52% and 4.15% for Q95, Q70, Q50, Q10 and Q05 respectively. Meaning that at those percentages at each site with different dominance of flow quantiles, the curve is the most similar resemblance of the true rating relationship.
The interaction between the three aspects of this thesis was also studied where data quantity and seasonality justified the seasonality analysis; on the other hand, seasonality and flow magnitude analysis proofed that α is influenced by both seasonality and flow magnitude combined where β is predominately affected by the change in flow magnitude. Finally, the investigation of the interaction between data quantity and flow magnitude dealt with the question of data quality vs. data quantity where suggestions were made for Q50 (representative quantile). That is how much data below the Q50 flow is needed to fully constrain the deviation of the rating curve shape and its fitting parameters due to the lack of data points. To be conservative, 75% of data points need to be
The findings of this thesis provided guidance on the data required to reduce uncertainty in rating curves, which can be used to guide the optimisation of spot-gauging data collection frequency and timing as well as providing the context of general uncertainties of rating curves within channel evolution.
Funding
Loughborough ABCE
History
School
- Architecture, Building and Civil Engineering
Publisher
Loughborough UniversityRights holder
© Jiajun LiPublication date
2022Notes
A Doctoral Thesis. Submitted in partial fulfilment of the requirements for the award of the degree of Doctor of Philosophy of Loughborough University.Language
- en
Supervisor(s)
Tim Marjoribanks ; Louise Slater ; Ian PattisonQualification name
- PhD
Qualification level
- Doctoral
This submission includes a signed certificate in addition to the thesis file(s)
- I have submitted a signed certificate