This report presents work undertaken by CEH and the Met Office under the 鈥淩ainfall and River Flow Ensemble Verification鈥?project commissioned by the Flood Forecasting Centre on behalf of the Scottish Environment Protection Agency, Environment Agency and Natural Resources Wales. A Prototype Framework for joint river flow and precipitation ensemble verification is developed and its use demonstrated on example verification periods and case study storms. This Framework constitutes a set of recommended metrics (scores and diagrams) for verifying ensemble forecasts of precipitation and river flow along with consideration of their application to the forecast and observation datasets involved. The work provides a foundation for the follow-on FCERM R&D Project SC150016 entitled 鈥淚mproving confidence in Flood Guidance through verification of rainfall and river flow ensembles鈥?The proposed Prototype Framework is presented, with details given of the chosen verification metrics along with supporting data requirements. Aspects requiring coordination between the hydrological and meteorological components of the study are discussed, including the use of thresholds, accumulation periods for precipitation, and lead-time considerations. Sources of precipitation verification truth data (radar- and raingauge-based) and their effect on the verification analyses are explored and the Rank Histogram found to be particularly sensitive. Daily precipitation accumulations are used to obtain an upper-bound on precipitation forecast skill, whilst hourly accumulations expose the effect of timing uncertainties, and allow a closer link to the river flow verification made at a 15 minute time-step.To provide an overview of ensemble performance, verification statistics are calculated at national and regional scales. Overall, for the 32 day verification periods considered here, it is found that probabilistic forecasts derived from both the river flow and the precipitation accumulation ensembles tend to be over-confident (the probabilities are higher than what the observed frequency of occurrence suggest). Whilst the river flow ensemble is under-spread according to the Rank Histogram, the outcome is less clear for the precipitation ensembles and dependent on truth type used. Considering the ROC Skill Score, both river flow and precipitation accumulation ensembles show good potential skill. Threshold-based verification scores are regionally dependent for both river flow and precipitation accumulations, although the details of these dependencies vary with ensemble type and verification metric. Overall, the ensemble skill performs worse in the drier regions towards the southeast of England.Maps of verification scores for individual catchment sites provide a visual overview of ensemble performance. However, using information only for an individual catchment, particularly for rare events, is found to give analyses dominated by sampling uncertainty. This is the case for river flow when using return period thresholds of interest for flood forecasting. To obtain a meaningful verification analysis at the scale of individual catchments, river flow data are pooled by catchment area within a given region.A prototype ensemble verification site performance summary, containing relevant verification information for a specific site of interest is presented. This summary brings together different verification scores and diagrams for both river flow and precipitation accumulations. Verification information from each site summary can be incorporated directly into the forecasting process. Possible methods of displaying this information are presented, with examples given for river flow over specific flood-producing case-study storms. These displays are shown alongside those placing the ensemble uncertainty in the context of the climatological ensemble spread. Thus it is demonstrated how, with appropriate understanding of sampling uncertainties, relevant verification information can be used to give an informed interpretation of the quantitative likelihood from an ensemble forecast.The main findings of the study are summarised as a set of key conclusions with sampling uncertainty identified as a major consideration for meaningful verification: influenced by threshold choice, period length, and flooding regime. Building on the Prototype Framework, and making use of the demonstration verification findings, recommendations are made for the possible characteristics of an ensemble verification system. As a foundation report guiding a follow-on R&D project of greater depth, areas requiring further consideration are identified. These aim to align future research to help develop robust and effective verification systems having real operational value to flood forecasting, guidance and warning.