Validate the completeness and data coverage of a short-term dataset
validate_coverage.RdLegislative statistics often require a minimum data coverage. This function
takes short-term data, pads it with missing values to create a complete time
series, and calculates a data capture percentage per year, pollutant and
site. It may be prudent to run this function after
validate_mod_obs_pairs(). pad_data() is also exported which performs
the data padding without the data coverage validation step.
Arguments
- data
A short-term
data.framewith columns defined bydict.- resolution
One of
"guess","hourly"or"daily". If"guess", the function will attempt to guess the resolution of the data. Note that only hourly and daily data are supported.- min_coverage
The minimum data coverage percent, expressed as a decimal (i.e., this option should be between
0and1, representing 0% and 100%). Defaults to0.75(75%).- mode
One of
"fix"(the default),"error"or"warn". If"fix"the function will return a dataframe where site-year-pollutant combinations with low data capture have been removed. If"error"or"warn", the the function will either error or warn if mismatched values are detected. Regardless, a data capture columndata_coveragewill be appended todata.- dict
See
mqo_dict()for more information. Acts as a data dictionary to specify the columns in the data{mqor}should use.
See also
Other data utilities:
filter_year(),
mqo_percentile(),
mutate_rolling_mean(),
summarise_daily(),
validate_mod_obs_pairs()