Overall takeaways
Flu revision
- NHSN
  - Revision behavior
- NSSP
  - Revision behavior
  - Correlation with latest
Covid revision
- NHSN
- NSSP
  - Revision behavior
  - Correlation with latest
Data substitions
- Flu
- Covid
Correlations between the two archives

Overall takeaways

There is substantial under-reporting behavior that is consistent for a single geo. For nhsn, this is mostly systematic, though there are some locations which have sudden corrections. We can probably improve our forecasts by including revision behavior for most locations, and including alerts for what seem like reporting errors. nssp has smaller, more regular corrections, and is a better predictor of the final value than the initial nhsn release.

Flu and covid have similar levels of revision, and states with large sudden corrections do so in both states (specifically az, mn and nh all had unusual revisions, though dc has covid corrections that flu doesn’t), and a list of the states with highest relative correction contains a similar set of states. We haven’t checked the exact correlation between flu and covid, but this is likely high; it is reported through the same channels by the same people. We should look into the extra columns to see if it provides useful information for handling revision behavior.

Flu revision

NHSN

Revision behavior

First we get the archive and remove the data older than the first version so as not to clog up the revision behavior, and display the overall revision summary¹.

nhsn_archive_flu <- qs2::qs_read(here::here("flu_hosp_prod", "objects", "nhsn_archive_data"))

nhsn_archive_flu <- nhsn_archive_flu$DT %>% filter(time_value >= "2024-11-19", geo_value %nin% c("vi", "as", "gu")) %>% as_epi_archive()
nhsn_archive_flu$time_type <- "day"
revision_sum <- nhsn_archive_flu %>% epiprocess::revision_analysis(value, min_waiting_period = NULL)

revision_sum %>% print(quick_revision = 7)

##

## ── An epi_archive spanning 2024-11-23 to 2025-05-10. ──

##

## ── Min lag (time to first version):

##      min median     mean     max
##   4 days 4 days 4.2 days 25 days

##

## ── Fraction of epi_key + time_values with

## No revisions:
## • 222 out of 1,336 (16.62%)
## 
## Quick revisions (last revision within 7 days of the `time_value`):
## • 379 out of 1,336 (28.37%)
## 
## Few revisions (At most 3 revisions for that `time_value`):
## • 957 out of 1,336 (71.63%)
## 
## 
## 
## ── Fraction of revised epi_key + time_values which have: 
## 
## Less than 0.1 spread in relative value:
## • 609 out of 1,114 (54.67%)
## 
## Spread of more than 2719.05 in actual value (when revised):
## • 7 out of 1,114 (0.63%)
## 
## 
## 
## ── Days until within 20% of the latest value:

##      min median     mean      max
##   4 days 4 days 7.9 days 174 days

So around a fifth have no revisions, around a quarter resolve within a week, and around 2/3rds have a small number of revisions. Around half of those with revisions have little relative change (10% of the max value). The “actual value” change isn’t really worth thinking about because this is counts data (so there being only 6 doesn’t tell us much).

Here’s a plot of the version changes for all locations.

text_size <- 6
nhsn_archive_flu %>% autoplot(value) + theme(strip.text.x = element_text(size = text_size, margin = margin(.1, 0, .1, 0, "cm")), axis.text = element_text(size =text_size, angle = 45), legend.title = element_text(size = text_size), legend.text = element_text(size = text_size), legend.key.size = unit(0.5, "cm")) + scale_size_manual(values = c(0.5))

Since this is probably too small to actually be legible, let’s figure out the states with the worst revision behavior and plot those. Worst in this case meaning worst average relative spread

av_re_spread <- revision_sum$revision_behavior %>%
  group_by(geo_value) %>%
  summarize(rel_spread = mean(rel_spread, na.rm = TRUE)) %>%
  arrange(desc(rel_spread)) %>%
  filter(geo_value %nin% c("vi", "as", "gu"))
av_re_spread %>%
  print(n=20)

## # A tibble: 54 × 2
##    geo_value rel_spread
##    <chr>          <dbl>
##  1 pr             0.357
##  2 wa             0.279
##  3 ok             0.261
##  4 az             0.259
##  5 nm             0.215
##  6 nh             0.214
##  7 ar             0.212
##  8 id             0.206
##  9 dc             0.198
## 10 mo             0.197
## 11 mt             0.178
## 12 or             0.169
## 13 ri             0.139
## 14 ak             0.136
## 15 sd             0.129
## 16 md             0.127
## 17 ga             0.119
## 18 mn             0.113
## 19 ky             0.112
## 20 va             0.111
## # ℹ 34 more rows

The worst 18 excluding the geo’s we don’t actually forecast (so no as or vi, gu).

nhsn_filtered <- nhsn_archive_flu %>%
  filter(geo_value %in% av_re_spread$geo_value[1:18])
nhsn_filtered$DT %<>%
  mutate(geo_value = factor(geo_value, levels = av_re_spread$geo_value[1:18]))
autoplot(nhsn_filtered, "value") + facet_wrap(~geo_value, ncol = 3, scales = "free") + theme(strip.text.x = element_text(size = 8))

These cover a range of revision behaviors; some are off on a single time_value in a terrible way, such as mn or nh, while most have fairly constant revisioning. az is unusual in having two bad versions when they revised the entire history down by a factor of 2, and then reverted to what it had been previously. Most are systematically reporting lower than the actual values, with some such as nm, nh, ok, and ak particularly bad about this. It is probably worth adding a measure at the key level of how systematic the bias is to revision_analysis, or perhaps a separate function. These seem likely to be estimable beforehand. Exceptions that include at least one case of over-reporting: mn, id, nh These are not later backfilled; they seem more like bad estimates or entry error.

NSSP

Revision behavior

nssp_archive_flu <- qs2::qs_read(here::here("flu_hosp_prod", "objects", "nssp_archive_data"))

nssp_archive_flu <- nssp_archive_flu$DT %>% filter(time_value >= "2024-11-19", geo_value %nin% c("vi", "as", "gu")) %>% as_epi_archive()
nssp_archive_flu$time_type <- "day"
revision_sum_nssp <- nssp_archive_flu %>% epiprocess::revision_analysis(nssp, min_waiting_period = NULL)

revision_sum_nssp %>% print(quick_revision = 7)

##

## ── An epi_archive spanning 2024-11-20 to 2025-05-07. ──

##

## ── Min lag (time to first version):

##      min median     mean     max
##   4 days 4 days 4.6 days 25 days

##

## ── Fraction of epi_key + time_values with

## No revisions:
## • 169 out of 1,250 (13.52%)
## 
## Quick revisions (last revision within 7 days of the `time_value`):
## • 142 out of 1,250 (11.36%)
## 
## Few revisions (At most 3 revisions for that `time_value`):
## • 873 out of 1,250 (69.84%)
## 
## 
## 
## ── Fraction of revised epi_key + time_values which have: 
## 
## Less than 0.1 spread in relative value:
## • 929 out of 1,081 (85.94%)
## 
## Spread of more than 0.703 in actual value (when revised):
## • 44 out of 1,081 (4.07%)
## 
## 
## 
## ── Days until within 20% of the latest value:

##      min median     mean     max
##   4 days 4 days 4.7 days 25 days

So very few weeks have no revision, but ~70% have only a small number of revisions. And most (87%) are very close to their final values

av_re_spread <- revision_sum_nssp$revision_behavior %>%
  group_by(geo_value) %>%
  summarize(rel_spread = mean(rel_spread, na.rm = TRUE)) %>%
  arrange(desc(rel_spread)) %>%
  filter(geo_value %nin% c("vi", "as", "gu"))

nssp_archive_flu %>% autoplot(nssp, .facet_filter = geo_value %in% av_re_spread$geo_value[1:18]) + facet_wrap(~geo_value, ncol = 3, scales = "free") + theme(strip.text.x = element_text(size = 8))

These corrections are way more regular than nhsn, so we can definitely do nowcasting to adjust the values.

Correlation with latest

Does NSSP actually correlate better with the latest value than nhsn itself does? This was a property we were relying on through the season to generate our corrections. Comparing the correlation of nssp as it was on the forecast date with the value as it eventually is.

nssp_archive_flu

## → An `epi_archive` object, with metadata:
## ℹ Min/max time values: 2024-11-20 / 2025-05-07
## ℹ First/last version with update: 2024-12-01 / 2025-05-11
## ℹ Versions end: 2025-05-11
## ℹ A preview of the table (4572 rows x 4 columns):
## Key: <geo_value, time_value, version>
##       geo_value time_value    version  nssp
##          <char>     <Date>     <Date> <num>
##    1:        ak 2024-11-20 2024-12-01  1.33
##    2:        ak 2024-11-20 2024-12-08  1.35
##    3:        ak 2024-11-20 2024-12-22  1.32
##    4:        ak 2024-11-20 2025-01-05  1.34
##    5:        ak 2024-11-27 2024-12-01  1.74
##   ---                                      
## 4568:        wy 2025-04-09 2025-04-13  0.00
## 4569:        wy 2025-04-16 2025-04-20  0.00
## 4570:        wy 2025-04-23 2025-04-27  0.00
## 4571:        wy 2025-04-30 2025-05-04  0.00
## 4572:        wy 2025-05-07 2025-05-11  0.00

nssp_diagonal <- nssp_archive_flu %>%
  epix_slide(\(x, group, ref_time) x %>% filter(time_value == max(time_value))) %>%
  select(time_value = version, geo_value, nssp)
nhsn_diagonal <- nhsn_archive_flu %>%
  epix_slide(\(x, group, ref_time) x %>% filter(time_value == max(time_value))) %>%
  select(time_value = version, geo_value, value)
nhsn_latest <- nhsn_archive_flu %>% epix_as_of_current() %>% filter(time_value %in% nhsn_diagonal$time_value)
nssp_diagonal

## # A tibble: 1,198 × 3
##    time_value geo_value  nssp
##    <date>     <chr>     <dbl>
##  1 2024-12-01 ak         1.74
##  2 2024-12-01 al         1.17
##  3 2024-12-01 ar         0.38
##  4 2024-12-01 az         3.36
##  5 2024-12-01 ca         1.25
##  6 2024-12-01 co         0.69
##  7 2024-12-01 ct         0.21
##  8 2024-12-01 dc         0.17
##  9 2024-12-01 de         0.34
## 10 2024-12-01 fl         2.02
## # ℹ 1,188 more rows

joined_dataset <- nhsn_latest %>%
  left_join(
    nhsn_diagonal,
    by = c("geo_value", "time_value"),
    suffix = c("_latest", "_diag")
  ) %>%
  left_join(
    nssp_diagonal %>% mutate(time_value = time_value - 1),
    by = c("geo_value", "time_value")
  ) %>%
  filter(geo_value %nin% c("vi", "as", "gu", "mp"))
corrs <- joined_dataset %>%
  epi_cor(value_latest, value_diag) %>%
  left_join(
    joined_dataset %>%
    epi_cor(value_latest, nssp),
    by = "geo_value", suffix = c("_release", "_nssp")
  ) %>%
  pivot_longer(names_to = "correlation_against", cols = c(cor_release, cor_nssp))
corrs %>% drop_na() %>% group_by(geo_value) %>% filter(n()==2) %>%
ggplot(aes(x = geo_value, y = value, fill = correlation_against)) +
  geom_bar(position = "dodge", stat = "identity") + theme(legend.position = "bottom")

Red, which is on the left for each location, gives the correlation of the final value with nssp as it was at the time of the forecast, while blue on the right gives the correlation of the final value with nhsn as it was. Everywhere but Louisiana, nssp has a stronger correlation with the eventual value than nhsn itself does.

Covid revision

And now for ~ the same idea, but for covid

NHSN

First we get the archive and remove the data older than the first version so as not to clog up the revision behavior, and display the overall revision summary².

nhsn_archive_covid <- qs2::qs_read(here::here("covid_hosp_prod", "objects", "nhsn_archive_data"))

nhsn_archive_covid <- nhsn_archive_covid$DT %>% filter(time_value >= "2024-11-19") %>%
  filter(geo_value %nin% c("vi", "as", "gu")) %>% as_epi_archive()
nhsn_archive_covid$time_type <- "day"
revision_sum <- nhsn_archive_covid %>%
  epiprocess::revision_analysis(value, min_waiting_period = NULL)
revision_sum %>% print(quick_revision = 7)

##

## ── An epi_archive spanning 2024-11-23 to 2025-05-10. ──

##

## ── Min lag (time to first version):

##      min median     mean     max
##   4 days 4 days 4.2 days 25 days

##

## ── Fraction of epi_key + time_values with

## No revisions:
## • 211 out of 1,336 (15.79%)
## 
## Quick revisions (last revision within 7 days of the `time_value`):
## • 382 out of 1,336 (28.59%)
## 
## Few revisions (At most 3 revisions for that `time_value`):
## • 1,012 out of 1,336 (75.75%)
## 
## 
## 
## ── Fraction of revised epi_key + time_values which have: 
## 
## Less than 0.1 spread in relative value:
## • 615 out of 1,125 (54.67%)
## 
## Spread of more than 929.6 in actual value (when revised):
## • 6 out of 1,125 (0.53%)
## 
## 
## 
## ── Days until within 20% of the latest value:

##      min median     mean      max
##   4 days 4 days 8.1 days 174 days

Around a fifth have no revisions, around a third resolve within a week, and around 3/4ths have a small number of revisions. Around 60% of those with revisions have little relative change (10% of the max value). The “actual value” change isn’t really worth thinking about because this is counts data (so there being only 6 doesn’t tell us much).

Here’s a plot of the version changes for all locations.

nhsn_archive_covid %>% autoplot(value) + theme(strip.text.x = element_text(size = text_size, margin = margin(.1, 0, .1, 0, "cm")), axis.text = element_text(size =text_size, angle = 45), legend.title = element_text(size = text_size), legend.text = element_text(size = text_size), legend.key.size = unit(0.5, "cm")) + scale_size_manual(values = c(0.5))

Since this is probably too small to actually be legible, let’s figure out the states with the worst revision behavior and plot those. Worst in this case meaning worst average relative spread

av_re_spread <- revision_sum$revision_behavior %>%
  group_by(geo_value) %>%
  summarize(rel_spread = mean(rel_spread, na.rm = TRUE)) %>%
  arrange(desc(rel_spread)) %>%
  filter(geo_value %nin% c("vi", "as", "gu"))
av_re_spread %>%
  print(n=20)

## # A tibble: 54 × 2
##    geo_value rel_spread
##    <chr>          <dbl>
##  1 pr             0.348
##  2 az             0.281
##  3 dc             0.265
##  4 mo             0.253
##  5 ar             0.233
##  6 nm             0.226
##  7 wa             0.224
##  8 ok             0.216
##  9 id             0.197
## 10 mt             0.193
## 11 ks             0.171
## 12 nh             0.169
## 13 ri             0.162
## 14 ky             0.161
## 15 hi             0.144
## 16 sd             0.135
## 17 va             0.131
## 18 or             0.115
## 19 mn             0.113
## 20 nv             0.110
## # ℹ 34 more rows

Which, if you compare with the average relative spread in the Flu section is remarkably similar. The worst 18 excluding the geo’s we don’t actually forecast (so no as or vi, gu).

nhsn_filtered <- nhsn_archive_covid %>%
  filter(geo_value %in% av_re_spread$geo_value[1:18])
nhsn_filtered$DT %<>%
  mutate(geo_value = factor(geo_value, levels = av_re_spread$geo_value[1:18]))
autoplot(nhsn_filtered, "value") + facet_wrap(~geo_value, ncol = 3, scales = "free") + theme(strip.text.x = element_text(size = 8))

Strictly visually, this seems to revise more than the flu data (this doesn’t actually fit well with the numeric revision analysis so something odd is going on). Perhaps the revisions are more chaotic.

-mn instead of having a single time point for a single version has it’s entire trajectory wrong in the same way that az does, possibly on the same version.

az has similar revision behavior (a couple of weeks where they were wildly off).
nh has similar behavior in over-estimating a specific single time value quite badly, but otherwise having typical under-reporting problems.
dc has some new and wild behavior; this is likely as visually striking as it is because the values are so small.
mo has new revision behavior, primarily in the magnitude of the difference.
ok has more extreme under-reporting than in the case of flu, but again this seems to likely be a factor of the number of cases, suggesting their underreporting happens in absolute number of cases rather than relative³.

Like flu, these seem mostly estimable beforehand, with some more drastic exceptions.

NSSP

Revision behavior

nssp_archive_covid <- qs2::qs_read(here::here("covid_hosp_prod", "objects", "nssp_archive_data"))

nssp_archive_covid <- nssp_archive_covid$DT %>% filter(time_value >= "2024-11-19", geo_value %nin% c("vi", "as", "gu")) %>% as_epi_archive()
nssp_archive_covid$time_type <- "day"
revision_sum_nssp <- nssp_archive_covid %>% epiprocess::revision_analysis(nssp, min_waiting_period = NULL)

revision_sum_nssp %>% print(quick_revision = 7)

##

## ── An epi_archive spanning 2024-11-20 to 2025-05-07. ──

##

## ── Min lag (time to first version):

##      min median     mean     max
##   4 days 4 days 4.6 days 25 days

##

## ── Fraction of epi_key + time_values with

## No revisions:
## • 168 out of 1,250 (13.44%)
## 
## Quick revisions (last revision within 7 days of the `time_value`):
## • 139 out of 1,250 (11.12%)
## 
## Few revisions (At most 3 revisions for that `time_value`):
## • 1,100 out of 1,250 (88%)
## 
## 
## 
## ── Fraction of revised epi_key + time_values which have: 
## 
## Less than 0.1 spread in relative value:
## • 877 out of 1,082 (81.05%)
## 
## Spread of more than 0.162 in actual value (when revised):
## • 70 out of 1,082 (6.47%)
## 
## 
## 
## ── Days until within 20% of the latest value:

##      min median     mean     max
##   4 days 4 days 4.8 days 25 days

So few weeks have no revision (only 14%), but ~87% have only a small number of revisions. And most (81%) are close to their final values

av_re_spread <- revision_sum_nssp$revision_behavior %>%
  group_by(geo_value) %>%
  summarize(rel_spread = mean(rel_spread, na.rm = TRUE)) %>%
  arrange(desc(rel_spread)) %>%
  filter(geo_value %nin% c("vi", "as", "gu"))

nssp_archive_flu %>% autoplot(nssp, .facet_filter = geo_value %in% av_re_spread$geo_value[1:18]) + facet_wrap(~geo_value, ncol = 3, scales = "free") + theme(strip.text.x = element_text(size = 8))

These corrections are way more regular than nhsn, so we can definitely do nowcasting to adjust the values.

Correlation with latest

nssp_diagonal <- nssp_archive_covid %>%
  epix_slide(\(x, group, ref_time) x %>% filter(time_value == max(time_value))) %>%
  select(time_value = version, geo_value, nssp)
nhsn_diagonal <- nhsn_archive_covid %>%
  epix_slide(\(x, group, ref_time) x %>% filter(time_value == max(time_value))) %>%
  select(time_value = version, geo_value, value)
nhsn_latest <- nhsn_archive_covid %>% epix_as_of_current() %>% filter(time_value %in% nhsn_diagonal$time_value)
nssp_diagonal

## # A tibble: 1,198 × 3
##    time_value geo_value  nssp
##    <date>     <chr>     <dbl>
##  1 2024-12-01 ak         0.39
##  2 2024-12-01 al         0.42
##  3 2024-12-01 ar         0.45
##  4 2024-12-01 az         1.72
##  5 2024-12-01 ca         0.38
##  6 2024-12-01 co         1.11
##  7 2024-12-01 ct         0.77
##  8 2024-12-01 dc         0.22
##  9 2024-12-01 de         0.59
## 10 2024-12-01 fl         0.31
## # ℹ 1,188 more rows

joined_dataset <- nhsn_latest %>%
  left_join(
    nhsn_diagonal,
    by = c("geo_value", "time_value"),
    suffix = c("_latest", "_diag")
  ) %>%
  left_join(
    nssp_diagonal %>% mutate(time_value = time_value - 1),
    by = c("geo_value", "time_value")
  ) %>%
  filter(geo_value %nin% c("vi", "as", "gu", "mp"))
corrs <- joined_dataset %>%
  epi_cor(value_latest, value_diag) %>%
  left_join(
    joined_dataset %>%
    epi_cor(value_latest, nssp),
    by = "geo_value", suffix = c("_release", "_nssp")
  ) %>%
  pivot_longer(names_to = "correlation_against", cols = c(cor_release, cor_nssp))
corrs %>% drop_na() %>% group_by(geo_value) %>% filter(n()==2) %>%
ggplot(aes(x = geo_value, y = value, fill = correlation_against)) +
  geom_bar(position = "dodge", stat = "identity") + theme(legend.position = "bottom")

Red, which is on the left for each location, gives the correlation of the final value with nssp as it was at the time of the forecast, while blue on the right gives the correlation of the final value with nhsn as it was. For most locations nssp correlates somewhat more strongly with the final value than the initial nhsn release does. nv, dc, md, and mt are all appreciably better. hi is notable in that the initial release has a negative correlation, while nssp is better, though at ~0.25 is still only weakly correlated with the final value. al, ma, and sd are all exceptions in having the initial release correlate better than nssp, though really they’re around the same value.

Data substitions

Flu

And we actually need to compare this with the data revision estimates:

data_substitutions <- readr::read_csv(
    here::here(glue::glue("flu_data_substitutions.csv")),
    comment = "#",
    show_col_types = FALSE
  ) %>%
  mutate(time_value = round_date(time_value, unit = "week", week_start = 6))
nhsn_archive_flu %>%
  autoplot(value, .facet_filter = geo_value %in% (data_substitutions$geo_value %>% unique())) + geom_point(data = data_substitutions, aes(x = time_value, y = value)) + facet_wrap(~geo_value, scale = "free")

which doesn’t look all that great. To calculate how much closer (or further) we were from the final value, first we construct the relevant snapshots:

final_values <- nhsn_archive_flu %>% epix_as_of_current() %>% mutate(time_value = round_date(time_value, unit = "week", week_start = 6))
data_as_it_was <- map(
  data_substitutions$forecast_date %>% unique(),
  \(version) nhsn_archive_flu %>% epix_as_of(version) %>% mutate(forecast_date = version)
) %>%
  list_rbind() %>% mutate(time_value = round_date(time_value, unit = "week", week_start = 6))

and then we compute several notions of differences

full_table <- data_substitutions %>%
  left_join(
    final_values,
    by = join_by(geo_value, time_value), suffix = c("_substituted", "_final")
  ) %>%
  left_join(
    data_as_it_was,
    by = join_by(geo_value, forecast_date, time_value)
  ) %>%
  rename(as_of_value = value) %>%
  mutate()

diffs <- full_table %>%
  mutate(
    abs_diff = value_substituted - value_final,
    rel_diff = abs_diff / value_final,
    rel_rev_diff = abs_diff / (-value_final + as_of_value),
    ) %>%
  arrange(rel_diff)
diffs %>%
  select(-forecast_date, -value_substituted, -value_final, -as_of_value) %>%
  print(n = 23)

## # A tibble: 23 × 5
##    geo_value time_value abs_diff rel_diff rel_rev_diff
##    <chr>     <date>        <dbl>    <dbl>        <dbl>
##  1 nm        2025-02-15      -78  -0.315         0.661
##  2 sd        2025-01-04      -14  -0.298         0.412
##  3 nj        2025-03-22      -85  -0.246         0.274
##  4 ok        2025-02-01     -266  -0.238         5.66 
##  5 ok        2025-02-15     -186  -0.237         0.660
##  6 ms        2025-02-01     -117  -0.218       117    
##  7 ca        2025-02-01     -525  -0.119         3.70 
##  8 de        2025-02-01      -10  -0.0714     -Inf    
##  9 nm        2025-02-01      -14  -0.0654        1.4  
## 10 id        2025-02-01       -9  -0.0566       -3    
## 11 ga        2025-02-15      -34  -0.0463        0.105
## 12 hi        2025-03-01        3   0.0448       -0.2  
## 13 az        2025-02-01       43   0.0533        0.683
## 14 nh        2025-03-15       21   0.266       Inf    
## 15 id        2025-01-04       81   0.335        -1.42 
## 16 tn        2025-03-22      122   0.396        -4.88 
## 17 nh        2025-03-22       27   0.509       Inf    
## 18 dc        2025-02-15       47   0.566        -2.47 
## 19 dc        2025-02-22       58   0.806        -2.64 
## 20 dc        2025-02-22       58   0.806        -4.83 
## 21 dc        2025-02-15       77   0.928       Inf    
## 22 dc        2025-02-15       77   0.928       Inf    
## 23 nh        2025-03-29       66   1.94          0.733

The table is sorted by rel_diff.

abs_diff gives how much our adjustment was over by (so positive means we were over the latest value).
rel_diff gives the difference relative to the latest value; for nh on 03/29 (the last entry in the table), our estimate was nearly twice as large as the latest value. This is mostly to compensate for the different magnitudes of values across time and geo.
rel_rev_diff is the most appropriate to view as a substitution scoring. It divides the abs_diff by how much the value as of the forecast date was off; for nh on 03/29 again, it was merely 73%, so we did bring it closer to the actual value. Any of these which are <1 are “successful” in the sense that we were closer to the latest value than the as_of value was. An infinite value tells us that we adjusted a value that hasn’t been corrected. The sign for rel_rev_diff is a bit confusing, and tells us whether our estimate and the as of value were both larger/smaller than the latest value, or one larger and one smaller.

How many did we substitute a more accurate value?

mean(abs(diffs$rel_rev_diff) < 1)

## [1] 0.3478261

around 35%, so not a great track record overall. How about lower than the target vs higher than the target?

diffs %>%
  mutate(is_over = rel_diff > 0) %>%
  group_by(is_over) %>%
  summarize(fraction_improved = mean(abs(rel_rev_diff) < 1))

## # A tibble: 2 × 2
##   is_over fraction_improved
##   <lgl>               <dbl>
## 1 FALSE               0.455
## 2 TRUE                0.25

So we did marginally better when it was below, but much worse when it was above.

Overall, it turns out our value substitutions did not actually help much for flu.

Covid

And we actually need to compare revision behavior with our estimates of the correct values:

data_substitutions <- readr::read_csv(
    here::here(glue::glue("covid_data_substitutions.csv")),
    comment = "#",
    show_col_types = FALSE
  ) %>%
  mutate(time_value = round_date(time_value, unit = "week", week_start = 6))
nhsn_archive_covid %>%
  autoplot(value, .facet_filter = geo_value %in% (data_substitutions$geo_value %>% unique())) + geom_point(data = data_substitutions, aes(x = time_value, y = value)) + facet_wrap(~geo_value, scale = "free")

To calculate how much closer (or further) we were from the final value, first we construct the relevant snapshots:

final_values <- nhsn_archive_covid %>% epix_as_of_current() %>% mutate(time_value = round_date(time_value, unit = "week", week_start = 6))
data_as_it_was <- map(
  data_substitutions$forecast_date %>% unique(),
  \(version) nhsn_archive_covid %>% epix_as_of(version) %>% mutate(forecast_date = version)
) %>%
  list_rbind() %>% mutate(time_value = round_date(time_value, unit = "week", week_start = 6))

final_values %>% filter(time_value > "2025-01-03")

## An `epi_df` object, 1,018 x 3 with metadata:
## * geo_type  = nation
## * time_type = week
## * as_of     = 2025-05-16
## 
## # A tibble: 1,018 × 3
##    geo_value time_value value
##    <chr>     <date>     <dbl>
##  1 ak        2025-01-04     6
##  2 ak        2025-01-11    14
##  3 ak        2025-01-18     7
##  4 ak        2025-01-25     5
##  5 ak        2025-02-01     9
##  6 ak        2025-02-08    10
##  7 ak        2025-02-15    11
##  8 ak        2025-02-22    10
##  9 ak        2025-03-01    19
## 10 ak        2025-03-08    14
## # ℹ 1,008 more rows

full_table <- data_substitutions %>%
  left_join(
    final_values,
    by = join_by(geo_value, time_value), suffix = c("_substituted", "_final")
  ) %>%
  left_join(
    data_as_it_was,
    by = join_by(geo_value, forecast_date, time_value)
  ) %>%
  rename(as_of_value = value) %>%
  mutate()

diffs <- full_table %>%
  mutate(
    abs_diff = value_substituted - value_final,
    rel_diff = abs_diff / value_final,
    rel_rev_diff = abs_diff / (-value_final + as_of_value),
    ) %>%
  arrange(rel_diff)
diffs %>%
  select(-forecast_date, -value_substituted, -value_final, -as_of_value) %>%
  print(n = 23)

## # A tibble: 8 × 5
##   geo_value time_value abs_diff rel_diff rel_rev_diff
##   <chr>     <date>        <dbl>    <dbl>        <dbl>
## 1 ga        2025-02-15      -52  -0.198         0.473
## 2 nj        2025-03-22      -41  -0.132         0.152
## 3 ky        2025-03-22       -5  -0.0370        0.263
## 4 dc        2025-02-15        5   0.385        -1    
## 5 ar        2025-03-22       24   0.393        -0.8  
## 6 ak        2025-03-22        5   0.5          -1.67 
## 7 nh        2025-01-11       64   0.744       -10.7  
## 8 sd        2025-03-01       15   0.75         -3.75

The table is sorted by rel_diff. - abs_diff gives how much our adjustment was over by (so positive means we were over the latest value).

rel_diff gives the difference relative to the latest value.
rel_rev_diff on the other hand divides that by how much the value as of the forecast date was off. Any of these which are <1 are “successful” in the sense that we were closer to the latest value than the as_of value was. An infinite value tells us that we adjusted a value that hasn’t been corrected. The sign for rel_rev_diff is a bit confusing, and tells us whether our estimate and the as of value were both larger/smaller than the latest value, or one larger and one smaller.

We did fewer substitutions in Covid, but they were more regularly good, and there were no cases where we shouldn’t have changed the value and did.

How many did we substitute a more accurate value?

mean(abs(diffs$rel_rev_diff) < 1)

## [1] 0.5

50%, so overall basically guessing. How about lower than the target vs higher than the target?

diffs %>%
  mutate(is_over = rel_diff > 0) %>%
  group_by(is_over) %>%
  summarize(fraction_improved = mean(abs(rel_rev_diff) < 1))

## # A tibble: 2 × 2
##   is_over fraction_improved
##   <lgl>               <dbl>
## 1 FALSE                 1  
## 2 TRUE                  0.2

When we were under, we always improved the situation. However when we were above, sometimes we made it worse.

Correlations between the two archives

Visually, the flu and covid nhsn datasets seem to have related revision behavior. We should probably study this more carefully, but the question is a somewhat difficult one. One potential way to go about this is for each (time_value, geo_value) pair, do a correlation of the time series across version. Alternatively, we could do the same, but for the differences between versions (so correlate the correction amount).

min_waiting_period is NULL here since we’re plotting mid-season, while quick_revision = 7 because this is weekly data represented using days (because the versions are days).↩︎
min_waiting_period is NULL here since we’re plotting mid-season, while quick_revision = 7 because this is weekly data represented using days (because the versions are days).↩︎
which would definitely be a bit odd of an effect.↩︎

Revision Summary 2025

Rendered: 2025-05-20 17:09:41

Overall takeaways

Flu revision

NHSN

Revision behavior

NSSP

Revision behavior

Correlation with latest

Covid revision

NHSN

NSSP

Revision behavior

Correlation with latest

Data substitions

Flu

Covid

Correlations between the two archives