Overview

Analysis

General

  1. Weekly Average Bed Occupancy Rates Over Time per Hospital Subtype
  2. Weekly Average ICU Bed Occupancy Rate Over Time per Hospital Subtype

Covid-19

  1. Weekly Average hospitalized COVID-19 adult patients overall & per hospital subtype
  2. Weekly Average hospitalized COVID-19 adult patients by state
  3. Weekly Average COVID-19 Admissions by Age Group Over Time per Hospital Subtype
  4. Weekly Average Usage of Staffed Inpatient Beds by COVID-19 patients per Year per Hospital Subtype
  5. Percentage of Inpatient and ICU Beds Used by COVID-19 Patients by Hospital Subtype
  6. Weekly Average Vaccination administered to Healthcare Personnel & Patients by Hospital Subtype
  7. Ratio of Healthcare Personnel to Patient COVID Vaccination Doses

Influenza

  1. Weekly Average Hospitalized Influenza patients Over Time by Hospital Subtype
  2. Percentage of ICU Beds Used by Influenza Patients per Hospital Subtype

Prediction Models

  1. Prediction of Average Weekly Patients based on Vaccination Status and Hospital Subtype

Overview

The dataset was downloaded from healthdata.gov and provided by U.S. Department of Health & Human Services. It was collected from hospitals across US, started from July 31, to understand more on the impact on patient and hospital capacity of four different facilities due to COVID-19. It has one million observations and 127 fields of data. Additionally, we conducted two analysis on influenza data.

To understand more about the data, check the following:

Info about data - COVID-19 Reported Patient Impact and Hospital Capacity by Facility

Link - https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/anag-cw7u/about_data

# Define the file path
file_path <- "/Users/thandarmoe/Library/Mobile Documents/com~apple~CloudDocs/teaching/Cloud/me/R-Language/Practice/Dataset/COVID-19_Reported_Patient_Impact_and_Hospital_Capacity_by_Facility_20241005.csv"

# Load the CSV file
data <- read.csv(file_path)

# Set the collection_week to Date
data$collection_week <- as.Date(data$collection_week)

# View the structure of the dataset
str(data)
## 'data.frame':    1045406 obs. of  128 variables:
##  $ hospital_pk                                                                       : chr  "050778" "051325" "060001" "061324" ...
##  $ collection_week                                                                   : Date, format: "2020-04-12" "2020-05-24" ...
##  $ state                                                                             : chr  "CA" "CA" "CO" "CO" ...
##  $ ccn                                                                               : chr  "050778" "051325" "060001" "061324" ...
##  $ hospital_name                                                                     : chr  "LOMA LINDA UNIVERSITY CHILDREN'S HOSPITAL" "ADVENTIST HEALTH MENDOCINO COAST" "BANNER NORTH COLORADO MEDICAL CENTER" "ASPEN VALLEY HOSPITAL" ...
##  $ address                                                                           : chr  "11234 ANDERSON STREET SUITE A" "700 RIVER DRIVE" "1801 16TH ST" "401 CASTLE CREEK RD" ...
##  $ city                                                                              : chr  "LOMA LINDA" "FORT BRAGG" "GREELEY" "ASPEN" ...
##  $ zip                                                                               : int  92354 95437 80631 81611 19973 19958 32763 32164 33606 32932 ...
##  $ hospital_subtype                                                                  : chr  "Short Term" "Critical Access Hospitals" "Short Term" "Critical Access Hospitals" ...
##  $ fips_code                                                                         : int  6071 6045 8123 8097 10005 10005 12127 12035 12057 12009 ...
##  $ is_metro_micro                                                                    : chr  "true" "true" "true" "true" ...
##  $ total_beds_7_day_avg                                                              : num  NA 29 1081 NA 99 ...
##  $ all_adult_hospital_beds_7_day_avg                                                 : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ all_adult_hospital_inpatient_beds_7_day_avg                                       : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ inpatient_beds_used_7_day_avg                                                     : num  NA 10.4 171.4 NA 62.7 ...
##  $ all_adult_hospital_inpatient_bed_occupied_7_day_avg                               : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ inpatient_beds_used_covid_7_day_avg                                               : num  -999999 0 18.7 0 5.7 ...
##  $ total_adult_patients_hospitalized_confirmed_and_suspected_covid_7_day_avg         : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_adult_patients_hospitalized_confirmed_covid_7_day_avg                       : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_pediatric_patients_hospitalized_confirmed_and_suspected_covid_7_day_avg     : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_pediatric_patients_hospitalized_confirmed_covid_7_day_avg                   : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ inpatient_beds_7_day_avg                                                          : num  NA 22 321 NA 99 ...
##  $ total_icu_beds_7_day_avg                                                          : num  99 4 78 4 16 20 30.4 18 127 NA ...
##  $ total_staffed_adult_icu_beds_7_day_avg                                            : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ icu_beds_used_7_day_avg                                                           : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ staffed_adult_icu_bed_occupancy_7_day_avg                                         : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ staffed_icu_adult_patients_confirmed_and_suspected_covid_7_day_avg                : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ staffed_icu_adult_patients_confirmed_covid_7_day_avg                              : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_patients_hospitalized_confirmed_influenza_7_day_avg                         : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ icu_patients_confirmed_influenza_7_day_avg                                        : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_patients_hospitalized_confirmed_influenza_and_covid_7_day_avg               : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_beds_7_day_sum                                                              : int  NA 203 7565 NA 693 1470 NA NA NA NA ...
##  $ all_adult_hospital_beds_7_day_sum                                                 : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ all_adult_hospital_inpatient_beds_7_day_sum                                       : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ inpatient_beds_used_7_day_sum                                                     : int  NA 73 1200 NA 439 982 NA NA NA NA ...
##  $ all_adult_hospital_inpatient_bed_occupied_7_day_sum                               : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ inpatient_beds_used_covid_7_day_sum                                               : int  16 0 131 0 40 13 43 13 8 24 ...
##  $ total_adult_patients_hospitalized_confirmed_and_suspected_covid_7_day_sum         : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_adult_patients_hospitalized_confirmed_covid_7_day_sum                       : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_pediatric_patients_hospitalized_confirmed_and_suspected_covid_7_day_sum     : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_pediatric_patients_hospitalized_confirmed_covid_7_day_sum                   : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ inpatient_beds_7_day_sum                                                          : int  NA 154 2249 NA 693 1365 NA NA NA NA ...
##  $ total_icu_beds_7_day_sum                                                          : int  693 28 546 4 112 140 213 126 254 NA ...
##  $ total_staffed_adult_icu_beds_7_day_sum                                            : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ icu_beds_used_7_day_sum                                                           : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ staffed_adult_icu_bed_occupancy_7_day_sum                                         : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ staffed_icu_adult_patients_confirmed_and_suspected_covid_7_day_sum                : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ staffed_icu_adult_patients_confirmed_covid_7_day_sum                              : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_patients_hospitalized_confirmed_influenza_7_day_sum                         : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ icu_patients_confirmed_influenza_7_day_sum                                        : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_patients_hospitalized_confirmed_influenza_and_covid_7_day_sum               : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_beds_7_day_coverage                                                         : int  0 7 7 0 7 7 0 0 0 0 ...
##  $ all_adult_hospital_beds_7_day_coverage                                            : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ all_adult_hospital_inpatient_beds_7_day_coverage                                  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ inpatient_beds_used_7_day_coverage                                                : int  0 7 7 0 7 7 0 0 0 0 ...
##  $ all_adult_hospital_inpatient_bed_occupied_7_day_coverage                          : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ inpatient_beds_used_covid_7_day_coverage                                          : int  7 7 7 1 7 7 7 7 1 7 ...
##  $ total_adult_patients_hospitalized_confirmed_and_suspected_covid_7_day_coverage    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ total_adult_patients_hospitalized_confirmed_covid_7_day_coverage                  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ total_pediatric_patients_hospitalized_confirmed_and_suspected_covid_7_day_coverage: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ total_pediatric_patients_hospitalized_confirmed_covid_7_day_coverage              : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ inpatient_beds_7_day_coverage                                                     : int  0 7 7 0 7 7 0 0 0 0 ...
##  $ total_icu_beds_7_day_coverage                                                     : int  7 7 7 1 7 7 7 7 2 0 ...
##  $ total_staffed_adult_icu_beds_7_day_coverage                                       : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ icu_beds_used_7_day_coverage                                                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ staffed_adult_icu_bed_occupancy_7_day_coverage                                    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ staffed_icu_adult_patients_confirmed_and_suspected_covid_7_day_coverage           : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ staffed_icu_adult_patients_confirmed_covid_7_day_coverage                         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ total_patients_hospitalized_confirmed_influenza_7_day_coverage                    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ icu_patients_confirmed_influenza_7_day_coverage                                   : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ total_patients_hospitalized_confirmed_influenza_and_covid_7_day_coverage          : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ previous_day_admission_adult_covid_confirmed_7_day_sum                            : int  NA NA NA NA NA -999999 NA NA NA NA ...
##  $ previous_day_admission_adult_covid_confirmed_18.19_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_confirmed_20.29_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_confirmed_30.39_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_confirmed_40.49_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_confirmed_50.59_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_confirmed_60.69_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_confirmed_70.79_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_confirmed_80._7_day_sum                        : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_confirmed_unknown_7_day_sum                    : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_pediatric_covid_confirmed_7_day_sum                        : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_covid_ED_visits_7_day_sum                                            : int  NA NA NA NA NA 158 NA NA NA NA ...
##  $ previous_day_admission_adult_covid_suspected_7_day_sum                            : int  NA NA NA NA NA 111 NA NA NA NA ...
##  $ previous_day_admission_adult_covid_suspected_18.19_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_suspected_20.29_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_suspected_30.39_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_suspected_40.49_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_suspected_50.59_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_suspected_60.69_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_suspected_70.79_7_day_sum                      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_suspected_80._7_day_sum                        : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_adult_covid_suspected_unknown_7_day_sum                    : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_admission_pediatric_covid_suspected_7_day_sum                        : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ previous_day_total_ED_visits_7_day_sum                                            : int  NA NA NA NA NA 920 NA NA NA NA ...
##  $ previous_day_admission_influenza_confirmed_7_day_sum                              : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ geocoded_hospital_address                                                         : chr  "POINT (-117.264076004 34.049546981)" "POINT (-123.800331961 39.431600999)" "POINT (-104.709703002 40.414700005)" "POINT (-106.839321892 39.190483484)" ...
##  $ hhs_ids                                                                           : chr  "[C050778-A]" "[C051325-A]" "[C060001-A]" "[C061324-A]" ...
##  $ previous_day_admission_adult_covid_confirmed_7_day_coverage                       : int  0 0 0 0 0 7 0 0 0 0 ...
##   [list output truncated]
#Load tidyverse
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
#Clean the data
data <- data %>%
  mutate(across(where(is.numeric), ~ na_if(., -999999.0)))

Analysis

General

Weekly Average Bed Occupancy Rates Over Time per Hospital Subtype

Variables

  1. inpatient_beds_used_7_day_avg

Average of total number of staffed inpatient beds that are occupied reported during the 7-day period.

  1. total_beds_7_day_avg

Average of total number of all staffed inpatient and outpatient beds in the hospital, including all overflow, observation, and active surge/expansion beds used for inpatients and for outpatients (including all ICU, ED, and observation) reported during the 7-day period.

# Calculate average occupancy rates
bed_occupancy_data <- data %>%
  group_by(hospital_subtype, collection_week) %>%
  summarise(
    avg_inpatient_beds_used = mean(inpatient_beds_used_7_day_avg, na.rm = TRUE),
    avg_total_beds = mean(total_beds_7_day_avg, na.rm = TRUE),
    avg_occupancy_rate = avg_inpatient_beds_used / avg_total_beds * 100,
    .groups = 'drop'
  ) %>%
  filter(!is.na(avg_occupancy_rate))

# Plot the bed occupancy rates over time
ggplot(bed_occupancy_data, aes(x = collection_week, y = avg_occupancy_rate)) +
  geom_line(color = "steelblue") +
  labs(title = "Average Bed Occupancy Rate Over Time per Hospital Subtype",
       y = "Occupancy Rate (%)") +
  theme_minimal() +
  scale_x_date(date_breaks = "6 months", date_labels = "%b %Y") +
  theme(plot.title = element_text(hjust = 0.5),
    legend.position = "top",
    axis.text.x = element_text(angle = 45, hjust = 1)) +
  facet_wrap(~hospital_subtype)

This chart shows the average bed occupancy rates from 2020 to 2024:

  • Children’s Hospitals: Bed occupancy increased to around 70-80%, with some fluctuations and a slight decline by 2024.

  • Critical Access Hospitals: Occupancy remained steady around 40-50% after a rise in 2020.

  • Long-Term Hospitals: Stable around 70-80%, with minor dips after an early drop in 2020.

  • Short-Term Hospitals: Similar pattern to long-term, stabilizing around 60-70% after an initial drop.

Overall, occupancy trends stabilized after early disruptions in 2020.

Weekly Average ICU Bed Occupancy Rate Over Time per Hospital Subtype

Variables

  1. icu_beds_used_7_day_avg

Average number of total number of staffed inpatient ICU beds that are occupied reported in the 7-day period.

  1. total_icu_beds_7_day_avg

Average number of total number of staffed inpatient ICU beds reported in the 7-day period.

# Calculate ICU bed usage
icu_data <- data %>%
  group_by(hospital_subtype, collection_week) %>%
  summarise(
    avg_icu_beds_used = mean(icu_beds_used_7_day_avg, na.rm = TRUE),
    avg_total_icu_beds = mean(total_icu_beds_7_day_avg, na.rm = TRUE),
    avg_icu_occupancy_rate = avg_icu_beds_used / avg_total_icu_beds * 100,
    .groups = "drop"
  ) %>%
  filter(!is.na(avg_icu_occupancy_rate))

# Plot the ICU occupancy rates
ggplot(icu_data, aes(x = collection_week, y = avg_icu_occupancy_rate)) +
  geom_line(color = "red") +
  labs(title = "Weekly Average ICU Bed Occupancy Rate Over Time per Hospital Subtype",
       y = "ICU Occupancy Rate (%)") +
  theme_minimal() +
  scale_x_date(date_breaks = "6 months", date_labels = "%b %Y") +
  theme(plot.title = element_text(hjust = 0.5),
    legend.position = "top",
    axis.text.x = element_text(angle = 45, hjust = 1)) +
  facet_wrap(~hospital_subtype)

  1. Children’s Hospitals:

    • ICU occupancy starts high around early 2020, likely reflecting a spike (possibly due to the COVID-19 pandemic).

    • After the initial peak, the occupancy stabilizes around 60-80% and remains relatively stable through 2024, with minor fluctuations.

  2. Critical Access Hospitals:

    • ICU occupancy starts near zero in early 2020, then spikes significantly, peaking above 80%.

    • Afterward, there’s a fluctuating trend, with the occupancy rates varying between 40% to 80% over time, and it stabilizes somewhat by 2023 and 2024.

  3. Long-Term Hospitals:

    • The ICU occupancy in 2020 starts with a sharp drop from 90% to almost zero, followed by a rapid recovery to a steady state of around 80% occupancy. It stays relatively stable throughout the time period.
  4. Short-Term Hospitals:

    • Similar to long-term hospitals, there’s a sharp drop early in 2020, followed by a steady recovery.

    • After 2020, ICU occupancy remains relatively stable around 60-70% for the rest of the period.

In general, the graph reflects the trends in ICU bed occupancy across different hospital types, likely influenced by external events such as the COVID-19 pandemic. The early peaks, followed by stabilization, suggest that after initial surges in demand, the healthcare system adapted to stabilize ICU occupancy across different hospital types.

Covid-19

Weekly average of hospitalized COVID-19 adult patients overall & per hospital subtype

Variable

  1. total_adult_patients_hospitalized_confirmed_and_suspected_covid_7_day_avg

Average number of patients currently hospitalized in an adult inpatient bed who have laboratory-confirmed or suspected COVID19, including those in observation beds reported during the 7-day period.

# Calculate weekly average of hospitalized COVID-19 patients
weekly_data <- data %>%
  group_by(hospital_subtype, collection_week) %>%
  summarise(
    avg_patients = round(mean(total_adult_patients_hospitalized_confirmed_and_suspected_covid_7_day_avg, na.rm = TRUE)
  ), .groups = 'drop') %>%
  filter(!is.nan(avg_patients), collection_week > as.Date("2020-01-01"))

str(weekly_data)
## tibble [855 × 3] (S3: tbl_df/tbl/data.frame)
##  $ hospital_subtype: chr [1:855] "Childrens Hospitals" "Childrens Hospitals" "Childrens Hospitals" "Childrens Hospitals" ...
##  $ collection_week : Date[1:855], format: "2020-03-29" "2020-04-05" ...
##  $ avg_patients    : num [1:855] 6 4 2 2 0 0 0 1 0 0 ...
# Plot the average number of hospitalized COVID-19 patients over time
ggplot(weekly_data, aes(x = collection_week, y = avg_patients)) +
  geom_line(color = "blue") +
  labs(title = "Average Number of Hospitalized COVID-19 Adult Patients Over Time",
       y = "Weekly Average Patients") +
  theme_minimal() +
scale_x_date(date_breaks = "6 months", date_labels = "%b %Y") +
  theme(plot.title = element_text(hjust = 0.5),
    legend.position = "top",
    axis.text.x = element_text(angle = 45, hjust = 1))

This chart represents the weekly average number of hospitalized COVID-19 adult patients over time, spanning from 2020 to 2024. Here’s an interpretation:

  1. Initial Peaks (2020-2021):

    • In 2020 and early 2021, we see multiple peaks in hospitalizations, likely corresponding to waves of COVID-19 infections. The first wave appears smaller, with the peak reaching around 20 patients, but subsequent peaks, particularly in early 2021, are more pronounced, exceeding 40 patients.
  2. Mid-2021 to Early 2022:

    • There are several large spikes during this period, with the highest peak occurring in late 2021 or early 2022, exceeding 50 hospitalized patients on average. This aligns with significant COVID-19 surges driven by variants such as Delta and Omicron.
  3. Late 2022 to 2024:

    • Hospitalizations generally decrease after early 2022, though there are still smaller, periodic waves. Peaks during this period are smaller, reaching around 20-30 patients, indicating that while the number of COVID-19 hospitalizations remains significant, it has reduced compared to earlier surges.
  4. General Trend:

    • The pattern shows cyclical waves of hospitalizations, with sharp peaks followed by declines. The waves are consistent with known patterns of COVID-19 infection surges during colder months or as new variants emerged.

    • The trend appears to be moderating over time, with peaks becoming smaller, likely reflecting the effects of vaccinations, improved treatments, and perhaps milder variants.

Overall, the chart reflects the ebb and flow of COVID-19 hospitalizations over several years, with the highest surges occurring in 2021 and early 2022, followed by a gradual decline.

# Plot the average number of hospitalized COVID-19 patients over time per hospital subtype
ggplot(weekly_data, aes(x = collection_week, y = avg_patients)) +
  geom_line(color = "blue") +
  labs(title = "Avg Hospitalized COVID-19 Adult Patients Over Time per Hospital Subtype",
       y = "Weekly Average Patients") +
  theme_minimal() +
  scale_x_date(date_breaks = "6 months", date_labels = "%b %Y") +
  theme(plot.title = element_text(hjust = 0.5),
    legend.position = "top",
    axis.text.x = element_text(angle = 45, hjust = 1))+
  facet_wrap(~hospital_subtype)

  1. Children’s Hospitals:

    • The graph shows a very low number of adult COVID-19 patients, remaining near zero throughout the time period from 2021 to 2024, reflecting that adult COVID-19 patients rarely use these facilities.
  2. Critical Access Hospitals:

    • Similar to Children’s Hospitals, the number of hospitalized adult patients is consistently low (mostly close to zero) throughout the time period. There are minor fluctuations, but nothing substantial.
  3. Long-Term Hospitals:

    • These hospitals had a low and relatively stable number of adult COVID-19 patients. There is a slight increase in early 2021, but after that, the numbers remain minimal throughout 2022 to 2024.
  4. Short-Term Hospitals:

    • This subtype shows the most significant variation. There are notable peaks, especially in 2021 and 2022, corresponding to COVID-19 waves. The numbers decrease significantly afterward but show smaller spikes throughout 2023 and into 2024.

Overall Observations:

  • The burden of adult COVID-19 hospitalizations has been highest in Short-Term Hospitals, while other hospital subtypes have seen very few adult COVID-19 patients.

  • There is a clear decline in hospitalizations after 2022, with the peaks during the height of the pandemic’s waves.

Weekly average hospitalized COVID-19 adult patients by state

# Average hospitalized COVID-19 patients by state
state_data <- data %>%
  group_by(state) %>%
  summarise(
    avg_patients = round(mean(total_adult_patients_hospitalized_confirmed_and_suspected_covid_7_day_avg, na.rm = TRUE)))

str(state_data)
## tibble [56 × 2] (S3: tbl_df/tbl/data.frame)
##  $ state       : chr [1:56] "AK" "AL" "AR" "AS" ...
##  $ avg_patients: num [1:56] 4 12 9 1 19 17 8 17 17 26 ...
#Chose the Highest and Lowest 10 states
lowest_states <- state_data %>%
  arrange(avg_patients) %>%
  slice_head(n = 10)

highest_states <- state_data %>%
  arrange(desc(avg_patients)) %>%
  slice_head(n = 10)

result_states <- bind_rows(
  mutate(highest_states, category = "Highest"),
  mutate(lowest_states, category = "Lowest")
)

# Plot
ggplot(result_states, aes(x = reorder(state, avg_patients), y = avg_patients, fill = category)) + 
  geom_bar(stat = "identity") + 
  labs(title = "Weekly Average Adult COVID-19 Hospitalizations by State",
       x = "State",
       y = "Weekly Average Patients") + 
  coord_flip() + 
  theme_minimal() +
  scale_fill_manual(values = c("Highest" = "pink", "Lowest" = "steelblue")) 

Weekly Average COVID-19 Admissions by Age Group Over Time per Hospital Subtype

Variable

  1. previous_day_admission_adult_covid_confirmed_18.19_7_day_sum

Sum of number of patients age 18-19 who were admitted to an adult inpatient bed on the previous calendar day who had confirmed COVID-19 at the time of admission reported in the 7-day period.

  1. previous_day_admission_adult_covid_confirmed_unknown_7_day_sum

Sum of number of patients age unknown who were admitted to an adult inpatient bed on the previous calendar day who had confirmed COVID-19 at the time of admission reported in 7-day period.

  1. previous_day_admission_pediatric_covid_confirmed_7_day_sum

Sum of number of pediatric patients who were admitted to an inpatient bed, including NICU, PICU, newborn, and nursery, on the previous calendar day who had confirmed COVID-19 at the time of admission.

# Average admissions by age group
admissions_by_age <- data %>%
  group_by(hospital_subtype, collection_week) %>%
  summarise(
    avg_admissions_18_29 = mean(previous_day_admission_adult_covid_confirmed_18.19_7_day_sum, na.rm = TRUE),
    avg_admissions_30_39 = mean(previous_day_admission_adult_covid_confirmed_30.39_7_day_sum, na.rm = TRUE),
    avg_admissions_40_49 = mean(previous_day_admission_adult_covid_confirmed_40.49_7_day_sum, na.rm = TRUE),
    avg_admissions_50_59 = mean(previous_day_admission_adult_covid_confirmed_50.59_7_day_sum, na.rm = TRUE),
    avg_admissions_60_69 = mean(previous_day_admission_adult_covid_confirmed_60.69_7_day_sum, na.rm = TRUE),
    avg_admissions_70_79 = mean(previous_day_admission_adult_covid_confirmed_70.79_7_day_sum, na.rm = TRUE),
    avg_admissions_80_plus = mean(previous_day_admission_adult_covid_confirmed_80._7_day_sum, na.rm = TRUE),
    avg_admissions_unknown = mean(previous_day_admission_adult_covid_confirmed_unknown_7_day_sum, na.rm = TRUE),
    avg_admissions_pediatric = mean(previous_day_admission_pediatric_covid_confirmed_7_day_sum, na.rm = TRUE),
    .groups = "drop"
  )


# Reshape for easier plotting
admissions_long <- admissions_by_age %>%
  pivot_longer(cols = -c(hospital_subtype, collection_week), 
               names_to = "age_group", 
               values_to = "avg_admissions") %>%
   filter(!is.na(avg_admissions))



# Define shorter labels for the age groups
short_labels <- c(
  "avg_admissions_18_29" = "18-29",
  "avg_admissions_30_39" = "30-39",
  "avg_admissions_40_49" = "40-49",
  "avg_admissions_50_59" = "50-59",
  "avg_admissions_60_69" = "60-69",
  "avg_admissions_70_79" = "70-79",
  "avg_admissions_80_plus" = "80+",
  "avg_admissions_unknown" = "Unknown",
  "avg_admissions_pediatric" = "Pediatric"
)

# Define a more extensive custom color palette (adjust to match your age groups)
color_palette <- c("darkred", "red", "orange", "lightgreen", "lightblue", "gray", "blue", "purple", "steelblue")  # Add more colors as needed

# Plot admissions by age group with custom colors
ggplot(admissions_long, aes(x = collection_week, y = avg_admissions, color = age_group)) +
  geom_line(linewidth = 1.2) +  # Use linewidth instead of size
  scale_color_manual(values = color_palette, labels = short_labels) +  # Use custom colors
  labs(title = "Avg COVID-19 Admissions by Age Group Over Time Per Hospital Subtype",
       y = "Average Admissions") +
  theme_minimal() +
  scale_x_date(date_breaks = "6 months", date_labels = "%b %Y") +
  theme(plot.title = element_text(hjust = 0.5),
    legend.position = "top",
    axis.text.x = element_text(angle = 45, hjust = 1)) + 
  facet_wrap(~hospital_subtype)

  1. Children’s Hospitals:

    • The pediatric group (purple) dominates admissions, peaking significantly in early 2022 with over 20 average admissions. The trend shows a rise and fall in 2021 and 2022, followed by a gradual decline in 2023 and 2024.

    • Other age groups (e.g., 18-29, 30-39) have minimal to no admissions recorded.

  2. Critical Access Hospitals:

    • The 80+ age group (blue) records minor peaks, notably in early 2021 and 2022. However, admissions are very low overall for all age groups, indicating limited impact on these hospitals.
  3. Long Term Hospitals:

    • There is a consistent near-zero trend for all age groups, suggesting that COVID-19 admissions in these facilities were minimal or highly managed.
  4. Short Term Hospitals:

    • The 80+ age group (blue) again stands out, peaking multiple times around 2021-2022.

    • Younger age groups (e.g., 60-69 and 70-79) also show minor peaks, indicating that elderly patients were more commonly hospitalized in short-term facilities.

In summary, pediatric admissions were particularly high in Children’s Hospitals, while the elderly population (80+) had the most notable admissions across Short Term and Critical Access Hospitals.

Weekly Average Usage of Staffed Inpatient Beds by COVID-19 patients per Year per Hospital Subtype

Variable

  1. inpatient_beds_used_7_day_avg

Average of total number of staffed inpatient beds that are occupied reported during the 7-day period.

  1. inpatient_beds_used_covid_7_day_avg

Average of reported patients currently hospitalized in an inpatient bed who have suspected or confirmed COVID-19 reported during the 7-day period.

# Analyze hospital subtype impact
hospital_type <- data %>%
  group_by(hospital_subtype, year = year(collection_week)) %>%
  summarise(
    avg_beds_used = mean(inpatient_beds_used_7_day_avg, na.rm = TRUE),
    avg_beds_used_covid = mean(inpatient_beds_used_covid_7_day_avg, na.rm = TRUE)
  , .groups = "drop")

# Pivoting
hospital_type_long <- hospital_type %>%
  pivot_longer(cols = c(avg_beds_used, avg_beds_used_covid), names_to = "Legend", values_to = "value")

# Plot impact of hospital type with 100% stacked column
ggplot(hospital_type_long, aes(x = hospital_subtype, y = value, fill = Legend)) +
  geom_bar(stat = "identity", position = "fill") +  # Use "fill" for 100% stacked column
  labs(title = "Proportion of Staffed Bed Usage by COVID-19 Patients per Year",
       x = "Hospital Type",
       y = "Proportion of Avg Weekly Patients") +  # Adjust y-axis label to reflect proportions
  facet_wrap(~year) +
  scale_y_continuous(labels = scales::percent_format()) +  # Show percentages on y-axis
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))  # Set angle to 90 for vertical labels

Percentage of Inpatient and ICU Beds Used by COVID-19 Patients per Hospital Subtype

Variable

  1. inpatient_beds_used_covid_7_day_avg

Average of reported patients currently hospitalized in an inpatient bed who have suspected or confirmed COVID-19 reported during the 7-day period

  1. inpatient_beds_used_7_day_avg

Average of total number of staffed inpatient beds that are occupied reported during the 7-day period.

  1. staffed_icu_adult_patients_confirmed_and_suspected_covid_7_day_avg

Average number of patients currently hospitalized in a designated adult ICU bed who have suspected or laboratory-confirmed COVID-19 reported in the 7-day period.

  1. staffed_adult_icu_bed_occupancy_7_day_avg

Average of total number of staffed inpatient adult ICU beds that are occupied reported in the 7-day period.

inpatent_icu_percent <- data %>%
  group_by(hospital_subtype, collection_week) %>%
  summarise(
    percent_inpatient_covid_beds = mean(
      ifelse(inpatient_beds_used_7_day_avg == 0, NA, (inpatient_beds_used_covid_7_day_avg / inpatient_beds_used_7_day_avg) * 100),
      na.rm = TRUE
    ),
    percent_icu_covid_beds = mean(
      ifelse(staffed_adult_icu_bed_occupancy_7_day_avg == 0, NA, (staffed_icu_adult_patients_confirmed_and_suspected_covid_7_day_avg / staffed_adult_icu_bed_occupancy_7_day_avg) * 100),
      na.rm = TRUE
    ),
    .groups = "drop"  # Add this line to suppress the message
  )


# Reshape the data to long format
inpatient_icu_percent_long <- inpatent_icu_percent %>%
  pivot_longer(cols = c(percent_inpatient_covid_beds, percent_icu_covid_beds), 
               names_to = "metric", 
               values_to = "percentage")%>%
  filter(!is.na(percentage))

# Create the line plot
ggplot(inpatient_icu_percent_long, aes(x = collection_week, y = percentage, color = metric)) +
  geom_line(linewidth = 1) +
  labs(
    title = "% of Inpatient & ICU Beds Used by COVID-19 Patients by Hospital Subtype",
    x = "Year",
    y = "Percentage (%)",
    color = "Metric"
  ) +
  theme_minimal() +
  scale_x_date(date_breaks = "6 months", date_labels = "%b %Y") +
  theme(
    legend.position = "top",
    axis.text.x = element_text(angle = 45, hjust = 1)
  ) + facet_wrap(~hospital_subtype)

The graph presents the percentage of inpatient and ICU beds used by COVID-19 patients across four different types of hospitals: Children’s Hospitals, Critical Access Hospitals, Long Term Care Hospitals, and Short Term Hospitals. The data spans from 2020 through 2024, and key insights include:

  1. Critical Access Hospitals: There is a significant peak in ICU bed usage (red line) around 2021, where the percentage of ICU beds dedicated to COVID-19 patients exceeded 100%, especially in 2022. This suggests that the critical care capacity was heavily burdened during that period, likely due to a surge in severe COVID-19 cases.

  2. Short Term and Long Term Hospitals: Both show relatively consistent usage over time, with occasional fluctuations. Short Term Hospitals had a slightly higher ICU bed usage during certain periods, particularly in early 2021 and 2022, but overall, the percentages remained below 50%. Long Term Hospitals show a low percentage of both ICU and inpatient beds used, likely because these facilities may not specialize in acute care.

  3. Children’s Hospitals: COVID-19 bed usage was minimal here, with little to no strain observed in both ICU and inpatient capacities. This could reflect the lower hospitalization rates for children during the pandemic.

  4. Inpatient vs. ICU Beds: Across all hospital types, inpatient bed usage (blue line) remained relatively stable and low compared to ICU bed usage. ICU beds seem to have experienced more stress during COVID-19 peaks, particularly in Critical Access and Short Term hospitals.

In summary, Critical Access and Short Term hospitals faced the greatest strain in terms of COVID-19 patient care, especially in ICUs, while Children’s and Long Term hospitals were less impacted. This reflects the varying demand placed on different healthcare facilities during the pandemic based on their patient demographics and care capabilities.

Weekly Average Vaccination administered to Healthcare Personnel & Patients by Hospital Subtype

Variable

  1. total_personnel_covid_vaccinated_doses_one_7_day

The current maximum number of healthcare personnel who have received at least one dose of COVID-19 vaccination that is administered in a multi-dose series. This field is meant to represent those who have begun but not completed the vaccination process. Does not include those who received a single-dose vaccine in this field. (This field is only reported on Wednesdays of a given week)

  1. total_personnel_covid_vaccinated_doses_all_7_day

The current number of healthcare personnel who have received a complete series of a COVID-19vaccination.Includes those who have received all doses in a multi-dose series as well as those who received a single-dose vaccine. (This field is only reported on Wednesdays of a given week)

  1. total_personnel_covid_vaccinated_doses_none_7_day

The number reported of healthcare personnel who have not yet received a single vaccine dose. (This field is only reported on Wednesdays of a given week)

  1. previous_week_patients_covid_vaccinated_doses_one_7_day

The number of patients who received the first dose of a COVID-19 vaccine that is administered in a multi-dose series. (This field is only reported on Wednesdays of a given week)

  1. previous_week_patients_covid_vaccinated_doses_all_7_day

The number of patients who received the final dose in a COVID-19 vaccination series. (This field is only reported on Wednesdays of a given week)

#Preparing the data for vaccination status of healthcare personnel and patients
avg_weekly_vaccination_data <- data %>%
  group_by(hospital_subtype, collection_week) %>%
  summarise(
    avg_weekly_personnel_doses_one_administered = mean(
total_personnel_covid_vaccinated_doses_one_7_day, na.rm = TRUE),
avg_weekly_personnel_doses_all_administered = mean(
total_personnel_covid_vaccinated_doses_all_7_day, na.rm = TRUE),
avg_weekly_personnel_doses_none_administered = mean(
total_personnel_covid_vaccinated_doses_none_7_day, na.rm = TRUE),

avg_weekly_patient_doses_one_administered = mean(
previous_week_patients_covid_vaccinated_doses_one_7_day, na.rm = TRUE),
avg_weekly_patient_doses_all_administered = mean(
previous_week_patients_covid_vaccinated_doses_all_7_day, na.rm = TRUE),
    .groups = 'drop'
  ) %>% 
  filter(
    !is.na(avg_weekly_personnel_doses_one_administered) & 
    !is.na(avg_weekly_personnel_doses_all_administered) & 
    !is.na(avg_weekly_patient_doses_one_administered) & 
    !is.na(avg_weekly_patient_doses_all_administered) &
      !is.na(avg_weekly_personnel_doses_none_administered)
  )

# Pivoting the data longer for easier plotting
avg_weekly_vaccination_data_long <- avg_weekly_vaccination_data %>%
  pivot_longer(cols = starts_with("avg_weekly"),
               names_to = "vaccination_type", 
               values_to = "avg_doses")

#For healthcare personnel
avg_weekly_vaccination_data_long_personnel <- avg_weekly_vaccination_data_long %>% 
  filter(vaccination_type %in% c("avg_weekly_personnel_doses_one_administered", 
                                 "avg_weekly_personnel_doses_all_administered", 
                                 "avg_weekly_personnel_doses_none_administered"))

#For Patients
avg_weekly_vaccination_data_long_patients <- avg_weekly_vaccination_data_long %>% 
  filter(vaccination_type %in% c("avg_weekly_patient_doses_one_administered", 
                                 "avg_weekly_patient_doses_all_administered"))

Weekly Average Vaccination administered to Healthcare Personnel by Hospital Subtype

# Creating the line graph
ggplot(avg_weekly_vaccination_data_long_personnel, aes(x = collection_week, y = avg_doses, color = vaccination_type)) +
  geom_line(linewidth = 1) +
  labs(title = "Avg Weekly Vacc_Doses Administered to Health Personnel by Hospital Subtype",
       x = "Collection Week",
       y = "Average Doses Administered (Weekly)",
       color = "Vaccination Type") +
  scale_color_manual(values = c("blue", "darkgreen", "red"),
                     labels = c("One Dose (Personnel)", "All Doses (Personnel)", "None (Personnel)"),
                     guide = guide_legend(nrow = 2)) + # Set legend to 2 rows
  theme_minimal() +
  scale_x_date(date_breaks = "6 months", date_labels = "%b %Y") +
 theme(
    legend.position = "top",
    legend.box = "horizontal",
    legend.title = element_blank(), # Remove legend title if not needed
    legend.text = element_text(size = 10), # Adjust the size of the text
    axis.text.x = element_text(angle = 45, hjust = 1)
) +
  facet_wrap(~ hospital_subtype)

  • Children’s Hospitals:

    • They exhibit a consistent and relatively high number of doses administered to health personnel, particularly with a noticeable increase in both one-dose and no-dose vaccinations (the red and blue lines).

    • The red line (“None Doses Administered”) is especially notable, indicating a significant proportion of health personnel were reported to have not received any dose in recent weeks.

  • Critical Access Hospitals:

    • There is almost no substantial vaccination activity in this category, with the lines for one-dose, all-dose, and no-dose nearly flat throughout the period. This suggests that personnel in critical access hospitals either have low vaccination rates or are vaccinated through other channels.
  • Long-Term Care Facilities:

    • A very flat trend, indicating very few doses were administered to health personnel across the monitored period. Long-term care facilities may have had vaccinations handled in ways that are underreported or completed earlier in the pandemic.
  • Short-Term Care Hospitals:

    • These hospitals show a slightly increasing trend in vaccinations, but the numbers remain low compared to Children’s Hospitals. It indicates a lower but steady vaccination rate among health personnel.

Weekly Average Vaccination administered to Patients by Hospital Subtype

# Creating the line graph
ggplot(avg_weekly_vaccination_data_long_patients, aes(x = collection_week, y = avg_doses, color = vaccination_type)) +
  geom_line(linewidth = 1) +
  labs(title = "Avg Weekly Vacc_Doses Administered to Patients by Hospital Subtype",
       x = "Collection Week",
       y = "Average Doses Administered (Weekly)",
       color = "Vaccination Type") +
  scale_color_manual(values = c("blue", "darkgreen"),
                     labels = c("One Dose (Patient)", "All Doses (Patient)"),
                     guide = guide_legend(nrow = 2)) + # Set legend to 2 rows
  theme_minimal() +
  scale_x_date(date_breaks = "6 months", date_labels = "%b %Y") +
 theme(
    legend.position = "top",
    legend.box = "horizontal",
    legend.title = element_blank(), # Remove legend title if not needed
    legend.text = element_text(size = 10), # Adjust the size of the text
    axis.text.x = element_text(angle = 45, hjust = 1)
) +
  facet_wrap(~ hospital_subtype)

  1. Children’s Hospitals:

    • There was a peak in early 2021, followed by a rapid decline. Small fluctuations occurred until mid-2022, after which both one dose and all doses flattened, indicating a steep drop in vaccinations.
  2. Critical Access Hospitals:

    • Similar to Children’s Hospitals, a large peak is visible early in 2021, followed by a steep drop. Vaccination rates remained low and stable, with very minor increases later on, indicating lower vaccination activity.
  3. Long Term Hospitals:

    • The trend here shows minimal activity throughout the period, with a slight bump early in the vaccine rollout. There’s almost no significant vaccination activity after mid-2021.
  4. Short Term Hospitals:

    • A steady peak from early 2021 is followed by a slow decline until mid-2022. After that, the vaccination rates drop and remain low throughout 2023 and into 2024.

General Trend

  • Vaccination rates spiked early on during the vaccine rollout (early to mid-2021), reflecting the initial surge in vaccine distribution.

  • After this initial spike, the number of doses administered declined significantly across all hospital subtypes, with particularly sharp declines in Children’s, Critical Access, and Long Term hospitals.

  • Short Term hospitals showed more sustained vaccination efforts over a longer period compared to the others.

  • By 2023, all subtypes exhibit very low or near-zero vaccination activity.

Ratio of Healthcare Personnel to Patient COVID Vaccination Doses

vaccination_compare <- avg_weekly_vaccination_data %>%
  mutate(vaccination_ratio = (avg_weekly_personnel_doses_all_administered + avg_weekly_personnel_doses_one_administered) / 
                             (avg_weekly_patient_doses_all_administered + avg_weekly_patient_doses_one_administered))

# Visualizing the ratio
ggplot(vaccination_compare, aes(x = collection_week, y = vaccination_ratio)) +
  geom_line(color = "purple") +
  labs(title = "Ratio of Personnel to Patient COVID Vaccination Doses",
       y = "Ratio",
       x = "Collection Week") +
  theme_minimal() +
  scale_x_date(date_breaks = "6 months", date_labels = "%b %Y") +
  theme(
    legend.position = "top",
    legend.box = "horizontal",
    legend.title = element_blank(), # Remove legend title if not needed
    legend.text = element_text(size = 10), # Adjust the size of the text
    axis.text.x = element_text(angle = 45, hjust = 1)
) +
  facet_wrap(~hospital_subtype)

  1. Children’s Hospitals:

    • There is a sharp drop early in the vaccination period (early 2021) followed by a relatively flat period.

    • Beginning in late 2022, the ratio of personnel to patient doses sharply rises, leveling off at a very high ratio (~7500) and remaining consistent through 2023 and into 2024.

    • This likely indicates a large number of personnel vaccinations compared to patient vaccinations, especially after the initial push of patient vaccines declined.

  2. Critical Access Hospitals:

    • The ratio remains consistently flat, suggesting little or no significant difference between personnel and patient vaccinations throughout the time period.

    • This may suggest relatively balanced vaccination rates or low overall activity.

  3. Long Term Hospitals:

    • A flat line at zero for almost the entire period suggests that very few vaccinations were administered, whether to personnel or patients.
  4. Short Term Hospitals:

    • The ratio is very low early in the vaccination campaign but begins to show small fluctuations starting in late 2022 and continuing into 2023, though still much lower compared to Children’s Hospitals.

    • There are mild increases in the ratio, which may indicate a higher proportion of personnel vaccinations compared to patient doses during this period.

Influenza

Weekly Average Hospitalized Influenza patients Over Time by Hospital Subtype

Variable

total_patients_hospitalized_confirmed_influenza_7_day_avg

Average number of patients (all ages) currently hospitalized in an inpatient bed who have laboratory-confirmed influenza. Including those in observation beds reported in the 7-day period.

# Calculate weekly average of hospitalized influenza patients per hospital subtype
inf_weekly_data <- data %>%
  group_by(hospital_subtype, collection_week) %>%
  summarise(
    avg_patients = round(mean(total_patients_hospitalized_confirmed_influenza_7_day_avg, na.rm = TRUE)
  ), .groups = 'drop') %>%
  filter(!is.nan(avg_patients), collection_week > as.Date("2020-01-01"))

str(inf_weekly_data)
## tibble [803 × 3] (S3: tbl_df/tbl/data.frame)
##  $ hospital_subtype: chr [1:803] "Childrens Hospitals" "Childrens Hospitals" "Childrens Hospitals" "Childrens Hospitals" ...
##  $ collection_week : Date[1:803], format: "2020-09-20" "2020-10-11" ...
##  $ avg_patients    : num [1:803] 0 0 0 0 0 0 0 0 0 0 ...
# Plot the average number of hospitalized influenza patients over time
ggplot(inf_weekly_data, aes(x = collection_week, y = avg_patients)) +
  geom_line(color = "black") +
  labs(title = "Average Number of Hospitalized influenza Patients Over Time",
       y = "Weekly Average Patients") +
  theme_minimal() +
  scale_x_date(date_breaks = "6 months", date_labels = "%b %Y") +
  theme(plot.title = element_text(hjust = 0.5),
    legend.position = "top",
    axis.text.x = element_text(angle = 45, hjust = 1))

  • July 2020 to January 2022: No reported influenza hospitalizations during this period, as indicated by the flat line at zero.

  • Mid-2022: A brief spike in hospitalized influenza patients is visible but relatively small.

  • Late 2022 to Early 2023 (Seasonal Peak): A significant surge in hospitalizations, reaching a peak in early 2023 before gradually decreasing.

  • Late 2023 to Early 2024 (Another Peak): Another wave of hospitalizations occurs, peaking similarly to the previous flu season in 2023, though potentially slightly lower in magnitude.

  • Mid-2024: A decline in hospitalizations follows, indicating the end of the second observed flu season.

The graph shows a clear seasonal pattern of influenza, with peaks in late fall/winter and declines by the spring, typical of influenza virus behavior.

# Plot the average number of hospitalized influenza patients over time per hospital subtype
ggplot(inf_weekly_data, aes(x = collection_week, y = avg_patients)) +
  geom_line(color = "black") +
  labs(title = "Avg Hospitalized Influenza Patients Over Time per Hospital Subtype",
       y = "Weekly Average Patients",
       x = "Collection Week") +
  theme_minimal() +
  scale_x_date(date_breaks = "6 months", date_labels = "%b %Y") +
  facet_wrap(~hospital_subtype) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

  1. Children’s Hospitals:

    • Significant spikes are seen during two flu seasons: one around early 2023 and another during late 2023 to early 2024.

    • The peak in early 2023 reaches over 6 average patients per week, with the second peak slightly lower.

    • No cases are recorded before mid-2022, suggesting the flu was under control during that time.

  2. Critical Access Hospitals:

    • The line is flat at zero throughout the entire period (2020–2024), indicating no recorded influenza hospitalizations in these types of hospitals.
  3. Long-Term Care:

    • Similar to Critical Access Hospitals, the line remains flat at zero, showing no hospitalizations for influenza over this time period.
  4. Short-Term Care:

    • Like Children’s Hospitals, Short-Term Care also experiences two distinct flu seasons, with a peak around early 2023 and another in late 2023/early 2024.

    • The peaks in this category are lower in magnitude compared to Children’s Hospitals, with fewer than 5 patients per week at the highest point.

Conclusion:

Children’s and Short-Term Care Hospitals seem to be the most affected by influenza during flu seasons (early 2023 and late 2023/early 2024).

Percentage of ICU Beds Used by Influenza Patients per Hospital Subtype

Variables

  1. icu_patients_confirmed_influenza_7_day_avg

Average of patients (all ages) currently hospitalized in a designated ICU bed with laboratory-confirmed influenza in the 7-day period.

  1. icu_beds_used_7_day_avg

Average number of total number of staffed inpatient ICU beds that are occupied reported in the 7-day period.

inf_inpatient_icu_percent <- data %>%
  group_by(hospital_subtype, collection_week) %>%
  summarise(
    percent_icu_inf_beds = mean(
      ifelse(icu_beds_used_7_day_avg  == 0, NA, (icu_patients_confirmed_influenza_7_day_avg / icu_beds_used_7_day_avg) * 100),
      na.rm = TRUE
    ),
    .groups = "drop"
  ) %>% filter (!is.na(percent_icu_inf_beds))

# Create the line plot
ggplot(inf_inpatient_icu_percent, aes(x = collection_week, y = percent_icu_inf_beds)) +
  geom_line(linewidth = 0.5, color = "black") +
  labs(
    title = "% of Inpatient & ICU Beds Used by Influenza Patients by Hospital Subtype",
    x = "Year",
    y = "Percentage (%)"
  ) +
  theme_minimal() +
  scale_x_date(date_breaks = "6 months", date_labels = "%b %Y")  +
  theme(
    legend.position = "top",
    axis.text.x = element_text(angle = 45, hjust = 1)
  ) + facet_wrap(~hospital_subtype)

  • Children’s Hospitals:

    • There are two distinct peaks, one around late 2020 and another in late 2022, reaching about 3% bed usage for influenza patients.

    • After these peaks, bed usage drops sharply to zero and stays relatively flat with minor fluctuations through 2023 and 2024.

  • Critical Access Hospitals:

    • There is almost no noticeable bed usage for influenza patients in these hospitals throughout the entire period. The line remains flat at or near zero.
  • Long Term Hospitals:

    • Similar to Critical Access Hospitals, the percentage of beds used by influenza patients remains near zero with a small increase towards the end of 2023, but still under 0.5%.
  • Short Term Hospitals:

    • A slight increase in influenza bed usage is seen around the fall and winter seasons of late 2022 and late 2023, but the peaks are under 1%. There is a steady, small rise towards the end of 2023 into 2024.

Data for Dashboard

# Separate latitude and longitude
data2 <-  data %>%
  mutate(
    # Extract longitude (first number after POINT)
    longitude = as.numeric(str_extract(geocoded_hospital_address, "-?\\d+\\.\\d+")),
    # Extract latitude (second number after POINT)
    latitude = as.numeric(str_extract(geocoded_hospital_address, "-?\\d+\\.\\d+(?=\\))"))
  )

data_icu_inpatient_covid <- data2 %>%
  mutate(
    percent_inpatient_covid_beds = ifelse(
      inpatient_beds_used_7_day_avg == 0, 
      NA, 
      (inpatient_beds_used_covid_7_day_avg / inpatient_beds_used_7_day_avg) * 100
    ),
    percent_icu_covid_beds = ifelse(
      staffed_adult_icu_bed_occupancy_7_day_avg == 0, 
      NA, 
      (staffed_icu_adult_patients_confirmed_and_suspected_covid_7_day_avg / staffed_adult_icu_bed_occupancy_7_day_avg) * 100
    )
  )

test <- data_icu_inpatient_covid %>% select(collection_week, percent_inpatient_covid_beds, percent_icu_covid_beds, longitude, latitude)

# Filtering rows where either percent_inpatient_covid_beds or percent_icu_covid_beds is not NA
test <- test %>%
  filter(!is.na(percent_inpatient_covid_beds) | !is.na(percent_icu_covid_beds))

# Convert to numeric, coerce non-numeric values to NA
test$longitude <- as.numeric(test$longitude)
test$latitude <- as.numeric(test$latitude)

Prediction Models

Prediction of Average Weekly Patients based on Vaccination Status and Hospital Subtype

# Merge vaccination data with weekly average of hospitalized COVID-19 patients
merged_data <- weekly_data %>%
  left_join(avg_weekly_vaccination_data, by = c("hospital_subtype", "collection_week")) %>%
  drop_na()

# Ensure hospital_subtype is a factor
merged_data$hospital_subtype <- as.factor(merged_data$hospital_subtype)

str(merged_data)
## tibble [689 × 8] (S3: tbl_df/tbl/data.frame)
##  $ hospital_subtype                            : Factor w/ 4 levels "Childrens Hospitals",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ collection_week                             : Date[1:689], format: "2021-01-10" "2021-01-17" ...
##  $ avg_patients                                : num [1:689] 1 1 1 1 1 0 1 0 0 0 ...
##  $ avg_weekly_personnel_doses_one_administered : num [1:689] 513 1218 1375 1541 1349 ...
##  $ avg_weekly_personnel_doses_all_administered : num [1:689] 929 519 697 1011 1141 ...
##  $ avg_weekly_personnel_doses_none_administered: num [1:689] 384 915 1101 1221 1170 ...
##  $ avg_weekly_patient_doses_one_administered   : num [1:689] 0.167 8.659 147.074 168.517 150.581 ...
##  $ avg_weekly_patient_doses_all_administered   : num [1:689] 0 2.73 39.63 182.82 17.7 ...
library(caret)
## Loading required package: lattice
## 
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
## 
##     lift
# Train-test split (70% training, 30% testing)
set.seed(371)
train_index <- createDataPartition(merged_data$avg_patients, p = 0.7, list = FALSE)
train_data <- merged_data[train_index, ]
test_data <- merged_data[-train_index, ]

# Train linear regression model including hospital_subtype
model <- lm(avg_patients ~ hospital_subtype + 
                        avg_weekly_personnel_doses_one_administered + 
                        avg_weekly_personnel_doses_all_administered +
                        avg_weekly_patient_doses_one_administered +
                        avg_weekly_patient_doses_all_administered, data = train_data)

# Model summary
summary(model)
## 
## Call:
## lm(formula = avg_patients ~ hospital_subtype + avg_weekly_personnel_doses_one_administered + 
##     avg_weekly_personnel_doses_all_administered + avg_weekly_patient_doses_one_administered + 
##     avg_weekly_patient_doses_all_administered, data = train_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -15.0957  -0.8758  -0.0675   0.4344  30.0040 
## 
## Coefficients:
##                                               Estimate Std. Error t value
## (Intercept)                                 -0.1065270  0.7996070  -0.133
## hospital_subtypeCritical Access Hospitals    0.0725926  0.8310922   0.087
## hospital_subtypeLong Term                    0.9662296  0.8380373   1.153
## hospital_subtypeShort Term                  14.1024975  0.7454990  18.917
## avg_weekly_personnel_doses_one_administered  0.0008411  0.0009947   0.846
## avg_weekly_personnel_doses_all_administered -0.0007781  0.0009228  -0.843
## avg_weekly_patient_doses_one_administered    0.0227548  0.0034182   6.657
## avg_weekly_patient_doses_all_administered   -0.0101100  0.0033595  -3.009
##                                             Pr(>|t|)    
## (Intercept)                                  0.89407    
## hospital_subtypeCritical Access Hospitals    0.93043    
## hospital_subtypeLong Term                    0.24950    
## hospital_subtypeShort Term                   < 2e-16 ***
## avg_weekly_personnel_doses_one_administered  0.39819    
## avg_weekly_personnel_doses_all_administered  0.39956    
## avg_weekly_patient_doses_one_administered    7.7e-11 ***
## avg_weekly_patient_doses_all_administered    0.00276 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.852 on 477 degrees of freedom
## Multiple R-squared:  0.7305, Adjusted R-squared:  0.7265 
## F-statistic: 184.7 on 7 and 477 DF,  p-value: < 2.2e-16
  • Short Term: Estimate = 14.1025, p-value < 2e-16 (highly significant). This indicates that, on average, short-term hospitals have 14.1025 more patients compared to the reference category (Children Hospitals).

  • avg_weekly_patient_doses_one_administered: Estimate = 0.0228, p-value = 7.7e-11 (highly significant). Each additional dose administered to patients is associated with an increase in the average number of patients.

  • avg_weekly_patient_doses_all_administered: Estimate = -0.0101, p-value = 0.00276 (significant). This suggests that more doses administered (overall) may be associated with a decrease in average patients, which could indicate a negative correlation in specific contexts.

Overall, the model suggests that avg_weekly_patient_doses_one_administered, avg_weekly_patient_doses_all_administered and hospital_subtypeShort Term are significant predictors of avg_patients. The model explains a substantial portion of the variance in the outcome, but some predictors are not statistically significant, indicating they may not contribute meaningfully to the model.

# Predict on test data
predictions <- predict(model, newdata = test_data)

# Calculate RMSE and R-squared
rmse <- sqrt(mean((predictions - test_data$avg_patients)^2))
r_squared <- 1 - (sum((test_data$avg_patients - predictions)^2) / sum((test_data$avg_patients - mean(test_data$avg_patients))^2))

cat("RMSE:", rmse, "\n")
## RMSE: 5.548382
cat("R-squared:", r_squared, "\n")
## R-squared: 0.5990908
  • Moderate Predictive Power: The R-squared value indicates that the model has a reasonable amount of predictive power, but there is still a significant portion of variability (about 40%) unexplained by the model.

  • Prediction Accuracy: The RMSE provides a measure of how accurate the model’s predictions are. An RMSE of approximately 5.55 means that, on average, the model’s predictions deviate from the actual values of avg_patients by about 5.55 patients.

# Step 1: Create a data frame with actual and predicted values
results <- data.frame(
  Actual = test_data$avg_patients,
  Predicted = predictions
)

# Step 2: Plot the prediction graph
ggplot(results, aes(x = Actual, y = Predicted)) +
  geom_point(alpha = 0.6) + # Scatter plot of actual vs. predicted
  geom_abline(slope = 1, intercept = 0, color = "red", linetype = "dashed") + # 45-degree line
  labs(title = "Actual vs. Predicted Patients",
       x = "Actual Patients",
       y = "Predicted Patients") +
  theme_minimal() +
  xlim(min(results$Actual, results$Predicted), max(results$Actual, results$Predicted)) + # Set limits for x-axis
  ylim(min(results$Actual, results$Predicted), max(results$Actual, results$Predicted)) # Set limits for y-axis

  • Lower Patient Counts : The model’s predictions are more accurate, with many points close to the diagonal red line (ideal prediction line).

  • Higher Patient Counts: Predictions become less accurate, with the model consistently underestimating the actual number of patients. This can be seen as points move farther away from the red line, particularly for actual patient counts above 20.