US Elections

Biden’s Approval Margins

This assignment involves analyzing the poll approval for the US president. Here, I’ll again start by loading the libraries which I might need to do my analysis: and a bit of cleaning the data:

# Imported approval polls data directly off fivethirtyeight website
approval_polllist <- read_csv('https://projects.fivethirtyeight.com/biden-approval-data/approval_polllist.csv')

## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   .default = col_character(),
##   samplesize = col_double(),
##   weight = col_double(),
##   influence = col_double(),
##   approve = col_double(),
##   disapprove = col_double(),
##   adjusted_approve = col_double(),
##   adjusted_disapprove = col_double(),
##   tracking = col_logical(),
##   poll_id = col_double(),
##   question_id = col_double()
## )
## ℹ Use `spec()` for the full column specifications.

glimpse(approval_polllist)

## Rows: 1,598
## Columns: 22
## $ president           <chr> "Joseph R. Biden Jr.", "Joseph R. Biden Jr.", "Jos…
## $ subgroup            <chr> "All polls", "All polls", "All polls", "All polls"…
## $ modeldate           <chr> "9/13/2021", "9/13/2021", "9/13/2021", "9/13/2021"…
## $ startdate           <chr> "1/21/2021", "1/26/2021", "1/26/2021", "1/27/2021"…
## $ enddate             <chr> "2/2/2021", "1/28/2021", "1/28/2021", "1/29/2021",…
## $ pollster            <chr> "Gallup", "Morning Consult", "Rasmussen Reports/Pu…
## $ grade               <chr> "B+", "B", "B", "A+", "B", "B", "B+", "B-", "B/C",…
## $ samplesize          <dbl> 906.00, 3423.00, 1500.00, 1261.00, 2200.00, 15000.…
## $ population          <chr> "a", "a", "lv", "a", "a", "a", "rv", "rv", "lv", "…
## $ weight              <dbl> 1.31470660, 0.19383147, 0.33818752, 2.35547250, 0.…
## $ influence           <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ approve             <dbl> 57.00000, 56.00000, 50.00000, 53.00000, 56.00000, …
## $ disapprove          <dbl> 37.0, 31.0, 45.0, 29.0, 33.0, 32.0, 39.0, 35.0, 36…
## $ adjusted_approve    <dbl> 56.36684, 54.58655, 52.41459, 53.38782, 54.58655, …
## $ adjusted_disapprove <dbl> 36.35828, 34.25346, 39.04388, 33.12161, 36.25346, …
## $ multiversions       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "*", NA, N…
## $ tracking            <lgl> NA, NA, TRUE, NA, NA, TRUE, NA, NA, NA, TRUE, TRUE…
## $ url                 <chr> "https://news.gallup.com/poll/329348/biden-begins-…
## $ poll_id             <dbl> 74344, 74292, 74290, 74321, 74378, 74339, 74322, 7…
## $ question_id         <dbl> 139651, 139518, 139515, 139560, 139765, 139644, 13…
## $ createddate         <chr> "2/4/2021", "2/1/2021", "1/29/2021", "2/1/2021", "…
## $ timestamp           <chr> "13:37:08 13 Sep 2021", "13:37:08 13 Sep 2021", "1…

#Using `lubridate` to fix dates, as they are given as characters.
approval_polllist <- approval_polllist %>% 
  mutate(enddate = mdy(enddate))

What I did is to calculate the average net approval rate (approve- disapprove) for each week since he got into office. I plot the net approval, along with its 95% confidence interval. There are various dates given for each poll, I used enddate, i.e., the date the poll ended.

#tidy data and calculate CI using formula
net_approval <- approval_polllist %>% 
  filter(!is.na(subgroup)) %>%
  #using lubridate to get week number
  mutate(week = isoweek(enddate),
         net_approval_day = approve - disapprove) %>% 
  group_by(subgroup, week) %>%
  summarise(mean_net_approval = mean(net_approval_day),
            sd_net_approval = sd(net_approval_day),
            count = n(),
            se_twitter = sd_net_approval / sqrt(count),
            t_critical = qt(0.975, count - 1),
            lower_ci =  mean_net_approval - t_critical*se_twitter,
            upper_ci = mean_net_approval + t_critical*se_twitter)

## Warning in qt(0.975, count - 1): NaNs produced

## Warning in qt(0.975, count - 1): NaNs produced

## `summarise()` has grouped output by 'subgroup'. You can override using the `.groups` argument.

#plot Biden's weekly net approval rate
ggplot(net_approval, 
       aes(x= week, 
           y= mean_net_approval)) +
  geom_line(color = "red")+
  geom_point(color = "red")+
  geom_smooth(color = "blue",
              level = 0,
              size = 1)+
  #add orange line at zero
  geom_hline(yintercept=0, 
             color = "orange", 
             size = 2)+
  theme_bw()+
  #add confidence band using calculated CI
  geom_ribbon(aes(ymin = lower_ci, 
                  ymax = upper_ci),
              alpha=0.3,
              fill = "grey",
              color = "red") + 
  labs(
    title = "Estimating Approval Margin (approve-disapprove) for Joe Biden",
    subtitle = "Weekly average of all polls by different subgroups",
    x = "Week of the year",
    y = "Average Approval Margin (approve-disapprove)")+
  #differentiate between Adults, All polls, Voters 
  facet_wrap(vars(subgroup))+
  scale_y_continuous(breaks=seq(-15,10,2.5))

## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Compare Confidence Intervals

Compare the confidence intervals for week 4 and week 25.

net_approval_4_25 <- net_approval %>% 
  filter(week %in% c(4, 25)) %>% 
  mutate(
    ci_width = upper_ci - lower_ci) %>% 
  select(subgroup, week, lower_ci, upper_ci, ci_width)

net_approval_4_25

## # A tibble: 6 x 5
## # Groups:   subgroup [3]
##   subgroup   week lower_ci upper_ci ci_width
##   <chr>     <dbl>    <dbl>    <dbl>    <dbl>
## 1 Adults        4    21.4      24.4     3.02
## 2 Adults       25    12.0      16.5     4.57
## 3 All polls     4    14.0      24.2    10.2 
## 4 All polls    25     8.57     13.7     5.08
## 5 Voters        4     5.92     28.0    22.1 
## 6 Voters       25     3.61     10.5     6.85

From the results, I can clearly see that for all subgroups, the confidence interval for Biden’s net approval rate has been narrower from week 4 to week 25. Especially for Voters subgroup, the width confidence interval has been drastically decreased from 16.84 to 6.85. I assume this is because as after Biden has been elected for a longer period of time in week 25 (almost half a year), all adults including his voters would become more clear about their approval or disapproval to the president. After Americans took over 25-week time to evaluate their new elected president, they would probably have a clearer attitude towards Biden’s policy changes, administration and national strategies. These clearer perceptions then result in this decreasing confidence interval trend.