Is the Average Opinion Score for Modern Family, Friends & Big Bang Theory equal?

Author

Diya Bijoy

Published

November 4, 2025

R Packages Setup

library(tidyverse) # Tidy data processing

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   4.0.0     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(ggformula) # Formula based plots

Loading required package: scales

Attaching package: 'scales'

The following object is masked from 'package:purrr':

    discard

The following object is masked from 'package:readr':

    col_factor

Loading required package: ggridges

New to ggformula?  Try the tutorials: 
    learnr::run_tutorial("introduction", package = "ggformula")
    learnr::run_tutorial("refining", package = "ggformula")

library(mosaic) # Data inspection and Statistical Inference

Registered S3 method overwritten by 'mosaic':
  method                           from   
  fortify.SpatialPolygonsDataFrame ggplot2

The 'mosaic' package masks several functions from core packages in order to add 
additional features.  The original behavior of these functions should not be affected by this.

Attaching package: 'mosaic'

The following object is masked from 'package:Matrix':

    mean

The following object is masked from 'package:scales':

    rescale

The following objects are masked from 'package:dplyr':

    count, do, tally

The following object is masked from 'package:purrr':

    cross

The following object is masked from 'package:ggplot2':

    stat

The following objects are masked from 'package:stats':

    binom.test, cor, cor.test, cov, fivenum, IQR, median, prop.test,
    quantile, sd, t.test, var

The following objects are masked from 'package:base':

    max, mean, min, prod, range, sample, sum

library(broom) # Tidy outputs from Statistical Analyses
library(infer) # Statistical Inference, Permutation/Bootstrap


Attaching package: 'infer'

The following objects are masked from 'package:mosaic':

    prop_test, t_test

library(supernova) # Beginner-Friendly ANOVA Tables


Attaching package: 'supernova'

The following object is masked from 'package:scales':

    number

library(ggstatsplot) # Statistical Plots

You can cite this package as:
     Patil, I. (2021). Visualizations with statistical details: The 'ggstatsplot' approach.
     Journal of Open Source Software, 6(61), 3167, doi:10.21105/joss.03167

library(ggcompare) # Improved p.value brackets on graphs
library(patchwork) # Arranging Plots
library(ggprism) # Interesting Categorical Axes
library(paletteer) # Color Palettes

Reading the file

tvshows <- read.csv("2-tv_series.csv")
glimpse(tvshows)

Rows: 52
Columns: 5
$ Name            <chr> "Diya", "Ihina", "Abhinav", "Nikhita", "Rishi", "Charv…
$ Gender          <chr> "Female", "Female", "Male", "Female", "Male", "Female"…
$ Friends         <dbl> 8.0, 7.0, 8.0, 7.0, 8.0, 6.0, 8.0, 4.0, 9.0, 7.0, NA, …
$ Modern.Family   <dbl> 9.0, 9.0, 10.0, 9.0, 6.0, 9.0, 7.5, 8.0, 8.0, 9.0, 9.5…
$ Big.Bang.Theory <dbl> 9.0, 6.0, 7.0, 6.0, 5.0, 5.5, 9.0, 6.0, 5.0, 3.0, 6.0,…

Cleaning the Data

tvshows_modified <- tvshows %>% drop_na()
tvshows_modified

       Name Gender Friends Modern.Family Big.Bang.Theory
1      Diya Female     8.0           9.0             9.0
2     Ihina Female     7.0           9.0             6.0
3   Abhinav   Male     8.0          10.0             7.0
4   Nikhita Female     7.0           9.0             6.0
5     Rishi   Male     8.0           6.0             5.0
6    Charvi Female     6.0           9.0             5.5
7      Maya Female     8.0           7.5             9.0
8     Anuva Female     4.0           8.0             6.0
9   Shourya   Male     9.0           8.0             5.0
10  sarthak   Male     7.0           9.0             3.0
11  Niyosha Female     8.0           8.0             7.0
12  Alekhya Female     5.0           8.0             7.0
13 Shalmali Female     7.0           9.0             9.0
14   Riddhi Female     8.5          10.0             8.0
15   Aryaan   Male     7.0           9.0             8.0
16  Jeffery   Male     6.0           8.0            10.0
17    Ishaa Female     9.0          10.0             7.0
18     Paru Female     5.0           9.0             8.0
19  Krushna Female     2.0           3.0             4.0
20     Neil   Male     6.5           8.0             4.0
21 Avantika Female     8.0          10.0             2.0
22 prakruti Female     7.0           9.0             7.5
23    Aditi Female     5.0          10.0             3.0
24    Kavya Female     7.0           8.0             3.0
25   Mitali Female     5.0           8.0             7.0
26   Ananya Female     8.0          10.0             8.0
27    Dhruv   Male     6.5           7.5             9.0
28  Prithvi   Male     6.0           8.0             7.0
29     Arya   Male     1.0           6.0             3.0
30 Tathastu   Male    10.0           9.0             9.0
31  Vaibhav   Male     8.0          10.0             6.0
32   Akarsh   Male     7.0           8.0             8.0
33    Arnav   Male     8.0           3.0             5.0
34    Aarav   Male     9.0           8.0             7.0
35   Kreesh   Male     8.0           9.0             7.0
36 Niranjan   Male     4.0           9.0             4.0
37   Vishal   Male     5.0           9.0             5.0
38   Saachi Female     7.0          10.0             7.0
39     Diya Female     8.0          10.0             7.0
40    Rahul   Male     9.0           9.0             7.0
41    Nipun   Male     9.5           8.0             7.0
42  Akshaya   Male     7.5           9.0             9.0
43     Anay   Male     7.0           8.0             5.0
44     Amit   Male     2.0           7.0             4.0
45    Aryan   Male    10.0           6.0             8.0
46  Drishti Female     8.5           7.0             8.5

tvshows_pivot <- tvshows_modified %>%
  pivot_longer(
    cols = c(Friends, Modern.Family, Big.Bang.Theory),
    names_to = "Series",
    values_to = "Opinion_Score"
  ) %>%
  drop_na() %>%
  mutate(
    Series = factor(
      Series,
      levels = c("Friends", "Modern.Family", "Big.Bang.Theory"),
      labels = c("Friends", "Modern Family", "Big Bang Theory")
    )
  ) %>%
  rename("Id" = 1) -> tv_long

tv_long

# A tibble: 138 × 4
   Id      Gender Series          Opinion_Score
   <chr>   <chr>  <fct>                   <dbl>
 1 Diya    Female Friends                     8
 2 Diya    Female Modern Family               9
 3 Diya    Female Big Bang Theory             9
 4 Ihina   Female Friends                     7
 5 Ihina   Female Modern Family               9
 6 Ihina   Female Big Bang Theory             6
 7 Abhinav Male   Friends                     8
 8 Abhinav Male   Modern Family              10
 9 Abhinav Male   Big Bang Theory             7
10 Nikhita Female Friends                     7
# ℹ 128 more rows

Visualising the Data

tv_long %>% gf_boxplot(Opinion_Score~Series, orientation = "x", fill = ~ Series) %>% 
  gf_labs(
    title = "Opinion Score vs Series",
    x = "TV shows",
    y = "Opinion Scores"
  ) %>% 
  gf_hline(yintercept = ~ mean(Opinion_Score), color = "grey") %>% 
  gf_annotate(geom = "text", label = "Overall Mean", x = 2, y = mean(tv_long$Opinion_Score) + -0.5, size = 4)

  gf_theme(theme_minimal)

NULL

Between the three groups, we see a lot of variance from the overall mean.
Friends & Big Bang Theory are closer while Modern Family has a lot of variance.

gf_jitter(Opinion_Score ~ Series,
  color = ~Series, width = 0.2,
  data = tv_long, size = 3, alpha = 0.5
) %>%
  gf_labs(
    title = "Opinion Scores across TV Series",
    x = "TV Series", y = "Opinion Score"
  ) %>%
  gf_hline(yintercept = ~ mean(Opinion_Score), color = "grey") %>%
  gf_annotate(
    geom = "text",
    label = "Overall Mean",
    x = 2,
    y = mean(tv_long$Opinion_Score) + 0.5,
    size = 4
  )

From the above jitter plot, we see that there is a lot of variance within each group (height of each plot is long).

ANOVA

H0: All three series have equal mean opinion scores
H1: All three series have different mean opinion scores

tvshows_anova <- aov(Opinion_Score~Series, data = tv_long)
tvshows_anova

Call:
   aov(formula = Opinion_Score ~ Series, data = tv_long)

Terms:
                  Series Residuals
Sum of Squares   86.6341  472.8098
Deg. of Freedom        2       135

Residual standard error: 1.871442
Estimated effects may be unbalanced

tvshows_supernova <-
  supernova::pairwise(tvshows_anova,
    correction = "Bonferroni", # Try "Tukey"
    alpha = 0.05, # 95% CI calculation
    var_equal = TRUE, # We'll see
    plot = T
  )

tvshows_supernova$Series

Series

Levels: 3

Family-wise error-rate: 0.049


  group_1         group_2        diff pooled_se      t    df  lower  upper p_adj
  <chr>           <chr>         <dbl>     <dbl>  <dbl> <int>  <dbl>  <dbl> <dbl>
1 Modern Family   Friends       1.413     0.276  5.121   135  0.820  2.006 .0000
2 Big Bang Theory Friends      -0.446     0.276 -1.615   135 -1.039  0.148 .3259
3 Big Bang Theory Modern Fami… -1.859     0.276 -6.736   135 -2.452 -1.265 .0000

supernova::supernova(tvshows_anova)

 Analysis of Variance Table (Type III SS)
 Model: Opinion_Score ~ Series

                              SS  df     MS      F   PRE     p
 ----- --------------- | ------- --- ------ ------ ----- -----
 Model (error reduced) |  86.634   2 43.317 12.368 .1549 .0000
 Error (from model)    | 472.810 135  3.502                   
 ----- --------------- | ------- --- ------ ------ ----- -----
 Total (empty model)   | 559.444 137  4.084

supernova::equation(tvshows_anova)

Fitted equation:
Opinion_Score = 6.891304 + 1.413043*SeriesModern Family + -0.4456522*SeriesBig Bang Theory + e

tv_long %>%
  mutate(fitted = fitted(tvshows_anova)) %>%
  gf_jitter(Opinion_Score ~ Series,
    width = 0.2, alpha = 0.6,
    color = ~Series,
    data = .
  ) %>%
  gf_summary(
    group = ~1, 
    fun = "mean", geom = "line", colour = "lightblue",
    lty = 1, linewidth = 2
  ) %>%
  gf_point(fitted ~ Series,
    color = ~Series,
    shape = 2,
    size = 6
  ) %>%
  gf_segment(fitted + fitted ~ 0 + Series, linetype = 2, color = ~Series) %>%
  gf_labs(
    title = "Fitted ANOVA Model Equation",
    x = "Series", y = "Opinion Score"
  ) %>%
  gf_refine(
    scale_x_discrete(guide = "prism_bracket"),
    scale_y_continuous(breaks = c(10, 15, (26.3 - 10.1), 20, (26.3 - 5.3), 25, 26.3, 30, 35))
  )

Warning: The S3 guide system was deprecated in ggplot2 3.5.0.
ℹ It has been replaced by a ggproto system that can be extended.

Inference:-

The test gave an F-value of 14.24 and a p-value < 0.001 which means the result is statistically significant. This suggests that at least one show’s average opinion score is different from the others.
We reject the null hypothesis that all three shows have the same average opinion score. At least one show’s rating differs significantly from the others.

1. Check for Normality

shapiro.test(x = tv_long$Opinion_Score)


    Shapiro-Wilk normality test

data:  tv_long$Opinion_Score
W = 0.91645, p-value = 3.301e-07

tv_long %>%
  group_by(Series) %>%
  group_modify(~ .x %>%
    select(Opinion_Score) %>%
    as_vector() %>%
    shapiro.test() %>%
    broom::tidy())

# A tibble: 3 × 4
# Groups:   Series [3]
  Series          statistic    p.value method                     
  <fct>               <dbl>      <dbl> <chr>                      
1 Friends             0.909 0.00161    Shapiro-Wilk normality test
2 Modern Family       0.808 0.00000294 Shapiro-Wilk normality test
3 Big Bang Theory     0.946 0.0320     Shapiro-Wilk normality test

The p value at each level is very low so we reject the Null Hypothesis that its normally distributed.

tvshows_anova$residuals %>%
  as_tibble() %>%
  gf_dhistogram(~value, data = .) %>%
  gf_labs(
    title = "Residuals Histogram",
    x = "Residuals", y = "Count"
  ) %>%
  gf_fitdistr()

`stat_bin()` using `bins = 30`. Pick better value `binwidth`.

##
tvshows_anova$residuals %>%
  as_tibble() %>%
  gf_qq(~value, data = .) %>%
  gf_qqstep() %>%
  gf_labs(
    title = "Residuals Q-Q Plot",
    x = "Theoretical Quantiles", y = "Sample Quantiles"
  ) %>%
  gf_qqline()

shapiro.test(tvshows_anova$residuals)


    Shapiro-Wilk normality test

data:  tvshows_anova$residuals
W = 0.93715, p-value = 7.456e-06

The residual p value is also very small.

2. Check for Similar Variance

tv_long %>%
  group_by(Series) %>%
  summarise(variance = var(Opinion_Score))

# A tibble: 3 × 2
  Series          variance
  <fct>              <dbl>
1 Friends             4.07
2 Modern Family       2.47
3 Big Bang Theory     3.97

DescTools::LeveneTest(Opinion_Score ~ Series, data = tv_long)

Levene's Test for Homogeneity of Variance (center = median)
       Df F value Pr(>F)
group   2  1.5298 0.2203
      135

fligner.test(Opinion_Score ~ Series, data = tv_long)


    Fligner-Killeen test of homogeneity of variances

data:  Opinion_Score by Series
Fligner-Killeen:med chi-squared = 4.0885, df = 2, p-value = 0.1295

From both tests, we see that p>0.05. Therefore, variances are not significantly different.

Inference:-

The Variables are not normally distributed
The Variables are not significantly different

Effect Size

To see how how strong or big the difference is.

tvshows_supernova <-
  supernova::pairwise(tvshows_anova,
    plot = TRUE,
    alpha = 0.05,
    correction = "Bonferroni"
  )

tvshows_supernova$Series

Series

Levels: 3

Family-wise error-rate: 0.049


  group_1         group_2        diff pooled_se      t    df  lower  upper p_adj
  <chr>           <chr>         <dbl>     <dbl>  <dbl> <int>  <dbl>  <dbl> <dbl>
1 Modern Family   Friends       1.413     0.276  5.121   135  0.820  2.006 .0000
2 Big Bang Theory Friends      -0.446     0.276 -1.615   135 -1.039  0.148 .3259
3 Big Bang Theory Modern Fami… -1.859     0.276 -6.736   135 -2.452 -1.265 .0000

Modern Family has the highest on average.
Friends was rated moderately.
Big Bang Theory received the lowest scores.
There is no significant difference between Friends and Big Bang Theory.

Visualizing ANOVA using ggstatsplot

library(ggstatsplot)
tv_long %>%
  ggstatsplot::ggbetweenstats(
    x = Series, y = Opinion_Score,
    colour = Series, alpha = 0.8,
    type = "parametric",
    p.adjust.method = "bonferroni",
    conf.level = 0.95,

    violin.args = list(width = 0.0)
  ) +

  scale_colour_paletteer_d("ggthemes::colorblind") +
  labs(
    title = "ANOVA : Series vs Opinion Scores",
    x = "Series", y = "Opinion Score"
  )

Scale for colour is already present.
Adding another scale for colour, which will replace the existing scale.

F stat value is 14.24 - which is a very large variance.
P value is 4.32 * 10^-06 is really small, and such a large F value is very unlikely to happen by chance.
So we reject the Null Hypothesis that all three means are equal.
The graph shows the means for each series:
- Friends: 6.89
- Modern Family: 8.30
- Big Bang Theory: 6.45
Modern Family was rated significantly higher than both Friends and Big Bang Theory.

Permutation Test

observed_infer <-
  tv_long %>%
  specify(Opinion_Score ~ Series) %>%
  hypothesise(null = "independence") %>%
  calculate(stat = "F")
observed_infer

Response: Opinion_Score (numeric)
Explanatory: Series (factor)
Null Hypothesi...
# A tibble: 1 × 1
   stat
  <dbl>
1  12.4

null_dist_infer <- tv_long %>%
  specify(Opinion_Score ~ Series) %>%
  hypothesise(null = "independence") %>%
  generate(reps = 4999, type = "permute") %>%
  calculate(stat = "F")
null_dist_infer

Response: Opinion_Score (numeric)
Explanatory: Series (factor)
Null Hypothesi...
# A tibble: 4,999 × 2
   replicate  stat
       <int> <dbl>
 1         1 0.536
 2         2 0.574
 3         3 0.652
 4         4 2.96 
 5         5 2.73 
 6         6 0.643
 7         7 1.12 
 8         8 1.16 
 9         9 0.428
10        10 0.625
# ℹ 4,989 more rows

null_dist_infer %>%
  visualize() +
  shade_p_value(observed_infer, direction = "greater") +
  labs(
    title = "Permutation Test: Distribution of F-statistic",
    x = "F-statistic",
    y = "Count"
  )

null_dist_infer %>%
  get_p_value(obs_stat = observed_infer, direction = "two-sided")

Warning: Please be cautious in reporting a p-value of 0. This result is an approximation
based on the number of `reps` chosen in the `generate()` step.
ℹ See `get_p_value()` (`?infer::get_p_value()`) for more information.

# A tibble: 1 × 1
  p_value
    <dbl>
1       0

Inference:-

The observed F-statistic was 12.37, shown by the red line on the graph. The red line lies far to the right of the null distribution which represents what we would expect if there were no real differences between shows.
The p-value = 0 which indicates none of the 4,999 random permutations produced an F value as large as the observed one.
We reject the null hypothesis which confirms that at least one show’s average opinion score is significantly different from the others.

Conclusion

The ANOVA test showed a significant difference in average opinion scores (F = 12.37, p < 0.001), indicating that not all shows were rated the same.
Because the data was not perfectly normal, a permutation test was also conducted. The observed F-statistic lay far beyond the null distribution with a p-value of 0, confirming that the differences are statistically significant and not due to random variation.

Among the three shows, Modern Family received the highest average rating (8.3), followed by Friends (6.9) and The Big Bang Theory (6.4). This shows that viewers clearly preferred Modern Family, while Friends and The Big Bang Theory received comparatively lower and similar scores.

Overall, both tests lead to the same conclusion - there is a real and significant difference in how audiences rated the three shows, with Modern Family being the most favorite.