Is the average weekly expenditure for boys equal to girls?

Author

Diya Bijoy

Published

November 4, 2025

R Packages Setup

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   4.0.0     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(mosaic)

Registered S3 method overwritten by 'mosaic':
  method                           from   
  fortify.SpatialPolygonsDataFrame ggplot2

The 'mosaic' package masks several functions from core packages in order to add 
additional features.  The original behavior of these functions should not be affected by this.

Attaching package: 'mosaic'

The following object is masked from 'package:Matrix':

    mean

The following objects are masked from 'package:dplyr':

    count, do, tally

The following object is masked from 'package:purrr':

    cross

The following object is masked from 'package:ggplot2':

    stat

The following objects are masked from 'package:stats':

    binom.test, cor, cor.test, cov, fivenum, IQR, median, prop.test,
    quantile, sd, t.test, var

The following objects are masked from 'package:base':

    max, mean, min, prod, range, sample, sum

library(ggformula)
library(infer)


Attaching package: 'infer'

The following objects are masked from 'package:mosaic':

    prop_test, t_test

library(broom) # Clean test results in tibble form
library(resampledata) # Datasets from Chihara and Hesterberg's book


Attaching package: 'resampledata'

The following object is masked from 'package:datasets':

    Titanic

library(openintro) # More datasets

Loading required package: airports
Loading required package: cherryblossom
Loading required package: usdata

Attaching package: 'openintro'

The following object is masked from 'package:mosaic':

    dotPlot

The following objects are masked from 'package:lattice':

    ethanol, lsegments

library(visStatistics) # One package to rule them all
library(ggstatsplot)

You can cite this package as:
     Patil, I. (2021). Visualizations with statistical details: The 'ggstatsplot' approach.
     Journal of Open Source Software, 6(61), 3167, doi:10.21105/joss.03167

Reading file

exp <- read.csv("4-weekly_expenditure.csv")
glimpse(exp)

Rows: 40
Columns: 6
$ Name                        <chr> "Radha", "Prerana", "Chris", "Nireeksha", …
$ Gender                      <chr> "Female", "Female", "Male", "Female", "Mal…
$ Total_Expenditure_Last_Week <dbl> 2000.0, 1200.0, 15000.0, 3620.0, 560.0, 22…
$ X                           <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ X.1                         <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ X.2                         <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…

Munging

exp_modified <- exp %>%
  select(c("Name","Gender","Total_Expenditure_Last_Week")) %>% 
  mutate(across(where(is.character), as.factor))%>%
  dplyr::relocate(where(is.factor), .before = Total_Expenditure_Last_Week)
glimpse(exp_modified)

Rows: 40
Columns: 3
$ Name                        <fct> Radha, Prerana, Chris, Nireeksha, Supraj, …
$ Gender                      <fct> Female, Female, Male, Female, Male, Male, …
$ Total_Expenditure_Last_Week <dbl> 2000.0, 1200.0, 15000.0, 3620.0, 560.0, 22…

Examining the Data

exp_modified %>% 
  count(Gender)

  Gender  n
1 Female 20
2   Male 20

exp_modified %>% 
  group_by(Gender) %>% 
  summarise(
    avg_exp = mean(Total_Expenditure_Last_Week, narm = TRUE),
     sd_exp = sd(Total_Expenditure_Last_Week, na.rm = TRUE),
  )

# A tibble: 2 × 3
  Gender avg_exp sd_exp
  <fct>    <dbl>  <dbl>
1 Female   2582   2465.
2 Male     7085. 11501.

Visualising the Data

exp_modified %>% 
  gf_boxplot(Gender~Total_Expenditure_Last_Week, fill =~ Gender,orientation = "y") %>% 
  gf_labs(
    title="Last Week Expenditure Girls vs Guys",
    x ="Expense Last Week",
    y = "Gender") %>% 
  gf_theme(theme_minimal())

exp_mod2 <-exp_modified %>% 
  slice_max(n = 15, order_by = Total_Expenditure_Last_Week)
  
exp_mod2 %>%  
  gf_col(reorder(Name, Total_Expenditure_Last_Week)~Total_Expenditure_Last_Week, fill = "steelblue") %>% 
  gf_labs(
    title = "Top 15 highest expenses",
    x="Expenses",
    y = "Names"
  ) %>% 
  gf_theme(theme_minimal)

avg <- exp_mod2 %>% 
  group_by(Gender) %>% 
  summarise(
    avg_expense = mean(Total_Expenditure_Last_Week)
  )

avg %>% 
  gf_col(avg_expense~Gender, fill = ~ Gender) %>% 
  gf_labs(
    title = "Average expense of Girls & Boys",
    x = "Gender",
    y = "Average Expense"
  ) %>% 
  gf_refine(scale_fill_brewer(palette = "Set1")) %>% 
  gf_theme(theme_minimal)

exp_modified %>%
  gf_density(
    ~Total_Expenditure_Last_Week,
    fill = ~Gender, colour = ~Gender,
    alpha = 0.5,
    title = "Expenditure Last Week Densities",
    subtitle = "Boys vs Girls"
  ) %>% 
  gf_theme(theme_minimal)

Null Hypothesis & Alternative Hypothesis

H0: Weekly expenditure is independent of gender (There is no difference between girls expenditure & boys expenditure)
H1: Weekly expenditure is dependent on gender.

1. Check for Normality

exp_modified %>%
    gf_density(~Total_Expenditure_Last_Week,
    fill = ~Gender,
    alpha = 0.5,
    title = "Weekly Expenditure for boys and girls"
  ) %>%
  gf_facet_grid(~Gender) %>%
  gf_fitdistr(dist = "dnorm") %>%
  gf_theme(theme(legend.position = "none"))

exp_modified %>%
  gf_qqline(~Total_Expenditure_Last_Week,
    color = ~Gender,
    title = "Weekly Expenditure",
    subtitle = "Are they Normally Distributed?"
  ) %>%
  gf_qq() %>%
  gf_facet_wrap(~Gender) %>% 

  gf_theme(theme(legend.position = "none"))

boys_expense <- exp_modified %>%
  filter(Gender == "Male") %>%
  select(Total_Expenditure_Last_Week)

girls_expense <- exp_modified %>%
  filter(Gender == "Female") %>%
  select(Total_Expenditure_Last_Week)

shapiro.test(boys_expense$Total_Expenditure_Last_Week)


    Shapiro-Wilk normality test

data:  boys_expense$Total_Expenditure_Last_Week
W = 0.56351, p-value = 1.261e-06

shapiro.test(girls_expense$Total_Expenditure_Last_Week)


    Shapiro-Wilk normality test

data:  girls_expense$Total_Expenditure_Last_Week
W = 0.78317, p-value = 0.0004899

P value for both boys & girls is less than 0.05, therefore its not normally distributed. This means that the distributions of weekly expenditure for both boys and girls deviate significantly from a normal distribution.

2. Check for Variances

exp_modified %>%
  dplyr::select(Total_Expenditure_Last_Week, Gender) %>%
  dplyr::group_by(Gender) %>%
  dplyr::summarize(variance = var(Total_Expenditure_Last_Week), n = n())

# A tibble: 2 × 3
  Gender   variance     n
  <fct>       <dbl> <int>
1 Female   6074125.    20
2 Male   132268687.    20

var.test(Total_Expenditure_Last_Week ~ Gender,
  data = exp_modified,
  conf.int = TRUE, conf.level = 0.95
) %>%
  broom::tidy()

Multiple parameters; naming those columns num.df and den.df.

# A tibble: 1 × 9
  estimate num.df den.df statistic p.value conf.low conf.high method alternative
     <dbl>  <int>  <int>     <dbl>   <dbl>    <dbl>     <dbl> <chr>  <chr>      
1   0.0459     19     19    0.0459 8.53e-9   0.0182     0.116 F tes… two.sided

##
qf(0.975, 275, 322)

[1] 1.254823

The variance is way off with a p value 8.53 * 10^-9 which is much less than 0.05, so therefore the variance for the variables is significantly different.

Inference

The two variables are not normally distributed.
The two variances are also significantly different.

This means that its Non-Parametric so the Mann-Whitney Test & Permutation Test must be done since its not normally distributed and the variances is different.

1. Mann-Whitney Test

H0: The rank sums for girls and boys do not differ significantly.
H1: The rank sums for girls and boys do differ significantly.

library(ggprism)
library(ggtext)
library(glue)
library(latex2exp)

exp_modified %>%
  gf_jitter(rank(Total_Expenditure_Last_Week) ~ Gender,
    color = ~Gender,
    show.legend = FALSE,
    width = 0.05, alpha = 0.25,
    ylab = "Ranks of Total Expenditure",
    title = "Ranked Expenditure for Boys and Girls"
  ) %>%
  gf_summary(
    group = ~1, 
    fun = "mean", geom = "line", colour = "lightblue",
    lty = 1, linewidth = 2
  ) %>%
  gf_summary(fun = "mean", colour = "firebrick", size = 4, geom = "point") %>%
  gf_refine(scale_x_discrete(
    breaks = c("Male", "Female"),
    labels = c("Male", "Female"),
    guide = "prism_bracket"
  )) %>%
  gf_annotate("label",
    label = "Mean Rank \nBoys Scores",
    y = 300, x = 0.75, inherit = FALSE
  ) %>%
  gf_annotate("label",
    label = "Mean Rank \nGirls Scores",
    y = 320, x = 2.25, inherit = FALSE
  )

Warning in ggplot2::annotate(geom = geom, x = x, y = y, xmin = xmin, xmax = xmax, : Ignoring unknown parameters: `inherit`
Ignoring unknown parameters: `inherit`

Warning: The S3 guide system was deprecated in ggplot2 3.5.0.
ℹ It has been replaced by a ggproto system that can be extended.

wilcox.test(Total_Expenditure_Last_Week ~ Gender,
  data = exp_modified,
  conf.int = TRUE
) %>%
  broom::tidy()

Warning in wilcox.test.default(x = DATA[[1L]], y = DATA[[2L]], ...): cannot
compute exact p-value with ties

Warning in wilcox.test.default(x = DATA[[1L]], y = DATA[[2L]], ...): cannot
compute exact confidence intervals with ties

# A tibble: 1 × 7
  estimate statistic p.value conf.low conf.high method               alternative
     <dbl>     <dbl>   <dbl>    <dbl>     <dbl> <chr>                <chr>      
1   -1100.      132.  0.0657   -3075.      60.0 Wilcoxon rank sum t… two.sided

Here, we see that the p value = 0.06 which is >0.05, so we fail to reject the Null Hypothesis. This means there is no statistically significant difference in weekly expenditure between boys and girls.

mosaic::t_test(Total_Expenditure_Last_Week ~ Gender,
  var.equal = FALSE, # Welch Correction
  data = exp_modified
) %>%
  broom::tidy()

# A tibble: 1 × 10
  estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
     <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>     <dbl>
1   -4503.      2582     7085.     -1.71   0.102      20.7   -9976.      971.
# ℹ 2 more variables: method <chr>, alternative <chr>

Inferences:-

The p-value was 0.06, which is slightly greater than the 0.05 threshold.
We fail to reject the null hypothesis, which means that there is no statistically significant difference in the distribution of weekly expenditure between boys and girls.

2. Permutations Test

obs_diff_infer <- exp_modified %>% 
  infer::specify(Total_Expenditure_Last_Week ~ Gender) %>%
  infer::calculate(
    stat = "diff in means",
    order = c("Male", "Female")
  )
obs_diff_infer

Response: Total_Expenditure_Last_Week (numeric)
Explanatory: Gender (factor)
# A tibble: 1 × 1
   stat
  <dbl>
1 4503.

On average, males spent Rs. 4502.75 more than females last week. Is this difference due to random chance or is it statistically significant?

null_dist_infer <- exp_modified %>% 
  specify(Total_Expenditure_Last_Week ~ Gender) %>%
  hypothesize(null = "independence") %>%
  generate(reps = 4999, type = "permute") %>% 
  calculate(
    stat = "diff in means",
    order = c("Male", "Female") 
  )
null_dist_infer

Response: Total_Expenditure_Last_Week (numeric)
Explanatory: Gender (factor)
...
# A tibble: 4,999 × 2
   replicate   stat
       <int>  <dbl>
 1         1 -3680.
 2         2  3696.
 3         3 -2997.
 4         4  2828.
 5         5 -2935.
 6         6  3881.
 7         7  1753.
 8         8  3548.
 9         9 -2063.
10        10  -155.
# ℹ 4,989 more rows

null_dist_infer %>%
  visualise() + 
  shade_p_value(obs_diff_infer,
    direction = "two-sided"
  ) +
  scale_y_continuous(expand = c(0, 0))

null_dist_infer %>%
  get_p_value(obs_stat = obs_diff_infer, direction = "two-sided")

# A tibble: 1 × 1
  p_value
    <dbl>
1  0.0556

Inferences:-

The p value = 0.059 (5.9%), which is greater than the 0.05 threshold.
So we fail to reject the Null Hypothesis that weekly expenditure is independent of gender. So the observed difference of Rs. 4502.75 has occurred due to random chance.

Conclusion

To conclude, while boys appear to have spent a bit more on average than girls, statistical tests show that this difference is not significant. Both the Mann Whitney and permutation tests indicate that gender does not play a major role in determining weekly spending. Overall, weekly expenditure patterns are fairly similar between boys and girls.