Math Anxiety Data set

Math, math, math… for some an absolute nightmare for others the holy grail of existence. Let’s break it down

Data Cleaning

This section handles loading the dataset, cleaning missing values, and initial transformations like converting variables to factors.

Load necessary libraries for data manipulation, visualization, and interactive elements.

Setting up R Packages

library(ggformula)

Loading required package: ggplot2

Loading required package: scales

Loading required package: ggridges


New to ggformula?  Try the tutorials: 
    learnr::run_tutorial("introduction", package = "ggformula")
    learnr::run_tutorial("refining", package = "ggformula")

library(janitor)


Attaching package: 'janitor'

The following objects are masked from 'package:stats':

    chisq.test, fisher.test

library(mosaic)

Registered S3 method overwritten by 'mosaic':
  method                           from   
  fortify.SpatialPolygonsDataFrame ggplot2


The 'mosaic' package masks several functions from core packages in order to add 
additional features.  The original behavior of these functions should not be affected by this.


Attaching package: 'mosaic'

The following objects are masked from 'package:dplyr':

    count, do, tally

The following object is masked from 'package:Matrix':

    mean

The following object is masked from 'package:scales':

    rescale

The following object is masked from 'package:ggplot2':

    stat

The following objects are masked from 'package:stats':

    binom.test, cor, cor.test, cov, fivenum, IQR, median, prop.test,
    quantile, sd, t.test, var

The following objects are masked from 'package:base':

    max, mean, min, prod, range, sample, sum

library(naniar)
library(skimr)


Attaching package: 'skimr'

The following object is masked from 'package:naniar':

    n_complete

The following object is masked from 'package:mosaic':

    n_missing

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ lubridate 1.9.4     ✔ tibble    3.3.0
✔ purrr     1.0.4     ✔ tidyr     1.3.1
✔ readr     2.1.5

── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ readr::col_factor() masks scales::col_factor()
✖ mosaic::count()     masks dplyr::count()
✖ purrr::cross()      masks mosaic::cross()
✖ purrr::discard()    masks scales::discard()
✖ mosaic::do()        masks dplyr::do()
✖ tidyr::expand()     masks Matrix::expand()
✖ dplyr::filter()     masks stats::filter()
✖ dplyr::lag()        masks stats::lag()
✖ tidyr::pack()       masks Matrix::pack()
✖ mosaic::stat()      masks ggplot2::stat()
✖ mosaic::tally()     masks dplyr::tally()
✖ tidyr::unpack()     masks Matrix::unpack()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(tinytable)


Attaching package: 'tinytable'

The following object is masked from 'package:ggplot2':

    theme_void

library(visdat)
library(crosstable)


Attaching package: 'crosstable'

The following object is masked from 'package:purrr':

    compact

library(RColorBrewer)

Read Data

meth <- readr::read_delim("MathAnxiety.csv",
                          delim = ";",
                          locale = locale(decimal_mark = ",")) %>% 
  janitor::clean_names("snake")

Rows: 599 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ";"
chr (2): Gender, Grade
dbl (4): Age, AMAS, RCMAS, Arith

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

meth

# A tibble: 599 × 6
     age gender grade      amas rcmas arith
   <dbl> <chr>  <chr>     <dbl> <dbl> <dbl>
 1  138. Boy    Secondary     9    20     6
 2  141. Boy    Secondary    18     8     6
 3  138. Girl   Secondary    23    26     5
 4  143. Girl   Secondary    19    18     7
 5  136. Boy    Secondary    23    20     1
 6  135  Girl   Secondary    27    33     1
 7  134. Boy    Secondary    22    23     4
 8  139. Boy    Secondary    17    11     7
 9  132. Girl   Secondary    28    32     2
10  135. Boy    Secondary    20    30     6
# ℹ 589 more rows

Data Dictionary

Variable	Description
Age	The age of the child in months.
Gender	The gender of the child (Boy or Girl).
Grade	The educational level of the child (Primary or Secondary).
AMAS	The score on the Abbreviated Math Anxiety Scale, where a higher score indicates greater math anxiety.
RCMAS	The score on the Revised Children’s Manifest Anxiety Scale, measuring general anxiety
Arith	The score on an arithmetic test.

Examine Data

This section inspects the dataset structure, summaries, counts, and missing values through various diagnostic functions.

summary(meth)

      age           gender             grade                amas      
 Min.   :  3.7   Length:599         Length:599         Min.   : 4.00  
 1st Qu.:106.2   Class :character   Class :character   1st Qu.:18.00  
 Median :120.8   Mode  :character   Mode  :character   Median :22.00  
 Mean   :124.6                                         Mean   :21.98  
 3rd Qu.:141.8                                         3rd Qu.:26.50  
 Max.   :187.5                                         Max.   :45.00  
     rcmas           arith      
 Min.   : 1.00   Min.   :0.000  
 1st Qu.:14.00   1st Qu.:4.000  
 Median :19.00   Median :6.000  
 Mean   :19.24   Mean   :5.302  
 3rd Qu.:25.00   3rd Qu.:7.000  
 Max.   :41.00   Max.   :8.000

skimr::skim(meth)

Data summary
Name	meth
Number of rows	599
Number of columns	6
_______________________
Column type frequency:
character	2
numeric	4
________________________
Group variables	None

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
gender	0	1	3	4	0	2	0
grade	0	1	7	9	0	2	0

Variable type: numeric

skim_variable	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
age	1	124.65	22.31	3.7	106.15	120.8	141.85	187.5	▁▁▇▇▃
amas	1	21.98	6.60	4.0	18.00	22.0	26.50	45.0	▂▆▇▃▁
rcmas	1	19.24	7.57	1.0	14.00	19.0	25.00	41.0	▂▇▇▅▁
arith	1	5.30	2.11	0.0	4.00	6.0	7.00	8.0	▂▃▃▇▇

Show the structure of the dataset including data types.

str(meth)

spc_tbl_ [599 × 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ age   : num [1:599] 138 141 138 143 136 ...
 $ gender: chr [1:599] "Boy" "Boy" "Girl" "Girl" ...
 $ grade : chr [1:599] "Secondary" "Secondary" "Secondary" "Secondary" ...
 $ amas  : num [1:599] 9 18 23 19 23 27 22 17 28 20 ...
 $ rcmas : num [1:599] 20 8 26 18 20 33 23 11 32 30 ...
 $ arith : num [1:599] 6 6 5 7 1 1 4 7 2 6 ...
 - attr(*, "spec")=
  .. cols(
  ..   Age = col_double(),
  ..   Gender = col_character(),
  ..   Grade = col_character(),
  ..   AMAS = col_double(),
  ..   RCMAS = col_double(),
  ..   Arith = col_double()
  .. )
 - attr(*, "problems")=<externalptr>

Count occurrences by gender.

count(meth, gender)

# A tibble: 2 × 2
  gender     n
  <chr>  <int>
1 Boy      323
2 Girl     276

Return the dimensions of the dataset.

dim(meth)

[1] 599   6

List the column names of the dataset.

base::names(meth)

[1] "age"    "gender" "grade"  "amas"   "rcmas"  "arith"

Provide a glimpse of the dataset showing types and sample value

dplyr::glimpse(meth)

Rows: 599
Columns: 6
$ age    <dbl> 137.8, 140.7, 137.9, 142.8, 135.6, 135.0, 133.6, 139.3, 131.7, …
$ gender <chr> "Boy", "Boy", "Girl", "Girl", "Boy", "Girl", "Boy", "Boy", "Gir…
$ grade  <chr> "Secondary", "Secondary", "Secondary", "Secondary", "Secondary"…
$ amas   <dbl> 9, 18, 23, 19, 23, 27, 22, 17, 28, 20, 16, 20, 21, 36, 16, 27, …
$ rcmas  <dbl> 20, 8, 26, 18, 20, 33, 23, 11, 32, 30, 10, 4, 23, 26, 24, 21, 3…
$ arith  <dbl> 6, 6, 5, 7, 1, 1, 4, 7, 2, 6, 2, 5, 2, 6, 2, 7, 2, 4, 7, 3, 8, …

Replace common NA representations with actual NA values in the dataset.

meth_modified <- meth %>%
  naniar::replace_with_na_all(data = ., condition = ~ .x %in% common_na_numbers) %>%
  naniar::replace_with_na_all(data = ., condition = ~ .x %in% common_na_strings)
glimpse(meth_modified)

Rows: 599
Columns: 6
$ age    <dbl> 137.8, 140.7, 137.9, 142.8, 135.6, 135.0, 133.6, 139.3, 131.7, …
$ gender <chr> "Boy", "Boy", "Girl", "Girl", "Boy", "Girl", "Boy", "Boy", "Gir…
$ grade  <chr> "Secondary", "Secondary", "Secondary", "Secondary", "Secondary"…
$ amas   <dbl> 9, 18, 23, 19, 23, 27, 22, 17, 28, 20, 16, 20, 21, 36, 16, 27, …
$ rcmas  <dbl> 20, 8, 26, 18, 20, 33, 23, 11, 32, 30, 10, 4, 23, 26, 24, 21, 3…
$ arith  <dbl> 6, 6, 5, 7, 1, 1, 4, 7, 2, 6, 2, 5, 2, 6, 2, 7, 2, 4, 7, 3, 8, …

Viewing Missing Data

visdat::vis_miss(meth_modified)

visdat::vis_dat(meth_modified)

Munging

This section performs further data wrangling, such as binning age and factoring variables.

Convert gender and grade to factors and relocate them before age for better organization.

meth_modified <- meth_modified %>%
  mutate(
    gender = as.factor(gender),
    grade = as.factor(grade),
  ) %>%
  dplyr::relocate(where(is.factor), .before = age)
glimpse(meth_modified)

Rows: 599
Columns: 6
$ gender <fct> Boy, Boy, Girl, Girl, Boy, Girl, Boy, Boy, Girl, Boy, Boy, Boy,…
$ grade  <fct> Secondary, Secondary, Secondary, Secondary, Secondary, Secondar…
$ age    <dbl> 137.8, 140.7, 137.9, 142.8, 135.6, 135.0, 133.6, 139.3, 131.7, …
$ amas   <dbl> 9, 18, 23, 19, 23, 27, 22, 17, 28, 20, 16, 20, 21, 36, 16, 27, …
$ rcmas  <dbl> 20, 8, 26, 18, 20, 33, 23, 11, 32, 30, 10, 4, 23, 26, 24, 21, 3…
$ arith  <dbl> 6, 6, 5, 7, 1, 1, 4, 7, 2, 6, 2, 5, 2, 6, 2, 7, 2, 4, 7, 3, 8, …

meth_modified %>%
  head(10) %>%
  dplyr::rename(
    "Gender" = gender,
    "Grade" = grade,
    "Age (months)" = age,
    "AMAS (Math Anxiety)" = amas,
    "RCMAS (General Anxiety)" = rcmas,
    "Arithmetic Score" = arith
  ) %>%
  tt()

Gender	Grade	Age (months)	AMAS (Math Anxiety)	RCMAS (General Anxiety)	Arithmetic Score
Boy	Secondary	137.8	9	20	6
Boy	Secondary	140.7	18	8	6
Girl	Secondary	137.9	23	26	5
Girl	Secondary	142.8	19	18	7
Boy	Secondary	135.6	23	20	1
Girl	Secondary	135.0	27	33	1
Boy	Secondary	133.6	22	23	4
Boy	Secondary	139.3	17	11	7
Girl	Secondary	131.7	28	32	2
Boy	Secondary	134.8	20	30	6

Summaries: Examining the Data

meth_modified %>% dplyr::count(across(.cols = c(gender, grade)))

# A tibble: 4 × 3
  gender grade         n
  <fct>  <fct>     <int>
1 Boy    Primary     199
2 Boy    Secondary   124
3 Girl   Primary     202
4 Girl   Secondary    74

meth_modified %>%
  dplyr::summarise(
    mean_amas = mean(amas, na.rm = T),
    sd_amas = sd(amas, na.rm = T),
    min_amas = min(amas, na.rm = T),
    max_amas = max(amas, na.rm = T)
  )

# A tibble: 1 × 4
  mean_amas sd_amas min_amas max_amas
      <dbl>   <dbl>    <dbl>    <dbl>
1      22.0    6.60        4       45

meth_modified %>%
  dplyr::summarise(across(
    .cols = c(amas, rcmas), 

    .fns = list(
      mean = ~ mean(., na.rm = T),
      sd = sd,
      min = min, max = max
    )
  ))

# A tibble: 1 × 8
  amas_mean amas_sd amas_min amas_max rcmas_mean rcmas_sd rcmas_min rcmas_max
      <dbl>   <dbl>    <dbl>    <dbl>      <dbl>    <dbl>     <dbl>     <dbl>
1      22.0    6.60        4       45       19.2     7.57         1        41

meth_modified %>%
  dplyr::summarise(
    mean_arith = mean(arith, na.rm = T),
    sd_arith = sd(arith, na.rm = T),
    min_arith = min(arith, na.rm = T),
    max_arith = max(arith, na.rm = T)
  )

# A tibble: 1 × 4
  mean_arith sd_arith min_arith max_arith
       <dbl>    <dbl>     <dbl>     <dbl>
1       5.30     2.11         0         8

meth_modified %>%
  dplyr::summarise(
    mean_age = mean(age, na.rm = T),
    sd_age = sd(age, na.rm = T),
    min_age = min(age, na.rm = T),
    max_age = max(age, na.rm = T)
  )

# A tibble: 1 × 4
  mean_age sd_age min_age max_age
     <dbl>  <dbl>   <dbl>   <dbl>
1     125.   22.3     3.7    188.

meth_modified %>%
  group_by(gender) %>%
  dplyr::summarise(across(
    .cols = c(age, amas, rcmas, arith),

    .fns = list(
      mean = ~ mean(., na.rm = T),
      sd = sd,
      min = min, max = max
    )
  ))

# A tibble: 2 × 17
  gender age_mean age_sd age_min age_max amas_mean amas_sd amas_min amas_max
  <fct>     <dbl>  <dbl>   <dbl>   <dbl>     <dbl>   <dbl>    <dbl>    <dbl>
1 Boy        128.   22.9    93.1    188.      21.2    6.51        4       45
2 Girl       121.   21.1     3.7    180.      22.9    6.59        9       40
# ℹ 8 more variables: rcmas_mean <dbl>, rcmas_sd <dbl>, rcmas_min <dbl>,
#   rcmas_max <dbl>, arith_mean <dbl>, arith_sd <dbl>, arith_min <dbl>,
#   arith_max <dbl>

crosstable(age + rcmas + amas + arith ~ gender,
  data = meth_modified
) %>%
  crosstable::as_flextable()

label	variable	gender
label	variable	Boy	Girl
age	Min / Max	93.1 / 187.5	3.7 / 180.3
	Med [IQR]	124.9 [107.4;147.2]	117.8 [105.8;133.4]
	Mean (std)	127.6 (22.9)	121.1 (21.1)
	N (NA)	323 (0)	276 (0)
rcmas	Min / Max	1.0 / 41.0	3.0 / 38.0
	Med [IQR]	18.0 [13.0;23.0]	20.0 [15.0;26.0]
	Mean (std)	18.1 (7.5)	20.6 (7.4)
	N (NA)	323 (0)	276 (0)
amas	Min / Max	4.0 / 45.0	9.0 / 40.0
	Med [IQR]	21.0 [17.0;26.0]	23.0 [19.0;28.0]
	Mean (std)	21.2 (6.5)	22.9 (6.6)
	N (NA)	323 (0)	276 (0)
arith	Min / Max	0 / 8.0	0 / 8.0
	Med [IQR]	6.0 [4.0;7.0]	6.0 [4.0;7.0]
	Mean (std)	5.3 (2.1)	5.3 (2.1)
	N (NA)	323 (0)	276 (0)

meth_modified %>%
  group_by(gender) %>%
  dplyr::summarise(across(
    .cols = c(age, amas, rcmas, arith),

    .fns = list(
      mean = ~ mean(., na.rm = T),
      sd = sd,
      min = min, max = max
    )
  ))

# A tibble: 2 × 17
  gender age_mean age_sd age_min age_max amas_mean amas_sd amas_min amas_max
  <fct>     <dbl>  <dbl>   <dbl>   <dbl>     <dbl>   <dbl>    <dbl>    <dbl>
1 Boy        128.   22.9    93.1    188.      21.2    6.51        4       45
2 Girl       121.   21.1     3.7    180.      22.9    6.59        9       40
# ℹ 8 more variables: rcmas_mean <dbl>, rcmas_sd <dbl>, rcmas_min <dbl>,
#   rcmas_max <dbl>, arith_mean <dbl>, arith_sd <dbl>, arith_min <dbl>,
#   arith_max <dbl>

meth_modified %>%
  group_by(grade) %>%
  dplyr::summarise(across(
    .cols = c(age, amas, rcmas, arith),

    .fns = list(
      mean = ~ mean(., na.rm = T),
      sd = sd,
      min = min, max = max
    )
  ))

# A tibble: 2 × 17
  grade     age_mean age_sd age_min age_max amas_mean amas_sd amas_min amas_max
  <fct>        <dbl>  <dbl>   <dbl>   <dbl>     <dbl>   <dbl>    <dbl>    <dbl>
1 Primary       111.   12.1     3.7    135.      21.8    6.53        4       38
2 Secondary     151.   12.1   132.     188.      22.3    6.75        9       45
# ℹ 8 more variables: rcmas_mean <dbl>, rcmas_sd <dbl>, rcmas_min <dbl>,
#   rcmas_max <dbl>, arith_mean <dbl>, arith_sd <dbl>, arith_min <dbl>,
#   arith_max <dbl>

gf_histogram(~age, data = meth_modified) %>%
  gf_labs(title = "Histogram of Age")

`stat_bin()` using `bins = 30`. Pick better value `binwidth`.

gf_histogram(~amas, data = meth_modified) %>%
  gf_labs(title = "Histogram of AMAS")

`stat_bin()` using `bins = 30`. Pick better value `binwidth`.

gf_histogram(~rcmas, data = meth_modified) %>%
  gf_labs(title = "Histogram of RCMAS")

`stat_bin()` using `bins = 30`. Pick better value `binwidth`.

gf_bar(~grade, data = meth_modified) %>%
  gf_labs(title = "Bar Plot of Grade")

gf_bar(~gender, data = meth_modified) %>%
  gf_labs(title = "Bar Plot of Gender")

Visualizing the data

Grade Count for both Genders

meth_modified %>%
  gf_bar(~grade,
         fill=~gender,
         position = "dodge") +
  labs(title= "Grade count for both genders", subtitle = "Dodged Bar Chart", x ="Grade", y ="Count") +
  scale_fill_brewer(palette = "Set2")

Why is the count reducing in secondary grade? Assuming lesser students have enrolled for Secondary grade compared to Primary grade.

meth_modified %>%
  gf_bar(~grade,
         fill=~gender,
         position = "stack") +
  labs(title= "Grade count for both genders", subtitle = "Stacked Bar Chart", x ="Grade", y ="Count") +
  scale_fill_brewer(palette = "Set2")

Comparing Count of Girl’s Grade

meth_modified %>%
  gf_bar(~grade,
         fill=~gender,
         position = "fill") +
  labs(title= "Comparing count of both gender's grade", subtitle = "Filled Bar Chart", x ="Grade", y ="Count") +
  scale_fill_brewer(palette = "Set2")

Age Count for both Genders

meth_modified %>%
  gf_bar(~age,
         fill=~gender,
         position = "stack") +
  labs(title= "Age count for both genders", subtitle = "Stacked Bar Chart", x ="Age", y ="Count") +
  scale_fill_brewer(palette = "Set2")

Inferences:-

After age 150 months (12.5 years), there’s a significant drop, which confirms that as students get older, fewer people enroll in math.

AMAS Scores based on Gender & Grade

meth_modified %>% 
  gf_histogram(~amas| grade~gender,
               bins = 5,
               fill = "steelblue",
               color="white") %>% 
  gf_labs(title="Histogram of AMAS Scores",
          subtitle ="Faceted by Grade and Gender",
          x="AMAS Score",
          y="Count")

meth_modified %>% 
  gf_histogram(~amas | grade,
               fill=~gender,
               colour="black") %>%
  gf_labs(
    title = "AMAS by Filled and Faceted by Grade")

`stat_bin()` using `bins = 30`. Pick better value `binwidth`.

meth_modified %>% 
  gf_boxplot(amas~gender | grade,
             fill=~gender,
             orientation = "x") %>% 
  gf_labs(title = "Boxplots of AMAS Scores by Grade",
       subtitle = "Faceted by Grade",
       x = "Gender",
       y = "AMAS Score")

Inferences:-

Girls face more math anxiety: Box plots show higher medians, wider ranges. For girls, means are higher in both Primary & Secondary Grades.

RCMAS Scores based on Gender & Grade

meth_modified %>% 
  gf_histogram(~rcmas| grade~gender,
               bins = 5,
               fill = "steelblue",
               color="white") %>% 
  gf_labs(title="Histogram of RCMAS Scores",
          subtitle ="Faceted by Grade and Gender",
          x="RCMAS Score",
          y="Count")

meth_modified %>% 
  gf_histogram(~rcmas | grade,
               fill=~gender,
               colour="black") %>%
  gf_labs(
    title = "RCMAS by Filled and Faceted by Grade")

`stat_bin()` using `bins = 30`. Pick better value `binwidth`.

meth_modified %>% 
  gf_boxplot(rcmas~gender | grade,
             fill=~gender,
             orientation = "x") %>% 
  gf_labs(title = "Boxplots of RCMAS Scores by Grade",
       subtitle = "Faceted by Grade",
       x = "Gender",
       y = "RCMAS Score")

Overall Inferences

Anxiety appears to be higher for Girls than for Boys
Anxiety appears to be highest in Primary Girls
Anxiety appears to be lowest in Secondary Boys

Arith Scores based on Gender & Grade

meth_modified %>%
  gf_bar(~arith | grade,
               fill = ~gender,
               position="dodge",
               color="black") %>% 
  gf_labs(title = "Dodged Bar Graph of Arith Scores",
       subtitle = "Faceted by Grade",
       x = "Arith Score",
       y = "Count") %>% 
  gf_refine(scale_color_brewer(palette = "Set 2"))

Warning: Unknown palette: "Set 2"

Inferences:

In general, girls have a lower arith score than boys.
In Primary Grade, it is seen that the difference between girls and boys score is less when compared to Secondary grade where the gap is bigger.

meth_modified %>%
  gf_bar(~arith | grade,
               fill = ~gender,
               position="fill",
               color="black") %>% 
  gf_labs(title = "Filled Bar Graph of Arith Scores",
       subtitle = "Faceted by Grade",
       x = "Arith Score",
       y = "Count") %>% 
  gf_refine(scale_color_brewer(palette = "Set 2"))

Warning: Unknown palette: "Set 2"

meth_modified %>% 
  gf_bar(~arith | grade,
               fill=~grade,
               colour="black") %>%
  gf_labs(
    title = "Arith by Filled and Faceted by Grade",
    x="Arith Scores",
    y="Count")

Inferences:

Primary Grade have higher arith scores than Secondary Grade.

meth_modified %>% 
  gf_boxplot(arith ~gender | grade,
             fill=~gender,
             orientation = "x") %>% 
  gf_labs(title = "Boxplots of Arith Scores by Grade",
       subtitle = "Faceted by Grade",
       x = "Gender",
       y = "Arith Score")

Inferences:

Primary grades ranges from 5 to 7 & Median is 6, while Secondary grades ranges from 3 to 6 & median is 4.
Secondary students lag in arithmetic: Box plots shows lower medians and wider ranges for Secondary Grade.

Arith Scores based on Age, Grade & Gender

meth_modified %>%
  gf_point(arith ~ age | gender, color = ~ grade) %>%
  gf_labs(
    title = "Scatter Plot of Age vs Arithmetic Score",
    subtitle = "Faceted by Gender, Colored by Grade",
    x = "Age",
    y = "Arithmetic Score",
    color = "Grade"
  ) %>%
  gf_refine(scale_color_brewer(palette = "Spectral")) %>% 
  gf_theme(theme_dark())

Inferences:-

Scatter plots shows similar Arith trends across both genders, but primary tend to have higher Arith scores.

meth_modified %>% 
  gf_boxplot(arith ~ grade | gender, orientation = "x", fill=~grade, color = "black") %>%
 gf_labs(
    x = "Grade",
    y = "Arithmetic Score",
    title = "Arithmetic Scores by Grade",
    subtitle = "Faceted by Gender"
  ) %>%
  gf_refine(scale_fill_brewer(palette = "Set3")) %>%
  gf_theme(theme_dark())

Inferences:

Primary grade students (both genders) have higher arithmetic scores compared to secondary grade.
Secondary grade students have wider range for arithmetic scores compared to Primary grade.
Among genders within each grade there’s no difference in score .

Comparing AMAS & Arith Score

Hypothesis: From previous visualization we think that Math Anxiety (AMAS) is affects Arithmetic scores.

meth_modified %>%
  gf_boxplot(arith ~ amas | gender ~ grade, fill = ~ gender, orientation = "x") %>%
  gf_labs(
    title = "Relationship between Math Anxiety and Arithmetic Performance",
    subtitle = "Across Gender and Grade",
    x = "Math Anxiety (AMAS)",
    y = "Arithmetic Performance (Arith)"
  ) %>%
  gf_refine(scale_fill_brewer(palette = "Set2")) %>%
  gf_theme(theme_light())

Inferences:

As math anxiety (AMAS) increases, arithmetic performance tends to go down for both genders.
In both primary and secondary grade, Boys perform better than girls.
Primary Grade students have higher math score than Secondary.
The Secondary grade has more Math anxeity and this affects their Arithmatic scores.

Density Plot based on Age, Gender & Grade

meth_modified %>%
  gf_density(~ age | gender, fill = ~ grade, color="black") %>%
  gf_labs(
    title = "Density Plots of Age",
    subtitle = "Faceted by Gender, Filled by Grade",
    x = "Age",
    y = "Density",
    color = "Grade"
  ) %>% 
  gf_refine(scale_color_brewer(palette = "Set 3")) %>% 
  gf_theme(theme_light())

Warning: Unknown palette: "Set 3"

Ignoring unknown labels:
• colour : "Grade"

Comparing AMAS & RCMAS

meth_modified %>%
  gf_point(rcmas ~ amas, color = ~ gender, shape = ~ gender) %>%
  gf_labs(
    title = "Scatter Plot of Amas vs Rcmas",
    subtitle = "Colored by Gender",
    x = "Amas Score",
    y = "Rcmas Score"
  ) %>%
  gf_refine(scale_color_brewer(palette = "Set1")) %>% 
  gf_theme(theme_light())

Inferences:

In general, both Genders have higher amas and lower rcmas

meth_modified %>%
  gf_boxplot(rcmas ~ amas | gender~grade, fill =~gender, orientation = "y") %>%
  gf_labs(
    title = "Box Plot of Amas vs Rcmas",
    subtitle = "Colored by Grade",
    x = "Amas Score",
    y = "Rcmas Score"
  ) %>%
  gf_refine(scale_color_brewer(palette = "Set2")) %>% 
  gf_theme(theme_light())

Inferences:

As seen in the boxplot graphs, Boys tend to have a wider range of RCMAS than girls.
Comparing Primary and Secondary Boys -
- Primary Boys have lower general anxiety (rcmas) and higher math anxiety (amas).
- Secondary Boys have higher general anxiety (rcmas) and lower math anxiety (amas).
Comparing Primary and Secondary Girls -
- Primary Girls have higher general anxiety(rcmas) and lower math anxiety (amas)
- Secondary Girls have lower general anxiety (rcmas) and higher math anxiety (amas)
Comparing Primary Boys and Girls -
- Boys have lesser median but wider range for math anxiety (amas)
- Girls have higher median but narrower range for math anxiety (amas)
- Boys & Girls have same range for general anxiety (rcmas)
Comparing Secondary Boys & Girls -
- Boys have lesser median and narrower range for math anxiety (amas)
- Girls have higher median and wider range for math anxiety (amas)
- Boys have wider range of general anxiety compared to girls (rcmas)

Conclusion

Most visualizations show that girls have higher anxieties (both math and general), while boys have higher arithmetic scores. This could be due to boys possibly having a better support system.
Many visualizations confirm: primary grade students do better in Math,