Math Anxiety Dataset

A1
Author

Diya Bijoy, Swetha KV, Abhinav R, Aanya Pandith

Published

September 28, 2025

Math Anxiety Data set

Math, math, math… for some an absolute nightmare for others the holy grail of existence. Let’s break it down

Data Cleaning

This section handles loading the dataset, cleaning missing values, and initial transformations like converting variables to factors.

Load necessary libraries for data manipulation, visualization, and interactive elements.

Setting up R Packages

library(ggformula)
Loading required package: ggplot2
Loading required package: scales
Loading required package: ggridges

New to ggformula?  Try the tutorials: 
    learnr::run_tutorial("introduction", package = "ggformula")
    learnr::run_tutorial("refining", package = "ggformula")
library(janitor)

Attaching package: 'janitor'
The following objects are masked from 'package:stats':

    chisq.test, fisher.test
library(mosaic)
Registered S3 method overwritten by 'mosaic':
  method                           from   
  fortify.SpatialPolygonsDataFrame ggplot2

The 'mosaic' package masks several functions from core packages in order to add 
additional features.  The original behavior of these functions should not be affected by this.

Attaching package: 'mosaic'
The following objects are masked from 'package:dplyr':

    count, do, tally
The following object is masked from 'package:Matrix':

    mean
The following object is masked from 'package:scales':

    rescale
The following object is masked from 'package:ggplot2':

    stat
The following objects are masked from 'package:stats':

    binom.test, cor, cor.test, cov, fivenum, IQR, median, prop.test,
    quantile, sd, t.test, var
The following objects are masked from 'package:base':

    max, mean, min, prod, range, sample, sum
library(naniar)
library(skimr)

Attaching package: 'skimr'
The following object is masked from 'package:naniar':

    n_complete
The following object is masked from 'package:mosaic':

    n_missing
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ lubridate 1.9.4     ✔ tibble    3.3.0
✔ purrr     1.0.4     ✔ tidyr     1.3.1
✔ readr     2.1.5     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ readr::col_factor() masks scales::col_factor()
✖ mosaic::count()     masks dplyr::count()
✖ purrr::cross()      masks mosaic::cross()
✖ purrr::discard()    masks scales::discard()
✖ mosaic::do()        masks dplyr::do()
✖ tidyr::expand()     masks Matrix::expand()
✖ dplyr::filter()     masks stats::filter()
✖ dplyr::lag()        masks stats::lag()
✖ tidyr::pack()       masks Matrix::pack()
✖ mosaic::stat()      masks ggplot2::stat()
✖ mosaic::tally()     masks dplyr::tally()
✖ tidyr::unpack()     masks Matrix::unpack()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tinytable)

Attaching package: 'tinytable'

The following object is masked from 'package:ggplot2':

    theme_void
library(visdat)
library(crosstable)

Attaching package: 'crosstable'

The following object is masked from 'package:purrr':

    compact
library(RColorBrewer)

Read Data

meth <- readr::read_delim("MathAnxiety.csv",
                          delim = ";",
                          locale = locale(decimal_mark = ",")) %>% 
  janitor::clean_names("snake")
Rows: 599 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ";"
chr (2): Gender, Grade
dbl (4): Age, AMAS, RCMAS, Arith

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
meth
# A tibble: 599 × 6
     age gender grade      amas rcmas arith
   <dbl> <chr>  <chr>     <dbl> <dbl> <dbl>
 1  138. Boy    Secondary     9    20     6
 2  141. Boy    Secondary    18     8     6
 3  138. Girl   Secondary    23    26     5
 4  143. Girl   Secondary    19    18     7
 5  136. Boy    Secondary    23    20     1
 6  135  Girl   Secondary    27    33     1
 7  134. Boy    Secondary    22    23     4
 8  139. Boy    Secondary    17    11     7
 9  132. Girl   Secondary    28    32     2
10  135. Boy    Secondary    20    30     6
# ℹ 589 more rows

Data Dictionary

Variable Description
Age The age of the child in months.
Gender The gender of the child (Boy or Girl).
Grade The educational level of the child (Primary or Secondary).
AMAS The score on the Abbreviated Math Anxiety Scale, where a higher score indicates greater math anxiety.
RCMAS The score on the Revised Children’s Manifest Anxiety Scale, measuring general anxiety
Arith The score on an arithmetic test.

Examine Data

This section inspects the dataset structure, summaries, counts, and missing values through various diagnostic functions.

summary(meth)
      age           gender             grade                amas      
 Min.   :  3.7   Length:599         Length:599         Min.   : 4.00  
 1st Qu.:106.2   Class :character   Class :character   1st Qu.:18.00  
 Median :120.8   Mode  :character   Mode  :character   Median :22.00  
 Mean   :124.6                                         Mean   :21.98  
 3rd Qu.:141.8                                         3rd Qu.:26.50  
 Max.   :187.5                                         Max.   :45.00  
     rcmas           arith      
 Min.   : 1.00   Min.   :0.000  
 1st Qu.:14.00   1st Qu.:4.000  
 Median :19.00   Median :6.000  
 Mean   :19.24   Mean   :5.302  
 3rd Qu.:25.00   3rd Qu.:7.000  
 Max.   :41.00   Max.   :8.000  
skimr::skim(meth)
Data summary
Name meth
Number of rows 599
Number of columns 6
_______________________
Column type frequency:
character 2
numeric 4
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
gender 0 1 3 4 0 2 0
grade 0 1 7 9 0 2 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
age 0 1 124.65 22.31 3.7 106.15 120.8 141.85 187.5 ▁▁▇▇▃
amas 0 1 21.98 6.60 4.0 18.00 22.0 26.50 45.0 ▂▆▇▃▁
rcmas 0 1 19.24 7.57 1.0 14.00 19.0 25.00 41.0 ▂▇▇▅▁
arith 0 1 5.30 2.11 0.0 4.00 6.0 7.00 8.0 ▂▃▃▇▇

Show the structure of the dataset including data types.

str(meth)
spc_tbl_ [599 × 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ age   : num [1:599] 138 141 138 143 136 ...
 $ gender: chr [1:599] "Boy" "Boy" "Girl" "Girl" ...
 $ grade : chr [1:599] "Secondary" "Secondary" "Secondary" "Secondary" ...
 $ amas  : num [1:599] 9 18 23 19 23 27 22 17 28 20 ...
 $ rcmas : num [1:599] 20 8 26 18 20 33 23 11 32 30 ...
 $ arith : num [1:599] 6 6 5 7 1 1 4 7 2 6 ...
 - attr(*, "spec")=
  .. cols(
  ..   Age = col_double(),
  ..   Gender = col_character(),
  ..   Grade = col_character(),
  ..   AMAS = col_double(),
  ..   RCMAS = col_double(),
  ..   Arith = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 

Count occurrences by gender.

count(meth, gender)
# A tibble: 2 × 2
  gender     n
  <chr>  <int>
1 Boy      323
2 Girl     276

Return the dimensions of the dataset.

dim(meth)
[1] 599   6

List the column names of the dataset.

base::names(meth)
[1] "age"    "gender" "grade"  "amas"   "rcmas"  "arith" 

Provide a glimpse of the dataset showing types and sample value

dplyr::glimpse(meth)
Rows: 599
Columns: 6
$ age    <dbl> 137.8, 140.7, 137.9, 142.8, 135.6, 135.0, 133.6, 139.3, 131.7, …
$ gender <chr> "Boy", "Boy", "Girl", "Girl", "Boy", "Girl", "Boy", "Boy", "Gir…
$ grade  <chr> "Secondary", "Secondary", "Secondary", "Secondary", "Secondary"…
$ amas   <dbl> 9, 18, 23, 19, 23, 27, 22, 17, 28, 20, 16, 20, 21, 36, 16, 27, …
$ rcmas  <dbl> 20, 8, 26, 18, 20, 33, 23, 11, 32, 30, 10, 4, 23, 26, 24, 21, 3…
$ arith  <dbl> 6, 6, 5, 7, 1, 1, 4, 7, 2, 6, 2, 5, 2, 6, 2, 7, 2, 4, 7, 3, 8, …

Replace common NA representations with actual NA values in the dataset.

meth_modified <- meth %>%
  naniar::replace_with_na_all(data = ., condition = ~ .x %in% common_na_numbers) %>%
  naniar::replace_with_na_all(data = ., condition = ~ .x %in% common_na_strings)
glimpse(meth_modified)
Rows: 599
Columns: 6
$ age    <dbl> 137.8, 140.7, 137.9, 142.8, 135.6, 135.0, 133.6, 139.3, 131.7, …
$ gender <chr> "Boy", "Boy", "Girl", "Girl", "Boy", "Girl", "Boy", "Boy", "Gir…
$ grade  <chr> "Secondary", "Secondary", "Secondary", "Secondary", "Secondary"…
$ amas   <dbl> 9, 18, 23, 19, 23, 27, 22, 17, 28, 20, 16, 20, 21, 36, 16, 27, …
$ rcmas  <dbl> 20, 8, 26, 18, 20, 33, 23, 11, 32, 30, 10, 4, 23, 26, 24, 21, 3…
$ arith  <dbl> 6, 6, 5, 7, 1, 1, 4, 7, 2, 6, 2, 5, 2, 6, 2, 7, 2, 4, 7, 3, 8, …

Viewing Missing Data

visdat::vis_miss(meth_modified)

visdat::vis_dat(meth_modified)

Munging

This section performs further data wrangling, such as binning age and factoring variables.

Convert gender and grade to factors and relocate them before age for better organization.

meth_modified <- meth_modified %>%
  mutate(
    gender = as.factor(gender),
    grade = as.factor(grade),
  ) %>%
  dplyr::relocate(where(is.factor), .before = age)
glimpse(meth_modified)
Rows: 599
Columns: 6
$ gender <fct> Boy, Boy, Girl, Girl, Boy, Girl, Boy, Boy, Girl, Boy, Boy, Boy,…
$ grade  <fct> Secondary, Secondary, Secondary, Secondary, Secondary, Secondar…
$ age    <dbl> 137.8, 140.7, 137.9, 142.8, 135.6, 135.0, 133.6, 139.3, 131.7, …
$ amas   <dbl> 9, 18, 23, 19, 23, 27, 22, 17, 28, 20, 16, 20, 21, 36, 16, 27, …
$ rcmas  <dbl> 20, 8, 26, 18, 20, 33, 23, 11, 32, 30, 10, 4, 23, 26, 24, 21, 3…
$ arith  <dbl> 6, 6, 5, 7, 1, 1, 4, 7, 2, 6, 2, 5, 2, 6, 2, 7, 2, 4, 7, 3, 8, …
meth_modified %>%
  head(10) %>%
  dplyr::rename(
    "Gender" = gender,
    "Grade" = grade,
    "Age (months)" = age,
    "AMAS (Math Anxiety)" = amas,
    "RCMAS (General Anxiety)" = rcmas,
    "Arithmetic Score" = arith
  ) %>%
  tt()
Gender Grade Age (months) AMAS (Math Anxiety) RCMAS (General Anxiety) Arithmetic Score
Boy Secondary 137.8 9 20 6
Boy Secondary 140.7 18 8 6
Girl Secondary 137.9 23 26 5
Girl Secondary 142.8 19 18 7
Boy Secondary 135.6 23 20 1
Girl Secondary 135.0 27 33 1
Boy Secondary 133.6 22 23 4
Boy Secondary 139.3 17 11 7
Girl Secondary 131.7 28 32 2
Boy Secondary 134.8 20 30 6

Summaries: Examining the Data

meth_modified %>% dplyr::count(across(.cols = c(gender, grade)))
# A tibble: 4 × 3
  gender grade         n
  <fct>  <fct>     <int>
1 Boy    Primary     199
2 Boy    Secondary   124
3 Girl   Primary     202
4 Girl   Secondary    74
meth_modified %>%
  dplyr::summarise(
    mean_amas = mean(amas, na.rm = T),
    sd_amas = sd(amas, na.rm = T),
    min_amas = min(amas, na.rm = T),
    max_amas = max(amas, na.rm = T)
  )
# A tibble: 1 × 4
  mean_amas sd_amas min_amas max_amas
      <dbl>   <dbl>    <dbl>    <dbl>
1      22.0    6.60        4       45
meth_modified %>%
  dplyr::summarise(across(
    .cols = c(amas, rcmas), 

    .fns = list(
      mean = ~ mean(., na.rm = T),
      sd = sd,
      min = min, max = max
    )
  ))
# A tibble: 1 × 8
  amas_mean amas_sd amas_min amas_max rcmas_mean rcmas_sd rcmas_min rcmas_max
      <dbl>   <dbl>    <dbl>    <dbl>      <dbl>    <dbl>     <dbl>     <dbl>
1      22.0    6.60        4       45       19.2     7.57         1        41
meth_modified %>%
  dplyr::summarise(
    mean_arith = mean(arith, na.rm = T),
    sd_arith = sd(arith, na.rm = T),
    min_arith = min(arith, na.rm = T),
    max_arith = max(arith, na.rm = T)
  )
# A tibble: 1 × 4
  mean_arith sd_arith min_arith max_arith
       <dbl>    <dbl>     <dbl>     <dbl>
1       5.30     2.11         0         8
meth_modified %>%
  dplyr::summarise(
    mean_age = mean(age, na.rm = T),
    sd_age = sd(age, na.rm = T),
    min_age = min(age, na.rm = T),
    max_age = max(age, na.rm = T)
  )
# A tibble: 1 × 4
  mean_age sd_age min_age max_age
     <dbl>  <dbl>   <dbl>   <dbl>
1     125.   22.3     3.7    188.
meth_modified %>%
  group_by(gender) %>%
  dplyr::summarise(across(
    .cols = c(age, amas, rcmas, arith),

    .fns = list(
      mean = ~ mean(., na.rm = T),
      sd = sd,
      min = min, max = max
    )
  ))
# A tibble: 2 × 17
  gender age_mean age_sd age_min age_max amas_mean amas_sd amas_min amas_max
  <fct>     <dbl>  <dbl>   <dbl>   <dbl>     <dbl>   <dbl>    <dbl>    <dbl>
1 Boy        128.   22.9    93.1    188.      21.2    6.51        4       45
2 Girl       121.   21.1     3.7    180.      22.9    6.59        9       40
# ℹ 8 more variables: rcmas_mean <dbl>, rcmas_sd <dbl>, rcmas_min <dbl>,
#   rcmas_max <dbl>, arith_mean <dbl>, arith_sd <dbl>, arith_min <dbl>,
#   arith_max <dbl>
crosstable(age + rcmas + amas + arith ~ gender,
  data = meth_modified
) %>%
  crosstable::as_flextable()

label

variable

gender

Boy

Girl

age

Min / Max

93.1 / 187.5

3.7 / 180.3

Med [IQR]

124.9 [107.4;147.2]

117.8 [105.8;133.4]

Mean (std)

127.6 (22.9)

121.1 (21.1)

N (NA)

323 (0)

276 (0)

rcmas

Min / Max

1.0 / 41.0

3.0 / 38.0

Med [IQR]

18.0 [13.0;23.0]

20.0 [15.0;26.0]

Mean (std)

18.1 (7.5)

20.6 (7.4)

N (NA)

323 (0)

276 (0)

amas

Min / Max

4.0 / 45.0

9.0 / 40.0

Med [IQR]

21.0 [17.0;26.0]

23.0 [19.0;28.0]

Mean (std)

21.2 (6.5)

22.9 (6.6)

N (NA)

323 (0)

276 (0)

arith

Min / Max

0 / 8.0

0 / 8.0

Med [IQR]

6.0 [4.0;7.0]

6.0 [4.0;7.0]

Mean (std)

5.3 (2.1)

5.3 (2.1)

N (NA)

323 (0)

276 (0)

meth_modified %>%
  group_by(gender) %>%
  dplyr::summarise(across(
    .cols = c(age, amas, rcmas, arith),

    .fns = list(
      mean = ~ mean(., na.rm = T),
      sd = sd,
      min = min, max = max
    )
  ))
# A tibble: 2 × 17
  gender age_mean age_sd age_min age_max amas_mean amas_sd amas_min amas_max
  <fct>     <dbl>  <dbl>   <dbl>   <dbl>     <dbl>   <dbl>    <dbl>    <dbl>
1 Boy        128.   22.9    93.1    188.      21.2    6.51        4       45
2 Girl       121.   21.1     3.7    180.      22.9    6.59        9       40
# ℹ 8 more variables: rcmas_mean <dbl>, rcmas_sd <dbl>, rcmas_min <dbl>,
#   rcmas_max <dbl>, arith_mean <dbl>, arith_sd <dbl>, arith_min <dbl>,
#   arith_max <dbl>
meth_modified %>%
  group_by(grade) %>%
  dplyr::summarise(across(
    .cols = c(age, amas, rcmas, arith),

    .fns = list(
      mean = ~ mean(., na.rm = T),
      sd = sd,
      min = min, max = max
    )
  ))
# A tibble: 2 × 17
  grade     age_mean age_sd age_min age_max amas_mean amas_sd amas_min amas_max
  <fct>        <dbl>  <dbl>   <dbl>   <dbl>     <dbl>   <dbl>    <dbl>    <dbl>
1 Primary       111.   12.1     3.7    135.      21.8    6.53        4       38
2 Secondary     151.   12.1   132.     188.      22.3    6.75        9       45
# ℹ 8 more variables: rcmas_mean <dbl>, rcmas_sd <dbl>, rcmas_min <dbl>,
#   rcmas_max <dbl>, arith_mean <dbl>, arith_sd <dbl>, arith_min <dbl>,
#   arith_max <dbl>
gf_histogram(~age, data = meth_modified) %>%
  gf_labs(title = "Histogram of Age")
`stat_bin()` using `bins = 30`. Pick better value `binwidth`.

gf_histogram(~amas, data = meth_modified) %>%
  gf_labs(title = "Histogram of AMAS")
`stat_bin()` using `bins = 30`. Pick better value `binwidth`.

gf_histogram(~rcmas, data = meth_modified) %>%
  gf_labs(title = "Histogram of RCMAS")
`stat_bin()` using `bins = 30`. Pick better value `binwidth`.

gf_bar(~grade, data = meth_modified) %>%
  gf_labs(title = "Bar Plot of Grade")

gf_bar(~gender, data = meth_modified) %>%
  gf_labs(title = "Bar Plot of Gender")

Visualizing the data

Grade Count for both Genders

meth_modified %>%
  gf_bar(~grade,
         fill=~gender,
         position = "dodge") +
  labs(title= "Grade count for both genders", subtitle = "Dodged Bar Chart", x ="Grade", y ="Count") +
  scale_fill_brewer(palette = "Set2")

Why is the count reducing in secondary grade? Assuming lesser students have enrolled for Secondary grade compared to Primary grade.

meth_modified %>%
  gf_bar(~grade,
         fill=~gender,
         position = "stack") +
  labs(title= "Grade count for both genders", subtitle = "Stacked Bar Chart", x ="Grade", y ="Count") +
  scale_fill_brewer(palette = "Set2")

Comparing Count of Girl’s Grade

meth_modified %>%
  gf_bar(~grade,
         fill=~gender,
         position = "fill") +
  labs(title= "Comparing count of both gender's grade", subtitle = "Filled Bar Chart", x ="Grade", y ="Count") +
  scale_fill_brewer(palette = "Set2")

Age Count for both Genders

meth_modified %>%
  gf_bar(~age,
         fill=~gender,
         position = "stack") +
  labs(title= "Age count for both genders", subtitle = "Stacked Bar Chart", x ="Age", y ="Count") +
  scale_fill_brewer(palette = "Set2")

Inferences:-

  1. After age 150 months (12.5 years), there’s a significant drop, which confirms that as students get older, fewer people enroll in math.

AMAS Scores based on Gender & Grade

meth_modified %>% 
  gf_histogram(~amas| grade~gender,
               bins = 5,
               fill = "steelblue",
               color="white") %>% 
  gf_labs(title="Histogram of AMAS Scores",
          subtitle ="Faceted by Grade and Gender",
          x="AMAS Score",
          y="Count")

meth_modified %>% 
  gf_histogram(~amas | grade,
               fill=~gender,
               colour="black") %>%
  gf_labs(
    title = "AMAS by Filled and Faceted by Grade")
`stat_bin()` using `bins = 30`. Pick better value `binwidth`.

meth_modified %>% 
  gf_boxplot(amas~gender | grade,
             fill=~gender,
             orientation = "x") %>% 
  gf_labs(title = "Boxplots of AMAS Scores by Grade",
       subtitle = "Faceted by Grade",
       x = "Gender",
       y = "AMAS Score")

Inferences:-

  1. Girls face more math anxiety: Box plots show higher medians, wider ranges. For girls, means are higher in both Primary & Secondary Grades.

RCMAS Scores based on Gender & Grade

meth_modified %>% 
  gf_histogram(~rcmas| grade~gender,
               bins = 5,
               fill = "steelblue",
               color="white") %>% 
  gf_labs(title="Histogram of RCMAS Scores",
          subtitle ="Faceted by Grade and Gender",
          x="RCMAS Score",
          y="Count")

meth_modified %>% 
  gf_histogram(~rcmas | grade,
               fill=~gender,
               colour="black") %>%
  gf_labs(
    title = "RCMAS by Filled and Faceted by Grade")
`stat_bin()` using `bins = 30`. Pick better value `binwidth`.

meth_modified %>% 
  gf_boxplot(rcmas~gender | grade,
             fill=~gender,
             orientation = "x") %>% 
  gf_labs(title = "Boxplots of RCMAS Scores by Grade",
       subtitle = "Faceted by Grade",
       x = "Gender",
       y = "RCMAS Score")

Overall Inferences

  • Anxiety appears to be higher for Girls than for Boys

  • Anxiety appears to be highest in Primary Girls

  • Anxiety appears to be lowest in Secondary Boys

Arith Scores based on Gender & Grade

meth_modified %>%
  gf_bar(~arith | grade,
               fill = ~gender,
               position="dodge",
               color="black") %>% 
  gf_labs(title = "Dodged Bar Graph of Arith Scores",
       subtitle = "Faceted by Grade",
       x = "Arith Score",
       y = "Count") %>% 
  gf_refine(scale_color_brewer(palette = "Set 2"))
Warning: Unknown palette: "Set 2"

Inferences:

  1. In general, girls have a lower arith score than boys.
  2. In Primary Grade, it is seen that the difference between girls and boys score is less when compared to Secondary grade where the gap is bigger.
meth_modified %>%
  gf_bar(~arith | grade,
               fill = ~gender,
               position="fill",
               color="black") %>% 
  gf_labs(title = "Filled Bar Graph of Arith Scores",
       subtitle = "Faceted by Grade",
       x = "Arith Score",
       y = "Count") %>% 
  gf_refine(scale_color_brewer(palette = "Set 2"))
Warning: Unknown palette: "Set 2"

meth_modified %>% 
  gf_bar(~arith | grade,
               fill=~grade,
               colour="black") %>%
  gf_labs(
    title = "Arith by Filled and Faceted by Grade",
    x="Arith Scores",
    y="Count")

Inferences:

Primary Grade have higher arith scores than Secondary Grade.

meth_modified %>% 
  gf_boxplot(arith ~gender | grade,
             fill=~gender,
             orientation = "x") %>% 
  gf_labs(title = "Boxplots of Arith Scores by Grade",
       subtitle = "Faceted by Grade",
       x = "Gender",
       y = "Arith Score")

Inferences:

  1. Primary grades ranges from 5 to 7 & Median is 6, while Secondary grades ranges from 3 to 6 & median is 4.
  2. Secondary students lag in arithmetic: Box plots shows lower medians and wider ranges for Secondary Grade.

Arith Scores based on Age, Grade & Gender

meth_modified %>%
  gf_point(arith ~ age | gender, color = ~ grade) %>%
  gf_labs(
    title = "Scatter Plot of Age vs Arithmetic Score",
    subtitle = "Faceted by Gender, Colored by Grade",
    x = "Age",
    y = "Arithmetic Score",
    color = "Grade"
  ) %>%
  gf_refine(scale_color_brewer(palette = "Spectral")) %>% 
  gf_theme(theme_dark())

Inferences:-

  1. Scatter plots shows similar Arith trends across both genders, but primary tend to have higher Arith scores.
meth_modified %>% 
  gf_boxplot(arith ~ grade | gender, orientation = "x", fill=~grade, color = "black") %>%
 gf_labs(
    x = "Grade",
    y = "Arithmetic Score",
    title = "Arithmetic Scores by Grade",
    subtitle = "Faceted by Gender"
  ) %>%
  gf_refine(scale_fill_brewer(palette = "Set3")) %>%
  gf_theme(theme_dark())

Inferences:

  1. Primary grade students (both genders) have higher arithmetic scores compared to secondary grade.
  2. Secondary grade students have wider range for arithmetic scores compared to Primary grade.
  3. Among genders within each grade there’s no difference in score .

Comparing AMAS & Arith Score

  • Hypothesis: From previous visualization we think that Math Anxiety (AMAS) is affects Arithmetic scores.
meth_modified %>%
  gf_boxplot(arith ~ amas | gender ~ grade, fill = ~ gender, orientation = "x") %>%
  gf_labs(
    title = "Relationship between Math Anxiety and Arithmetic Performance",
    subtitle = "Across Gender and Grade",
    x = "Math Anxiety (AMAS)",
    y = "Arithmetic Performance (Arith)"
  ) %>%
  gf_refine(scale_fill_brewer(palette = "Set2")) %>%
  gf_theme(theme_light())

Inferences:

  1. As math anxiety (AMAS) increases, arithmetic performance tends to go down for both genders.
  2. In both primary and secondary grade, Boys perform better than girls.
  3. Primary Grade students have higher math score than Secondary.
  4. The Secondary grade has more Math anxeity and this affects their Arithmatic scores.

Density Plot based on Age, Gender & Grade

meth_modified %>%
  gf_density(~ age | gender, fill = ~ grade, color="black") %>%
  gf_labs(
    title = "Density Plots of Age",
    subtitle = "Faceted by Gender, Filled by Grade",
    x = "Age",
    y = "Density",
    color = "Grade"
  ) %>% 
  gf_refine(scale_color_brewer(palette = "Set 3")) %>% 
  gf_theme(theme_light())
Warning: Unknown palette: "Set 3"
Ignoring unknown labels:
• colour : "Grade"

Comparing AMAS & RCMAS

meth_modified %>%
  gf_point(rcmas ~ amas, color = ~ gender, shape = ~ gender) %>%
  gf_labs(
    title = "Scatter Plot of Amas vs Rcmas",
    subtitle = "Colored by Gender",
    x = "Amas Score",
    y = "Rcmas Score"
  ) %>%
  gf_refine(scale_color_brewer(palette = "Set1")) %>% 
  gf_theme(theme_light())

Inferences:

  1. In general, both Genders have higher amas and lower rcmas
meth_modified %>%
  gf_boxplot(rcmas ~ amas | gender~grade, fill =~gender, orientation = "y") %>%
  gf_labs(
    title = "Box Plot of Amas vs Rcmas",
    subtitle = "Colored by Grade",
    x = "Amas Score",
    y = "Rcmas Score"
  ) %>%
  gf_refine(scale_color_brewer(palette = "Set2")) %>% 
  gf_theme(theme_light())

Inferences:

  1. As seen in the boxplot graphs, Boys tend to have a wider range of RCMAS than girls.
  2. Comparing Primary and Secondary Boys -
    • Primary Boys have lower general anxiety (rcmas) and higher math anxiety (amas).
    • Secondary Boys have higher general anxiety (rcmas) and lower math anxiety (amas).
  3. Comparing Primary and Secondary Girls -
    • Primary Girls have higher general anxiety(rcmas) and lower math anxiety (amas)
    • Secondary Girls have lower general anxiety (rcmas) and higher math anxiety (amas)
  4. Comparing Primary Boys and Girls -
    • Boys have lesser median but wider range for math anxiety (amas)
    • Girls have higher median but narrower range for math anxiety (amas)
    • Boys & Girls have same range for general anxiety (rcmas)
  5. Comparing Secondary Boys & Girls -
    • Boys have lesser median and narrower range for math anxiety (amas)
    • Girls have higher median and wider range for math anxiety (amas)
    • Boys have wider range of general anxiety compared to girls (rcmas)

Conclusion

  1. Most visualizations show that girls have higher anxieties (both math and general), while boys have higher arithmetic scores. This could be due to boys possibly having a better support system.

  2. Many visualizations confirm: primary grade students do better in Math,