library(tidyverse)
library(ggrain)
Skills Lab 07: Correlation and Chi-square
Google doc: bit.ly/skills-lab-07
Setup
Packages and data
Load the necessary packages:
Data
Load the data:
<- readr::read_csv("data/video_games_data.csv") games_tib
Variables in the dataset:
- id: Participant’s ID
- age: Participants age
- game: Name of the video game
- game_type: Game classification as “Shooter”, “Sports game”, “RPG” or “Animal crossing”
- affect: Level of emotional affect measured from -6 (most negative) to + 6 (most positive)
- affect_cat: Categorical version of the affect variable, with values “Negative” or “Positive”
- life_sat: life Satisfaction
- experience: Experience of playing video games (0-100)
- hours: Hours spend playing video games per week
Correlation
Task 1: Create a correlation matrix
A good first step in an analysis is to explore the associations between variables
Select the continuous (numeric varables) in the dataset and generate a correlation matrix. Save this selection of columns in a new object called
games_tib_cor
Create a visualisation of the correlations
<- games_tib |>
games_tib_cor ::select(affect, life_sat, experience, age, hours) dplyr
|> GGally::ggscatmat() games_tib_cor
Task 2: Run correlation tests
Run correlation tests on the numeric variables
Find the variables relevant to the hypothesis below
What is the relationship between these two variables?
Is the relationship strong?
Is it statistically significant?
Can we reject the null hypothesis?
There will be a negative relationship between time spent playing video games and emotional affect.
::correlation(games_tib_cor) correlation
# Correlation Matrix (pearson-method)
Parameter1 | Parameter2 | r | 95% CI | t(16977) | p
-------------------------------------------------------------------------
affect | life_sat | 0.63 | [ 0.62, 0.64] | 106.24 | < .001***
affect | experience | 0.04 | [ 0.03, 0.06] | 5.38 | < .001***
affect | age | 0.15 | [ 0.13, 0.16] | 19.29 | < .001***
affect | hours | 0.02 | [ 0.00, 0.03] | 2.42 | 0.047*
life_sat | experience | 0.04 | [ 0.02, 0.05] | 4.75 | < .001***
life_sat | age | 0.11 | [ 0.10, 0.13] | 14.65 | < .001***
life_sat | hours | 5.61e-03 | [-0.01, 0.02] | 0.73 | 0.464
experience | age | 0.49 | [ 0.48, 0.50] | 73.44 | < .001***
experience | hours | -0.01 | [-0.03, 0.00] | -1.63 | 0.207
age | hours | 0.17 | [ 0.16, 0.19] | 22.64 | < .001***
p-value adjustment method: Holm (1979)
Observations: 16979
Chi-square
There will be an association between type of game and experiences of positive or negative affect.
Task 3: Quick data cleaning
We’re interested in comparing the game “Animal crossing” against games classified as “Sports game” - filter the rows that only contain these two game types and save the new dataset into an object called
games_tib_chi
Make a prediction! Who do you think is going to be more likely to experience positive affect? Players of Animal Crossing or players of sports games (car racing)?
<- games_tib |>
games_tib_chi ::filter(game_type %in% c("Animal crossing", "Sports game")) dplyr
Task 4: Plotting!
Create a bar plot showing the counts of participants across the two game types split by affect valence
Change the default colours
Adjust axis labels
Interpret the plot - does this it support your prediction?
|>
games_tib_chi ::ggplot(aes(x = game_type, fill = affect_cat)) +
ggplot2geom_bar(position = "dodge", alpha = 0.6) +
scale_fill_manual(values = c("darkmagenta", "lightseagreen")) +
labs(x = "Game type", y = "Frequency", fill = "Affect") +
theme_light()
# if time, this is useful:
|>
games_tib_chi ::ggplot(aes(x = game_type, fill = affect_cat)) +
ggplot2geom_bar(position = "fill", alpha = 0.75) +
scale_fill_manual(values = c("darkmagenta", "lightseagreen")) +
labs(x = "Game type", y = "Frequency", fill = "Affect") +
theme_light()
Task 5: Run Chi-square test
Run the test of association between type of game and affect category
Interpret the results - does the statistical test support your prediction?
Can we reject the null hypothesis?
<- chisq.test(games_tib_chi$game_type, games_tib_chi$affect_cat)
chi_test chi_test
Pearson's Chi-squared test with Yates' continuity correction
data: games_tib_chi$game_type and games_tib_chi$affect_cat
X-squared = 147.94, df = 1, p-value < 2.2e-16
$expected chi_test
games_tib_chi$affect_cat
games_tib_chi$game_type Negative Positive
Animal crossing 950.5018 5580.498
Sports game 1361.4982 7993.502
$observed chi_test
games_tib_chi$affect_cat
games_tib_chi$game_type Negative Positive
Animal crossing 1217 5314
Sports game 1095 8260
References:
Videogames and well-being pre-print paper (source of the dataset):
- https://osf.io/preprints/psyarxiv/8cxyh