Getting started
Many people use the jsPsych
JavaScript framework to run their online experiments. Regardless of
whether you store your results locally or on a dedicated server (e.g.,
JATOS), you end up with results
stored in JSON format. Although it is possible to work with JSON files
in R, the jspsych
package provides a simplified interface
to smooth the import and subsequent data manipulations.
The package will help you to covert the JSON file into the
tibble
format used in dplyr
and many other
packages of the tidyverse
.
Example data
Let’s imagine you collected the data of one person in a simple reaction time experiment. You can try the experiment yourself in the jsPsych tutorial.
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
# or library(tidyverse)
library(jspsychread)
# example file included in the package
filename <- demo_file("demo-simple-rt-task.json")
First, let’s have a look at the data.
readLines(filename, n = 12) %>% cat(sep = "\n")
#> [
#> {
#> "success": true,
#> "timeout": false,
#> "failed_images": [],
#> "failed_audio": [],
#> "failed_video": [],
#> "trial_type": "preload",
#> "trial_index": 0,
#> "time_elapsed": 3,
#> "internal_node_id": "0.0-0.0"
#> },
We can import the content into R with read_jspsych
command and we will get the following tibble.
As you can see below, the tibble contains 6 columns. The
record
column a different number for each experiment stored
in the file. It will be always 1
for a local JSON file (and
it is a good idea to change it after the import) or it will be an
increasing row of integers (1, 2, ...
) is reading the data
from JATOS server (where you should specify an additional parameter
single = F
).
The next four columns trial_type
,
trial_index
, time_elapsed
and
internal_node_id
are the compulsory data stored for every
plugin/trial of a jsPsych experiment. We will use
trial_type
to distinguish between the experiment trials and
other stuff (e.g., instructions). The values of trial_index
can be also handy, when you want to work with a particular subset of
your trials (training vs experiment, block 1 vs 2). Durations of each
trial in milliseconds are stored in time_elapsed
may be a
useful indicator of long waiting times (even if you don’t collect a
response time for the particular trial or plugin). The text stored in
internal_node_id
refers to the hierarchical structure of
the experiment.
Although the mentioned columns might already be of your interest, all
really interesting information are stored as a list in the final column
called raw
.
d <- read_jspsych(filename)
d
#> # A tibble: 24 × 6
#> record trial_type trial_index time_elapsed intern…¹ raw
#> <dbl> <chr> <int> <int> <chr> <list>
#> 1 1 preload 0 3 0.0-0.0 <named list>
#> 2 1 html-keyboard-response 1 5215 0.0-1.0 <named list>
#> 3 1 html-keyboard-response 2 13909 0.0-2.0 <named list>
#> 4 1 html-keyboard-response 3 17423 0.0-3.0… <named list>
#> 5 1 image-keyboard-response 4 18047 0.0-3.0… <named list>
#> 6 1 html-keyboard-response 5 18550 0.0-3.0… <named list>
#> 7 1 image-keyboard-response 6 19217 0.0-3.0… <named list>
#> 8 1 html-keyboard-response 7 19974 0.0-3.0… <named list>
#> 9 1 image-keyboard-response 8 20499 0.0-3.0… <named list>
#> 10 1 html-keyboard-response 9 21751 0.0-3.0… <named list>
#> # … with 14 more rows, and abbreviated variable name ¹internal_node_id
Expanding the raw data
The data from each plugin differ and thus it make sense to expand them only in a tibble of the trials/plugins of the same time.
As you can see, our experiment contains one preload
plugin, 13 html-keyboard-response
plugins for instructions
and fixation crosses and finally 10 image-keyboard-response
plugins for experiment trials.
d %>% count(trial_type)
#> # A tibble: 3 × 2
#> trial_type n
#> <chr> <int>
#> 1 html-keyboard-response 13
#> 2 image-keyboard-response 10
#> 3 preload 1
The data in raw
are stored as a list, which we can
inspect. But later you can see how to convert these list directly into
tibbles.
d %>% slice_head(n = 1) %>% pull(raw)
#> [[1]]
#> [[1]]$success
#> [1] TRUE
#>
#> [[1]]$timeout
#> [1] FALSE
#>
#> [[1]]$failed_images
#> list()
#>
#> [[1]]$failed_audio
#> list()
#>
#> [[1]]$failed_video
#> list()
Let’s go directly to the experiment trials. For the
filter
command, you can use the string constant, i.e.
trial_type == "image-keyboard-response"
or use the
predefined constants in the trial_tyoes
list and rely on
the auto-complete. This means: type trial_types$im
, press
TAB
and choose from the auto-complete drop-dow menu.
de <-
d %>%
filter(trial_type == trial_types$image_keyboard_response)
de
#> # A tibble: 10 × 6
#> record trial_type trial_index time_elapsed intern…¹ raw
#> <dbl> <chr> <int> <int> <chr> <list>
#> 1 1 image-keyboard-response 4 18047 0.0-3.0… <named list>
#> 2 1 image-keyboard-response 6 19217 0.0-3.0… <named list>
#> 3 1 image-keyboard-response 8 20499 0.0-3.0… <named list>
#> 4 1 image-keyboard-response 10 22256 0.0-3.0… <named list>
#> 5 1 image-keyboard-response 12 23268 0.0-3.0… <named list>
#> 6 1 image-keyboard-response 14 25451 0.0-3.0… <named list>
#> 7 1 image-keyboard-response 16 26217 0.0-3.0… <named list>
#> 8 1 image-keyboard-response 18 27520 0.0-3.0… <named list>
#> 9 1 image-keyboard-response 20 29771 0.0-3.0… <named list>
#> 10 1 image-keyboard-response 22 31166 0.0-3.0… <named list>
#> # … with abbreviated variable name ¹internal_node_id
For the conversion to tibble, we use process_records
.
Currently, there is a set of dedicated parsers (all starting with
parse_
) which you call with .using
argument.
dep <-
de %>%
# limit the columns to make it more readable
select(record, trial_index, raw) %>%
process_records(.using = parse_image_keyboard_response)
dep
#> # A tibble: 10 × 4
#> record trial_index raw processed
#> <dbl> <int> <list> <list>
#> 1 1 4 <named list [6]> <tibble [1 × 3]>
#> 2 1 6 <named list [6]> <tibble [1 × 3]>
#> 3 1 8 <named list [6]> <tibble [1 × 3]>
#> 4 1 10 <named list [6]> <tibble [1 × 3]>
#> 5 1 12 <named list [6]> <tibble [1 × 3]>
#> 6 1 14 <named list [6]> <tibble [1 × 3]>
#> 7 1 16 <named list [6]> <tibble [1 × 3]>
#> 8 1 18 <named list [6]> <tibble [1 × 3]>
#> 9 1 20 <named list [6]> <tibble [1 × 3]>
#> 10 1 22 <named list [6]> <tibble [1 × 3]>
This will create a list column processed
, which can be
keep as list column or unnest it.
def <-
dep %>%
unnest(processed)
def
#> # A tibble: 10 × 6
#> record trial_index raw rt response stimulus
#> <dbl> <int> <list> <int> <chr> <chr>
#> 1 1 4 <named list [6]> 612 j img/orange.png
#> 2 1 6 <named list [6]> 665 f img/blue.png
#> 3 1 8 <named list [6]> 523 f img/blue.png
#> 4 1 10 <named list [6]> 502 j img/orange.png
#> 5 1 12 <named list [6]> 505 j img/orange.png
#> 6 1 14 <named list [6]> 425 f img/blue.png
#> 7 1 16 <named list [6]> 509 j img/orange.png
#> 8 1 18 <named list [6]> 545 f img/blue.png
#> 9 1 20 <named list [6]> 495 j img/orange.png
#> 10 1 22 <named list [6]> 387 f img/blue.png
Finally, we have our data!
sumtable <-
def %>%
mutate(
correct = ((response == "j") == (stimulus == "img/orange.png")),
colour = gsub("(img/|\\.png)", "", stimulus)
) %>%
filter(correct) %>%
group_by(colour) %>%
summarise(mean_rt = mean(rt), sd = sd(rt), n = n())
sumtable
#> # A tibble: 2 × 4
#> colour mean_rt sd n
#> <chr> <dbl> <dbl> <int>
#> 1 blue 509 109. 5
#> 2 orange 525. 49.1 5