Skip to contents

Getting started

Many people use the jsPsych JavaScript framework to run their online experiments. Regardless of whether you store your results locally or on a dedicated server (e.g., JATOS), you end up with results stored in JSON format. Although it is possible to work with JSON files in R, the jspsych package provides a simplified interface to smooth the import and subsequent data manipulations.

The package will help you to covert the JSON file into the tibble format used in dplyr and many other packages of the tidyverse.

Example data

Let’s imagine you collected the data of one person in a simple reaction time experiment. You can try the experiment yourself in the jsPsych tutorial.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
# or library(tidyverse)
library(jspsychread)

# example file included in the package
filename <- demo_file("demo-simple-rt-task.json")

First, let’s have a look at the data.

readLines(filename, n = 12) %>% cat(sep = "\n")
#> [
#>  {
#>      "success": true,
#>      "timeout": false,
#>      "failed_images": [],
#>      "failed_audio": [],
#>      "failed_video": [],
#>      "trial_type": "preload",
#>      "trial_index": 0,
#>      "time_elapsed": 3,
#>      "internal_node_id": "0.0-0.0"
#>  },

We can import the content into R with read_jspsych command and we will get the following tibble.

As you can see below, the tibble contains 6 columns. The record column a different number for each experiment stored in the file. It will be always 1 for a local JSON file (and it is a good idea to change it after the import) or it will be an increasing row of integers (1, 2, ...) is reading the data from JATOS server (where you should specify an additional parameter single = F).

The next four columns trial_type, trial_index, time_elapsed and internal_node_id are the compulsory data stored for every plugin/trial of a jsPsych experiment. We will use trial_type to distinguish between the experiment trials and other stuff (e.g., instructions). The values of trial_index can be also handy, when you want to work with a particular subset of your trials (training vs experiment, block 1 vs 2). Durations of each trial in milliseconds are stored in time_elapsed may be a useful indicator of long waiting times (even if you don’t collect a response time for the particular trial or plugin). The text stored in internal_node_id refers to the hierarchical structure of the experiment.

Although the mentioned columns might already be of your interest, all really interesting information are stored as a list in the final column called raw.

d <- read_jspsych(filename)

d
#> # A tibble: 24 × 6
#>    record trial_type              trial_index time_elapsed intern…¹ raw         
#>     <dbl> <chr>                         <int>        <int> <chr>    <list>      
#>  1      1 preload                           0            3 0.0-0.0  <named list>
#>  2      1 html-keyboard-response            1         5215 0.0-1.0  <named list>
#>  3      1 html-keyboard-response            2        13909 0.0-2.0  <named list>
#>  4      1 html-keyboard-response            3        17423 0.0-3.0… <named list>
#>  5      1 image-keyboard-response           4        18047 0.0-3.0… <named list>
#>  6      1 html-keyboard-response            5        18550 0.0-3.0… <named list>
#>  7      1 image-keyboard-response           6        19217 0.0-3.0… <named list>
#>  8      1 html-keyboard-response            7        19974 0.0-3.0… <named list>
#>  9      1 image-keyboard-response           8        20499 0.0-3.0… <named list>
#> 10      1 html-keyboard-response            9        21751 0.0-3.0… <named list>
#> # … with 14 more rows, and abbreviated variable name ¹​internal_node_id

Expanding the raw data

The data from each plugin differ and thus it make sense to expand them only in a tibble of the trials/plugins of the same time.

As you can see, our experiment contains one preload plugin, 13 html-keyboard-response plugins for instructions and fixation crosses and finally 10 image-keyboard-response plugins for experiment trials.

d %>% count(trial_type)
#> # A tibble: 3 × 2
#>   trial_type                  n
#>   <chr>                   <int>
#> 1 html-keyboard-response     13
#> 2 image-keyboard-response    10
#> 3 preload                     1

The data in raw are stored as a list, which we can inspect. But later you can see how to convert these list directly into tibbles.

d %>% slice_head(n = 1) %>% pull(raw)
#> [[1]]
#> [[1]]$success
#> [1] TRUE
#> 
#> [[1]]$timeout
#> [1] FALSE
#> 
#> [[1]]$failed_images
#> list()
#> 
#> [[1]]$failed_audio
#> list()
#> 
#> [[1]]$failed_video
#> list()

Let’s go directly to the experiment trials. For the filter command, you can use the string constant, i.e.  trial_type == "image-keyboard-response" or use the predefined constants in the trial_tyoes list and rely on the auto-complete. This means: type trial_types$im, press TAB and choose from the auto-complete drop-dow menu.

de <- 
  d %>% 
  filter(trial_type == trial_types$image_keyboard_response)

de
#> # A tibble: 10 × 6
#>    record trial_type              trial_index time_elapsed intern…¹ raw         
#>     <dbl> <chr>                         <int>        <int> <chr>    <list>      
#>  1      1 image-keyboard-response           4        18047 0.0-3.0… <named list>
#>  2      1 image-keyboard-response           6        19217 0.0-3.0… <named list>
#>  3      1 image-keyboard-response           8        20499 0.0-3.0… <named list>
#>  4      1 image-keyboard-response          10        22256 0.0-3.0… <named list>
#>  5      1 image-keyboard-response          12        23268 0.0-3.0… <named list>
#>  6      1 image-keyboard-response          14        25451 0.0-3.0… <named list>
#>  7      1 image-keyboard-response          16        26217 0.0-3.0… <named list>
#>  8      1 image-keyboard-response          18        27520 0.0-3.0… <named list>
#>  9      1 image-keyboard-response          20        29771 0.0-3.0… <named list>
#> 10      1 image-keyboard-response          22        31166 0.0-3.0… <named list>
#> # … with abbreviated variable name ¹​internal_node_id

For the conversion to tibble, we use process_records. Currently, there is a set of dedicated parsers (all starting with parse_) which you call with .using argument.

dep <-
  de %>% 
  # limit the columns to make it more readable
  select(record, trial_index, raw) %>%
  process_records(.using = parse_image_keyboard_response) 

dep
#> # A tibble: 10 × 4
#>    record trial_index raw              processed       
#>     <dbl>       <int> <list>           <list>          
#>  1      1           4 <named list [6]> <tibble [1 × 3]>
#>  2      1           6 <named list [6]> <tibble [1 × 3]>
#>  3      1           8 <named list [6]> <tibble [1 × 3]>
#>  4      1          10 <named list [6]> <tibble [1 × 3]>
#>  5      1          12 <named list [6]> <tibble [1 × 3]>
#>  6      1          14 <named list [6]> <tibble [1 × 3]>
#>  7      1          16 <named list [6]> <tibble [1 × 3]>
#>  8      1          18 <named list [6]> <tibble [1 × 3]>
#>  9      1          20 <named list [6]> <tibble [1 × 3]>
#> 10      1          22 <named list [6]> <tibble [1 × 3]>

This will create a list column processed, which can be keep as list column or unnest it.

def <- 
  dep %>%
  unnest(processed)

def
#> # A tibble: 10 × 6
#>    record trial_index raw                 rt response stimulus      
#>     <dbl>       <int> <list>           <int> <chr>    <chr>         
#>  1      1           4 <named list [6]>   612 j        img/orange.png
#>  2      1           6 <named list [6]>   665 f        img/blue.png  
#>  3      1           8 <named list [6]>   523 f        img/blue.png  
#>  4      1          10 <named list [6]>   502 j        img/orange.png
#>  5      1          12 <named list [6]>   505 j        img/orange.png
#>  6      1          14 <named list [6]>   425 f        img/blue.png  
#>  7      1          16 <named list [6]>   509 j        img/orange.png
#>  8      1          18 <named list [6]>   545 f        img/blue.png  
#>  9      1          20 <named list [6]>   495 j        img/orange.png
#> 10      1          22 <named list [6]>   387 f        img/blue.png

Finally, we have our data!

sumtable <-
  def %>% 
  mutate(
    correct = ((response == "j") == (stimulus == "img/orange.png")),
    colour = gsub("(img/|\\.png)", "", stimulus)
  ) %>% 
  filter(correct) %>% 
  group_by(colour) %>% 
  summarise(mean_rt = mean(rt), sd = sd(rt), n = n())

sumtable
#> # A tibble: 2 × 4
#>   colour mean_rt    sd     n
#>   <chr>    <dbl> <dbl> <int>
#> 1 blue      509  109.      5
#> 2 orange    525.  49.1     5