Type: Package
Title: Extract 'REDCap' Databases into Tidy 'Tibble's
Version: 1.2.3
Description: Convert 'REDCap' exports into tidy tables for easy handling of 'REDCap' repeat instruments and event arms.
License: MIT + file LICENSE
URL: https://chop-cgtinformatics.github.io/REDCapTidieR/, https://github.com/CHOP-CGTInformatics/REDCapTidieR
BugReports: https://github.com/CHOP-CGTInformatics/REDCapTidieR/issues
Depends: R (≥ 3.5.0)
Imports: checkmate, cli, dplyr, glue, lobstr, lubridate, purrr, REDCapR (≥ 1.2.0), rlang, stringi, stringr, tibble, tidyr, tidyselect, formattable, pillar, vctrs, readr, stats, forcats
Suggests: covr, knitr, labelled, lintr, openxlsx2 (≥ 0.8), prettyunits, rmarkdown, skimr, testthat (≥ 3.0.0), withr
VignetteBuilder: knitr
Config/testthat/edition: 3
Encoding: UTF-8
Language: en-US
LazyData: true
RoxygenNote: 7.3.2
NeedsCompilation: no
Packaged: 2025-06-06 15:31:54 UTC; porterej
Author: Richard Hanna ORCID iD [aut, cre], Stephan Kadauke ORCID iD [aut], Ezra Porter ORCID iD [aut]
Maintainer: Richard Hanna <hannar1@chop.edu>
Repository: CRAN
Date/Publication: 2025-06-06 16:20:05 UTC

REDCapTidieR: Extract 'REDCap' Databases into Tidy 'Tibble's

Description

logo

Convert 'REDCap' exports into tidy tables for easy handling of 'REDCap' repeat instruments and event arms.

Author(s)

Maintainer: Richard Hanna hannar1@chop.edu (ORCID)

Authors:

See Also

Useful links:


Supplement a supertibble from a longitudinal database with information about the events associated with each instrument

Description

Supplement a supertibble from a longitudinal database with information about the events associated with each instrument

Usage

add_event_mapping(supertbl, linked_arms, repeat_event_types)

Arguments

supertbl

a supertibble object to supplement with metadata

linked_arms

the tibble with event mappings created by link_arms()

repeat_event_types

a dataframe output from get_repeat_event_types() which specifies NR, RS, and RT types for events

Value

The original supertibble with an events redcap_events list column containing arms and events associated with each instrument


Add labelled features to write_redcap_xlsx

Description

Helper function to support labelled aesthetics to XLSX supertibble output

Usage

add_labelled_xlsx_features(
  supertbl,
  supertbl_meta,
  wb,
  sheet_vals,
  include_toc_sheet = TRUE,
  include_metadata_sheet = TRUE,
  supertbl_toc = NULL
)

Arguments

supertbl

a supertibble generated using read_redcap()

supertbl_meta

supertibble metadata generated by bind_supertbl_metadata()

wb

An openxlsx2 workbook object

sheet_vals

Helper argument passed from write_redcap_xlsx to determine and assign sheet values.

include_toc_sheet

Include a sheet capturing the supertibble output. Default TRUE.

include_metadata_sheet

Include a sheet capturing the combined output of the supertibble redcap_metadata. Default TRUE.

supertbl_toc

The table of contents supertibble defined in the parent function. Default NULL.


Supplement a supertibble with additional metadata fields

Description

Supplement a supertibble with additional metadata fields

Usage

add_metadata(
  supertbl,
  db_metadata,
  redcap_uri,
  token,
  suppress_redcapr_messages
)

Arguments

supertbl

a supertibble object to supplement with metadata

db_metadata

a REDCap metadata tibble

redcap_uri

The URI/URL of the REDCap server (e.g., "https://server.org/apps/redcap/api/"). Required.

token

The user-specific string that serves as the password for a project. Required.

suppress_redcapr_messages

A logical to control whether to suppress messages from REDCapR API calls. Default TRUE.

Details

This function assumes that db_metadata has been processed to include a row for each option of each multiselection field, i.e. with update_field_names()

Value

The original supertibble with additional fields:


Add the metadata sheet

Description

Internal helper function. Adds appropriate elements to wb object. Returns a dataframe.

Usage

add_metadata_sheet(
  supertbl,
  supertbl_meta,
  wb,
  add_labelled_column_headers,
  table_style,
  column_width,
  na_replace
)

Arguments

supertbl

a supertibble generated using read_redcap()

supertbl_meta

an unnest-ed metadata tibble from the supertibble

wb

An openxlsx2 workbook object

add_labelled_column_headers

Whether or not to include labelled outputs.

table_style

Any excel table style name or "none" (see "formatting" in openxlsx2::wb_add_data_table()). Default "tableStyleLight8".

column_width

Width to set columns across the workbook. Default "auto", otherwise a numeric value. Standard Excel is 8.43.

na_replace

The value used to replace NA values in supertbl. The default is "".

Value

A dataframe


Add partial key helper variables to dataframes

Description

Make helper variables redcap_event and redcap_arm available as branches from var for later use.

Usage

add_partial_keys(db_data, var = NULL)

Arguments

db_data

The REDCap database output defined by REDCapR::redcap_read_oneshot()$data

var

The unquoted name of the field containing event and arm identifiers. Default NULL.

Value

Two appended columns, redcap_event and redcap_arm to the end of read_redcap output tibbles.


Add skimr metrics to a supertibble's metadata

Description

Add default skim metrics to the redcap_data list elements of a supertibble output from read_readcap.

Usage

add_skimr_metadata(supertbl)

Arguments

supertbl

a supertibble generated using read_redcap()

Details

For more information on the default metrics provided, check the get_default_skimmer_names documentation.

Value

A supertibble with skimr metadata metrics

Examples

superheroes_supertbl

add_skimr_metadata(superheroes_supertbl)

## Not run: 
redcap_uri <- Sys.getenv("REDCAP_URI")
token <- Sys.getenv("REDCAP_TOKEN")

supertbl <- read_redcap(redcap_uri, token)
add_skimr_metadata(supertbl)

## End(Not run)


Add the supertbl table of contents sheet

Description

Internal helper function. Adds appropriate elements to wb object. Returns a dataframe.

Usage

add_supertbl_toc(
  wb,
  supertbl,
  include_metadata_sheet,
  add_labelled_column_headers,
  table_style,
  column_width,
  na_replace
)

Arguments

wb

An openxlsx2 workbook object

supertbl

a supertibble generated using read_redcap()

include_metadata_sheet

Include a sheet capturing the combined output of the supertibble redcap_metadata.

add_labelled_column_headers

Whether or not to include labelled outputs.

table_style

Any excel table style name or "none" (see "formatting" in openxlsx2::wb_add_data_table()). Default "tableStyleLight8".

column_width

Width to set columns across the workbook. Default "auto", otherwise a numeric value. Standard Excel is 8.43.

na_replace

The value used to replace NA values in supertbl. The default is "".

Value

A dataframe


Apply factor labels to a vector

Description

Apply factor labels to a vector

Usage

apply_labs_factor(x, labels, ...)

Arguments

x

a vector to label

labels

a named vector of labels in the format c(value = label)

...

unused, needed to ignore extra arguments that may be passed

Details

Dots are needed to ignore ptype argument that may be passed to apply_labs_haven

Value

factor


Apply haven value labels to a vector

Description

Apply haven value labels to a vector

Usage

apply_labs_haven(x, labels, ptype, ...)

Arguments

x

a vector to label

labels

a named vector of labels in the format c(value = label)

ptype

vector to serve as prototype for label values

...

unused, needed to ignore extra arguments that may be passed

Details

Assumes a check_installed() has been run for labelled. Since haven preserves the underlying data values we need to make sure the data type of the value options in the metadata matches the data type of the values in the actual data. This function accepts a prototype, usually a column from db_data, and uses force_cast() to do a best-effort casting of the value options in the metadata to the same data type as ptype. The fallback is to convert x and the value labels to character.

Value

haven_labelled vector


Add supertbl S3 class

Description

Add supertbl S3 class

Usage

as_supertbl(x)

Arguments

x

an object to class

Value

The object with redcaptidier_supertbl S3 class


Bind supertbl metadata

Description

Simple helper function for binding supertbl metadata into one table. This supports creating the metadata XLSX sheet as well as supertbl_recode.

Usage

bind_supertbl_metadata(supertbl)

Arguments

supertbl

A supertibble generated using read_redcap()


Extract data tibbles from a REDCapTidieR supertibble and bind them to an environment

Description

Take a supertibble generated with read_redcap() and bind its data tibbles (i.e. the tibbles in the redcap_data column) to an environment. The default is the global environment.

Usage

bind_tibbles(supertbl, environment = global_env(), tbls = NULL)

Arguments

supertbl

A supertibble generated by read_redcap(). Required.

environment

The environment to bind the tibbles to. Default is rlang::global_env().

tbls

A vector of the redcap_form_names of the data tibbles to bind to the environment. Default is NULL which binds all data tibbles.

Value

This function returns nothing as it's used solely for its side effect of modifying an environment.

Examples

## Not run: 
# Create an empty environment
my_env <- new.env()

ls(my_env)

superheroes_supertbl

bind_tibbles(superheroes_supertbl, my_env)

ls(my_env)

## End(Not run)

Utility function to calculate summary for each tibble in a supertibble

Description

Utility function to calculate summary for each tibble in a supertibble

Usage

calc_metadata_stats(data)

Arguments

data

a tibble of redcap data stored in the redcap_data column of a supertibble

Value

A list containing:


Check requested data argument exists in REDCap data

Description

Provide an error message when an argument is requested, but is not found in any read_redcap() redcap_data output.

Usage

check_data_arg_exists(db_data, col, arg, call = caller_env())

Arguments

db_data

The REDCap database output generated by REDCapR::redcap_read_oneshot()$data

col

The column to check for in redcap_data

arg

The argument used for the column check

call

The calling environment to use in the error message

Details

Currently used for the following arguments:

Value

An error message saying the requested data does not exist


Check equal distinct values between two columns

Description

Takes a dataframe and two columns and checks if n_distinct of the second column is all unique based on grouping of the first column.

Usage

check_equal_col_summaries(data, col1, col2, call = caller_env())

Arguments

data

a dataframe

col1

a column to group by

col2

a column to check for uniqueness


Check data field for field values not in metadata

Description

Check data field for field values not in metadata

Usage

check_extra_field_values(x, values)

Arguments

x

data field

values

expected field values


Parse logical field and compile data for warning if parsing errors occurred

Description

Parse logical field and compile data for warning if parsing errors occurred

Usage

check_field_is_logical(x)

Arguments

x

vector to parse


Check fields are of checkbox field type

Description

Check fields are of checkbox field type

Usage

check_fields_are_checkboxes(metadata_tbl, call = caller_env())

Arguments

metadata_tbl

A metadata tibble from a supertibble

call

The calling environment to use in the error message


Check fields exist for checkbox combination

Description

Check fields exist for checkbox combination

Usage

check_fields_exist(fields, expr, call = caller_env())

Arguments

fields

Vector of character strings to check the length of

expr

An expression

call

The calling environment to use in the error message


Check if file already exists

Description

Provide an error message when a file is declared for writing that already exists.

Usage

check_file_exists(file, overwrite, call = caller_env())

Arguments

file

The file that is being checked

overwrite

Whether the file was declared to be overwritten

call

The calling environment to use in the error message

Details

In the case of write_redcap_xlsx(), this should only error when a file already exists and is not declared for overwite.

Value

An error message saying the requested file already exists


Check that all requested instruments are in REDCap project metadata

Description

Provide an error message when any instrument names are passed to read_redcap() that do not exist in the project metadata.

Usage

check_forms_exist(db_metadata, forms, call = caller_env())

Arguments

db_metadata

The metadata file read by REDCapR::redcap_metadata_read()

forms

The character vector of instrument names passed to read_redcap()

call

the calling environment to use in the error message

Value

An error message listing the requested instruments that don't exist


Check if labelled

Description

Checks if a supplied supertibble is labelled and throws an error if it is not but labelled is set to TRUE

Usage

check_labelled(supertbl, add_labelled_column_headers, call = caller_env())

Arguments

supertbl

a supertibble generated using read_redcap()

add_labelled_column_headers

Whether or not to include labelled outputs

call

the calling environment to use in the warning message

Value

A boolean


Check metadata fields exist for checkbox combination

Description

Similar to check_fields_exist(), but instead of verifying fields that exist in the data tibble this seeks to verify their existence under the metadata tibble field_names.

Usage

check_metadata_fields_exist(metadata_tbl, cols, call = caller_env())

Arguments

metadata_tbl

A metadata tibble from the supertibble generated by read_redcap().

cols

Selected columns identified for combine_checkboxes() to be cross checked against metadata_tibble$field_name

call

The calling environment to use in the error message


Check that parsed labels are not duplicated

Description

Check that parsed labels are not duplicated

Usage

check_parsed_labels(
  parsed_labels_output,
  field_name,
  warn_stripped_text = FALSE,
  call = caller_env(n = 2)
)

Arguments

parsed_labels_output

a vector of parsed labels produced by parse_labels()

field_name

the name of the field associated with the labels to use in the warning message

warn_stripped_text

logical for whether to include a note about HTML tag stripping in the message

call

the calling environment to use in the error message. The parent of calling environment by default because this check usually occurs 2 frames below the relevant context for the user

Value

a warning message alerting specifying the duplicate labels and REDCap field affected


Check that a supplied REDCap database is populated

Description

Check for potential outputs where metadata is present, but nrow and ncol equal 0. This causes multi_choice_to_labels to fail, but a helpful error message should be provided.

Usage

check_redcap_populated(db_data, call = caller_env())

Arguments

db_data

The REDCap database output generated by REDCapR::redcap_read_oneshot()$data

call

the calling environment to use in the error message

Value

A helpful error message alerting the user to check their API privileges.


Check for instruments that have both repeating and non-repeating structure

Description

Check for potential instruments that are given both repeating and nonrepeating structure. REDCapTidieR does not support database structures built this way.

Usage

check_repeat_and_nonrepeat(db_data, call = caller_env())

Arguments

db_data

The REDCap database output generated by REDCapR::redcap_read_oneshot()$data

call

the calling environment to use in the error message

Value

A helpful error message alerting the user to existence of an instrument that was designated both as repeating and non-repeating.


Check that all metadata tibbles within a supertibble contain field_name and field_label columns

Description

Check that all metadata tibbles within a supertibble contain field_name and field_label columns

Usage

check_req_labelled_metadata_fields(supertbl, call = caller_env())

Arguments

supertbl

a supertibble containing a redcap_metadata column

call

the calling environment to use in the error message

Value

an error message alerting that instrument metadata is incomplete


Check for possible API user privilege issues

Description

Check for potential user access privilege issues and provide an appropriate warning message. This can occur when metadata forms/field names do not appear in a database export.

Usage

check_user_rights(db_data, db_metadata, call = caller_env())

Arguments

db_data

The REDCap database output generated by REDCapR::redcap_read_oneshot()$data

db_metadata

The REDCap metadata output generated by REDCapR::redcap_metadata_read()$data

call

the calling environment to use in the warning message

Value

A helpful error message alerting the user to check their API privileges.


Check an argument with checkmate

Description

Check an argument with checkmate

Usage

check_arg_is_supertbl(
  x,
  req_cols = c("redcap_data", "redcap_metadata"),
  arg = caller_arg(x),
  call = caller_env()
)

check_arg_is_env(x, ..., arg = caller_arg(x), call = caller_env())

check_arg_is_character(x, ..., arg = caller_arg(x), call = caller_env())

check_arg_is_logical(x, ..., arg = caller_arg(x), call = caller_env())

check_arg_choices(x, ..., arg = caller_arg(x), call = caller_env())

check_arg_is_valid_token(x, arg = caller_arg(x), call = caller_env())

check_arg_is_valid_extension(
  x,
  valid_extensions,
  arg = caller_arg(x),
  call = caller_env()
)

Arguments

x

An object to check

req_cols

required fields for check_arg_is_supertbl()

arg

The name of the argument to include in an error message. Captured by rlang::caller_arg() by default

call

the calling environment to use in the error message

...

additional arguments passed on to checkmate

Value

TRUE if x passes the checkmate check. An error otherwise with the name of the checkmate function as a class


Extract non-longitudinal REDCap databases into tidy tibbles

Description

Helper function internal to read_redcap responsible for extraction and final processing of a tidy tibble to the user from a non-longitudinal REDCap database.

Usage

clean_redcap(db_data, db_metadata)

Arguments

db_data

The REDCap database output defined by REDCapR::redcap_read_oneshot()$data

db_metadata

The REDCap metadata output defined by REDCapR::redcap_metadata_read()$data

Value

Returns a tibble with list elements containing tidy dataframes. Users can access dataframes under the redcap_data column with reference to form_name and structure column details.


Extract longitudinal REDCap databases into tidy tibbles

Description

Helper function internal to read_redcap responsible for extraction and final processing of a tidy tibble to the user from a longitudinal REDCap database.

Usage

clean_redcap_long(
  db_data_long,
  db_metadata_long,
  linked_arms,
  allow_mixed_structure = FALSE
)

Arguments

db_data_long

The longitudinal REDCap database output defined by REDCapR::redcap_read_oneshot()$data

db_metadata_long

The longitudinal REDCap metadata output defined by REDCapR::redcap_metadata_read()$data

linked_arms

Output of link_arms, linking instruments to REDCap events/arms

allow_mixed_structure

A logical to allow for support of mixed repeating/non-repeating instruments. Setting to TRUE will treat the mixed instrument's non-repeating versions as repeating instruments with a single instance. Applies to longitudinal projects only. Default FALSE.

Value

Returns a tibble with list elements containing tidy dataframes. Users can access dataframes under the redcap_data column with reference to form_name and structure column details.


Combine checkbox fields with respect to repaired outputs

Description

This function seeks to preserve the original data columns and types from the originally supplied data_tbl and add on the new columns from data_tbl_mod.

If names_repair presents a repair strategy, the output columns will be captured and updated here while dropping the original columns.

Usage

combine_and_repair_tbls(data_tbl, data_tbl_mod, new_cols, names_repair)

Arguments

data_tbl

The original data table given to combine_checkboxes()

data_tbl_mod

A modified data table from data_tbl

new_cols

The new columns created for checkbox combination

names_repair

What happens if the output has invalid column names? The default, "check_unique" is to error if the columns are duplicated. Use "minimal" to allow duplicates in the output, or "unique" to de-duplicated by adding numeric suffixes. See vctrs::vec_as_names() for more options.

Value

a tibble


Combine Checkbox Fields into a Single Column

Description

combine_checkboxes() consolidates multiple checkbox fields in a REDCap data tibble into a single column. This transformation simplifies analysis by merging several binary columns into one labeled factor column, making the data more interpretable and easier to analyze.

Usage

combine_checkboxes(
  supertbl,
  tbl,
  cols,
  names_prefix = "",
  names_sep = "_",
  names_glue = NULL,
  names_repair = "check_unique",
  multi_value_label = "Multiple",
  values_fill = NA,
  raw_or_label = "label",
  keep = TRUE
)

Arguments

supertbl

A supertibble generated by read_redcap(). Required.

tbl

The redcap_form_name of the data tibble to extract. Required.

cols

Checkbox columns to combine to single column. Required.

names_prefix

String added to the start of every variable name.

names_sep

String to separate new column names from names_prefix.

names_glue

Instead of names_sep and names_prefix, you can supply a glue specification and the unique .value to create custom column names.

names_repair

What happens if the output has invalid column names? The default, "check_unique" is to error if the columns are duplicated. Use "minimal" to allow duplicates in the output, or "unique" to de-duplicated by adding numeric suffixes. See vctrs::vec_as_names() for more options.

multi_value_label

A string specifying the value to be used when multiple checkbox fields are selected. Default "Multiple".

values_fill

Value to use when no checkboxes are selected. Default NA.

raw_or_label

Either 'raw' or 'label' to specify whether to use raw coded values or labels for the options. Default 'label'.

keep

Logical indicating whether to keep the original checkbox fields in the output. Default TRUE.

Details

combine_checkboxes() operates on the data and metadata tibbles produced by the read_redcap() function. Since it relies on the checkbox field naming conventions used by REDCap, changes to the checkbox variable names or their associated metadata field_names could lead to errors.

REDCap checkbox fields are typically expanded into separate variables for each checkbox option, with names formatted as checkbox_var___1, checkbox_var___2, etc. combine_checkboxes() detects these variables and combines them into a single column. If the expected variables are not found, an error is returned.

Value

A modified supertibble.

Examples

library(dplyr)
# Set up sample data tibble
data_tbl <- tibble::tribble(
  ~"study_id", ~"multi___1", ~"multi___2", ~"multi___3",
  1, TRUE, FALSE, FALSE,
  2, TRUE, TRUE, FALSE,
  3, FALSE, FALSE, FALSE
)

# Set up sample metadata tibble
metadata_tbl <- tibble::tribble(
  ~"field_name", ~"field_type", ~"select_choices_or_calculations",
  "study_id", "text", NA,
  "multi___1", "checkbox", "1, Red | 2, Yellow | 3, Blue",
  "multi___2", "checkbox", "1, Red | 2, Yellow | 3, Blue",
  "multi___3", "checkbox", "1, Red | 2, Yellow | 3, Blue"
)

# Create sample supertibble
supertbl <- tibble::tribble(
  ~"redcap_form_name", ~"redcap_data", ~"redcap_metadata",
  "tbl", data_tbl, metadata_tbl
)

class(supertbl) <- c("redcap_supertbl", class(supertbl))

# Combine checkboxes under column "multi"
combine_checkboxes(
  supertbl = supertbl,
  tbl = "tbl",
  cols = starts_with("multi")
) %>%
  dplyr::pull(redcap_data) %>%
  dplyr::first()

## Not run: 

redcap_uri <- Sys.getenv("REDCAP_URI")
token <- Sys.getenv("REDCAP_TOKEN")

supertbl <- read_redcap(redcap_uri, token)
combine_checkboxes(
  supertbl = supertbl,
  tbl = "tbl",
  cols = starts_with("col"),
  multi_value_label = "Multiple",
  values_fill = NA
)

## End(Not run)


Convert a new checkbox column's values

Description

This function takes a single column of data and converts the values based on the overall data tibble cross referenced with a nested section of the metadata tibble.

case_when logic helps determine whether the value is a coalesced singular value or a user-specified one via multi_value_label or values_fill.

Usage

convert_checkbox_vals(
  metadata,
  .new_value,
  data_tbl,
  raw_or_label,
  multi_value_label,
  values_fill
)

Arguments

metadata

A nested portion of the overall metadata tibble

.new_value

The new column values made by combine_checkboxes()

data_tbl

The data tibble from the original supertibble

raw_or_label

Either 'raw' or 'label' to specify whether to use raw coded values or labels for the options. Default 'label'.

multi_value_label

A string specifying the value to be used when multiple checkbox fields are selected. Default "Multiple".

values_fill

Value to use when no checkboxes are selected. Default NA.

Details

This function is used in conjunction with pmap().


Convert Mixed Structure Instruments to Repeating Instruments

Description

For longitudinal projects where users set allow_mixed_structure to TRUE, this function will handle the process of setting the nonrepeating parts of the instrument to repeating ones with a single instance.

Usage

convert_mixed_instrument(db_data_long, mixed_structure_ref)

Arguments

db_data_long

The longitudinal REDCap database output defined by REDCapR::redcap_read_oneshot()$data

mixed_structure_ref

Reference dataframe containing mixed structure fields and forms.

Value

Returns a tibble with list elements containing tidy dataframes. Users can access dataframes under the redcap_data column with reference to form_name and structure column details.


Utility function to convert redcap repeat instance columns into appropriate form and event columns

Description

Utility function to convert redcap repeat instance columns into appropriate form and event columns

Usage

create_repeat_instance_vars(db_data)

Arguments

db_data

The REDCap database output generated by REDCapR::redcap_read_oneshot()$data

Details

The output of a standard REDCap export with repeating forms and/or events makes use of redcap_repeat_instance in combination with redcap_repeat_instrument and whether or not data exists in both. Instead, rename and separate redcap_repeat_instance into redcap_form_instance and redcap_event_instance.

Value

A dataframe.


Check whether a REDCap database has repeat forms

Description

Simple utility function checking for the existence of repeat forms in a REDCap database.

Usage

db_has_repeat_forms(db_data)

Arguments

db_data

A REDCap dataframe.

Value

A boolean.


Extract non-repeat tables from non-longitudinal REDCap databases

Description

Sub-helper function to clean_redcap for single nonrepeat table extraction.

Usage

distill_nonrepeat_table(form_name, db_data, db_metadata)

Arguments

form_name

The form_name described in the named column from the REDCap metadata.

db_data

The REDCap database output defined by REDCapR::redcap_read_oneshot()$data

db_metadata

The REDCap metadata output defined by REDCapR::redcap_metadata_read()$data

Value

A subset tibble of all data related to a specified form_name


Extract non-repeat tables from longitudinal REDCap databases

Description

Sub-helper function to clean_redcap_long for single nonrepeat table extraction.

Usage

distill_nonrepeat_table_long(
  form_name,
  db_data_long,
  db_metadata_long,
  linked_arms
)

Arguments

form_name

The form_name described in the named column from the REDCap metadata.

db_data_long

The REDCap database output defined by REDCapR::redcap_read_oneshot()$data

db_metadata_long

The REDCap metadata output defined by REDCapR::redcap_metadata_read()$data

linked_arms

Output of link_arms, linking instruments to REDCap events/arms

Value

A tibble of all data related to a specified form_name


Extract repeat tables from non-longitudinal REDCap databases

Description

Sub-helper function to clean_redcap for single repeat table extraction.

Usage

distill_repeat_table(form_name, db_data, db_metadata)

Arguments

form_name

The form_name described in the named column from the REDCap metadata.

db_data

The non-longitudinal REDCap database output defined by REDCapR::redcap_read_oneshot()$data

db_metadata

The non-longitudinal REDCap metadata output defined by REDCapR::redcap_metadata_read()$data

Value

A subset tibble of all data related to a specified form_name


Extract repeat tables from longitudinal REDCap databases

Description

Sub-helper function to clean_redcap_long for single repeat table extraction.

Usage

distill_repeat_table_long(
  form_name,
  db_data_long,
  db_metadata_long,
  linked_arms,
  has_mixed_structure_forms = FALSE,
  mixed_structure_ref = NULL
)

Arguments

form_name

The form_name described in the named column from the REDCap metadata.

db_data_long

The REDCap database output defined by REDCapR::redcap_read_oneshot()$data

db_metadata_long

The REDCap metadata output defined by REDCapR::redcap_metadata_read()$data

linked_arms

Output of link_arms, linking instruments to REDCap events/arms

has_mixed_structure_forms

Whether the instrument under evaluation has a mixed structure. Default FALSE.

mixed_structure_ref

A mixed structure reference dataframe supplied by get_mixed_structure_fields().

Value

A tibble of all data related to a specified form_name


Extract a specific metadata tibble from a supertibble

Description

Utility function to extract a specific metadata tibble from a supertibble given a redcap_form_name

Usage

extract_metadata_tibble(supertbl, redcap_form_name)

Arguments

supertbl

A supertibble generated by read_redcap().

redcap_form_name

A character string identifying the redcap_form_name the metadata tibble is associated with.

Value

A tibble


Extract a single data tibble from a REDCapTidieR supertibble

Description

Take a supertibble generated with read_redcap() and return one of its data tibbles.

Usage

extract_tibble(supertbl, tbl)

Arguments

supertbl

A supertibble generated by read_redcap(). Required.

tbl

The redcap_form_name of the data tibble to extract. Required.

Details

This function makes it easy to extract a single instrument's data from a REDCapTidieR supertibble.

Value

A tibble.

Examples

superheroes_supertbl

extract_tibble(superheroes_supertbl, "heroes_information")


Extract data tibbles from a REDCapTidieR supertibble into a list

Description

Take a supertibble generated with read_redcap() and return a named list of data tibbles.

Usage

extract_tibbles(supertbl, tbls = everything())

Arguments

supertbl

A supertibble generated by read_redcap(). Required.

tbls

A vector of form_names or a tidyselect helper. Default is dplyr::everything().

Details

This function makes it easy to extract a multiple instrument's data from a REDCapTidieR supertibble into a named list. Specifying instruments using tidyselect helper functions such as dplyr::starts_with() or dplyr::ends_with() is supported.

Value

A named list of tibbles

Examples

superheroes_supertbl

# Extract all data tibbles
extract_tibbles(superheroes_supertbl)

# Only extract data tibbles starting with "heroes"
extract_tibbles(superheroes_supertbl, starts_with("heroes"))


Format REDCap variable labels

Description

Use these functions with the format_labels argument of make_labelled() to define how variable labels should be formatted before being applied to the data columns of redcap_data. These functions are helpful to create pretty variable labels from REDCap field labels.

Usage

fmt_strip_whitespace(x)

fmt_strip_trailing_colon(x)

fmt_strip_trailing_punct(x)

fmt_strip_html(x)

fmt_strip_field_embedding(x)

Arguments

x

a character vector

Value

a modified character vector

Examples


fmt_strip_whitespace("Poorly Spaced   Label ")

fmt_strip_trailing_colon("Label:")

fmt_strip_trailing_punct("Label-")

fmt_strip_html("<b>Bold Label</b>")

fmt_strip_field_embedding("Label{another_field}")

superheroes_supertbl

make_labelled(superheroes_supertbl, format_labels = fmt_strip_trailing_colon)


Format value for error message

Description

Format value for error message

Usage

format_error_val(x)

Arguments

x

value to format

Value

If x is atomic, x with cli formatting to truncate to 5 values. Otherwise, a string summarizing x produced by as_label


Determine fields included in REDCapR::redcap_read_oneshot output that should be dropped from results of read_redcap

Description

Determine fields included in REDCapR::redcap_read_oneshot output that should be dropped from results of read_redcap

Usage

get_fields_to_drop(db_metadata, form)

Arguments

db_metadata

metadata tibble created by REDCapR::redcap_metadata_read

form

the name of the instrument containing identifiers

Details

This function applies rules to determine which fields are included in the results of REDCapR::redcap_read_oneshot because the user didn't request the instrument containing identifiers

Value

A character vector of extra field names that can be used to filter the results of REDCapR::redcap_read_oneshot


Get metadata specification table

Description

Get metadata specification table

Usage

get_metadata_spec(
  metadata_tbl,
  selected_cols,
  names_prefix,
  names_sep,
  names_glue
)

Arguments

metadata_tbl

A metadata tibble from the supertibble generated by read_redcap().

selected_cols

Character string vector of field names for checkbox combination

names_prefix

String added to the start of every variable name.

names_sep

String to separate new column names from names_prefix.

names_glue

Instead of names_sep and names_prefix, you can supply a glue specification and the unique .value to create custom column names.

Value

a tibble


Get Mixed Structure Instrument List

Description

Define fields in a given project that are used in both a repeating and nonrepeating manner.

Usage

get_mixed_structure_fields(db_data)

Arguments

db_data

The REDCap database output generated by REDCapR::redcap_read_oneshot()$data

Value

a dataframe


Utility function to extract the name of the project identifier field for a tibble of REDCap data

Description

Utility function to extract the name of the project identifier field for a tibble of REDCap data

Usage

get_record_id_field(data)

Arguments

data

a tibble of REDCap data

Details

The current implementation assumes that the first field in the data is the project identifier

Value

The name of the identifier field in the data


Add identification for repeat event types

Description

To correctly assign repeat event types a few assumptions must be made:

Usage

get_repeat_event_types(data)

Arguments

data

the REDCap data

Value

A dataframe with unique event names mapped to their corresponding repeat types


Swap vector names for values

Description

Swap vector names for values

Usage

invert_vec(x)

Arguments

x

a vector

Value

Vector with names and values reversed


Determine if an object is labelled

Description

An internal utility function used to inform other processes of whether or not a given object has been labelled (i.e. with make_labelled()).

Usage

is_labelled(obj)

Arguments

obj

An object to be tested for "label" attributes

Details

An object is considered labelled if it has "label" attributes.

Value

A boolean


Description

For REDCap databases containing arms and events, it is necessary to determine how these are linked and what variables belong to them.

Usage

link_arms(redcap_uri, token, suppress_redcapr_messages = TRUE)

Arguments

redcap_uri

The REDCap URI

token

The REDCap API token

suppress_redcapr_messages

A logical to control whether to suppress messages from REDCapR API calls. Default TRUE.

Value

Returns a tibble of redcap_event_names with list elements containing a vector of associated instruments.


Apply variable labels to a REDCapTidieR supertibble

Description

Take a supertibble and use the labelled package to apply variable labels to the columns of the supertibble as well as to each tibble in the redcap_data, redcap_metadata, and redcap_events columns of that supertibble.

Usage

make_labelled(supertbl, format_labels = NULL)

Arguments

supertbl

a supertibble generated using read_redcap()

format_labels

one or multiple optional label formatting functions. A label formatting function is a function that takes a character vector and returns a modified character vector of the same length. This function is applied to field labels before attaching them to variables. One of:

  • NULL to apply no additional formatting. Default.

  • A label formatting function.

  • A character with the name of a label formatting function.

  • A vector or list of label formatting functions or function names to be applied in order. Note that ordering may affect results.

Details

The variable labels for the data tibbles are derived from the field_label column of the metadata tibble.

Value

A labelled supertibble.

Examples

superheroes_supertbl

make_labelled(superheroes_supertbl)

make_labelled(superheroes_supertbl, format_labels = tolower)

## Not run: 
redcap_uri <- Sys.getenv("REDCAP_URI")
token <- Sys.getenv("REDCAP_TOKEN")

supertbl <- read_redcap(redcap_uri, token)
make_labelled(supertbl)

## End(Not run)

Make skimr labels from default skimr outputs

Description

A simple helper function that returns all default skimr names as formatted character vector for use in make_lablled

Usage

make_skimr_labels()

Details

All labels supplied are manually created and agreed upon as human-readable

Value

A character vector


Update multiple choice fields with label data

Description

Update REDCap variables with multi-choice types to standard form labels taken from REDCap metadata.

Usage

multi_choice_to_labels(
  db_data,
  db_metadata,
  raw_or_label = "label",
  call = caller_env()
)

Arguments

db_data

A REDCap database object

db_metadata

A REDCap metadata object

raw_or_label

A string (either 'raw', 'label', or 'haven') that specifies whether to export the raw coded values or the labels for the options of categorical fields. Default is 'label'. If 'haven' is supplied, categorical fields are converted to haven_labelled vectors.

call

call for conditions

Details

Coerce variables of field_type "truefalse", "yesno", and "checkbox" to logical. Introduce form_status_complete column and append to end of tibble outputs. Ensure field_types "dropdown" and "radio" are converted appropriately since label appendings are important and unique to these.


Parse labels from REDCap metadata into usable formats

Description

Takes a string separated by ,s and/or |s (i.e. comma/tab separated values) containing key value pairs (raw and label) and returns a tidy tibble.

Usage

parse_labels(string, return_vector = FALSE, return_stripped_text_flag = FALSE)

Arguments

string

A db_metadata$select_choices_or_calculations field pre-filtered for checkbox field_type

return_vector

logical for whether to return result as a vector

return_stripped_text_flag

logical for whether to return a flag indicating whether or not text was stripped from labels

Details

The associated string comes from metadata outputs.

Value

A tidy tibble from a matrix giving raw and label outputs to be used in later functions if return_vector = FALSE, the default. Otherwise a vector result in a c(raw = label) format to use with dplyr::recode


Convert yesno, truefalse, and checkbox fields to logical

Description

Convert yesno, truefalse, and checkbox fields to logical

Usage

parse_logical_cols(db_data, db_metadata, call = caller_env())

Arguments

db_data

A REDCap database object

db_metadata

A REDCap metadata object

call

call for conditions


Import a REDCap database into a tidy supertibble

Description

Query the REDCap API to retrieve data and metadata about a project, and transform the output into a "supertibble" that contains data and metadata organized into tibbles, broken down by instrument.

Usage

read_redcap(
  redcap_uri,
  token,
  raw_or_label = "label",
  forms = NULL,
  export_survey_fields = NULL,
  export_data_access_groups = NULL,
  suppress_redcapr_messages = TRUE,
  guess_max = Inf,
  allow_mixed_structure = getOption("redcaptidier.allow.mixed.structure", FALSE)
)

Arguments

redcap_uri

The URI/URL of the REDCap server (e.g., "https://server.org/apps/redcap/api/"). Required.

token

The user-specific string that serves as the password for a project. Required.

raw_or_label

A string (either 'raw', 'label', or 'haven') that specifies whether to export the raw coded values or the labels for the options of categorical fields. Default is 'label'. If 'haven' is supplied, categorical fields are converted to haven_labelled vectors.

forms

A character vector of REDCap instrument names that specifies which instruments to import. Default is NULL which imports all instruments in the project.

export_survey_fields

A logical that specifies whether to export survey identifier and timestamp fields. The default, NULL, tries to determine if survey fields exist and returns them if available.

export_data_access_groups

A logical that specifies whether to export the data access group field. The default, NULL, tries to determine if a data access group field exists and returns it if available.

suppress_redcapr_messages

A logical to control whether to suppress messages from REDCapR API calls. Default TRUE.

guess_max

A positive base::numeric value passed to readr::read_csv() that specifies the maximum number of records to use for guessing column types. Default Inf.

allow_mixed_structure

A logical to allow for support of mixed repeating/non-repeating instruments. Setting to TRUE will treat the mixed instrument's non-repeating versions as repeating instruments with a single instance. Applies to longitudinal projects only. Default FALSE. Can be set globally with options(redcaptidier.allow.mixed.structure = TRUE).

Details

This function uses the REDCapR package to query the REDCap API. The REDCap API returns a block matrix that mashes data from all data collection instruments together. The read_redcap() function deconstructs the block matrix and splices the data into individual tibbles, where one tibble represents the data from one instrument.

Value

A tibble in which each row represents a REDCap instrument. It contains the following columns:

Examples

## Not run: 
redcap_uri <- Sys.getenv("REDCAP_URI")
token <- Sys.getenv("REDCAP_TOKEN")

read_redcap(
  redcap_uri,
  token,
  raw_or_label = "label"
)

## End(Not run)


Additional release questions

Description

Additional release questions to be added when using devtools::release() during CRAN submissions.

Usage

release_questions()

Details

This follows the documentation provided in devtools::release().

Value

A series of character string questions


Remove rows with empty data

Description

Remove rows that are empty in all associated data columns (those derived from fields in REDCap). This occurs when a form is filled out in an event, but other forms are not. Regardless of a form's status, all forms in an event are included in the output so long as any form in the event contains data.

This only applies to longitudinal REDCap databases containing events.

Usage

remove_empty_rows(data, my_record_id)

Arguments

data

A REDCap dataframe from a longitudinal database, pre-processed within a ⁠distill_*⁠ function.

my_record_id

The record ID defined in the project.

Value

A dataframe.


Replace checkbox TRUEs with raw_or_label values

Description

Simple utility function for replacing checkbox field values.

Usage

replace_true(col, col_name, metadata, raw_or_label)

Arguments

col

A vector

col_name

A string

metadata

A metadata tibble from the original supertibble

raw_or_label

Either 'raw' or 'label' to specify whether to use raw coded values or labels for the options. Default 'label'.

Value

A character string


Convert user input into label formatting function

Description

Convert user input into label formatting function

Usage

resolve_formatter(format_labels, env = caller_env(n = 2), call = caller_env())

Arguments

format_labels

argument passed to make_labelled

env

the environment in which to look up functions if format_labels contains character elements. The default, caller_env(n = 2), uses the environment from which the user called make_labelled()

call

the calling environment to use in the error message

Value

a function


Safely set variable labels

Description

A utility function for setting labels of a tibble from a named vector while accounting for labels that may not be present in the data.

Usage

safe_set_variable_labels(data, labs)

Value

A tibble


Apply applicable skimmers to data

Description

A helper function for add_skimr_metadata() which applies applicable skimmers to a given dataframe.

Usage

skim_data(redcap_data, redcap_metadata, is_labelled)

Value

A dataframe


Remove html tags and field embedding logic from a string

Description

Remove html tags and field embedding logic from a string

Usage

strip_html_field_embedding(x)

Arguments

x

vector of strings to format

Value

vector of strings with html tags, field embedding logic, and extra whitespace removed


Superheroes Data

Description

A dataset of superheroes in a REDCapTidieR supertbl object

Usage

superheroes_supertbl

Format

heroes_information

A tibble with 734 rows and 12 columns:

record_id

REDCap record ID

name

Hero name

gender

Gender

eye_color

Eye color

race

Race

hair_color

Hair color

height

Height

weight

Weight

publisher

Publisher

skin_color

Skin color

alignment

Alignment

form_status_complete

REDCap instrument completed?

super_hero_powers

A tibble with 5,966 rows and 4 columns:

record_id

REDCap record ID

redcap_form_instance

REDCap repeat instance

power

Super power

form_status_complete

REDCap instrument completed?

Source

https://www.superherodb.com/


Recode fields using supertbl metadata

Description

This utility function helps to map metadata field types in order to apply changes in supertbl tables.

Usage

supertbl_recode(supertbl, supertbl_meta, add_labelled_column_headers)

Arguments

supertbl

A supertibble generated using read_redcap()

supertbl_meta

an unnest-ed metadata tibble from the supertibble

add_labelled_column_headers

Whether or not to include labelled outputs


Provide a succinct summary of an object

Description

tbl_sum() gives a brief textual description of a table-like object, which should include the dimensions and the data source in the first element, and additional information in the other elements (such as grouping for dplyr). The default implementation forwards to obj_sum().

Usage

## S3 method for class 'redcap_supertbl'
tbl_sum(x)

Arguments

x

Object to summarise.

Value

A named character vector, describing the dimensions in the first element and the data source in the name of the first element.


Make a REDCapR API call with custom error handling

Description

Make a REDCapR API call with custom error handling

Usage

try_redcapr(expr, call = caller_env())

Arguments

expr

an expression making a REDCapR API call

call

the calling environment to use in the warning message

Value

If successful, the data element of the REDCapR result. Otherwise an error


Implement REDCapR DAG Data into Supertibble

Description

This helper function uses output from REDCapR::redcap_dag_read and applies the necessary raw/label values to the redcap_data_access_group column.

This is done because REDCapTidieR retrieves raw data by default, then merges labels from the metadata. However, some columns like redcap_data_access_group are not in the metadata and so there is nothing by default to reference.

Usage

update_dag_cols(data, dag_data, raw_or_label)

Arguments

data

the REDCap data

dag_data

a DAG dataset exported from REDCapR::redcap_dag_read

raw_or_label

A string (either 'raw', 'label', or 'haven') that specifies whether to export the raw coded values or the labels for the options of categorical fields. Default is 'label'. If 'haven' is supplied, categorical fields are converted to haven_labelled vectors.


Correctly label variables belonging to checkboxes with minus signs

Description

Using db_data and db_metadata, temporarily create a conversion column that reverts automatic REDCap behavior where database column names have "-"s converted to "_"s.

Usage

update_data_col_names(db_data, db_metadata)

Arguments

db_data

The REDCap database output defined by REDCapR::redcap_read_oneshot()$data

db_metadata

The REDCap metadata output defined by REDCapR::redcap_metadata_read()$data

Details

This is an issue with checkbox fields since analysts should be able to verify checkbox variable suffices with their label counterparts.

Value

Updated db_data column names for checkboxes where "-"s were replaced by "_"s.


Update metadata field names for checkbox handling

Description

Takes a db_metadata object and:

Usage

update_field_names(db_metadata)

Arguments

db_metadata

The REDCap metadata output defined by REDCapR::redcap_metadata_read()$data

Details

Assumes db_metadata:

Value

Column db_metadata with field_name_updated appended and field_label updated for new rows corresponding to checkbox options


Vector type as a string

Description

vec_ptype_full() displays the full type of the vector. vec_ptype_abbr() provides an abbreviated summary suitable for use in a column heading.

Usage

## S3 method for class 'redcap_supertbl'
vec_ptype_abbr(x, ..., prefix_named, suffix_shape)

Arguments

x

A vector.

...

These dots are for future extensions and must be empty.

prefix_named

If TRUE, add a prefix for named vectors.

suffix_shape

If TRUE (the default), append the shape of the vector.

Value

A string.


Write Supertibbles to XLSX

Description

Transform a supertibble into an XLSX file, with each REDCap data tibble in a separate sheet.

Usage

write_redcap_xlsx(
  supertbl,
  file,
  add_labelled_column_headers = NULL,
  use_labels_for_sheet_names = TRUE,
  include_toc_sheet = TRUE,
  include_metadata_sheet = TRUE,
  table_style = "tableStyleLight8",
  column_width = "auto",
  recode_logical = TRUE,
  na_replace = "",
  overwrite = FALSE
)

Arguments

supertbl

A supertibble generated using read_redcap().

file

The name of the file to which the output will be written.

add_labelled_column_headers

If TRUE, the first row of each sheet will contain variable labels, with variable names in the second row. If FALSE, variable names will be in the first row. The default value, NULL, tries to determine if supertbl contains variable labels and, if present, includes them in the first row. The labelled package must be installed if add_labelled_column_headers is TRUE.

use_labels_for_sheet_names

If FALSE, sheet names will come from the REDCap instrument names. If TRUE, sheet names will come from instrument labels. The default is TRUE.

include_toc_sheet

If TRUE, the first sheet in the XLSX output will be a table of contents, providing information about each data tibble in the workbook. The default is TRUE.

include_metadata_sheet

If TRUE, the final sheet in the XLSX output will contain metadata about each variable, combining the content of supertbl$redcap_metadata. The default is TRUE.

table_style

Any Excel table style name or "none". For more details, see the "formatting" vignette of the openxlsx package. The default is "tableStyleLight8".

column_width

Sets the width of columns throughout the workbook. The default is "auto", but you can specify a numeric value.

recode_logical

If TRUE, fields with "yesno" field type are recoded to "yes"/"no" and fields with a "checkbox" field type are recoded to "Checked"/"Unchecked". The default is TRUE.

na_replace

The value used to replace NA values in supertbl. The default is "".

overwrite

If FALSE, will not overwrite file when it exists. The default is FALSE.

Value

An openxlsx2 workbook object, invisibly

Examples

## Not run: 
redcap_uri <- Sys.getenv("REDCAP_URI")
token <- Sys.getenv("REDCAP_TOKEN")

supertbl <- read_redcap(redcap_uri, token)

supertbl %>%
  write_redcap_xlsx(file = "supertibble.xlsx")

# Add variable labels

library(labelled)

supertbl %>%
  make_labelled() %>%
  write_redcap_xlsx(file = "supertibble.xlsx", add_labelled_column_headers = TRUE)

## End(Not run)