Sanitize to adhere to REDCap character encoding requirements
Source:R/redcap-column-sanitize.R
redcap_column_sanitize.RdReplace non-ASCII characters with legal characters that won't cause problems when writing to a REDCap project.
Usage
redcap_column_sanitize(
d,
column_names = colnames(d),
encoding_initial = "latin1",
substitution_character = "?"
)Arguments
- d
The
base::data.frame()ortibble::tibble()containing the dataset used to update the REDCap project. Required.- column_names
An array of
charactervalues indicating the names of the variables to sanitize. Optional.- encoding_initial
An array of
charactervalues indicating the names of the variables to sanitize. Optional.- substitution_character
The
charactervalue that replaces characters that were unable to be appropriately matched.
Details
Letters like an accented 'A' are replaced with a plain 'A'.
This is a thin wrapper around base::iconv().
The ASCII//TRANSLIT option does the actual transliteration work. As of
R 3.1.0, the OSes use similar, but different, versions to convert the
characters. Be aware of this in case you notice OS-dependent differences.
Examples
# Typical examples are not shown because they require non-ASCII encoding,
# which makes the package documentation less portable.
dirty <- data.frame(
id = 1:3,
names = c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher")
)
REDCapR::redcap_column_sanitize(dirty)
#> id names
#> 1 1 Ekstrom
#> 2 2 J"oreskog
#> 3 3 bisschen Z"urcher