Sanitize to adhere to REDCap character encoding requirements
Source:R/redcap-column-sanitize.R
redcap_column_sanitize.Rd
Replace non-ASCII characters with legal characters that won't cause problems when writing to a REDCap project.
Usage
redcap_column_sanitize(
d,
column_names = colnames(d),
encoding_initial = "latin1",
substitution_character = "?"
)
Arguments
- d
The
base::data.frame()
ortibble::tibble()
containing the dataset used to update the REDCap project. Required.- column_names
An array of
character
values indicating the names of the variables to sanitize. Optional.- encoding_initial
An array of
character
values indicating the names of the variables to sanitize. Optional.- substitution_character
The
character
value that replaces characters that were unable to be appropriately matched.
Details
Letters like an accented 'A' are replaced with a plain 'A'.
This is a thin wrapper around base::iconv()
.
The ASCII//TRANSLIT
option does the actual transliteration work. As of
R 3.1.0
, the OSes use similar, but different, versions to convert the
characters. Be aware of this in case you notice OS-dependent differences.
Examples
# Typical examples are not shown because they require non-ASCII encoding,
# which makes the package documentation less portable.
dirty <- data.frame(
id = 1:3,
names = c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher")
)
REDCapR::redcap_column_sanitize(dirty)
#> id names
#> 1 1 Ekstrom
#> 2 2 J"oreskog
#> 3 3 bisschen Z"urcher