Creates a dataset that help batching long-running read and writes
Source:R/create-batch-glossary.R
create_batch_glossary.Rd
The function returns a base::data.frame()
that other
functions use to separate long-running read and write REDCap calls into
multiple, smaller REDCap calls. The goal is to (1) reduce the chance of
time-outs, and (2) introduce little breaks between batches so that the
server isn't continually tied up.
Value
Currently, a base::data.frame()
is returned with the following
columns,
id
: aninteger
that uniquely identifies the batch, starting at1
.start_index
: the index of the first row in the batch.integer
.stop_index
: the index of the last row in the batch.integer
.id_pretty
: acharacter
representation ofid
, but padded with zeros.start_index
: acharacter
representation ofstart_index
, but padded with zeros.stop_index
: acharacter
representation ofstop_index
, but padded with zeros.label
: acharacter
concatenation ofid_pretty
,start_index
, andstop_index_pretty
.
Details
This function can also assist splitting and saving a large data frame to disk as smaller files (such as a .csv). The padded columns allow the OS to sort the batches/files in sequential order.
See also
See redcap_read()
for a function that uses create_batch_glossary
.
Examples
REDCapR::create_batch_glossary(100, 50)
#> # A tibble: 2 × 7
#> id start_index stop_index index_pretty start_index_pretty stop_index_pretty
#> <int> <int> <int> <chr> <chr> <chr>
#> 1 1 1 50 1 001 050
#> 2 2 51 100 2 051 100
#> # ℹ 1 more variable: label <chr>
REDCapR::create_batch_glossary(100, 25)
#> # A tibble: 4 × 7
#> id start_index stop_index index_pretty start_index_pretty stop_index_pretty
#> <int> <int> <int> <chr> <chr> <chr>
#> 1 1 1 25 1 001 025
#> 2 2 26 50 2 026 050
#> 3 3 51 75 3 051 075
#> 4 4 76 100 4 076 100
#> # ℹ 1 more variable: label <chr>
REDCapR::create_batch_glossary(100, 3)
#> # A tibble: 34 × 7
#> id start_index stop_index index_pretty start_index_pretty
#> <int> <int> <int> <chr> <chr>
#> 1 1 1 3 01 001
#> 2 2 4 6 02 004
#> 3 3 7 9 03 007
#> 4 4 10 12 04 010
#> 5 5 13 15 05 013
#> 6 6 16 18 06 016
#> 7 7 19 21 07 019
#> 8 8 22 24 08 022
#> 9 9 25 27 09 025
#> 10 10 28 30 10 028
#> # ℹ 24 more rows
#> # ℹ 2 more variables: stop_index_pretty <chr>, label <chr>
REDCapR::create_batch_glossary( 0, 3)
#> # A tibble: 0 × 7
#> # ℹ 7 variables: id <int>, start_index <int>, stop_index <int>,
#> # index_pretty <chr>, start_index_pretty <chr>, stop_index_pretty <chr>,
#> # label <chr>
d <- data.frame(
record_id = 1:100,
iv = sample(x=4, size=100, replace=TRUE),
dv = rnorm(n=100)
)
REDCapR::create_batch_glossary(nrow(d), batch_size=40)
#> # A tibble: 3 × 7
#> id start_index stop_index index_pretty start_index_pretty stop_index_pretty
#> <int> <int> <int> <chr> <chr> <chr>
#> 1 1 1 40 1 001 040
#> 2 2 41 80 2 041 080
#> 3 3 81 100 3 081 100
#> # ℹ 1 more variable: label <chr>