Convert the selected columns of the data frame into either dummy logical columns, or into membership degrees of fuzzy sets, while leaving the remaining columns untouched. Each column selected for transformation typically yields in multiple columns in the output.
Usage
partition(
.data,
.what = everything(),
...,
.breaks = NULL,
.labels = NULL,
.na = TRUE,
.keep = FALSE,
.method = "crisp",
.right = TRUE
)
Arguments
- .data
the data frame to be processed
- .what
a tidyselect expression (see tidyselect syntax) specifying the columns to be transformed
- ...
optional other tidyselect expressions selecting additional columns to be processed
- .breaks
for numeric columns, this has to be either an integer scalar or a numeric vector. If
.breaks
is an integer scalar, it specifies the number of resulting intervals to break the numeric column to (for.method="crisp"
) or the number of target fuzzy sets (for.method="triangle"
or.method="raisedcos
). If.breaks
is a vector, the values specify the borders of intervals (for.method="crisp"
) or the breaking points of fuzzy sets.- .labels
character vector specifying the names used to construct the newly created column names. If
NULL
, the labels are generated automatically.- .na
if
TRUE
, an additional logical column is created for each source column that containsNA
values. For column namedx
, the newly created column's name will bex=NA
.- .keep
if
TRUE
, the original columns being transformed remain present in the resulting data frame.- .method
The method of transformation for numeric columns. Either
"crisp"
,"triangle"
, or"raisedcos"
is required.- .right
If
.method="crisp"
, this argument specifies if the intervals should be closed on the right (and open on the left) or vice versa.
Details
Transformations performed by this function are typically useful as a
preprocessing step before using the dig()
function or some of its
derivatives (dig_correlations()
, dig_paired_baseline_contrasts()
,
dig_associations()
).
The transformation of selected columns differ based on the type. Concretely:
logical column
x
is transformed into pair of logical columns,x=TRUE
andx=FALSE
;factor column
x
, which has levelsl1
,l2
, andl3
, is transformed into three logical columns namedx=l1
,x=l2
, andx=l3
;numeric column
x
is transformed accordingly to.method
argument:if
.method="crisp"
, the column is first transformed into a factor with intervals as factor levels and then it is processed as a factor (see above);for other
.method
(triangle
orraisedcos
), several new columns are created, where each column has numeric values from the interval \([0,1]\) and represents a certain fuzzy set (either triangular or raised-cosinal). Details of transformation of numeric columns can be specified with additional arguments (.breaks
,.labels
,.right
).
Examples
# transform logical columns and factors
d <- data.frame(a = c(TRUE, TRUE, FALSE),
b = factor(c("A", "B", "A")),
c = c(1, 2, 3))
partition(d, a, b)
#> # A tibble: 3 × 5
#> c `a=T` `a=F` `b=A` `b=B`
#> <dbl> <lgl> <lgl> <lgl> <lgl>
#> 1 1 TRUE FALSE TRUE FALSE
#> 2 2 TRUE FALSE FALSE TRUE
#> 3 3 FALSE TRUE TRUE FALSE
# transform numeric columns to logical columns (crisp transformation)
partition(CO2, conc:uptake, .method = "crisp", .breaks = 3)
#> # A tibble: 84 × 9
#> Plant Type Treatment `conc=(-Inf;397]` `conc=(397;698]` `conc=(698;Inf]`
#> <ord> <fct> <fct> <lgl> <lgl> <lgl>
#> 1 Qn1 Quebec nonchilled TRUE FALSE FALSE
#> 2 Qn1 Quebec nonchilled TRUE FALSE FALSE
#> 3 Qn1 Quebec nonchilled TRUE FALSE FALSE
#> 4 Qn1 Quebec nonchilled TRUE FALSE FALSE
#> 5 Qn1 Quebec nonchilled FALSE TRUE FALSE
#> 6 Qn1 Quebec nonchilled FALSE TRUE FALSE
#> 7 Qn1 Quebec nonchilled FALSE FALSE TRUE
#> 8 Qn2 Quebec nonchilled TRUE FALSE FALSE
#> 9 Qn2 Quebec nonchilled TRUE FALSE FALSE
#> 10 Qn2 Quebec nonchilled TRUE FALSE FALSE
#> # ℹ 74 more rows
#> # ℹ 3 more variables: `uptake=(-Inf;20.3]` <lgl>, `uptake=(20.3;32.9]` <lgl>,
#> # `uptake=(32.9;Inf]` <lgl>
# transform numeric columns to fuzzy sets (triangle transformation)
partition(CO2, conc:uptake, .method = "triangle", .breaks = 3)
#> # A tibble: 84 × 9
#> Plant Type Treatment `conc=(-Inf;95;548)` `conc=(95;548;1000)`
#> <ord> <fct> <fct> <dbl> <dbl>
#> 1 Qn1 Quebec nonchilled 1 0
#> 2 Qn1 Quebec nonchilled 0.823 0.177
#> 3 Qn1 Quebec nonchilled 0.657 0.343
#> 4 Qn1 Quebec nonchilled 0.436 0.564
#> 5 Qn1 Quebec nonchilled 0.105 0.895
#> 6 Qn1 Quebec nonchilled 0 0.718
#> 7 Qn1 Quebec nonchilled 0 0
#> 8 Qn2 Quebec nonchilled 1 0
#> 9 Qn2 Quebec nonchilled 0.823 0.177
#> 10 Qn2 Quebec nonchilled 0.657 0.343
#> # ℹ 74 more rows
#> # ℹ 4 more variables: `conc=(548;1000;Inf)` <dbl>,
#> # `uptake=(-Inf;7.7;26.6)` <dbl>, `uptake=(7.7;26.6;45.5)` <dbl>,
#> # `uptake=(26.6;45.5;Inf)` <dbl>
# complex transformation with different settings for each column
CO2 |>
partition(Plant:Treatment) |>
partition(conc,
.method = "raisedcos",
.breaks = c(-Inf, 95, 175, 350, 675, 1000, Inf)) |>
partition(uptake,
.method = "triangle",
.breaks = c(-Inf, 7.7, 28.3, 45.5, Inf),
.labels = c("low", "medium", "high"))
#> # A tibble: 84 × 24
#> `Plant=Qn1` `Plant=Qn2` `Plant=Qn3` `Plant=Qc1` `Plant=Qc3` `Plant=Qc2`
#> <lgl> <lgl> <lgl> <lgl> <lgl> <lgl>
#> 1 TRUE FALSE FALSE FALSE FALSE FALSE
#> 2 TRUE FALSE FALSE FALSE FALSE FALSE
#> 3 TRUE FALSE FALSE FALSE FALSE FALSE
#> 4 TRUE FALSE FALSE FALSE FALSE FALSE
#> 5 TRUE FALSE FALSE FALSE FALSE FALSE
#> 6 TRUE FALSE FALSE FALSE FALSE FALSE
#> 7 TRUE FALSE FALSE FALSE FALSE FALSE
#> 8 FALSE TRUE FALSE FALSE FALSE FALSE
#> 9 FALSE TRUE FALSE FALSE FALSE FALSE
#> 10 FALSE TRUE FALSE FALSE FALSE FALSE
#> # ℹ 74 more rows
#> # ℹ 18 more variables: `Plant=Mn3` <lgl>, `Plant=Mn2` <lgl>, `Plant=Mn1` <lgl>,
#> # `Plant=Mc2` <lgl>, `Plant=Mc3` <lgl>, `Plant=Mc1` <lgl>,
#> # `Type=Quebec` <lgl>, `Type=Mississippi` <lgl>,
#> # `Treatment=nonchilled` <lgl>, `Treatment=chilled` <lgl>,
#> # `conc=(-Inf;95;175)` <dbl>, `conc=(95;175;350)` <dbl>,
#> # `conc=(175;350;675)` <dbl>, `conc=(350;675;1000)` <dbl>, …