Selector that wraps two other Selectors given during construction and uses both for selection proportionally. Each of the resulting n_select individuals is chosen either from $selector, or from $selector_not.

This makes it possible to implement selection methods such as random interleaving, where only a fraction of p individuals were selected by a criterion, while the others are taken randomly.

Algorithm

To perform selection, n_selector_in rows of values are given to $selector, and the remaining nrow(values) - n_selector_in rows are given to $selector_not. Both selectors are used to generate a subset of selected individuals: $selector generates n_selector_out individuals, and $selector_not generates n_select - n_selector_out individuals.

n_selector_in is either set to round(nrow(values) * p_in) when proportion_in is "exact", or to rbinom(1, nrow(values), p_in) when proportion_in is "random".

n_selector_out is set to round(n_select * p_out) when proportion_out is "exact", or to rbinom(1, n_select, p_out) when proportion_out is "random".

When odds_correction is TRUE, then p_out is adjusted depending on the used n_selector_in value before being applied. Let odds(p) = p/(1-p). Then the effective p_out is set such that odds(effective p_out) = odds(p_out) * n_selector_in / (nrow(values) - n_selector_in) / odds(p_in). This corrects for the discrepancy between the chosen p_in and the effective proportion of n_selector_in / nrow(values) caused either by rounding errors or when proportion_in is "random".

When p_in is exactly 1 or exactly 0, and p_out is not equal to p_in, then an error is given.

If nrow(values) is 1, then this individuum is returned and $selector / $selector_not are not called.

If try_unique is TRUE, then n_selector_out is set to at most n_selector_in and at least n_select - nrow(values) + n_selector_in, and an error is generated when nrow(values) is less than n_select.

If try_unique is FALSE and odds_correction is TRUE and n_selector_in is either 0 or nrow(values), then $p_out is set to either 0 or 1, respectively.

If try_unique is FALSE and odds_correction is FALSE and n_selector_in is either 0 or nrow(values), and n_selector_out is not equal to 0 or n_select, respectively, then n_selector_in is increased / decreased by 1 to give $selector_not / $selector at least one individuum to choose from. While this behaviour may seem pathological, it is to ensure continuity with sampled values of n_selector_in that are close to 0 or n_select.

If n_selector_out is n_select or 0, or if n_selector_in is nrows(values) - 1 or 1, then only $selector / $selector_not is executed, respectively; possibly with a subset of values if n_selector_in differs from nrow(values) / 0.

Configuration Parameters

This operator has the configuration parameters of the Selectors that it wraps: The configuration parameters of the operator given to the selector construction argument are prefixed with "maybe.", the configuration parameters of the operator given to the selector_not construction argument are prefixed with "maybe_not.".

Additional configuration parameters:

  • p_in :: numeric(1)
    Probability per individual (when random_choise is TRUE), or fraction of individuals (when random_choice is FALSE), that are given to $selector instead of $selector_not. This may be overriden when try_unique is TRUE, in which case at least as many rows are given to $selector and $selector_not as they are generating output values respectively. When this is exactly 1 or exactly 0, then p_out must be equal to p_in. Must be set by the user.

  • p_out :: numeric(1)
    Probability per output value (when random_choise is TRUE), or fraction of output values (when random_choice is FALSE), that are generated by $selector instead of $selector_not. When this values is not given, it defaults to p_in.

  • shuffle_input :: logical(1)
    Whether to distribute input values randomly to $selector / $selector_not. If FALSE, then the first part of values is given to $selector. This only randomizes which lines of values are given to $selector / $selector_not, but it does not necessarily reorder the lines of values given to each. In particular, if p_out is 0 or 1, then no shuffling takes place. Initialized to TRUE.

  • proportion_in :: character(1)
    When set to "random", sample the number of individuals given to $selector according to rbinom(1, nrow(values), p_in). When set to "exact", give $selector round(nrow(values) * p_in) individuals. Initialized to "exact".

  • proportion_out :: character(1)
    When set to "random", sample the number of individuals generated by $selector according to rbinom(1, n_select, p_out). When set to "exact", have $selector generate round(n_select * p_out) individuals.

  • odds_correction :: logical(1)
    When set, the effectively used value of p_out is set to 1 / (1 + ((nrow(values) - n_selector_in) * p_in * (1 - p_out)) / (n_selector_in * p_out * (1 - p_in))), see the Algorithm section. Initialized to FALSE.

  • try_unique :: logical(1)
    Whether to give at least as many rows of values to each of $selector and $selector_not as they are generating output values. This should be set to TRUE whenever SelectorMaybe is used to select unique values, and can be set to FALSE when selecting values multiple times is acceptable. When this is TRUE, then having n_select > nrow(values) generates an error. Initialized to TRUE.

Supported Operand Types

Supported Param classes are the set intersection of supported classes of selector and selector_not.

Dictionary

This Filtor can be created with the short access form ftr() (ftrs() to get a list), or through the the dictionary dict_filtors in the following way:

# preferred:
ftr("maybe", <selector> [, <selector_not>])
ftrs("maybe", <selector> [, <selector_not>])  # takes vector IDs, returns list of Filtors

# long form:
dict_filtors$get("maybe", <selector> [, <selector_not>])

Super classes

miesmuschel::MiesOperator -> miesmuschel::Selector -> SelectorMaybe

Active bindings

selector

(Selector)
Selector being wrapped. This operator gets run with probability / proportion p_in and generates output with probability / proportion p_out (configuration parameters).

selector_not

(Selector)
Alternative Selector being wrapped. This operator gets run with probability / proportion 1 - p_in and generates output with probability / proportion 1 - p_out (configuration parameters).

Methods

Inherited methods


Method new()

Initialize the SelectorMaybe object.

Usage

SelectorMaybe$new(selector, selector_not = SelectorRandom$new())

Arguments

selector

(Selector)
Selector to wrap. This operator gets run with probability / fraction p_in (Configuration parameter).
The constructed object gets a clone of this argument. The $selector field will reflect this value.

selector_not

(Selector)
Another Selector to wrap. This operator runs when selector is not chosen. By default, this is SelectorRandom, i.e. selecting randomly.
The constructed object gets a clone of this argument. The $selector_not field will reflect this value.


Method prime()

See MiesOperator method. Primes both this operator, as well as the wrapped operators given to selector and selector_not during construction.

Usage

SelectorMaybe$prime(param_set)

Arguments

param_set

(ParamSet)
Passed to MiesOperator$prime().

Returns

invisible self.


Method clone()

The objects of this class are cloneable with this method.

Usage

SelectorMaybe$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.