Extract features from a dataset

Create scalar valued summary features for a dataset from feature functions.

features(.tbl, .var, features, ...)

features_at(.tbl, .vars, features, ...)

features_all(.tbl, features, ...)

features_if(.tbl, .predicate, features, ...)

Arguments

.tbl: A dataset
.var: An expression that produces a vector from which the features are computed.
features: A list of functions (or lambda expressions) for the features to compute. feature_set() is a useful helper for building sets of features.
...: Additional arguments to be passed to each feature. These arguments will only be passed to features which use it in their formal arguments (base::formals()), and not via their .... While passing na.rm = TRUE to stats::var() will work, it will not for base::mean() as its formals are x and .... To more precisely pass inputs to each function, you should use lambdas in the list of features (~ mean(., na.rm = TRUE)).
.vars: A tidyselect compatible selection of the column(s) to compute features on.
.predicate: A predicate function (or lambda expression) to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected.

Details

Lists of available features can be found in the following pages:

Examples

# Provide a set of functions as a named list to features.
library(tsibble)
tourism %>% 
  features(Trips, features = list(mean = mean, sd = sd))
#> # A tibble: 304 × 5
#>    Region         State              Purpose    mean    sd
#>    <chr>          <chr>              <chr>     <dbl> <dbl>
#>  1 Adelaide       South Australia    Business 156.   35.6 
#>  2 Adelaide       South Australia    Holiday  157.   27.1 
#>  3 Adelaide       South Australia    Other     56.6  17.3 
#>  4 Adelaide       South Australia    Visiting 205.   32.5 
#>  5 Adelaide Hills South Australia    Business   2.66  4.30
#>  6 Adelaide Hills South Australia    Holiday   10.5   6.37
#>  7 Adelaide Hills South Australia    Other      1.40  1.65
#>  8 Adelaide Hills South Australia    Visiting  14.2  10.7 
#>  9 Alice Springs  Northern Territory Business  14.6   7.20
#> 10 Alice Springs  Northern Territory Holiday   31.9  18.1 
#> # ℹ 294 more rows

# Search and use useful features with `feature_set()`. 

library(feasts)
tourism %>% 
  features(Trips, features = feature_set(tags = "autocorrelation"))
#> # A tibble: 304 × 14
#>    Region         State Purpose     acf1 acf10 diff1_acf1 diff1_acf10 diff2_acf1
#>    <chr>          <chr> <chr>      <dbl> <dbl>      <dbl>       <dbl>      <dbl>
#>  1 Adelaide       Sout… Busine…  0.0333  0.131     -0.520       0.463     -0.676
#>  2 Adelaide       Sout… Holiday  0.0456  0.372     -0.343       0.614     -0.487
#>  3 Adelaide       Sout… Other    0.517   1.15      -0.409       0.383     -0.675
#>  4 Adelaide       Sout… Visiti…  0.0684  0.294     -0.394       0.452     -0.518
#>  5 Adelaide Hills Sout… Busine…  0.0709  0.134     -0.580       0.415     -0.750
#>  6 Adelaide Hills Sout… Holiday  0.131   0.313     -0.536       0.500     -0.716
#>  7 Adelaide Hills Sout… Other    0.261   0.330     -0.253       0.317     -0.457
#>  8 Adelaide Hills Sout… Visiti…  0.139   0.117     -0.472       0.239     -0.626
#>  9 Alice Springs  Nort… Busine…  0.217   0.367     -0.500       0.381     -0.658
#> 10 Alice Springs  Nort… Holiday -0.00660 2.11      -0.153       2.11      -0.274
#> # ℹ 294 more rows
#> # ℹ 6 more variables: diff2_acf10 <dbl>, season_acf1 <dbl>, pacf5 <dbl>,
#> #   diff1_pacf5 <dbl>, diff2_pacf5 <dbl>, season_pacf <dbl>

# Best practice is to use anonymous functions for additional arguments
tourism %>% 
  features(Trips, list(~ quantile(., probs=seq(0,1,by=0.2))))
#> # A tibble: 304 × 9
#>    Region         State      Purpose    `0%`  `20%`   `40%`  `60%`  `80%` `100%`
#>    <chr>          <chr>      <chr>     <dbl>  <dbl>   <dbl>  <dbl>  <dbl>  <dbl>
#>  1 Adelaide       South Aus… Busine…  68.7   127.   145.    160.   182.   242.  
#>  2 Adelaide       South Aus… Holiday 108.    133.   148.    160.   181.   224.  
#>  3 Adelaide       South Aus… Other    25.9    42.3   50.6    55.9   70.4  107.  
#>  4 Adelaide       South Aus… Visiti… 137.    177.   194.    214.   237.   270.  
#>  5 Adelaide Hills South Aus… Busine…   0       0      0.763   1.79   4.56  28.6 
#>  6 Adelaide Hills South Aus… Holiday   0       5.29   7.51   11.3   15.5   35.8 
#>  7 Adelaide Hills South Aus… Other     0       0      0.685   1.18   2.44   8.95
#>  8 Adelaide Hills South Aus… Visiti…   0.778   8.15  10.2    14.1   19.3   81.1 
#>  9 Alice Springs  Northern … Busine…   1.01    8.41  11.3    15.8   21.5   34.1 
#> 10 Alice Springs  Northern … Holiday   2.81   14.6   24.1    36.2   46.6   76.5 
#> # ℹ 294 more rows

Arguments

Details

See also

Examples