| Categorical, ID | Index | na_strategy | "separate" | "zero"``"separate"``"most_frequent" | When set to "zero", embeddings for missing values are represented as zero vectors. When set to "separate", missing values are treated as a distinct category. When set to "most_frequent", missing values are assigned to the most prevalent category. |
| Categorical, ID | Index | min_occ | 1 | positive integer | The minimal count to allow within each category. If a category count is lower than min_occ, Kumo treats the category as N/A. |
| Categorical, ID | Hash | na_strategy | "separate" | "zero"``"separate"``"most_frequent" | When set to "zero", embeddings for missing values are represented as zero vectors. When set to "separate", missing values are treated as a distinct category. When set to "most_frequent", missing values are assigned to the most prevalent category. |
| Categorical, ID | Hash | num_components | Depends on cardinality of the column | positive integer | The capacity of the hash table. |
| Categorical, ID | Hash | min_occ | Depends on cardinality of the column | positive integer | The minimal count to allow within each category. If a category count is lower than min_occ, Kumo treats the category as N/A. |
| Categorical, ID | Hash | na_strategy | "zero" | "zero"``"separate"``"most_frequent" | When set to "zero", embeddings for missing values are represented as zero vectors. When set to "separate", missing values are treated as a distinct category. When set to "most_frequent", missing values are assigned to the most prevalent category. |
| Multicategorical | MultiCategorical | min_occ | 1 | positive integer | The minimal count to allow within each category. If a category count is lower than min_occ, Kumo treats the category as N/A. |
| Multicategorical | MultiCategorical | sep | Inferred by Kumo | string | The separator to use. |
| Numerical | Numerical | scaler | None | None``"standard"``"minmax"``"robust" | When set to None, no transformation is applied to the column values. When set to "standard", the column values are transformed to have zero mean and unit variance. When set to "minmax", the values are scaled to fall within the range [0, 1]. When set to "robust", values are subtracted from the feature’s median and divided by the interquartile range. |
| Numerical | Numerical | na_strategy | "mean" | "mean"``"zero" | If "mean", N/A values are replaced with the mean value of the column. If "zero", N/A values are replaced with zero. |
| Numerical | MaxLogNumerical | na_strategy | "mean" | "mean"``"zero" | If "mean", N/A values are replaced with the mean value of the column. If "zero", N/A values are replaced with zero. |
| Numerical | MinLogNumerical | na_strategy | "mean" | "mean"``"zero" | If "mean", N/A values are replaced with the mean value of the column. If "zero", N/A values are replaced with zero. |
| Embedding | NumericalList | na_strategy | "zero" | "zero" | If "zero", N/A values are replaced with zero. |
| Timestamp | Datetime | include_minute | true | true``false | Whether to include minute. |
| Timestamp | Datetime | include_hour | true | true``false | Whether to include hour. |
| Timestamp | Datetime | include_day_of_week | true | true``false | Whether to include day of week. |
| Timestamp | Datetime | include_day_of_month | true | true``false | Whether to include day of month. |
| Timestamp | Datetime | include_day_of_year | true | true``false | Whether to include day of year. |
| Timestamp | Datetime | include_year | true | true``false | Whether to include year. |
| Timestamp | Datetime | num_year_periods | Depends on the difference between the min and max year in the column | positive integer | The number of periods to consider for encoding years, e.g., in case num_year_periods=4, year is encoded as year % i for each i in { 2, 4, 8, 16 }. If set to None, it will be inferred based on dataset statistics. |
| Text | GloVe | model_name | "glove.6B" | "glove.6B"``"glove.42B"``"glove.840B"``"glove_twitter.27B" | The pretrained model name. |
| Text | GloVe | embedding_dim | 50 | 25``50``100``200``300 | The embedding dimension of the pretrained model. Note that not all models support these embedding dimensions. See the GloVe Argument Combinations table below. |
| Any type | Null | n/a | n/a | n/a | If Null is specified to a column, Kumo ignores this column completely. |