> ## Documentation Index
> Fetch the complete documentation index at: https://kumo.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Data Types

> Column data types and semantic types supported by KumoRFM

KumoRFM uses two complementary type systems: **Data Types (Dtype)** for physical storage and **Semantic Types (Stype)** for semantic meaning.

## Data Types (Dtype)

Data types represent how data is physically stored and processed:

```python theme={null}
from kumoai import Dtype

# Numerical types
Dtype.bool      # Boolean values
Dtype.int       # Integer values
Dtype.float     # Floating point values

# String types
Dtype.string    # Text data
Dtype.binary    # Binary data

# Temporal types
Dtype.date      # Date/timestamp
Dtype.time      # Time values
Dtype.timedelta # Time differences

# List types
Dtype.floatlist   # Lists of floats (embeddings/sequences)
Dtype.intlist     # Lists of integers
Dtype.stringlist  # Lists of strings
```

### Dtype Mapping

When constructing a `LocalTable`, each `pandas` dtype is automatically mapped to a corresponding Kumo dtype. While you can access the dtype of individual columns in a `LocalTable`, you cannot modify it. For data type modifications, modify the underlying `pandas.DataFrame` instead before creating the table:

```python theme={null}
import pandas as pd
import kumoai.rfm as rfm

df = pd.DataFrame({'user_id': [1, 2, 3]})
table = rfm.LocalTable(df, name="users")

print(table["user_id"].dtype)
```

## Semantic Types (Stype)

Semantic types define the meaning of data and determine column-level data processing within the model:

```python theme={null}
from kumoai import Stype

# Core semantic types
Stype.numerical        # Numerical values for mathematical operations
Stype.categorical      # Discrete categories with limited cardinality
Stype.multicategorical # Multiple categories in single field
Stype.ID               # Unique identifiers
Stype.text             # Natural language text
Stype.timestamp        # Date/time information
Stype.sequence         # Embeddings or sequential data
```

### Dtype-Stype Compatibility

Not all combinations are valid. Check compatibility using:

```python theme={null}
# Check if a semantic type supports a data type
stype = Stype.categorical
dtype = Dtype.string

is_compatible = stype.supports_dtype(dtype)
print(f"{stype} supports {dtype}: {is_compatible}")

# Get default semantic type for a data type
default_stype = dtype.default_stype
print(f"Default stype for {dtype}: {default_stype}")
```

### Stype Assignment and Encoding

Semantic types can be modified after table creation, provided they are compatible with the underlying data type. The semantic type determines how values are encoded and processed by the foundation model:

```python theme={null}
# Valid stype modifications (compatible with underlying dtype)
table['user_id'].stype = 'ID'           # Dtype.int -> Stype.ID
table['category'].stype = 'categorical' # Dtype.string -> Stype.categorical
table['description'].stype = 'text'     # Dtype.string -> Stype.text

# Invalid: integers do not support text semantic types:
table['user_id'].stype = 'text'
```

For detailed information about how different semantic types affect column preprocessing and encoding, please refer to the [Column Preprocessing Guide](https://docs.kumo.ai/docs/column-preprocessing).
