cascade.utils.tables#
- class cascade.utils.tables.CSVDataset(csv_file_path: str, *args: Any, **kwargs: Any)[source]#
Wrapper for .csv files.
- class cascade.utils.tables.FeatureTable(table: TableDataset | DataFrame, *args: Any, **kwargs: Any)[source]#
- __init__(table: TableDataset | DataFrame, *args: Any, **kwargs: Any) None [source]#
Table dataset which allows to easily define and compute features
Example
```python >>> import pandas as pd >>> from cascade.utils.tables import FeatureTable >>> df = pd.read_csv(r’data .csv’, index_col=0) >>> df id count name 0 0 1 aaa 1 1 5 bbb 2 2 0 ccc >>> ft = FeatureTable(df) >>> ft.get_features() [‘id’, ‘count’, ‘ name’] >>> ft.add_feature(‘square’, lambda df: df[‘count’] * df[‘count’]) >>> def counts(df): >>> return df[‘count’] * 2, df[‘count’] * 3
>>> ft.add_feature(('count_2', 'count_3'), counts) >>> ft.get_features() ['id', 'count', ' name', 'square', ('count_2', 'count_3')] >>> ft.get_table(['count', ('count_2', 'count_3')]) count count_2 count_3 0 1 2 3 1 5 10 15 2 0 0 0
- Parameters:
table (Union[TableDataset, pd.DataFrame]) – The table to wrap
- add_feature(name: str | Tuple[str], func: Callable[[DataFrame], Series | Tuple[str]], *args: Any, **kwargs: Any) None [source]#
- get_features() List[str | Tuple[str]] [source]#
Returns the list of feature names with all computed features added before
- Returns:
List of feature names
- Return type:
List[str]
- class cascade.utils.tables.TableDataset(*args: Any, t: DataFrame | TableDataset | None = None, **kwargs: Any)[source]#
Wrapper for ``pd.DataFrame``s which allows to manage metadata and perform validation.
- __init__(*args: Any, t: DataFrame | TableDataset | None = None, **kwargs: Any) None [source]#
- Parameters:
t (optional) – pd.DataFrame or TableDataset to be set as table
- class cascade.utils.tables.TableFilter(dataset: TableDataset, mask: List[bool], *args: Any, **kwargs: Any)[source]#
Filter for table values
- __init__(dataset: TableDataset, mask: List[bool], *args: Any, **kwargs: Any) None [source]#
- Parameters:
dataset (TableDataset) – Dataset to be filtered.
mask (Iterable[bool]) – Binary mask to select values from table.