fabricatio_plot.toolboxes.dataframe_curd
CRUD Data Operations Toolbox Module.
This module provides focused tools for core data operations following CRUD principles: - Create: Generate new data structures/columns - Read: Extract and inspect data - Update: Transform and modify existing data - Delete: Remove data elements
Designed for clear separation of concerns with minimal dependencies.
Attributes
Functions
|
Create an empty DataFrame with specified columns and optional data types. |
|
Create a new column by evaluating an expression on existing columns. |
|
Retrieve a single row by its index value. |
|
Update missing values in a single column using specified strategy. |
|
Apply mathematical transformation to a numeric column. |
|
Delete specified columns from DataFrame. |
|
Delete rows that meet a specified condition. |
|
Atomically rename a single column. |
|
Set DataFrame index from an existing column. |
Module Contents
- fabricatio_plot.toolboxes.dataframe_curd.data_crud_toolbox
- fabricatio_plot.toolboxes.dataframe_curd.create_empty_dataframe(columns: List[str], dtypes: List[str] | None = None) pandas.DataFrame
Create an empty DataFrame with specified columns and optional data types.
- Parameters:
columns – List of column names
dtypes – Optional list of data types corresponding to columns (e.g., [‘int’, ‘float’, ‘str’])
- Returns:
Empty DataFrame with schema defined
- fabricatio_plot.toolboxes.dataframe_curd.add_computed_column(df: pandas.DataFrame, new_column: str, expression: str, dtype: str | None = None) pandas.DataFrame
Create a new column by evaluating an expression on existing columns.
- Parameters:
df – Input DataFrame
new_column – Name for the new column
expression – Python expression using existing columns (e.g., “price * quantity”)
dtype – Optional data type for the new column
- Returns:
DataFrame with new computed column added
- fabricatio_plot.toolboxes.dataframe_curd.get_row_by_index(df: pandas.DataFrame, index_value: Any) pandas.Series
Retrieve a single row by its index value.
- Parameters:
df – Input DataFrame
index_value – Value of the index to retrieve
- Returns:
Row as a pandas Series
- Raises:
KeyError – If index_value not found in index
- fabricatio_plot.toolboxes.dataframe_curd.fill_missing_values(df: pandas.DataFrame, column: str, strategy: Literal['mean', 'median', 'mode', 'constant'] = 'mean', constant_value: Any | None = None) pandas.DataFrame
Update missing values in a single column using specified strategy.
- Parameters:
df – Input DataFrame
column – Column name to process
strategy – Filling strategy (‘mean’, ‘median’, ‘mode’, ‘constant’)
constant_value – Value to use when strategy=’constant’
- Returns:
DataFrame with missing values filled in specified column
- fabricatio_plot.toolboxes.dataframe_curd.transform_column(df: pandas.DataFrame, column: str, transformation: Literal['log', 'sqrt', 'square', 'normalize']) pandas.DataFrame
Apply mathematical transformation to a numeric column.
- Parameters:
df – Input DataFrame
column – Column name to transform
transformation – Type of transformation (‘log’, ‘sqrt’, ‘square’, ‘normalize’)
- Returns:
DataFrame with transformed column
- fabricatio_plot.toolboxes.dataframe_curd.drop_columns(df: pandas.DataFrame, columns: List[str]) pandas.DataFrame
Delete specified columns from DataFrame.
- Parameters:
df – Input DataFrame
columns – List of column names to drop
- Returns:
DataFrame with columns removed
- Raises:
KeyError – If any column in ‘columns’ doesn’t exist in DataFrame
- fabricatio_plot.toolboxes.dataframe_curd.drop_rows_by_condition(df: pandas.DataFrame, condition: str) pandas.DataFrame
Delete rows that meet a specified condition.
- Parameters:
df – Input DataFrame
condition – Query string for rows to drop (e.g., “price < 0”, “category.isna()”)
- Returns:
DataFrame with matching rows removed
- fabricatio_plot.toolboxes.dataframe_curd.rename_column(df: pandas.DataFrame, old_name: str, new_name: str) pandas.DataFrame
Atomically rename a single column.
- Parameters:
df – Input DataFrame
old_name – Current column name
new_name – New column name
- Returns:
DataFrame with column renamed
- Raises:
KeyError – If old_name not found in columns
- fabricatio_plot.toolboxes.dataframe_curd.set_index_from_column(df: pandas.DataFrame, column: str, drop: bool = True) pandas.DataFrame
Set DataFrame index from an existing column.
- Parameters:
df – Input DataFrame
column – Column name to use as index
drop – Whether to drop the column after setting index (default: True)
- Returns:
DataFrame with new index set
- Raises:
KeyError – If column not found in DataFrame