Pandas DataFrame Pipeline Builder
Constructs data transformation pipelines using Pandas and the pipe() method chain pattern. Integrates with SQLAlchemy for database I/O and PyArrow for high-performance Parquet operations.
Constructs data transformation pipelines using Pandas and the pipe() method chain pattern. Integrates with SQLAlchemy for database I/O and PyArrow for high-performance Parquet operations.
npx skills add agentskillexchange/skills --skill pandas-dataframe-pipeline-builder
Data transformation pipeline construction agent using the Pandas library. Builds composable transformation pipelines using the DataFrame.pipe() method chain pattern for readable and testable data processing. Implements common transformations including type casting, null handling, deduplication, pivoting, and window functions. Integrates with SQLAlchemy for reading from and writing to relational databases including PostgreSQL, MySQL, and BigQuery via the pandas-gbq connector. Uses PyArrow as the computation backend for high-performance Parquet file operations and columnar data processing. Supports schema validation using Pandera with automatic data quality checks at each pipeline stage. Implements chunked processing for datasets exceeding available memory using Dask DataFrame as a drop-in replacement. Generates pipeline documentation with input/output schema descriptions and transformation logs for data lineage tracking.