⚠ Pre-alpha. Beta release approaching. The public surface (decorator names, config keys, CLI flags) may shift between now and the beta tag — pin the exact version if you’re trying it out.
Core install
pip install "ematix-flow==0.3.0"
That gives you every backend, the flow CLI binary, and the
run_pipeline / run_streaming_pipeline Python entrypoints.
No extra services to operate.
Optional extras
| Extra | What it adds | Install |
|---|---|---|
df | DataFrame interop helpers (polars / pandas) for to_polars() / to_pandas() materialization. | pip install "ematix-flow[df]" then pip install polars (or pandas). |
spark | PySpark interop helpers (to_pyspark() / from_pyspark()). Heavy — pulls in PySpark + JDBC. | pip install "ematix-flow[spark]" |
pyarrow | Required for streaming-backend pyclass wrappers when iterating batch-by-batch in Python. | pip install pyarrow |
The flow binary, run_pipeline, and the typed-Python streaming API work
without any extras.
Verify
flow --version
flow connections list
Next: Connections.