Overview
A modern async Python ORM for ClickHouse using Pydantic.
Features
- Async HTTP client using
clickhouse_connect - Model data using
pydanticfor tables with auto data types - Query builder using
prqlto create composable queries - Admin to create and manage tables/views and migrations
- Writer to validate and insert data in batches to tables
- Reader to query and stream results with validation
Installation
Quick Start
from pydantic import BaseModel
from pyclickhouse import Admin, Client, F, Query, Reader, Table, Writer
async def main() -> None:
# create new client
client = Client()
# define your model
class Event(BaseModel):
name: str
value: int
# define a table
table = Table(Event)
async with client:
# create table
admin = Admin(client)
await admin.create_table(table)
# insert event data
writer = Writer(client, table)
async with writer:
await writer.insert(Event(name="first", value=1))
await writer.insert(Event(name="second", value=2))
# stream event data
reader = Reader(client, table, model=Event)
results = await reader.stream()
async for row in results:
print(row)
# Event(name="first", value=1)
# Event(name="second", value=2)
# aggregate using query builder
query = Query(table).aggregate(res=F.sum(table.value))
# query event analytics
reader = Reader(client, query)
results = await reader.query()
print(results)
# [{"res": 3}]
if __name__ == "__main__":
import asyncio
asyncio.run(main())
Core Concepts
Learn the core concepts and how to use PyClickHouse:
-
Client — Connect to ClickHouse and manage the async connection lifecycle. Learn how to initialize clients, execute queries directly, and manage database connections.
-
Table — Define ClickHouse tables from Pydantic models. Map Python types to ClickHouse columns, customize table names and types, manage table lifecycle and auto-registration with the registry.
-
Query — Build type-safe, composable queries using PRQL. Filter, select, aggregate, group, join, and apply window functions. Use parameterized queries for dynamic SQL generation.
-
View — Create simple views (virtual queries) and materialized views (persisted results). Understand view lifecycle, auto-registration, and patterns for incremental data loading and denormalization.
-
Reader — Execute queries and retrieve results. Query with parameters, stream results asynchronously, and iterate over rows with automatic validation.
-
Writer — Insert validated data in batches. Create writers for efficient bulk inserts, manage batch sizing and flushing, and track written rows.
-
Admin — Manage the database schema. Create, alter, and drop tables/views, perform migrations with lifecycle awareness, introspect existing tables, and bulk operations.
Use Cases
I want to query data
Start with Client to understand how to connect, then move to Query to learn how to build queries, and finally Reader to execute them and get results.
I want to insert data
Create a Pydantic model and Table definition, then use Writer to insert data in batches.
I want to manage schema
Use Admin to create tables from Pydantic models, perform migrations with lifecycle controls, and manage views. See Table and View for schema design patterns.
I want to transform data
Use View to create materialized views that compute and store results, combined with Query to define the transformation logic.
Examples
Define a Table
from pydantic import BaseModel
from pyclickhouse import Table
class User(BaseModel):
id: int
name: str
email: str
users = Table(model=User, name="users")
Build a Query
from pyclickhouse import Query, F
query = Query(users).filter(
users.id > 10
).select(
users.name,
users.email
)
Insert Data
from pyclickhouse import Writer
async with client:
writer = Writer(client, users)
writer.write(User(id=1, name="Alice", email="alice@example.com"))
await writer.flush()
Execute a Query
from pyclickhouse import Reader
async with client:
reader = Reader(client, query)
results = await reader.query()
for row in results:
print(row)
Create a Table
from pyclickhouse import Admin
async with client:
admin = Admin(client)
await admin.create_table(users)
Create a Materialized View
from pyclickhouse import View
high_value_sales = View(
name="high_value_sales",
query=Query(sales).filter(sales.amount > 1000),
table=Table(model=SaleSummary, name="high_value_sales_storage")
)
async with client:
admin = Admin(client)
await admin.create_view(high_value_sales)
Key Features
Query Builder
- PRQL-based — Compose queries using pipeline operations, compiling to ClickHouse SQL
- Type-safe — Column references are typed via Pydantic models
- Composable — Chain operations like filter, select, aggregate, group, sort, take
- Parameterized — Support runtime parameters with automatic type inference
- Functions — Access 1000+ ClickHouse functions via the
Fobject - Advanced — Join tables and compute window functions (planned features with documented APIs)
Table & Schema
- Auto-typed — Python types automatically mapped to ClickHouse types
- Customizable — Override column names and types; add defaults, comments, codecs
- Lifecycle-aware — Mark tables as
managed,protected, orexternalto control migrations - Auto-registered — Tables register with a global registry by default
- Reflection — Introspect existing tables from ClickHouse and generate Pydantic models
Views
- Simple views — Virtual tables executing queries on-demand
- Materialized views — Persist computed results with automatic updates from source tables
- Integration — Kafka and PostgreSQL engines for streaming and federated data
- Patterns — Built-in support for incremental loading, denormalization, and aggregation
Admin & Migrations
- Create/Alter/Drop — Full DDL support for tables and views
- Lifecycle-aware — Respect
managed/protected/externallifecycle tags during migrations - Bulk operations —
create_all()andmigrate_all()for multiple tables at once - Introspection —
get_table()to reflect existing ClickHouse tables as Pydantic models
Reader & Writer
- Async streaming — Non-blocking iteration over query results
- Validation — Automatic Pydantic validation of data in and out
- Batching — Efficient bulk inserts with configurable batch sizes
- Parameters — Pass runtime values to parameterized queries
Explore Further
- Examples — Working code samples for common patterns
- API References — Detailed API documentation for all classes and methods
- GitHub — Source code, issues, and contributions
Development
Local Setup
To set up the development environment:
# Clone the repository
git clone https://github.com/anadahalli/pyclickhouse.git
cd pyclickhouse
# Install development dependencies
uv sync --group dev
Running Tests
Execute the test suite with:
# Run all tests
uv run pytest
# Run tests with coverage
uv run pytest --cov=pyclickhouse
# Run a specific test file
uv run pytest tests/test_client.py
Contributing
We welcome contributions! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Please ensure your code follows the project's style and includes tests for new functionality.
Issues and Feedback
- Bug Reports: Please open an issue on GitHub Issues with a clear description and steps to reproduce
- Feature Requests: Submit feature requests through GitHub Issues
- Questions: Use GitHub Discussions or open an issue for general questions
- Feedback: We appreciate any feedback! Feel free to reach out through GitHub Issues