Skip to content

Python Library

trinops includes TrinoProgress, a library for adding real-time progress monitoring to Python scripts that run queries through the trino-python-client. It polls query statistics in a background thread and renders progress to one or more display backends.

Wrap a trino cursor with TrinoProgress and use it as a drop-in replacement. The progress display starts automatically when you call execute() and stops when the context manager exits.

import trino
from trinops.progress import TrinoProgress
conn = trino.dbapi.connect(host="trino.example.com", port=443, user="alice")
cursor = conn.cursor()
with TrinoProgress(cursor) as progress:
progress.execute("SELECT * FROM hive.analytics.events WHERE ds = '2024-03-15'")
rows = progress.fetchall()
print(f"Got {len(rows)} rows")

TrinoProgress proxies fetchone(), fetchall(), fetchmany(), description, and iteration, so you can use it anywhere you would use a cursor.

If you already have a query running and just want to monitor it, pass a connection and a query ID:

import trino
from trinops.progress import TrinoProgress
conn = trino.dbapi.connect(host="trino.example.com", port=443, user="alice")
with TrinoProgress(conn, query_id="20240315_123456_00001_abcde") as progress:
progress.start()
progress.wait()
stats = progress.last_stats
print(f"Final state: {stats.state}, rows: {stats.processed_rows}")

The display parameter controls where progress is rendered. It accepts a string name, a list of names, or a Display instance.

NameDescription
"auto"Uses tqdm if installed, falls back to stderr. This is the default.
"stderr"Single-line progress written to stderr with carriage returns.
"tqdm"tqdm progress bar. Requires pip install trinops[tqdm].
"web"Starts a local HTTP server with a live-updating dashboard.

The simplest backend. Prints a single updating line to stderr showing state, splits, rows, bytes, elapsed time, and CPU time:

with TrinoProgress(cursor, display="stderr") as progress:
progress.execute("SELECT count(*) FROM large_table")
progress.fetchall()

Renders a tqdm progress bar tracking split completion, with rows, CPU time, and memory as postfix stats:

with TrinoProgress(cursor, display="tqdm") as progress:
progress.execute("SELECT count(*) FROM large_table")
progress.fetchall()

Install the tqdm extra: pip install trinops[tqdm].

Starts a local HTTP server that serves a live-updating HTML dashboard. The URL is printed to stderr on startup:

with TrinoProgress(cursor, display="web", web_port=8000) as progress:
progress.execute("SELECT count(*) FROM large_table")
progress.fetchall()

Set web_port=0 (the default) to let the OS pick an available port. The dashboard shows state, a progress bar, split/row/byte/time statistics, and a stage-by-stage breakdown.

Pass a list to render to several backends simultaneously:

with TrinoProgress(cursor, display=["stderr", "web"]) as progress:
progress.execute("SELECT count(*) FROM large_table")
progress.fetchall()
ParameterTypeDefaultDescription
cursor_or_connectioncursor or connection(required)A trino cursor (cursor mode) or connection (standalone mode)
query_idstr | NoneNoneQuery ID for standalone mode
displaystr | list | Display"auto"Display backend(s)
intervalfloat1.0Polling interval in seconds
max_failuresint5Max consecutive poll failures before stopping (standalone mode)
web_portint0Port for the web display; 0 for auto-assign

Each poll produces a QueryStats dataclass with these fields:

FieldTypeDescription
statestrQuery state (QUEUED, PLANNING, STARTING, RUNNING, FINISHING, FINISHED, FAILED)
queuedboolWhether the query is queued
scheduledboolWhether the query is scheduled
nodesintNumber of worker nodes
total_splitsintTotal number of splits
queued_splitsintSplits waiting to run
running_splitsintSplits currently executing
completed_splitsintSplits finished
cpu_time_millisintCumulative CPU time
wall_time_millisintCumulative wall time
queued_time_millisintTime spent queued
elapsed_time_millisintWall-clock elapsed time
processed_rowsintRows processed
processed_bytesintBytes processed
physical_input_bytesintPhysical bytes read
peak_memory_bytesintPeak memory usage
spilled_bytesintBytes spilled to disk
progress_percentagefloat | NoneOverall progress percentage
root_stageStageStats | NoneRoot of the stage tree
errordict | NoneError details for failed queries

The is_terminal property returns True when the state is FINISHED or FAILED.

Each stage in the query plan is represented as a StageStats dataclass:

FieldTypeDescription
stage_idstrStage identifier
statestrStage state
doneboolWhether the stage is complete
nodesintNumber of nodes running this stage
total_splitsintTotal splits in this stage
queued_splitsintQueued splits
running_splitsintRunning splits
completed_splitsintCompleted splits
cpu_time_millisintCPU time for this stage
wall_time_millisintWall time for this stage
processed_rowsintRows processed by this stage
processed_bytesintBytes processed by this stage
failed_tasksintNumber of failed tasks
sub_stagestuple[StageStats, ...]Child stages

Implement the Display protocol to build your own backend:

from trinops.progress.display import Display
from trinops.progress.stats import QueryStats
class MyDisplay:
def on_stats(self, stats: QueryStats) -> None:
print(f"{stats.state}: {stats.completed_splits}/{stats.total_splits} splits")
def close(self) -> None:
pass
with TrinoProgress(cursor, display=MyDisplay()) as progress:
progress.execute("SELECT count(*) FROM large_table")
progress.fetchall()

The Display protocol requires two methods: on_stats(stats: QueryStats) -> None called on each poll, and close() -> None called when monitoring ends.