Skip to content

The data catalog

The catalog is a NautilusTrader ParquetDataCatalog on local disk (ATS_CATALOG_DIR, default ./data/catalog). It is the single system of record for market bars — every backtest reads from here and nothing else, which is what makes runs reproducible and free of look-ahead.

An instrument is a ticker plus a venue, written TICKER.VENUE:

  • Equities / ETFs — e.g. SPY.XNAS. The text before the dot is the ticker; after it, the venue.
  • Crypto — e.g. BTCUSD.CRYPTO.

The UI usually lets you type a bare ticker (e.g. SPY) for equities and resolves the venue for you; crypto uses the full id.

The asset class (equity or crypto) is derived from the instrument and changes three things:

  • Fee model — IB-style commissions for equities; a crypto fee model for crypto.
  • Sizing — equities trade whole shares; crypto supports fractional sizing.
  • Annualization — 252 trading days/year for equities, 365 for crypto (crypto trades every day). See metrics.

A bar spec names a cadence, e.g. 1-DAY-LAST (daily) or 1-MINUTE-LAST (intraday). The catalog stores a base cadence per instrument:

  • Daily bars come from Tiingo.
  • Intraday 1-minute bars are built by Databento ingestion from raw trades (with a VWAP sidecar and a measured spread used for realistic intraday cost modeling).

Higher timeframes can be resampled from a finer base cadence on the fly (the resampler lives in ats/data/resample.py; the shared timeframe registry in ats/data/timeframes.py). So a strategy can request, say, a 15-minute bar built from the stored 1-minute data.

The Data tab lists every instrument in the catalog with its date range, bar count, and an estimated storage size per row. Daily history is tiny; intraday 1-minute history is much larger (and Databento charges for it), which is why intraday ingestion has a cost-estimate confirm step. See ingestion.

  • Backtests (ats/workers/backtest_runner.py) load bars for the requested instrument + bar spec via the catalog, run the strategy on a BacktestEngine, and never touch a live API.
  • The Charts tab reads the catalog to render candlesticks and overlays.
  • The multi-instrument engines load an aligned grid of closes across a universe from the catalog.