Skip to main content

Model Runner

Overview

The Model Runner schedules, manages, and executes models on blockchain data.

  • Runtime: Python3.7
  • Persistence: Postgres, Redis, RabbitMQ, Kafka
  • Third-party Frameworks: Celery (task management), FastAPI (webserver)
  • Internal libraries: engine (internal utilities), python-models (model code)

Architecture

Core Components

Core Components

  • scheduler - periodically schedules and expires model runs
  • api - handles submission of model runs
  • worker - handles processing of model runs

Worker Layout

Worker Layout

Model Run State

Model Run State

Ownership

Downstream Dependencies

Service Dependencies

Upstream Dependencies

  • Chainwalkers - produces block data
  • Relayer - relays block data to the Indexer
  • Indexer - persists block data to Postgres

Core Components

  • Scheduler - Terminates and schedules model runs
  • Workers - Executes models and forwards results downstream
  • Enqueuer - Fans out messages to specific worker queues
  • API - Webserver which handles model run CRUD operations

Where is the Code?

Code is deployed via Github Actions on commits to main.

What DNS addresses are supported?

Networking

Under construction 🚧.

Persistence

Databases

DatabaseNameDescriptionHost
postgresdoormodel run state and resultsprod-door.cluster-cik7nbaqdhks.us-east-1.rds.amazonaws.com
redisdb0model run progress and ABI cachedoor-v2.jlfd3y.0001.use1.cache.amazonaws.com
rabbitMQdata-platform-prodpending run messagesb-f1aa7ceb-4ffa-450c-a17e-acf0e1963296.mq.us-east-1.amazonaws.com:6379
kafkaprodmodeled result messagespkc-4nym6.us-east-1.aws.confluent.cloud:9092
snowflakeflipside_prod_dbstores modeled resultsvna27887.us-east-1.snowflakecomputing.com

RabbitMQ Queues

Queues follow the convention: <blockchain>_<realtime|backfill>_<reads|nonreads>. The current queues are:

  • ethereum_realtime_reads - handles ethereum realtime read models
  • ethereum_realtime_nonreads - handles ethereum realtime nonread models
  • ethereum_backfill_reads - handles ethereum backfill read models
  • ethereum_backfill_nonreads - handles ethereum backfill nonread models
  • matic_realtime_reads - handles matic realtime read models
  • matic_realtime_nonreads - handles matic realtime nonread models
  • matic_backfill_reads - handles matic backfill read models
  • matic_backfill_nonreads - handles matic backfill nonread models
  • backfill_1 - handles adhoc runs
  • backfill_2 - handles adhoc runs
  • backfill_3 - handles adhoc runs
  • backfill_4 - handles adhoc runs

Kafka Topics

Sink topics follow the convention: <env>-<blockchain>-sink.

  • prod-ethereum-sink - stores ethereum messages to be written to Snowflake
  • prod-matic-sink - stores polygon messages to be written to Snowflake

Snowflake Tables

  • flipside_prod_db.bronze.prod_ethereum_sink_407559501 - ethereum modeled data
  • flipside_prod_db.prod_matic_sink_510901820 - polygon modeled data

Logging

ComponentLog SourceLink
schedulerAirflowlink
apiDatadoglink
ethereum-realtimeDatadoglink
ethereum-backfillDatadoglink
matic-realtimeDatadoglink
matic-backfillDatadoglink
backfill-1Datadoglink
backfill-2Datadoglink
backfill-3Datadoglink
backfill-4Datadoglink

Monitoring

ResourceMetricsLink
Model Runner Monitoring Dashboardblocks modeled, message durations, time in queuelink
RabbitMQ Managementqueue size, ack rates, delivery rates, consumer countslink
Confluent Cloudtopic production and consumption rateslink
AWS RDSread and write IOPslink
Airflowscheduler DAG executionlink
APM Tracingmodel execution tracing and error rateslink

Error Reporting

Errors are reported to Sentry.

Where does it Run?

ComponentResourceRegion
SchedulerAWS EKS - data-platform-prod/python-apius-east-1
WorkersAWS EKS - data-platform-prod/python-apius-east-1
EnqueuerAWS EKS - data-platform-prod/python-apius-east-1
APIAWS EKS - data-platform-prod/python-apius-east-1

Troubleshooting

Diagnostics