Indexing and Querying: The Graph, Subgraphs and GraphQL

From raw on chain events to queryable APIs: subgraph schemas, mappings, indexing economics, and production grade query patterns.

TL;DR:
The Graph turns contract events and call results into a GraphQL API using a subgraph that declares a schema, a manifest of what to watch, and mapping functions that transform chain data into entities.
You can deploy to the decentralized network where indexers, curators, and delegators coordinate with GRT, or you can begin on hosted services while prototyping.
Scalable subgraphs are deterministic, reorg aware, and well modeled so queries stay fast and predictable.

1) What is a Subgraph?

A subgraph is a small indexing program that watches one or more smart contracts and converts their events and selected call results into a structured dataset that you query with GraphQL.
It runs inside The Graph node, which streams blocks from a chain, executes your mapping handlers when matching events appear, and materializes entities in a store.
Those entities are exposed through a GraphQL schema you control.

The execution model is event sourced: you never mutate chain state in place. Instead, your mapping receives a replayable event and updates entities accordingly.
If the network reorgs, the indexer reverts affected entities back to the last safe block and replays the canonical blocks. This is why handlers must be idempotent and deterministic.

Deterministic means the same input always yields the same output and no external randomness or time based branching is allowed.
Reorg aware means you link every entity update to the block and the log index so rollbacks are safe and predictable.
Well modeled means your entities match the questions your product needs to answer, with fields and indices that keep hot queries fast.

Tip: Build with the query in mind. Start from the front end screens and analytics panels you plan to render, then design entities and relations that support those exact reads with minimal joins.

2) Folder Layout and Core Files

A typical subgraph repository follows a simple and portable structure. You can scaffold it with the Graph CLI or assemble it by hand.

my-protocol-subgraph/
├─ abis/
│  ├─ ERC20.json
│  └─ Pool.json
├─ schema.graphql
├─ subgraph.yaml
├─ src/
│  ├─ mappings/
│  │  ├─ pool.ts
│  │  └─ factory.ts
│  └─ utils/
│     ├─ math.ts
│     └─ ids.ts
├─ tests/                      # Matchstick tests
│  ├─ pool.test.ts
│  └─ factory.test.ts
├─ package.json
└─ tsconfig.json

The three most important files are the schema, the manifest, and the mappings.
The schema declares the entities your API will serve. The manifest declares which networks and contracts to index and which handlers to run.
The mappings are AssemblyScript functions that transform events into entity writes.

3) Schema Design and Entity Modeling

Your schema.graphql defines the types that clients can query.
Each type annotated with @entity becomes a table with an id primary key and a set of fields in the store. Relations are expressed by referencing other entity ids.
You can also mark reverse relations with @derivedFrom to let the indexer compute them automatically.

# schema.graphql
# Entities for a simple swap protocol with pools, accounts, and trades

type Account @entity {
  id: ID!                                 # lowercased address string
  createdAt: BigInt!
  tradesCount: Int!
  trades: [Trade!]! @derivedFrom(field: "trader")
  positions: [Position!]! @derivedFrom(field: "account")
}

type Pool @entity {
  id: ID!                                 # pool address
  token0: Bytes!
  token1: Bytes!
  feeBps: Int!
  createdAtBlock: BigInt!
  createdAtTimestamp: BigInt!
  txCount: BigInt!
  volumeToken0: BigDecimal!
  volumeToken1: BigDecimal!
  liquidityToken0: BigDecimal!
  liquidityToken1: BigDecimal!
  swaps: [Trade!]! @derivedFrom(field: "pool")
}

type Trade @entity {
  id: ID!                                 # txHash-logIndex
  pool: Pool!
  trader: Account!
  amountInRaw: BigInt!
  amountOutRaw: BigInt!
  amountIn: BigDecimal!                   # scaled by decimals
  amountOut: BigDecimal!                  # scaled by decimals
  tokenIn: Bytes!
  tokenOut: Bytes!
  price: BigDecimal!                      # tokenOut per tokenIn
  blockNumber: BigInt!
  timestamp: BigInt!
  txHash: Bytes!
}

type Position @entity {
  id: ID!                                 # account-pool
  account: Account!
  pool: Pool!
  shares: BigDecimal!
  deposited0: BigDecimal!
  deposited1: BigDecimal!
  withdrawn0: BigDecimal!
  withdrawn1: BigDecimal!
  updatedAt: BigInt!
}

# Snapshot for charts (sharded by day)
type PoolDayData @entity {
  id: ID!                                 # poolId-YYYYMMDD
  pool: Pool!
  date: Int!
  volumeToken0: BigDecimal!
  volumeToken1: BigDecimal!
  txCount: BigInt!
  liquidityToken0: BigDecimal!
  liquidityToken1: BigDecimal!
}

A few design rules help keep schemas robust:

Deterministic identifiers. Make every id derivable from the event payload. For trades, concatenate the transaction hash and the log index. For accounts, use the lowercased checksum address. For composite entities like a position, join account and pool with a delimiter.
Block metadata. Persist blockNumber, timestamp, and txHash on state change entities. That makes time range filtering and audit trails straightforward.
Raw and scaled values. Store raw on chain integers as BigInt and scaled decimals as BigDecimal strings so you can render human friendly numbers without repeated conversion in the client.
Snapshots for charts. Real time dashboards should not aggregate thousands of trades per view. Write hourly and daily snapshots in your mapping and query them directly.
Denormalize hot fields. Copy two or three frequently read attributes onto leaf entities to avoid deep joins. For example cache feeBps or an asset symbol if you need it on every row in a table.

Naming tip: Name entities for what they represent, not how you compute them. Trade is better than SwapEvent because you may later enrich it with derived fields that are not present on the event itself.

4) Mappings and Deterministic Transforms

Mappings are AssemblyScript functions that the indexer calls when an event you declared in the manifest fires.
The functions transform the event into entity writes using the entity API. The environment exposes block metadata, transaction hash, and event parameters.
Mappings must be deterministic. They cannot perform random number generation, fetch from HTTP, or read the host clock. They can call read only view functions through the ABI as long as the values are not time sensitive.

Manifest with event handlers and templates

# subgraph.yaml
specVersion: 0.0.6
schema:
  file: ./schema.graphql

dataSources:
  - kind: ethereum
    name: Factory
    network: mainnet
    source:
      address: "0xFactoryAddress..."
      abi: Factory
      startBlock: 17350000
    mapping:
      kind: ethereum/events
      apiVersion: 0.0.7
      language: wasm/assemblyscript
      abis:
        - name: Factory
          file: ./abis/Factory.json
        - name: Pool
          file: ./abis/Pool.json
        - name: ERC20
          file: ./abis/ERC20.json
      entities:
        - Pool
      eventHandlers:
        - event: PoolCreated(indexed address,indexed address,indexed address,uint24)
          handler: handlePoolCreated
      file: ./src/mappings/factory.ts

templates:
  - name: PoolTemplate
    kind: ethereum
    network: mainnet
    source:
      abi: Pool
    mapping:
      kind: ethereum/events
      apiVersion: 0.0.7
      language: wasm/assemblyscript
      abis:
        - name: Pool
          file: ./abis/Pool.json
        - name: ERC20
          file: ./abis/ERC20.json
      entities:
        - Trade
        - PoolDayData
        - Position
      eventHandlers:
        - event: Swap(indexed address,indexed address,uint256,uint256)
          handler: handleSwap
        - event: Mint(indexed address,uint256,uint256)
          handler: handleMint
        - event: Burn(indexed address,uint256,uint256)
          handler: handleBurn
      file: ./src/mappings/pool.ts

Mapping helpers

Collect shared math and id helpers in a src/utils folder. AssemblyScript supports BigInt and BigDecimal for safe integer and fixed decimal math.

// src/utils/math.ts
import { BigDecimal, BigInt } from "@graphprotocol/graph-ts"

export const ZERO_BI = BigInt.zero()
export const ZERO_BD = BigDecimal.fromString("0")
export const ONE_BI  = BigInt.fromI32(1)

/** Scale a raw integer amount by token decimals into BigDecimal */
export function scale(amount: BigInt, decimals: i32): BigDecimal {
  let scaleFactor = BigDecimal.fromString("1" + "0".repeat(decimals as i32))
  return amount.toBigDecimal().div(scaleFactor)
}

/** Safe division with zero guard */
export function div(a: BigDecimal, b: BigDecimal): BigDecimal {
  return b.equals(ZERO_BD) ? ZERO_BD : a.div(b)
}

Factory mapping with dynamic data source creation

// src/mappings/factory.ts
import { Address, BigInt, ethereum } from "@graphprotocol/graph-ts"
import { PoolCreated } from "../../types/Factory/Factory"
import { Pool as PoolEntity } from "../../types/schema"
import { Pool as PoolContract } from "../../types/Pool/Pool"
import { PoolTemplate } from "../../types/templates"
import { ZERO_BD } from "../utils/math"

export function handlePoolCreated(e: PoolCreated): void {
  let id = e.params.pool.toHex().toLowerCase()

  let pool = new PoolEntity(id)
  pool.token0 = e.params.token0
  pool.token1 = e.params.token1
  pool.feeBps = e.params.fee.toI32()
  pool.createdAtBlock = e.block.number
  pool.createdAtTimestamp = e.block.timestamp
  pool.txCount = BigInt.zero()
  pool.volumeToken0 = ZERO_BD
  pool.volumeToken1 = ZERO_BD
  pool.liquidityToken0 = ZERO_BD
  pool.liquidityToken1 = ZERO_BD
  pool.save()

  // Start indexing this new pool from this block
  PoolTemplate.create(e.params.pool)
}

Pool mapping with swap handler and snapshots

// src/mappings/pool.ts
import { BigInt, Address, crypto, Bytes } from "@graphprotocol/graph-ts"
import { Swap, Mint, Burn } from "../../types/Pool/Pool"
import { Trade, Account, Pool, PoolDayData, Position } from "../../types/schema"
import { ZERO_BD, scale, div } from "../utils/math"

/** Build an id like <txhash>-<logIndex> */
function eventId(tx: Bytes, logIndex: BigInt): string {
  return tx.toHex() + "-" + logIndex.toString()
}

/** Build a day bucket id like <poolId>-YYYYMMDD */
function poolDayId(poolId: string, timestamp: BigInt): string {
  let day = timestamp.toI32() / 86400
  return poolId + "-" + day.toString()
}

export function handleSwap(e: Swap): void {
  let poolId = e.address.toHex().toLowerCase()
  let id = eventId(e.transaction.hash, e.logIndex)
  let pool = Pool.load(poolId)
  if (pool == null) return

  // Account
  let acctId = e.params.trader.toHex().toLowerCase()
  let acct = Account.load(acctId)
  if (acct == null) {
    acct = new Account(acctId)
    acct.createdAt = e.block.timestamp
    acct.tradesCount = 0
  }

  // Compute scaled amounts and price
  // Example assumes 18 decimals for both tokens; in a real subgraph, query decimals once and cache
  let amountIn    = scale(e.params.amountIn, 18)
  let amountOut   = scale(e.params.amountOut, 18)
  let price       = div(amountOut, amountIn)

  // Trade entity
  let t = new Trade(id)
  t.pool        = poolId
  t.trader      = acct.id
  t.amountInRaw = e.params.amountIn
  t.amountOutRaw= e.params.amountOut
  t.amountIn    = amountIn
  t.amountOut   = amountOut
  t.tokenIn     = e.params.tokenIn
  t.tokenOut    = e.params.tokenOut
  t.price       = price
  t.blockNumber = e.block.number
  t.timestamp   = e.block.timestamp
  t.txHash      = e.transaction.hash
  t.save()

  // Update pool aggregates
  pool.txCount = pool.txCount.plus(BigInt.fromI32(1))
  if (e.params.tokenIn == pool.token0) {
    pool.volumeToken0 = pool.volumeToken0.plus(amountIn)
    pool.volumeToken1 = pool.volumeToken1.plus(amountOut)
  } else {
    pool.volumeToken1 = pool.volumeToken1.plus(amountIn)
    pool.volumeToken0 = pool.volumeToken0.plus(amountOut)
  }
  pool.save()

  // Snapshot
  let dayId = poolDayId(poolId, e.block.timestamp)
  let snap = PoolDayData.load(dayId)
  if (snap == null) {
    snap = new PoolDayData(dayId)
    snap.pool = poolId
    snap.date = e.block.timestamp.toI32() / 86400
    snap.txCount = BigInt.zero()
    snap.volumeToken0 = ZERO_BD
    snap.volumeToken1 = ZERO_BD
    snap.liquidityToken0 = pool.liquidityToken0
    snap.liquidityToken1 = pool.liquidityToken1
  }
  snap.txCount = snap.txCount.plus(BigInt.fromI32(1))
  snap.volumeToken0 = pool.volumeToken0
  snap.volumeToken1 = pool.volumeToken1
  snap.save()

  // Account counter
  acct.tradesCount = acct.tradesCount + 1
  acct.save()
}

export function handleMint(e: Mint): void {
  // Update position and pool liquidity
  let poolId = e.address.toHex().toLowerCase()
  let acctId = e.params.owner.toHex().toLowerCase()
  let posId = acctId + "-" + poolId

  let pos = Position.load(posId)
  if (pos == null) {
    pos = new Position(posId)
    pos.account = acctId
    pos.pool    = poolId
    pos.shares  = ZERO_BD
    pos.deposited0 = ZERO_BD
    pos.deposited1 = ZERO_BD
    pos.withdrawn0 = ZERO_BD
    pos.withdrawn1 = ZERO_BD
  }
  pos.deposited0 = pos.deposited0.plus(scale(e.params.amount0, 18))
  pos.deposited1 = pos.deposited1.plus(scale(e.params.amount1, 18))
  pos.updatedAt  = e.block.timestamp
  pos.save()

  let pool = Pool.load(poolId)
  if (pool) {
    pool.liquidityToken0 = pool.liquidityToken0.plus(scale(e.params.amount0, 18))
    pool.liquidityToken1 = pool.liquidityToken1.plus(scale(e.params.amount1, 18))
    pool.save()
  }
}

export function handleBurn(e: Burn): void {
  let poolId = e.address.toHex().toLowerCase()
  let acctId = e.params.owner.toHex().toLowerCase()
  let posId = acctId + "-" + poolId
  let pos = Position.load(posId)
  if (pos) {
    pos.withdrawn0 = pos.withdrawn0.plus(scale(e.params.amount0, 18))
    pos.withdrawn1 = pos.withdrawn1.plus(scale(e.params.amount1, 18))
    pos.updatedAt  = e.block.timestamp
    pos.save()
  }
  let pool = Pool.load(poolId)
  if (pool) {
    pool.liquidityToken0 = pool.liquidityToken0.minus(scale(e.params.amount0, 18))
    pool.liquidityToken1 = pool.liquidityToken1.minus(scale(e.params.amount1, 18))
    pool.save()
  }
}

The pattern above covers the majority of decentralized exchange or lending subgraphs: a factory emits a creation event, you index each new instance with a template, and you create entities for each business event.

Gas hygiene: Keep mapping code small and avoid heavy string building inside loops. Although mappings run off chain, reducing CPU makes indexers happy and lowers query latency for consumers.

5) Decentralized Network Roles and Costs

The Graph decentralized network matches consumers who want queries with indexers who run Graph Node at scale.
Three roles coordinate with the GRT token:

Indexers stake GRT, select subgraphs to index, and serve queries. Their rewards and fees depend on performance and stake.
Curators signal which subgraphs are valuable by depositing GRT on bonding curves, guiding indexer attention and earning a portion of query fees.
Delegators delegate GRT to indexers and share rewards without running infrastructure.

As a developer you have two main paths. During early development, you may deploy to a hosted service or a private Graph Node to iterate quickly.
Once the schema and mappings stabilize, publish to the decentralized network so multiple indexers can serve your traffic. You pay per query through a gateway or directly to indexers.
Performance improves with diversity of providers and users gain availability even if one indexer has issues.

Budgeting tip: Reduce query cost by caching hot queries, using very narrow GraphQL selections, and precomputing counters in mappings rather than aggregating client side on large result sets.

6) Query Patterns and Performance

GraphQL lets you ask for exactly the fields you need. Performance depends on query shape, filters, and pagination strategy.
Below are patterns that keep things fast and safe in production.

Pagination strategies

For small lists use first and skip. For long histories, avoid large skip values because they grow slower with big offsets.
Instead, paginate by a cursor field you control such as id, timestamp, or blockNumber with _gt filters.

# Cursor pagination by timestamp and id
query TradesAfter($pool: String!, $ts: Int!, $lastId: String!) {
  trades(
    where: { pool: $pool, timestamp_gte: $ts, id_gt: $lastId }
    orderBy: id
    orderDirection: asc
    first: 1000
  ) {
    id
    amountIn
    amountOut
    timestamp
  }
}

Server side filtering

Use the where filter to narrow results as much as possible. Avoid fetching thousands of rows to filter in the client.
Many filter operators are available: _in, _not_in, _contains, _not_contains, _gt, _lt, and friends.

# All positions for an account in a set of pools
query($acct: String!, $pools: [String!]!) {
  positions(
    where: { account: $acct, pool_in: $pools }
    orderBy: updatedAt
    orderDirection: desc
  ) {
    id
    pool { id }
    shares
    updatedAt
  }
}

Historical reads with the block argument

The block argument lets you query the state as of a specific block number. This is essential for fair user interfaces and time travel analytics.

# Past state at a specific block height
query($block: Int!, $pool: String!) {
  pool(block: { number: $block }, id: $pool) {
    id
    volumeToken0
    volumeToken1
    txCount
  }
}

Counters and snapshots instead of ad hoc aggregation

Subgraphs do not offer SQL aggregates. To serve totals and charts, maintain running counters and time bucketed snapshots in mappings.
Query those entities directly rather than aggregating thousands of leaf rows per request.

Anti pattern: Pulling tens of thousands of trades and computing a chart in the browser. This will be slow, expensive, and error prone.

Client side code examples

Any GraphQL client works. Here is a small example with fetch and with Apollo Client to illustrate pagination and retries.

// Simple fetch with pagination by id
const endpoint = "https://gateway.thegraph.com/api/<key>/subgraphs/id/<subgraph-id>"

const Q = `
query TradesAfter($pool: String!, $lastId: String!) {
  trades(where: { pool: $pool, id_gt: $lastId }, orderBy: id, orderDirection: asc, first: 1000) {
    id amountIn amountOut timestamp
  }
}
`

async function* streamTrades(pool) {
  let last = ""
  while (true) {
    const res = await fetch(endpoint, {
      method: "POST",
      headers: { "content-type": "application/json" },
      body: JSON.stringify({ query: Q, variables: { pool, lastId: last } })
    }).then(r => r.json())
    const rows = res.data.trades
    if (rows.length === 0) return
    for (const t of rows) yield t
    last = rows[rows.length - 1].id
  }
}

// Apollo Client example with retry link (pseudocode)
import { ApolloClient, InMemoryCache, HttpLink, ApolloLink } from "@apollo/client"
const retry = new ApolloLink((op, forward) => forward(op).map(result => result)) // plug real retry link
const client = new ApolloClient({
  link: ApolloLink.from([retry, new HttpLink({ uri: endpoint })]),
  cache: new InMemoryCache()
})

7) Advanced Features and Multi Chain

As your subgraph matures, you may need to index multiple networks, read immutable config through call handlers, or snapshot at block intervals.
The following techniques cover those needs.

Call handlers for immutable config

Call handlers allow you to index the results of specific contract calls. This is most valuable for immutable values like token decimals or pool parameters that do not change often.
You should avoid reading volatile state in tight loops from call handlers.

# subgraph.yaml (snippet)
...
mapping:
  ...
  callHandlers:
    - function: initialize(address,address,uint24)
      handler: handleInitialize

// src/mappings/pool.ts
import { InitializeCall } from "../../types/Pool/Pool"

export function handleInitialize(call: InitializeCall): void {
  let poolId = call.to.toHex().toLowerCase()
  let pool = Pool.load(poolId)
  if (pool == null) return
  pool.feeBps = call.inputs.fee.toI32()
  pool.save()
}

Block handlers for periodic tasks

If you need periodic tasks such as rolling snapshots or sanity checks, a block handler can run every block or at a defined interval.
Use sparingly since it increases indexing cost.

# subgraph.yaml (snippet)
...
mapping:
  ...
  blockHandlers:
    - handler: handleBlock
      filter:
        kind: polling
        every: 250   # every 250 blocks

// src/mappings/pool.ts
import { ethereum } from "@graphprotocol/graph-ts"

export function handleBlock(block: ethereum.Block): void {
  // Example: no op heartbeats, or rotate a small cache
}

Multi chain deployments

For multi chain apps you can deploy one subgraph per network or include multiple data sources in one manifest.
For a unified view across chains, run a tiny aggregation service that fans out queries per chain and merges results by a common key like account id.

Latency tip: Combine a subgraph for historical consistency with direct JSON RPC reads for the most recent one or two blocks, using the block number to reconcile the boundary.

8) Observability, Testing, and Versioning

Treat your subgraph like a service with releases, monitoring, and tests. Stability here translates directly into a better product and fewer support escalations.

Metrics to track: head block height, lag versus the chain head, average mapping duration, handler error counts, and query error rates at your gateway.
Alerts: notify when lag exceeds your tolerance, when mapping exceptions spike, or when a data source stalls at a block for more than a short window.
Semantic versions: publish versions such as v1.3.0 with a changelog. Pins to exact contract addresses and ABIs keep deployments reproducible.
Breaking changes: run the old and the new schema in parallel during client migration. If you need a one time backfill, write a short script to reprocess blocks or add a migration handler guarded by a block filter.
Backfills and reindexing: when logic changes, you may need to reindex from scratch. Keep manifests tight with startBlock near contract creation so reindexing stays fast.

Matchstick tests

Unit tests for mappings improve confidence and speed up iteration. The Matchstick framework lets you construct fake events, call handlers, and assert entity outcomes.

// tests/pool.test.ts (concept)
import { test, assert, newMockEvent } from "matchstick-as/assembly/index"
import { handleSwap } from "../src/mappings/pool"
import { Swap } from "../generated/Pool/Pool"
import { BigInt, Address, ethereum, Bytes } from "@graphprotocol/graph-ts"

test("swap creates trade and updates counters", () => {
  let e = newMockEvent()
  e.address = Address.fromString("0xPool...")
  e.block.number = BigInt.fromI32(100)
  e.block.timestamp = BigInt.fromI32(1700000000)
  e.transaction.hash = Bytes.fromHexString("0xabc...") as Bytes
  e.logIndex = BigInt.fromI32(0)

  let swap = new Swap(
    e.address,
    e.logIndex,
    e.transactionLogIndex,
    e.logType,
    e.block,
    e.transaction,
    e.parameters
  )
  // set Swap params as needed...
  handleSwap(swap)

  // assert entities persisted...
})

Determinism check: Avoid floating point math, random identifiers, and locale based formatting. If two indexers running the same code could produce different outputs, your subgraph will be unreliable under the network.

9) Reorgs, Determinism, and Safety

Chains reorg. Contracts upgrade. Data providers hiccup. Robust subgraphs handle these realities gracefully.

Reorg handling: Always base entity identity on transaction hash and log index for event driven entities. Avoid external sequence numbers that could drift during replays.
Block finality windows: If you represent live values like TVL on a landing page, consider a small delay to allow the indexer to settle on a stable head during volatile periods.
Decimals and overflow: Use integer scaling for prices and amounts. Avoid casting large integers to 64 bit numbers in AssemblyScript. Prefer BigInt and BigDecimal everywhere.
Call handler safety: Keep call handlers for immutable config. If you must read view functions that can change, cache once per pool and ignore during tight loops.
Derived relations: @derivedFrom fields are computed by the store. Do not write them from mappings or you may create inconsistent rows.
Entity churn control: For high volume contracts, avoid creating a new entity for every micro event that is not relevant to queries. Collapse micro events into a single aggregated row per block if possible.

Common pitfalls: forgetting to normalize addresses to lower case in ids, using floating points for token math, relying on external services inside mappings, or building daily snapshots by iterating over prior entities each day.

10) Production Playbook and Migration Tips

The last step is turning a working subgraph into a reliable service. The following checklists capture what experienced teams do before launch and during operations.

Pre launch checklist

Confirm that the schema answers every screen and report with one or two shallow queries. If a dashboard needs three or more deep joins, adjust the model.
Tighten startBlock values to just after contract deployment. This shortens initial sync and any reindex cycles.
Cache token decimals and symbols once per address to avoid repeated reads. Expose both raw and scaled amounts on entities that the front end renders often.
Backfill PoolDayData and similar snapshot entities as your mappings observe events. Ensure you do not rebuild old days on every new event.
Create browser and server side examples demonstrating cursor pagination, historical reads with the block argument, and a total counter read for TVL.

Operational runbooks

Stuck sync: If the indexer stalls at a block, check node connectivity, ABI mismatches, and mapping exceptions. Deploy a quick patch and reindex if necessary.
Reorg storm: If a chain is unstable, you may temporarily display values as of a slightly older block and label the page with the last synchronized height.
Contract upgrade: When a protocol deploys a new factory or pool version, add a new data source with a higher start block and keep both live until flows migrate.
Breaking schema change: Publish v2, keep v1 for a defined window, and add a banner to client apps prompting users to upgrade. Archive v1 after adoption.
Cost spikes: If gateway fees rise due to a traffic event, cache hot queries with a reverse proxy and aggressively limit query fields to what you render.

Client side ergonomics

Adopt persisted queries or a thin server that validates incoming GraphQL and forwards only approved shapes to indexers. This protects you from accidental heavy queries and simplifies caching by making payloads stable.

Quick check

What are the three core parts of a subgraph and what does each do?
Why are deterministic ids like txHash-logIndex preferred for event driven entities?
How can you query a past state of an entity, and why would you do it?
When would you choose snapshots over client side aggregation, and why?
What are two ways to paginate long histories without using large skip values?

Show answers

Schema defines entities and relations, manifest declares contracts, events, and handlers with start blocks, and mappings transform events and calls into entity writes.
They ensure uniqueness and stability across replays and reorgs so you never duplicate or miss a record when blocks reorganize.
Use the GraphQL block argument, for example positions(block: { number: N }), to reproduce the state at a point in time for fair comparisons or historical analytics.
Choose snapshots when your UI needs totals or charts over many events. Snapshots reduce query cost and avoid downloading massive result sets to sum in the browser.
Paginate by cursor on a monotonic field such as id or timestamp using _gt filters, or paginate by block ranges if you store block numbers on entities.

Go deeper

Design lectures: event driven ETL, snapshot strategies, denormalization trade offs, and idempotent mapping patterns.
Operations lectures: indexer selection and query budgeting, cache keys for GraphQL, multi region gateways, and sync lag service level objectives.
Ecosystem lectures: alternatives and complements such as custom ETL, Subsquid, Dune, and Reservoir. When to combine subgraphs with direct node reads and how to stitch multi chain data.
Testing labs: write Matchstick tests that simulate swaps, mints, and burns, verify counters and snapshot updates, and assert determinism under replays.

Next: decentralized storage with IPFS, Arweave, and Filecoin.

Next: Storage →

Indexing and Querying (The Graph)