RPCs and Nodes (Clients, Reliability, Security)

RPCs and Nodes: Accessing the Chain Reliably and Safely

Node types, JSON RPC patterns, reliability architectures, and production grade security practices for real dapps.

TL;DR:
Pick deliberately between public endpoints, hosted providers, and self run nodes (full or archival). Build resiliency with multi provider failover or quorum RPC. Build safety with chain checks, simulation, finalized block tags, and private transaction routes. Monitor head lag, error codes, and reorg depth. Keep a runbook for desyncs, pruning, and rapid client upgrades.

1) Node types and roles

An RPC endpoint is your window into a blockchain. Under the hood sits a node that synchronizes blocks, verifies state transitions, and exposes APIs like JSON RPC and WebSocket. Choosing the right node class determines what answers you can serve and how much it costs to run.

  • Full node serves the current state at or near head. It verifies blocks and transactions and can answer most reads, event subscriptions, and normal write flows. On Ethereum style networks post Merge you run an execution client plus a consensus client.
  • Pruned node is like a full node that discards some historical state to save disk. It can still answer head reads and historical logs via receipts but cannot reconstruct arbitrary storage for very old blocks.
  • Archival node retains all historical state diffs, allowing point in time storage queries such as “what did this mapping look like at block N”. Essential for indexers, research, and deep debugging. Costs a lot more in disk and IOPS.
  • Light client verifies headers and proofs with very low resource use. Great for mobile and embedded contexts and for L2s that provide succinct validity proofs. Light clients reduce trust without full synchronization.

Sync strategies. Fast or snap sync brings a node online quickly by checkpointing recent state and then backfilling. Full archival sync replays the entire chain. Plan disk growth, IOPS, and snapshot or restore procedures from day one. If you operate multiple networks, track per chain retention and add monitoring for disk watermarks.

Endpoint forms. Most providers expose HTTP JSON RPC for request or response, WebSocket for subscriptions, and sometimes GraphQL for structured reads. Many add private transaction routes for MEV protection. Always treat provider specific features as optional and design a graceful fallback route.

Client diversity: When self running, prefer two different client implementations where possible. Diversity reduces common mode failure from client bugs and improves quorum designs.

Hardware and topology notes

  • Execution clients like geth and nethermind respond quickly to head reads. Erigon is popular for archival due to its columnar storage approach and fast historical queries.
  • Consensus clients like lighthouse and prysm manage beacons and attestations. You pair one execution with one consensus client post Merge, even if you are not a validator.
  • Disk matters. Use NVMe with high write endurance for archival nodes. Pruned nodes can tolerate cheaper disks but still benefit from NVMe during sync and compaction.
  • Networking matters. If you serve global users, place stateless RPC frontends in multiple regions and route to the nearest healthy backend. Keep a warm standby in another region for failover.

2) JSON RPC usage patterns

JSON RPC is a simple protocol: send a method and params, receive a result or an error. Correct usage avoids subtle bugs during congestion and reorgs. The patterns below cover reads, writes, gas, filters, and subscriptions.

  • Read safely with block tags. Prefer explicit blockTag arguments where supported. latest is fast but can reorg. safe and finalized (on chains that support them) trade latency for certainty. For balances and storage use eth_getBalance and eth_call with a tag.
  • Pre flight writes. Simulate with eth_call using the same nonce, gas, and value you intend to broadcast. Check revert reasons. Compare simulation across more than one provider when value at risk is high.
  • Gas estimation. eth_estimateGas returns a minimal bound under current state and mempool conditions. Add a safety buffer and always set EIP 1559 fields maxFeePerGas and maxPriorityFeePerGas. Cap retries with backoff during spikes.
  • Batch reads. Batch small groups of homogeneous calls to reduce latency and rate pressure. Avoid mega batches that lead to single packet failures. Chunk by 50 to 200 calls depending on provider guidance.
  • Filters and logs. eth_newFilter and eth_getFilterChanges stream new logs but are ephemeral. eth_getLogs with block ranges is more portable for backfills. For large windows, page by block spans and indexed topics and respect provider caps.
  • Network identity. Verify eth_chainId on connect and sign with chain aware transactions to avoid accidental broadcasts on the wrong network.
  • WebSocket versus HTTP. WebSocket shines for subscriptions like newHeads and logs, but you need reconnect and replay logic. For mission critical writes, consider private routes over HTTP that bypass the public mempool.
# Read a balance at a finalized block with curl
curl -s https://rpc.example.org \
  -H 'content-type: application/json' \
  -d '{
    "jsonrpc":"2.0",
    "id":1,
    "method":"eth_getBalance",
    "params":["0xYourAddress", "finalized"]
  }'
# Simulate a call at the same nonce you plan to send
curl -s https://rpc.example.org \
  -H 'content-type: application/json' \
  -d '{
    "jsonrpc":"2.0",
    "id":1,
    "method":"eth_call",
    "params":[
      {
        "from":"0xSender",
        "to":"0xContract",
        "data":"0xA9059CBB...  ",     // your calldata
        "nonce":"0x1a",                // same nonce you will use
        "maxFeePerGas":"0x59682f00",   // example
        "maxPriorityFeePerGas":"0x3b9aca00"
      },
      "latest"
    ]
  }'

For event scans, design your own pagination. Fetch logs in block windows and expand or shrink window size based on response payload and provider limits. De duplicate by transaction hash and log index and persist a cursor so you can resume mid scan after a crash.

# getLogs with block window pagination
{
  "jsonrpc": "2.0",
  "id": 42,
  "method": "eth_getLogs",
  "params": [{
    "fromBlock": "0xB71B00",          # start
    "toBlock":   "0xB71FFF",          # end
    "address": "0xPoolAddress",
    "topics": [
      "0xd78ad95fa46c994b6551d0da85fc275fe613ce37657fb8d5e3e..."  # Swap topic
    ]
  }]
}

3) Reliability and performance

Treat your RPC layer like a database dependency. Plan for failover, cold starts, and provider quirks. Two patterns dominate: multi provider failover and quorum reads.

  • Multi provider design. Keep credentials for at least two providers plus your own node if possible. Implement health probes based on eth_blockNumber drift, error rate, and latency. Fail fast to a healthy peer and return to the primary after a cooldown.
  • Quorum reads. For critical reads, query N providers and require M of N agreement within a tolerance such as one block or small numeric deltas. This guards against transient forks and stale caches.
  • Caching. Cache block scoped reads keyed by method, params, and block number. Invalidate when a new head arrives. Do not cache mempool queries.
  • Head and history split. Use a fast hosted endpoint for head reads and writes, and your own archival stack for historical scans and analytics. This keeps hot paths snappy while controlling cost.
  • Idempotency for writes. If you retry eth_sendRawTransaction, resend the exact same signed payload. For replacement transactions, bump fees with the same nonce deliberately and monitor acceptance to avoid accidental nonce gaps.
  • Region awareness. Place RPC frontends near your users or sequencers for L2s. Cross region round trips can exceed block time and increase inclusion variance.
# Pseudocode for quorum reads
async function quorumCall(request, providers, agreeBy) {
  const outs = await Promise.allSettled(providers.map(p => p(request)))
  const ok = outs.filter(o => o.status === "fulfilled").map(o => o.value)
  // Compare by blockNumber or value difference within agreeBy tolerance
  // Return median or majority; mark outliers for health checks
}
Pagination hygiene: For large historical scans, use adaptive windows. Start with a 4 thousand block span, shrink when responses are big or slow, and expand when small. Persist your last successful end block to resume quickly after restarts.

4) Security and MEV considerations

RPC mistakes cause real losses. Harden both key handling and transaction flow. Think in layers: identity, simulation, broadcast path, and post trade checks.

  • Identity and chain checks. On app startup verify eth_chainId and probe a known canary contract or storage slot. Refuse to operate if anything mismatches.
  • Simulation before broadcast. Simulate with the exact nonce and gas you will use. Capture revert strings and surface them to users. Compare simulation across providers when funds at risk are high.
  • Private transactions. Reduce frontrunning by sending through private or MEV protected routes where available. These bypass the public mempool and deliver directly to block builders or relays. Expect tradeoffs in transparency and inclusion guarantees.
  • Key hygiene. Never expose private keys in a browser. Sign locally with hardware or a trusted signer and transmit only signed payloads. If you operate a backend signer, place it behind strict access controls and consider quorum approvals for large transfers.
  • RPC credential controls. Lock provider API keys to referrer domains and IP ranges. Rotate frequently. Monitor for unexpected origins and burst patterns, which are symptoms of leaked keys.
  • Finality awareness. Label pre final states as pending. For balances and fills, recalc after a safe depth. Trigger compensation logic if fills are reorged out.
Replacement rules: When replacing a pending transaction, reuse the same nonce and raise maxFeePerGas and maxPriorityFeePerGas above the previous values by a sufficient delta. Track acceptance and stop once one hash is in a block.

5) Operations and monitoring

Observability converts unknown risk into managed risk. Decide what “healthy” means in numbers and alert before users notice issues.

  • Health signals: head lag relative to a trusted reference, error code mix such as throttles and timeouts, method latency percentiles, WebSocket subscription drop rate, and resource pressure on self run nodes.
  • Reorg and fork alerts: track reorg depth and frequency. Investigate spikes. Protect user visible state with safe or finalized reads where possible.
  • Logs and indexing: chunk block ranges, checkpoint progress, and handle provider page limits gracefully. Persist cursors so scans can resume mid window after a crash.
  • Runbooks: desync recovery steps, disk growth mitigation with pruning or snapshot restore, database repair, and client upgrade procedures with rollback. Practice with game day drills.
  • Capacity planning: measure read and write QPS, peak concurrency, and typical batch sizes. Scale horizontally with more stateless frontends and vertically with faster disks for archival backends. Keep a warm standby region.

Self run basics: sample service files and commands

# Example geth full node flags (concept)
geth \
  --http --http.addr 0.0.0.0 --http.vhosts="*" --http.api eth,net,web3 \
  --ws --ws.addr 0.0.0.0 --ws.origins="*" --ws.api eth,net,web3 \
  --syncmode snap \
  --cache 4096 \
  --authrpc.addr 127.0.0.1 \
  --authrpc.port 8551 \
  --metrics

# Erigon archival example (concept)
erigon \
  --chain mainnet \
  --http --http.addr 0.0.0.0 --http.api eth,debug,trace,erigon \
  --ws \
  --prune=htc \
  --torrent.download.rate=0 \
  --metrics
Exposure caution: Never expose admin or debug namespaces to the public internet. Bind them to localhost or a private network segment and protect with access control lists.

6) Recipes and code snippets

The following short recipes illustrate safe patterns for common tasks: simulate and send, EIP 1559 fee selection, WebSocket subscriptions with backoff, logs scanning, and quorum RPC.

Simulate and send with EIP 1559 using TypeScript

import { JsonRpcProvider, Wallet, parseUnits } from "ethers"

const provider = new JsonRpcProvider(process.env.RPC_URL)
const wallet = new Wallet(process.env.PRIVATE_KEY, provider)

async function sendSafe(tx) {
  // Check chain identity
  const chainId = (await provider.getNetwork()).chainId
  if (chainId !== BigInt(1)) throw new Error("Unexpected chain")

  // Build the transaction
  const nonce = await provider.getTransactionCount(wallet.address, "latest")
  const base = await provider.getBlock("latest")
  const maxPriorityFeePerGas = parseUnits("1.5", "gwei")
  const maxFeePerGas = base && base.baseFeePerGas
    ? base.baseFeePerGas * BigInt(2) + maxPriorityFeePerGas
    : parseUnits("30", "gwei")

  const req = {
    to: tx.to,
    data: tx.data,
    value: tx.value ?? 0n,
    nonce,
    maxPriorityFeePerGas,
    maxFeePerGas,
    type: 2
  }

  // Pre flight simulation
  await provider.call({ ...req, from: wallet.address })

  // Send
  const sent = await wallet.sendTransaction(req)
  console.log("Submitted", sent.hash)

  // Wait for safe depth
  const receipt = await sent.wait(1) // or more confirmations
  console.log("Mined in block", receipt.blockNumber)
  return receipt
}

Python logs scanner with adaptive windows

from web3 import Web3
w3 = Web3(Web3.HTTPProvider("https://rpc.example.org"))

POOL = Web3.to_checksum_address("0xPool")
TOPIC_SWAP = "0xd78ad95fa46c994b6551d0da85fc275fe613ce37657fb8d5e3e..."

def scan(start, end, step=4000):
    cur = start
    while cur <= end:
        to = min(cur + step, end)
        try:
            logs = w3.eth.get_logs({
                "fromBlock": cur,
                "toBlock": to,
                "address": POOL,
                "topics": [TOPIC_SWAP]
            })
            for log in logs:
                # process log
                pass
            cur = to + 1
            if len(logs) < 100: step = min(step * 2, 20000)
        except Exception as e:
            # shrink window on provider limits
            step = max(256, step // 2)
    print("done")

WebSocket subscription with reconnect and replay

import WebSocket from "isomorphic-ws"

function subscribeLogs(wsUrl, params) {
  let ws, subId, lastBlock = 0

  function connect() {
    ws = new WebSocket(wsUrl)
    ws.onopen = () => {
      ws.send(JSON.stringify({ jsonrpc: "2.0", id: 1, method: "eth_subscribe", params: ["logs", params] }))
    }
    ws.onmessage = (evt) => {
      const msg = JSON.parse(evt.data)
      if (msg.id === 1) { subId = msg.result; return }
      if (msg.method === "eth_subscription" && msg.params) {
        const log = msg.params.result
        lastBlock = parseInt(log.blockNumber, 16)
        // process log
      }
    }
    ws.onclose = () => setTimeout(connect, 1500 + Math.random() * 1000)
    ws.onerror = () => { try { ws.close() } catch {} }
  }
  connect()
}

Quorum RPC facade in TypeScript

async function rpc(url, body) {
  const r = await fetch(url, { method: "POST", headers: { "content-type": "application/json" }, body: JSON.stringify(body) })
  const j = await r.json()
  if (j.error) throw new Error(j.error.message)
  return j.result
}

async function quorumCall(method, params, providers, agreeBlocks=1) {
  const payload = { jsonrpc:"2.0", id:1, method, params }
  const outs = await Promise.allSettled(providers.map(p => rpc(p, payload)))
  const oks = outs.filter(x => x.status === "fulfilled").map(x => x.value)
  if (oks.length === 0) throw new Error("All providers failed")
  // If the response has blockNumbers, compare within agreeBlocks; otherwise pick median or majority value
  return oks[0] // stub: add comparison logic for your method
}

7) Edge cases and troubleshooting

Real systems hit corner cases. These are the ones you will see most often, along with defensive responses.

  • Nonce too low. You signed with a nonce that is already used. Fetch the current account nonce at the intended block tag and rebuild the transaction. If you have an in flight replacement, wait for inclusion or cancel with a zero value send to self at higher fees using the same nonce.
  • Replacement underpriced. Your fee bump was too small. Bump maxPriorityFeePerGas and potentially maxFeePerGas beyond your previous attempt by a meaningful delta. Respect chain and client rules for replacement thresholds.
  • Intrinsic gas too low. Your gas limit is below the intrinsic cost. Re estimate and add a buffer. Remember that dynamic cost can change mildly between simulation and inclusion.
  • Chain reorgs. A transaction landed and then disappeared from history. Requery state after a safe depth before crediting balances or emitting downstream side effects.
  • Provider drift. Two providers disagree on a state read. Treat it like a soft failure and either fence the operation or use a quorum read before proceeding.
  • Filter not found. You used a long lived eth_newFilter without heartbeats. Recreate the filter or switch to polling with eth_getLogs and block windows.
  • Head stalls. Your node stops advancing. Check peers, disk, and database logs. Restart with safe flags, resync if needed, or fail over to a healthy peer while you repair.

Labeling and user communication

In user interfaces, show pending and confirmed states, include the last synchronized block number, and display a small banner during provider outages. These small details reduce confusion during volatile periods.

Common foot guns: assuming latest is final, retrying send with a different signed payload, mixing units between gwei and wei, trusting a single provider for risk critical reads, and exposing admin RPC namespaces on public interfaces.

Quick check

  1. When do you need an archival node instead of a pruned or full node?
  2. Why add a chainId check on connect and prefer finalized block tags for some reads?
  3. Name one way to reduce frontrunning risk when submitting transactions.
  4. What is a quorum read and when would you use it?
  5. How would you paginate a large eth_getLogs backfill safely?
Show answers
  • When you must reconstruct historical storage state at past blocks or run deep analytics and backfills beyond logs.
  • ChainId prevents signing for the wrong network and finalized reads reduce reorg risk for user visible accounting.
  • Send through a private transaction route or MEV protected relay so your transaction does not sit in the public mempool.
  • A quorum read queries multiple providers and requires a majority agreement within a small tolerance. Use it for risk critical decisions like liquidation triggers or price reads.
  • Page by block windows with adaptive size, filter by topics, de duplicate by transaction hash and log index, and persist a cursor to resume after interruptions.

Go deeper

  • Concepts: reorg handling and finality on L1 and L2, mempool policies and replacement rules, block building pipelines and proposer or builder separation.
  • Design patterns: quorum RPC for critical reads, head and history split architecture, transaction simulation gates with M of N agreement, chain aware feature flags.
  • Operations: snapshotting and fast restore, cross provider drift dashboards, alert budgets for eth_getLogs latency, and disaster recovery drills for client regressions.

Next: decentralized storage with IPFS, Arweave, and Filecoin.


Next: Storage →