AI in Finance & Crypto
Risk & fraud models, time-series forecasting, portfolio thinking, sentiment & news pipelines, on-chain analytics, DeFi risk, governance summarization, and the guardrails you need to avoid costly mistakes.
Where AI fits in finance & crypto
Finance produces structured (tabular) data with clear labels (default/no-default, fraud/no-fraud) and vast time series (prices, volumes, order books). Crypto adds transparent ledgers (on-chain transactions, smart-contract events) and decentralized governance. AI helps in four broad ways: (1) classify & score risk, (2) detect anomalies, (3) forecast & plan, and (4) digest information (summarize, extract, prioritize). The best systems combine ML/DL with rules, controls, and human review.
Risk & fraud models (tabular + graph)
Fraud detection: Classify transactions or flows as risky using features like velocity (frequency/amount spikes), device fingerprint, merchant type, IP/geo, and graph features (shared wallets, counterparties). Techniques include gradient-boosted trees for tabular features and graph neural networks for relational structure (e.g., suspicious clusters). Because fraud is rare, focus on precision-recall trade-offs, cost-based metrics (false positives are expensive), and analyst tooling for triage.
Credit & counterparty risk: Predict probability of default (PD), loss given default (LGD), and exposure at default (EAD). In crypto markets, “counterparty” may be a centralized exchange, market maker, or a smart contract’s solvency (oracle/lending risk). Simpler, explainable models (logistic regression, GBMs with reason codes) are often preferred by auditors and regulators. Provide cohort metrics and calibration plots, not just a single AUC.
Forecasting & time series
Market data are sequences: prices, volumes, spreads, funding rates, implied volatility, liquidations. The baseline approach is to engineer features (returns, moving averages, RSI, order-flow imbalance) and train a model to forecast short-horizon returns or classify regimes (trend, chop, shock). Tree-based models handle tabular features well; sequence models (temporal CNNs, transformers) model long-range dependencies.
- Targets: Direction (up/down), buckets (top/bottom tercile of returns), or continuous next-period return.
- Horizon: Short horizons (minutes/hours) are noisy; aggregate to reduce variance. Longer horizons face regime shifts.
- Features: Price/volume, order-book depth, realized volatility, funding/borrow rates, perp basis, on-chain flows, social/news sentiment.
- Evaluation: Instead of only RMSE, measure information coefficient (correlation with future returns), hit-rate, Sharpe under a simple trading rule, and drawdown.
Regime detection: Cluster periods by volatility/liquidity to adapt models or risk. A model that works in calm markets may fail during stress. Detecting “non-stationarity” (the world changes) is as important as predicting a number.
Portfolio & execution (practical lens)
Forecasts only matter if you can turn them into decisions. A simple rule is often best: go to cash when model confidence is low; size trades conservatively; cap leverage; set stop-loss and max drawdown. Transaction costs, slippage, and fees will erase fragile signals, so build them into backtests. For portfolios, diversify across uncorrelated signals and markets; regularize weights and limit turnover.
- Execution: Use volume-weighted rules or basic participation strategies to avoid excessive market impact on thin pairs.
- Risk: Track VaR/expected shortfall; use circuit-breakers (pause trading) on anomalous inputs or latency spikes.
- Robustness: Stress test with historical crises and synthetic shocks (oracle outages, exchange halts, volatility explosions).
Sentiment, news & research copilots
Language models can monitor news, reports, and social streams to surface actionable items; earnings changes, regulation updates, exploits, governance proposals. Build a retrieval index over trusted sources; for each alert, retrieve top passages and generate a grounded summary with links. Track precision (usefulness of alerts), latency, and coverage by topic.
- Entity linking: Map mentions (tickers, token symbols, project names) to canonical entities to avoid symbol collisions.
- Sentiment: Supervised classifiers over finance-specific data often beat generic sentiment. Calibrate to avoid overreacting to sarcasm or memes.
- Policy risk: Summarize proposed laws/regulatory actions and tag affected assets, exchanges, or wallet types.
On-chain analytics & DeFi risk
Public ledgers enable powerful analytics:
- Wallet clustering: Graph clustering and heuristics to infer entities and behavior patterns (market makers, bridges, mixers). Provide confidence scores and allow user overrides.
- Flow monitoring: Detect large/informed flows (bridge inflows, whale transfers, concentrated LP adds/removes). Combine with market context.
- MEV & sandwich risk: Identify patterns around swaps and liquidations; adjust slippage and routing policies in your app.
- Protocol risk scoring: Assess exposure to oracles, collateral concentration, code review status, upgradeability, admin keys, audits, and historical incidents. A simple weighted score, paired with explanations helps users make informed choices.
- Stablecoin health: Track peg deviations, liquidity depth, and collateral transparency; alert on persistent deviations and oracle lags.
DAO & governance intelligence
Governance moves fast and is often fragmented across forums and chains. Retrieval-augmented generation (RAG) can unify discussions: index proposals, comments, and vote histories; generate weekly digests with pros/cons, notable voters, and timelines. Add stance detection (support/oppose/neutral) for delegates and projects. For transparency, always include citations to the exact threads and transactions.
Backtesting, leakage & evaluation pitfalls
- Look-ahead bias: Accidentally using information that wasn’t available at decision time (e.g., final OHLC for the bar you’re predicting), or using future on-chain states. Enforce strict time cuts.
- Survivorship bias: Testing only assets that still exist; include delisted/abandoned tokens for realism.
- Data snooping: Over-tuning to the backtest period. Keep a final untouched test set and consider walk-forward evaluation.
- Transaction costs: Simulate fees, slippage, and market impact; add latency; consider halted markets.
- Multiple hypothesis: If you try 100 combinations and keep the best, you almost certainly overfit. Penalize model complexity and require out-of-sample confirmation.
- Drift & regime shifts: Monitor live performance; pause or down-weight models during structural breaks (forks, regulatory shocks, oracle incidents).
Compliance, safety & ethics
Even if you’re not a bank, you handle sensitive data and influence financial outcomes. Apply the same discipline as our AI Ethics & Risks chapter: data minimization, explainability, cohort evaluation, human oversight, and incident playbooks. Log decisions. For sanctions/AML screening, treat models as assistive: require human adjudication and retain evidence. Make disclaimers clear in user-facing products; never present model outputs as certainties.
Hands-on projects & roadmaps
- Fraud triage MVP: Build a gradient-boosted model with features from transactions (amount, device, merchant, geo, velocity). UI shows risk score, top features, and related wallet graph. Measure PR-AUC and analyst time saved.
- Signal lab: Create a small pipeline that computes 10 simple signals (momentum, volatility, funding skew, on-chain netflows). Rank assets daily; report hit-rate and drawdown under a naive long-only rule with costs.
- RAG research bot: Index reputable sources (docs, forums, audits, governance). For each query, return a grounded summary with citations and a “confidence” rubric your team defines.
- DeFi risk dashboard: Combine protocol metrics (TVL concentration, oracle sources, upgradeable proxies) into a transparent score with links to evidence and an audit trail.