Lesson 14: LLM Applications in Quant

LLM is the most powerful research assistant, but the worst trader. Let it do what it's good at.


The GPT Trader Fantasy vs Reality

In 2023, a team ran an experiment: let GPT-4 make trading decisions directly.

They gave GPT-4:

  • Past 20 days of price data
  • Latest financial news
  • Current positions

Then asked: "Should we buy, sell, or hold?"

Results after 3 months:

MetricGPT-4 StrategyRandom StrategyS&P 500
Return-12%-3%+8%
Sharpe Ratio-0.8-0.20.5
Max Drawdown25%18%10%

GPT-4 even underperformed a random strategy.

Why did it fail?

  1. Hallucination problem: GPT-4 makes up non-existent "market patterns"
  2. Weak math ability: Cannot accurately calculate returns, volatility
  3. No memory: Each call is independent, cannot accumulate experience
  4. High latency: One inference takes seconds, misses fast-moving markets
  5. Not reproducible: Same input may produce different output

But during the same period, teams using GPT-4 for earnings analysis reported:

  • Earnings interpretation quality improved 3x
  • Research report generation time dropped from 4 hours to 30 minutes
  • News sentiment analysis accuracy 85%

Conclusion: LLM isn't suitable for direct trading, but is powerful for research and analysis.


14.1 The Right Role for LLM

What Can LLM Do?

CapabilityQuant ApplicationValue Rating
Text understandingEarnings interpretation, news analysis5/5
Information extractionExtract key metrics from 10-K5/5
Report generationStrategy diagnostic reports, postmortem summaries4/5
Code generationAssist writing backtest code4/5
Knowledge Q&AExplain financial concepts3/5

What Can't LLM Do?

AttemptWhy It FailsAlternative
Direct price predictionCannot access real-time data, inaccurate calculationsTraditional ML models
Real-time trading decisionsToo much latency, not reproducibleRules + ML systems
Precise calculationsHigh error rate on large numbersDedicated calculation modules
Long-term memoryLimited context, cannot accumulateExternal knowledge bases

Core Principle: LLM as Enhancement Layer

LLM Architecture Positioning

14.2 News and Sentiment Analysis

Application Scenarios

Extract trading-relevant information from financial news:

News TypeExtracted ContentQuant Value
Earnings releasesBeat/miss expectations, guidance changesEvent-driven signals
Product newsNew launches, recalls, partnershipsFundamental changes
Macro policyRate decisions, regulatory changesSystemic risk
Analyst opinionsRating changes, price target adjustmentsSentiment indicators

Sentiment Analysis Example

Input News:

Apple reported quarterly revenue of $123 billion, beating Wall Street expectations of $118 billion. However, iPhone sales in China declined 15% year-over-year due to increased competition from Huawei.

LLM Extraction:

{
  "company": "AAPL",
  "event_type": "earnings",
  "sentiment_overall": 0.6,
  "key_points": [
    {"topic": "revenue", "sentiment": 0.8, "value": "beat by 4%"},
    {"topic": "china_iphone", "sentiment": -0.7, "value": "-15% YoY"}
  ],
  "trading_signal": "mixed",
  "confidence": 0.7
}

How to Use:

  • Don't trade directly, use as one input feature for signals
  • Combine with price momentum, technical indicators
  • Use historical data to validate LLM sentiment's predictive power (backtest)

Prompt Engineering Key Points

Bad prompt:

Is this news bullish or bearish?

Good prompt:

You are a financial analyst. Please analyze the following news:
[News content]

Please output in the following format:
1. Company/asset involved (ticker symbol)
2. Event type (earnings/product/policy/other)
3. Overall sentiment score (-1 to 1, -1 extremely negative, 1 extremely positive)
4. 3 key information points, each with topic, sentiment score, specific value
5. Possible short-term price impact (up/down/neutral)
6. Confidence (0 to 1)

Output JSON only, no explanation.

14.3 Earnings Report Automation

10-K/10-Q Information Extraction

US public company annual reports (10-K) and quarterly reports (10-Q) contain massive information, but typically exceed 100 pages.

LLM can automatically extract:

Extracted ContentLocationUse
Revenue growthFinancial StatementsFundamental screening
Gross margin changeMD&AProfitability trend
Risk factor changesRisk FactorsRisk warning
Management discussion toneMD&ASentiment indicator
Litigation progressLegal ProceedingsEvent risk

Structured Extraction Process

+-------------------+
|  Download 10-K PDF |
+--------+----------+
         |
         v
+-------------------+
|   PDF -> Text     |
+--------+----------+
         |
         v
+-------------------+
| Segment (by section) |
+--------+----------+
         |
         v
+-------------------+
| LLM extract per segment |  <- Process each segment separately to avoid context overflow
+--------+----------+
         |
         v
+-------------------+
| Store structured data | -> Database for backtesting
+-------------------+

Earnings Tone Change Detection

Compare MD&A tone changes between consecutive earnings reports:

Previous StatementCurrent StatementSignal
"We expect growth""We remain cautiously optimistic"Tone weakening
"Facing challenges""Facing significant challenges"Risk intensifying
"Exploring""Have initiated"Progress accelerating

Note: Tone change is a relative indicator - needs historical comparison to be meaningful.


14.4 Research Report Generation

Strategy Diagnostic Report

Let LLM help write postmortem reports:

Input:

Strategy name: Momentum Strategy v2.3
Time range: 2024-01-01 to 2024-03-31
Performance data:
  - Return: +8.5%
  - Benchmark return: +12.0%
  - Max drawdown: -6.2%
  - Sharpe ratio: 1.2
  - Turnover: 420%

Trade details: (attach CSV)

LLM Output:

# Momentum Strategy v2.3 Q1 2024 Postmortem

## Summary
Strategy underperformed benchmark by 3.5% in Q1, mainly due to consecutive
stop-losses during February's ranging period.

## Core Problem Diagnosis
1. **High turnover cost**: 420% turnover means ~2.1% annualized trading cost
2. **February anomaly**: February return -4.3%, 70% of total drawdown
3. **Sector concentration**: Tech stocks 65% of holdings, lacking diversification

## Improvement Recommendations
1. Add holding period filter to reduce short-term noise trades
2. Reduce position or pause strategy when ADX < 20
3. Add sector diversification constraint, single sector < 30%

## Next Steps
- [ ] Backtest performance after adding ADX filter
- [ ] Test effect of minimum holding period = 3 days

Report Quality Control

LLM-generated reports need verification:

Check ItemRiskResponse
Data accuracyLLM may fabricate dataExtract key data from source, don't let LLM calculate
Causal errorsCorrelation mistaken for causationHuman review key conclusions
Missing important infoOnly sees partial inputCheck if all input dimensions covered

14.5 LLM Safety Constraints

Why Need Hard Constraints?

LLM output is unreliable. If used directly for trading:

RiskScenarioConsequence
Hallucination"Based on data, recommend full position buy"Actual data doesn't support
Calculation error"Return 15%" is actually 1.5%Wrong decision
InconsistencySame question, different answersNot reproducible

Hard Constraint Design

LLM Constraint Layer Architecture

Audit Trail

Every LLM call must be logged:

{
  "timestamp": "2024-01-15 09:30:00",
  "input_prompt": "Analyze AAPL earnings...",
  "output_raw": "...",
  "output_parsed": {"sentiment": 0.6, ...},
  "model": "gpt-4-0125-preview",
  "temperature": 0.1,
  "action_taken": "Used as feature input to signal model",
  "final_decision": "No trade executed"
}

Purposes:

  • Postmortem whether LLM misled decisions
  • Detect LLM performance degradation for timely adjustment
  • Regulatory compliance

14.6 LLM Agent Architecture

Position in Multi-Agent System

+-------------------------------------------------------------+
|                      Meta Agent                              |
+----------------------------+--------------------------------+
                             |
    +------------------------+------------------------+
    |                        |                        |
    v                        v                        v
+---------+            +---------+              +---------+
|  LLM    |            | Signal  |              |  Risk   |
|Research |  --------> |  Agent  | -----------> |  Agent  |
|  Agent  |   Input    +---------+              +---------+
+---------+  Sentiment
             Features

LLM Agent Position:
  - Not on critical decision path
  - Only provides "reference information"
  - Downstream Agents decide whether to adopt

LLM Agent Responsibility Boundaries

Can DoCannot Do
Analyze news sentimentDirectly generate buy/sell signals
Extract earnings dataCalculate position sizes
Generate report draftsModify risk control parameters
Explain strategy logicExecute orders
Flag anomalous eventsDisable stop-losses

LLM Capability Boundaries LLMs are excellent research analysts but should never directly participate in trading decisions. They can parse annual reports, detect tone changes in earnings calls, flag accounting irregularities, and generate factor hypotheses — but they cannot generate orders, calculate position sizes, modify risk parameters, or execute trades. This boundary must be enforced at the architecture level: LLM outputs can only serve as signal inputs, never reaching the execution or risk control paths directly. See Lesson 22 for the complete boundary table.


14.7 Practical Considerations

Cost Control

ScenarioCall FrequencyCost Control Strategy
Real-time news analysisPer news itemPre-filter irrelevant news with rules
Earnings analysisQuarterlyCost acceptable
Daily report generationDailyUse caching to avoid duplicate calls
Real-time Q&AOn-demandLimit daily call count

Latency Management

ApplicationAcceptable LatencyStrategy
Event-driven trading< 1 secondDon't use LLM, use rules
Daily strategiesMinutesCan use LLM, batch processing
Research analysisHoursUse freely

Model Selection

TaskRecommended ModelReason
Sentiment analysisGPT-3.5 / Claude HaikuSimple task, low cost
Earnings analysisGPT-4 / Claude OpusNeeds deep understanding
Code generationClaude Opus / GPT-4Needs precision
Simple classificationFine-tuned small modelLowest cost, minimal latency

Lesson Deliverables

After completing this lesson, you will have:

  1. Clear understanding of LLM's role in quant - Know what it can and cannot do
  2. Sentiment analysis capability - Can design news/earnings LLM analysis workflows
  3. Safety constraint awareness - Understand why hard constraints are necessary
  4. Architecture integration method - Know how LLM Agent fits into multi-agent systems

Acceptance Criteria

CheckpointAcceptance StandardSelf-Test Method
Role understandingCan name 3 things LLM can and cannot do eachList without notes
Prompt designCan write structured sentiment analysis promptGiven news, design prompt
Safety constraintsCan explain why LLM shouldn't trade directlyList 3 risks
Architecture positioningCan mark LLM position in multi-Agent architecture diagramDraw diagram

Design Exercise:

Design an LLM news analysis workflow with requirements:

  1. Process 500 financial news items daily
  2. Latency requirement < 1 hour to complete all
  3. Cost budget < $50/day
  4. Output sentiment features for daily strategies
Click to see reference solution

Workflow Design:

  1. Pre-filtering (rules):

    • Keyword filter, keep only news about held stocks
    • Expected to filter down to 100 items
  2. Batch processing:

    • Use GPT-3.5-turbo ($0.002/1K tokens)
    • Each news ~500 tokens input + 200 tokens output
    • 100 items x 700 tokens x $0.002/1K = $0.14
  3. Parallel calls:

    • 10 concurrent calls
    • ~3 seconds each, 100 items ~30 seconds total
  4. Output format:

    {
      "symbol": "AAPL",
      "sentiment": 0.6,
      "event_type": "earnings",
      "key_point": "beat expectations"
    }
  5. Post-processing:

    • Aggregate multiple news items per stock
    • Calculate weighted average sentiment score
    • Store in feature database

Cost Estimate:

  • $0.14/day x 30 days = $4.2/month
  • Well below $50/day budget

Lesson Summary

  • Understand LLM's correct role in quant: research assistant, not trader
  • Master news sentiment analysis and earnings information extraction applications
  • Recognize LLM risks: hallucination, calculation errors, not reproducible
  • Learn to design safety constraints: hard constraint layer + audit trail
  • Understand LLM Agent's position in multi-agent architecture

Further Reading


Next Lesson Preview

Lesson 15: Risk Control and Money Management

Even the best strategy loses money - the key is "how much loss is acceptable." Risk Agent has veto power, it's the last line of defense for protecting capital. Next lesson we dive deep into the core logic of risk control.

Cite this chapter
Zhang, Wayland (2026). Lesson 14: LLM Applications in Quant. In AI Quantitative Trading: From Zero to One. https://waylandz.com/quant-book-en/Lesson-14-LLM-Applications-in-Quant
@incollection{zhang2026quant_Lesson_14_LLM_Applications_in_Quant,
  author = {Zhang, Wayland},
  title = {Lesson 14: LLM Applications in Quant},
  booktitle = {AI Quantitative Trading: From Zero to One},
  year = {2026},
  url = {https://waylandz.com/quant-book-en/Lesson-14-LLM-Applications-in-Quant}
}