COT Backtests
This page documents the backtesting research behind TradeAnon's COT-based contrarian signals. The COT Watchlist is built on validated signals with historical performance data.
Research Objective
The COT backtest research aimed to answer:
- Do positioning extremes predict reversals? — When speculators are extremely positioned, do markets tend to reverse?
- Which threshold works best? — What z-score level identifies actionable extremes?
- Which markets show strongest signals? — Are some futures more responsive than others?
- What is realistic performance? — What returns and risk should we expect?
Methodology
Signal Definition
A contrarian signal is generated when:
Signal = Fade speculator extreme positioning
Long Signal:
- Speculator net position z-score < -threshold (extremely bearish)
- Signal: Go long (fade the bears)
Short Signal:
- Speculator net position z-score > +threshold (extremely bullish)
- Signal: Go short (fade the bulls)
Threshold Levels Tested
| Level | Z-Score | Description |
|---|---|---|
| Conservative | ±2.5σ | Very extreme readings |
| Standard | ±2.0σ | Moderate extremes |
| Aggressive | ±1.5σ | More frequent signals |
Holding Period
Signals tested with multiple holding periods:
- 1 week
- 2 weeks
- 4 weeks
- 8 weeks
Universe
Tested across major futures categories:
- Equity indices (ES, NQ, RTY)
- Energy (CL, NG)
- Metals (GC, SI, HG)
- Currencies (6E, 6J, 6B)
- Interest rates (ZN, ZB, ZF)
- Agriculture (ZC, ZS, ZW)
Key Findings
Overall Results
Contrarian signals show positive expectancy:
- Win rates: 52-60% depending on market and threshold
- Average win typically exceeds average loss
- Sharpe ratios: 0.4-0.8 range
- Works better in some markets than others
Threshold Analysis
| Threshold | Win Rate | # Signals/Year | Avg Return |
|---|---|---|---|
| ±2.5σ | 58% | 2-3 | Higher |
| ±2.0σ | 55% | 4-6 | Moderate |
| ±1.5σ | 52% | 8-12 | Lower |
Tradeoff: More conservative thresholds have higher win rates but fewer signals.
Market-Specific Results
Strongest Signals:
- Gold (GC): Consistent contrarian edge
- Crude Oil (CL): Strong reversal patterns
- S&P 500 (ES): Reliable at extremes
Moderate Signals:
- Currencies: Variable by pair
- Grains: Seasonal factors complicate
Weakest Signals:
- Natural Gas (NG): High volatility reduces reliability
- Some agricultural commodities
Holding Period Analysis
| Holding Period | Win Rate | Avg Return | Notes |
|---|---|---|---|
| 1 week | 51% | Lower | Often too short |
| 2 weeks | 54% | Moderate | Good balance |
| 4 weeks | 56% | Higher | Allows trend to develop |
| 8 weeks | 55% | Variable | Diminishing edge |
Conclusion: 2-4 week holding periods show best risk-adjusted results.
Sample Backtest: Gold (GC)
Parameters
- Period: 2015-2024
- Threshold: ±2.0σ
- Holding: 4 weeks
- Position: Long only (fade bearish extremes)
Results
| Metric | Value |
|---|---|
| Total Signals | 47 |
| Win Rate | 59.6% |
| Avg Win | 4.2% |
| Avg Loss | 2.8% |
| Sharpe Ratio | 0.72 |
| Max Drawdown | 12.4% |
| Total Return | 89.3% |
Equity Curve Characteristics
- Steady upward trend with expected drawdowns
- Outperformed buy-and-hold during test period
- Larger gains during periods of sentiment extremes
Watchlist Construction
The COT Watchlist surfaces signals that meet validation criteria:
Inclusion Criteria
- Positive historical expectancy — Backtested signal shows positive returns
- Statistical significance — Sufficient sample size (>20 signals)
- Current extreme — Z-score beyond threshold
- Recent data — COT report within last week
Displayed Metrics
For each watchlist signal:
| Metric | Purpose |
|---|---|
| Symbol | Market identifier |
| Direction | Long or short signal |
| Z-Score | Current extreme reading |
| Threshold | Which level triggered |
| Historical Win Rate | Backtest performance |
| Historical Sharpe | Risk-adjusted performance |
| Sample Size | Number of historical signals |
Ranking
Signals ranked by:
- Sharpe ratio (primary)
- Win rate (secondary)
- Recency of extreme (tertiary)
Limitations
Data Limitations
COT Delay:
- Data as of Tuesday, released Friday
- 3-day lag between positions and report
- Signal may be stale by trading time
History Depth:
- Reliable data since ~2006
- Limited sample for some markets
- Regime changes may affect future performance
Methodological Limitations
No Look-Ahead:
- Signals based on available data at time
- But holding period results known ex-post
Single Entry:
- Tests single entry at signal
- Real trading might scale in/out
No Stops:
- Tests buy-and-hold for holding period
- Live trading would use risk management
Market Limitations
Capacity:
- Individual traders have minimal impact
- Institutional size may move markets
- Most signals remain actionable
Correlation:
- Multiple signals may fire simultaneously
- Diversification benefits limited
- Risk management crucial
Using These Results
For Traders
- Don't trade every signal — Focus on highest conviction
- Use proper sizing — Don't overweight any single position
- Apply risk management — Use stops despite backtest assumptions
- Expect variance — Individual results will vary
Realistic Expectations
What to Expect:
- Win rate near 55% (not 80%)
- Regular losing trades
- Periods of drawdown
- Modest but consistent edge
What NOT to Expect:
- Every signal to win
- Immediate results
- Outperformance every period
- Risk-free returns
Research Updates
Backtests are periodically re-run to:
- Incorporate new data
- Validate continued effectiveness
- Adjust thresholds if needed
- Remove signals that stopped working
Last updated: Check platform for current date.