Skip to main content
Skip to main content
Version: Next 🚧

Backtesting Methodology

Introduction

Backtesting is the process of testing a trading strategy against historical data to evaluate its performance. While backtesting can't guarantee future results, proper methodology helps identify strategies with genuine edge versus those that simply got lucky. This guide covers backtesting concepts, common pitfalls, and best practices.

What is Backtesting?

Definition: Simulating a trading strategy on historical data to see how it would have performed.

Purpose:

  • Validate strategy logic
  • Estimate potential returns
  • Identify weaknesses
  • Optimize parameters
  • Build confidence before live trading

What Backtesting Can Do:

  • ✅ Test if strategy logic works
  • ✅ Identify optimal parameters
  • ✅ Estimate risk metrics
  • ✅ Compare different approaches
  • ✅ Find edge in historical data

What Backtesting Cannot Do:

  • ❌ Guarantee future performance
  • ❌ Account for all market conditions
  • ❌ Predict black swan events
  • ❌ Replace live trading experience
  • ❌ Eliminate all risk

Backtesting Process

1. Define Strategy

Clear Rules:

Entry: Price > EMA(50) AND RSI > 50
Exit: Price < EMA(50) OR Stop Loss hit
Position Size: 2% of capital
Stop Loss: 3%

Must Be Objective:

  • No discretion
  • No "if it looks good"
  • Programmable rules
  • Repeatable logic

2. Select Data

Data Requirements:

  • Quality: Accurate, clean data
  • Quantity: Sufficient history (2+ years)
  • Granularity: Match trading timeframe
  • Completeness: No gaps or missing data

Data Period:

Minimum: 1 year
Recommended: 2-3 years
Ideal: 5+ years (multiple market cycles)

Timeframe Selection:

Scalping: Tick or 1-minute data
Day Trading: 1-5 minute data
Swing Trading: 15-minute to hourly data
Position Trading: Daily data

3. Run Backtest

Configuration:

{
"startDate": "2021-01-01",
"endDate": "2023-12-31",
"initialBalance": 100000,
"symbols": ["NSE:RELIANCE", "NSE:TCS"],
"slippage": 0.1, // 0.1% slippage
"commission": 0.05 // 0.05% commission
}

Execution:

  • Process each candle sequentially
  • Apply entry/exit rules
  • Track positions and P&L
  • Record all trades
  • Calculate metrics

4. Analyze Results

Key Metrics:

  • Total Return
  • Win Rate
  • Profit Factor
  • Sharpe Ratio
  • Maximum Drawdown
  • Average Win/Loss
  • Number of Trades

Performance Evaluation:

Good Strategy:
- Win Rate: 45-60%
- Profit Factor: >1.5
- Sharpe Ratio: >1.0
- Max Drawdown: &lt;20%
- Consistent across periods

5. Validate

Out-of-Sample Testing:

In-Sample: 2021-2022 (optimize)
Out-of-Sample: 2023 (validate)

If OOS performance similar to IS → Good
If OOS much worse → Overfitted

Walk-Forward Analysis:

  • Optimize on period 1
  • Test on period 2
  • Repeat rolling forward
  • Ensures robustness

Common Pitfalls

1. Look-Ahead Bias

Problem: Using future information not available at trade time

Examples:

// ❌ BAD: Using today's close to make today's decision
if (close[0] > sma[0]) {
// This uses today's close which isn't known until day end
}

// ✅ GOOD: Using previous close
if (close[1] > sma[1]) {
// Uses yesterday's data, available at today's open
}

How to Avoid:

  • Only use data available at decision time
  • Be careful with indicators that "repaint"
  • Use previous candle for signals
  • Test with realistic execution timing

2. Survivorship Bias

Problem: Only testing on stocks that still exist today

Example:

Testing on current NSE 50 stocks
Excludes companies that failed or delisted
Inflates historical returns

How to Avoid:

  • Use point-in-time universe
  • Include delisted stocks
  • Test on index constituents as of each date
  • Account for corporate actions

3. Curve Fitting (Over-Optimization)

Problem: Finding parameters that work perfectly on historical data but fail in live trading

Example:

Testing RSI periods from 2 to 50
Finding RSI(17) gives best results
But RSI(16) and RSI(18) perform poorly
→ Likely curve-fitted

Signs of Curve Fitting:

  • Too many parameters
  • Very specific parameter values
  • Performance degrades with small changes
  • Perfect backtest results
  • Complex rules

How to Avoid:

  • Use standard parameters (14, 20, 50, 200)
  • Limit optimization variables
  • Test parameter robustness
  • Prefer simple strategies
  • Validate out-of-sample

4. Data Snooping

Problem: Testing multiple strategies until one works

Example:

Test 100 different strategies
One shows 50% annual return
But it's just random luck

How to Avoid:

  • Define strategy before testing
  • Limit strategy variations
  • Use statistical significance tests
  • Validate with Monte Carlo
  • Be skeptical of amazing results

5. Ignoring Transaction Costs

Problem: Not accounting for slippage, commissions, taxes

Reality:

Backtest: 30% annual return
After costs:
- Commissions: -2%
- Slippage: -3%
- Taxes: -5%
Actual: 20% annual return

How to Avoid:

  • Include realistic commissions
  • Add slippage (0.1-0.5%)
  • Account for bid-ask spread
  • Consider market impact
  • Factor in taxes

6. Unrealistic Assumptions

Problem: Assuming perfect execution, infinite liquidity

Unrealistic:

  • All orders filled at exact price
  • No slippage on large orders
  • Instant execution
  • Trading any size

Realistic:

  • Slippage on market orders
  • Partial fills possible
  • Execution delays
  • Position size limits

Validation Techniques

1. Out-of-Sample Testing

Method:

Total Data: 2020-2023 (4 years)

Split:
In-Sample: 2020-2022 (3 years) - Optimize
Out-of-Sample: 2023 (1 year) - Validate

Never optimize on OOS data!

Evaluation:

IS Performance: 25% return, 15% drawdown
OOS Performance: 22% return, 18% drawdown

Similar results → Strategy is robust
OOS much worse → Strategy is overfitted

2. Walk-Forward Analysis

Method:

Period 1 (2020): Optimize → Test on 2021
Period 2 (2021): Optimize → Test on 2022
Period 3 (2022): Optimize → Test on 2023

Calculate Walk-Forward Efficiency (WFE)

WFE Calculation:

WFE = OOS Performance / IS Performance

>0.7: Good (OOS is 70%+ of IS)
0.5-0.7: Acceptable
<0.5: Poor (significant degradation)

3. Monte Carlo Simulation

Method:

  • Take actual trades from backtest
  • Randomize order of trades
  • Run 1000+ simulations
  • Analyze distribution of results

Purpose:

  • Assess luck vs skill
  • Calculate confidence intervals
  • Identify robustness
  • Estimate risk of ruin

Interpretation:

Original: 30% return
Monte Carlo: 25% median, 15-40% range

If original is in top 10% → Likely lucky
If original is near median → Robust strategy

4. Different Market Conditions

Test Across:

  • Bull markets (2020-2021)
  • Bear markets (2022)
  • Sideways markets (2019)
  • High volatility (2020 crash)
  • Low volatility (2017)

Consistent Performance:

Bull: 25% return
Bear: -5% return (better than market)
Sideways: 10% return

Strategy works in multiple conditions ✓

5. Multiple Instruments

Test On:

  • Different stocks
  • Different sectors
  • Different market caps
  • Different exchanges

Robust Strategy:

Works on:
- Large caps (RELIANCE, TCS)
- Mid caps (DIXON, POLYCAB)
- Different sectors (Tech, Finance, Energy)

Not just one lucky stock

Best Practices

1. Realistic Assumptions

Include:

{
"slippage": 0.1, // 0.1% per trade
"commission": 0.05, // 0.05% per trade
"minPrice": 10, // Avoid penny stocks
"maxPositionSize": 10, // Max 10% per position
"executionDelay": 1 // 1 candle delay
}

2. Sufficient Data

Minimum Requirements:

  • 2+ years of data
  • 100+ trades
  • Multiple market conditions
  • Complete data (no gaps)

More Data = Better:

  • 5+ years ideal
  • 500+ trades preferred
  • Full market cycles
  • Various volatility regimes

3. Simple Strategies

Prefer:

  • 2-3 indicators maximum
  • Standard parameters
  • Clear logic
  • Few rules

Avoid:

  • 10+ conditions
  • Highly specific parameters
  • Complex combinations
  • Many exceptions

4. Parameter Robustness

Test:

RSI(14): 25% return
RSI(13): 23% return
RSI(15): 24% return

Robust! Small changes don't break strategy.

vs.

RSI(14): 25% return
RSI(13): 5% return
RSI(15): 3% return

Curve-fitted! Only works with exact parameter.

5. Statistical Significance

Minimum Requirements:

  • 100+ trades
  • 2+ years
  • Positive expectancy
  • Consistent across periods

T-Test:

Test if returns are statistically different from zero
p-value < 0.05 → Statistically significant

6. Risk-Adjusted Returns

Don't Just Look at Returns:

Strategy A: 40% return, 30% drawdown
Strategy B: 25% return, 10% drawdown

Strategy B is better (risk-adjusted)

Use Sharpe Ratio:

Sharpe = (Return - Risk-Free Rate) / Std Deviation

>1.0: Good
>2.0: Excellent
>3.0: Outstanding

7. Document Everything

Record:

  • Strategy logic
  • Parameters tested
  • Optimization process
  • Results and metrics
  • Assumptions made
  • Validation methods

Why:

  • Reproducibility
  • Learning from failures
  • Avoiding repeated mistakes
  • Compliance and auditing

Interpreting Results

Good Backtest Results

Characteristics:

  • Consistent returns across periods
  • Reasonable drawdowns (<20%)
  • Sufficient number of trades (100+)
  • Works on multiple instruments
  • Robust to parameter changes
  • Similar IS and OOS performance
  • Positive risk-adjusted returns

Example:

Period: 2020-2023
Total Return: 80% (20% annualized)
Win Rate: 52%
Profit Factor: 1.8
Sharpe Ratio: 1.5
Max Drawdown: 15%
Number of Trades: 250
WFE: 0.75

Warning Signs

Red Flags:

  • Too good to be true (>100% annual)
  • Perfect or near-perfect win rate (>80%)
  • Very few trades (<50)
  • Works on only one stock
  • Sensitive to parameters
  • Large IS/OOS performance gap
  • Inconsistent across periods

Example:

Period: 2020-2023
Total Return: 500% (125% annualized) ⚠️
Win Rate: 95% ⚠️
Profit Factor: 10.0 ⚠️
Number of Trades: 15 ⚠️
WFE: 0.2 ⚠️

Likely overfitted or data snooped!

Realistic Expectations

Annual Returns:

  • Excellent: 30-50%
  • Good: 20-30%
  • Acceptable: 15-20%
  • Poor: <15%

Win Rate:

  • High: 55-65%
  • Medium: 45-55%
  • Low: 35-45%

Sharpe Ratio:

  • Excellent: >2.0
  • Good: 1.0-2.0
  • Acceptable: 0.5-1.0
  • Poor: <0.5

From Backtest to Live Trading

1. Paper Trading

Before Going Live:

  • Run strategy in paper mode
  • Monitor for 1-2 weeks minimum
  • Verify execution logic
  • Check for bugs
  • Confirm performance matches backtest

2. Small Position Sizes

Start Conservative:

Backtest: 2% risk per trade
Live Start: 0.5% risk per trade

Gradually increase as confidence builds

3. Monitor Closely

Track:

  • Actual vs expected performance
  • Execution quality
  • Slippage amounts
  • Any unexpected behavior

4. Accept Variance

Understand:

  • Live results will differ from backtest
  • Short-term variance is normal
  • Judge over 50+ trades minimum
  • Some drawdown is expected

5. When to Stop

Stop If:

  • Drawdown exceeds backtest by 50%
  • Strategy logic appears broken
  • Market conditions fundamentally changed
  • Consistent underperformance (100+ trades)

Summary

Key Principles:

  1. Realistic Assumptions: Include costs, slippage, delays
  2. Avoid Biases: Look-ahead, survivorship, curve-fitting
  3. Validate Thoroughly: OOS, walk-forward, Monte Carlo
  4. Keep It Simple: Fewer parameters, standard values
  5. Test Robustness: Multiple periods, instruments, conditions
  6. Statistical Significance: 100+ trades, 2+ years
  7. Risk-Adjusted: Focus on Sharpe ratio, not just returns
  8. Document Process: Record everything for learning
  9. Be Skeptical: If it looks too good, it probably is
  10. Paper Trade First: Validate before risking real money

Backtesting Checklist:

  • Strategy rules clearly defined
  • Sufficient historical data (2+ years)
  • Realistic costs included
  • No look-ahead bias
  • Out-of-sample validation
  • Walk-forward analysis
  • Monte Carlo simulation
  • Multiple instruments tested
  • Parameter robustness checked
  • Results documented
  • Paper trading completed