Skip to main content
Skip to main content
Version: Next 🚧

How to Validate with Monte Carlo

Problem

Your backtest shows good results, but you're not sure if the performance is due to skill or just lucky trade sequencing. You want to assess how robust your strategy is under different trade orders.

Prerequisites

  • Completed backtest with trade history
  • Minimum 30 trades (preferably 50+)
  • Understanding of probability and confidence intervals
  • Realistic expectations about strategy performance

What is Monte Carlo Simulation?

Monte Carlo simulation randomizes the order of your historical trades thousands of times to answer:

  • Was I lucky? Did I happen to get winning trades at the right time?
  • How robust is my strategy? Does it work under different trade sequences?
  • What's the worst case? What's the maximum drawdown I might experience?
  • What's the probability of ruin? What are the chances of losing 50% of my account?

Solution

Step 1: Configure Monte Carlo Settings

Set up the simulation parameters:

{
"monteCarloConfig": {
"enabled": true,
"numSimulations": 5000,
"confidenceLevel": 95,
"ruinThreshold": -50
}
}

Parameter Guidelines:

ParameterRecommendedDescription
numSimulations1000-10000More = more accurate, but slower
confidenceLevel90, 95, or 99Higher = wider confidence intervals
ruinThreshold-30 to -50Drawdown % considered "ruin"

Step 2: Run Monte Carlo Simulation

Execute the simulation on your backtest results:

// Run Monte Carlo simulation
const simulation = await fetch('/api/backtest/monte-carlo', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${token}`
},
body: JSON.stringify({
backtestId: "backtest_123",
monteCarloConfig: {
enabled: true,
numSimulations: 5000,
confidenceLevel: 95,
ruinThreshold: -50
}
})
}).then(res => res.json())

console.log('Monte Carlo simulation started:', simulation.id)

Execution Time:

  • 1000 simulations: 10-30 seconds
  • 5000 simulations: 30-90 seconds
  • 10000 simulations: 1-3 minutes

Step 3: Understand Confidence Intervals

The simulation provides confidence intervals for key metrics:

{
"monteCarloResults": {
"totalReturn": {
"original": 20.5,
"mean": 18.2,
"median": 17.8,
"stdDev": 8.5,
"confidenceInterval": {
"lower": 5.2,
"upper": 31.8
},
"percentileRank": 68
},
"maxDrawdown": {
"original": 15.5,
"mean": 18.3,
"median": 17.2,
"stdDev": 6.2,
"confidenceInterval": {
"lower": 8.5,
"upper": 28.9
},
"percentileRank": 35
},
"sharpeRatio": {
"original": 1.85,
"mean": 1.62,
"median": 1.58,
"stdDev": 0.45,
"confidenceInterval": {
"lower": 0.85,
"upper": 2.42
},
"percentileRank": 72
},
"profitFactor": {
"original": 1.8,
"mean": 1.65,
"median": 1.62,
"stdDev": 0.38,
"confidenceInterval": {
"lower": 1.05,
"upper": 2.28
},
"percentileRank": 65
}
}
}

Interpreting Confidence Intervals:

95% confidence interval means:

  • 95% of simulations fell within this range
  • You can expect results in this range 95% of the time
  • Wider intervals = more uncertainty

Example:

Total Return: 20.5% (original)
95% CI: [5.2%, 31.8%]

Interpretation:
- Your actual return was 20.5%
- 95% of simulations returned between 5.2% and 31.8%
- You have 95% confidence future results will be in this range

Step 4: Check Percentile Rank

Percentile rank shows where your original results fall compared to simulations:

function interpretPercentileRank(rank, metric) {
if (rank > 75) {
return {
interpretation: 'Lucky',
message: `Your ${metric} was better than ${rank}% of simulations. You may have been lucky with trade timing.`,
concern: 'high'
}
} else if (rank >= 45 && rank <= 55) {
return {
interpretation: 'Typical',
message: `Your ${metric} was average. Results are likely due to strategy skill, not luck.`,
concern: 'low'
}
} else if (rank < 25) {
return {
interpretation: 'Unlucky',
message: `Your ${metric} was worse than ${100 - rank}% of simulations. You may have been unlucky.`,
concern: 'medium'
}
} else {
return {
interpretation: 'Normal',
message: `Your ${metric} was within normal range.`,
concern: 'low'
}
}
}

// Example
const returnRank = interpretPercentileRank(68, 'total return')
// "Your total return was better than 68% of simulations. Results are within normal range."

const drawdownRank = interpretPercentileRank(35, 'max drawdown')
// "Your max drawdown was better than 35% of simulations. You may have been lucky with lower drawdown."

Percentile Rank Guidelines:

  • > 75%: You were lucky (results better than most simulations)
  • 45-55%: Typical (results near average)
  • < 25%: You were unlucky (results worse than most simulations)

Step 5: Assess Robustness

Check the robustness rating:

{
"robustness": {
"rating": "high",
"score": 82,
"factors": {
"consistentProfitability": true,
"narrowConfidenceIntervals": true,
"lowProbabilityOfRuin": true,
"stablePerformance": true
}
}
}

Robustness Ratings:

  • High (80-100): Strategy is very robust, performs well under different sequences
  • Moderate (60-79): Strategy is reasonably robust, some variability
  • Low (< 60): Strategy is fragile, highly dependent on trade sequence
function assessRobustness(results) {
const factors = {
// Factor 1: Positive in 90%+ simulations
consistentProfitability: results.totalReturn.confidenceInterval.lower > 0,

// Factor 2: Narrow confidence intervals (relative to mean)
narrowConfidenceIntervals: (
(results.totalReturn.confidenceInterval.upper -
results.totalReturn.confidenceInterval.lower) /
results.totalReturn.mean
) < 2.0,

// Factor 3: Low probability of ruin
lowProbabilityOfRuin: results.probabilityOfRuin < 5,

// Factor 4: Stable performance (low std dev)
stablePerformance: results.totalReturn.stdDev / results.totalReturn.mean < 0.5
}

const score = Object.values(factors).filter(v => v).length / 4 * 100

return {
rating: score >= 80 ? 'high' : score >= 60 ? 'moderate' : 'low',
score: Math.round(score),
factors
}
}

Step 6: Check Probability of Ruin

Assess the risk of catastrophic loss:

{
"probabilityOfRuin": 2.5,
"ruinThreshold": -50,
"worstCaseDrawdown": -42.3
}

Interpretation:

  • < 1%: Excellent (very low risk of ruin)
  • 1-5%: Good (acceptable risk)
  • 5-10%: Moderate (higher risk, manageable)
  • > 10%: High (significant risk of ruin)
function assessRuinRisk(probabilityOfRuin, threshold) {
return {
probability: probabilityOfRuin + '%',
threshold: threshold + '%',
interpretation: probabilityOfRuin < 1 ? 'Excellent - Very low risk' :
probabilityOfRuin < 5 ? 'Good - Acceptable risk' :
probabilityOfRuin < 10 ? 'Moderate - Higher risk' :
'High - Significant risk of ruin',
recommendation: probabilityOfRuin < 5 ? 'Safe to trade with current position sizing' :
probabilityOfRuin < 10 ? 'Consider reducing position size' :
'Reduce position size significantly or improve strategy'
}
}

// Example: 2.5% probability of -50% drawdown
// "Good - Acceptable risk. Safe to trade with current position sizing."

Step 7: Analyze Distribution

Review the distribution of outcomes:

{
"distribution": {
"profitable": 4750,
"unprofitable": 250,
"profitablePercentage": 95.0,
"bestCase": 45.2,
"worstCase": -12.8,
"percentiles": {
"p10": 8.5,
"p25": 12.3,
"p50": 17.8,
"p75": 24.1,
"p90": 29.6
}
}
}

Distribution Analysis:

function analyzeDistribution(distribution) {
return {
profitableRate: distribution.profitablePercentage + '%',
range: `${distribution.worstCase}% to ${distribution.bestCase}%`,
medianReturn: distribution.percentiles.p50 + '%',

// Downside risk (10th percentile)
downsideRisk: distribution.percentiles.p10 + '%',

// Upside potential (90th percentile)
upsidePotential: distribution.percentiles.p90 + '%',

// Symmetry check
isSymmetric: Math.abs(
distribution.percentiles.p50 - distribution.mean
) < 2,

interpretation: distribution.profitablePercentage >= 90 ?
'Highly consistent - profitable in 90%+ simulations' :
distribution.profitablePercentage >= 75 ?
'Reasonably consistent - profitable in 75%+ simulations' :
'Inconsistent - significant variability in outcomes'
}
}

Complete Monte Carlo Analysis

async function validateWithMonteCarlo(backtestId) {
// 1. Run simulation
console.log('Running Monte Carlo simulation...')
const simulation = await runMonteCarloSimulation(backtestId, {
numSimulations: 5000,
confidenceLevel: 95,
ruinThreshold: -50
})

// 2. Get results
const results = await getMonteCarloResults(simulation.id)

// 3. Analyze results
const analysis = {
// Check if lucky
isLucky: results.totalReturn.percentileRank > 75,
isUnlucky: results.totalReturn.percentileRank < 25,

// Assess robustness
robustness: assessRobustness(results),

// Check ruin risk
ruinRisk: assessRuinRisk(
results.probabilityOfRuin,
results.ruinThreshold
),

// Analyze distribution
distribution: analyzeDistribution(results.distribution),

// Overall confidence intervals
expectedReturn: {
lower: results.totalReturn.confidenceInterval.lower,
upper: results.totalReturn.confidenceInterval.upper,
median: results.totalReturn.median
},

expectedDrawdown: {
lower: results.maxDrawdown.confidenceInterval.lower,
upper: results.maxDrawdown.confidenceInterval.upper,
median: results.maxDrawdown.median
}
}

// 4. Make decision
const decision = makeValidationDecision(analysis)

console.log('Monte Carlo Analysis Complete')
console.log('Decision:', decision.verdict)
console.log('Confidence:', decision.confidence)

return { analysis, decision }
}

function makeValidationDecision(analysis) {
const concerns = []

// Check for luck
if (analysis.isLucky) {
concerns.push('Results may be due to lucky trade sequencing')
}

// Check robustness
if (analysis.robustness.rating === 'low') {
concerns.push('Strategy is not robust to different trade sequences')
}

// Check ruin risk
if (analysis.ruinRisk.probability > 10) {
concerns.push('High probability of catastrophic loss')
}

// Check consistency
if (analysis.distribution.profitablePercentage < 75) {
concerns.push('Strategy is unprofitable in 25%+ of simulations')
}

// Make decision
if (concerns.length === 0) {
return {
verdict: 'APPROVED',
confidence: 'high',
message: 'Strategy passed Monte Carlo validation. Ready for live trading.',
concerns: []
}
} else if (concerns.length &lt;= 1) {
return {
verdict: 'APPROVED WITH CAUTION',
confidence: 'moderate',
message: 'Strategy passed with minor concerns. Proceed carefully.',
concerns
}
} else {
return {
verdict: 'NOT APPROVED',
confidence: 'low',
message: 'Strategy failed Monte Carlo validation. Needs improvement.',
concerns
}
}
}

Interpreting Results

Scenario 1: Robust Strategy

{
"totalReturn": {
"original": 18.5,
"percentileRank": 52,
"confidenceInterval": [12.3, 24.8]
},
"robustness": { "rating": "high", "score": 85 },
"probabilityOfRuin": 1.2,
"profitablePercentage": 94
}

Interpretation:

  • ✓ Percentile rank near 50% (not lucky)
  • ✓ High robustness score
  • ✓ Low probability of ruin
  • ✓ Profitable in 94% of simulations
  • Verdict: Strategy is robust and ready for live trading

Scenario 2: Lucky Strategy

{
"totalReturn": {
"original": 35.2,
"percentileRank": 88,
"confidenceInterval": [5.8, 42.1]
},
"robustness": { "rating": "moderate", "score": 65 },
"probabilityOfRuin": 8.5,
"profitablePercentage": 78
}

Interpretation:

  • ⚠ Percentile rank 88% (lucky)
  • ⚠ Moderate robustness
  • ⚠ Higher probability of ruin
  • ⚠ Wide confidence interval
  • Verdict: Results may be due to luck. Proceed with caution or improve strategy.

Scenario 3: Fragile Strategy

{
"totalReturn": {
"original": 22.5,
"percentileRank": 45,
"confidenceInterval": [-8.2, 48.3]
},
"robustness": { "rating": "low", "score": 42 },
"probabilityOfRuin": 15.2,
"profitablePercentage": 62
}

Interpretation:

  • ✗ Very wide confidence interval
  • ✗ Low robustness score
  • ✗ High probability of ruin
  • ✗ Unprofitable in 38% of simulations
  • Verdict: Strategy is fragile. Do not trade live. Needs significant improvement.

Best Practices

  1. Run Sufficient Simulations: Use 5000-10000 for reliable results
  2. Check Multiple Metrics: Don't rely on just one metric
  3. Look for Consistency: Strategy should be profitable in 80%+ simulations
  4. Assess Luck Factor: Percentile rank should be 40-60%
  5. Verify Robustness: Robustness score should be 70+
  6. Check Ruin Risk: Probability of ruin should be < 5%
  7. Review Confidence Intervals: Narrower is better
  8. Compare to Baseline: Monte Carlo results should match backtest expectations
  9. Test Different Thresholds: Try -30%, -40%, -50% ruin thresholds
  10. Document Results: Keep records for future reference

Troubleshooting

Problem: Wide Confidence Intervals

Causes:

  • High variability in trade results
  • Few large winners/losers
  • Inconsistent strategy performance

Solutions:

  • Improve strategy consistency
  • Add filters to reduce variability
  • Increase sample size (more trades)

Problem: High Probability of Ruin

Causes:

  • Position sizes too large
  • Stop losses too wide
  • Insufficient risk management

Solutions:

  • Reduce position size
  • Tighten stop losses
  • Implement better risk management

Problem: Lucky Results (High Percentile Rank)

Causes:

  • Favorable trade sequencing
  • Overfitted parameters
  • Small sample size

Solutions:

  • Run more simulations
  • Test on different time periods
  • Validate with walk-forward optimization