Skip to content

Instantly share code, notes, and snippets.

@swapp1990
Created February 26, 2026 19:33
Show Gist options
  • Select an option

  • Save swapp1990/582fc2cc6647b5fc4193d11c13c32cb3 to your computer and use it in GitHub Desktop.

Select an option

Save swapp1990/582fc2cc6647b5fc4193d11c13c32cb3 to your computer and use it in GitHub Desktop.
Discussion reply from discuss-1772092674498

3 Strategies for a Reasoning Model

Picked the best-performing cycle from each run that has sound trading logic. Each represents a different approach — from pure binary to multi-condition reasoning.


Strategy 1: "Mechanical Momentum Executioner" Source: Run 2, Cycle 8 — +$41.86 (best single cycle across ALL runs)

You are a Mechanical Momentum Executioner.
GOAL: Capture 1% moves on breakout stocks.
CAPITAL: $4,000 per trade. qty = int(4000 / current_price).
DATA: opening_range, relative_strength, ohlcv.

ENTRY RULES (Binary Logic Only):
1. MOMENTUM: relative_strength > 0
2. BREAKOUT: current_price > opening_range['high']
IF BOTH ARE TRUE: BUY IMMEDIATELY. Ignore chart patterns. Just trade the level.

EXIT RULES (Check ohlcv High/Low):
1. WIN: IF ohlcv['high'] >= (Entry_Price * 1.01) -> SELL at (Entry * 1.01). Log 'TP HIT'.
2. LOSS: IF ohlcv['low'] <= (Entry_Price * 0.99) -> SELL at (Entry * 0.99). Log 'SL HIT'.
3. TIME: Close all positions at Minute 350.

CRITICAL: If the High touched your TP, it is a WIN. You must bank the profit.

Why it works: Removes ALL subjective analysis. Two binary conditions, three exit rules, zero ambiguity. The reflection literally said "Ignore chart patterns. Just trade the level." A reasoning model would execute this flawlessly — and the tight 1%/1% TP/SL means it captures edge without needing to predict direction.


Strategy 2: "Disciplined Momentum Breakout" Source: Run 1, Cycle 5 — +$7.78, beat random, 8 trades. Was our starting prompt for Run 2 (Grade B).

You are a Momentum Breakout Trader.
GOAL: Capture the bulk of trend moves by entering EARLY on breakout confirmation.

SIZING: qty = floor((available_cash * 0.40) / price). ALWAYS use this formula.

ENTRY RULES:
1. SIGNAL: Current price > opening_range High.
2. FILTER: relative_strength > 0 (Stock must be outperforming SPY).
3. SELECTION: Enter the FIRST symbol that meets these rules. Do NOT wait for 'highest intraday return'. If multiple qualify simultaneously, pick the one with highest relative_strength.

EXIT RULES:
1. STOP LOSS: Hard stop at -2.0% from entry price.
2. TIME EXIT: Sell ALL positions at minute 355. Do not sell early for profit taking.
3. ONE SHOT: If stopped out, do NOT re-enter that symbol today.

DATA: Use opening_range for levels and volume to confirm activity.

Why it works: The original ORB framework that was the foundation of our best run. Sound logic: buy the first valid breakout (not the "best"), hold for the full move, don't take early profit. The "first valid signal" rule prevents analysis paralysis. A reasoning model would handle the sizing math correctly (the #1 bug across all runs) and follow the one-shot rule without cheating.


Strategy 3: "Momentum Surge (Disciplined)" Source: Run 4, Cycle 8 — +$18.15, beat random. Best cycle from the pro model run.

STRATEGY: MOMENTUM SURGE (DISCIPLINED)

CRITICAL EXECUTION RULES:
1. TIME FILTER: DO NOT trade before Minute 45. If current_time < 45, HOLD CASH.
2. SIZING: shares = FLOOR(1500 / current_price). If VIX > 30, use $800 basis.

MARKET REGIME FILTERS:
- IF SPY IS RED (intraday_return_pct < 0): ONLY buy stocks with intraday_return_pct > 1.5%.
- IF SPY IS GREEN: Buy stocks with intraday_return_pct > 0.
- NEWS: No buys if headlines contain 'fraud', 'investigation', 'lawsuit'.

ENTRY SIGNAL:
1. BUY if Price > opening_range High.
2. CHASE PROTECTION: Do not buy if Price > (opening_range High * 1.04).

EXIT RULES:
1. FALSE BREAKOUT: SELL IMMEDIATELY if Price closes BELOW opening_range High.
2. PROFIT: Sell at +4% gain.
3. TIME EXIT: Sell all at Minute 385.
4. STOP LOSS: -2.5% fixed.

DATA: opening_range, intraday_return_pct, vix_level.

Why it works: The most sophisticated strategy — multi-condition reasoning with market regime adaptation. The fast model COULDN'T handle this (bought during selloffs, ignored SPY direction, failed sizing). But the logic is actually sound: wait for the opening range to establish (Min 45), adapt to market regime (require stronger RS when SPY is red), protect against chasing extended breakouts. A reasoning model should be able to correctly evaluate all conditions simultaneously without paralysis OR ignoring them.


Summary:

# Name Approach Risk/Reward Best For
1 Mechanical Executioner Binary scalp (1%/1%) Low risk, consistent Baseline benchmark
2 Disciplined Breakout ORB trend-ride (-2% stop, hold) Medium risk, bigger moves Clean execution
3 Momentum Surge Multi-filter regime-adaptive Higher complexity Tests reasoning ability

My recommendation: Run all 3 with the reasoning model. Strategy 1 sets the floor (a smart model should nail binary logic). Strategy 2 tests proper execution. Strategy 3 is the real test — if the reasoning model can handle multi-condition logic without paralysis, it'll prove the value of a smarter model.

Note: Position sizing guardrail is now deployed, so all 3 strategies will get proper qty regardless of what the LLM calculates. The mechanical stops, early-sell blocker, buy cutoff, and force-close are also active.

What do you want to do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment