from olpsbandit.llm import GRPOTrainer, NewsAligner
# 1. Align news tokens with price action
dataset = NewsAligner.sync(tickers=["NVDA", "AMD"], lookback="2y")
# Train Policy on Returns
trainer = GRPOTrainer(model="qwen-3-8b-fin", reward_func="sharpe")
# Forecast Quantiles
forecast = trainer.predict(live_stream)
print(forecast.p90_confidence)
# >> 0.87 (HIGH CONVICTION)Backtest Results
/backtest-preview.png
Live Monitor
/live-preview.png
Strategy Editor
/editor-preview.png
Built for Modern Quants
A complete ecosystem for researching, backtesting, and deploying agent-based strategies.
Live Performance
Real-time leaderboard of deployed strategies.
| Strategy | Return | Sharpe | Avg Hold | Trend (1D) |
|---|
Price-Aligned News & Quantile Forecasting
Standard sentiment analysis is noisy. We use Reinforcement Learning from Verifiable Rewards (RLVR) to align large language models directly with market returns.
- 1Ingest news from 50+ financial sources
- 2Output probability distributions (quantiles)
- 3Execute on high-confidence P90/P10 divergences
# Initialize Strategy Ensemble
bandit = ContextualBandit(
arms=[
MeanReversion(lookback=20),
TrendFollowing(ema_span=50),
SentimentModel(source="reuters")
],
policy="LinUCB"
)
# Adaptive Allocation
allocation = bandit.select_arm(live_stream)Real-time Weight Adjustment
Updating every tickSolve the Non-Stationarity Problem
What worked yesterday often fails today. Markets shift between regimes: trends, volatility, liquidity. Instead of relying on a static strategy, we treat every signal as an "arm" in a Multi-Armed Bandit problem.
Smart Exploitation
The system automatically shifts heavy capital allocation to the "hot hand". Reward functions (e.g. Sharpe ratio) allow for regime specialization.
Continuous Exploration
Minimize regret by testing underperforming strategies with small capital, ensuring you never miss a regime shift.
Contextual Bandits
Unlike simple bandits, our agents observe market context (volatility, sentiment, volume) to predict which arm will perform best before the move happens.
Ready to outsmart the market?
Clone the repo, install the SDK, and start training your first agent today.