Autonomous Execution Agent
Autonomous strategy research agent adapted from Microsoft's RD-Agent(Q).
Overview
The automatic agent is the challenger. Adapted from Microsoft's RD-Agent(Q), it runs 24/7 autonomously — generating hypotheses, implementing factors, backtesting, and iterating — attempting to beat the human baseline.
Both systems log to the same trading-tracker for direct comparison.
Server Details
| Property | Value |
|---|---|
| IP Address | 64.23.179.43 |
| OS | Ubuntu 24.04 LTS |
| Dashboard | http://64.23.179.43:8080 |
| SSH Port | 22 |
| Auth | SSH key only |
Access
ssh your-username@64.23.179.43Request access the same way as the collab server.
Pipeline
Five-stage closed loop running continuously:
Specification → Synthesis → Implementation → Validation → Analysis
↑ │
└──────────────────────────────────────────────────────┘- Specification — Research goal and constraints (e.g., "Sharpe > 1.5, max DD < 15%, US mid-cap").
- Synthesis — LLM generates hypotheses from domain priors and prior results.
- Implementation — Co-STEER code agent translates hypothesis to executable Qlib factor/model.
- Validation — Walk-forward backtest via Qlib + Papermill. Metrics: IC, Sharpe, drawdown, turnover.
- Analysis — Multi-armed bandit decides next focus (factor vs model). Results persisted.
Adaptations from Original RD-Agent(Q)
| Aspect | Original | Ours |
|---|---|---|
| Market | Chinese A-shares | US equities (Alpaca) |
| Factors | Alpha101 | Custom from team research |
| Evaluation | Internal only | Same Alpaca + trading-tracker as human baseline |
| Human gate | Optional | Required for live promotion |
Directory Structure
/opt/rd-agent/
├── config/ # spec.yaml, scheduler.yaml, model_config.yaml
├── workspace/
│ ├── factors/ # Generated factor code
│ ├── models/ # Generated model code
│ └── results/ # Per-iteration backtest results
├── knowledge/ # hypotheses.json, code_patterns.json, factor_registry.json
├── data/ # market/ and qlib_data/
└── logs/iterations/ # Per-iteration logsOperations
# Status
systemctl status rd-agent
# Live logs
journalctl -u rd-agent -f
# Current iteration
cat /opt/rd-agent/workspace/results/latest/iteration.json | jq '.iteration'
# View specific iteration
cat /opt/rd-agent/workspace/results/iter_042/hypothesis.json | jq '.'
# Pause / resume
sudo systemctl stop rd-agent
sudo systemctl start rd-agent
# Update research goal
nano /opt/rd-agent/config/spec.yaml
sudo systemctl restart rd-agentCompare Against Human Baseline
psql -d trading_tracker -c "
SELECT source,
COUNT(*) as trades,
ROUND(SUM(pnl)::numeric, 2) as total_pnl,
ROUND(AVG(pnl)::numeric, 4) as avg_pnl
FROM trades
WHERE timestamp > NOW() - INTERVAL '30 days'
GROUP BY source
ORDER BY total_pnl DESC;
"Performance Reference (Original RD-Agent(Q))
- ~2× annualized returns vs Alpha101 factor library.
- 70% fewer factors needed.
- < $10 LLM API cost per full research cycle.
Our results will differ based on market and configuration.
Troubleshooting
Stuck on implementation — tail -100 /opt/rd-agent/logs/iterations/latest/implementation.log then sudo systemctl restart rd-agent.
Backtest failures — python -c "import qlib; qlib.init(); print('OK')" and check /opt/rd-agent/data/qlib_data/.
High memory — free -h. Reduce backtest range in spec.yaml or add swap.