Autonomous Execution Agent

Overview

The automatic agent is the challenger. Adapted from Microsoft's RD-Agent(Q), it runs 24/7 autonomously — generating hypotheses, implementing factors, backtesting, and iterating — attempting to beat the human baseline.

Both systems log to the same trading-tracker for direct comparison.

Server Details

Property	Value
IP Address	`64.23.179.43`
OS	Ubuntu 24.04 LTS
Dashboard	`http://64.23.179.43:8080`
SSH Port	`22`
Auth	SSH key only

Access

ssh your-username@64.23.179.43

Request access the same way as the collab server.

Pipeline

Five-stage closed loop running continuously:

Specification → Synthesis → Implementation → Validation → Analysis
      ↑                                                      │
      └──────────────────────────────────────────────────────┘

Specification — Research goal and constraints (e.g., "Sharpe > 1.5, max DD < 15%, US mid-cap").
Synthesis — LLM generates hypotheses from domain priors and prior results.
Implementation — Co-STEER code agent translates hypothesis to executable Qlib factor/model.
Validation — Walk-forward backtest via Qlib + Papermill. Metrics: IC, Sharpe, drawdown, turnover.
Analysis — Multi-armed bandit decides next focus (factor vs model). Results persisted.

Adaptations from Original RD-Agent(Q)

Aspect	Original	Ours
Market	Chinese A-shares	US equities (Alpaca)
Factors	Alpha101	Custom from team research
Evaluation	Internal only	Same Alpaca + trading-tracker as human baseline
Human gate	Optional	Required for live promotion

Directory Structure

/opt/rd-agent/
├── config/          # spec.yaml, scheduler.yaml, model_config.yaml
├── workspace/
│   ├── factors/     # Generated factor code
│   ├── models/      # Generated model code
│   └── results/     # Per-iteration backtest results
├── knowledge/       # hypotheses.json, code_patterns.json, factor_registry.json
├── data/            # market/ and qlib_data/
└── logs/iterations/ # Per-iteration logs

Operations

# Status
systemctl status rd-agent

# Live logs
journalctl -u rd-agent -f

# Current iteration
cat /opt/rd-agent/workspace/results/latest/iteration.json | jq '.iteration'

# View specific iteration
cat /opt/rd-agent/workspace/results/iter_042/hypothesis.json | jq '.'

# Pause / resume
sudo systemctl stop rd-agent
sudo systemctl start rd-agent

# Update research goal
nano /opt/rd-agent/config/spec.yaml
sudo systemctl restart rd-agent

Compare Against Human Baseline

psql -d trading_tracker -c "
  SELECT source,
         COUNT(*) as trades,
         ROUND(SUM(pnl)::numeric, 2) as total_pnl,
         ROUND(AVG(pnl)::numeric, 4) as avg_pnl
  FROM trades
  WHERE timestamp > NOW() - INTERVAL '30 days'
  GROUP BY source
  ORDER BY total_pnl DESC;
"

Performance Reference (Original RD-Agent(Q))

~2× annualized returns vs Alpha101 factor library.
70% fewer factors needed.
< $10 LLM API cost per full research cycle.

Our results will differ based on market and configuration.

Troubleshooting

Stuck on implementation — tail -100 /opt/rd-agent/logs/iterations/latest/implementation.log then sudo systemctl restart rd-agent.

Backtest failures — python -c "import qlib; qlib.init(); print('OK')" and check /opt/rd-agent/data/qlib_data/.

High memory — free -h. Reduce backtest range in spec.yaml or add swap.

Autonomous Execution Agent

On this page