How It Works

Rather than sampling every candidate in a real environment, BioDreamer learns a latent simulator and plans multi-step interventions inside it.

ObservationEncoder → Latent State (z_t)JEPA Predictor (z_t, action → ż_{t+1})Reward Model (z_t → fitness)Policy (RL agent plans in imagination)Proposed Candidates
01

Observe

Encode biological state into latent space

The domain-specific encoder (ESM-2 for proteins, SE(3)-GNN for molecules, scVI-VAE for cells) maps raw biological data such as sequences, coordinates, and gene expression into a compact latent representation z_t.

02

Dream

Simulate outcomes in imagination

The JEPA predictor estimates the next latent state directly. Given the current state z_t and a proposed action (mutation, force change, gene knockout), it produces ż_{t+1} entirely in latent space without ever reconstructing observations.

03

Evaluate

Score fitness and uncertainty

The reward head predicts target properties (ΔΔG, binding affinity, cell state distance) from the dreamed state. The uncertainty module estimates model confidence, and high uncertainty signals promising regions to explore.

04

Plan

Select optimal interventions via Active Inference

The policy rolls out multi-step trajectories inside the world model and selects actions that minimise expected free energy, balancing high fitness (exploitation) with reducing model uncertainty (exploration).

05

Act & Update

Validate and refine the model

Top-ranked candidates are evaluated by a real oracle (MD simulation, ESMFold, or wet-lab assay). The results update the world model, improving its predictions for the next round and closing the active learning loop.