Frequently Asked Questions¶
OpenWorm general¶
Why C. elegans?¶
The tiny worm C. elegans is by far the most understood and studied animal with a brain in all of biology -- first genome mapped, exactly 302 neurons with a complete connectome, and three Nobel prizes awarded for work on it. When making a complex computer model, it is important to start where the data are the most complete. For the full story, see the Background page.
What does the real worm do?¶
It has all sorts of behaviors! Some include:
- It finds food and mates
- It avoids toxins and predators
- It lays eggs
- It crawls and there are a bunch of different crawling motions
Do you simulate all that?¶
Yes! Today we simulate crawling (302 neurons + 95 muscles + body physics, validated against Schafer lab kinematics). Our roadmap adds cell-type specialization, sensory responses, organ systems (pharynx, intestine, egg-laying), and ultimately all 959 somatic cells over 18 months. The main point is that we want the worm's overall behavior to emerge from the behavior of each of its cells put together. Each behavior is formally specified in a Design Document with quantitative validation targets. See the Implementation Roadmap for the complete phase-by-phase plan.
So say the virtual organism lays eggs. Are the eggs intended to be new, viable OpenWorms, or is fertilization not a goal?¶
Egg-laying is specified in DD018 (Egg-Laying System Architecture) — a 28-cell circuit (2 HSN serotonergic, 6 VC cholinergic, 16 sex muscles) that produces the characteristic two-state pattern (~20 min inactive, ~2 min active bursts). Implementation is Phase 3 work.
Developmental modeling (embryo to L1 to L4 to adult) is Phase 6 work in our roadmap, using the Witvliet developmental connectome series (8 stages). C. elegans has the best known developmental history of any organism, making it a fascinating future direction.
Does it need to know how to be a worm to act like a worm?¶
The "logic" part comes from the dynamics of the neurons interacting with each other. It is a little unintuitive but that's what makes up how it "thinks". So we are simulating those dynamics as well as we can rather than instructing it what to do when. This is formalized in DD001 (Neural Circuit Architecture), which uses Hodgkin-Huxley equations to model each neuron's electrical dynamics.
Given all that we DON'T know about C. elegans (all the various synaptic strengths, dynamics, gap junction rectification, long-range neuromodulation, etc.), how do you know the model you eventually make truly recapitulates reality?¶
All models are wrong, some models are useful :) We must have the model make a prediction and then test it. Based on how well the model fits the available data, we can quantify how well the model recapitulates reality.
We now have a formal 3-tier validation framework (DD010):
- Tier 1: Single-cell electrophysiology (patch clamp comparison)
- Tier 2: Circuit-level functional connectivity (must correlate r > 0.5 with Randi 2023 whole-brain imaging)
- Tier 3: Behavioral kinematics (speed, wavelength, frequency within +/-15% of Schafer lab database)
Tiers 2 and 3 are blocking — code cannot merge if validation regresses. See the Validation page for details.
Is there only one solution to all those variables in the connectome that will make a virtual C. elegans that resembles a real one, or are there multiple?¶
It is very likely to be multiple, given what we know about the variability of neuronal networks in general (Prinz, Bucher & Marder 2004). One technique to deal with this is to generate multiple models that work (Marder & Taylor 2011) and analyze them under different conditions. What we are after is the solution space that works (Achard & De Schutter 2006, see Fig 6 for an example), rather than a single solution.
DD017 (Hybrid Mechanistic-ML Framework) now specifies automated approaches: differentiable simulation with gradient descent for parameter fitting, plus foundation model predictions (ESM3/AlphaFold) for channel kinetics.
Why not start with simulating something simpler? Are nematodes too complex for a first go at whole organism simulation?¶
Nematodes have been studied far more than simpler multi-cellular organisms, and therefore more data exist that we can use to build our model. We would need to get, for example, another connectome and another anatomical 3D map whereas in C. elegans they already exist. The community of scientists using C. elegans as their model organism is much larger than communities that studying simpler multi-cellular organisms, so the effect of the community size also weighed in on the decision.
When do you think the simulation will be "complete", and which behaviors would that include?¶
"Complete" is relative -- biology is infinitely complex. Our target is a 959-cell organism with all major organ systems (pharynx, intestine, reproductive), validated against experimental kinematics and organ-specific metrics, delivered over ~18 months in 4 phases. We define completion at each phase as meeting all DD010 validation criteria. Beyond Phase 4, future work includes intracellular signaling, developmental modeling, and male-specific systems. See the Implementation Roadmap for the full phase-by-phase timeline.
Currently, what are your biggest problems or needs?¶
To make this project move faster, we'd love more help from motivated folks. Both programmers and experimentalists. We have a lot we want to do and not enough hands to do it.
Current priorities:
- Infrastructure bootstrap: Docker stack (DD013), toolbox revival (DD021), CI/CD pipeline
- Phase 1 science: CeNGEN cell-type specialization (DD005), functional connectivity validation (DD010 Tier 2)
- Integration + Validation maintainers: Two critical L4 roles are currently vacant
Read more about ways to help on our website or check the contributor guide.
Where I could read about your "to do's?"¶
Our work is organized through Design Documents, which define the complete roadmap from 302 neurons to 959 cells. Each DD contains deliverables, testing procedures, and integration contracts.
We also have GitHub issues across all our repositories for specific programming tasks.
How do I know which issues are safe to work on? How do I know I won't be stepping on any toes of work already going on?¶
We primarily use Slack for coordination. If you are interested in helping with an issue but don't know if others are working on it, ask in the relevant Slack channel. You can also comment on the GitHub issue directly. All contributors are advised to announce their intent on Slack or GitHub as soon as they start working on a task.
In general, you won't step on any toes though -- multiple people doing the same thing can still be helpful as different individuals bring different perspectives to the table.
For a structured approach, see the DD contribution workflow and the contributor progression model (Observer to Senior Contributor, L0-L5).
Do you all ever meet up somewhere physically?¶
The core OpenWorm team has met in person several times, including early meetings in Paris (July 2014) and London (Fall 2014). We use Slack and video calls to meet face to face on a regular basis.
OpenWorm simulation and modeling¶
What is the level of granularity of these models (ie. cells, subcellular, etc.), and how does that play out in terms of computational requirements?¶
We model at five scales simultaneously (detailed on the modeling approach page):
| Scale | Design Documents | Computational Cost |
|---|---|---|
| Molecular | DD017 | Low (parameter lookup) |
| Channel | DD001, DD005 | Moderate (HH equations per cell) |
| Cellular | DD001, DD002, DD007-DD009 | Moderate-High (302-959 cells) |
| Tissue | DD003, DD004 | High (~100K SPH particles) |
| Organism | DD010, DD019 | Validation overhead |
In order to make this work we make use of abstraction, so something that is less complex today can be swapped in for something more complex tomorrow. DD017 specifies neural surrogates that can provide 1000x speedup for body physics.
What's the data source for your computer simulation of the living worm?¶
There is not a single data source for our simulation; in fact one of our unique challenges is coming up with new ways to work out how to integrate multiple data sets together. DD008 (Data Integration Pipeline) specifies the formal approach. Key datasets include:
- The Virtual Worm (3D atlas of C. elegans anatomy)
- The C. elegans connectome — accessed via ConnectomeToolbox (cect) per DD020
- CeNGEN single-cell transcriptomics — drives cell-type differentiation
- Randi 2023 whole-brain calcium imaging — Tier 2 validation target for functional connectivity
- Ripoll-Sanchez 2023 neuropeptide connectome — 31,479 interactions feeding the neuropeptide model
- Schafer lab WCON behavioral database — Tier 3 validation target for behavioral kinematics
Has there been previous modeling work on various subsystems illustrating what level of simulation is necessary to produce observed behaviors?¶
There have been other modeling efforts in C. elegans and their subsystems, as well as in academic journal articles. However, the question of "what level of simulation is necessary" to produce observed behaviors is still an open question. Each Design Document includes an "Existing Code Resources" section identifying reusable repos from the OpenWorm GitHub org (15+ repos with production-ready data APIs, validated models, and infrastructure tools).
How are neurons simulated today?¶
Our neural models are specified in DD001 (Neural Circuit Architecture) and implemented in the c302 framework. c302 generates NeuroML2 networks at multiple levels of biophysical detail:
| Level | Cell Type | Synapses | Use Case |
|---|---|---|---|
| A | Integrate-and-Fire | Event-driven | Topology testing |
| B | IAF + Activity | Event-driven | Community extensions |
| C | HH conductance-based | Event-driven | Working |
| C1 | HH + graded synapses | Graded | Recommended default |
| D | Multicompartmental HH | Event-driven | Specialized studies |
Level C1 is the default because C. elegans neurons communicate via graded potentials (not action potentials), and graded synapses are essential for coupling with the body physics (Sibernetic).
What is the connection between the basic properties of C. elegans neurons and human neurons?¶
C. elegans neurons do not spike (i.e. have action potentials), which makes them different from human neurons. However, the same mathematics that describe the action potential (known as the Hodgkin-Huxley model, used in DD001) also describe the dynamics of neurons that do not exhibit action potentials. The biophysics of the neurons from either species are still similar in that they both have chemical synapses, both have excitable cell membranes, and both use voltage sensitive ion channels to modify the electrical potential across their cell membranes.
What is the level of detail of the wiring diagram for the non-neuron elements?¶
There is a map between motor neurons and muscle cells in the published wiring diagram. Beyond that, DD020 (Connectome Data Access) specifies the ConnectomeToolbox (cect) as the canonical API for all connectivity data. The Witvliet developmental series (8 stages) and Ripoll-Sanchez neuropeptide connectome provide additional non-synaptic interaction data.
What is SPH?¶
Smoothed Particle Hydrodynamics — a mesh-free method for simulating fluid and solid mechanics using particles. More information is available online.
What are you doing with SPH?¶
We are building the body of the worm using particles that are being driven by SPH. This is formally specified in DD003 (Body Physics Architecture), which defines the PCISPH pressure solver, ~100K particles (liquid, elastic, boundary types), and validated body mechanics. This allows for physical interactions between the body of the worm and its environment.
OpenWorm code reuse¶
What are LEMS and jLEMS?¶
LEMS (Low Entropy Model Specification) is a compact model specification that allows definition of mathematical models in a transparent machine readable way. NeuroML 2.0 is built on top of LEMS and defines component types useful for describing neural systems (e.g. ion channels, synapses). jLEMS is the Java library that reads, validates, and provides basic solving for LEMS. A utility, jNeuroML, has been created which bundles jLEMS, and allows any LEMS or NeuroML 2 model to be executed, can validate NeuroML 2 files, and convert LEMS/NeuroML 2 models to multiple simulator languages (e.g. NEURON, Brian) and to other formats.
What about Geppetto, OSGi, Spring, Tomcat, Virgo, and Maven?¶
These were core technologies for the Geppetto simulation platform, which served as our primary visualization and simulation environment from 2014-2020. Geppetto has been superseded by DD014 (Dynamic Visualization), which specifies a lighter Python-native approach using Trame (Phase 1-2) and Three.js + WebGPU (Phase 3).
See Archived Projects for the full historical context.
OpenWorm links and resources¶
Do you have a website?¶
Where can I send my inquiries about the project?¶
Where can I find the "worm browser"?¶
How do I join the community?¶
We primarily use Slack for day-to-day communication. Fill out our volunteer application form to get an invite.