{
    "componentChunkName": "component---src-templates-article-page-js",
    "path": "/journals/biology/micropub-biology-002192",
    "result": {"data":{"article":{"manuscript":{"id":"67cadf6b-eb36-4529-aa78-e206875dcf4a","submissionTypes":["methodology"],"citations":[],"doi":"10.17912/micropub.biology.002192","dbReferenceId":"WBPaper00069715","pmcId":"","pmId":"","proteopedia":"","reviewPanel":"","species":["c. elegans"],"integrations":[],"corrections":null,"history":{"received":"2026-05-08T20:15:46.540Z","revisionReceived":"2026-05-19T15:53:41.749Z","accepted":"2026-05-22T14:51:38.972Z","published":"2026-05-23T21:04:59.291Z","indexed":"2026-06-06T21:04:59.291Z"},"versions":[{"id":"a495bb8a-1ad5-477a-9945-dab40bad07c6","decision":"edit","abstract":"<p>The ability to empirically identify every cell type in C. elegans based on their lineage history has been a powerful scientific resource, yet robust lineage-based identification typically requires live imaging and manual or automated cell tracking. Recent methods have been proposed to automate cell identification on the basis of spatiotemporal atlases, taking advantage of the availability of the large number of manually curated datasets the field has generated. These approaches enable robust cell identification across developmental stages. Here we present a complementary tool called EmbAlign, a fully automated 3D registration framework that determines lineage identities based solely on the 3D position of cell nuclei within an embryo snapshot. By anchoring its label search on the observed cell count, EmbAlign retrieves biologically valid reference templates from a spatiotemporal atlas built from timelapse observations and refines them via an iterative Sinkhorn alignment procedure. This approach robustly accommodates positional variability and arbitrary embryo orientations, allowing for automated cell labeling in uncompressed live or fixed embryos. In cross-validation, EmbAlign achieves 96.6% labeling accuracy up to the 190-cell stage, with slight performance fluctuations restricted to active division windows. EmbAlign includes a diagnostic layer that outputs a continuous confidence score (AUPRC = 0.525), ensuring interpretable assignments even in challenging cases, and provides an interactive report for each alignment providing a visualization of registration performance, label assignments, and label confidence. EmbAlign provides a complementary solution for converting raw 3D spatial data into annotated, lineage-aware datasets.</p>","acknowledgements":"<p>&nbsp;Strain JIM113 was provided by the CGC, which is funded by NIH Office of Research Infrastructure Programs (P40 OD010440).</p>","authors":[{"affiliations":["University of California Los Angeles"],"departments":["Bioinformatics Interdepartmental Program"],"credit":["investigation","methodology","software","validation","visualization","writing_originalDraft"],"email":"mptran@g.ucla.edu","firstName":"Miles","lastName":"Tran","submittingAuthor":false,"correspondingAuthor":false,"equalContribution":false,"WBId":null,"orcid":"0009-0005-9330-9297"},{"affiliations":["University of California Los Angeles"],"departments":["Molecular Biology Interdepartmental Program"],"credit":["dataCuration","validation"],"email":"neilpeinado@g.ucla.edu","firstName":"Neil","lastName":"Peinado","submittingAuthor":false,"correspondingAuthor":false,"equalContribution":false,"WBId":null,"orcid":null},{"affiliations":["Hong Kong Baptist University"],"departments":["Department of Biology"],"credit":["dataCuration","writing_reviewEditing"],"email":"zyzhao@hkbu.edu.hk","firstName":"Zhongying","lastName":"Zhao","submittingAuthor":false,"correspondingAuthor":false,"equalContribution":false,"WBId":null,"orcid":null},{"affiliations":["University of California Los Angeles"],"departments":["Molecular, Cell and Developmental Biology"],"credit":["conceptualization","dataCuration","methodology","project","fundingAcquisition","supervision","validation","writing_reviewEditing"],"email":"pavak@ucla.edu","firstName":"Pavak","lastName":"Shah","submittingAuthor":true,"correspondingAuthor":true,"equalContribution":false,"WBId":null,"orcid":"0000-0002-2603-5995"}],"awards":[{"awardId":"R35GM151199","funderName":"National Institutes of Health (United States)","awardRecipient":"PKS"}],"conflictsOfInterest":"<p>The authors declare that there are no conflicts of interest present.</p>","dataTable":{"url":null},"extendedData":[],"funding":"","image":{"url":"https://portal.micropublication.org/uploads/51bad455a31e292a4e958d5721d2f352.png"},"imageCaption":"<pre><code>A. Overview of the EmbAlign framework. Input Centroids: Raw 3D unoriented input centroids (top) and candidate reference templates retrieved from the spatiotemporal atlas (bottom). Coarse Alignment: Superposition of observed centroids with candidate reference template (top). A discrete rotational sweep around the principal component axis and the resulting cost landscape (bottom). Sinkhorn Refinement: Iterative Sinkhorn refinement mapping the spatial correspondence between observed centroids and the candidate reference (top) along with the resulting cost landscape (bottom). Label Assignment: The final discrete lineage identity assignments (top) and the resulting distribution of per-cell confidence scores (bottom) generated by the diagnostic layer. \nB. Cross-validated, per-frame alignment accuracy (N=11, mean = 96.9%) aligned to a canonical time axis. Rolling mean frame accuracy (solid black) with bootstrapped 95% confidence intervals (shaded).\nC. Per-frame alignment accuracy for two independently generated light sheet microscopy datasets.\nD. Distribution of assignment accuracies across all training embryos for the 30 cell types with the lowest overall mean accuracy. \nE. Cross-validated AUPRC curves (mean AUC = 0.546). Mean interpolated precision (solid blue) and bootstrapped 95% confidence intervals (shaded). Naive baseline (0.05, dashed).  \nF. Normalized confusion matrix. Values indicate the proportion of alignments correctly classified within each true class. \nG. Side-by-side comparison of ground truth label assignment accuracy (left) and the predicted confidence score (right).</code></pre>","imageTitle":"<pre><code>EmbAlign enables fully automated cell labeling in C. elegans embryo snapshots.</code></pre>","methods":"<p><b>Spatiotemporal Atlas Construction</b>&nbsp;</p><p>Embryos were first aligned to a canonical time axis via dynamic time warping<a href=\"https://paperpile.com/c/eA17rr/yIyn\">(</a><i><a href=\"https://paperpile.com/c/eA17rr/yIyn\">Dynamic Programming Algorithm Optimization for Spoken Word Recognition</a></i><a href=\"https://paperpile.com/c/eA17rr/yIyn\">, n.d.)</a>. A continuous spatial reference was constructed by fitting independent 1D Gaussian Process regressors (RBF and White kernels) to the spatiotemporal trajectories of each cell type’s 3D coordinates. Total positional variance at time t was calculated by aggregating the GP regression uncertainty with empirical cell-specific variance.&nbsp;</p><p>To define the database of biologically valid cell combinations (“slices”), we utilized strictly empirically observed states. Slices were extracted directly from the training frames by aggregating valid cell label combinations grouped by frame cell count (N). Finally, discrete slices were inflated into function 3D reference frames. We calculated the overlapping temporal existence window, the canonical time window within which all cells in a given slice simultaneously exist, for each observed slice and queried the continuous GP atlas at the window’s midpoint (t<sub>med</sub>) to retrieve the expected 3D coordinates and covariance matrices required for alignment. To provide a temporal baseline for downstream diagnostic validation, an empirical growth curve was also constructed by calculating the mean and standard deviation of total cell counts across canonical time bins in the training data.</p><p><b>EmbAlign 3D Alignment and Label Transfer</b></p><p>Observed 3D nuclei centroids were first centered and scaled by their median pairwise distance. For each candidate atlas slice matching the observed cell count N, we aligned both the positive and negative primary principal component (±PC1) axes of observation to the reference PC1 axis to account for PCA sign ambiguity. We then rotated the observation around the PC1 axis in discrete angular steps, scoring each orientation based on the Sinkhorn-weighted squared Euclidean distance between observed and reference centroids. To escape geometric symmetry traps, we identified the top k unique angular valleys (separated by ≥30°) to seed the refinement phase.&nbsp;</p><p>Each of the k initializations entered an Iterative Closest Point Refinement <a href=\"https://paperpile.com/c/eA17rr/2OtJ\">(Bergström &amp; Edlund, 2014)</a> (ICP)-like soft refinement loop. To more robustly handle biological noise, we replaced the standard hard-assignment matching of the ICP algorithm with optimal transport. We computed a soft correspondence probability matrix using Sinkhorn entropic regularization. The rigid transformation (rotation and translation) was then iteratively updated using a weighted Kabsch Algorithm <a href=\"https://paperpile.com/c/eA17rr/nHpW\">(Kabsch, 1976)</a>, where the influence of each cell pairing was governed by its Sinkhorn probability.&nbsp;</p><p>Alignments were scored by calculating the Mahalanobis distance between the aligned observations and the atlas 3D Gaussian distributions. The slice and orientation minimizing this global cost were selected as the winner. Final discrete cell identities were assigned by executing a linear sum assignment algorithm <a href=\"https://paperpile.com/c/eA17rr/wnwg\">(Kuhn, 1955)</a> directly on the winning mahalanobis distance matrix.</p><p><b>Confidence Scoring</b></p><p>A Random Forest<a href=\"https://paperpile.com/c/eA17rr/tppS\">(Breiman, 2001)</a> classifier (n_estimators = 200, max_depth = 10) was trained to predict cell-level assignment correctness using geometric and biological alignment features. Input variables included Sinkhorn assignment entropy, Mahalanobis distance, frame cell count, frame inferred time, and a normalized division delta (t<sub>med</sub>-tbirth)/(t<sub>division</sub>-tbirth) representing cell life progress. The model outputs continuous cell-level probabilities, which are also averaged to generate an aggregate frame level confidence score.&nbsp;</p><p>Pipeline accuracy and diagnostic accuracy were evaluated using a Leave-One-Out cross-validation strategy. For each fold, the continuous spatial GP atlas, slice atlas, and diagnostic classifier, were fitted entirely on the training embryos prior to evaluating the withheld test embryo.</p><p><b>QC Report Generation</b></p><p>To facilitate rapid visual assessment of alignment quality, the pipeline generates an interactive HTML diagnostic dashboard. To trace the optimization landscape, the alignment engine records the Sinkhorn-weighted sum of squared Euclidean distances at each discrete step of the coarse angular sweep. Furthermore, for the top k angular initializations selected for the refinement tournament, this same soft-weighted cost is sequentially tracked across all Sinkhorn refinement loops. These convergence traces are packaged alongside the empirical population growth curve, which plots the unannotated embryo’s observed cell count and t<sub>med</sub> against the 95% confidence intervals of the training population.&nbsp;</p><p><b>Data Acquisition&nbsp;</b></p><p>For validation, JIM113 embryos were isolated from gravid hermaphrodites and mounted on a coverslip using a diSPIM sample chamber. Images were acquired with 0.75 micron z-spacing on an ASI diSPIM <a href=\"https://paperpile.com/c/eA17rr/yCU1\">(Wu et al., 2013)</a> run in single view acquisition mode controlled by the diSPIM control plugin in micro-manager <a href=\"https://paperpile.com/c/eA17rr/8UCO+aiBr\">(A. Edelstein et al., 2010; A. D. Edelstein et al., 2014)</a>. Images were cropped in ImageJ <a href=\"https://paperpile.com/c/eA17rr/8Vct\">(Schneider et al., 2012)</a> and processed using StarryNite. Lineage tracing results from StarryNite were then manually curated using AceTree.</p><p><b>Data and Code Availability</b></p><p>The EmbAlign source code is available as a public repository at https://github.com/shahlab-ucla/EmbAlign. This repository also contains executable scripts, configuration files, and pretrained models required to reproduce the analyses and figures presented in this study. All processed datasets used for model training and out-of-sample validation are hosted within the repository and permanently archived at https://doi.org/10.5281/zenodo.20089241. The repository also contains core usage vignettes outlining execution of the EmbAlign pipeline for lineage inference in unlabeled 3D nuclei coordinates, as well as instructions for fitting the spatiotemporal atlases on raw point cloud data.</p>","reagents":"<p></p>","patternDescription":"<p><b>Introduction:</b></p><p>During the initial mapping of the <i>C. elegans </i>embryonic lineage <a href=\"https://paperpile.com/c/eA17rr/YgjH\">(Sulston et al., 1983)</a>, the spatial stereotypy of the embryo was noted. More recent implementations of automated tracking-based lineage tracing <a href=\"https://paperpile.com/c/eA17rr/Zsa7+8q2s+L00y\">(Bao et al., 2006; Santella et al., 2010, 2014)</a> enabled quantitative assessment of the global consistency of cell positioning within the embryo <a href=\"https://paperpile.com/c/eA17rr/vOpc+SI9r+n1tz\">(Li et al., 2019; Moore et al., 2013; Schnabel et al., 2006)</a>, a feature that has led to recent advances <a href=\"https://paperpile.com/c/eA17rr/HA8b+kTnY\">(Haus et al., 2025; Ntemos et al., 2025)</a> in using spatial atlases built by live imaging to automate cell identity determination from static snapshots of <i>C. elegans </i>embryos using timelapse recordings of embryos imaged under gentle compression. Compression has been extensively used in prior live imaging experiments <a href=\"https://paperpile.com/c/eA17rr/YgjH\">(Sulston et al., 1983)</a> as it forces embryos into stereotypical orientations and limits their axial extent, allowing for 3D coverage with fewer focal planes in confocal imaging <a href=\"https://paperpile.com/c/eA17rr/EDNV\">(Bao &amp; Murray, 2011)</a>. While current approaches achieve high accuracy in the instantaneous inference of lineage identity from static snapshots of cell position, both of these tools were trained using compressed data. We set out to build EmbAlign to fill this gap by enabling robust lineage identity determination using cell position in uncompressed embryos, such as would be generated by many smFISH <a href=\"https://paperpile.com/c/eA17rr/Ypca\">(Parker et al., 2021)</a> and immunofluorescence<a href=\"https://paperpile.com/c/eA17rr/Ypca+vnCY\">(Duerr, 2006; Parker et al., 2021)</a> sample preparation approaches, and by snapshots from uncompressed live imaging, for example by lightsheet microscopy <a href=\"https://paperpile.com/c/eA17rr/KPdD\">(Wu et al., 2011)</a>.</p><p><b>Pipeline Overview:</b></p><p>The EmbAlign algorithm automates the transformation of unoriented 3D nuclei centroids from uncompressed C. elegans embryos onto their canonical lineage identities (<b>Fig. 1A</b>). Initially, raw 3D centroids are centered and scaled to standardize physical size variations while preserving relative spatial topology. Using the total observed cell count, the algorithm selects from a library of reference templates derived from empirically observed lineage identity configurations. To resolve arbitrary orientations, the algorithm aligns the observed centroids’ primary principal component (±PC1) axes to the reference and executes a discrete rotational sweep to identify the top k unique angular orientations. Each initialization is then refined via an iterative Sinkhorn alignment procedure. This stage utilizes entropically regularized optimal transport to compute a soft correspondence probability matrix and iteratively updates the best fit rigid transformation. Following refinement, EmbAlign resolves a final, discrete mapping of identities to centroids by minimizing the Sinkhorn-weighted Mahalanobis distance, enforcing the one-to-one correspondence required by the embryo’s invariant body plan. This strict one-to-one correspondence does introduce a key limitation for EmbAlign in that it is extremely sensitive to detection errors, requiring careful validation of cell detection in test data prior to alignment. Finally, a Random Forest diagnostic layer evaluates the assignment, utilizing geometric and biological features to output a continuous confidence score for every predicted cell identity.&nbsp;</p><p><b>Label Transfer Performance:</b></p><p>EmbAlign’s performance was rigorously evaluated using a Leave-One-Out Cross-Validation (LOOCV) strategy across a dataset of uncompressed C. elegans embryos from two different labs acquired using distinct imaging modalities (embryo n = 11, frame n = 1,215), where ground truth cell identities were established using the AceTree <a href=\"https://paperpile.com/c/eA17rr/wnW2+hC5r\">(Boyle et al., 2006; Katzman et al., 2018)</a>/StarryNite <a href=\"https://paperpile.com/c/eA17rr/Zsa7+L00y+8q2s\">(Bao et al., 2006; Santella et al., 2010, 2014)</a> lineage tracking pipeline. Position data for uncompressed embryos were sourced from previously published lightsheet microscopy lineage reconstructions. The algorithm achieved a sustained, high average frame accuracy of 96.6% from the 6-cell up to the 190-cell stage (<b>Fig. 1B</b>). Despite this overall robustness, performance exhibited predictable transient declines corresponding to major waves of synchronous cell division. Nuclei captured within or immediately adjacent to these division windows are inherently more difficult to classify, as rapid physical displacement during cytokinesis briefly deforms the expected spatial topology and maximizes the distance of these cells from their canonical atlas positions. Finding an exact reference template match is inherently challenging during these rapid waves of cell division. Remarkably, EmbAlign maintains a 86.6% assignment accuracy even when aligning against an imperfect reference template (n = 270), compared to a near-perfect 99.4% accuracy when at least one exact reference template match is available (n = 945). This suggests the EmbAlign framework successfully buffers against missing reference templates, and increasing training dataset size to more robustly cover gaps in the current reference atlas should further improve overall performance.</p><p>To verify that the pipeline does not overfit to the spatial properties of the training data, we evaluated EmbAlign on two fully independent, out-of-sample (OOS) datasets acquired via single view selective plane illumination microscopy (ASI diSPIM operated in single view acquisition mode). These OOS embryos were annotated with ground truth labels using AceTree and StarryNite pipeline up to the 100 cell stage. The algorithm successfully mapped these embryos, maintaining comparable high-fidelity frame accuracies (96.3% and 96.1%) without any dataset-specific recalibration (<b>Fig. 1C</b>). This OOS performance demonstrates robust generalizability, confirming that EmbAlign effectively captures the invariant spatial dynamics of <i>C. elegans</i> embryogenesis rather than batch-specific imaging artifacts.&nbsp;</p><p>To investigate cell type specific drops in performance, we examined the 30 lowest-performing cell types. We observed that these challenging types primarily consist of later-stage cells that emerge toward the end of our atlas window and correspond to a wave of cell divisions. Additionally, we correlated our global cell-type accuracies with empirical positional variance measurements from a prior study <a href=\"https://paperpile.com/c/eA17rr/SI9r+TBy8\">(Guan et al., 2025; Li et al., 2019)</a> in uncompressed embryos. We found a small but significant negative correlation (Spearman’ s  = -0.11, p = 0.03), suggesting the natural spatial variability of these cells is likely a minor contributing factor to EmbAlign’s difficulty in resolving their identities.&nbsp;&nbsp;</p><p><b>Diagnostic Layer Performance:</b></p><p>To enable reliable application in experimental settings lacking ground truth, we integrated a decoupled Random Forest diagnostic layer that provides a single cell assessment of alignment quality in real-time. Because the pipeline maintains a high baseline accuracy, the diagnostic task is imbalanced, requiring the classifier to identify rare misassignment events within a vast majority of correct labels. To rigorously evaluate performance under those conditions, we compared the precision (a measure of the false positive rate) and recall (a measure of the false negative rate) of the model. The diagnostic layer achieved an AUPRC of 0.525—a greater than 10-fold improvement over the naive baseline of 0.05 (<b>Fig. 1E</b>)—and&nbsp; successfully identified 78.5% of true assignment errors and 90.2% of true assignment successes (<b>Fig. 1F</b>), demonstrating a robust ability to capture specific geometric and biological features of assignment failures.</p><p>To make these diagnostics accessible to the end user, we developed an interactive HTML alignment report that packages the pipeline’s outputs into a comprehensive dashboard. Along with a frame level confidence estimate, the tool projects the aligned embryo onto an empirical population growth curve to estimate its canonical time, a feature that allows users to flag datasets that are likely to have been captured during transient, error-prone division windows. The dashboard also summarizes the entire search landscape, enabling users to compare alternative alignments across multiple local minima and track their respective optimization convergence traces. Finally, predicted labels and alignment confidence scores are mapped directly onto the embryo’s 3D spatial topology, resulting in interactive 3D spatial label assignment and confidence plots that facilitate rapid verification of alignment quality (<b>Fig. 1A, 1G</b>).&nbsp;&nbsp;</p><p><b>Conclusions:</b></p><p>In summary, EmbAlign provides a generalizable solution for automated cell identity inference in static 3D snapshots of uncompressed <i>C. elegans</i> embryos. By aligning 3D nuclei centroid coordinates to live-imaging derived spatiotemporal atlases of early <i>C. elegans</i> development, EmbAlign achieves &gt;96% alignment accuracy up to the 190 cell stage that generalizes across independently generated datasets. Additionally, we observe that performance lapses correspond to transient waves of synchronous cell division, where natural positional variance temporarily disrupts spatial stereotypy. Supported by a predictive diagnostic classifier and interactive alignment reports, EmbAlign provides a complementary tool for transforming raw spatial coordinates into lineage-aware datasets.</p>","references":[{"reference":"Bergstrom P, Edlund O. 2014. Robust registration of point sets using iteratively reweighted least. Computational Optimization and Applications. 58: 543-561.","pubmedId":"","doi":"10.1007/s10589-014-9643-2"},{"reference":"Cuturi M. 2013. Sinkhorn Distances: Lightspeed Computation of Optimal Transportation.","pubmedId":"","doi":""},{"reference":"Edelstein A, Amodaj N, Hoover K, Vale R, Stuurman N. 2010. Computer Control of Microscopes Using µManager. Current Protocols in Molecular Biology. 92: 14.20.1-14.20.17.","pubmedId":"","doi":"10.1002/0471142727.mb1420s92"},{"reference":"Schneider CA, Rasband WS, Eliceiri KW. 2012. NIH Image to ImageJ: 25 years of image analysis. Nature Methods. 9: 671-675.","pubmedId":"","doi":"10.1038/nmeth.2089"},{"reference":"Santella A, Du Z, Nowotschin S, Hadjantonakis AK, Bao Z. 2010. A hybrid blob-slice model for accurate and efficient detection of. BMC Bioinformatics. 11: 580.","pubmedId":"","doi":"10.1186/1471-2105-11-580"},{"reference":"Bao Z, Murray JI. 2011. Mounting Caenorhabditis elegans embryos for live imaging of embryogenesis. Cold Spring Harb Protoc. 2011","pubmedId":"","doi":"10.1101/pdb.prot065599"},{"reference":"Ntemos K, Xu F, Bazzi NZ, Fucile G, Maretic HP, Dokmanic I, Mango SE, Sawh AN. 2025. Rapid canalization of chromosome conformation-transcription fingerprints. bioRxiv","pubmedId":"","doi":"10.1101/2025.10.22.684035"},{"reference":"Wu Y, Ghitani A, Christensen R, Santella A, Du Z, Rondeau G, et al., Shroff H. 2011. Inverted selective plane illumination microscopy (iSPIM) enables coupled. Proceedings of the National Academy of Sciences. 108: 17708-17713.","pubmedId":"","doi":"10.1073/pnas.1108494108"},{"reference":"Santella A, Du Z, Bao Z. 2014. A semi-local neighborhood-based framework for probabilistic cell lineage. BMC Bioinformatics. 15: 217.","pubmedId":"","doi":"10.1186/1471-2105-15-217"},{"reference":"Li X, Zhao Z, Xu W, Fan R, Xiao L, Ma X, Du Z. 2019. Systems Properties and Spatiotemporal Regulation of Cell Position. Cell Rep. 26: 313-321.e7.","pubmedId":"","doi":"10.1016/j.celrep.2018.12.052"},{"reference":"Guan G, Li Z, Ma Y, Ye P, Cao J, Wong MK, et al., Zhao Z. 2025. Cell lineage-resolved embryonic morphological map reveals signaling. Nat Commun. 16: 3700.","pubmedId":"","doi":"10.1038/s41467-025-58878-0"},{"reference":"Sulston JE, Schierenberg E, White JG, Thomson JN. 1983. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev Biol. 100: 64-119.","pubmedId":"","doi":"10.1016/0012-1606(83)90201-4"},{"reference":"Parker DM, Winkenbach LP, Parker A, Boyson S, Nishimura EO. 2021. Improved Methods for Single-Molecule Fluorescence In Situ Hybridization. Curr Protoc. 1: e299.","pubmedId":"","doi":"10.1002/cpz1.299"},{"reference":"Bao Z, Murray JI, Boyle T, Ooi SL, Sandel MJ, Waterston RH. 2006. Automated cell lineage tracing in Caenorhabditis elegans. Proceedings of the National Academy of Sciences. 103: 2707-2712.","pubmedId":"","doi":"10.1073/pnas.0511111103"},{"reference":"Edelstein AD, Tsuchida MA, Amodaj N, Pinkard H, Vale RD, Stuurman N. 2014. Advanced methods of microscope control using μManager software. POL Scientific. 1: 1.","pubmedId":"","doi":"10.14440/jbm.2014.36"},{"reference":". 2019. Reprint of: Mahalanobis, P.C. (1936) \"On the Generalised Distance in. Sankhya A. 80: 1-7.","pubmedId":"","doi":"10.1007/s13171-019-00164-5"},{"reference":"Katzman B, Tang D, Santella A, Bao Z. 2018. AceTree: a major update and case study in the long term maintenance of. BMC Bioinformatics. 19: 121.","pubmedId":"","doi":"10.1186/s12859-018-2127-0"},{"reference":". . Sakoe, H. and Chiba, S. (1978) Dynamic Programming Algorithm Optimization.","pubmedId":"","doi":""},{"reference":"Haus E, Santella A, Xu Y, Ren R, Wang D, Bao Z. 2025. A Single-cell Spatiotemporal Manifold of Tissue Morphology and Dynamics. bioRxiv","pubmedId":"","doi":"10.1101/2025.10.22.683950"},{"reference":"Schnabel R, Bischoff M, Hintze A, Schulz AK, Hejnol A, Meinhardt H, Hutter H. 2006. Global cell sorting in the C. elegans embryo defines a new mechanism for. Dev Biol. 294: 418-431.","pubmedId":"","doi":"10.1016/j.ydbio.2006.03.004"},{"reference":"Kabsch W. 1976. A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A. 32: 922-923.","pubmedId":"","doi":"10.1107/S0567739476001873"},{"reference":"Hadwiger G, Dour S, Arur S, Fox P, Nonet ML. 2010. A Monoclonal Antibody Toolkit for C. elegans. PLOS ONE. 5: e10161.","pubmedId":"","doi":"10.1371/journal.pone.0010161"},{"reference":"Breiman L. 2001. Random Forests. Machine Learning. 45: 5-32.","pubmedId":"","doi":"10.1023/A:1010933404324"},{"reference":"Moore JL, Du Z, Bao Z. 2013. Systematic quantification of developmental phenotypes at single-cell. Development. 140: 3266-3274.","pubmedId":"","doi":"10.1242/dev.096040"},{"reference":"Duerr JS. 2006. Immunohistochemistry. WormBook: 1-61.","pubmedId":"","doi":"10.1895/wormbook.1.105.1"},{"reference":"Boyle TJ, Bao Z, Murray JI, Araya CL, Waterston RH. 2006. AceTree: a tool for visual analysis of Caenorhabditis elegans. BMC Bioinformatics. 7: 275.","pubmedId":"","doi":"10.1186/1471-2105-7-275"},{"reference":"Kuhn HW. 1955. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly. 2: 83-97.","pubmedId":"","doi":"10.1002/nav.3800020109"},{"reference":"Wu Y, Wawrzusin P, Senseney J, Fischer RS, Christensen R, Santella A, et al., Shroff H. 2013. Spatially isotropic four-dimensional imaging with dual-view plane. Nature Biotechnology. 31: 1032-1038.","pubmedId":"","doi":"10.1038/nbt.2713"},{"reference":". . Dynamic programming algorithm optimization for spoken word recognition.","pubmedId":"","doi":"10.5555/108235.108244"}],"title":"<p>Inference of Lineage-Resolved Cell Identities in Uncompressed <i>C. elegans </i>Embryos</p>","reviews":[],"curatorReviews":[{"curator":{"displayName":"Gary Craig Schindelman"},"openAcknowledgement":false,"submitted":null}]},{"id":"78706527-5eec-46a1-b250-73baee0286c2","decision":"revise","abstract":"<p>The ability to empirically identify every cell type in C. elegans based on their lineage history has been a powerful scientific resource, yet robust lineage-based identification typically requires live imaging and manual or automated cell tracking. Recent methods have been proposed to automate cell identification on the basis of spatiotemporal atlases, taking advantage of the availability of the large number of manually curated datasets the field has generated. These approaches enable robust cell identification across developmental stages. Here we present a complementary tool called EmbAlign, a fully automated 3D registration framework that determines lineage identities based solely on the 3D position of cell nuclei within an embryo snapshot. By anchoring its label search on the observed cell count, EmbAlign retrieves biologically valid reference templates from a spatiotemporal atlas built from timelapse observations and refines them via an iterative Sinkhorn alignment procedure. This approach robustly accommodates positional variability and arbitrary embryo orientations, allowing for automated cell labeling in uncompressed live or fixed embryos. In cross-validation, EmbAlign achieves 96.6% labeling accuracy up to the 190-cell stage, with slight performance fluctuations restricted to active division windows. EmbAlign includes a diagnostic layer that outputs a continuous confidence score (AUPRC = 0.525), ensuring interpretable assignments even in challenging cases, and provides an interactive report for each alignment providing a visualization of registration performance, label assignments, and label confidence. EmbAlign provides a complementary solution for converting raw 3D spatial data into annotated, lineage-aware datasets.</p>","acknowledgements":"<p>&nbsp;Strain JIM113 was provided by the CGC, which is funded by NIH Office of Research Infrastructure Programs (P40 OD010440).</p>","authors":[{"affiliations":["University of California Los Angeles"],"departments":["Bioinformatics Interdepartmental Program"],"credit":["investigation","methodology","software","validation","visualization","writing_originalDraft"],"email":"mptran@g.ucla.edu","firstName":"Miles","lastName":"Tran","submittingAuthor":false,"correspondingAuthor":false,"equalContribution":false,"WBId":null,"orcid":"0009-0005-9330-9297"},{"affiliations":["University of California Los Angeles"],"departments":["Molecular Biology Interdepartmental Program"],"credit":["dataCuration","validation"],"email":"neilpeinado@g.ucla.edu","firstName":"Neil","lastName":"Peinado","submittingAuthor":false,"correspondingAuthor":false,"equalContribution":false,"WBId":null,"orcid":null},{"affiliations":["Hong Kong Baptist University"],"departments":["Department of Biology"],"credit":["dataCuration","writing_reviewEditing"],"email":"zyzhao@hkbu.edu.hk","firstName":"Zhongying","lastName":"Zhao","submittingAuthor":false,"correspondingAuthor":false,"equalContribution":false,"WBId":null,"orcid":null},{"affiliations":["University of California Los Angeles"],"departments":["Molecular, Cell and Developmental Biology"],"credit":["conceptualization","dataCuration","methodology","project","fundingAcquisition","supervision","validation","writing_reviewEditing"],"email":"pavak@ucla.edu","firstName":"Pavak","lastName":"Shah","submittingAuthor":true,"correspondingAuthor":true,"equalContribution":false,"WBId":null,"orcid":"0000-0002-2603-5995"}],"awards":[{"awardId":"R35GM151199","funderName":"National Institutes of Health (United States)","awardRecipient":"PKS"}],"conflictsOfInterest":"<p>The authors declare that there are no conflicts of interest present.</p>","dataTable":{"url":null},"extendedData":[],"funding":"","image":{"url":"https://portal.micropublication.org/uploads/51bad455a31e292a4e958d5721d2f352.png"},"imageCaption":"A. Overview of the EmbAlign framework. Input Centroids: Raw 3D unoriented input centroids (top) and candidate reference templates retrieved from the spatiotemporal atlas (bottom). Coarse Alignment: Superposition of observed centroids with candidate reference template (top). A discrete rotational sweep around the principal component axis and the resulting cost landscape (bottom). Sinkhorn Refinement: Iterative Sinkhorn refinement mapping the spatial correspondence between observed centroids and the candidate reference (top) along with the resulting cost landscape (bottom). Label Assignment: The final discrete lineage identity assignments (top) and the resulting distribution of per-cell confidence scores (bottom) generated by the diagnostic layer. \nB. Cross-validated, per-frame alignment accuracy (N=11, mean = 96.9%) aligned to a canonical time axis. Rolling mean frame accuracy (solid black) with bootstrapped 95% confidence intervals (shaded).\nC. Per-frame alignment accuracy for two independently generated light sheet microscopy datasets.\nD. Distribution of assignment accuracies across all training embryos for the 30 cell types with the lowest overall mean accuracy. \nE. Cross-validated AUPRC curves (mean AUC = 0.546). Mean interpolated precision (solid blue) and bootstrapped 95% confidence intervals (shaded). Naive baseline (0.05, dashed).  \nF. Normalized confusion matrix. Values indicate the proportion of alignments correctly classified within each true class. \nG. Side-by-side comparison of ground truth label assignment accuracy (left) and the predicted confidence score (right).","imageTitle":"EmbAlign enables fully automated cell labeling in C. elegans embryo snapshots.","methods":"<p><b>Spatiotemporal Atlas Construction</b>&nbsp;</p><p>Embryos were first aligned to a canonical time axis via dynamic time warping<a href=\"https://paperpile.com/c/eA17rr/yIyn\">(</a><i><a href=\"https://paperpile.com/c/eA17rr/yIyn\">Dynamic Programming Algorithm Optimization for Spoken Word Recognition</a></i><a href=\"https://paperpile.com/c/eA17rr/yIyn\">, n.d.)</a>. A continuous spatial reference was constructed by fitting independent 1D Gaussian Process regressors (RBF and White kernels) to the spatiotemporal trajectories of each cell type’s 3D coordinates. Total positional variance at time t was calculated by aggregating the GP regression uncertainty with empirical cell-specific variance.&nbsp;</p><p>To define the database of biologically valid cell combinations (“slices”), we utilized strictly empirically observed states. Slices were extracted directly from the training frames by aggregating valid cell label combinations grouped by frame cell count (N). Finally, discrete slices were inflated into function 3D reference frames. We calculated the overlapping temporal existence window, the canonical time window within which all cells in a given slice simultaneously exist, for each observed slice and queried the continuous GP atlas at the window’s midpoint (t<sub>med</sub>) to retrieve the expected 3D coordinates and covariance matrices required for alignment. To provide a temporal baseline for downstream diagnostic validation, an empirical growth curve was also constructed by calculating the mean and standard deviation of total cell counts across canonical time bins in the training data.</p><p><b>EmbAlign 3D Alignment and Label Transfer</b></p><p>Observed 3D nuclei centroids were first centered and scaled by their median pairwise distance. For each candidate atlas slice matching the observed cell count N, we aligned both the positive and negative primary principal component (±PC1) axes of observation to the reference PC1 axis to account for PCA sign ambiguity. We then rotated the observation around the PC1 axis in discrete angular steps, scoring each orientation based on the Sinkhorn-weighted squared Euclidean distance between observed and reference centroids. To escape geometric symmetry traps, we identified the top k unique angular valleys (separated by ≥30°) to seed the refinement phase.&nbsp;</p><p>Each of the k initializations entered an Iterative Closest Point Refinement <a href=\"https://paperpile.com/c/eA17rr/2OtJ\">(Bergström &amp; Edlund, 2014)</a> (ICP)-like soft refinement loop. To more robustly handle biological noise, we replaced the standard hard-assignment matching of the ICP algorithm with optimal transport. We computed a soft correspondence probability matrix using Sinkhorn entropic regularization. The rigid transformation (rotation and translation) was then iteratively updated using a weighted Kabsch Algorithm <a href=\"https://paperpile.com/c/eA17rr/nHpW\">(Kabsch, 1976)</a>, where the influence of each cell pairing was governed by its Sinkhorn probability.&nbsp;</p><p>Alignments were scored by calculating the Mahalanobis distance between the aligned observations and the atlas 3D Gaussian distributions. The slice and orientation minimizing this global cost were selected as the winner. Final discrete cell identities were assigned by executing a linear sum assignment algorithm <a href=\"https://paperpile.com/c/eA17rr/wnwg\">(Kuhn, 1955)</a> directly on the winning mahalanobis distance matrix.</p><p><b>Confidence Scoring</b></p><p>A Random Forest<a href=\"https://paperpile.com/c/eA17rr/tppS\">(Breiman, 2001)</a> classifier (n_estimators = 200, max_depth = 10) was trained to predict cell-level assignment correctness using geometric and biological alignment features. Input variables included Sinkhorn assignment entropy, Mahalanobis distance, frame cell count, frame inferred time, and a normalized division delta (t<sub>med</sub>-tbirth)/(t<sub>division</sub>-tbirth) representing cell life progress. The model outputs continuous cell-level probabilities, which are also averaged to generate an aggregate frame level confidence score.&nbsp;</p><p>Pipeline accuracy and diagnostic accuracy were evaluated using a Leave-One-Out cross-validation strategy. For each fold, the continuous spatial GP atlas, slice atlas, and diagnostic classifier, were fitted entirely on the training embryos prior to evaluating the withheld test embryo.</p><p><b>QC Report Generation</b></p><p>To facilitate rapid visual assessment of alignment quality, the pipeline generates an interactive HTML diagnostic dashboard. To trace the optimization landscape, the alignment engine records the Sinkhorn-weighted sum of squared Euclidean distances at each discrete step of the coarse angular sweep. Furthermore, for the top k angular initializations selected for the refinement tournament, this same soft-weighted cost is sequentially tracked across all Sinkhorn refinement loops. These convergence traces are packaged alongside the empirical population growth curve, which plots the unannotated embryo’s observed cell count and t<sub>med</sub> against the 95% confidence intervals of the training population.&nbsp;</p><p><b>Data Acquisition&nbsp;</b></p><p>For validation, JIM113 embryos were isolated from gravid hermaphrodites and mounted on a coverslip using a diSPIM sample chamber. Images were acquired with 0.75 micron z-spacing on an ASI diSPIM <a href=\"https://paperpile.com/c/eA17rr/yCU1\">(Wu et al., 2013)</a> run in single view acquisition mode controlled by the diSPIM control plugin in micro-manager <a href=\"https://paperpile.com/c/eA17rr/8UCO+aiBr\">(A. Edelstein et al., 2010; A. D. Edelstein et al., 2014)</a>. Images were cropped in ImageJ <a href=\"https://paperpile.com/c/eA17rr/8Vct\">(Schneider et al., 2012)</a> and processed using StarryNite. Lineage tracing results from StarryNite were then manually curated using AceTree.</p><p><b>Data and Code Availability</b></p><p>The EmbAlign source code is available as a public repository at https://github.com/shahlab-ucla/EmbAlign. This repository also contains executable scripts, configuration files, and pretrained models required to reproduce the analyses and figures presented in this study. All processed datasets used for model training and out-of-sample validation are hosted within the repository and permanently archived at https://doi.org/10.5281/zenodo.20089241. The repository also contains core usage vignettes outlining execution of the EmbAlign pipeline for lineage inference in unlabeled 3D nuclei coordinates, as well as instructions for fitting the spatiotemporal atlases on raw point cloud data.</p>","reagents":"<p></p>","patternDescription":"<p><b>Introduction:</b></p><p>During the initial mapping of the <i>C. elegans </i>embryonic lineage <a href=\"https://paperpile.com/c/eA17rr/YgjH\">(Sulston et al., 1983)</a>, the spatial stereotypy of the embryo was noted. More recent implementations of automated tracking-based lineage tracing <a href=\"https://paperpile.com/c/eA17rr/Zsa7+8q2s+L00y\">(Bao et al., 2006; Santella et al., 2010, 2014)</a> enabled quantitative assessment of the global consistency of cell positioning within the embryo <a href=\"https://paperpile.com/c/eA17rr/vOpc+SI9r+n1tz\">(Li et al., 2019; Moore et al., 2013; Schnabel et al., 2006)</a>, a feature that has led to recent advances <a href=\"https://paperpile.com/c/eA17rr/HA8b+kTnY\">(Haus et al., 2025; Ntemos et al., 2025)</a> in using spatial atlases built by live imaging to automate cell identity determination from static snapshots of <i>C. elegans </i>embryos using timelapse recordings of embryos imaged under gentle compression. Compression has been extensively used in prior live imaging experiments <a href=\"https://paperpile.com/c/eA17rr/YgjH\">(Sulston et al., 1983)</a> as it forces embryos into stereotypical orientations and limits their axial extent, allowing for 3D coverage with fewer focal planes in confocal imaging <a href=\"https://paperpile.com/c/eA17rr/EDNV\">(Bao &amp; Murray, 2011)</a>. While current approaches achieve high accuracy in the instantaneous inference of lineage identity from static snapshots of cell position, both of these tools were trained using compressed data. We set out to build EmbAlign to fill this gap by enabling robust lineage identity determination using cell position in uncompressed embryos, such as would be generated by many smFISH <a href=\"https://paperpile.com/c/eA17rr/Ypca\">(Parker et al., 2021)</a> and immunofluorescence<a href=\"https://paperpile.com/c/eA17rr/Ypca+vnCY\">(Duerr, 2006; Parker et al., 2021)</a> sample preparation approaches, and by snapshots from uncompressed live imaging, for example by lightsheet microscopy <a href=\"https://paperpile.com/c/eA17rr/KPdD\">(Wu et al., 2011)</a>.</p><p><b>Pipeline Overview:</b></p><p>The EmbAlign algorithm automates the transformation of unoriented 3D nuclei centroids from uncompressed C. elegans embryos onto their canonical lineage identities (<b>Fig. 1A</b>). Initially, raw 3D centroids are centered and scaled to standardize physical size variations while preserving relative spatial topology. Using the total observed cell count, the algorithm selects from a library of reference templates derived from empirically observed lineage identity configurations. To resolve arbitrary orientations, the algorithm aligns the observed centroids’ primary principal component (±PC1) axes to the reference and executes a discrete rotational sweep to identify the top k unique angular orientations. Each initialization is then refined via an iterative Sinkhorn alignment procedure. This stage utilizes entropically regularized optimal transport to compute a soft correspondence probability matrix and iteratively updates the best fit rigid transformation. Following refinement, EmbAlign resolves a final, discrete mapping of identities to centroids by minimizing the Sinkhorn-weighted Mahalanobis distance, enforcing the one-to-one correspondence required by the embryo’s invariant body plan. This strict one-to-one correspondence does introduce a key limitation for EmbAlign in that it is extremely sensitive to detection errors, requiring careful validation of cell detection in test data prior to alignment. Finally, a Random Forest diagnostic layer evaluates the assignment, utilizing geometric and biological features to output a continuous confidence score for every predicted cell identity.&nbsp;</p><p><b>Label Transfer Performance:</b></p><p>EmbAlign’s performance was rigorously evaluated using a Leave-One-Out Cross-Validation (LOOCV) strategy across a dataset of uncompressed C. elegans embryos from two different labs acquired using distinct imaging modalities (embryo n = 11, frame n = 1,215), where ground truth cell identities were established using the AceTree <a href=\"https://paperpile.com/c/eA17rr/wnW2+hC5r\">(Boyle et al., 2006; Katzman et al., 2018)</a>/StarryNite <a href=\"https://paperpile.com/c/eA17rr/Zsa7+L00y+8q2s\">(Bao et al., 2006; Santella et al., 2010, 2014)</a> lineage tracking pipeline. Position data for uncompressed embryos were sourced from previously published lightsheet microscopy lineage reconstructions. The algorithm achieved a sustained, high average frame accuracy of 96.6% from the 6-cell up to the 190-cell stage (<b>Fig. 1B</b>). Despite this overall robustness, performance exhibited predictable transient declines corresponding to major waves of synchronous cell division. Nuclei captured within or immediately adjacent to these division windows are inherently more difficult to classify, as rapid physical displacement during cytokinesis briefly deforms the expected spatial topology and maximizes the distance of these cells from their canonical atlas positions. Finding an exact reference template match is inherently challenging during these rapid waves of cell division. Remarkably, EmbAlign maintains a 86.6% assignment accuracy even when aligning against an imperfect reference template (n = 270), compared to a near-perfect 99.4% accuracy when at least one exact reference template match is available (n = 945). This suggests the EmbAlign framework successfully buffers against missing reference templates, and increasing training dataset size to more robustly cover gaps in the current reference atlas should further improve overall performance.</p><p>To verify that the pipeline does not overfit to the spatial properties of the training data, we evaluated EmbAlign on two fully independent, out-of-sample (OOS) datasets acquired via single view selective plane illumination microscopy (ASI diSPIM operated in single view acquisition mode). These OOS embryos were annotated with ground truth labels using AceTree and StarryNite pipeline up to the 100 cell stage. The algorithm successfully mapped these embryos, maintaining comparable high-fidelity frame accuracies (96.3% and 96.1%) without any dataset-specific recalibration (<b>Fig. 1C</b>). This OOS performance demonstrates robust generalizability, confirming that EmbAlign effectively captures the invariant spatial dynamics of <i>C. elegans</i> embryogenesis rather than batch-specific imaging artifacts.&nbsp;</p><p>To investigate cell type specific drops in performance, we examined the 30 lowest-performing cell types. We observed that these challenging types primarily consist of later-stage cells that emerge toward the end of our atlas window and correspond to a wave of cell divisions. Additionally, we correlated our global cell-type accuracies with empirical positional variance measurements from a prior study <a href=\"https://paperpile.com/c/eA17rr/SI9r+TBy8\">(Guan et al., 2025; Li et al., 2019)</a> in uncompressed embryos. We found a small but significant negative correlation (Spearman’ s  = -0.11, p = 0.03), suggesting the natural spatial variability of these cells is likely a minor contributing factor to EmbAlign’s difficulty in resolving their identities.&nbsp;&nbsp;</p><p><b>Diagnostic Layer Performance:</b></p><p>To enable reliable application in experimental settings lacking ground truth, we integrated a decoupled Random Forest diagnostic layer that provides a single cell assessment of alignment quality in real-time. Because the pipeline maintains a high baseline accuracy, the diagnostic task is imbalanced, requiring the classifier to identify rare misassignment events within a vast majority of correct labels. To rigorously evaluate performance under those conditions, we compared the precision (a measure of the false positive rate) and recall (a measure of the false negative rate) of the model. The diagnostic layer achieved an AUPRC of 0.525—a greater than 10-fold improvement over the naive baseline of 0.05 (<b>Fig. 1E</b>)—and&nbsp; successfully identified 78.5% of true assignment errors and 90.2% of true assignment successes (<b>Fig. 1F</b>), demonstrating a robust ability to capture specific geometric and biological features of assignment failures.</p><p>To make these diagnostics accessible to the end user, we developed an interactive HTML alignment report that packages the pipeline’s outputs into a comprehensive dashboard. Along with a frame level confidence estimate, the tool projects the aligned embryo onto an empirical population growth curve to estimate its canonical time, a feature that allows users to flag datasets that are likely to have been captured during transient, error-prone division windows. The dashboard also summarizes the entire search landscape, enabling users to compare alternative alignments across multiple local minima and track their respective optimization convergence traces. Finally, predicted labels and alignment confidence scores are mapped directly onto the embryo’s 3D spatial topology, resulting in interactive 3D spatial label assignment and confidence plots that facilitate rapid verification of alignment quality (<b>Fig. 1A, 1G</b>).&nbsp;&nbsp;</p><p><b>Conclusions:</b></p><p>In summary, EmbAlign provides a generalizable solution for automated cell identity inference in static 3D snapshots of uncompressed <i>C. elegans</i> embryos. By aligning 3D nuclei centroid coordinates to live-imaging derived spatiotemporal atlases of early <i>C. elegans</i> development, EmbAlign achieves &gt;96% alignment accuracy up to the 190 cell stage that generalizes across independently generated datasets. Additionally, we observe that performance lapses correspond to transient waves of synchronous cell division, where natural positional variance temporarily disrupts spatial stereotypy. Supported by a predictive diagnostic classifier and interactive alignment reports, EmbAlign provides a complementary tool for transforming raw spatial coordinates into lineage-aware datasets.</p>","references":[{"reference":"Bergstrom P, Edlund O. 2014. Robust registration of point sets using iteratively reweighted least. Computational Optimization and Applications. 58: 543-561.","pubmedId":"","doi":"10.1007/s10589-014-9643-2"},{"reference":"Cuturi M. 2013. Sinkhorn Distances: Lightspeed Computation of Optimal Transportation.","pubmedId":"","doi":""},{"reference":"Edelstein A, Amodaj N, Hoover K, Vale R, Stuurman N. 2010. Computer Control of Microscopes Using µManager. Current Protocols in Molecular Biology. 92: 14.20.1-14.20.17.","pubmedId":"","doi":"10.1002/0471142727.mb1420s92"},{"reference":"Schneider CA, Rasband WS, Eliceiri KW. 2012. NIH Image to ImageJ: 25 years of image analysis. Nature Methods. 9: 671-675.","pubmedId":"","doi":"10.1038/nmeth.2089"},{"reference":"Santella A, Du Z, Nowotschin S, Hadjantonakis AK, Bao Z. 2010. A hybrid blob-slice model for accurate and efficient detection of. BMC Bioinformatics. 11: 580.","pubmedId":"","doi":"10.1186/1471-2105-11-580"},{"reference":"Bao Z, Murray JI. 2011. Mounting Caenorhabditis elegans embryos for live imaging of embryogenesis. Cold Spring Harb Protoc. 2011","pubmedId":"","doi":"10.1101/pdb.prot065599"},{"reference":"Ntemos K, Xu F, Bazzi NZ, Fucile G, Maretic HP, Dokmanic I, Mango SE, Sawh AN. 2025. Rapid canalization of chromosome conformation-transcription fingerprints. bioRxiv","pubmedId":"","doi":"10.1101/2025.10.22.684035"},{"reference":"Wu Y, Ghitani A, Christensen R, Santella A, Du Z, Rondeau G, et al., Shroff H. 2011. Inverted selective plane illumination microscopy (iSPIM) enables coupled. Proceedings of the National Academy of Sciences. 108: 17708-17713.","pubmedId":"","doi":"10.1073/pnas.1108494108"},{"reference":"Santella A, Du Z, Bao Z. 2014. A semi-local neighborhood-based framework for probabilistic cell lineage. BMC Bioinformatics. 15: 217.","pubmedId":"","doi":"10.1186/1471-2105-15-217"},{"reference":"Li X, Zhao Z, Xu W, Fan R, Xiao L, Ma X, Du Z. 2019. Systems Properties and Spatiotemporal Regulation of Cell Position. Cell Rep. 26: 313-321.e7.","pubmedId":"","doi":"10.1016/j.celrep.2018.12.052"},{"reference":"Guan G, Li Z, Ma Y, Ye P, Cao J, Wong MK, et al., Zhao Z. 2025. Cell lineage-resolved embryonic morphological map reveals signaling. Nat Commun. 16: 3700.","pubmedId":"","doi":"10.1038/s41467-025-58878-0"},{"reference":"Sulston JE, Schierenberg E, White JG, Thomson JN. 1983. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev Biol. 100: 64-119.","pubmedId":"","doi":"10.1016/0012-1606(83)90201-4"},{"reference":"Parker DM, Winkenbach LP, Parker A, Boyson S, Nishimura EO. 2021. Improved Methods for Single-Molecule Fluorescence In Situ Hybridization. Curr Protoc. 1: e299.","pubmedId":"","doi":"10.1002/cpz1.299"},{"reference":"Bao Z, Murray JI, Boyle T, Ooi SL, Sandel MJ, Waterston RH. 2006. Automated cell lineage tracing in Caenorhabditis elegans. Proceedings of the National Academy of Sciences. 103: 2707-2712.","pubmedId":"","doi":"10.1073/pnas.0511111103"},{"reference":"Edelstein AD, Tsuchida MA, Amodaj N, Pinkard H, Vale RD, Stuurman N. 2014. Advanced methods of microscope control using μManager software. POL Scientific. 1: 1.","pubmedId":"","doi":"10.14440/jbm.2014.36"},{"reference":". 2019. Reprint of: Mahalanobis, P.C. (1936) \"On the Generalised Distance in. Sankhya A. 80: 1-7.","pubmedId":"","doi":"10.1007/s13171-019-00164-5"},{"reference":"Katzman B, Tang D, Santella A, Bao Z. 2018. AceTree: a major update and case study in the long term maintenance of. BMC Bioinformatics. 19: 121.","pubmedId":"","doi":"10.1186/s12859-018-2127-0"},{"reference":". . Sakoe, H. and Chiba, S. (1978) Dynamic Programming Algorithm Optimization.","pubmedId":"","doi":""},{"reference":"Haus E, Santella A, Xu Y, Ren R, Wang D, Bao Z. 2025. A Single-cell Spatiotemporal Manifold of Tissue Morphology and Dynamics. bioRxiv","pubmedId":"","doi":"10.1101/2025.10.22.683950"},{"reference":"Schnabel R, Bischoff M, Hintze A, Schulz AK, Hejnol A, Meinhardt H, Hutter H. 2006. Global cell sorting in the C. elegans embryo defines a new mechanism for. Dev Biol. 294: 418-431.","pubmedId":"","doi":"10.1016/j.ydbio.2006.03.004"},{"reference":"Kabsch W. 1976. A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A. 32: 922-923.","pubmedId":"","doi":"10.1107/S0567739476001873"},{"reference":"Hadwiger G, Dour S, Arur S, Fox P, Nonet ML. 2010. A Monoclonal Antibody Toolkit for C. elegans. PLOS ONE. 5: e10161.","pubmedId":"","doi":"10.1371/journal.pone.0010161"},{"reference":"Breiman L. 2001. Random Forests. Machine Learning. 45: 5-32.","pubmedId":"","doi":"10.1023/A:1010933404324"},{"reference":"Moore JL, Du Z, Bao Z. 2013. Systematic quantification of developmental phenotypes at single-cell. Development. 140: 3266-3274.","pubmedId":"","doi":"10.1242/dev.096040"},{"reference":"Duerr JS. 2006. Immunohistochemistry. WormBook: 1-61.","pubmedId":"","doi":"10.1895/wormbook.1.105.1"},{"reference":"Boyle TJ, Bao Z, Murray JI, Araya CL, Waterston RH. 2006. AceTree: a tool for visual analysis of Caenorhabditis elegans. BMC Bioinformatics. 7: 275.","pubmedId":"","doi":"10.1186/1471-2105-7-275"},{"reference":"Kuhn HW. 1955. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly. 2: 83-97.","pubmedId":"","doi":"10.1002/nav.3800020109"},{"reference":"Wu Y, Wawrzusin P, Senseney J, Fischer RS, Christensen R, Santella A, et al., Shroff H. 2013. Spatially isotropic four-dimensional imaging with dual-view plane. Nature Biotechnology. 31: 1032-1038.","pubmedId":"","doi":"10.1038/nbt.2713"},{"reference":". . Dynamic programming algorithm optimization for spoken word recognition.","pubmedId":"","doi":"10.5555/108235.108244"}],"title":"<p>Inference of Lineage-Resolved Cell Identities in Uncompressed <i>C. elegans </i>Embryos</p>","reviews":[{"reviewer":{"displayName":"Zhuo Du"},"openAcknowledgement":true,"status":{"submitted":true}}],"curatorReviews":[{"curator":{"displayName":"Gary Craig Schindelman"},"openAcknowledgement":false,"submitted":null}]},{"id":"4d54cde6-5fe3-43f7-91d8-414762106181","decision":"accept","abstract":"<p>Lineage-based cell identification in <i><a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"fb60fcff-20ec-4f86-9581-49dca66bb720\">C. elegans</a></i> typically requires intensive live imaging and tracking. Recent methods attempt to automate cell identification on the basis of spatiotemporal atlases. We present EmbAlign, an automated 3D registration framework that determines lineage identities from single embryo snapshots. EmbAlign retrieves reference templates from a spatiotemporal atlas and refines assignments using an iterative Sinkhorn alignment procedure, robustly handling positional variability and arbitrary orientations in uncompressed embryos. EmbAlign achieves 96.9% accuracy up to the 190-cell stage and includes a diagnostic layer for continuous scoring (AUPRC = 0.546), converting raw spatial data into lineage aware datasets.</p>","acknowledgements":"<p>&nbsp;Strain JIM113 was provided by the CGC, which is funded by NIH Office of Research Infrastructure Programs (P40 OD010440).</p>","authors":[{"affiliations":["University of California Los Angeles"],"departments":["Bioinformatics Interdepartmental Program"],"credit":["investigation","methodology","software","validation","visualization","writing_originalDraft"],"email":"mptran@g.ucla.edu","firstName":"Miles","lastName":"Tran","submittingAuthor":false,"correspondingAuthor":false,"equalContribution":false,"WBId":null,"orcid":"0009-0005-9330-9297"},{"affiliations":["University of California Los Angeles"],"departments":["Molecular Biology Interdepartmental Program"],"credit":["dataCuration","validation"],"email":"neilpeinado@g.ucla.edu","firstName":"Neil","lastName":"Peinado","submittingAuthor":false,"correspondingAuthor":false,"equalContribution":false,"WBId":null,"orcid":null},{"affiliations":["Hong Kong Baptist University"],"departments":["Department of Biology"],"credit":["dataCuration","writing_reviewEditing"],"email":"zyzhao@hkbu.edu.hk","firstName":"Zhongying","lastName":"Zhao","submittingAuthor":false,"correspondingAuthor":false,"equalContribution":false,"WBId":null,"orcid":null},{"affiliations":["University of California Los Angeles"],"departments":["Molecular, Cell and Developmental Biology"],"credit":["conceptualization","dataCuration","methodology","project","fundingAcquisition","supervision","validation","writing_reviewEditing"],"email":"pavak@ucla.edu","firstName":"Pavak","lastName":"Shah","submittingAuthor":true,"correspondingAuthor":true,"equalContribution":false,"WBId":null,"orcid":"0000-0002-2603-5995"}],"awards":[{"awardId":"R35GM151199","funderName":"National Institutes of Health (United States)","awardRecipient":"PKS"}],"conflictsOfInterest":"<p>The authors declare that there are no conflicts of interest present.</p>","dataTable":{"url":null},"extendedData":[{"description":"<p>v1 release of EmbAlign source code, also indexed at https://doi.org/10.5281/zenodo.20089241 provided under the MIT license.</p>","doi":null,"resourceType":"Software","name":"EmbAlign.zip","url":"https://portal.micropublication.org/uploads/960450e5612070355778b8f05e56421d.zip"}],"funding":"","image":{"url":"https://portal.micropublication.org/uploads/51bad455a31e292a4e958d5721d2f352.png"},"imageCaption":"A. Overview of the EmbAlign framework. Input Centroids: Raw 3D unoriented input centroids (top) and candidate reference templates retrieved from the spatiotemporal atlas (bottom). Coarse Alignment: Superposition of observed centroids with candidate reference template (top). A discrete rotational sweep around the principal component axis and the resulting cost landscape (bottom). Sinkhorn Refinement: Iterative Sinkhorn refinement mapping the spatial correspondence between observed centroids and the candidate reference (top) along with the resulting cost landscape (bottom). Label Assignment: The final discrete lineage identity assignments (top) and the resulting distribution of per-cell confidence scores (bottom) generated by the diagnostic layer. \nB. Cross-validated, per-frame alignment accuracy (N=11, mean = 96.9%) aligned to a canonical time axis. Rolling mean frame accuracy (solid black) with bootstrapped 95% confidence intervals (shaded).\nC. Per-frame alignment accuracy for two independently generated light sheet microscopy datasets.\nD. Distribution of assignment accuracies across all training embryos for the 30 cell types with the lowest overall mean accuracy. \nE. Cross-validated AUPRC curves (mean AUC = 0.546). Mean interpolated precision (solid blue) and bootstrapped 95% confidence intervals (shaded). Naive baseline (0.05, dashed).  \nF. Normalized confusion matrix. Values indicate the proportion of alignments correctly classified within each true class. \nG. Side-by-side comparison of ground truth label assignment accuracy (left) and the predicted confidence score (right).","imageTitle":"EmbAlign enables fully automated cell labeling in C. elegans embryo snapshots.","methods":"<p><b>Spatiotemporal Atlas Construction</b> </p><p>Embryos were first aligned to a canonical time axis via dynamic time warping<a href=\"https://paperpile.com/c/eA17rr/yIyn\">(</a><i><a href=\"https://paperpile.com/c/eA17rr/yIyn\">Dynamic Programming Algorithm Optimization for Spoken Word Recognition</a></i><a href=\"https://paperpile.com/c/eA17rr/yIyn\">, n.d.)</a>. A continuous spatial reference was constructed by fitting independent 1D Gaussian Process regressors (RBF and White kernels) to the spatiotemporal trajectories of each cell type's 3D coordinates. Total positional variance at time t was calculated by aggregating the GP regression uncertainty with empirical cell-specific variance. </p><p>To define the database of biologically valid cell combinations (“slices”), we utilized strictly empirically observed states. Slices were extracted directly from the training frames by aggregating valid cell label combinations grouped by frame cell count (N). Finally, discrete slices were inflated into function 3D reference frames. We calculated the overlapping temporal existence window, the canonical time window within which all cells in a given slice simultaneously exist, for each observed slice and queried the continuous GP atlas at the window's midpoint (t<sub>med</sub>) to retrieve the expected 3D coordinates and covariance matrices required for alignment. To provide a temporal baseline for downstream diagnostic validation, an empirical growth curve was also constructed by calculating the mean and standard deviation of total cell counts across canonical time bins in the training data.</p><p><b>EmbAlign 3D Alignment and Label Transfer</b></p><p>Observed 3D nuclei centroids were first centered and scaled by their median pairwise distance. For each candidate atlas slice matching the observed cell count N, we aligned both the positive and negative primary principal component (±<a>PC1</a>) axes of observation to the reference <a>PC1</a> axis to account for PCA sign ambiguity. We then rotated the observation around the <a>PC1</a> axis in discrete angular steps, scoring each orientation based on the Sinkhorn-weighted squared Euclidean distance between observed and reference centroids. To escape geometric symmetry traps, we identified the top k unique angular valleys (separated by ≥30°) to seed the refinement phase. </p><p>Each of the k initializations entered an Iterative Closest Point Refinement <a href=\"https://paperpile.com/c/eA17rr/2OtJ\">(Bergström &amp; Edlund, 2014)</a> (ICP)-like soft refinement loop. To more robustly handle biological noise, we replaced the standard hard-assignment matching of the ICP algorithm with optimal transport. We computed a soft correspondence probability matrix using Sinkhorn entropic regularization. The rigid transformation (rotation and translation) was then iteratively updated using a weighted Kabsch Algorithm <a href=\"https://paperpile.com/c/eA17rr/nHpW\">(Kabsch, 1976)</a>, where the influence of each cell pairing was governed by its Sinkhorn probability. </p><p>Alignments were scored by calculating the Mahalanobis distance between the aligned observations and the atlas 3D Gaussian distributions. The slice and orientation minimizing this global cost were selected as the winner. Final discrete cell identities were assigned by executing a linear sum assignment algorithm <a href=\"https://paperpile.com/c/eA17rr/wnwg\">(Kuhn, 1955)</a> directly on the winning mahalanobis distance matrix.</p><p><b>Confidence Scoring</b></p><p>A Random Forest<a href=\"https://paperpile.com/c/eA17rr/tppS\">(Breiman, 2001)</a> classifier (n_estimators = 200, max_depth = 10) was trained to predict cell-level assignment correctness using geometric and biological alignment features. Input variables included Sinkhorn assignment entropy, Mahalanobis distance, frame cell count, frame inferred time, and a normalized division delta (t<sub>med</sub>-tbirth)/(t<sub>division</sub>-tbirth) representing cell life progress. The model outputs continuous cell-level probabilities, which are also averaged to generate an aggregate frame level confidence score. </p><p>Pipeline accuracy and diagnostic accuracy were evaluated using a Leave-One-Out cross-validation strategy. For each fold, the continuous spatial GP atlas, slice atlas, and diagnostic classifier, were fitted entirely on the training embryos prior to evaluating the withheld test embryo.</p><p><b>QC Report Generation</b></p><p>To facilitate rapid visual assessment of alignment quality, the pipeline generates an interactive HTML diagnostic dashboard. To trace the optimization landscape, the alignment engine records the Sinkhorn-weighted sum of squared Euclidean distances at each discrete step of the coarse angular sweep. Furthermore, for the top k angular initializations selected for the refinement tournament, this same soft-weighted cost is sequentially tracked across all Sinkhorn refinement loops. These convergence traces are packaged alongside the empirical population growth curve, which plots the unannotated embryo's observed cell count and t<sub>med</sub> against the 95% confidence intervals of the training population. </p><p><b>Data Acquisition </b></p><p>For validation, <a href=\"http://www.wormbase.org/db/get?name=WBStrain00022462;class=Strain\" id=\"12769628-6fd9-47c3-a863-a04f9fc2c107\">JIM113</a> embryos were isolated from gravid hermaphrodites and mounted on a coverslip using a diSPIM sample chamber. Images were acquired with 0.75 micron z-spacing on an ASI diSPIM <a href=\"https://paperpile.com/c/eA17rr/yCU1\">(Wu et al., 2013)</a> run in single view acquisition mode controlled by the diSPIM control plugin in micro-manager <a href=\"https://paperpile.com/c/eA17rr/8UCO+aiBr\">(A. Edelstein et al., 2010; A. D. Edelstein et al., 2014)</a>. Images were cropped in ImageJ <a href=\"https://paperpile.com/c/eA17rr/8Vct\">(Schneider et al., 2012)</a> and processed using StarryNite. Lineage tracing results from StarryNite were then manually curated using AceTree.</p><p><b>Data and Code Availability</b></p><p>The EmbAlign source code is available as a public repository at https://github.com/shahlab-ucla/EmbAlign. This repository also contains executable scripts, configuration files, and pretrained models required to reproduce the analyses and figures presented in this study. All processed datasets used for model training and out-of-sample validation are hosted within the repository and permanently archived at https://doi.org/10.5281/zenodo.20089241. The repository also contains core usage vignettes outlining execution of the EmbAlign pipeline for lineage inference in unlabeled 3D nuclei coordinates, as well as instructions for fitting the spatiotemporal atlases on raw point cloud data.</p>","reagents":"<p></p>","patternDescription":"<p>During the mapping of the <i><a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"71c01161-cbe9-4059-832b-1f2606c5c85d\">C. elegans</a> </i>embryonic lineage <a href=\"https://paperpile.com/c/eA17rr/YgjH\">(Sulston et al., 1983)</a>, the spatial stereotypy of the embryo was noted. More recent automated tracking-based lineage tracing <a href=\"https://paperpile.com/c/eA17rr/Zsa7+8q2s+L00y\">(Bao et al., 2006; Santella et al., 2010, 2014)</a> enabled quantitative assessment of the consistency of cell positioning within the embryo <a href=\"https://paperpile.com/c/eA17rr/vOpc+SI9r+n1tz\">(Li et al., 2019; Moore et al., 2013; Schnabel et al., 2006)</a>, a feature that has led to recent advances <a href=\"https://paperpile.com/c/eA17rr/HA8b+kTnY\">(Haus et al., 2025; Ntemos et al., 2025)</a> in using spatial atlases to automate cell identity determination from static snapshots of <i><a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"1fb7a9f7-6522-47a6-9a10-ea5d211738d4\">C. elegans</a> </i>embryos using timelapse recordings of embryos imaged under gentle compression. Compression has been extensively used in prior live imaging experiments <a href=\"https://paperpile.com/c/eA17rr/YgjH\">(Sulston et al., 1983)</a> as it forces embryos into stereotypical orientations and limits their axial extent, allowing for 3D coverage with fewer focal planes in confocal imaging <a href=\"https://paperpile.com/c/eA17rr/EDNV\">(Bao &amp; Murray, 2011)</a>. While current approaches achieve high accuracy in the inference of lineage identity from static snapshots of cell position, both of these tools were trained using compressed data. Because this mechanical compression physically flattens the Z-axis, it alters the relative spatial topology of nuclei. Consequently, tools trained on these compressed datasets may struggle to map the 3D geometry of uncompressed embryos, which are free to rotate in space. We built EmbAlign to fill this gap by enabling robust lineage identity determination using cell position in uncompressed embryos, such as would be generated by many smFISH <a href=\"https://paperpile.com/c/eA17rr/Ypca\">(Parker et al., 2021)</a> and immunofluorescence <a href=\"https://paperpile.com/c/eA17rr/Ypca+vnCY\">(Duerr, 2006; Parker et al., 2021)</a> sample preparation approaches, and by snapshots from uncompressed live imaging, for example by lightsheet microscopy <a href=\"https://paperpile.com/c/eA17rr/KPdD\">(Wu et al., 2011)</a>. In the case of assays like smFISH or immunofluorescence, embryos can be fixed in suspension, preserving their native 3D architecture while inherently precluding any live imaging. EmbAlign provides the geometric inference to map molecular profiles to lineage identities in these preparations.</p><p>EmbAlign automates the transformation of 3D nuclei centroids from uncompressed <a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"190812f1-2718-4ba2-9d84-81fd41d4b676\">C. elegans</a> embryos onto their canonical lineage identities (<b>Fig. 1A</b>). While users can generate these coordinates using various 3D segmentation strategies, ranging from classical Laplacian of Gaussian blob detection <a href=\"https://paperpile.com/c/eA17rr/X8o2\">(Kong et al., 2013)</a> to deep-learning models like Cellpose <a href=\"https://paperpile.com/c/eA17rr/xqSl\">(Stringer et al., 2021)</a>, robust detection is critical. Before executing EmbAlign, we recommend visually verifying segmented centroids against raw fluorescence images as a standard practice. Once these validated 3D centroids are obtained, they are centered and scaled to standardize physical size variations while preserving relative spatial topology. Using the total observed cell count, the algorithm selects from a reference templates derived from empirically observed lineage identity configurations. To resolve arbitrary orientations, the algorithm aligns the observed centroids' primary principal component (±<a>PC1</a>) axes to the reference and executes a discrete rotational sweep to identify the top k unique angular orientations. Each initialization is then refined via an iterative Sinkhorn alignment procedure. This stage utilizes entropically regularized optimal transport to compute a soft correspondence probability matrix and iteratively updates the best fit rigid transformation. Following refinement, EmbAlign resolves a final, discrete mapping of identities to centroids by minimizing the Sinkhorn-weighted Mahalanobis distance, enforcing the one-to-one correspondence required by the embryo's invariant body plan. Finally, a Random Forest diagnostic layer evaluates the assignment, utilizing geometric and biological features to output a continuous confidence score for every prediction. </p><p>EmbAlign's performance was evaluated using a Leave-One-Out Cross-Validation (LOOCV) strategy across a dataset of uncompressed <a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"24b6d164-1164-4087-856c-06497c3ae47e\">C. elegans</a> embryos from two different labs acquired using distinct imaging modalities (embryo n = 11, frame n = 1,215), where ground truth cell identities were established using the AceTree <a href=\"https://paperpile.com/c/eA17rr/wnW2+hC5r\">(Boyle et al., 2006; Katzman et al., 2018)</a>/StarryNite <a href=\"https://paperpile.com/c/eA17rr/Zsa7+L00y+8q2s\">(Bao et al., 2006; Santella et al., 2010, 2014)</a> lineage tracking pipeline. Position data for uncompressed embryos were sourced from previously published lightsheet microscopy lineage reconstructions. The algorithm achieved a high average frame accuracy of 96.6% from the 6-cell up to the 190-cell stage (<b>Fig. 1B</b>). Despite this overall robustness, performance exhibited predictable transient declines corresponding to waves of synchronous cell division. Nuclei captured within or immediately adjacent to these division windows are inherently more difficult to classify, as rapid physical displacement during cytokinesis briefly deforms the expected spatial topology and maximizes the distance of these cells from their canonical atlas positions. Finding an exact reference template match is inherently challenging during these rapid waves of cell division. Remarkably, EmbAlign maintains a 86.6% assignment accuracy even when aligning against an imperfect reference template (n = 270), compared to a near-perfect 99.4% accuracy when at least one exact reference template match is available (n = 945). This suggests the EmbAlign framework successfully buffers against missing reference templates, and increasing training dataset size to more robustly cover gaps in the current reference atlas should further improve overall performance.</p><p>To verify that the pipeline does not overfit to the spatial properties of the training data, we evaluated EmbAlign on two fully independent, out-of-sample (OOS) datasets acquired via single view selective plane illumination microscopy (ASI diSPIM operated in single view acquisition mode). These OOS embryos were annotated with ground truth labels using AceTree and StarryNite pipeline up to the 100 cell stage. The algorithm successfully mapped these embryos, maintaining comparable high-fidelity frame accuracies (96.3% and 96.1%) without any dataset-specific recalibration (<b>Fig. 1C</b>). This OOS performance demonstrates robust generalizability, confirming that EmbAlign effectively captures the invariant spatial dynamics of <i><a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"fa53dbde-6be0-4695-82f7-a2c04fc80ee1\">C. elegans</a></i> embryogenesis rather than batch-specific imaging artifacts. </p><p>To investigate cell type specific drops in performance, we examined the 30 lowest-performing cell types. We observed that these challenging types primarily consist of later-stage cells that emerge toward the end of our atlas window and correspond to a wave of cell divisions. Additionally, we correlated our global cell-type accuracies with empirical positional variance measurements from a prior study <a href=\"https://paperpile.com/c/eA17rr/SI9r+TBy8\">(Guan et al., 2025; Li et al., 2019)</a> in uncompressed embryos. We found a small but significant negative correlation (Spearman' s = -0.11, p = 0.03), suggesting the natural spatial variability of these cells is likely a minor contributing factor to EmbAlign's difficulty in resolving their identities.  </p><p>To enable reliable application in experimental settings lacking ground truth, we integrated a decoupled Random Forest diagnostic layer that provides a single cell assessment of alignment quality in real-time. Because the pipeline maintains a high baseline accuracy, the diagnostic task is imbalanced, requiring the classifier to identify rare misassignment events within a vast majority of correct labels. To rigorously evaluate performance under those conditions, we compared the precision (a measure of the false positive rate) and recall (a measure of the false negative rate) of the model. The diagnostic layer achieved an AUPRC of 0.525—a greater than 10-fold improvement over the naive baseline of 0.05 (<b>Fig. 1E</b>)—and  successfully identified 78.5% of true assignment errors and 90.2% of true assignment successes (<b>Fig. 1F</b>), demonstrating a robust ability to capture specific geometric and biological features of assignment failures.</p><p>To make these diagnostics accessible to the end user, we developed an interactive HTML alignment report that packages the pipeline's outputs into a comprehensive dashboard. Along with a frame level confidence estimate, the tool projects the aligned embryo onto an empirical population growth curve to estimate its canonical time, a feature that allows users to flag datasets that are likely to have been captured during transient, error-prone division windows. The dashboard also summarizes the entire search landscape, enabling users to compare alternative alignments across multiple local minima and track their respective optimization convergence traces. Finally, predicted labels and alignment confidence scores are mapped directly onto the embryo's 3D spatial topology, resulting in interactive 3D spatial label assignment and confidence plots that facilitate rapid verification of alignment quality (<b>Fig. 1A, 1G</b>).  </p><p>While EmbAlign provides a robust framework for identity inference in uncompressed embryos, its current implementation has specific boundaries defined by training data availability and input data quality. The pipeline is validated up to the 190-cell stage: a limit dictated by the increased difficulty of generating ground truth training data for uncompressed embryos using compression-optimized tracking tools, rather than an inherent algorithmic constraint. Furthermore, because the algorithm relies on a strict one-to-one correspondence between observed nuclei and those in the candidate atlas template, EmbAlign is very sensitive to detection errors, requiring careful validation of cell detection in test data prior to alignment. However, the framework's reasonable performance of imperfect reference templates suggests that missing or extra nuclei primarily trigger localized mapping failures rather than catastrophic global misalignment. This behavior underscores the critical utility of the diagnostic layer and interactive alignment report, which projects alignment confidence scores spatially to help users visually flag and isolate these localized, artifact driven misassignments.</p><p>In summary, EmbAlign provides a generalizable solution for automated cell identity inference in static 3D snapshots of uncompressed <i><a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"eefdb80b-5a83-4f8f-a5d7-d4b3e7de6241\">C. elegans</a></i> embryos. By aligning 3D nuclei centroid coordinates to live-imaging derived spatiotemporal atlases of early <i><a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"21f77647-3d85-4f62-8759-bf1f3698a07c\">C. elegans</a></i> development, EmbAlign achieves &gt;96% alignment accuracy up to the 190 cell stage that generalizes across independently generated datasets. Additionally, we observe that performance lapses correspond to transient waves of synchronous cell division, where natural positional variance temporarily disrupts spatial stereotypy. Supported by a predictive diagnostic classifier and interactive alignment reports, EmbAlign provides a complementary tool for transforming raw spatial coordinates into lineage-aware datasets.</p>","references":[{"reference":"Bao Z, Murray JI, Boyle T, Ooi SL, Sandel MJ, Waterston RH. 2006. Automated cell lineage tracing in Caenorhabditis elegans. Proceedings of the National Academy of Sciences. 103: 2707-2712.","pubmedId":"","doi":"10.1073/pnas.0511111103"},{"reference":"Bao Z, Murray JI. 2011. Mounting Caenorhabditis elegans embryos for live imaging of embryogenesis. Cold Spring Harb Protoc. 2011","pubmedId":"","doi":"10.1101/pdb.prot065599"},{"reference":"Bergstrom P, Edlund O. 2014. Robust registration of point sets using iteratively reweighted least. Computational Optimization and Applications. 58: 543-561.","pubmedId":"","doi":"10.1007/s10589-014-9643-2"},{"reference":"Boyle TJ, Bao Z, Murray JI, Araya CL, Waterston RH. 2006. AceTree: a tool for visual analysis of Caenorhabditis elegans. BMC Bioinformatics. 7: 275.","pubmedId":"","doi":"10.1186/1471-2105-7-275"},{"reference":"Breiman L. 2001. Random Forests. Machine Learning. 45: 5-32.","pubmedId":"","doi":"10.1023/A:1010933404324"},{"reference":"Cuturi M. 2013. Sinkhorn Distances: Lightspeed Computation of Optimal Transportation.","pubmedId":"","doi":""},{"reference":"Duerr JS. 2006. Immunohistochemistry. WormBook: 1-61.","pubmedId":"","doi":"10.1895/wormbook.1.105.1"},{"reference":"Edelstein A, Amodaj N, Hoover K, Vale R, Stuurman N. 2010. Computer Control of Microscopes Using µManager. Current Protocols in Molecular Biology. 92: 14.20.1-14.20.17.","pubmedId":"","doi":"10.1002/0471142727.mb1420s92"},{"reference":"Edelstein AD, Tsuchida MA, Amodaj N, Pinkard H, Vale RD, Stuurman N. 2014. Advanced methods of microscope control using μManager software. POL Scientific. 1: 1.","pubmedId":"","doi":"10.14440/jbm.2014.36"},{"reference":"Guan G, Li Z, Ma Y, Ye P, Cao J, Wong MK, et al., Zhao Z. 2025. Cell lineage-resolved embryonic morphological map reveals signaling. Nat Commun. 16: 3700.","pubmedId":"","doi":"10.1038/s41467-025-58878-0"},{"reference":"Hadwiger G, Dour S, Arur S, Fox P, Nonet ML. 2010. A Monoclonal Antibody Toolkit for C. elegans. PLOS ONE. 5: e10161.","pubmedId":"","doi":"10.1371/journal.pone.0010161"},{"reference":"Haus E, Santella A, Xu Y, Ren R, Wang D, Bao Z. 2025. A Single-cell Spatiotemporal Manifold of Tissue Morphology and Dynamics. bioRxiv","pubmedId":"","doi":"10.1101/2025.10.22.683950"},{"reference":"Kabsch W. 1976. A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A. 32: 922-923.","pubmedId":"","doi":"10.1107/S0567739476001873"},{"reference":"Katzman B, Tang D, Santella A, Bao Z. 2018. AceTree: a major update and case study in the long term maintenance of. BMC Bioinformatics. 19: 121.","pubmedId":"","doi":"10.1186/s12859-018-2127-0"},{"reference":"Kong H, Akakin HC, Sarma SE. 2013. A generalized Laplacian of Gaussian filter for blob detection and its. IEEE Trans Cybern. 43: 1719-1733.","pubmedId":"","doi":"10.1109/TSMCB.2012.2228639"},{"reference":"Kuhn HW. 1955. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly. 2: 83-97.","pubmedId":"","doi":"10.1002/nav.3800020109"},{"reference":"Li X, Zhao Z, Xu W, Fan R, Xiao L, Ma X, Du Z. 2019. Systems Properties and Spatiotemporal Regulation of Cell Position. Cell Rep. 26: 313-321.e7.","pubmedId":"","doi":"10.1016/j.celrep.2018.12.052"},{"reference":"<p>Mahalanobis, P.C. Reprint of:(1936) \"On the Generalised Distance in Statistics.\". <i>Sankhya A</i> <b>80</b> (Suppl 1), 1–7 (2018)</p>","pubmedId":"","doi":"10.1007/s13171-019-00164-5"},{"reference":"Moore JL, Du Z, Bao Z. 2013. Systematic quantification of developmental phenotypes at single-cell. Development. 140: 3266-3274.","pubmedId":"","doi":"10.1242/dev.096040"},{"reference":"Ntemos K, Xu F, Bazzi NZ, Fucile G, Maretic HP, Dokmanic I, Mango SE, Sawh AN. 2025. Rapid canalization of chromosome conformation-transcription fingerprints. bioRxiv","pubmedId":"","doi":"10.1101/2025.10.22.684035"},{"reference":"Parker DM, Winkenbach LP, Parker A, Boyson S, Nishimura EO. 2021. Improved Methods for Single-Molecule Fluorescence In Situ Hybridization. Curr Protoc. 1: e299.","pubmedId":"","doi":"10.1002/cpz1.299"},{"reference":"<p>Sakoe, H. and Chiba, S. (1990) Dynamic Programming Algorithm Optimization. Readings in speech recognition p159-165</p>","pubmedId":"","doi":"10.5555/108235.108244"},{"reference":"Santella A, Du Z, Bao Z. 2014. A semi-local neighborhood-based framework for probabilistic cell lineage. BMC Bioinformatics. 15: 217.","pubmedId":"","doi":"10.1186/1471-2105-15-217"},{"reference":"Santella A, Du Z, Nowotschin S, Hadjantonakis AK, Bao Z. 2010. A hybrid blob-slice model for accurate and efficient detection of. BMC Bioinformatics. 11: 580.","pubmedId":"","doi":"10.1186/1471-2105-11-580"},{"reference":"Schnabel R, Bischoff M, Hintze A, Schulz AK, Hejnol A, Meinhardt H, Hutter H. 2006. Global cell sorting in the C. elegans embryo defines a new mechanism for. Dev Biol. 294: 418-431.","pubmedId":"","doi":"10.1016/j.ydbio.2006.03.004"},{"reference":"Schneider CA, Rasband WS, Eliceiri KW. 2012. NIH Image to ImageJ: 25 years of image analysis. Nature Methods. 9: 671-675.","pubmedId":"","doi":"10.1038/nmeth.2089"},{"reference":"<p>Stringer C, Wang T, Michaelos M, Pachitariu M. 2020. Cellpose: a generalist algorithm for cellular segmentation. Nature Methods 18: 100-106.</p>","pubmedId":"","doi":"10.1038/s41592-020-01018-x"},{"reference":"Sulston JE, Schierenberg E, White JG, Thomson JN. 1983. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev Biol. 100: 64-119.","pubmedId":"","doi":"10.1016/0012-1606(83)90201-4"},{"reference":"Wu Y, Ghitani A, Christensen R, Santella A, Du Z, Rondeau G, et al., Shroff H. 2011. Inverted selective plane illumination microscopy (iSPIM) enables coupled. Proceedings of the National Academy of Sciences. 108: 17708-17713.","pubmedId":"","doi":"10.1073/pnas.1108494108"},{"reference":"Wu Y, Wawrzusin P, Senseney J, Fischer RS, Christensen R, Santella A, et al., Shroff H. 2013. Spatially isotropic four-dimensional imaging with dual-view plane. Nature Biotechnology. 31: 1032-1038.","pubmedId":"","doi":"10.1038/nbt.2713"}],"title":"<p>Inference of Lineage-Resolved Cell Identities in Uncompressed <i>C. elegans </i>Embryos</p>","reviews":[{"reviewer":{"displayName":"Zhuo Du"},"openAcknowledgement":true,"status":{"submitted":true}}],"curatorReviews":[{"curator":{"displayName":"Gary Craig Schindelman"},"openAcknowledgement":false,"submitted":"1779461223426"}]},{"id":"54895972-a90e-4e67-b469-581558c49b8a","decision":"publish","abstract":"<p>Lineage-based cell identification in <i><a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"fb60fcff-20ec-4f86-9581-49dca66bb720\">C. elegans</a></i> typically requires intensive live imaging and tracking. Recent methods attempt to automate cell identification on the basis of spatiotemporal atlases. We present EmbAlign, an automated 3D registration framework that determines lineage identities from single embryo snapshots. EmbAlign retrieves reference templates from a spatiotemporal atlas and refines assignments using an iterative Sinkhorn alignment procedure, robustly handling positional variability and arbitrary orientations in uncompressed embryos. EmbAlign achieves 96.9% accuracy up to the 190-cell stage and includes a diagnostic layer for continuous scoring (AUPRC = 0.546), converting raw spatial data into lineage aware datasets.</p>","acknowledgements":"<p>&nbsp;Strain JIM113 was provided by the CGC, which is funded by NIH Office of Research Infrastructure Programs (P40 OD010440).</p>","authors":[{"affiliations":["University of California Los Angeles"],"departments":["Bioinformatics Interdepartmental Program"],"credit":["investigation","methodology","software","validation","visualization","writing_originalDraft"],"email":"mptran@g.ucla.edu","firstName":"Miles P","lastName":"Tran","submittingAuthor":false,"correspondingAuthor":false,"equalContribution":false,"WBId":null,"orcid":"0009-0005-9330-9297"},{"affiliations":["University of California Los Angeles"],"departments":["Molecular Biology Interdepartmental Program"],"credit":["dataCuration","validation"],"email":"neilpeinado@g.ucla.edu","firstName":"Neil","lastName":"Peinado","submittingAuthor":false,"correspondingAuthor":false,"equalContribution":false,"WBId":null,"orcid":null},{"affiliations":["Hong Kong Baptist University"],"departments":["Department of Biology"],"credit":["dataCuration","writing_reviewEditing"],"email":"zyzhao@hkbu.edu.hk","firstName":"Zhongying","lastName":"Zhao","submittingAuthor":false,"correspondingAuthor":false,"equalContribution":false,"WBId":null,"orcid":null},{"affiliations":["University of California Los Angeles"],"departments":["Molecular, Cell and Developmental Biology"],"credit":["conceptualization","dataCuration","methodology","project","fundingAcquisition","supervision","validation","writing_reviewEditing"],"email":"pavak@ucla.edu","firstName":"Pavak K","lastName":"Shah","submittingAuthor":true,"correspondingAuthor":true,"equalContribution":false,"WBId":null,"orcid":"0000-0002-2603-5995"}],"awards":[{"awardId":"R35GM151199","funderName":"National Institutes of Health (United States)","awardRecipient":"PKS"}],"conflictsOfInterest":"<p>The authors declare that there are no conflicts of interest present.</p>","dataTable":{"url":null},"extendedData":[{"description":"<p>v1 release of EmbAlign source code, also indexed at https://doi.org/10.5281/zenodo.20089241 provided under the MIT license.</p>","doi":"10.22002/w3xfh-10x37","resourceType":"Software","name":"EmbAlign.zip","url":"https://portal.micropublication.org/uploads/960450e5612070355778b8f05e56421d.zip"}],"funding":"","image":{"url":"https://portal.micropublication.org/uploads/51bad455a31e292a4e958d5721d2f352.png"},"imageCaption":"A. Overview of the EmbAlign framework. Input Centroids: Raw 3D unoriented input centroids (top) and candidate reference templates retrieved from the spatiotemporal atlas (bottom). Coarse Alignment: Superposition of observed centroids with candidate reference template (top). A discrete rotational sweep around the principal component axis and the resulting cost landscape (bottom). Sinkhorn Refinement: Iterative Sinkhorn refinement mapping the spatial correspondence between observed centroids and the candidate reference (top) along with the resulting cost landscape (bottom). Label Assignment: The final discrete lineage identity assignments (top) and the resulting distribution of per-cell confidence scores (bottom) generated by the diagnostic layer. \nB. Cross-validated, per-frame alignment accuracy (N=11, mean = 96.9%) aligned to a canonical time axis. Rolling mean frame accuracy (solid black) with bootstrapped 95% confidence intervals (shaded).\nC. Per-frame alignment accuracy for two independently generated light sheet microscopy datasets.\nD. Distribution of assignment accuracies across all training embryos for the 30 cell types with the lowest overall mean accuracy. \nE. Cross-validated AUPRC curves (mean AUC = 0.546). Mean interpolated precision (solid blue) and bootstrapped 95% confidence intervals (shaded). Naive baseline (0.05, dashed).  \nF. Normalized confusion matrix. Values indicate the proportion of alignments correctly classified within each true class. \nG. Side-by-side comparison of ground truth label assignment accuracy (left) and the predicted confidence score (right).","imageTitle":"<p>EmbAlign enables fully automated cell labeling in C. elegans embryo snapshots</p>","methods":"<p><b>Spatiotemporal Atlas Construction</b> </p><p>Embryos were first aligned to a canonical time axis via dynamic time warping<a href=\"https://paperpile.com/c/eA17rr/yIyn\">(</a><i><a href=\"https://paperpile.com/c/eA17rr/yIyn\">Dynamic Programming Algorithm Optimization for Spoken Word Recognition</a></i><a href=\"https://paperpile.com/c/eA17rr/yIyn\">, n.d.)</a>. A continuous spatial reference was constructed by fitting independent 1D Gaussian Process regressors (RBF and White kernels) to the spatiotemporal trajectories of each cell type's 3D coordinates. Total positional variance at time t was calculated by aggregating the GP regression uncertainty with empirical cell-specific variance. </p><p>To define the database of biologically valid cell combinations (“slices”), we utilized strictly empirically observed states. Slices were extracted directly from the training frames by aggregating valid cell label combinations grouped by frame cell count (N). Finally, discrete slices were inflated into function 3D reference frames. We calculated the overlapping temporal existence window, the canonical time window within which all cells in a given slice simultaneously exist, for each observed slice and queried the continuous GP atlas at the window's midpoint (t<sub>med</sub>) to retrieve the expected 3D coordinates and covariance matrices required for alignment. To provide a temporal baseline for downstream diagnostic validation, an empirical growth curve was also constructed by calculating the mean and standard deviation of total cell counts across canonical time bins in the training data.</p><p><b>EmbAlign 3D Alignment and Label Transfer</b></p><p>Observed 3D nuclei centroids were first centered and scaled by their median pairwise distance. For each candidate atlas slice matching the observed cell count N, we aligned both the positive and negative primary principal component (±<a>PC1</a>) axes of observation to the reference <a>PC1</a> axis to account for PCA sign ambiguity. We then rotated the observation around the <a>PC1</a> axis in discrete angular steps, scoring each orientation based on the Sinkhorn-weighted squared Euclidean distance between observed and reference centroids. To escape geometric symmetry traps, we identified the top k unique angular valleys (separated by ≥30°) to seed the refinement phase. </p><p>Each of the k initializations entered an Iterative Closest Point Refinement <a href=\"https://paperpile.com/c/eA17rr/2OtJ\">(Bergström &amp; Edlund, 2014)</a> (ICP)-like soft refinement loop. To more robustly handle biological noise, we replaced the standard hard-assignment matching of the ICP algorithm with optimal transport. We computed a soft correspondence probability matrix using Sinkhorn entropic regularization. The rigid transformation (rotation and translation) was then iteratively updated using a weighted Kabsch Algorithm <a href=\"https://paperpile.com/c/eA17rr/nHpW\">(Kabsch, 1976)</a>, where the influence of each cell pairing was governed by its Sinkhorn probability. </p><p>Alignments were scored by calculating the Mahalanobis distance between the aligned observations and the atlas 3D Gaussian distributions. The slice and orientation minimizing this global cost were selected as the winner. Final discrete cell identities were assigned by executing a linear sum assignment algorithm <a href=\"https://paperpile.com/c/eA17rr/wnwg\">(Kuhn, 1955)</a> directly on the winning mahalanobis distance matrix.</p><p><b>Confidence Scoring</b></p><p>A Random Forest<a href=\"https://paperpile.com/c/eA17rr/tppS\">(Breiman, 2001)</a> classifier (n_estimators = 200, max_depth = 10) was trained to predict cell-level assignment correctness using geometric and biological alignment features. Input variables included Sinkhorn assignment entropy, Mahalanobis distance, frame cell count, frame inferred time, and a normalized division delta (t<sub>med</sub>-tbirth)/(t<sub>division</sub>-tbirth) representing cell life progress. The model outputs continuous cell-level probabilities, which are also averaged to generate an aggregate frame level confidence score. </p><p>Pipeline accuracy and diagnostic accuracy were evaluated using a Leave-One-Out cross-validation strategy. For each fold, the continuous spatial GP atlas, slice atlas, and diagnostic classifier, were fitted entirely on the training embryos prior to evaluating the withheld test embryo.</p><p><b>QC Report Generation</b></p><p>To facilitate rapid visual assessment of alignment quality, the pipeline generates an interactive HTML diagnostic dashboard. To trace the optimization landscape, the alignment engine records the Sinkhorn-weighted sum of squared Euclidean distances at each discrete step of the coarse angular sweep. Furthermore, for the top k angular initializations selected for the refinement tournament, this same soft-weighted cost is sequentially tracked across all Sinkhorn refinement loops. These convergence traces are packaged alongside the empirical population growth curve, which plots the unannotated embryo's observed cell count and t<sub>med</sub> against the 95% confidence intervals of the training population. </p><p><b>Data Acquisition </b></p><p>For validation, <a href=\"http://www.wormbase.org/db/get?name=WBStrain00022462;class=Strain\" id=\"12769628-6fd9-47c3-a863-a04f9fc2c107\">JIM113</a> embryos were isolated from gravid hermaphrodites and mounted on a coverslip using a diSPIM sample chamber. Images were acquired with 0.75 micron z-spacing on an ASI diSPIM <a href=\"https://paperpile.com/c/eA17rr/yCU1\">(Wu et al., 2013)</a> run in single view acquisition mode controlled by the diSPIM control plugin in micro-manager <a href=\"https://paperpile.com/c/eA17rr/8UCO+aiBr\">(A. Edelstein et al., 2010; A. D. Edelstein et al., 2014)</a>. Images were cropped in ImageJ <a href=\"https://paperpile.com/c/eA17rr/8Vct\">(Schneider et al., 2012)</a> and processed using StarryNite. Lineage tracing results from StarryNite were then manually curated using AceTree.</p><p><b>Data and Code Availability</b></p><p>The EmbAlign source code is available as a public repository at https://github.com/shahlab-ucla/EmbAlign. This repository also contains executable scripts, configuration files, and pretrained models required to reproduce the analyses and figures presented in this study. All processed datasets used for model training and out-of-sample validation are hosted within the repository and permanently archived at https://doi.org/10.5281/zenodo.20089241. The repository also contains core usage vignettes outlining execution of the EmbAlign pipeline for lineage inference in unlabeled 3D nuclei coordinates, as well as instructions for fitting the spatiotemporal atlases on raw point cloud data.</p>","reagents":"<p></p>","patternDescription":"<p>During the mapping of the <i><a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"71c01161-cbe9-4059-832b-1f2606c5c85d\">C. elegans</a> </i>embryonic lineage <a href=\"https://paperpile.com/c/eA17rr/YgjH\">(Sulston et al., 1983)</a>, the spatial stereotypy of the embryo was noted. More recent automated tracking-based lineage tracing <a href=\"https://paperpile.com/c/eA17rr/Zsa7+8q2s+L00y\">(Bao et al., 2006; Santella et al., 2010, 2014)</a> enabled quantitative assessment of the consistency of cell positioning within the embryo <a href=\"https://paperpile.com/c/eA17rr/vOpc+SI9r+n1tz\">(Li et al., 2019; Moore et al., 2013; Schnabel et al., 2006)</a>, a feature that has led to recent advances <a href=\"https://paperpile.com/c/eA17rr/HA8b+kTnY\">(Haus et al., 2025; Ntemos et al., 2025)</a> in using spatial atlases to automate cell identity determination from static snapshots of <i><a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"1fb7a9f7-6522-47a6-9a10-ea5d211738d4\">C. elegans</a> </i>embryos using timelapse recordings of embryos imaged under gentle compression. Compression has been extensively used in prior live imaging experiments <a href=\"https://paperpile.com/c/eA17rr/YgjH\">(Sulston et al., 1983)</a> as it forces embryos into stereotypical orientations and limits their axial extent, allowing for 3D coverage with fewer focal planes in confocal imaging <a href=\"https://paperpile.com/c/eA17rr/EDNV\">(Bao &amp; Murray, 2011)</a>. While current approaches achieve high accuracy in the inference of lineage identity from static snapshots of cell position, both of these tools were trained using compressed data. Because this mechanical compression physically flattens the Z-axis, it alters the relative spatial topology of nuclei. Consequently, tools trained on these compressed datasets may struggle to map the 3D geometry of uncompressed embryos, which are free to rotate in space. We built EmbAlign to fill this gap by enabling robust lineage identity determination using cell position in uncompressed embryos, such as would be generated by many smFISH <a href=\"https://paperpile.com/c/eA17rr/Ypca\">(Parker et al., 2021)</a> and immunofluorescence <a href=\"https://paperpile.com/c/eA17rr/Ypca+vnCY\">(Duerr, 2006; Parker et al., 2021)</a> sample preparation approaches, and by snapshots from uncompressed live imaging, for example by lightsheet microscopy <a href=\"https://paperpile.com/c/eA17rr/KPdD\">(Wu et al., 2011)</a>. In the case of assays like smFISH or immunofluorescence, embryos can be fixed in suspension, preserving their native 3D architecture while inherently precluding any live imaging. EmbAlign provides the geometric inference to map molecular profiles to lineage identities in these preparations.</p><p>EmbAlign automates the transformation of 3D nuclei centroids from uncompressed <a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"190812f1-2718-4ba2-9d84-81fd41d4b676\">C. elegans</a> embryos onto their canonical lineage identities (<b>Fig. 1A</b>). While users can generate these coordinates using various 3D segmentation strategies, ranging from classical Laplacian of Gaussian blob detection <a href=\"https://paperpile.com/c/eA17rr/X8o2\">(Kong et al., 2013)</a> to deep-learning models like Cellpose <a href=\"https://paperpile.com/c/eA17rr/xqSl\">(Stringer et al., 2021)</a>, robust detection is critical. Before executing EmbAlign, we recommend visually verifying segmented centroids against raw fluorescence images as a standard practice. Once these validated 3D centroids are obtained, they are centered and scaled to standardize physical size variations while preserving relative spatial topology. Using the total observed cell count, the algorithm selects from reference templates derived from empirically observed lineage identity configurations. To resolve arbitrary orientations, the algorithm aligns the observed centroids' primary principal component (±<a>PC1</a>) axes to the reference and executes a discrete rotational sweep to identify the top k unique angular orientations. Each initialization is then refined via an iterative Sinkhorn alignment procedure. This stage utilizes entropically regularized optimal transport to compute a soft correspondence probability matrix and iteratively updates the best fit rigid transformation. Following refinement, EmbAlign resolves a final, discrete mapping of identities to centroids by minimizing the Sinkhorn-weighted Mahalanobis distance, enforcing the one-to-one correspondence required by the embryo's invariant body plan. Finally, a Random Forest diagnostic layer evaluates the assignment, utilizing geometric and biological features to output a continuous confidence score for every prediction. </p><p>EmbAlign's performance was evaluated using a Leave-One-Out Cross-Validation (LOOCV) strategy across a dataset of uncompressed <a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"24b6d164-1164-4087-856c-06497c3ae47e\">C. elegans</a> embryos from two different labs acquired using distinct imaging modalities (embryo n = 11, frame n = 1,215), where ground truth cell identities were established using the AceTree <a href=\"https://paperpile.com/c/eA17rr/wnW2+hC5r\">(Boyle et al., 2006; Katzman et al., 2018)</a>/StarryNite <a href=\"https://paperpile.com/c/eA17rr/Zsa7+L00y+8q2s\">(Bao et al., 2006; Santella et al., 2010, 2014)</a> lineage tracking pipeline. Position data for uncompressed embryos were sourced from previously published lightsheet microscopy lineage reconstructions. The algorithm achieved a high average frame accuracy of 96.6% from the 6-cell up to the 190-cell stage (<b>Fig. 1B</b>). Despite this overall robustness, performance exhibited predictable transient declines corresponding to waves of synchronous cell division. Nuclei captured within or immediately adjacent to these division windows are inherently more difficult to classify, as rapid physical displacement during cytokinesis briefly deforms the expected spatial topology and maximizes the distance of these cells from their canonical atlas positions. Finding an exact reference template match is inherently challenging during these rapid waves of cell division. Remarkably, EmbAlign maintains a 86.6% assignment accuracy even when aligning against an imperfect reference template (n = 270), compared to a near-perfect 99.4% accuracy when at least one exact reference template match is available (n = 945). This suggests the EmbAlign framework successfully buffers against missing reference templates, and increasing training dataset size to more robustly cover gaps in the current reference atlas should further improve overall performance.</p><p>To verify that the pipeline does not overfit to the spatial properties of the training data, we evaluated EmbAlign on two fully independent, out-of-sample (OOS) datasets acquired via single view selective plane illumination microscopy (ASI diSPIM operated in single view acquisition mode). These OOS embryos were annotated with ground truth labels using AceTree and StarryNite pipeline up to the 100 cell stage. The algorithm successfully mapped these embryos, maintaining comparable high-fidelity frame accuracies (96.3% and 96.1%) without any dataset-specific recalibration (<b>Fig. 1C</b>). This OOS performance demonstrates robust generalizability, confirming that EmbAlign effectively captures the invariant spatial dynamics of <i><a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"fa53dbde-6be0-4695-82f7-a2c04fc80ee1\">C. elegans</a></i> embryogenesis rather than batch-specific imaging artifacts. </p><p>To investigate cell type specific drops in performance, we examined the 30 lowest-performing cell types. We observed that these challenging types primarily consist of later-stage cells that emerge toward the end of our atlas window and correspond to a wave of cell divisions. Additionally, we correlated our global cell-type accuracies with empirical positional variance measurements from a prior study <a href=\"https://paperpile.com/c/eA17rr/SI9r+TBy8\">(Guan et al., 2025; Li et al., 2019)</a> in uncompressed embryos. We found a small but significant negative correlation (Spearman' s = -0.11, p = 0.03), suggesting the natural spatial variability of these cells is likely a minor contributing factor to EmbAlign's difficulty in resolving their identities.  </p><p>To enable reliable application in experimental settings lacking ground truth, we integrated a decoupled Random Forest diagnostic layer that provides a single cell assessment of alignment quality in real-time. Because the pipeline maintains a high baseline accuracy, the diagnostic task is imbalanced, requiring the classifier to identify rare misassignment events within a vast majority of correct labels. To rigorously evaluate performance under those conditions, we compared the precision (a measure of the false positive rate) and recall (a measure of the false negative rate) of the model. The diagnostic layer achieved an AUPRC of 0.525—a greater than 10-fold improvement over the naive baseline of 0.05 (<b>Fig. 1E</b>)—and  successfully identified 78.5% of true assignment errors and 90.2% of true assignment successes (<b>Fig. 1F</b>), demonstrating a robust ability to capture specific geometric and biological features of assignment failures.</p><p>To make these diagnostics accessible to the end user, we developed an interactive HTML alignment report that packages the pipeline's outputs into a comprehensive dashboard. Along with a frame level confidence estimate, the tool projects the aligned embryo onto an empirical population growth curve to estimate its canonical time, a feature that allows users to flag datasets that are likely to have been captured during transient, error-prone division windows. The dashboard also summarizes the entire search landscape, enabling users to compare alternative alignments across multiple local minima and track their respective optimization convergence traces. Finally, predicted labels and alignment confidence scores are mapped directly onto the embryo's 3D spatial topology, resulting in interactive 3D spatial label assignment and confidence plots that facilitate rapid verification of alignment quality (<b>Fig. 1A, 1G</b>).  </p><p>While EmbAlign provides a robust framework for identity inference in uncompressed embryos, its current implementation has specific boundaries defined by training data availability and input data quality. The pipeline is validated up to the 190-cell stage: a limit dictated by the increased difficulty of generating ground truth training data for uncompressed embryos using compression-optimized tracking tools, rather than an inherent algorithmic constraint. Furthermore, because the algorithm relies on a strict one-to-one correspondence between observed nuclei and those in the candidate atlas template, EmbAlign is very sensitive to detection errors, requiring careful validation of cell detection in test data prior to alignment. However, the framework's reasonable performance on imperfect reference templates suggests that missing or extra nuclei primarily trigger localized mapping failures rather than catastrophic global misalignment. This behavior underscores the critical utility of the diagnostic layer and interactive alignment report, which projects alignment confidence scores spatially to help users visually flag and isolate these localized, artifact driven misassignments.</p><p>In summary, EmbAlign provides a generalizable solution for automated cell identity inference in static 3D snapshots of uncompressed <i><a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"eefdb80b-5a83-4f8f-a5d7-d4b3e7de6241\">C. elegans</a></i> embryos. By aligning 3D nuclei centroid coordinates to live-imaging derived spatiotemporal atlases of early <i><a href=\"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&amp;id=6239\" id=\"21f77647-3d85-4f62-8759-bf1f3698a07c\">C. elegans</a></i> development, EmbAlign achieves &gt;96% alignment accuracy up to the 190 cell stage that generalizes across independently generated datasets. Additionally, we observe that performance lapses correspond to transient waves of synchronous cell division, where natural positional variance temporarily disrupts spatial stereotypy. Supported by a predictive diagnostic classifier and interactive alignment reports, EmbAlign provides a complementary tool for transforming raw spatial coordinates into lineage-aware datasets.</p>","references":[{"reference":"Bao Z, Murray JI, Boyle T, Ooi SL, Sandel MJ, Waterston RH. 2006. Automated cell lineage tracing in Caenorhabditis elegans. Proceedings of the National Academy of Sciences. 103: 2707-2712.","pubmedId":"","doi":"10.1073/pnas.0511111103"},{"reference":"Bao Z, Murray JI. 2011. Mounting Caenorhabditis elegans embryos for live imaging of embryogenesis. Cold Spring Harb Protoc. 2011","pubmedId":"","doi":"10.1101/pdb.prot065599"},{"reference":"Bergstrom P, Edlund O. 2014. Robust registration of point sets using iteratively reweighted least. Computational Optimization and Applications. 58: 543-561.","pubmedId":"","doi":"10.1007/s10589-014-9643-2"},{"reference":"Boyle TJ, Bao Z, Murray JI, Araya CL, Waterston RH. 2006. AceTree: a tool for visual analysis of Caenorhabditis elegans. BMC Bioinformatics. 7: 275.","pubmedId":"","doi":"10.1186/1471-2105-7-275"},{"reference":"Breiman L. 2001. Random Forests. Machine Learning. 45: 5-32.","pubmedId":"","doi":"10.1023/A:1010933404324"},{"reference":"Cuturi M. 2013. Sinkhorn Distances: Lightspeed Computation of Optimal Transportation.","pubmedId":"","doi":""},{"reference":"Duerr JS. 2006. Immunohistochemistry. WormBook: 1-61.","pubmedId":"","doi":"10.1895/wormbook.1.105.1"},{"reference":"Edelstein A, Amodaj N, Hoover K, Vale R, Stuurman N. 2010. Computer Control of Microscopes Using µManager. Current Protocols in Molecular Biology. 92: 14.20.1-14.20.17.","pubmedId":"","doi":"10.1002/0471142727.mb1420s92"},{"reference":"Edelstein AD, Tsuchida MA, Amodaj N, Pinkard H, Vale RD, Stuurman N. 2014. Advanced methods of microscope control using μManager software. POL Scientific. 1: 1.","pubmedId":"","doi":"10.14440/jbm.2014.36"},{"reference":"Guan G, Li Z, Ma Y, Ye P, Cao J, Wong MK, et al., Zhao Z. 2025. Cell lineage-resolved embryonic morphological map reveals signaling. Nat Commun. 16: 3700.","pubmedId":"","doi":"10.1038/s41467-025-58878-0"},{"reference":"Hadwiger G, Dour S, Arur S, Fox P, Nonet ML. 2010. A Monoclonal Antibody Toolkit for C. elegans. PLOS ONE. 5: e10161.","pubmedId":"","doi":"10.1371/journal.pone.0010161"},{"reference":"Haus E, Santella A, Xu Y, Ren R, Wang D, Bao Z. 2025. A Single-cell Spatiotemporal Manifold of Tissue Morphology and Dynamics. bioRxiv","pubmedId":"","doi":"10.1101/2025.10.22.683950"},{"reference":"Kabsch W. 1976. A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A. 32: 922-923.","pubmedId":"","doi":"10.1107/S0567739476001873"},{"reference":"Katzman B, Tang D, Santella A, Bao Z. 2018. AceTree: a major update and case study in the long term maintenance of. BMC Bioinformatics. 19: 121.","pubmedId":"","doi":"10.1186/s12859-018-2127-0"},{"reference":"Kong H, Akakin HC, Sarma SE. 2013. A generalized Laplacian of Gaussian filter for blob detection and its. IEEE Trans Cybern. 43: 1719-1733.","pubmedId":"","doi":"10.1109/TSMCB.2012.2228639"},{"reference":"Kuhn HW. 1955. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly. 2: 83-97.","pubmedId":"","doi":"10.1002/nav.3800020109"},{"reference":"Li X, Zhao Z, Xu W, Fan R, Xiao L, Ma X, Du Z. 2019. Systems Properties and Spatiotemporal Regulation of Cell Position. Cell Rep. 26: 313-321.e7.","pubmedId":"","doi":"10.1016/j.celrep.2018.12.052"},{"reference":"<p>Mahalanobis, P.C. Reprint of:(1936) \"On the Generalised Distance in Statistics.\". <i>Sankhya A</i> <b>80</b> (Suppl 1), 1–7 (2018)</p>","pubmedId":"","doi":"10.1007/s13171-019-00164-5"},{"reference":"Moore JL, Du Z, Bao Z. 2013. Systematic quantification of developmental phenotypes at single-cell. Development. 140: 3266-3274.","pubmedId":"","doi":"10.1242/dev.096040"},{"reference":"Ntemos K, Xu F, Bazzi NZ, Fucile G, Maretic HP, Dokmanic I, Mango SE, Sawh AN. 2025. Rapid canalization of chromosome conformation-transcription fingerprints. bioRxiv","pubmedId":"","doi":"10.1101/2025.10.22.684035"},{"reference":"Parker DM, Winkenbach LP, Parker A, Boyson S, Nishimura EO. 2021. Improved Methods for Single-Molecule Fluorescence In Situ Hybridization. Curr Protoc. 1: e299.","pubmedId":"","doi":"10.1002/cpz1.299"},{"reference":"<p>Sakoe, H. and Chiba, S. (1990) Dynamic Programming Algorithm Optimization. Readings in speech recognition p159-165</p>","pubmedId":"","doi":"10.5555/108235.108244"},{"reference":"Santella A, Du Z, Bao Z. 2014. A semi-local neighborhood-based framework for probabilistic cell lineage. BMC Bioinformatics. 15: 217.","pubmedId":"","doi":"10.1186/1471-2105-15-217"},{"reference":"Santella A, Du Z, Nowotschin S, Hadjantonakis AK, Bao Z. 2010. A hybrid blob-slice model for accurate and efficient detection of. BMC Bioinformatics. 11: 580.","pubmedId":"","doi":"10.1186/1471-2105-11-580"},{"reference":"Schnabel R, Bischoff M, Hintze A, Schulz AK, Hejnol A, Meinhardt H, Hutter H. 2006. Global cell sorting in the C. elegans embryo defines a new mechanism for. Dev Biol. 294: 418-431.","pubmedId":"","doi":"10.1016/j.ydbio.2006.03.004"},{"reference":"Schneider CA, Rasband WS, Eliceiri KW. 2012. NIH Image to ImageJ: 25 years of image analysis. Nature Methods. 9: 671-675.","pubmedId":"","doi":"10.1038/nmeth.2089"},{"reference":"<p>Stringer C, Wang T, Michaelos M, Pachitariu M. 2020. Cellpose: a generalist algorithm for cellular segmentation. Nature Methods 18: 100-106.</p>","pubmedId":"","doi":"10.1038/s41592-020-01018-x"},{"reference":"Sulston JE, Schierenberg E, White JG, Thomson JN. 1983. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev Biol. 100: 64-119.","pubmedId":"","doi":"10.1016/0012-1606(83)90201-4"},{"reference":"Wu Y, Ghitani A, Christensen R, Santella A, Du Z, Rondeau G, et al., Shroff H. 2011. Inverted selective plane illumination microscopy (iSPIM) enables coupled. Proceedings of the National Academy of Sciences. 108: 17708-17713.","pubmedId":"","doi":"10.1073/pnas.1108494108"},{"reference":"Wu Y, Wawrzusin P, Senseney J, Fischer RS, Christensen R, Santella A, et al., Shroff H. 2013. Spatially isotropic four-dimensional imaging with dual-view plane. Nature Biotechnology. 31: 1032-1038.","pubmedId":"","doi":"10.1038/nbt.2713"}],"title":"<p>Inference of Lineage-Resolved Cell Identities in Uncompressed <i>C. elegans </i>Embryos</p>","reviews":[],"curatorReviews":[{"curator":{"displayName":"Gary Craig Schindelman"},"openAcknowledgement":false,"submitted":null}]}]}},"species":{"species":[{"value":"acer saccharum","label":"Acer saccharum","imageSrc":"","imageAlt":"","mod":"TreeGenes","modLink":"https://treegenesdb.org","linkVariable":""},{"value":"achillea millefolium","label":"Achillea millefolium","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"acinetobacter baylyi","label":"Acinetobacter baylyi","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"actinobacteria bacterium","label":"Actinobacteria bacterium","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"adelges tsugae","label":"Adelges tsugae","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"adenocaulon chilense","label":"Adenocaulon chilense","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"aedes japonicus","label":"Aedes japonicus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"aegorhinus vitulus","label":"Aegorhinus vitulus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"alaimidae","label":"Alaimidae","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"allobates femoralis","label":"Allobates femoralis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"alnus glutinosa","label":"Alnus glutinosa","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"alosa aestivalis","label":"Alosa aestivalis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"alosa pseudoharengus","label":"Alosa pseudoharengus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"alternaria alternata","label":"Alternaria alternata","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"amynthas agrestis","label":"Amynthas Agrestis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ancylostoma caninum","label":"Ancylostoma caninum","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ancylostoma ceylanicum","label":"Ancylostoma ceylanicum","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"anemone multifida","label":"Anemone multifida","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"anguilla rostrata","label":"Anguilla rostrata","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"anisakis simplex","label":"Anisakis simplex","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"anomala albopilosa","label":"Anomala albopilosa","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"anthomyiidae sp","label":"Anthomyiidae sp","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"anthomyiidae sp","label":"Anthomyiidae sp","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"arabidopsis","label":"Arabidopsis","imageSrc":"arabidopsis.png","imageAlt":"Arabidopsis graphic by Zoe Zorn CC BY 4.0","mod":"TAIR","modLink":"https://arabidopsis.org","linkVariable":""},{"value":"architeuthis dux","label":"Architeuthis dux","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"arion vulgaris","label":"Arion vulgaris","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"armeria","label":"Armeria","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"artemia","label":"Artemia","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"arthrobacter sp.","label":"Arthrobacter sp.","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ascaridia","label":"Ascaridia","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ascaridia galli","label":"Ascaridia galli","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"asparagopsis taxiformis","label":"Asparagopsis taxiformis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"astatotilapia burtoni","label":"Astatotilapia burtoni","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"avena sativa","label":"Avena sativa","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"aves","label":"Aves","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bacillus","label":"Bacillus (firmicutes)","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bacillus cereus","label":"Bacillus cereus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bacillus mycoides","label":"Bacillus mycoides","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bacillus subtilis","label":"Bacillus subtilis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bacillus thuringiensis","label":"Bacillus thuringiensis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bacillus toyonensis","label":"Bacillus toyonensis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bacillus wiedmannii","label":"Bacillus wiedmannii","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bacteria","label":"Bacteria","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bacteriophage","label":"Bacteriophage","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bactrocera","label":"Bactrocera sp.","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"batrachospermum gelatinosum","label":"Batrachospermum gelatinosum","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"betula lenta","label":"Betula lenta","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"betula nigra","label":"Betula nigra","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bombus dahlbohmii","label":"Bombus dahlbohmii","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bombus terrestris","label":"Bombus terrestris","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bombyx mori","label":"Bombyx mori","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bos taurus","label":"Bos Taurus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"brachygobius doriae","label":"Brachygobius doriae","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"brassica oleracea","label":"Brassica oleracea","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"brassica rapa","label":"Brassica rapa","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"brugia malayi","label":"Brugia malayi","imageSrc":"","imageAlt":"","mod":"WormBase","modLink":"www.wormbase.org","linkVariable":""},{"value":"burkholderia thailandensis","label":"Burkholderia thailandensis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"buttiauxella","label":"Buttiauxella","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"caenorhabditis brenneri","label":"Caenorhabditis brenneri","imageSrc":"","imageAlt":"","mod":"WormBase","modLink":"www.wormbase.org","linkVariable":""},{"value":"caenorhabditis briggsae","label":"Caenorhabditis briggsae","imageSrc":"","imageAlt":"","mod":"WormBase","modLink":"www.wormbase.org","linkVariable":""},{"value":"c. elegans","label":"Caenorhabditis elegans","imageSrc":"c-elegans.jpg","imageAlt":"C. elegans graphic by Zoe Zorn CC BY 4.0","mod":"WormBase","modLink":"https://wormbase.org","linkVariable":""},{"value":"caenorhabditis inopinata","label":"Caenorhabditis inopinata","imageSrc":"","imageAlt":"","mod":"WormBase","modLink":"www.wormbase.org","linkVariable":""},{"value":"caenorhabditis japonica","label":"Caenorhabditis japonica","imageSrc":"","imageAlt":"","mod":"WormBase","modLink":"www.wormbase.org","linkVariable":""},{"value":"caenorhabditis nigoni","label":"Caenorhabditis nigoni","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"caenorhabditis remanei","label":"Caenorhabditis remanei","imageSrc":"","imageAlt":"","mod":"WormBase","modLink":"www.wormbase.org","linkVariable":""},{"value":"caenorhabditis tropicalis","label":"Caenorhabditis tropicalis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"calidifontibacillus","label":"Calidifontibacillus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"calidifontibacillus erzuremensis","label":"Calidifontibacillus erzuremensis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"calliphora sp","label":"Calliphora sp","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"caltha sagittata","label":"Caltha sagittata","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"cambarus latimanus","label":"Cambarus latimanus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"candida albicans","label":"Candida albicans","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"canis familiaris","label":"Canis familiaris","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"cannabis sativa","label":"Cannabis sativa","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"caretta caretta","label":"Caretta caretta","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"cassiopea xamachana","label":"Cassiopea xamachana","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"caulobacter vibrioides","label":"Caulobacter vibrioides","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"cephalopods","label":"Cephalopoda","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"cerastium arvense","label":"Cerastium arvense","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ceriodaphnia","label":"Ceriodaphnia","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ceroglossus suturalis","label":"Ceroglossus suturalis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"chaetoceros","label":"Chaetoceros","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"chamaecrista fasciculata","label":"Chamaecrista fasciculata","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"chilicola chalcidiformis","label":"Chilicola chalcidiformis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"chitinimonas","label":"Chitinimonas","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"chlamydomonas reinhardtii","label":"Chlamydomonas reinhardtii","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"chromobacterium","label":"Chromobacterium","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"chrysemys picta","label":"Chrysemys picta","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"chrysoperla rufilabris","label":"Chrysoperla rufilabris","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"citrus","label":"Citrus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"clavibacter sp.","label":"Clavibacter sp.","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"colinus virginianus","label":"Colinus virginianus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"crassostrea virginica","label":"Crassostrea virginica","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"crithidia fasciculata","label":"Crithidia fasciculata","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"cutibacterium acnes","label":"Cutibacterium acnes","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"cyanobacteria","label":"Cyanobacteria","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"daphnia","label":"Daphnia","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"daphnia pulex","label":"Daphnia pulex","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"diabrotica virgifera","label":"Diabrotica virgifera","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"diabrotica virgifera virgifera virus 1","label":"Diabrotica virgifera virgifera virus 1","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"d. discoideum","label":"Dictyostelium discoideum","imageSrc":"dicty.png","imageAlt":"D. discoideum","mod":"dictyBase","modLink":"http://dictybase.org","linkVariable":""},{"value":"diptera","label":"Diptera","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"dotocryptus bellicosus","label":"Dotocryptus bellicosus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"drechmeria coniospora","label":"Drechmeria coniospora","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"drosophila","label":"Drosophila","imageSrc":"drosophila.png","imageAlt":"Drosophila graphic by Zoe Zorn CC BY 4.0","mod":"FlyBase","modLink":"https://flybase.org/doi/","linkVariable":"doi"},{"value":"dryopteris campyloptera","label":"Dryopteris campyloptera","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"dryopteris expansa","label":"Dryopteris expansa","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"dryopteris intermedia","label":"Dryopteris intermedia","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"dugesia dorotocephala","label":"Dugesia dorotocephala","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"elasmobranchii","label":"Elasmobranchii","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"embryophyta","label":"Embryophyta","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"enoploteuthis chunii","label":"Enoploteuthis chunii","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"enterobacter aerogenes","label":"Enterobacter aerogenes","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"enterococcus raffinosus","label":"Enterococcus raffinosus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"epichloë coenophiala","label":"Epichloë coenophiala","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"equus caballus","label":"Equus caballus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"erigeron sp","label":"Erigeron sp","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"eristalis","label":"Eristalis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"eruca vesicaria","label":"Eruca vesicaria","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"erwinia carotovora","label":"Erwinia carotovora","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"erythronium americanum","label":"Erythronium americanum","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"escherichia coli","label":"Escherichia coli","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"eukaryota","label":"Eukaryotes","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"felis catus","label":"Felis catus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"francisella novicida","label":"Francisella novicida","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"francisella tularensis","label":"Francisella tularensis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"fraxinus americana","label":"Fraxinus americana","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"fucus distichus","label":"Fucus distichus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"fungi","label":"Fungi","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"gasteropelecus sp.","label":"Gasteropelecus sp.","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"geranium sp","label":"Geranium sp","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"girardia","label":"Girardia","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"glaucomys volans","label":"Glaucomys volans","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"glycine max","label":"Glycine max","imageSrc":"","imageAlt":"","mod":"Soybase","modLink":"https://soybase.org","linkVariable":""},{"value":"glyptemys insculpta","label":"Glyptemys insculpta","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"gossypium hirsutum","label":"Gossypium hirsutum","imageSrc":"","imageAlt":"","mod":"CottonGen","modLink":"https://www.cottongen.org/","linkVariable":""},{"value":"gromphadorhina portentosa","label":"Gromphadorhina portentosa","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"gryllodes sigillatus","label":"Gryllodes sigillatus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"haliotis rufescens","label":"Haliotis rufescens","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"hepacivirus hominis","label":"Hepatitis C Virus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"herpes simplex virus type 1","label":"Herpes simplex virus type 1","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"human","label":"Human","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"human coronavirus oc43","label":"Human coronavirus OC43","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"hydra vulgaris","label":"Hydra vulgaris","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"hydropsyche sp","label":"Hydropsyche sp","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"hymenoptera","label":"Hymenoptera","imageSrc":"","imageAlt":"","mod":"Hymenoptera Genome Database","modLink":"https://hymenoptera.elsiklab.missouri.edu/","linkVariable":""},{"value":"hypochaeris radicata","label":"Hypochaeris radicata","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"hypodynerus vespiformis","label":"Hypodynerus vespiformis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"iflaviridae","label":"Iflaviridae","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"iflavuris","label":"Iflavirus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ipomoea hederacea","label":"Ipomoea hederacea","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ischnomera","label":"Ischnomera","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ischnomera ruficollis","label":"Ischnomera ruficollis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"julidochromis marlieri","label":"Julidochromis marlieri","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"juniperus virginiana","label":"Juniperus virginiana","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"kluyveromyces marxianus","label":"Kluyveromyces marxianus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"l. casei","label":"L. casei","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"lacticaseibacillus casei","label":"Lacticaseibacillus casei","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"larentiinae sp","label":"Larentiinae sp","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"laurus nobilis","label":"Laurus nobilis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"lepidoptera","label":"Lepidoptera","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"leucanthemum vulgare","label":"Leucanthemum vulgare","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"linepithema humile","label":"Linepithema humile","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"liometopum occidentale","label":"Liometopum occidentale","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"lolium arundinaceum","label":"Lolium arundinaceum","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"lumbriculus variegatus","label":"Lumbriculus variegatus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"lumbricus terrestris","label":"Lumbricus terrestris","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"lupinus polyphyllus","label":"Lupinus polyphyllus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"lycorma delicatula","label":"Lycorma delicatula","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"lynx rufus","label":"Lynx rufus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"magnaporthe oryzae","label":"Magnaporthe oryzae","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"mammalia","label":"Mammalia","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"manihot esculenta","label":"Manihot esculenta","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"medicago lupulina","label":"Medicago lupulina","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"meloidogyne","label":"Meloidogyne","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"mimus polyglottos","label":"Mimus polyglottos","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"bryophyta","label":"Mosses","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"mouse","label":"Mouse","imageSrc":"","imageAlt":"","mod":"MGI","modLink":"https://informatics.jax.org","linkVariable":""},{"value":"m. minutoides","label":"Mus minutoides","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"mycobacterium smegmatis","label":"Mycobacterium smegmatis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"nakaseomyces glabratus","label":"Nakaseomyces glabratus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"nauphoeta cinerea","label":"Nauphoeta cinerea","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"neurospora","label":"Neurospora","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"n. benthamiana","label":"Nicotiana benthamiana","imageSrc":"","imageAlt":"","mod":"Solgenomics Network","modLink":"https://solgenomics.net/organism/Nicotiana_benthamiana/genome","linkVariable":""},{"value":"nicotiana tabacum","label":"Nicotiana tabacum","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"noctuidae","label":"Noctuidae","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"noctuidae sp","label":"Noctuidae sp","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"nothobranchius furzeri","label":"Nothobranchius furzeri","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"onchocerca volvulus","label":"Onchocerca volvulus","imageSrc":"","imageAlt":"","mod":"WormBase","modLink":"www.wormbase.org","linkVariable":""},{"value":"orconectes virilis","label":"Orconectes virilis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ormia ochracea","label":"Ormia ochracea","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"o. sativa","label":"Oryza sativa","imageSrc":"","imageAlt":"","mod":"Gramene","modLink":"https://www.gramene.org/","linkVariable":""},{"value":"other","label":"Other","imageSrc":"","imageAlt":"","mod":null,"modLink":null,"linkVariable":null},{"value":"oxalis enneaphylla","label":"Oxalis enneaphylla","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"paenarthrobacter nicotinovorans","label":"Paenarthrobacter nicotinovorans","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"paenarthrobacter nicotinovorans","label":"Paenarthrobacter nicotinovorans","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pantoea","label":"Pantoea","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pantoea agglomerans","label":"Pantoea agglomerans","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"papaver sp","label":"Papaver sp","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"paramecium bursaria","label":"Paramecium bursaria","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"partitiviridae","label":"Partitiviridae","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pelodiscus sinensis","label":"Pelodiscus sinensis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"perezia recurvata","label":"Perezia recurvata","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"petromyzon marinus","label":"Petromyzon marinus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"photinus pyralis","label":"Photinus pyralis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"photinus pyralis associated partiti-like virus","label":"Photinus pyralis associated partiti-like virus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"photinus pyralis iflavirus 1","label":"Photinus pyralis iflavirus 1","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"physcomitrium patens","label":"Physcomitrium patens","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pinus strobus","label":"Pinus strobus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pinus taeda","label":"Pinus taeda","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"platycheirus","label":"Platycheirus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"plectus sambesii","label":"Plectus sambesii","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pogonomyrmex occidentalis","label":"Pogonomyrmex occidentalis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"poncirus trifoliata","label":"Poncirus trifoliata","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"populus deltoides","label":"Populus deltoides","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"potato virus y","label":"Potato virus Y","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"primula magellanica","label":"Primula magellanica","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pristionchus pacificus","label":"Pristionchus pacificus","imageSrc":"","imageAlt":"","mod":"WormBase","modLink":"www.wormbase.org","linkVariable":""},{"value":"prunus persica","label":"Prunus persica","imageSrc":"","imageAlt":"","mod":"Genome Database for Rosaceae","modLink":"https://www.rosaceae.org/","linkVariable":""},{"value":"psalmopoeus iriminia","label":"Psalmopoeus iriminia","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pseudanabaena sp.","label":"Pseudanabaena sp.","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pseudomonas","label":"Pseudomonas","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pseudomonas aeruginosa","label":"Pseudomonas aeruginosa","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pseudomonas glycinae","label":"Pseudomonas glycinae","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pseudomonas putida","label":"Pseudomonas putida","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pseudomonas syringae","label":"Pseudomonas syringae","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"pterophyllum scalare","label":"Pterophyllum scalare","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"python regius","label":"Python regius","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"quercus macrocarpa","label":"Quercus macrocarpa","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ralstonia solanacearum","label":"Ralstonia solanacearum","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ranitomeya imitator","label":"Ranitomeya imitator","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ranunculus peduncularis","label":"Ranunculus peduncularis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"rat","label":"Rat","imageSrc":"","imageAlt":"","mod":"RGD","modLink":"https://rgd.mcw.edu","linkVariable":""},{"value":"rheinheimera","label":"Rheinheimera","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ribes rubrum","label":"Ribes rubrum","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"sars-cov-2","label":"SARS-CoV-2","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"s. cerevisiae","label":"Saccharomyces cerevisiae","imageSrc":"yeast.png","imageAlt":"Yeast graphic by Zoe Zorn CC BY 4.0","mod":"SGD","modLink":"https://yeastgenome.org","linkVariable":""},{"value":"saccharomyces paradoxus","label":"Saccharomyces paradoxus ","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"s. uvarum","label":"Saccharomyces uvarum","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"schistosoma","label":"Schistosoma","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"schizosaccharomyces japonicus","label":"Schizosaccharomyces japonicus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"s. pombe","label":"Schizosaccharomyces pombe","imageSrc":"pombe.png","imageAlt":"Pombe graphic by Zoe Zorn © Caltech","mod":"PomBase","modLink":"https://www.pombase.org/reference/PMID:","linkVariable":"pmId"},{"value":"schmidtea mediterranea","label":"Schmidtea mediterranea","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"senecio sp","label":"Senecio sp","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"simocephalus","label":"Simocephalus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"siraitia grosvenorii","label":"Siraitia grosvenorii","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"solanum lycopersicum","label":"Solanum lycopersicum","imageSrc":"","imageAlt":"","mod":"Solgenomics Network","modLink":"https://solgenomics.net/organism/1/view/","linkVariable":""},{"value":"sorghum","label":"Sorghum","imageSrc":"","imageAlt":"","mod":"SorghumBase","modLink":"https://www.sorghumbase.org","linkVariable":""},{"value":"spiroplasma eriocheiris","label":"Spiroplasma eriocheiris","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"staphylococcus aureus","label":"Staphylococcus aureus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"staphylococcus epidermidis","label":"Staphylococcus epidermidis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"steinernema carpocapsae","label":"Steinernema carpocapsae","imageSrc":"","imageAlt":"","mod":"WormBase","modLink":"https://wormbase.org","linkVariable":""},{"value":"steinernema hermaphroditum","label":"Steinernema hermaphroditum","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"stenotrophomonas geniculata","label":"Stenotrophomonas geniculata","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"streptococcus gordonii ","label":"Streptococcus gordonii ","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"streptococcus mutans","label":"Streptococcus mutans","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":" streptococcus pneumoniae","label":"Streptococcus pneumoniae","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"s. purpuratus","label":"Strongylocentrotus purpuratus","imageSrc":"","imageAlt":"","mod":"Echinobase","modLink":"https://www.echinobase.org","linkVariable":""},{"value":"strongyloides ratti","label":"Strongyloides ratti","imageSrc":"","imageAlt":"","mod":"WormBase","modLink":"www.wormbase.org","linkVariable":""},{"value":"sulfolobus","label":"Sulfolobus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"symphoricarpos albus","label":"Symphoricarpos albus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"syncirsodes","label":"Syncirsodes","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"synechococcus elongatus","label":"Synechococcus elongatus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"syrphidae","label":"Syrphidae","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"tarantobelus jeffdanielsi","label":"Tarantobelus jeffdanielsi","imageSrc":"","imageAlt":"","mod":"WormBase","modLink":"www.wormbase.org","linkVariable":""},{"value":"taraxacum officinale","label":"Taraxacum officinale","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"tatochila theodice","label":"Tatochila theodice","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"tetrahymena","label":"Tetrahymena","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"tetramorium immigrans","label":"Tetramorium immigrans","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"tomato brown rugose fruit virus","label":"ToBRFV","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"trachemys scripta","label":"Trachemys scripta","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"tribolium castaneum","label":"Tribolium castaneum","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"trichoptera","label":"Trichoptera","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"trichuris muris","label":"Trichuris muris","imageSrc":"","imageAlt":"","mod":"WormBase","modLink":"www.wormbase.org","linkVariable":""},{"value":"trifolium repens","label":"Trifolium repens","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"trypoxylus dichotomus","label":"Trypoxylus dichotomus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"tsuga canadensis","label":"Tsuga canadensis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"ulva expansa","label":"Ulva expansa","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"universal","label":"Universal","imageSrc":"","imageAlt":"","mod":null,"modLink":null,"linkVariable":null},{"value":"vargula hilgendorfii","label":"Vargula hilgendorfii","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"vespula vulgaris","label":"Vespula vulgaris","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"virus","label":"Virus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"watasenia scintillans","label":"Watasenia scintillans","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"wolbachia pipientis","label":"Wolbachia pipientis","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"xenopus","label":"Xenopus","imageSrc":"xenopus.png","imageAlt":"Xenopus graphic by Zoe Zorn CC BY 4.0","mod":"XenBase","modLink":"https://xenbase.org","linkVariable":""},{"value":"xenorhabdus griffiniae","label":"Xenorhabdus griffiniae","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"yramea cytheris","label":"Yramea cytheris","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"zaprionus indianus","label":"Zaprionus indianus","imageSrc":"","imageAlt":"","mod":"","modLink":"","linkVariable":""},{"value":"zea mays","label":"Zea mays","imageSrc":"","imageAlt":"","mod":"MaizeGDB","modLink":"https://www.maizegdb.org","linkVariable":""},{"value":"zebrafish","label":"Zebrafish","imageSrc":"zebrafish.png","imageAlt":"Zebrafish graphic by Zoe Zorn CC BY 4.0","mod":"ZFIN","modLink":"https://zfin.org","linkVariable":""}]}},"pageContext":{"id":"67cadf6b-eb36-4529-aa78-e206875dcf4a","citedBy":[],"parsedCsv":{"csvHeader":[],"csvData":[]}}},
    "staticQueryHashes": ["2114697108"]}