Explore very large trees in the browser

Overview

Taxodium

Taxodium is a client-side Javascript tool for exploring extremely large trees. It is currently used for Cov2Tree, a display of the global SARS-CoV-2 phylogeny: 🌳 http://cov2tree.org

The data and tree displayed in Cov2Tree are collated by the UShER team: http://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/UShER_SARS-CoV-2// . The sequences contained represent the work of thousands of researchers across the world.

Most of this repository is a client side React app that displays the tree. It loads the public/nodelist.pb file, which contains a pre-processed form of the data. The Python scripts that build this file are in data_processing, along with a little documentation.

Using it for your own data

Right now you would need to create your own data_processing directory, which is adapted to produce the nodelist.pb file from your own datasets. We will try to streamline this. Feel free to raise an Issue requesting help!

Development instructions

nvm use 14
yarn install
yarn start
Comments
  • Treenome Browser v1 integration

    Treenome Browser v1 integration

    Here's a PR where we can iron out the final details of the treenome browser.

    A couple recent additions:

    • added nt mutations (now the mutations displayed in the browser match those selected in the cog menu). Mutations also load progressively in "chunks" rather than all at once.
    • added a band that appears at the selected node when zoomed in far enough, highlighting that node's genome

    I think the last main thing to figure out is, as you mentioned, handling non-SARS-CoV-2 trees.

    Right now it really only works for the cov2tree tree. The two main things needed to build the browser for an arbitrary tree are (1) the reference sequence and (2) coordinates of genes/CDS to compute nt positions of aa mutations. (2) is only necessary for aa mutations, so just displaying nt mutations could work without it.

    I currently compute the reference through the fake "mutations" at the root node -- which I realize aren't guaranteed to be there (but I think they should be if a .gb file was used to make the tree)

    (2) is hard-coded for SARS-CoV-2. But if we stored the CDS data from a .gb file, I could use those.

    What do you think about storing the genbank CDS features (potentially also the fasta sequences) in the jsonl file, and only displaying the treenome toggle button if they are present?

    opened by amkram 31
  • taxoniumtools `Error: EISDIR: illegal operation on a directory`

    taxoniumtools `Error: EISDIR: illegal operation on a directory`

    I tried out using taxoniumtools on my M1 mac and got the following error when opening http://localhost:8000/?backend=http://localhost:3453

    Node.js v17.9.1
    (node:1) ExperimentalWarning: Fetch is an experimental feature. This feature could change at any time
    (Use `node --trace-warnings ...` to show where the warning was created)
    imported importing
    importing is  {
    processJsonl: [AsyncFunction: processJsonl],
    generateConfig: [Function: generateConfig]
    }
    imported filtering
    imported exporting
    node:fs:728
    handleErrorFromBinding(ctx);
    ^
    
    Error: EISDIR: illegal operation on a directory, read
    at Object.readSync (node:fs:728:3)
    at tryReadSync (node:fs:434:20)
    at Object.readFileSync (node:fs:480:19)
    at loadData (/app/taxonium_backend/server.js:486:26) {
    errno: -21,
    syscall: 'read',
    code: 'EISDIR'
    }
    
    opened by corneliusroemer 17
  • V2 launch: are you having problems?

    V2 launch: are you having problems?

    I've just launched Taxonium V2 which is a rebuilt version of Taxonium from the ground up to support a server-side backend. (The backend is not running on the live site, but will be in a couple of days).

    While I have attempted to make things quite backwards compatible it may have broken some people's workflows. Do say if so and we can try to resolve things.

    The old version of Taxonium is still available at https://cov2tree-git-v1-theosanderson.vercel.app/ for now.

    opened by theosanderson 17
  • uncaught exception: out of memory on Firefox

    uncaught exception: out of memory on Firefox

    Please could you have a look at the following issue, which also occurred on a larger machine (more 128 GB RAM) and with Chromium too. Thank you.

    Viewing tree from loaded .jsonl (e.g. public-2022-05-25.all.masked.taxonium.jsonl) fails with Firefox due to out of memory error. Firefox 100.0.2 (64-bit) on Ubuntu 18.04.6, and 15GB system RAM. The same machine was able to display the tree from the .pb using https://cov2tree-git-v1-theosanderson.vercel.app/. The error only occurred with the jsonl.

    Firefox console log:

    11:39:38.569 Loading failed for the