RIVET Results Table
Each of the sections below describes the columns of RIVET's results table of inferred recombinant ancestors.
Recombinant Node ID
- UShER assigned node id for inferred recombinant node
Donor Node ID
- UShER assigned node id for donor (recombinant parentental node)
Acceptor Node ID
- UShER assigned node id for acceptor (recombinant parentental node)
Breakpoint 1 Interval
- RIPPLES inferred breakpoint interval 1
Breakpoint 2 Interval
- RIPPLES inferred breakpoint interval 2
Info
For more information on the RIPPLES
algorithm, please see: Pandemic-scale phylogenomics reveals the SARS-CoV-2 recombination landscape
Recombinant Clade
- Recombinant clade classification as assigned by
Nextstrain
Recombinant Lineage
- Recombinant lineage designation as assigned by
Pangolin
Donor Clade
- Donor clade classification as assigned by
Nextstrain
Donor Lineage
- Donor lineage designation as assigned by
Pangolin
Acceptor Clade
- Acceptor clade classification as assigned by
Nextstrain
Acceptor Lineage
- Acceptor lineage designation as assigned by
Pangolin
Chronumental-inferred origin date
- Inferred first emergence of recombinant ancestor sequence using the Chronumental method, which runs automatically as part of the
RIVET
pipeline. In short,Chronumental
is a accurate and scalable time-tree estimation method that uses stochastic gradient descent to estimate lengths of time for tree branches under a probabilistic model. For more information on this method, please see the Chronumental paper.
Recombinant Ranking Score
- The ranking score represents a growth score that we compute for each inferred recombinant, which is designed to help prioritize recently emerging recombinants and recombinants with many descendant circulating sequences.
- By default, we order the main
RIVET
results table by maximum ranking score, which attempts to prioritize highest concern recombinants of interest at the top of the list.
The recombinant growth metric below, G(R), for a recombinant node with a set of descendants S is defined below:
In the equation above, and correspond to the number of months (30-day intervals) đť‘š(đť‘…) đť‘š(đť‘ ) elapsed since the recombinant node was inferred to have originated and its descendant đť‘… sequence was sampled, respectively. The growth score above, G(R), is computed for each detected recombinant R, and the final recombinant list is ranked based on descending growth scores.
Representative Descendant
- This selected sample is a descendant with the fewest additional mutations as compared to it's recombinant ancestor.
Informative Site Sequence
- The informative site sequence is a binary string of
A
andB
for each trio sequence, where anA
is assigned if the recombinant node allele at the site matches only the donor node allele at that site, or aB
if the recombinant matched only the acceptor.
3SEQ (M, N, K)
- 3SEQ M, M, K values used to check individual p-values in a pre-generated 3SEQ p-value table.
3SEQ P-Value
Info
For more information on the 3SEQ
method and its use in RIPPLES
, please see Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm and the Supplementary Section of Pandemic-scale phylogenomics reveals the SARS-CoV-2 recombination landscape
Original Parsimony Score
- The original parsimony placement score on the global phylogeny.
Parsimony Score Improvement
- Highest parsimony score improvement relative to original parsimony score.
Quality Control (QC) Flags
- This column represents quality control (QC) or filtration checks that where flagged, meaning that this inferred recombinant is not high-confidence and could represent a false-positive recombinant resulting from bioinformatic, contamination or other sequencing errors.
Info
For detailed description of each quality control and filtration check performed in RIVET's
backend pipeline, see the Quality Control and Filtration Checks page.
Common sources of false positive errors in RIVET’s
pipeline include, but are not limited to:
- Contamination, sequencing, or assembly errors in the recombinant or parent sequences
- Missing sequences resulting in artificially long branches in the UCSC public tree
- Misalignments or phylogenetic inconsistencies
Common sources of false negative errors in RIVET’s
pipeline include, but are not limited to:
- Too few recombination-informative sites in the recombinant
- More than two breakpoints are required to explain the recombinant
- Too few descendants of the recombinant or its parent in the UCSC public tree