Welcome to DIPPER Wiki

Introduction
Overview
DIPPER (DIstance-based Phylogenetic PlacER) is an ultrafast tool designed to reconstruct ultralarge phylogenies.

Key Features
TBA
Installation Methods
NOTE: DIPPER is currently supported on systems with NVIDIA GPUs only. Support for additional platforms, including AMD GPUs and CPU-only options for x86-64 and ARM64 architecture, will be added soon. Stay tuned!
1. Using Docker Image
To use DIPPER in a docker container, users can create a docker container from a docker image, by following these steps
i. Dependencies
ii. Pull and build the DIPPER docker image from DockerHub
## Note: If the Docker image already exists locally, make sure to pull the latest version using
## docker pull swalia14/dipper:latest
## If the Docker image does not exist locally, the following command will pull and run the latest version
docker run -it --gpus all swalia14/dipper:latest
iii. Run DIPPER
2. Using DockerFile
Docker container with the preinstalled DIPPER program can also be built from a Dockerfile by following these steps.
i. Dependencies
ii. Clone the repository and build a docker image
iii. Build and run the docker container
iv. Run DIPPER
3. Using installation script (requires sudo access)
Users without sudo access are advised to install DIPPER via Docker Image or Dockerfile.
Step 1: Clone the repository
Step 2: Install dependencies (requires sudo access)DIPPER depends on the following common system libraries, which are typically pre-installed on most development environments:
For Ubuntu users with sudo access, if any of the required libraries are missing, you can install them with:Step 3: Build DIPPER
Step 4: The DIPPER executable is located in the bin
directory and can be run as follows:
Run DIPPER
Functionalities
Option | Description |
---|---|
-i , --input-format |
Input format (required):d - distance matrixr - raw sequencesm - MSA |
-o , --output-format |
Output format:t - phylogenetic tree in Newick format (default) d - distance matrix in PHYLIP format (coming soon!!) |
-I , --input-file |
Input file path (required): PHYLIP for distance matrix, FASTA for MSA or raw sequences |
-O , --output-file |
Output file path (required) |
-m , --algorithm |
Algorithm selection:0 - auto (default)1 - force placement2 - force NJ3 - divide-and-conquer |
-p , --placement-mode |
Placement mode:0 - exact1 - k-closest (default) |
-k , --kmer-size |
K-mer size (Valid range: 2–15, default: 15) |
-s , --sketch-size |
Sketch size (default: 1000) |
-d , --distance-type |
Distance type:1 - uncorrected2 - JC (default) 3 - Tajima-Nei4 - K2P5 - Tamura6 - Jinnei |
-a , --add |
Add query to a backbone tree using k-closest placement |
-t , --input-tree |
Input backbone tree in Newick (required with --add option) format |
-h , --help |
Show help message |
Note
All the files in the examples below can be found in the DIPPER/dataset
.
Enter into the build directory (assuming $DIPPER_HOME
directs to the DIPPER repository directory)
De-novo phylogeny construction
DIPPER supports de-novo construction of phylogenies from unaligned/aligned sequences in FASTA format and distance matrix in PHYLIP format.
Default mode
In default mode, DIPPER constructs phylogeny using: 1. Conventional NJ for sequences/tips < 30,000 2. Placement technique for sequences/tips >= 30,000 and < 1,000,000 3. Divide-and-conquer technique for sequences/tips >= 1,000,000
From unaligned sequences
Usage syntax
ExampleFrom aligned sequences
Usage syntax (using JC model)
ExampleFrom distance matrix
Usage syntax
ExampleConstruct phylogeny using placement technique
DIPPER allows users to construct phylogeny using the forced placement technique by setting the -m
option to 1
. Below we provide a syntax and an example for input unaligned sequences, but DIPPER also supports aligned sequences and distance matrix as input.
Usage syntax
Construct phylogeny using divide-and-conquer technique
DIPPER allows users to construct phylogeny using the forced divide-and-conquer technique by setting the -m
option to 3
. Below we provide a syntax and an example for input unaligned sequences, but DIPPER also supports aligned sequences and distance matrix as input.
Usage syntax
Adding tips (sequences) to a backbone tree
DIPPER allows users to add tips to an existing backbone tree using the placement technique. It requires tip sequences from the backbone tree and input query sequences to be provided in a single file (FASTA format), along with the input tree in Newick format.
Usage syntax
./dipper -i r -o t -m 1 --add -I <path to unaligned/aligned sequences FASTA file (containing backbone tree tip sequences and query sequences)> -O <path to output file> -t <path to input tree>
Contributions
We welcome contributions from the community to enhance the capabilities of DIPPER. If you encounter any issues or have suggestions for improvement, please open an issue on DIPPER GitHub page. For general inquiries and support, reach out to our team.
Citing DIPPER
TBA.