GraphBin Tutorial

This tutorial walks through the steps and commands used to set up GraphBin, prepare results for input, run GraphBin and visualise the final results.

Prerequisites

Make sure you have installed the following.

Step 1 - Installing GraphBin

Let's create a new conda environment and install GraphBin from bioconda using the following command.

conda create -n graphbin -c bioconda graphbin

Activate the conda environment using,

conda activate graphbin

We can check if GraphBin is working properly using the following command.

graphbin -h

Now we can clone the GraphBin repository to our local machine.

git clone https://github.com/Vini2/GraphBin.git

Make sure you go into the GraphBin folder using the cd command.

cd GraphBin/

Step 2 - Preprocessing

Let's set the path to our data as follows. You can use the path to our test data in tests/data.

mypath=/path/to/data/folder

Step 2a - Assembly

We can assemble our reads into contigs using any metagenomic assembler. For this purpose, we will use metaSPAdes (available from SPAdes) as follows.

spades --meta -1 $mypath/Reads_1.fastq -2 $mypath/Reads_2.fastq -o $mypath/ -t 8

Step 2b - Obtain the initial binning result

Any contig binning tool can be used to get an initial binning result. We will be using MaxBin 2 in this example.

Step 2c - Prepare the initial binning result

prepResult.py is a support script that allows you to format an initial binning result into the .csv format with contig identifiers and bin ID. Contigs are named according to their original identifier and bins are numbered according to the fasta file name. We can run prepResult.py as follows.

python support/prepResult.py --binned $mypath/maxbin_bins --output $mypath/

Step 3 - Using GraphBin

We can run the metaSPAdes version of GraphBin as follows.

graphbin --assembler spades --graph $mypath/assembly_graph_with_scaffolds.gfa --contigs $mypath/contigs.fasta --paths $mypath/contigs.paths --binned $mypath/initial_contig_bins.csv --output $mypath/