Welcome to ngs-preprocess pipeline documentation
About
ngs-preprocess is a pipeline designed to provide an easy-to-use framework for preprocessing sequencing reads from Illumina, Pacbio and Oxford Nanopore platforms. It is developed with Nextflow and Docker.
Workflow
The pipeline wraps up the following tools and analyses:
Software | Analysis |
---|---|
sra-tools & entrez-direct | Interaction with SRA database for fetching fastqs and metadata |
fastp | Fast all-in-one preprocessing for FastQ files |
porechop** | ONT reads trimming and demultiplexing |
porechop ABI** | Ab initio version of porechop |
pycoQC | ONT reads QC |
NanoPack | Long reads QC and filter |
bax2bam | Convert PacBio bax files to bam |
bam2fastx | Extract reads from PacBio bam files |
lima | PacBio reads demultiplexing |
pacbio ccs | Generate PacBio Highly Accurate Single-Molecule Consensus Reads |
About porechop
Although discontinued since 2018, porechop is included as a legacy compatibility for old nanopore runs, old sequencing kit libraries and old sequencer versions.
However, the newest versions of MinKNOW is able to output trimmed and demultiplexed fastq data, meaning this step is not required anymore.
Finally, it is also okay to not remove adapters from reads as some assemblers may be aware and even benefit of the sequences.
Quickstart
A quickstart is available so you can quickly get the gist of the pipeline's capabilities.
Usage
The pipeline's common usage is very simple as shown below:
# usual command-line
nextflow run fmalmeida/ngs-preprocess \
--sra_ids "list_of_sra.txt" \
--lreads_min_length 750 \
--output "./preprocessed_data" \
...
Quote
Some parameters are required, some are not. Please read the pipeline's manual reference to understand each parameter.
Citation
In order to cite this pipeline, please refer to:
Almeida FMd, Campos TAd and Pappas Jr GJ. Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation. F1000Research 2023, 12:1205 (https://doi.org/10.12688/f1000research.139488.1)
Support contact
Whenever a doubt arise feel free to contact me via the github issues.