Welcome to ngs-preprocess pipeline documentation
ngs-preprocess is a pipeline designed to provide an easy-to-use framework for preprocessing sequencing reads from Illumina, Pacbio and Oxford Nanopore platforms. It is developed with Nextflow and Docker.
The pipeline wraps up the following tools and analyses:
|sra-tools & entrez-direct
|Interaction with SRA database for fetching fastqs and metadata
|Fast all-in-one preprocessing for FastQ files
|ONT reads trimming and demultiplexing
|ONT reads QC
|Long reads QC and filter
|Convert PacBio bax files to bam
|Extract reads from PacBio bam files
|PacBio reads demultiplexing
|Generate PacBio Highly Accurate Single-Molecule Consensus Reads
Although discontinued since 2018, porechop is included as a legacy compatibility for old nanopore runs, old sequencing kit libraries and old sequencer versions.
However, the newest versions of MinKNOW is able to output trimmed and demultiplexed fastq data, meaning this step is not required anymore.
Finally, it is also okay to not remove adapters from reads as some assemblers may be aware and even benefit of the sequences.
A quickstart is available so you can quickly get the gist of the pipeline's capabilities.
The pipeline's common usage is very simple as shown below:
Some parameters are required, some are not. Please read the pipeline's manual reference to understand each parameter.
In order to cite this pipeline, please refer to:
Almeida FMd, Campos TAd and Pappas Jr GJ. Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation. F1000Research 2023, 12:1205 (https://doi.org/10.12688/f1000research.139488.1)
Whenever a doubt arise feel free to contact me via the github issues.