Quality Control

This area looks how to view and check the quality of raw high throughput sequencing data. 

FastQC is a tool used to check the quality of the sequence data. This can be downloaded onto your computer and uses a graphical interphase. It can also be used in the command line, which will produce a .zip file containing the plots and the html with the report. Best to have conda installed, and download FastQC in conda. 

MultiQC is a command line tool which will collate all html reports in a choosen directory.

FastP is an ultra-fast preprocessing tool and can perform quality control, adapter trimming, quality filtering, per-read quality pruning and more. 

This tool is around 2–5 times faster than other tools like Trimmomatic or Cutadapt even though this tool can undertake many more  operations than these similar tools.Recommend downloading this via conda.

After trimming it is best to check the quality again.

Trimmomatic undertakes a range of trimming functions  for illumina paired-end and single ended data. This is a command line tool. It requires slightly different approaches depending on whether the data is single- or pair-end. 

After trimming, it is best to check the quality again and that the trimming did what it was expected. 

QuasR is an R package which  provides a framework for the quantification & analysis of short reads. This tool allows for the complete pre-processing of raw reads including alignment, quality control plots and the quantification of genomic regions of interest.

ShortRead is an R package that employs sampling, iteration and input of FASTQ files. It enables filtering, trimming and quality assessments on reads. The data is shown as objects in the form of DNAStringSet-derived. These objects can be simply manipulated.