About read alignment
After preprocessing the raw reads, the reads need to be mapped to the reference genome. After mapping, check the quality, specifically the percentage of mapped reads, as this is an indicator of the overall sequence accuracy and tells you whether there is any contaminating DNA.
TopHat performs the alignment of short reads in two steps.
- Map unspliced genes to locate exons (with Bowtie)
- The unmapped reads are used to identify exon junctions by splitting them and aligning independently.
HISAT2 (hierarchical indexing for spliced alignment of transcripts 2) gives the most accurate results, is a fast and sensitive aligner. To index the reference genome, it employs a graph-based approach which works with the Bowtie2 algorithm for alignment.