HISAT2 no properly paired alignments
1
0
Entering edit mode
3.1 years ago
cengiz • 0

Hi All!

I'm a wetlab guy quite new to data analysis and would appreciate some help if possible!

Slowly i'm getting into commandline and understanding some of the workflow behind analysis but i've hit a bit of a wall. Following hisat2-build on the human genome (hg38) i run the following command and get the following output:

hisat2 -x genome/homo/homo.GRCm38 -U raw_data/NGS_211_553_fastq/_1_Cengiz.fastq.gz -S rna/lmsmc_ca_1.sam -p 3 -t

Time loading forward index: 00:00:09

Time loading reference: 00:00:05

Multiseed full-index search: 00:05:09

31656871 reads; of these:

  31656871 (100.00%) were unpaired; of these:

    375631 (1.19%) aligned 0 times

    29871475 (94.36%) aligned exactly 1 time

    1409765 (4.45%) aligned >1 times

98.81% overall alignment rate
Time searching: 00:05:17
Overall time: 00:05:26

Sounds great right? To me this reads like I'm doing ok!! Then i do samtools view sam > bam

then

samtools sort bam > sorted.bam

when i run samtools flagstat on this

samtools flagstat rna/lmsmc_ca_1.sorted.bam

34190647 + 0 in total (QC-passed reads + QC-failed reads)

2533776 + 0 secondary

0 + 0 supplementary

0 + 0 duplicates

33815016 + 0 mapped (98.90% : N/A)

0 + 0 paired in sequencing

0 + 0 read1

0 + 0 read2

0 + 0 properly paired (N/A : N/A)

0 + 0 with itself and mate mapped

0 + 0 singletons (N/A : N/A)

0 + 0 with mate mapped to a different chr

0 + 0 with mate mapped to a different chr (mapQ>=5)

So i'm a little confused as to why i have no properly paired sequences - i am a firm believer that i have done something wrong along the way! Any advices would be appreciated.

Many thanks, Cengiz

hisat2 • 1.1k views
ADD COMMENT
0
Entering edit mode
3.1 years ago
GenoMax 147k

Proper pairing of sequences is only applicable when you have paired-end reads. If a pair of sequences (that sample a library fragment) align within a certain expected distance of each other then they are considered properly paired. Since you have single end reads pairing does not apply.

ADD COMMENT
0
Entering edit mode

Oooohhh thats what it meant by 'properly paired' i thought it was properly paired against my reference genome. So assuming there is nothing to worry about i will continue !!

Thanks!

ADD REPLY
0
Entering edit mode

In your case following lines are the ones that are important. While you have some secondary alignments your data is well aligned to the reference.

34190647 + 0 in total (QC-passed reads + QC-failed reads)

2533776 + 0 secondary

33815016 + 0 mapped (98.90% : N/A)
ADD REPLY
0
Entering edit mode

I see! Thanks for the explanation :)

ADD REPLY

Login before adding your answer.

Traffic: 1249 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6