How to know if my insert size distribution is normal?
1
0
Entering edit mode
2.8 years ago
ashoo-geno • 0

Dear all,

I have obtained the insert sizes of my whole-genome sequencing data (Human blood samples sequenced by NovaSeq 6000) using "samtools stats" command. Now, I would like to know if the distribution of my insert sizes is normal.

What statistical tests or programs do you recommend me use?

Any help would be highly appreciated,

Thank you

WGS insert-size • 1.3k views
ADD COMMENT
0
Entering edit mode

it won't pass any test for normality. there always be reads with unusual insert distances. just make a plot and check if it looks bell-shaped (likely skewed). if it does not look very crazy, it is OK then.

ADD REPLY
0
Entering edit mode

thanks a lot for your message. The number of samples is around 12000; so making plots is troublesome. Anyways, very good to know it won't pass any tests; I posted my question here after disappointing from statistical tests results! I thought I was doing something wrong.

ADD REPLY
0
Entering edit mode
2.8 years ago

You can map the data to a ref genome and use qualimap http://qualimap.conesalab.org/

Talk to your lab workers about what is "normal" for you samples, depends on kit used, design etc.

ADD COMMENT
0
Entering edit mode

The samples are mapped to hg38; I had checked Qualimap before; as the number of samples is high (around 12,000), it looks troublesome to make plots using Qualimap; and my data is in CRAM format and Qualimap accepts BAM or SAM (based on its manual). I want to call structural variants in these samples. So, I just know that the distribution should be normal.

Thank you for your help.

ADD REPLY

Login before adding your answer.

Traffic: 2060 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6