Dear all,
I have RNASeq data. I have to estimate insert size for multiple samples. For this, I am using the CollectInsertSizeMetrics picard application. Here I have to provide input of BAM file (reads mapped to reference). I am confused whether I should do preprocessing of data and then estimate insert size or I should directly move with raw data to estimate insert size. please look into following points/conditions....
- consider all the reads for mapping and then calculate insert size ?
- perform quality filtration,adapter removal, mapping and then estimate the insert size ?
- only perform the adapter removal and then estimate the insert size ?
Every step will make the difference. I just want to follow the accurate way to estimate insert size.
Waiting for reply
Thank you in advance
You might want to take a look at this blog: https://mikelove.wordpress.com/2016/09/26/rna-seq-fragment-sequence-bias/ and at the
alpine
bioconductor package.