With the aim to know the number of read depth that I have in each polymorphisms of my RNAseq data:
I have done an alignment using bowtie to alignm reads against a reference sequence:
bowtie2 -f -x scaffold_filt_PV014_OD_DB.fa PV002_cmv3.fa -S scaffold_nofilt_OD_PV002.sam
I change sam format to bam format:
samtools view -bS scaffold_nofilt_OD_PV002.sam > scaffold_nofilt_OD_PV002.bam
I removed unpaired reads: samtools view -F 0x04 -b scaffold_nofilt_OD_PV002.bam >scaffold_nofilt_OD_rm_PV002.bam
I select those with a map quality higher than 10
samtools view -q 10 -b scaffold_nofilt_OD_rm_PV002.bam > scaffold_nofilt_OD_MapQ10_PV002.bam
I sort and I index it:
samtools sort scaffold_nofilt_OD_MapQ10_PV002.bam scaffold_nofilt_OD_PV002_index
samtools index scaffold_nofilt_OD_PV002_index.bam
After to do this I get the coverage of each base pair using bedtools:
bedtools genomecov -ibam scaffold_nofilt_OD_PV002_index.bam -bga > coverage_PV002.txt
Moreover I got the number of polymorphisms using these functions:
samtools mpileup -g -f scaffold_filt_PV014_OD_DB.fa scaffold_nofilt_OD_PV002_index.bam > scaffold_nofilt_OD_PV002_index.bcf
bcftools view -v -c -g scaffold_nofilt_OD_PV002_index.bcf > scaffold_nofilt_OD_PV002_index.vcf
When I compare the read depth bedtools values against the bcftools values (of DP4 column) I got a big difference:
- With bcftools I get this type of read depth: 245,222,8,171,249,etc
- With bedtools I got read depths of 9248,9265,49971,50076.
I don't know what read depth value I should take
What is the difference?
Thanks in advantage.
Can you tell us how many reads did you lose after this step?
If you are purely looking to see how many reads cover a position then
samtools depth -b <BED list of positions or regions>
ormosdepth
(https://github.com/brentp/mosdepth ) will work well.I don't remove any read with that function. But I will used samtools depth. Thanks in advantage.