Entering edit mode
6.4 years ago
blur
▴
280
Hi,
I am trying to extract full genes from mpileup, but if I have a gene that is 100bp long and a few positions are uncovered by reads, mpileup just skips them. I tried to use a bed file (-l), but still some positions seem to be missing. Is there a flag/command to make mpileup report every position inside the bed file? I couldn't find it in the manual.
Thanks,
1) mpileup has moved to bcftools
2) what are you trying to do ?
Sorry, I didn't understand what you mean by " mpileup has moved to bcftools" - you mean the manual?
I want to get the whole gene, for downstream analysis. So, If the gene is 4 bp long I would like to see
chrX 1 A 4 ....
chrX 2 T 3 ..C
chrX 3 C 0
ChrX 4 G 5 ..CC.
but now I see only:
chrX 1 A 4 ....
chrX 2 T 3 ..C
ChrX 4 G 5 ..CC.
What is the next downstream step? I am just wondering whether the 0-coverage positions are really needed to be present or could be inferred by whatever tool/script/step happens next.
I want to create a graph of each gene and coverage along it (among other things). The pipeline cuts each file automatically so that each gene (by name, an addition made after the pileup step), is inserted into a separate file that is then run through an R script that creates graphs for each file. The graph needs to include the entire sequence, and shows other information (not just coverage) but as some positions are missing, it skews my position count and everything else that relies on it.
it's now
bcftools mpileup
in the recent versions of samtools/bcftoolshow about using
samtools depth
with option-a
?Thanks for the information on mpileup moving to bcftools. It now has a multithreading option, that is great news. Did you happen to test if the multithreaded output is identical to the SAMtools single-threaded version?