Entering edit mode
19 months ago
Chris
▴
340
Hi all,
I modified my output as a vcf file but not bcf as the instruction. Is that OK or the output is not correct? Thank you so much!
bcftools mpileup -f Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa finalBamFile2.bam | bcftools call -mv -Ob -o variants.vcf.gz
bcftools mpileup -f reference.fa alignments.bam | bcftools call -mv -Ob -o calls.bcf
https://samtools.github.io/bcftools/howtos/variant-calling.html
Thank you! So I don't have to run again which took me about half a day. I got a suggestion that uploads bam file to IGV to verify the result. I am checking mutation in a gene of interest. By looking at the location of this gene on IGV using the bam file, I did not find any variant. Would you please have a suggestion to find mutations in a gene of interest?
poor qualities at this location.
Thanks for the comment! Are these good qualities?
no. blank reads == poor quality.
Is this RNA or DNA? It looks to me as if what Pierre is interpreting as mapping quality 0 reads may be split reads. (the .junctions file also may indicate that this is the case).
You should be able to load the VCF into IGV as well (after using tabix to index the .vcf.gz)
This is the whole genome so DNA. When I load the vcf into IGV, I got this:
n index file for /variants2_unzip.vcf could not be located. An index is recommended to view files of this size. Click "Go" to create one now or "Cancel to proceed without an index. Maybe I need to use tabix as your suggestion.
Thank you! So could you give me a sample of good qualities? So white space means blank reads?
White-space blank reads mean mapping quality score of 0, which means there are other region(s) of the genome where this read maps equally well so there is no confidence that the mapping is correct.
So you mean grey is good quality and white blank between read in the same row means mapping quality score from tool such STAR, BWA is 0? A good quality sample should have all grey?
Grey does not necessarily mean quality. Grey means mapping quality (MQ) > 0, which for most aligners mean the mapping is unique. For DNAseq, in most data sets over 90% of reads have maximum mapping quality score (e.g., bwa MQ of 60), so a MQ of 10 is still poor. For a region with lots of MQ0 reads, the chances are many of those grey reads have low MQ as well.