Entering edit mode
24 months ago
BQ
•
0
Hi, my goal is to make a vcf file combining 30 samples. I have trimmed all 30 samples, converted them into sam/bam files, and sorted, all separately. Then I combined them, with the reference genome, to form a bcf then vcf file. Then I realize that the QUAL value of every line is low (prob all 0).
My question: which step did I do wrong, or this is normal, or this is only the problem of the samples?
I used this pipeline: https://www.ebi.ac.uk/sites/ebi.ac.uk/files/content.ebi.ac.uk/materials/2014/140217_AgriOmics/dan_bolser_snp_calling.pdf
Here is what I got:
bwa aln ? this is an old pipeline unless you're working with short reads (length<50-70).
did you run any QC on the fastq ? on the BAMs ? what is the mean depth of coverage ?
I see, the
bam
depth of coverage is ~1, which prob is not right. So I decided to use thebam
files created from bwa mem earlier, which have depth of coverage ~100. However, after I combined all samples (bam) to call variant, the produced vcf does not show the samples, i.e. no "sample1" "sample2" after "FILTER" "INFO "FORMAT", only "bar". The QUAL and POS seems good this time.Where did I do wrong and how should I fix it? Or this is correct?
My goal is to produce a vcf file for this group of samples, and do the same thing for the other group. Then I want to combine the two vcf files for a GWAS in plink. Am I doing it correctly?
This is the code I used:
1) why a screenshot when you can copy-n-paste the text
2) what is the use of `bcftools convert' here ??
1) Sorry I am new to here. Thought a screenshot can be more visual.
2) To convert
bcf
tovcf
Do you or anyone know what is the reason of the only column "bar"? Would appreciate any help for my project to proceed.