Entering edit mode
4.3 years ago
evelyn
▴
230
Hello,
I am using freebayes
for variant calling using:
list=list.txt
freebayes -f transcripts.fasta -L $list
--ploidy 2 \
--min-alternate-fraction 0.2 \
--min-alternate-count 2 \
--min-coverage 3 \
--min-mapping-quality 1 \
--min-base-quality 20 \
--report-genotype-likelihood-max \
--throw-away-complex-obs \
--throw-away-mnps-obs \
--throw-away-indel-obs > result.vcf
But it gives error:
ERROR(freebayes): multiple samples (SM) map to the same read group (RG)
I have used picard
to add readgroups
to the bam files. Thank you!
As the error mention, your BAM file is listing samples with the same read group, which is not possible, you need different read group per sample.
I checked the readgroups for the two samples it is showing error for using:
It is showing different names for both the files based on their respective sample names.
SM is sample, not read group.
Thank you! How can I add the readgroup information to bam files using picard. Where can I find the readgroup information for the data I have.
OP is going off of the command I mentioned in their last post (the command was for a totally different purpose)
You should have different
ID:
entries for eachSM:
entry. Did you check that? The file name doesn't matter.what is
samtools view -H sample.bam | grep "RG:"
showing?It shows nothing for read group.
It would be
@RG
, notRG:
.I understand I need to add read groups information. But I could not find where to look for this information. How to check what is RGID, RGLB, RGPL, RGPU for my samples in order to add this information using picard.
Here is some information about what those variables mean. You need to come up with those yourself from information you know about your data. It should be straight forward to do. Ask if you have questions.
Here is the
picard
command line example of how you would actually add them to your files.Thank you! I got the fastq files from a collaborator. I don't have information about these variables. What can be the best way to find this information?
Check the first link. Most of that information is going to come from fastq headers/file names etc.
I have checked the fastq file header:
The fastq file names were based on sample names when I got those.
You are going to have to make up some of this information since the fastq headers appear to have been rewritten. Make sure you do it right and use reasonable values. Some would be obvious.
SM=Real_SampleName
,PL=Illumina
etc.I have changed the
SM
but I am not sure how to changeID
whichfreebayes
is complaining about.Give it a random value, or even set it so ID=SM. The requirement is that multiple SMs don't share the same ID.
evelyn, you have deleted multiple posts in the past, after you've received comments on them. That is bad etiquette. If that behavior is seen again, your account will be suspended.
You also have a lot of posts that have answers (some of which I am sure work and solve your problem), but none of them have been accepted. You either abandon your posts or delete them - that is an abuse of the site. Please go back and provide feedback on them. Pierre asked you to do this before, but you ignored that. Please consider this your final warning.