Question

how to find snps among samples

0

Entering edit mode

6.8 years ago

Kritika ▴ 270

Hi I have 6 samples WGS. I am asked to do snps calling among sample instead of comparing it with Reference genome. I want to find how similar are the samples . Usually we do SNP calling on comparing it with Reference (if available )so here the comparison has to be made among samples. Shall i need to go for Denovo assembly If yes then what will be next procedures

Also i have 11GB RAM and 1TB space so how much time it will be require for WGS of 30X data of 2.0GB genome

SNP • 2.5k views

ADD COMMENT • link 6.8 years ago by Kritika ▴ 270

0

Entering edit mode

I am asked to do ...

Clarify with the person asking you to do things what they mean by their query, what the ultimate aim is, etc. Gain a better understanding of why you're doing what you do.

It sounds like de novo assembly, but try and find out what exactly and more importantly, why.

ADD REPLY • link 6.8 years ago by Ram 44k

0

Entering edit mode

Hi @Ram Actually the study is WGS snp analysis but they are interested in only particular chromsomes. What they are quering is they want sample by sample comparison instead sample by ref comparison. so if they get any SNPs it should not be related to reference it should be related to sample. For example if i treat my sample 1 as reference and sample 2 as query then comparing the snps got from this with other sample.

ADD REPLY • link 6.8 years ago by Kritika ▴ 270

1

Entering edit mode

The closest I can think of is either tumor-normal analysis or de-novo/transmission analysis, both of which are analyses performed on top of calling SNVs. Using one sample as the ref is not advisable, as sequencing errors can cause a lot of noise. Ref sequences are reference for a reason - they have been well validated.

Also,

particular chromosomes

Use an interval list

sample by sample comparison instead of sample by ref

That was my question - why?

ADD REPLY • link 6.8 years ago by Ram 44k

0

Entering edit mode

Actually this sample is Rice sample WGS snp analysis . My client want to compare sequence of sample 1 and sample 4 (Not to be mapped with reference) take that snps compare with sample 2 ,sample 3,sample 5,sample 6 which is been mapped with reference

ADD REPLY • link 6.8 years ago by Kritika ▴ 270

0

Entering edit mode

You're telling me the what, I'm asking you about the why. Have you discussed with your client why they want to do this (their ultimate aim, as I mentioned earlier)? Unless this is common practice in plant bioinformatics, I do not think this approach makes sense.

ADD REPLY • link 6.8 years ago by Ram 44k

0

Entering edit mode

They want to see differences among the samples how much variation is there among the samples

ADD REPLY • link 6.8 years ago by Kritika ▴ 270

0

Entering edit mode

And why can you not get to that by comparing each to the standard reference sequence?

ADD REPLY • link 6.8 years ago by Ram 44k

0

Entering edit mode

they want one with reference and with sample vs sample comparision My with reference SNPs comparison for all samples has already been given to them but now they want sample vs samples comparison

ADD REPLY • link 6.8 years ago by Kritika ▴ 270

0

Entering edit mode

Explore using bcftools. The closest analysis that you can do, that would make sense here, is variants(X)-variants(A), where X and A are 2 samples, and variants(X), variants(A) is the set of all variants found in X and A respectively.

If I were you, I'd talk to them and tell them how their request doesn't make sense, as comparing to something you built just reinforces errors and introduces biases.

ADD REPLY • link 6.8 years ago by Ram 44k

0

Entering edit mode

They are beliving that this sample could be differing from standard reference

ADD REPLY • link 6.8 years ago by Kritika ▴ 270

1

Entering edit mode

a. Do not add an answer unless you're answering your original question b. They are free to believe what they want to, but unless they are bioinformatics experts as well as clients, they cannot tell you both what to do _and_ how to do it. Every sample analyzed ever differs from the standard reference, so comparing samples = comparing the way the samples differ from the reference, not aligning one sample from your dataset to another. That makes absolutely no sense.

ADD REPLY • link 6.8 years ago by Ram 44k