Entering edit mode
9.4 years ago
yuanhuang2012
•
0
Dear all,
I have some trouble in the data format in running Ldhat program. I'd like to generate sites and locs flles for full sequence data. But I don't understand how to make a locs file. The example lpl_fn.locs
is as follows,
61 9.73 L
0.106
0.11
0.145
0.325
0.479
0.736
1.216
1.22
1.286
1.547
1.571
1.828
1.939
2.131
2.5
2.619
2.987
2.996
3.022
3.248
3.609
3.723
3.843
4.016
4.343
4.346
4.418
4.426
4.509
4.576
4.872
4.935
5.085
5.168
5.441
5.554
5.56
5.687
6.25
6.595
6.678
6.718
6.772
6.863
7.315
7.344
7.36
7.413
7.754
8.089
8.285
8.292
8.393
8.533
8.537
8.644
8.755
8.852
9.402
9.712
9.721
61 is the number of sites, L details a model, What 9.73 stand for?
Could you please explain what this column stand for?
Thank you very much!
Best regards,
Yuan
Thanks.
The sites file is as follows,
How can I know the total length of the region analyzed? I thought the total length is the length of sequence 61bp. So, I still confused what the followed number for every sits stand for. Thank you very much.
You seem to misunderstand what sites are. Sites are positions in your alignment where you can find SNPs, so in your example you don't have 61 sites, but 46 (I compute that quickly, can be false). For example, there are no SNPs in 2nd position of your alignment (all bases are A) so there is no need to put this position in your input files. However, to keep the information in mind, the total length of your alignment (61) must be wrote, and it's what the total length of the analyzed regions is used for. Moreover, you can write the sites position in bp, or in kb.
Thank you so much! I did misunderstand what sites are. I totally got it now. I am going to make sites file by filtering the homozygote position. Thanks for your patience!