Entering edit mode
3.3 years ago
순연
•
0
Hi. I have a problem I want to compare the rs numbers in two vcf files. so I want to check which of the Rs numbers are in the top 10 percent. I don't know what to do. Can you help me if I have tools or if I have to look at the blog?
If you still need help, could you elaborate on what you mean by "which of the Rs numbers are in the top 10 percent"? Top 10% of what exactly? Do you simply want to know which rs IDs are shared between two VCF files?
yes,i still need help...
After finding which rs IDs are shared (which is the easy part), HOW do you want "rank" them? Do you mean to rank them by their allele frequency across all the samples in the two VCF files (i.e. prevalence)? Also, how big are your VCF files? BTW, I can read Korean so if you prefer, you are more than welcome to comment in Korean :)
구글링을 열심히 하는데 두 개의 파일을 비교하는 것 부터 쉽지가 않아서 질문 올렸어요 두개의 파일을 비교해서 공통되는 rs unmber를 찾으려면 어떤 tools이나 블로그를 봐야하는지 도움을 주셨으면 합니다 감사합니다
You still haven't answered my questions of 1) how you want to rank those common rs IDs and 2) how big the two files are. Therefore, here, I will just put how you can find rs IDs that are common between any two VCF files using Python and the
pyvcf
submodule I wrote:Above script will list common rs IDs between your VCF files:
Note that you will need to install the
fuc
package to use above script (run$ conda install -c bioconda fuc
).Thank you sir, happy Korean Thanksgiving Day 1) How you want to rank those companies IDs, and what RSunmber is there from the top to the top 10 to what RsIDs are duplicated in two files using the code taught by the teacher, and what RSunber are there?
In fact, I can't think of anything other than Excel that I can do like that. Is there any other good alternative?
2) How big the two files. 1. The total file size to be sampled is divided into ->2.45GB files, which are divided into chr1 and chr2, so the files to be sampled are also divided into chr1 chr2. chr1 200mb cr2 202mb and chr3 199mb. 2. The file you want to compare is a gnomAD file.