Hi,
I have called copy number and I have 2 files (I have shared link of my files ); One contains some ranges
> head(cndata[,c(2,3,4)])
Chromosome Start End
4470 1 51479 817980
4471 1 818499 1136753
4472 1 1138735 2558308
4473 1 2558740 5724264
4474 1 5724940 5749083
4475 1 5749226 12529544
>
I have another file like below
Chromosome Position
rs62635286 1 51479
rs75454623 1 114930
rs806731 1 30923
How I can count the number of SNP in each range? For example based on the second table how many SNP are in range of 51479
to 817980
?
I'm not so comfortable with R, but I would probably do this with either bedtools (if you have bed files of ranges and positions) or with bcftools (if you have a vcf file for your SNPs and a bed file or ranges).
I don't have bed file :(
There are a few filetypes in bioinformatics which are used A LOT, such as bed, vcf, sam/bam, gff, fasta/fastq. Always try to work with these formats, since there are so many options and tools which you can use to get your stuff done. Rolling your own solution in a scripting language like R/Python is definitely possible and you'll learn some coding while you're doing that, but is unlikely the fastest thing you can do to get your problem solved.
We need to "merge with overlap/range" then "group by count", searching these terms should help you solve the problem.
Hello F!
It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/8531/extracting-the-number-snp-in-each-range
This is typically not recommended as it runs the risk of annoying people in both communities.
Instead of tracing my activities in different forums, you guys please help me :(
You do not help your self. I have lost count of the number of questions you have asked in recent days/weeks which show absolutely no attempt at the task, nor willingess to learn.
We are not here to write code for you.
We did help you. There are solutions in this thread. What's wrong with those?
My brain not working nowadays :( too complicated solutions
We're glad to put you on the track to a solution, but we're not going to do your work. I estimate that the solution described here would take you 15 minutes of concentration, maximally. By the way, come say hi on slack if you have the time.
Soooo...you just expect others to do your work for you?