Entering edit mode
3.9 years ago
zizigolu
★
4.3k
Hi
I have copy number segment from exome seq like
> head(CN)
# A tibble: 6 x 6
file Chromosome Start End Total_CN Minor_CN
<chr> <chr> <int> <int> <int> <int>
1 sample1 51479 817980 2 0
How I know how many exones are in the range of End-Start
?
If you are looking at intervals the answer is always bedtools.
I need the number of exones in in each range
then
bedtools intersect -c
?Search the forum to find to how to query for a list of exons and their genomic location. Once you get that, you can use bedtools and R to get to your answer. You've been on the site long enough to know better than to ask for tailor-made solutions.
Does this code give me human exons cooradinates?
Are you expecting me to run the code and tell you or magically know the content of a file and how the code would alter it and predict that accurately?
You can run a few spot checks, right? Asking us for help is fine, but relying on us to do your job is just irresponsible.
No I alreadyy run the code and obtained the file like
I am asking is this human exon coordinates or no
Please open a genome browser on NCBI/EnsEMBL/UCSC, go to one of the coordinates and find the gene there, look at its exons and see if things match up.
Now, do the same thing for 2-3 genes in different regions. If everything looks OK, your dataset is fine. If not, it's not fine.
The above needs to be done by you, me, anyone if they wish to verify the dataset. Why would you rather we do it than you?
Read the tableSchema to know what you're actually downloading. As for whether the code is actually grabbing what you want, you are capable of verifying that.
As an aside, using exact code you got from elsewhere without understanding what it does is a recipe for mistakes. Always verify your output manually.
Another option is to use the GENCODE GTF file directly. That might be easier to reproduce than the query based approach. Either way, you are perfectly capable of finding these solutions without asking us for help.