Hi,
For running ActivedriverWGS software I will need coding or non coding parts of genome in BED12 format. I have found coding part of genome (in txt format though). But I don't know how to find non coding part of genome (BED12 format) also Transcription factor binding in BED4 format. I have contacted the developer but no response. Any suggestion please?
are you sure that's what you want? The documentation says: "Regions of interest can be coding or noncoding should be in a BED12 format", so you basically need a BED file of the regions for which you want to do the analysis.
I also did not get the impression that TF binding sites are required, they might be nice to have, but for that you would have to identify the TF of interest first (and search e.g. ENCODE for respective binding sites).
I don't know a lot about this software but appears to take "regions of interest" rather than whole genome information about this data https://github.com/reimandlab/ActiveDriverWGS
I would recommend probably using the UCSC table browser to get BED output for this info also
If you mean this GitHub issue #10, looks like developer responded?
Thank you, How I could find non-coding part of genome?
For example when downloading this software we can get coding part of genome (although in txt format)
But I don't know where I could find non coding part of genome
The non-coding part of the genome is everything which is not... coding. So it would essentially be the complement of the bed file of the coding sequences. But that is unlikely to be what you need for your tool. See also the comment of Friederike You just need regions of interest.
Thank you, but I have already calculated driver genes for coding part of genome by another software; Now I need to do the same for non coding part of genome for which I will need a file contains non coding regions of human genome that I don't know how to get that.
No, it is unlikely that your tool just expects a bed file of all non-coding regions in the human genome. But anyway, if you insist; the answer is bedtools complement.
Sorry, what is the input here when the expected output is non coding in BED12?
Spend some time reading our comments here and the documentation of bedtools complement. I'm not coming to sit next to you and do your work.
:(
The same story
You only once sat next to me and did my work, when I was in Germany for interview
you and Genomax
Thank you
Well, I'm sure you can figure this out :-)
Sorry,
Likely the coding and non-coding regions of human genome are here
https://www.gencodegenes.org/human/release_19.html
I have converted gtf to bed by bedops
so I have this
How I could extract below information from this bed , for example from first line like below to whole
I asked my question in another forum they closed my post :(
I trie my bed as a txt to extract what I want but I got error
You are mixing up terminology.
'Coding' and 'non-coding' are confusing terms, because it can mean multiple things. In transcriptomics people would subgroup transcripts in
coding
andnon-coding
transcripts, meaning "do these RNA molecules get translated to a protein?". Here non-coding transcript means every transcript that does not lead to a protein (as far as we know!).In genomics, however, regions of the DNA are subgrouped in
coding
andnon-coding
, roughly meaning "does this sequence get transcribed to an RNA molecule?". Here non-coding fragment means every piece of DNA that does not lead to a transcript (as far as we know!).I'd suggest being complete with regards to what you are looking for. I don't like the term "non-coding transcript". For me it is a "non-protein-coding transcript". The transcript is coding (=has a functional product) but it just doesn't create a protein.
It seems to me you are looking for non-coding DNA regions, while what you found on Gencode are non-protein-coding transcripts.
(Note that my comment here ignores biological noise: random transcription without function. The extent of this phenomenon is an open debate.)
I second everything Wouter wrote. I think we need to clarify what types of regions you actually want to look at using the tool (not what the tool says it needs, tell us what the goal of your analysis is).
Thank you
Is
Long non-coding RNA gene annotation
non-coding DNA regions?Wouter has addressed precisely that question.
DNA: