Hello
If I have a list of genes
How I can get the fasta format of these like below?
>BC200
GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCTCTCAGGGAGGCTAAGAGGCGGGAGGATAGCTTGAGCCCAGGAGTTCGAGACCTGCCTGGGCAATATAGCGAGACT
>NPPA
AS1TGCTGGTCAGAGGTCCTGGGGGTGGTTTTGAACCATCAGAGCTTGGACTTTTCTGACTTCCCCAGCAAGGATCTTCCCACTTCCTGCTCCCTGTGTTCCCACCC
Have you tried
bedtools getfasta -fi <genome_fasta> -bed <your_bed>
?Where in the code you are defining the lost of genes?
You have been on this forum long enough to know that you need to ask far more detailed questions than this.
What list of genes? What identifiers? Where do you want to get the sequences from? What organism? Do you know the genome?
From your comment, it seems you understand that you need a tool which can take the gene list - have you tried looking for such tools?
Yup, I definitely have tools that I can tell you about but I would want to know the things Joe has listed above, plus:
sequence type (cDNA/genomic/CDS)
number of IDs
Thank you so much
I have a list of long non coding RNA (lncRNA) and their possible target genes for human version hg19
There is a web tool http://www.cuilab.cn/lnctar which takes the fasta formats of lncRNA and their targets and say which genes is most likely is the target of a given lncRNA
I have the list of lncRNA and even their genomic coordinates however I am not sure if they are coding or not but I need the fasta format of these for the mentioned prediction tool
I have something like below in my hand
Thank you for any help
At least this much information should have been in your opening post. Please don't make us tell you again.
How many lines in this file, approximately?
Thank you @ Emily_Ensembl In total I have 8000 lncRNA but
only 200
of those were differentially expressed betwen two groups of patients for which I want to predict their target genes