Entering edit mode
3.2 years ago
asalimih
▴
60
Hi, I have a bed file containing exons of the genes. the name field is specified with name of the gene like (ENSG***). when I run bedtools getfasta
I get the sequences of each exon separately. is there a standard way in order to concatenate sequences that have the same gene name? or I should write a script to do this manually on the fasta files.
when I read the bedtools documentation there is a -split
switch which is only applicable to bed12 file format. link but my bed files are not bed12.
Thanks in advance
You might try something like that with AGAT
this produced an empty file. I assume the
file.fasta
is the genome. here is a demonstration of my bed file:Ok it is because the first command create gene features only and the second remove gene feature if they do not have any sub-feature like mRNA,transcript,exon etc. So like that it should work:
Yes file.fasta is the genome from wchich you will extract the sequence from.