Entering edit mode
6.6 years ago
alireza346
▴
10
I am trying to get the sequence at both sides of the coding sequence stop site for all of the genes for the alignment. 40 nt
before stop site (in the CDS
) and 50 nt
at the downstream of CDS
stop site(in the 3p UTR
) for all of the genes. do you know how I can do that correctly
It's unclear which data format you have.
I have RNAseq and want to align that to this part of transcriptome.
That is important information you should have mentioned in your question. Why do you think this is a good idea?
Anyway, to get this done you would first need a GTF/GFF of your organism and isolate the stop sites. Then you would make a bed file with -40nt and + 50nt intervals for each stop site. Then you could use
bedtools getfasta
to get the nucleotide sequence of those intervals from the reference genome.