Generating GTF files with repalced features
0
0
Entering edit mode
4.2 years ago
abiuma ▴ 30

I need to extract GTF annotation rows for transcripts based on the feature type transcript (column 3) of the original tab-delimited GTF and replace the feature type from transcript to exon.

Though I don't get any error while executing the command it does not replace the transcript with exon. Can you anyone tell me what's the mistake I am doing here?

awk 'BEGIN{FS="\t"; OFS="\t"} $3 == "transcript"{ print; $3="exon"; $9 = gensub("(transcript_id\\s\"{0,1})([^;\"]+)(\"{0,1});", "\\1\\2_premrna\\3;", "g", $9); print; next}{print}'  refdata-cellranger-GRCh38-1.2.0/genes/genes.gtf > GRCh38-1.2.0.premrna.gtf
gene RNA-Seq sequencing • 1.0k views
ADD COMMENT
0
Entering edit mode

Looks like 10x seems to have changed the awk command and it does not seem to work as you found out.

Can you try (this was what was there before):

awk 'BEGIN{FS="\t"; OFS="\t"} $3 == "transcript"{ $3="exon"; print}'  genes.gtf > GRCh38-1.2.0.premrna.gtf

If this works then can you email 10x support and let them know that their current example does not seem to work. Update this thread when you hear back from them.

ADD REPLY
0
Entering edit mode

Thanks! The command partially worked. Though the replacement of the feature type from transcript to exon occurred the file size has decreased.The original file size was 923313 KB and the file obtained after execution is 52000 KB. The total number of lines in the original file is 1780460 and the lines in the file obtained after execution of the command is 118158.

ADD REPLY

Login before adding your answer.

Traffic: 1840 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6