Entering edit mode
4.2 years ago
abiuma
▴
30
I need to extract GTF annotation rows for transcripts based on the feature type transcript (column 3) of the original tab-delimited GTF and replace the feature type from transcript to exon.
Though I don't get any error while executing the command it does not replace the transcript with exon. Can you anyone tell me what's the mistake I am doing here?
awk 'BEGIN{FS="\t"; OFS="\t"} $3 == "transcript"{ print; $3="exon"; $9 = gensub("(transcript_id\\s\"{0,1})([^;\"]+)(\"{0,1});", "\\1\\2_premrna\\3;", "g", $9); print; next}{print}' refdata-cellranger-GRCh38-1.2.0/genes/genes.gtf > GRCh38-1.2.0.premrna.gtf
Looks like 10x seems to have changed the
awk
command and it does not seem to work as you found out.Can you try (this was what was there before):
If this works then can you email 10x support and let them know that their current example does not seem to work. Update this thread when you hear back from them.
Thanks! The command partially worked. Though the replacement of the feature type from transcript to exon occurred the file size has decreased.The original file size was 923313 KB and the file obtained after execution is 52000 KB. The total number of lines in the original file is 1780460 and the lines in the file obtained after execution of the command is 118158.