Entering edit mode
3.2 years ago
Shraddha
▴
90
Hi all,
I'm trying to generate feature count files for the DeSeq2 pipeline, but I've run into an issue while using featureCounts .
I see that the gene IDs that I need, aren't in the same format at the rest of the attributes, but within the Dbxref section. How can I extract just the gene ID so that my featurecounts will produce an output?
thanks and kind regards
Thanks for your response! I tried using 'gene' with the -g flag, but it gave me unsatisfactory results (no features were found for any of my samples). I would hypothesize that the gene ID should be just the number, without the LOC. I was doing a long-winded series of awk commands to execute your second alternative, but this is far neater. Thanks again!
Yes, if you come across any LOC style identifiers you can be sure that the suffix numeral is the GeneID.