Hi!
I have quant.sf files generated by Salmon by mapping 100bp single-end Illumina libraries against primary transcripts. The genome for this species is not so perfect and is missing some genes of interest. So I am using both transcriptome and genome data for this RNA-seq.
Before passing this quantification data, I needed to run Tximport and generated a table containing transcript ID and gene ID from a gff3 file based on the genome annotation. Then, I realized many of the primary transcripts were missing in the genome.
I was going to add the unique transcript ID and some arbitrary gene ID into the table. But is this okay? What would be the standard protocol to deal with this?
Thanks!
How are you avoiding double counting entities shared between those two?