Entering edit mode
6 months ago
jain72744
▴
10
I had RNA-seq data from which I extracted all lncRNAs by naming using BiomaRt package in R with ENST labels. Now I have another dataset with ENSG labels and if I convert it to ENST labels, I get repetitions of the data as for gene there are multiple transcripts. Is there a more appropriate way to do this or should I stick to ENSG IDs for lncRNAs.
Another query is that I have genbank accesion IDs for a dataset. IS it possible to convert it to ENST/ENSG data?
RNA-seq is typically analyzed on gene rather than transcript level. Are your counts gene or transcript?
My data is based on gene counts and the respective gene IDs are given so I should work with ENSGs?
lncRNA_transcripts <- getBM(attributes = c("ensembl_transcript_id_version", "ensembl_gene_id_version", "external_gene_name"), filters = "biotype", values = "lncRNA", mart = ensembl)
I used this code to filter lncRNAs in my data.