How to differentiate between prokaryotic and eukaryotic transcript from RNAseq data
3
0
Entering edit mode
8.0 years ago
amoltej ▴ 100

Hello everyone,

I have a data of the insect (eukaryotic) tissue transcript data that has symbiotic organism (prokaryotic) living inside this tissue. I have done RNA sequencing with the poly-A selection. so ideally there should not be any transcript from the bacteria in the dataset. But that's not the case always.

So my question is if I find any transcript that is showing strong expression in the tissue, how can I confirm that the signal is from the eukaryotic cell and not the prokaryotic cell contamination? I do not have genome data available from both organisms unfortunately.

Thank you Amol

RNA-Seq • 3.7k views
ADD COMMENT
0
Entering edit mode
8.0 years ago
EVR ▴ 610

Hi Tej,

First before mapping the RNAseq reads to eukaryotic organism, first map the RNA-seq reads onto prokaryotic organism database. If you know the name of the that prokaryotic organism just download the respective database containing fasta sequences and map the RNA-seq reads to prokaryotic organism. Then take only those reads which are unaligned to prokarytic organi9sm and map it to the eukaryotic organism database and do your down stream analysis. This method you can be sure that you eliminated all the transcripts belongs to prokaryotic organism. But make sure you included all the genes from the that symbiotic prokaryotic organism

ADD COMMENT
0
Entering edit mode
8.0 years ago

It might make sense to do a de novo assembly - I recommend Trinity - and then blast all the resulting sequences against either Uniprot (blastx, slow) or close relatives of both your organisms (blastn CDS/CDNA, blastx protein level).

How do you plan to do quantification without genomic sequences ? I believe genome free methods like Kallisto and Salmon also require a very good annotation / transcript set.

ADD COMMENT
0
Entering edit mode

Hi Thamizh and Colindaven,

Thank you for the reply. I can not use simply similarity index to differentiate between them. The reason for that is, There are shared metabolic pathways between these two organisms. So if the gene is horizontally transferred then it will be the wrong idea to remove that gene sequence. I am considering this possibility because the gene that I am currently looking at is very significantly differentially expressed in the tissue. and showing similarity with the bacterial genome. Hence can not be ignored by just looking at the similarity between genome sequence of closely related species.

I am looking for a database /software that can differentiate between transcript based on their features e.g Poly-A tail is only present in eukaryotes. This information I can not use as in RNAseq data poly-A tail cold be missing. I hope you understand my problem.

Amol

ADD REPLY
0
Entering edit mode
8.0 years ago
amoltej ▴ 100

Hi Thamizh and Colindaven,

Thank you for the reply. I can not use simply similarity index to differentiate between them. The reason for that is, There are shared metabolic pathways between these two organisms. So if the gene is horizontally transferred then it will be the wrong idea to remove that gene sequence. I am considering this possibility because the gene that I am currently looking at is very significantly differentially expressed in the tissue. and showing similarity with the bacterial genome. Hence can not be ignored by just looking at the similarity between genome sequence of closely related species.

I am looking for a database /software that can differentiate between transcript based on their features e.g Poly-A tail is only present in eukaryotes. This information I can not use as in RNAseq data poly-A tail cold be missing. I hope you understand my problem.

Amol

ADD COMMENT
0
Entering edit mode

Please don't append comments as answers!!! Please use 'add comment' to posts you wish to reply to.

ADD REPLY

Login before adding your answer.

Traffic: 1844 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6