Entering edit mode
17 months ago
firefox91
•
0
Hello,
I would like to quantify the expression of a specific transcript (24nt) in a lot of rnaseq files (transcriptomes) frome the SRA database. Which tools are the easiest to use for this ? (I can't install Salmon)
Thanks !
You would not be able to search against all of SRA but if you have specific datasets you want to look into then you can use the SRA blast available via NCBI web blast.
I used a 24 bp example for searching and this what what you will see (the link will only stay valid for couple of days): https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Get&RID=8SV7GGWE016
I forgot to precise that the transcriptomes are not annotated.
I think we are missing information. Is the transcript 24nt or are you looking for a motif that is 24nt? Is this polyA RNA-seq or some sort of small RNA-seq? Do you have a reference genome or transcriptome?
The transcript I am looking for is 24nt and I am searching it in human embryo transcriptomes.
I am going to collect a lot of transcriptomes in the SRA database so the RNA-seq method changes depending on the experience.
I have a reference genome and transcriptome for homo sapiens.
I think the issue you are going to have is that most RNASeq protocols treat 24-mers as junk, and filter such small sequences away. You are going to have to research the library prep protocols to see which ones would properly preserve your target transcript.
There are almost no reliable annotated transcripts in human/GENCODE (see bottom) that are that short. smallRNAs such as miRNAs are posttranscriptionally processed and trimmed to that size, but this you will not find in standard RNA-seq as others have mentioned. My recommendation is to answer the essentials first:
how did you come to the idea that this is a real transcript and why do you think this really exists. This is probably the core of your research, and confidentiality might forbid to tell it here, but it's the most important question, as 24nt, unless a dedicated smallRNA, seems almost certainly like an artifact.
is it even polyadenylated, so you would see it in typical (small) RNA-seq? Or do you need ribodepletion protocols?
Have you aligned this sequence to the genome to see whether it even exists as a DNA template in the human genome?
Is 24nt the actual transcibed size or is the transcript longer and there are posttranscriptional modifications that make it shorter?
This all you should find out, with strangers in the internet, or much better with a local experience person who knows RNA-seq in and out. And then based on this you can decide whether there is a realistic chance to answer your question, or whether you're chasing ghosts.