Is there are good way of identifying alternative promoters from specific genomic locations by analyzing single-end RNA-seq data?
I currently have a dataset with 50bp single end seq from Illumina HiSeq 2500.
Is there are good way of identifying alternative promoters from specific genomic locations by analyzing single-end RNA-seq data?
I currently have a dataset with 50bp single end seq from Illumina HiSeq 2500.
Unless your libraries were constructed specifically to capture the 5' ends of transcripts, probably not. The best you could hope for is that the 5' UTR reads map to distinct loci (A or B), and that there's evidence for alternative splicing to downstream exons (A-C vs B-C), . But SE-50bp reads are usually too short to span most splice junctions. You may also attempt to draw inferences based on the relative read depths of A and B, but transcript ends are typically under-represented, so only cases where A >> B would be potentially valid.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Can I ask what you mean by A >> B? Do you mean upstream or higher expressed? If transcript ends are under-represented, wouldn't a downstream alternative TSSs site stick out in relation to its (upstream?)surroundings, as it is a transcript start and not end? Perhaps I misunderstood something.
I do agree that the data type is not very suitable for the task. Something that maybe could work is finding already known alternative promoter sites and checking those for their expression levels at the various possible TSSs, as you thus include some prior knowledge.
By A >> B, I meant that the upstream locus contains a significantly greater read depth than the downstream locus. If the reverse is true, it could merely reflect the relative lack of 5' ends that is typically observed.
By transcript ends, I meant both 5' and 3' (start and stop). So downstream TSSs would not stick out, since the ends of those transcripts would also be under-represented.