Hello,
I'd like some general advice on where to start a project. I am interested in the expression of a particular type satellite DNA in disease states. Normally, this satellite dna is heterochromatin and silenced. But it does seem to be abnormally expressed as a lncRNA in various disease states, notably some cancers.
My goal is to see if this satellite lncRNA is expressed in certain types of cancer.
This is not an actual RNA that's included in any reference transcriptome, It's also a simple pentamer repeat. So I would need to access the raw reads from sequencing experiments and align to some ideal repeat.
I see there are many massive databases that could be useful here. Has anyone needed to access/process raw reads from these databases? I'm wondering what will be the best bang for my buck (time spent...).
Any help, experience, intuition would be very valuable!