Entering edit mode
7.3 years ago
Naresh D J
▴
110
Hi, I am analyzing the MPRA (massively parallel reporter assays) sequencing data for enhancers function validation. In this experiment, we have included around 1000 scrambled sequences as a control. I can align the fastq sequences to reference hg19 and quantify the enhancers activity but how to map/align the sequences to my scrambled sequences (controls in the experiment) and counting them. Any suggestions or help would be appreciated. Thank you.
Best Regards, Naresh D J
What do you use to map to hg19? Assuming MPRA sequencing is RNAseq (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4540663/), I'd say you build a reference from your decoys and map to that with exactly the same tool.
Hi, Thanks for your response. I was also thinking of the same, i.e. building my own reference and mapping. I am not sure on how to count the reads to mapping to those scrambled sequences. I have been using HTseq-count or featureCounts from subread package and not sure of these tools quantify my reads. Do you know any other tools for this purpose.
Are these sequences unrelated to human? If so you could add them as additional "chromosomes" to the standard hg19 build. Create a new index and map. You will also need to make a new GTF file that contains these "chromosomes" so you can count using
featureCounts
.Thanks. These sequences are randomly generated and not related to human. Yes, I will try your suggestion.