Question

How to map/align fastq sequences to scrambled sequences and quantify them

0

Entering edit mode

7.3 years ago

Naresh D J ▴ 110

Hi, I am analyzing the MPRA (massively parallel reporter assays) sequencing data for enhancers function validation. In this experiment, we have included around 1000 scrambled sequences as a control. I can align the fastq sequences to reference hg19 and quantify the enhancers activity but how to map/align the sequences to my scrambled sequences (controls in the experiment) and counting them. Any suggestions or help would be appreciated. Thank you.

Best Regards, Naresh D J

alignment next-gen sequence • 1.9k views

ADD COMMENT • link 7.3 years ago by Naresh D J ▴ 110

0

Entering edit mode

What do you use to map to hg19? Assuming MPRA sequencing is RNAseq (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4540663/), I'd say you build a reference from your decoys and map to that with exactly the same tool.

ADD REPLY • link 7.3 years ago by cschu181 ★ 2.8k

0

Entering edit mode

Hi, Thanks for your response. I was also thinking of the same, i.e. building my own reference and mapping. I am not sure on how to count the reads to mapping to those scrambled sequences. I have been using HTseq-count or featureCounts from subread package and not sure of these tools quantify my reads. Do you know any other tools for this purpose.

ADD REPLY • link 7.3 years ago by Naresh D J ▴ 110

0

Entering edit mode

my scrambled sequences (controls in the experiment)

Are these sequences unrelated to human? If so you could add them as additional "chromosomes" to the standard hg19 build. Create a new index and map. You will also need to make a new GTF file that contains these "chromosomes" so you can count using featureCounts.

ADD REPLY • link 7.3 years ago by GenoMax 147k

0

Entering edit mode

Thanks. These sequences are randomly generated and not related to human. Yes, I will try your suggestion.

ADD REPLY • link 7.3 years ago by Naresh D J ▴ 110