Question

Conserved regions around mutations

0

Entering edit mode

5.0 years ago

Gene_MMP8 ▴ 240

I have a list of mutations of interest from the coding region in an experiment that I am performing. I have the mutation position, base substitution type (C>T, A>G, etc), Chromosome, and Gene name as input data. Now I was curious to explore the sequences surrounding those particular mutational positions. To do that, I extracted the raw nucleotide sequences 10 bases up and downstream of the mutation position and plot the sequence logos for the same. This is the image.

One thing to note from this image is that C and G nucleotides are highly conserved in the majority of the locations. How do I build a background model for this and argue that whatever I am noticing here is not by chance and is significant?
Also, I was also thinking about extracting motifs from the flanking nucleotides and see whether there is an overrepresentation of certain sequence motifs around the mutations. Given I am new to this field, is there a systematic way to do that?

Sequence_logo

next-gen sequencing • 1.1k views

ADD COMMENT • link 5.0 years ago by Gene_MMP8 ▴ 240

1

Entering edit mode

Your seqlogo image shows the same proportion of each nucleotide at every location. If you want to get a 'conservation' score out of your region, you need to give it other species' sequences for context. That's going to be tricky to define. Why not just download a public conservation track for the region.

ADD REPLY • link 5.0 years ago by karl.stamm 4.1k

0

Entering edit mode

I understand your point. Can you tell me a bit more about downloading a "public conservation track for the region"? Where can I find this?

ADD REPLY • link 5.0 years ago by Gene_MMP8 ▴ 240

0

Entering edit mode

I don't know your experimental design or species of interest, but this exon you'll see is missing in chimpanzee. https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr12%3A110912989%2D110913241&hgsid=879063881_SpUoTYb5r8APW0GPAMijO7goYoUs

ADD REPLY • link 5.0 years ago by karl.stamm 4.1k