Is there a way I can upload a reference sequence to Clustal Omega to get alligned protein sequences /or a different way of getting the seqeunces
0
0
Entering edit mode
6.3 years ago
vellryba • 0

Hello.

My aim is to find out correlated mutations within a single paired reads. For example, I need to know if the sequence ID X, that has mutation at position lets say 800, also has a mutation at position at 1100. So I managed to get bam and sam files containing only reads that span the regions I am interested in. I have the fasta sequences and I used Translator X to translate those into protein fasta.

Now I know what I was expecting to get back and when I loaded these into Clustal Omega to get an alignment. This doesnt work that well. There are gaps and sequenced that were just badly translated. I looked at the badly translated sequences in the fasta file I get from the Translator X and they are already there. When I looked at the nucleotide fasta, these are fine. Is there a way I can feed my reference sequence into an alignment tool so I can get the protein sequences translated and aligned correctly?

Does anybody have any experience with this type of analysis?

alignment sequencing • 1.4k views
ADD COMMENT
1
Entering edit mode

I don't fully understand your question.

If you have a reference sequence and your reads are covering the region you are interested in completely why is there a need to look at protein translations?

ADD REPLY
0
Entering edit mode

Hi, I know there is a mutation present (sometimes) in some of the reads. I also know that there is a mutation (sometimes again) a bit further down the genome. I want to see if that second mutation is only present when the first one is present. In other words, these mutations are hierarchical. I have the sam and bam file that only contains the reads that span both of the regions.

Now I just want to somehow count either nucleotide (or protein) variants in those reads. Something like this:

1position A 2nd position C - 1200
1 position A 2nd position T - 800

etc.

I am just not sure how to go about it

ADD REPLY
0
Entering edit mode

Use bam-readcount to get this information.

ADD REPLY
0
Entering edit mode

Hi,

this only gives me a count at each position. I need to see if they are correlated. Like this:

first position 800   second position 1000 count: 
AT 1000
CT 800
AG 600

etc.

ADD REPLY
0
Entering edit mode

Sorry to bother you, but do you have any other suggestion? This one wont work due to the reasons below.

ADD REPLY
0
Entering edit mode

You can probably do LD/Correlation analysis using PLINK (not my area of strength). This is only a pointer for you to consider.

ADD REPLY
0
Entering edit mode

Do you specifically want to find reads which contain multiple mutations, or are you just interested in co-localised mutations?

ADD REPLY
0
Entering edit mode

Hi, I need to know that the mutations came from a single paired read. There are particular regions I have in mind.

ADD REPLY
0
Entering edit mode

If the pair of reads you are looking at flanks the regions of interest then they represent a fragment that spans the region. Unless you have reads that go through the region of interest you have not way of confirming that a particular mutation is present in those fragments.

You will need to use sanger sequencing to confirm that the mutation exists using the original sample.

ADD REPLY

Login before adding your answer.

Traffic: 2025 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6