How to map the first part of a read and ignoring the rest of it?
0
0
Entering edit mode
4.9 years ago

Hi all,

I have a read that consists of two parts as shown in the picture.

Read sequence

The first part is a sequence of a virus (it's either virus A or B) and the second part (H) is a human DNA sequence. We also know that only part of the virus is inserted in the DNA. The length of this part varies across cells. I know the exact start point of the virus sequence, but don't know the end. I know that the length of whole read (A+H) is about 500bp. Also, the exact sequences of virus A and B are available, which I consider as a reference. I want to see which type of virus exists in my DNA sequence. How can I use BWA to align my reads to the reference and detect the type of virus?

Thank you in advance.

Shiva.

BWA targeted sequencing alignment sequencing • 1.1k views
ADD COMMENT
1
Entering edit mode

If you use virus A or B as sole reference bwa should align part of the read that matches the virus and should soft-clip the rest. You could look for soft clipped reads that show full match on part of the read?

ADD REPLY
1
Entering edit mode

Is it of interest where the sequence align? Or is the question just "Do I have virus A and/or virus B" sequence in?

For the later one bbduk might be helpful. See the usage example about "Kmer filtering".

ADD REPLY
0
Entering edit mode

Thank you for your reply. BBDuck seemed very useful!

My final goal is to find the location of "Y" sequence in the genome. This sequence is important to me because it was followed by "X" sequence. My plan was to first align the whole read to A and B individually to detect the type of virus. Then cut the "X" part from read and align only the "Y" part to the human reference genome and find the location of it in the genome. Do you think this approach is suitable? Do you know any tools that can do this analysis in one step?

ADD REPLY

Login before adding your answer.

Traffic: 1627 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6