Entering edit mode
8.6 years ago
pradyumna Jayaram
▴
210
I have a query sequence with 1935 bp and a reference sequence with 2817 bp. My query aligns fully with the reference sequence. I need to extract the regions from the reference sequence(2817bp) which are not a part of my query i.e.approximately 882 bp. I need to do this for many such files (nearly 200). Please help me with a script if possible. I couldn't find any tools online for the same. Kindly help!
Look at the samtools faidx solution: Extract User Defined Region From An Fasta File
what if i do not know the coordinate regions? I mean as I have mentioned I have to perform this nearly 200 times so it will be difficult to check the coordinates of the reference file which do not match with my query each time. Can you help me with a solution which will directly extract the unaligned region from the reference from a SAM/BAM file?
extract the region with 0 coverage bedtools: extracting no coverage regions
Can you please explain in detail? I am a beginner.
The link that Pierre provided gives the exact command. What additional explanation do you require?