Question

Finding variants within a subset of a BAM file

0

Entering edit mode

15 months ago

ramiro.barrantes ▴ 60

I downloaded bam "slices" from a database (TCGA), which correspond to a subset of the entire alignment, corresponding to a small set of genes. Now I would like to find variants in the human genome, I am currently using strelka and it works fine. However, I am wondering if there is a more efficient way to do it given that I know exactly the region of interest (those few genes)? Do you recommend me subsetting the human genome to only those genes? What tool should I use to do that?

variant calling • 943 views

ADD COMMENT • link 14 months ago by ramiro.barrantes ▴ 60

0

Entering edit mode

Don't forget to follow up on your threads. If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they all work. If an answer was not really helpful or did not work, provide detailed feedback so others know not to use that answer.

Upvote|Bookmark|Accept

ADD REPLY • link 15 months ago by Pierre Lindenbaum 166k

0

Entering edit mode

Actually, I found a solution in downloading "bam slices" from TCGA (https://docs.gdc.cancer.gov/API/Users_Guide/BAM_Slicing/) , which is another way of addressing the issue. Thank you!

ADD REPLY • link 14 months ago by ramiro.barrantes ▴ 60

score 2 · Accepted Answer · 2024-04-10

However, I am wondering if there is a more efficient way to do it given that I know exactly the region of interest (those few genes)?

https://github.com/Illumina/strelka/blob/v2.9.x/docs/userGuide/README.md

Strelka calls the entire genome by default, however variant calling may be restricted to an arbitrary subset of the genome by providing a region file in BED format with the --callRegions configuration option.