bait interval list for picard collecthsmetrices
4
0
Entering edit mode
6.6 years ago

Dear all,

I have some samples of exome sequences. I need to check the exome capture efficiency using picard collectHsmetrcies for each sample. I have target interval list with me. but I do not have idea about the bait interval list? Should I create from the bam file -> bed file -> interval ? Or what else I should do to get this bait interval list ? I just want to sure before doing it for the accuracy in method.

I will appreciate all the suggestions

Thank you

Archana

picard alignment exome • 7.0k views
ADD COMMENT
3
Entering edit mode
6.6 years ago
igor 13k

Target intervals are the regions or exons you are trying to capture. You don't necessarily expect to pick up all the exons, since baits/probes can't be designed for every region.

Bait intervals are the regions that correspond to the capture probes. In an ideal situation, all probes should work.

For the purpose of overall QC stats, the two should be roughly similar, but there are differences.

ADD COMMENT
0
Entering edit mode

Hi

Thank you for the explanation .

ADD REPLY
0
Entering edit mode

Can you cite any references about this? I noticed that terminology regarding regions, targets, baits, "covered" regions... has much room for improvement between kit and tool providers. I think BaitDesigner might give some insight into this - it also supports your answer.

ADD REPLY
0
Entering edit mode
6.6 years ago

Dear all

I am new in area of exome study.

Just now found this detail related to bait and target file for the prediction of collecthsmetrices

Bait interval = All tracks bed file (given by sequencing team) ;;;;; Target interval = covered bed file (from bam file)

Can anybody correct me here if it is right or wrong ?? . I just want to be sure for my workflow wether It is correct or not before proceeding for the analysis?

Thank you in advance

ADD COMMENT
0
Entering edit mode
6.6 years ago

Hello archie,

it is fine to use the same file for bait and target intervals.

fin swimmer

ADD COMMENT
0
Entering edit mode

Hi

Thank you. I am going to use the same file and will check result. I have another point in mind i.e bam input should be duplicate removed or CollectHSmetrices will automatically check for the duplicate level and count them separately ?? If no, then Can I give input of markduplicate where duplciate will be removed from bam file during processing.

ADD REPLY
0
Entering edit mode

Hello,

there is no need to remove duplicates from bam file if you've already marked them using MarkDuplicates. CollectHSmetrics recognize is and you would also get metrics about about the level of duplications.

fin swimmer

ADD REPLY
0
Entering edit mode
2.0 years ago
Arton ▴ 20

Hello, If we need to calculate the coverage of coding regions of a single gene from exome data, should we change the target interval file to the gene coding positions?

ADD COMMENT

Login before adding your answer.

Traffic: 1708 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6