Entering edit mode
4.7 years ago
ijlal.hyder2012
▴
20
Hello everyone,
I need some help, I have a Panel sequencing data (107 genes) from 18000 individuals, which is already aligned it to hg38 reference genome (1 bam file for each individual), As I have to merge this data with a WES data and I do not have the region information, I was just wondering if there is a way/tool to extract the regions from those 18000 bam files into a single file that can be further used to get those same regions out of the WES data.
I would really appreciate your help/suggestions.
Regards,
For the entire genome? Take a look at mosdepth.
Hi,
I actually wanted to get the regions out of those 18000 bam files into a single bed file based on that file I wanted to extract the exact same regions from WES data so that I can combine both data's and do the down stream analysis.
An example bed file can be found in the link below
wget -nd biobank.ndph.ox.ac.uk/showcase/showcase/auxdata/GRCh38_alt_mapping_noCHR.sorted.merged.bed
I was just wondering if it is possible to have something like this based on those 18000 bam files
Regards,
do you want the depth for each base of for each bed record ?.
If it is interval-wise then
featureCounts
can be a fast option. You will need to convert the BED file to a SAF like likeawk 'OFS="\t" {print $1"_"$2"_"$3, $1, $2, $3, "."}' in.bed > out.saf
and then count reads over these regions like
featureCounts -a out.saf -T ${Cores} -F SAF -o out.counts *.bam
For per-basepair use
mosdepth
as genomac suggested.ATpoint Hi, I want to produce coverage plots for multiple .bam files together for a single .bed target file. I browsed through the mosdepth manual but couldn't find any direct commands to produce coverage plots across multiple samples. Do you suggest to use outputs from individual files in R software as shown in the blogpost here? Or are there any tools which produces such plots directly?
ijlal.hyder2012 : Please do not delete posts when they have received comments/answers.