I am trying to use cn.mops to make CNV calls on my BWA-aligned BAM files, the getReadCountsFromBAM()
function has an option for seqRefName
, and the explanation for this field is:
Name of the reference sequence that should be analyzed. The name must appear in the header of the BAM file. If it is not given the function will select the first reference sequence that appears in the header of the BAM files.
I used samtools view to view the header of my bam files, and here is what I see:
@HD VN:1.4 SO:coordinate
@SQ SN:chr1 LN:249250621
@SQ SN:chr10 LN:135534747
@SQ SN:chr11 LN:135006516
@SQ SN:chr11_gl000202_random LN:40103
@SQ SN:chr12 LN:133851895
@SQ SN:chr13 LN:115169878
@SQ SN:chr14 LN:107349540
@SQ SN:chr15 LN:102531392
@SQ SN:chr16 LN:90354753
@SQ SN:chr17 LN:81195210
@SQ SN:chr17_ctg5_hap1 LN:1680828
@SQ SN:chr17_gl000203_random LN:37498
@SQ SN:chr17_gl000204_random LN:81310
@SQ SN:chr17_gl000205_random LN:174588
@SQ SN:chr17_gl000206_random LN:41001
@SQ SN:chr18 LN:78077248
@SQ SN:chr18_gl000207_random LN:4262
@SQ SN:chr19 LN:59128983
@SQ SN:chr19_gl000208_random LN:92689
@SQ SN:chr19_gl000209_random LN:159169
@SQ SN:chr1_gl000191_random LN:106433
@SQ SN:chr1_gl000192_random LN:547496
@SQ SN:chr2 LN:243199373
@SQ SN:chr20 LN:63025520
@SQ SN:chr21 LN:48129895
@SQ SN:chr21_gl000210_random LN:27682
@SQ SN:chr22 LN:51304566
@SQ SN:chr3 LN:198022430
@SQ SN:chr4 LN:191154276
@SQ SN:chr4_ctg9_hap1 LN:590426
@SQ SN:chr4_gl000193_random LN:189789
@SQ SN:chr4_gl000194_random LN:191469
@SQ SN:chr5 LN:180915260
@SQ SN:chr6 LN:171115067
@SQ SN:chr6_apd_hap1 LN:4622290
@SQ SN:chr6_cox_hap2 LN:4795371
For the BWA alignment, I used hg19 as my reference, however, I do not see hg19 in the BAM headers, so what does it mean by seqRefName in cn.mop
One thing I've noticed is that getReadCountsFromBAM()
seems using "chr1" as the default seqRefName when I don't specify it. So should I use a vector of chr(1..22) for seqRefName?
Yes, you need to use vector of all chr names if you would to use all chromosomes in the same time. Example is:
Thanks. This was very helpful. I finally used
I have the same problem, but I donĀ“t understand your solution.
I just tried to copy and past, but nothing happen...
Can someone explain me, what I have to do? pls!
Though I have never used cn.mops but based on whatever you posted it seems that by "reference sequence" cn.mops means "chromosomes" or "contigs" that were part of the reference fasta file against which the reads were aligned.