Cnvnator Questions Help Please !
1
1
Entering edit mode
12.6 years ago
madkitty ▴ 690
  1. I did a test with only one bam file and chrX on step A) Extract read mapping from BAM/SAM Files When I type -genome mm9 it outputs a file 17Mb root file When I don't specify any genome name it outputs a 30Mb root file Which one is the good one ??

  2. We have 4 bam files per sample so for every step A,B,C,D,E Can I type in the command line the path for the 4 bam files in one shot ? Like that : [whatever command step] /mybamfile/file1.bam /mybamfile/file2.bam /mybamfile/file3.bam /mybamfile/file4.bam

  3. GENERATING HISTOGRAM In the README file it says "Files with chromosome sequences are required and should reside in running directory or directory specified by -d option. Files should be named as: chr1.fa, chr2.fa, etc."

3.1. Are those files from the reference genome mm9 ? (Because that's all we have..)

3.2 file.root is that my previous out.root done in Step A?

3.3 If I do generate histogram, what does that tell me ? Am I suppose to get any number to use later out of it?

3.4. After -his we have to write the bin_size, I have no clue where am I suppose to find that number and what does it represent ?

4.Step C) CALCULATING STATISTICS file.root is file the same name in step A) named out.root ? so we re-use the same out.file all the time ?. I tried randomly with out.root and it says 20 times

Zero value of GC average.
Bin 1083251 with center 1.08325e+08 is not corrected.   (says that about 20 times)

Then it says that :
Making statistics for chrX after GC correction ...
Warning in <Fit>: Fit data is empty
Warning in <Fit>: Fit data is empty
Average RD per bin (1-22) is 0 +- 0 (after GC correction)
Average RD per bin (X,Y)  is 3.42284 +- 3.19117 (after GC correction)

What's are all those numbers ??

cnv • 6.2k views
ADD COMMENT
1
Entering edit mode
12.6 years ago

Well, those are a lot of questions in one go! Let's see if I we can help you there:

1- Have you checked their content? This usually helps... As explained in the README:

Chromosome names and lengths are parsed from sam/bam file header. Using -genome option one can overwrite this default.

So it depends on how you want this information to be parsed.

2- No, you will probably have to make some kind of loop. In bash, for instance:

for i in 1 2 3 4 ; do
[whatever command step] "/mybamfile/file"$i".bam &"
done

3.1- This is quite explicit: the files should be from the reference genome you are working on, so if you are working on mm9, the answer is yes.

3.2- Sounds logical, doesn't it?

3.3- Have you read the initial Nature paper and the CNVnator paper? These should give you the answer you are looking for. Anyway, this step seems optional.

3.4- This is the size of the bins you want for your histogram. You should have an idea of what this number should be. Namely, by reading the previous two papers.

4- For now, you have run step A only on chrX, so no, you should not be using this .root for all other steps, but only the steps concerning chrX. You should therefore run this on each chromosome (generating as many .root files as you have chromosomes). There also seems to be a problem with your .root file, so you should check you step A carefully.

ADD COMMENT
0
Entering edit mode

That was a very useful answer Thanks a lot!! I think we don't have the same README file, the one I have fits in 150 lines and nothing was mentioned about Chromosomes' names. Which version do you use ?

ADD REPLY
0
Entering edit mode

I just downloaded the current CNVnator version to help you out, and so I might have a more up-to-date version than you.

ADD REPLY
0
Entering edit mode

Alright Thanks a lot ! :)

ADD REPLY

Login before adding your answer.

Traffic: 1791 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6