I'm using Variationhunter to analyze CNV; and when I used hg18 reference genome sometime ago, I usually got the output as:
IL5_286:1:311:10:40 chr1 142188880 142188915 R 141736787 141736822 F deletion 3 23.930555 0.00257801939733326435
IL5_286:1:317:831:353 chr16 28530262 28530297 R 28517264 28517299 F deletion 0 26.263889 1.00000000000000000000
The second column tells at which chromosome the potential CNV lies.
Now when I switch to humang1kv37.fasta (one version of hg19), I get output like below:
HWI-ST150_0129:3:47:9776:140941 1 dna:chromosome chromosome:GRCh37:1:1:249250621:1 144308983 144309081 F 143689458 143689556 F V 4 60 2.254486e-12
HWI-ST150_0129:3:4:4776:183027 GL000225.1 dna:supercontig supercontig::GL000225.1:1:211173:1 6803 6901 F 102087 102185 F V5 64 1.502623e-12
We can see the format difference is: where used to be "chr#" is replaced by three columns as "1 dna:chromosome chromosome:GRCh37:1:1:249250621:1"
So what's this "1 dna:chromosome chromosome:GRCh37:1:1:249250621:1"? Is it a header indicating chromosome # or sth? Also I can grep this "header" from humang1kv37.fasta; but seems human_hg18.fastq doesn't contain such header-like stuff.
Also "L000225.1 dna:supercontig supercontig::GL000225.1:1:211173:1" is quite confusing to me. What does "L000225.1" standard for? Some unusual part of chromosome? (like mitonchondrial chromosome?)
Thanks a lot.