Entering edit mode
11.1 years ago
Tonyzeng
▴
310
HI, I have VCF file with header that I need to change the order of contig ID from
##contig=<ID=1,length=195471971>
##contig=<ID=10,length=130694993>
##contig=<ID=11,length=122082543>
##contig=<ID=12,length=120129022>
##contig=<ID=13,length=120421639>
##contig=<ID=14,length=124902244>
##contig=<ID=15,length=104043685>
##contig=<ID=16,length=98207768>
##contig=<ID=17,length=94987271>
##contig=<ID=18,length=90702639>
##contig=<ID=19,length=61431566>
##contig=<ID=2,length=182113224>
##contig=<ID=3,length=160039680>
##contig=<ID=4,length=156508116>
##contig=<ID=5,length=151834684>
##contig=<ID=6,length=149736546>
##contig=<ID=7,length=145441459>
##contig=<ID=8,length=129401213>
##contig=<ID=9,length=124595110>
##contig=<ID=X,length=171031299>
How can I change it to
##contig=<ID=10,length=130694993>
##contig=<ID=11,length=122082543>
##contig=<ID=12,length=120129022>
##contig=<ID=13,length=120421639>
##contig=<ID=14,length=124902244>
##contig=<ID=15,length=104043685>
##contig=<ID=16,length=98207768>
##contig=<ID=17,length=94987271>
##contig=<ID=18,length=90702639>
##contig=<ID=19,length=61431566>
##contig=<ID=1,length=195471971>
##contig=<ID=2,length=182113224>
##contig=<ID=3,length=160039680>
##contig=<ID=4,length=156508116>
##contig=<ID=5,length=151834684>
##contig=<ID=6,length=149736546>
##contig=<ID=7,length=145441459>
##contig=<ID=8,length=129401213>
##contig=<ID=9,length=124595110>
##contig=<ID=X,length=171031299>
That's not a BAM header. Do you mean VCF?
Thank you for the reminding, Dpryan, I corrected it.
Do you need to reorder the whole file, or just the header lines? It's unclear from your question.
I need just reorder the header lines because the order of read lines have been modified perfectly, Thank you!
Huh!! I just wrote a code for you to order the read lines. Anyways, its a hightime for you to learn vi commands (http://www.cs.colostate.edu/helpdocs/vi.html). Use unix to edit the file if it is too big for any windows application like Notepad++,
Thanks, Ashutoshmits, I am sorry not to make it clear that I do generate a VCF file with the correct chromosome order to the READ LINES but not the header line. As for the header line of VCF file, I still need to reorder ##contig=<id=number. i="" assumed="" that="" the="" following="" code="" you="" posted="" works="" for="" order="" the="" read="" lines="" but="" not="" for="" the="" header="" line.="" <="" p="">
Ashutoshmits, I have done running Basecalibration of GATK without any modification of the order ##contig=, it has done with out any probelm. So I do not need to sort the header anymore.
Cool. It means GATK doesnt care for the contig order in the header of a VCF file.
Oh yeah! Thank you so much for your help anyway, Ashutoshmits