Hi,
I want to use samtools command
samtools faidx <ref.fasta> [region1 [...]]
My question: where can I get ref.fasta or how to create ref.fasta by some command? Suppose I have a bam file already.
Thanks.
Hi,
I want to use samtools command
samtools faidx <ref.fasta> [region1 [...]]
My question: where can I get ref.fasta or how to create ref.fasta by some command? Suppose I have a bam file already.
Thanks.
Get it here:
http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/
and download hg18.2bit, this is hg18 binary coded. This one you can convert to the fasta format using twoBitToFa. You can download this tool here:
http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/
//Edit:
Ok, then you can download the single chromosomes here:
http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes/
Skip that ones with random in the name, just from chr1.fa.gz to chrY.fa.gz (depends on what you need - may ask the guy who did the alignment).
After that unzip them and merge. How this works please find here:
fa equals fasta
The BAM file will not contain the reference genome (if that is what you are asking). Check the header:
samtools view -H my.bam
You may find some information about the exact version that was used to align the data. If you can't find anything I'd suggest you contact the person that generated the alignments.
There is a header file. I found something like.
@SQ SN:chr1_random LN:1663265 AS:HG18 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly18.fasta">http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly18.fasta M5:cc05cb1554258add2eb62e88c0746394 SP:Homo sapiens
So should I download this file as reference fasta?
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Which genome was used to create your BAM file? By that I mean, to which genome were the reads aligned?
Human genome. dbGaP phenotype release
Which human genome? hg18, hg19, another one? Normally you can download the hgXX as single chromosomes and merged them to hgXX.fasta, meaning ref.fasta
hg18 genome. Where can I download it?
just a warning: if you already have a BAM file it means that the reads have already been mapped, so the reference file should have already been available. you should try to retrieve such reference file, because if you download a different file you would end having nomenclature or position errors that won't be easy to deal with.