New mouse reference genome (GRCm39/mm39)
1
0
Entering edit mode
15 months ago
sarahmanderni ▴ 120

Hi,

I am planning to perform sequencing alignment using the new mouse reference genome mm39 available on UCSC: https://hgdownload.soe.ucsc.edu/goldenPath/mm39/bigZips/ (bowtie2 on chipseq data). Did I understand correctly that both mm39.2bit and mm39.fa.gz are the same and I can use either of them (2bit file first convert to fasta) for the alignment? And one practical question: have you noticed significant overall difference compared to alignment against mm10?

reference-genome mm39 • 1.9k views
ADD COMMENT
0
Entering edit mode

Did I understand correctly that both mm39.2bit and mm39.fa.gz are the same and I can use either of them (2bit file first convert to fasta) for the alignment?

Yes, 2bit is a format that stores DNA sequences in a compact manner (I believe it even includes masked regions), but I wouldn't go through the hassle of converting it to fasta format if there is already a fasta file.

ADD REPLY
0
Entering edit mode
15 months ago

Did I understand correctly that both mm39.2bit and mm39.fa.gz are the same

fa.gz is a fasta text file compressed with gzip

.2bit is a UCSC specific format, where each nucleotide is packed in bytes, mostly used by blat

bowtie2 requires you to build a specific index from the fasta.

ADD COMMENT
0
Entering edit mode

Thanks for the response. So the content for mm39.2bit and mm39.fa.gz are the same regardless of the format and I can decide which one to use. I just wanted to make sure as UCSC has used "Soft-masked" term for mm39.fa.gz file in the readme.txt file but not for the mm39.2bit (they used "complete" reference for mm39.2bit).

ADD REPLY

Login before adding your answer.

Traffic: 1554 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6