Mapping Solid Reads To Hg19 Through Bioscope
1
3
Entering edit mode
14.0 years ago

this question is conceptually similar to a previous one posted by Pierre Lindenbaum, but adapted to the SOLiD data. in fact, our group is planning to map SOLiD results to hg19 using the propietary software BioScope (free though till now). the problem is that although many of the groups we know that are working on NGS are sticked to hg18, none of them have been able to convince us not to use hg19 since we are focused on human variations and we always want to use the most up to date version of dbSNP (among other DBs). any thoughts on this matter?

by the way, is there anyone here using hg19 on BioScope? the default installation comes with hg18 files only, and surprisingly it doesn't seem to be straightforward to upgrade to hg19. there are other files apart from hg19.fa that anyone can get for instance from the UCSC genome browser, and we haven't been able to find anywhere else. of course LifeTech is "working on it", but I was wondering if any of you may have already solved this issue.

solid next-gen sequencing mapping hg hg • 4.4k views
ADD COMMENT
0
Entering edit mode

Which files are missing? BioScope can be accessed via command line (SSH) so perhaps you could use that instead of the web interface.

ADD REPLY
0
Entering edit mode

sure, I'm accessing the offline cluster by ssh almost always, but the fact is that I have a concise folder structure from the default BioScope installation for hg18 at etc/files, although only the .fa and the .cmap files are needed for the targeted resequencing module we are using. I took the hg19.2bit file from the UCSC genome browser and convert it to .fa, and generated the .cmap following the instructions I found in a cmap folder of the default installation, but things did not work. I was wondering if any other researcher may have solved this, and where could I download those files from.

ADD REPLY
3
Entering edit mode
14.0 years ago

It is possible to align to hg19 using Bioscope.

Bioscope wants a multi-fasta file, and a cmap file pointing to the per-chromosome files. You'll also need a dbSNP source compatible with hg19 for annotation of SNP calls.

The 2bit file you converted may not be in a properly line wrapped fasta format. Make sure the file is compatible with samtools faidx first.

You should create a working multi-fasta reference by fetching the per-chromosome files from UCSC or NCBI, and concatenating them in a sensible (non-strictly-alphabetical) order. See: Where Can I Download Human Reference Genome In Fasta Format? Hgref.Fa File

You can construct a cmap file for small indel calling by looking at the existing file and updating it to hg19 file locations. (A cmap file is just a lookup table for the per-chromosome files.) Include (or exclude) the random contigs to match your multi-fasta reference.

For SNP annotation, you'll need to fetch and uncompress 3 files from the NCBI ftp folder here

  • b132_SNPChrPosOnRef_37_1.bcp.gz
  • b132_SNPContigLocusId_37_1.bcp.gz
  • b132_SNPContigLoc_37_1.bcp.gz

Install these to a hg19/dbSNP folder, and update the annotation parameters accordingly.

Some of the other modules will not function (CNV needs mappability computed for the new genome) - these will have to wait until LifeTech releases support for hg19. For targeted resequencing, you should be fine with the files listed. (Make sure your targets have hg19 coordinates, too!)

ADD COMMENT
1
Entering edit mode

found out that the memory requirements for bioscope using hg19 and dbSNP132 are 24GB instead of the 16GB stated when we bought the cluster nodes half a year ago :(

ADD REPLY
0
Entering edit mode

thanks jmanning for such a concise answer. I'm testing all this right away!

ADD REPLY
0
Entering edit mode

thanks jmanning for such a concise answer. we've just received today an updated version of the BioScope draft manual which now has a "add annotations" section where actually describes a very similar process as the one you mention. I'm testing both procedures right away!

ADD REPLY

Login before adding your answer.

Traffic: 2365 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6