hello, We work with a mouse strain different from the strain that was used in generating the standard mouse reference genome (mm9 or mm10). The Sanger institute has done NGS on the strain we work with and has a BAM file available on its website.
How can I use that BAM file to assemble a transcriptome that I can use as a reference for analyzing RNA-Seq data from this particular strain? I only care about protein-coding orf so I do not need to do de novo genome assembly.
Thanks.
What kind of BAM file is it? If it is for WGS then you can't use it directly to assemble a transcriptome. If it is from RNAseq data then you could use one of the options mentioned below by @grant after extracting the reads from that BAM file.
hello mmfansier, Thanks for suggesting mmseq. I found a genome for my mouse strain. I tried opening the genome in IGV. I had to gunzip, and make a .genome file as described in IGV website (https://software.broadinstitute.org/software/igv/LoadGenome). IGV does not read the fasta genome file. Do you have any advice? Thanks.
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized. This comment belongs under @mmfansler's answer.Are you getting an error? Are there just fasta sequences of transcripts in the file?
Just to test, I downloaded a transcriptome from there, unzipped, then indexed (
igvtools index transcriptome.fa
). This loaded fine in IGV. As mentioned by @genomax, we'll need more details to help further.