I very much like the IGB tools and its features. While I have been able to make a good use of it, I have been facing a problem and can't seem to find a solution how much I try. I am trying to view the aligned tophat output (mapped.bam and junction files from aligned RNAseq data on the reference A. lyrata genome. When I load the lyrata genome on the IGB browser I can see the genome coordinate and the TAIRmRNA database (the annotated .gff file). But, after I upload a mapped.bam and junction file I am not able to see the alignment (aligned reads) with the reference and the annotation.
But, I figured that the mapped.bam and junction creates its own set of scaffold at the bottom of the default set of scaffold (one to one copy with default, but not sure why?). So, if I select a scaffold that the mapped.bam file has created I am able to see the mapped reads and the junctions but now cannot see the co-ordinate bases and the annotations. However, with A. thaliana genome there is no such problem with viewing the mapped output and junctions from RNAseq data along with genome coordinates and bases, TAIR10 mRNA database and several other databases from other labs.
Also, I see that updated version of phytozome data is available (V10.2). Is the data for A. lyrata available on IGB browser (V7) the same as V10.2?
Thanks,
Bishwa K.
If you perform the alignment yourself, it might be a good idea to actually load the fasta file together with the gtf file you used for alignment to try and visualize the mapping on the IGB. Most of the time, it might just be due to the naming problem. And just in case, you might want to read this if you want to know how to index your fasta file
This also works. To open your fasta file in IGB and use it as the reference, select File > Open Genome from File. (Or click the blue and red DNA icon in the toolbar.)
IGB will then display a window that let's you select a fasta or 2bit format file to use as the reference sequence. (Better to use 2bit - it's much faster to read :-)
You can also enter a genome version and species names. It's optional, but if you do that, then IGB will display the names you selected in the Species and Genome Version menus of the Current Genome tab. Otherwise IGB will assign a default name.
Then, click OK.
What happens at that point is that IGB will scan your reference sequence file, make a list of all the chromosomes and their sizes, and then list them in Sequence table in the Current Genome tab.
At that point, you can open your files as you would normally, including your GTF file. IGB can read GTF files produced by cufflnks.
It can also read some GFF3 files. However, GFF3 files are sometimes not read correctly because different groups interpret the GFF3 specification differently and it's hard to make sure that all GFF3 files will work with IGB. For this reason, we recommend using BED or BED-Detail to represent gene models in IGB.
If you use BED-Detail, make a regular BED file. Then add a column 13 with whatever you want the gene title to be (e.g., TP53) and add a column 14 with whatever descriptive text you'd like to see in the Selection Info tab when you click on the gene. For column 4, insert the name of the gene model, e.g., AT1G07350.1 if it's Arabidopsis. For examples, see the "bed" files on the QuickLoad site - there are many examples from many different species. The text you insert into columns 4, 13, and 14 will be available for searching under the Advanced Search tab, so it's useful to add text you think will be helpful for search, like gene name and gene function.