How to use own genome file in IGV?
3
0
Entering edit mode
6.1 years ago

Hello,

let's assume I would like to use this reference genome from ensembl together with the gff annotation file within IGV for visualization.

  • I uncompressed the fasta file and indexed it with samtools index.
  • I uncompressed the gff3 file, sorted it by position, bgzip'ed and tabix indexed it.
  • In IGV I select Genomes -> Create .genome File and give the path to the reference sequence and the annotation file.

What I observe now is:

  • It takes very long until anything is loaded.
  • After finishing there is no annotation available.

What I expected:

I would like to enter a gene name or transcript number into the search field to jump directy to the corresponding position, just like it is in the predefined genomes.

How does it work correctly? What am I missing?

fin swimmer

igv annotation genome • 15k views
ADD COMMENT
0
Entering edit mode

I just tried a bacterial genome from GenBank. Just the .fna file and .gff file. No indexing done. I am able to create a .genome file and type in gene names from GFF file to select them directly in IGV.

ADD REPLY
0
Entering edit mode

I've made this stunt with index, because when loading the gff3 filer over the File menu igv complains that it's to large and I have to index it.

The strange thing is, if I load the indexed file over the File menu it loads quick and is displayed correct. The search bar does not work. I guess therefor I must create the .genome file. But if I do so I get the behavior I described above.

Will investigate more on it tomorrow.

ADD REPLY
1
Entering edit mode
6.0 years ago

Good morning everyone.

I've found on what's going on here.

  • That I cannot use the search bar when using an indexed file loaded over the File menu is due to the compression. The index is of course just for random access to the positions. So there is no way to load other informations before. (See also https://github.com/igvteam/igv/issues/244)
  • That the annotation file is not loaded (or igv give up) when stored as a .genome file might be a bug. The annotation file itsself is packed into the .genome file and referenced in the settings file. But the index file isn't packed and not referenced. So igv will fail to have access to the data.

So the first point could be a feature request to provide a way to search for gene names/transcript ids for compressed and indexed files (maybe over something like NoSql???)

The second point is a bug in my opinion. I will create in issue on github and will report the link to here later.

The workaround I've found is to use UCSC's Table Browser to export All GENCODE V28 in the output format all fields from selected table and store this file as refGene.txt (The filename is important!). This file is much smaller and I can use it without compression and indexing. But I guess there are some informations missing. In the moment in my case this seems to be ok.

fin swimmer

EDIT: Link to igv issue

ADD COMMENT
0
Entering edit mode
6.1 years ago

try to create a dictionary with picard createSequenceDictionary https://broadinstitute.github.io/picard/ ?

ADD COMMENT
0
Entering edit mode

Unfortunately this doesn't give any improvements.

ADD REPLY
0
Entering edit mode
6.0 years ago

I'm guessing you want to create the .genome file because you want to automatically load the annotations?

If you are OK with loading the annotations separately (and/or you want to check for potential issues with each file separately), you should be able to use Genomes --> Load Genome from File... to just load the FASTA reference without annotations.

ADD COMMENT

Login before adding your answer.

Traffic: 2072 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6