Question

How to use own genome file in IGV?

0

Entering edit mode

6.8 years ago

finswimmer 16k

Hello,

let's assume I would like to use this reference genome from ensembl together with the gff annotation file within IGV for visualization.

I uncompressed the fasta file and indexed it with samtools index.
I uncompressed the gff3 file, sorted it by position, bgzip'ed and tabix indexed it.
In IGV I select Genomes -> Create .genome File and give the path to the reference sequence and the annotation file.

What I observe now is:

It takes very long until anything is loaded.
After finishing there is no annotation available.

What I expected:

I would like to enter a gene name or transcript number into the search field to jump directy to the corresponding position, just like it is in the predefined genomes.

How does it work correctly? What am I missing?

fin swimmer

igv annotation genome • 19k views

ADD COMMENT • link updated 6.8 years ago by Charles Warden 8.3k • written 6.8 years ago by finswimmer 16k

0

Entering edit mode

I just tried a bacterial genome from GenBank. Just the .fna file and .gff file. No indexing done. I am able to create a .genome file and type in gene names from GFF file to select them directly in IGV.

ADD REPLY • link 6.8 years ago by GenoMax 152k

0

Entering edit mode

I've made this stunt with index, because when loading the gff3 filer over the File menu igv complains that it's to large and I have to index it.

The strange thing is, if I load the indexed file over the File menu it loads quick and is displayed correct. The search bar does not work. I guess therefor I must create the .genome file. But if I do so I get the behavior I described above.

Will investigate more on it tomorrow.

ADD REPLY • link 6.8 years ago by finswimmer 16k

0

Entering edit mode

6.8 years ago

Pierre Lindenbaum 166k

try to create a dictionary with picard createSequenceDictionary https://broadinstitute.github.io/picard/ ?

ADD COMMENT • link 6.8 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

Unfortunately this doesn't give any improvements.

ADD REPLY • link 6.8 years ago by finswimmer 16k

0

Entering edit mode

6.8 years ago

Charles Warden 8.3k

I'm guessing you want to create the .genome file because you want to automatically load the annotations?

If you are OK with loading the annotations separately (and/or you want to check for potential issues with each file separately), you should be able to use Genomes --> Load Genome from File... to just load the FASTA reference without annotations.

ADD COMMENT • link 6.8 years ago by Charles Warden 8.3k

score 1 · Accepted Answer · 2018-10-16

Good morning everyone.

I've found on what's going on here.

That I cannot use the search bar when using an indexed file loaded over the File menu is due to the compression. The index is of course just for random access to the positions. So there is no way to load other informations before. (See also https://github.com/igvteam/igv/issues/244)
That the annotation file is not loaded (or igv give up) when stored as a .genome file might be a bug. The annotation file itsself is packed into the .genome file and referenced in the settings file. But the index file isn't packed and not referenced. So igv will fail to have access to the data.

So the first point could be a feature request to provide a way to search for gene names/transcript ids for compressed and indexed files (maybe over something like NoSql???)

The second point is a bug in my opinion. I will create in issue on github and will report the link to here later.

The workaround I've found is to use UCSC's Table Browser to export All GENCODE V28 in the output format all fields from selected table and store this file as refGene.txt (The filename is important!). This file is much smaller and I can use it without compression and indexing. But I guess there are some informations missing. In the moment in my case this seems to be ok.

fin swimmer

EDIT: Link to igv issue