Integrated Genome Browser
Visualization for genome-scale data

Frequently Asked Questions

What is the Integrated Genome Browser?

The Integrated Genome Browser (IGB - pronounced "ig-bee") is an open source, freely-available desktop genome viewer implemented in the Java programming language. It differs from other desktop genome viewers in that it adapts advanced visualization techniques that information visualization researchers have shown can aid comprehension and exploration of large data sets.

For example, IGB implements include animated, one-dimensional semantic zooming; edge-matching on overlapping features; floating genome graphs; a zoom stripe graphic that helps maintain context; and more. Also, IGB is truly integrated; via a simple Web services approach, IGB makes it possible to view your data alongside publicly-available genome annotations and other lab's data sets. IGB also makes it easy for you to share your own data with the public or privately with collaborators.

IGB is the flagship product of the open source Genoviz project, which develops visualization software for bioinformatics and genomics. IGB is based on a library of visualization "widgets" called the Genoviz toolkit, the newest version of the Neomorphic Genome Software Developers Kit. Neomorphic was a bioinformatics software company started by graduate students from the Berkeley Drosophila Genome Project in the late 1990s.

In early 2000, Affymetrix purchased Neomorphic. In 2004, Affymetrix released the Genoviz toolkit and IGB as open source, free software.

For more information about visualization techniques for genomic data, see Loraine and Helt, 2002. To read a short paper describing IGB, see Nicol et al, 2009.

What happens when I download IGB using Java Web Start?

When you click the Start with Java Web Start button, IGB is downloaded from our site and run on your computer.

This is done through a mechanism called Java Web Start, which allows software applications to be downloaded and started by your Web browser. If your computer is relatively up-to-date, the software you would need to launch IGB using Java Web Start is probably already installed and ready to work.

What happens behind the scenes is that when you click the button, your browser downloads a JNLP file, short for Java Network Loading Protocol. JNLP files contain instructions telling your computer how to obtain and launch a particular software program - like IGB - from over the Web.

The Java Web Start mechanism built into your browser reads the JNLP file, determines where to get IGB, and then starts the download. Once the download finishes, IGB will launch.

Because it's inefficient to download the entire software program each time you want to use it, Java Web Start saves the software it downloads in a special directory (a cache) dedicated to this purpose. Each time you launch IGB, Java Web Start will check our site to see if there are any new versions available for the download. If there are, it downloads and launches the newer version. As a result, the software on your local computer stays up-to-date with the current release of IGB.

See also: Java Web Start page in Wikipedia.

What file formats are supported?

Dozens of file formats are supported. See the Bioviz wiki for more details.

How do I load annotations data?

The Integrated Genome Browser can load annotations data from three different kinds of data sources:

When you download and launch IGB from our site, the program will use default and user preferences to get a list of available data servers. It will then query those servers to determine the genomic data sets they host, and the species and genome versions they support.

To view a genome and its annotations in IGB:

Click the Data Access tab in the lower part of the IGB screen (this tab is selected by default).

Choose a species from the drop-down box (e.g., Arabidopsis thaliana).

Choose a genome version from the available versions listed in the Genome Version drop-down menu box.

When you pick a species and genome version, a list of assembled sequences (e.g., chromosomes or contigs) will appear in the Current Sequence table and a list of data sources will appear in the Data Sources table at the lower left of the display.

Click a Data Source folder to find out which data sets it provides. Some data sources may organize data sets into sub-folders.

Click a Data Set checkbox to start loading a data set. When you click a Data Set checkbox, it will disappear from the Data Sources table and re-appear in the Loaded Data Sets table in the middle of Data Access tab.

Choose a Load Mode for the data set. To load data for a small range, choose Region In View. The Refresh Data button will become enabled.

Use the sliders to zoom and scroll to the region you want, and then click Refresh Data to load data for that region. Each time you click Refresh Data, IGB will query its data source servers and get the data you've requested be loaded as Region In View under Load Mode.

To view a different chromosome or see the entire genome, click the corresponding row in the Current Sequence table.

IGB Data Access tab, in lower window of the display

IGB Data Access tab, in lower window of the display

How do I set up a QuickLoad site or directory?

Please see the GenoViz wiki for more details.

How do I load tiling array data?

IGB can display positional, numerical data in the form of graphs, where the x-axis is the sequence and the y-axis represents numerical data, like probe intensity values.

One easy way to view tiling array data in IGB is to write out your data to simple graph (.gr) format flat files. Graph files have two columns: the first column should contain the genomic position of your tiling array probes (either the start or the midpoint), and the second column should contain an intensity value corresponding to the probe position in column one. An individual file should contain data from just one chromosome or assembly, because IGB will assume that all the data are from the same coordinate system. To specify genomic positions, you should use an interbase coordinate system.

To view your file in IGB, operate the File->Open menu. But first, make sure IGB is showing the chromosome or assembly corresponding to the graph data file you are trying to open.

Once the data are in the view, use the Graph Adjuster Tab to change how the data are displayed. The IGB User’s Guide contains some useful guidelines on how to view and work with tiling array data, and we plan to add some additional documentation here, as well.

You are welcome to try this using some tiling array data from Yamada et al, originally published in 2003. We re-formatted the data as graph files, which you can download from this directory.

To get started working with this data, you need to know some basic things about how the experiments were done as well as how to view the data in IGB.

  1. Data file names containing a "C" indicate that the data are from probes selected from the Crick (reverse or bottom) strand of the chromsome. Conversely, data filenames containing a "W" are from the Watson (forward or top) strand. The target or sample preparation protocol used was the same as the target preparation protocol used with regular Affymetrix chips. This means that labeled cRNA is from the antisense strand of the target mRNA. Thus, Crick strand probes will hybridize to mRNAs expressed from bottom strand genes in IGB, and Watson strand probes will hybridize to mRNAs expressed from the top strand genes.
  2. When you open a graph file, make sure you are showing the same chromosome that the graph is from.
  3. The probe positions reported in the graph file were originally created using an earlier version of the Arabidopsis genome. We have not re-mapped the probes onto the TAIR8 genome as yet, but since the TAIR8 genome is not dramatically different from the earlier versions, we expect that the positions of most probes relative to gene annotations have not changed much.

Data files are named after the target chromosome, the sample, and the target strand:

[chromosome].[sample].[strand].gr.gz

The files are also compressed. (You don't have to uncompress them before opening them in IGB.)

Sample types include:

To adjust how the graph looks, use the Graph Adjuster tab. This tab contains many functions useful for statistical manipulation of expression data within the viewer. For more information about IGB's statistical capabilities, read the IGB User's Guide sections related to displaying and manipulating graphs.

Once you open a graph file, it may be placed into a separate tier, or layer, in the main map window. (The default behavior may depend on IGB's default settings -- see the User's Guide for details.)

To compare it to known annotations and find expressed genes, it is helpful to be able to drag it over the tier of annotations you would like to examine. To turn the graph tier into a draggable graph, click the graph to select it and then click the Graph Adjuster tab. Then click the Floating checkbox. To see a vertical scale showing the range of values in the graph, click Y axis checkbox in the Advanced section.

It is also helpful to adjust the scale of values the graph shows. Expression values are usually very unevenly distributed. That is, there are a few values that are very large and many values that are much smaller. If IGB must show the entire range of values in the graph, it will be hard to see expression values in the lower ranges. The extremely large values are outliers and are (usually) not very interesting; to view a more informative range of values, use the Y-axis Scale box to adjust the visible range. To adjust the visible range, click the graph to select it and then click-drag the sliders or just type in new Min and Max values. boxes. This will have the effect of changing how large and small values are shown. For instance, if you set the Min and Max values to the 15th and 90th percentiles, then all values below and above these thresholds will be shown at the minimum and maximum heights (or color intensities) in the graph.

Use the Style box to change graph style. Note how the image below uses the heatmap option, which uses lighter shades of gray to indicate higher-intensity values.

Example of using graph style box

Example of using graph style to change the display of the graph. This example uses the 'heatmap' option which uses different shades of gray to indicate different values

This image shows anther-enriched expression of the AT2G19110 locus. The top graph shows anther data, the middle one shows data from light-grown seedlings, and the bottom graph shows the difference between them, which was calculated using the A-B button in the Graph Adjuster panel. If you follow this link to the TAIR Web site, you will find that experiments using the ATH1 array confirm that this gene is expressed in developing flowers. Based on homology data, it appears to encode a cadmium-transporting ATPase and is localized to the membrane.

Anther-rich expression of AT2G19110, a putative
  cadmium-transporting ATPase that is localized to the membrane

Anther-rich expression of AT2G19110, a putative cadmium-transporting ATPase, with predicted membrane localization

Who do I contact?

Please contact us to share comments and ideas. We welcome your suggestions!