This article tries to explain a bit of each format: what are supported data files and the general structure to follow. If you experienced problems when importing or exporting files, please let us know to fill this documentation.

Import

Gephi can import following standard graph file formats. Articles contains documentation, samples and implementation details. They helps outlining differences between formats.

* GEXF

* GDF

* GML

* GraphML

* Pajek NET

* GraphViz DOT

* CSV

* UCINET DL

* Tulip TPL

* Netdraw VNA

* Spreadsheet

Compare

The following table will help if you want to choose in which format you want to encode your data. If you plan to work only with Gephi, we recommend to use GEXF, for many reasons. The table criteria don’t mention all features of formats but concentrate on these supported by Gephi.

Some details:

* Visualization attributes: Only GraphML, GDF and GEXF importers are able to recognize nodes position, color and size attributes. Positions in Pajek NET files are also found.
* Hierarchical graphs: Implemented since Gephi 0.7.
* Dynamic: See details on GEXF specification page.
* Spreadsheet: Node tables and edge tables can be loaded in the Data Laboratory only.

The following decision tree will help you to select the encoding format of your data, regardless technical constraints. Formats at the top can express more features than those at the bottom.

Concepts

Data-centered

What are the essential data Gephi is looking for in a graph file? We distinguish tree types of data: nodes, edges and attributes. Basically, edges are always between two nodes and attributes are data associated to nodes or edges, like some string or integer results.

Nodes and edges structure is called the network topology. Attributes are rather called network data. The structure (i.e. topology) is required for any graph but data are optional. However, these days focus is made on network analysis based on data attributes. Data are everywhere.

Attributes

A bunch of data can be stored within attributes. The concept is the same as table data or SQL. An attribute has a title/name and a value. Attribute’s name/title must be declared for the whole graph. It could be for instance ‘degree’, ‘valid’ or ‘url’. Besides the name of the attribute a column also contains the type. See further details about attribute system in the developer manual.

You may understand attributes whel looking at this node definition. besides native fiels (id, label), values are set for three attributes.

<node id="0" label="Hello world">
<attvalues>
<attvalue for="0" value="samplevalue"/>
<attvalue for="1" value="1831"/>
<attvalue for="2" value="true"/>
</attvalues>
</node>