[gephi-dev] Is there a way to cut a dynamical graph and update the current workspace?

Sébastien Heymann sebastien.heymann at gephi.org
Thu Dec 9 00:56:02 CET 2010


Hi @all,

I'm still in trouble using the snapshots to "purge" the main graph from old
nodes.

In this test, I import 200 snapshots corresponding to the evolution of the
same graph at different rounds, with a low maximum of memory allocated
(600M). I only process a metric on the last #window rounds, and I know that
some old nodes and edges will never appear again, so it could be interesting
to remove them regularly. So inside a loop to import the snapshots, I use
the following code to purge the graph:


Graph graph = graphModel.getGraph();
> DynamicGraph dynamicGraph = dynamicModel.createDynamicGraph(graph);
>
> loop :

> //Remove the previous rounds
> //WARNING: metrics results must not be saved in attributes!
> Interval currentInterval = new Interval(dynamicModel.getMax() - window+1,
> dynamicModel.getMax(), false, true);
> GraphView snapshot =
> dynamicGraph.getSnapshotGraph(currentInterval).getView();
>
> graph.writeLock();
> for (Node n : graph.getNodes().toArray()) {
>    //The node n is not in current snapshot
>    if (n.getNodeData().getNode(snapshot.getViewId()) == null) {
>        graph.removeNode(n);
>    }
> }
> graph.writeUnlock();
> snapshot = null;
>


I don't understand what I'm doing wrong, because the Profiler charts of
memory usage & gc shows clearly that executing this code doesn't allow more
memory to be re-allocated, whereas I have a huge dynamic activity in the
graph: especially tons of edges appearing and disappearing, and a regular
turn-over of nodes. And I checked: nodes and edges are correctly removed
from graph. At round #30, I already have a difference of 20K edges more
without executing this code.

However I don't understand why it doesn't lead to more available memory. The
contrary happens: the congestion appears earlier as shown in the charts.

Any explanation?

Seb


2010/12/6 Mathieu Bastian <mathieu.bastian at gmail.com>:
>
>
> On Thu, Dec 2, 2010 at 11:22 AM, Sébastien Heymann
> <sebastien.heymann at gephi.org> wrote:
>>
>> Thanks!
>>
>> My problem is to process a graph of more than 100K nodes and 5M edges. So
>> as I try to minimize the memory used, I incrementally import graphs and
run
>> metrics on a sliding time window.
>> For ancient intervals containing nodes no longer present in the current
>> working window, I want to delete these nodes (edges accordingly) to let
the
>> garbage collector do his work.
>>
>> Sure I'd like to take the snapshot and overwrite the current graph with
>> it, but I don't know how to set a new graph for an existing workspace.
Maybe
>> this is just not possible, Mathieu?
>> And I don't want to instantiate a new workspace at each step...
>
> The snapshot is a view, a subset of the complete graph so overwriting
> doesn't really make sense. You have several possibilities:
> - You can use a NOT filter to get the nodes and the edges that are not in
> your snapshot. You don't have to delete edges first, as deleting nodes
will
> just do it.
> - You can manually remove the nodes not in your snapshot, as you can get
the
> view of your snapshot.
>
> Graph mainGraph = graphController.getModel(workspace).getGraph();
> Graph snapshot = ...;
> GraphView snapshotView = snapshot.getView();
> for (Node n : mainGraph.getNodes().toArray()) {
>    if (n.getNodeData().getNode(snapshotView.getViewId()) == null) {
>        //The node n is not in snapshot
>       mainGraph.removeNode(n);
>    }
> }
>
> - Why not use several workspaces? GraphModel has a convenient
> "pushFrom(Graph graph)" method for you. Call this on an empty wortkspace
to
> copy your snapshot only.
>>
>> Btw @Mathieu, what is the behavior of the file importer when a node
>> imported by the snapshot T doesn't exist in the snapshot T+1?
>
> You mean with the Time Range importer? The node's time interval is defined
> as below:
> - If a node A appeared at T and is missing at T-1, it's interval will be
> [2005, 2006)
>
>>
>> I tried to export a GEXF file to check things but it crashed, trying to
>> write:
>>
>>> <attributes class="node" mode="dynamic">
>>>       <attribute id="time_interval" title="Time Interval"
>>
>> With the following exception:
>>>
>>> java.lang.NullPointerException
>>>         at
>>>
org.gephi.io.exporter.plugin.ExporterGEXF.writeAttributes(ExporterGEXF.java:324)
>>>         at
>>>
org.gephi.io.exporter.plugin.ExporterGEXF.writeAttributes(ExporterGEXF.java:296)
>>>         at
>>>
org.gephi.io.exporter.plugin.ExporterGEXF.writeGraph(ExporterGEXF.java:251)
>>>         at
>>> org.gephi.io.exporter.plugin.ExporterGEXF.execute(ExporterGEXF.java:213)
>>>         at
>>>
org.gephi.io.exporter.impl.ExportControllerImpl.exportFile(ExportControllerImpl.java:110)
>>>         at
>>>
org.gephi.io.exporter.impl.ExportControllerImpl.exportFile(ExportControllerImpl.java:67)
>
> Create a bug.
>
>>
>> It occurs by exporting the graph produced by:
>>
http://wiki.gephi.org/index.php/Toolkit_-_Import_Dynamic_Network_from_multiple_static_files
>> through an ExportController.
>>
>> Thanks again for your insights,
>> Seb
>>
>>
>> 2010/12/2 Cezary Bartosiak <cezary.bartosiak at gmail.com>
>>>>
>>>> System.out.println("Window: [" + dynamicGraph.getLow() + "," +
>>>> dynamicGraph.getHigh() + "]");
>>>
>>> Use dynamicModel.getMin() and dynamicModel.getMax(). It should work as
>>> expected. getLow() and getHigh() are (probably) unnecessary since they
are
>>> used now only for checking if given intervals for methods like
>>> getSnapshotGraph are valid. I need to think about it.
>>> As for deleting nodes. As far as I understand you it is enough to get a
>>> snapshot graph. It should not contain any elements (nodes/edges) that
are
>>> not present in the given time interval... So, what does not work? I'm a
bit
>>> confused :P
>>> On 2 December 2010 18:30, Sébastien Heymann <sebastien.heymann at gephi.org
>
>>> wrote:
>>>>
>>>> I'm using the toolkit and need to load static files incrementally, and
>>>> will have to compute metrics on a sliding window:
>>>> T0: [1..10]
>>>> T1: [2..11]
>>>> T3: [3..12]
>>>>
>>>> So basically, I initialize the algorithm with the window - 1 round.
Then
>>>> at the beginning of a loop I add a new round, and at the end I remove
the
>>>> first one.
>>>>
>>>>
>>>> Currently I have this code in a loop:
>>>>>
>>>>> Graph graph;
>>>>>
>>>>> {
>>>>> // Import a new static graph with a DynamicProcessor
>>>>> ...
>>>>> //Set date for this file
>>>>> dynamicProcessor.setDate("" + i);
>>>>> ..
>>>>> graph = graphModel.getGraph();
>>>>> ...
>>>>> DynamicGraph dynamicGraph = dynamicModel.createDynamicGraph(graph);
>>>>> System.out.println("Window: [" + dynamicGraph.getLow() + "," +
>>>>> dynamicGraph.getHigh() + "]");
>>>>> ...
>>>>> // Compute metrics..
>>>>> ...
>>>>> //Remove the first round
>>>>> double newLow = dynamicGraph.getLow() + 1;
>>>>> double high = dynamicGraph.getHigh();
>>>>> graph = (DirectedGraph) dynamicGraph.getSnapshotGraph(newLow, high);
>>>>> dynamicGraph.setInterval(newLow, high); //necessary?
>>>>> }
>>>>
>>>>
>>>> But it doesn't work. :-(
>>>>
>>>>
>>>> I see also that dynamicGraph.getLow() / getHigh() gives respectively
>>>> -Inf and +Inf.
>>>> Is setting the Interval safe? I see inside the code that there is no
>>>> control on the dates inside the current graph, so I don't know how/when
use
>>>> this method.
>>>>
>>>> Is there a way to do this? Generalized, the need is to perform
operators
>>>> like Union, Intersection and Exclusion based on the time.
>>>>
>>>>
>>>> Thanks,
>>>> Seb
>>>>
>>>> _______________________________________________
>>>> gephi-dev mailing list
>>>> gephi-dev at lists.gephi.org
>>>> http://gephi.org/mailman/listinfo/gephi-dev
>>>>
>>>
>>>
>>> _______________________________________________
>>> gephi-dev mailing list
>>> gephi-dev at lists.gephi.org
>>> http://gephi.org/mailman/listinfo/gephi-dev
>>>
>>
>>
>> _______________________________________________
>> gephi-dev mailing list
>> gephi-dev at lists.gephi.org
>> http://gephi.org/mailman/listinfo/gephi-dev
>>
>
>
> _______________________________________________
> gephi-dev mailing list
> gephi-dev at lists.gephi.org
> http://gephi.org/mailman/listinfo/gephi-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gephi.org/pipermail/gephi-dev/attachments/20101209/7cd728f0/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: heap without purge m600.png
Type: image/png
Size: 77284 bytes
Desc: not available
URL: <http://gephi.org/pipermail/gephi-dev/attachments/20101209/7cd728f0/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: heap with purge m600.png
Type: image/png
Size: 66083 bytes
Desc: not available
URL: <http://gephi.org/pipermail/gephi-dev/attachments/20101209/7cd728f0/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gc without purge m600.png
Type: image/png
Size: 77302 bytes
Desc: not available
URL: <http://gephi.org/pipermail/gephi-dev/attachments/20101209/7cd728f0/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gc with purge m600.png
Type: image/png
Size: 72635 bytes
Desc: not available
URL: <http://gephi.org/pipermail/gephi-dev/attachments/20101209/7cd728f0/attachment-0007.png>


More information about the gephi-dev mailing list