RNSC is a cost-based local search algorithm that works to minimize a
cost function in the solution space. The basic principle underlying
local search is to start from an initial candidate solution and then,
iteratively, to make moves from one candidate solution to another of
the candidate solutions from its direct neighbourhood. The moves are
based on local information, and continue until a termination condition
is met. Starting from an initial random or user input clustering, RNSC
searches a better clustering using a simple integer-valued cost
function, the naive cost function. This naive cost function
calculates the sum for each vertex i :
- The number of neighbours of i that are not in the same cluster
- The number of vertices that are in the same cluster than i but not connected to it.
This sum is divided by 2. For an optimal clustering, these values
have to be low for all vertices. When a user-defined number of moves
has been done without decreasing the naive cost function, the program shifts
to the real-value scaled cost function which also takes into account the size of the cluster during the calculations.
At each iteration, one node is moved from one cluster to another. If this move decreased the naive or the scaled cost function, this move (considered as an intensification move) is conserved.
In order to avoid local minima, a user-defined number of nodes are moved to random clusters according to a user-defined frequency.
In order to avoid cycling back to the previously explored partitioning, a list of vertices that may not move is maintained. The size of this list and the number of time a vertex must be in the list before being considered as non movable are user-defined.
The RNSC program was developed by Andrew King.
(King, 2004, M.Sc. thesis;
King et al, 2004) and is available upon request.
The input / output parsers, the webservices and the web interface were developped by Sylvain Brohée.
This manual is freely adapted from the help of the command line tool written by Andrew King.
Input and output formats
The accepted input formats are GML, tab-delimited and adjacency matrix.
For more explanations about these, refer to the manual of convert-graph.
A tab-delimited text file with 2 columns. The first column
indicates the element, the second column the class (cluster).
Column specifications (only for tab-delimited format)
Source and target column. Columns containing the source and target nodes.
Maximal number of clusters
If set to num, allow no more than num clusters. num must be between 2 and n, where n is the number of vertices in the graph. If this value is not specified or invalid, n clusters are used.
If set to num, set the tabu length to num. Default value is 1. Note that when the tabu list tolerance is used, vertices can appear on the tabu list more than once and moving them is only forbidden when they are on the tabu list more than TabuTol times, where TabuTol is the tabu list tolerance value.
Tabu list tolerance
If set to num, set the tabu list tolerance to num. The tabu
list tolerance is the number of times a vertex must appear on the tabu
list before moving it is forbidden.
Naive stopping tolerance
Number of steps that the naive scheme will continue without improving the best cost. If you run the scaled scheme, using a higher
naive stopping tolerance isn't likely to improve your results. (Default : 5)
Scaled stopping tolerance
This is the number of steps that the scaled scheme will continue without
improving the best cost. Setting the tolerance to 0 will cause the
algorithm to skip the scaled scheme. (Default : 5)
If set to num, set the diversification frequency to num. Without this option, no diversification will be performed. If the shuffling diversification length option is also used, then num is the shuffling diversification frequency. If the shuffling diversification length option is
not used, then num is the destructive diversification frequency. However, it
is recommended to use the shuffling diversification length option, because destructive diversification isn't much help.
Shuffling diversification length
If set to num, set the shuffling diversification length to num. That means that the
last num moves in the diversification period will be diversification
moves. Don't set this to be higher than the diversification frequency.