Tag Archives: coreness

K-cores Decomposition of a Random Graph

We described what cores are and it is now time to see how a network looks like, once we performed the k-cores decomposition. There is a very interesting visualization tool for graphs called LaNet-vi and it is based on the k-cores decomposition, described in previous posts here and here. This visualization tool provides very nice images of large scale networks on a two-dimensional layout.

Let’s consider first a random graph with N vertices and M edges. A random graph is obtained by a stochastic process that starts with N vertices and no edges, and at each step adds one new edge chosen uniformly with probability p from the set of missing edges. The number of possible edges is N (N – 1), that is the number of edges of a complete graph. A random graph with N vertices and M edges has an average degree c = 2 M / N, and the degree distribution follows a Poisson probability distribution. As we saw before, the k-core decomposition depends strongly on how the edges are distributed among the vertices.

The theory of random graphs says that there is a critical threshold c(k): if c ≥ c(k) then the k-core exists and it is formed by a finite group of vertices. I generated several random graphs for c = 4, 7, 10 and I performed the k-cores decompositions, writing some code in C programming language. In the plot below we can see the number of edges (in blue) and the number of vertices (in red) that belong to a k-shell (the vertices and edges that belongs to the k-core but not to the (k+1)-core ) with a value of coreness k. The green small squares represent the theoretical values for random graphs, given c and N. If you are interested the code is available in my repository here under the KCores folder.

grafoRandom
Random Graph with 10^5 vertices

How does this k-cores decomposition look like? And how is this useful? We notice that most of the vertices and edges belong to cores with high value of k. And also that the number of elements in a shell grows with k. And when c grows, new cores with larger k appear, as we expected from random graph theory. In the figure below these characteristics become even more clear. This very nice image is obtained with the LaNet-vi tool, mentioned above. This random graph has 1000 vertices and an average degree c = 10. The size of a vertex represents its degree and the color represents its coreness value (or the shell where the vertex is). This graph has clearly an homogeneous topology, there is no evidence of hierarchy, most of the vertices share the same role in the network (in red in the picture below). Often we read about the strenght of a network, that is how a network resists to damages or attacks. In this case, with an homogeneous structure, if we remove a random vertex or edge, we probably would not cause much trouble to this network, because most vertices are connected to each other and share the same role. This means that a random graph has a strong structure. And, obviously, this is the exact opposite of what we have in real networks.  But we will see what happens in real life in the next post!

k-cores decomposition visualization by LaNet-Vi
k-cores decomposition visualization by LaNet-Vi

K-cores Decomposition of a Graph

The Graph is a really beautiful and useful mathematical object, with many properties. Before introducing the k-cores, let’s describe shortly some basic principles. In general a graph is defined by a set of vertices and edges that connect pairs of vertices. Usually N represents the total number of vertices and M the number of edges. The degree is the fundamental characteristic. Every vertex has a degree, that is the number of edges that connect that vertex  with the others. With the same N and M we can build very different graphs, depending on the degree distribution, that represents how the edges are distributed among vertices.

Moreover graphs can be trees, which means without loops, or closed paths. Otherwise they are cyclic graphs, if they have at least one loop. The path from a vertex v to another vertex w is a sequence of connected vertices, where no edges or vertices are repeated. A loop is simply a path that starts and ends with the same vertex.

Tree graph, no closed paths or loops
Example of a tree

complete graph or fully connected graph, is a graph where every vertex is connected with all the others. If we have N total vertices, we will have all vertex with degree N – 1. Which means that we can go from everywhere to everywhere in just one step. Real networks usually do not have this super property, but they may have some highly connected parts. A clique in a graph is indeed a subset of vertices where every vertex has the maximum possible degree, in other words a complete graph inside another graph. In the example below in the 2-clique all vertices have degree 2, in the 3-clique all vertices have degree 3 and so on.

Examples of n-clique
Example of n-clique

Let’s go further. A graph is k-connected if every pair of vertices is connected through at least k distinct paths, that do not share edges. This property is related to the strength of a network, in fact a k-connected network remains connected whenever fewer than k links are removed or broken.

Now we can finally introduce the k-core, also related to the strength of a network. The k-core is a subset of a graph where every vertex has degree ≥ k. Cores are nested, a graph can have a 2-core that includes a 3-core that includes a 4-core and so on. Every vertex of a graph has its coreness value, that represents the maximum k-core that contains it. In other words a vertex has coreness 3 if it belongs to the 3-core but not to the 4-core.  The k-shell is the subset of vertices with the same value of coreness. We can see now that cliques are always also cores. The opposite is not true, cores are not necessarily cliques.

In the example below we have a graph with its cores decomposition.  The maximum core has k = 3 even though the maximum degree of the graph is 6 (red vertex, top right corner of the square). Then we have the 2-core in yellow and red. The minimum core is the whole graph and it has k = 1, because there are some leaves, that are vertices with degree 1 (most blue vertices). It is now clear how cores are nested.

In the next post we will discuss and analyze the k-pruning algorithm, a recursive algorithm that allows us to break down systematically a graph in its cores.