Networks come in all different shapes and sizes. Some are quite simple well others are more complex. Therefore, knowing the right metrics is very important in order to better understand what is going on in the network.
This blog post features some of the most simple yet important graph-based metrics for understanding structure.
To put it simply, the degree represents the number of connections/edges of a node. If the graph is directed, the degree is split into two parts: indegree and ourdegree. More often, this metric is represented collectively as a distribution using the degree sequence.
Knowing the degree sequence of a network (or degree of a node) is useful for understanding how many or how few connections are made between nodes. It is quite common for networks (especially social networks) to have a degree sequence centred around one as a node needs at least one edge to be included in the network.
To calculate the degree of a node using networkx, we can use the following...
For directed graphs, this would look like...
>>> G.in_degree('A') 3
>>> G.out_degree('A') 4
Or if you want to know the degree for all nodes just use the following without any parameters...
The density (sometimes known as the clustering coefficient) is a metric used to determine how clustered a network is. It is the ratio of the number of edges in the network divided by the maximum number of edges a network could have.
This metric is mainly used to give an overview of how many edges are occupied within the network. In large complex networks, this number is usually relatively low. Sparse networks tend to have a relatively high density considering that they are quite small and the chances of all the edges being occupied are much greater.
Finding the density of a network using networks is really easy...
In social network analysis, one way of discovering meaningful ties would be to consider reciprocity. Reciprocated edges can only be found in directed networks as these ties are dependent on bidirectional edges. In other words, an edge that going in two directions.
Why use it?
Reciprocity is incredibly important for understanding the fundamentals of user interaction. For example, in a social setting, a reciprocated edge may reflect a user returning a reply or response to a user interacting with them. This may suggest that these users would prefer to interact with each other as opposed to others.
Using networkx, this can be achieved by the following function where the value returned represents the number of reciprocated edges divided by the total number of edges. The higher the value the more reciprocated ties are occupied in the network. In our example, the reciprocity is...
>>> nx.reciprocity(G) 0.5
In network analysis, a triad (a group containing three nodes) is considered a fundamental building block of a network. Transitivity can be thought of as the probability of adjacent nodes being present within the network.
Can be used to indicate the completenesSocial interactions such as indirect reciprocity ("the enemy of my enemy is my friend") are dependent on a triadic based interaction with a total of three members are present. Much like density, transitivity can also be used to get an idea of the completeness of a network.
Using networkx, this is implemented as...
Modularity is a metric for understanding how well a network can be partitioned into separate clusters. The general rule is, the greater the modularity, the higher the number of highly connected groups connected by sparse edges.
This has also been used in another blog post as a simple metric to roughly gauge the number of communities present within the network. While there are other (and more elaborate) community detection techniques available, modularity serves as a convenient way for understanding if a network features connected communities of users.
The overall modularity of all nodes can be calculated using the following ...
>>> import networkx.algorithms.community as nx_comm >>> nx_comm.modularity(G, G.nodes())
Final Thoughts and Conclusions
In this blog post, we went through some of the techniques used to analyse an entire network using simple metrics. This list is by no means comprehensive and there are many other metrics available however these are the ones that are used the most.