2: Basic Usage and API#

In this tutorial, we will demonstrate how to use RepKit to load and inspect a set of networks. We will use the RepKit rnn module to load a set of networks from a directory, and then inspect the shape of each network in the set.

Once we explore the loading and unloading of the data, we will then explore how to measure the similarity between two networks and plot them in a low-dimensional space.

For this, We will use the model_space module from the RepKit.space package to measure the pairwise cosine similarity between the networks and visualize their 2D embedding using multidimensional scaling.

Understanding RepKitDataset#

All datasets used in RepKit are a child of the RepKitDatasetClass which inturn is a child of torch.utils.data.Dataset.

Importing the necessary modules#

First, we need to import the necessary modules. In this case, we only need to import the rnn module which is one of various pre-defined datasets from RepKit.dataset:

from RepKit.dataset import rnn
from RepKit.space import model_space

Loading the networks#

Next, we can load the networks using the rnn.flow() function and create a PyTorch DataLoader object for the networks using the get_dataloader() function:

networks = rnn.flow("data/baseline/").get_dataloader(batch_size=None, shuffle=False, num_workers=0)

Inspecting the networks#

We can now iterate over the networks and print out the shape of each network:

print(f"Found {len(networks)} networks")

# Checking shape of the networks
for idx, (x,y) in enumerate(networks):
    print("Network:", idx, x.shape, y.shape)

Measuring the Cosine similarity#

Next, we can use the model_space module to measure the pairwise cosine similarity between the networks, and then visualize their 2D embedding using multidimensional scaling:

space = model_space()

# Measure pairwise cosine similarity
space.measure(networks, "cosine")

# Plot the distance matrix
space.plot_distance()

Decomposing and visualizing 2D embedding#

Finally, we can decompose the distance matrix into a 2D embedding using multidimensional scaling and plot the 2D embedding with labels:

# Compute the 2D embedding using multidimensional scaling
space.decompose(components=2, engine="mds")

# Plot the 2D embedding with labels
space.plot_embedding(labels=[1, 5, 10, 0])

Notes:#

You can use any one of the supported metric (instead of cosine) to measure the similarity matrix. The supported metrics are can be found in space.metric.registered_metrics. Similarly, you can use any of the supported decomposers while generating the embeddings.
As you are playing with the space object that keeps track of state changes, you can chain multiple functions together. For example, in a single line of code: space.measure(networks, cosine").plot_distance().decompose(components=2, engine="mds").plot_embedding(labels=[1, 5, 10, 0])
Furthermore, you can also pass in your custom metric or decomposer to the measure() and decompose() functions respectively. For more information, please refer to the RepKit.space module or tutorial 5.