$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$

# 13.1: Introduction to Measures of Similarity and Structural Equivalence

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$

In this rather lengthy chapter we are going to do three things.

First, we will focus on how we can measure the similarity of actors in a network based on their relations to other actors. The whole idea of "equivalence" that we discussed in the last chapter is an effort to understand the pattern of relationships in a graph by creating classes, or groups of actors who are "equivalent" in one sense or another. All of the methods for identifying such groupings are based on first measuring the similarity or dissimilarity of actors, and then searching for patterns and simplifications. We will first review the most common approaches to indexing the similarities of actors based on their relations with other actors.

Second, we will very quickly look at two tools that are very commonly used for visualizing the patterns of similarity and dissimilarity/distance among actors. Multi-dimensional scaling and hierarchical cluster analysis are widely used tools for both network and non-network data. They are particularly helpful in visualizing the similarity or distance among cases, and for identifying classes of similar cases.

Third, we will examine the most commonly used approaches for finding structural equivalence classes. That is, methods for identifying groups of nodes that are similar in their patterns of ties to all other nodes. These methods (and those for other kinds of "equivalence" in the next two chapters) use the ideas of similarity/distance between actors as their starting point, and these methods most often use clustering and scaling as a way of visualizing results. In addition, the "block model" is also commonly used to describe structural similarity classes.