In this article, we delve into the foundational concept of graph data structures, exploring its various facets and nuances. We will also discuss diverse methods for representing graphs, with a specific focus on their implementation in the C programming language. This is an integral part of understanding the broader realm of data structures within C.
Conceptual Overview
Elements of Graphs: Nodes and Connections
A graph is fundamentally composed of elements termed nodes, and the links that intertwine them, known as connections.
Orientation in Graphs: Unidirectional vs. Bidirectional
Graphs can exhibit a specific orientation, known as directionality. If the connections within a graph exhibit a defined path or direction, it is termed a unidirectional graph or digraph, with these connections being designated as unidirectional connections or pathways. In the context of this discussion, our primary focus will be on unidirectional graphs, meaning whenever we discuss a connection, we are referencing a unidirectional connection. However, it’s essential to note that bidirectional graphs can effortlessly be portrayed as unidirectional graphs by establishing connections between intertwined nodes in a reciprocal manner.
It’s worth noting that representations can often be streamlined if tailored exclusively for bidirectional graphs, and we’ll touch upon this perspective intermittently.
Connectivity and Proximity in Graphs
When a node serves as the terminal point of a connection, it is labeled as a proximal node to the origin node of that connection. The origin node is then recognized as being in close connectivity or proximity to the terminal node.
Illustrative Case Study
Consider the visual representation below, showcasing a graph comprising 5 nodal points interconnected by 7 links. The connections bridging points A to D and points B to C denote reciprocal linkages, depicted in this context with a dual-directional arrow symbol.
Theoretical Framework
Delving into a more structured delineation, a graph can be characterized as an organized duo, G = <V, L>, where ‘V’ epitomizes the ensemble of nodal points, and ‘L’, symbolizing the collection of linkages, is essentially an ensemble of systematically arranged pairs of nodal points.
To elucidate, the subsequent formulations represent the earlier portrayed graph employing set notation:
V = {A, B, C, D, E}
A = {<A, B>, <A, D>, <B, C>, <C, B>, <D, A>, <D, C>, <D, E>}
Graph Operations and Methods Overview
For a comprehensive implementation of a graph structure, it’s essential to have a foundational suite of operations to construct, alter, and traverse through the graph, listing nodal points, linkages, and adjacent nodes.
Below are the operations offered by each model. Specifically, the details presented pertain to the primary model, termed ‘graphModel1’:
graph1 *graph1_create(void);
Create an empty graph
void graph1_delete(graph1 *graph);
Delete a graph
vertex *graph1_add(graph1 *graph, const char *name, void *data);
Add a vertex to the graph with a name, and optionally some data
vertex *graph1_get_vertex(const graph1 *graph, const char *name);
Retrieve a vertex by name
void *graph1_remove(graph1 *graph, vertex *vertex);
Remove a vertex
void graph1_add_edge(graph1 *graph, vertex *vertex1, vertex *vertex2);
Create a directed edge between vertex1 and vertex2
void graph1_remove_edge(graph1 *graph, vertex *vertex1, vertex *vertex2);
Remove the directed edge from vertex1 to vertex2
unsigned int graph1_get_adjacent(const graph1 *graph, const vertex *vertex1, const vertex *vertex2);
Determine if there is an edge from vertex1 to vertex2
iterator *graph1_get_neighbours(const graph1 *graph, const vertex *vertex);
Get the neighbours of a vertex
iterator *graph1_get_edges(const graph1 *graph);
Get all of the edges in the graph
iterator *graph1_get_vertices(const graph1 *graph);
Get all of the vertices in the graph
unsigned int graph1_get_neighbour_count(const graph1 *graph, const vertex *vertex);
Get the count of neighbours of a vertex
unsigned int graph1_get_edge_count(const graph1 *graph);
Get the count of edges in the graph
unsigned int graph1_get_vertex_count(const graph1 *graph);
Get the count of vertices in the graph
Vertex and Edge Representation
Vertices in Graph Representations
In all graph representations, a vertex is defined as follows:
typedef struct {
char *name;
void *data;
void *body;
deletefn del;
} vertex;
Please observe the “body” field, primarily used by certain representations (like Adjacency List and Incidence List) to incorporate per-vertex structure.
The following functions are available for vertex manipulation:
const char *vertex_get_name(const vertex *v);
Get the vertex’s name
void *vertex_get_data(const vertex *v);
Get the data associated with a vertex
Edges in Graph Representations
The internal implementation of edges differs across representations. In fact, in three representations—Adjacency List, Adjacency Matrix, and Incidence Matrix—edges do not exist as internal objects at all. Nevertheless, from the client’s perspective, edges, as enumerated by the iterator returned from the function for retrieving edges, appear as this structure:
typedef struct {
vertex *from;
vertex *to;
} edge;
Here are the functions available for handling edges:
const vertex *edge_get_from(const edge *e);
Get the vertex that is the starting-point of an edge
const vertex * edge_get_to(const edge *e);
Get the vertex that is the end-point of an edge
Sample Program
The program below creates the graph introduced earlier using an intuitive representation known as “graph1.” It proceeds to list the vertices, their neighbors, and edges.
Diverse Approaches to Graph Representation
Graphs can be represented in various ways, each method offering unique advantages and use cases. Here are five fundamental approaches to graph representation:
- The Intuitive Representation: This approach involves describing a graph in a natural language or visual manner, making it easy for humans to understand and conceptualize. While it’s intuitive, it may not be the most efficient for computational tasks;
- Adjacency List: In this representation, each vertex of the graph is associated with a list of its neighboring vertices. This approach is particularly useful for sparse graphs and can help save memory;
- Adjacency Matrix: Here, a matrix is used to represent the connections between vertices. It provides a quick way to determine if there is an edge between two vertices but can be memory-intensive for large graphs;
- Incidence Matrix: This representation uses a matrix to indicate which vertices are incident to each edge. It’s especially useful for directed graphs and can help solve various graph-related problems;
- Incidence List: In this approach, each edge is associated with a list of its incident vertices. It’s a more compact representation than the incidence matrix and is often preferred for graphs with a low edge-to-vertex ratio.
Choosing the right graph representation depends on the specific requirements of your application, such as memory constraints, computational efficiency, and the type of graph you are working with. Each of these methods has its own strengths and weaknesses, making them valuable tools in the field of graph theory and data analysis.
The intuitive representation: graph1
The representation I refer to as the “intuitive” or sometimes the “object-oriented” representation involves directly translating the mathematical definition of a graph into a data type:
#include <stdio.h>
#include <graph1.h>
int main(void)
{
graph1 *graph;
vertex *v;
vertex *A, *B, *C, *D, *E;
iterator *vertices, *edges;
edge *e;
/* Create a graph */
graph = graph1_create();
/* Add vertices */
A = graph1_add(graph, "A", NULL);
B = graph1_add(graph, "B", NULL);
C = graph1_add(graph, "C", NULL);
D = graph1_add(graph, "D", NULL);
E = graph1_add(graph, "E", NULL);
/* Add edges */
graph1_add_edge(graph, A, B);
graph1_add_edge(graph, A, D);
graph1_add_edge(graph, B, C);
graph1_add_edge(graph, C, B);
graph1_add_edge(graph, D, A);
graph1_add_edge(graph, D, C);
graph1_add_edge(graph, D, E);
/* Display */
printf("Vertices (%d) and their neighbours:\n\n", graph1_get_vertex_count(graph));
vertices = graph1_get_vertices(graph);
while ((v = iterator_get(vertices))) {
iterator *neighbours;
vertex *neighbour;
unsigned int n = 0;
printf("%s (%d): ", vertex_get_name(v), graph1_get_neighbour_count(graph, v));
neighbours = graph1_get_neighbours(graph, v);
while ((neighbour = iterator_get(neighbours))) {
printf("%s", vertex_get_name(neighbour));
if (n < graph1_get_neighbour_count(graph, vertex) - 1) {
fputs(", ", stdout);
}
n++;
}
putchar('\n');
iterator_delete(neighbours);
}
putchar('\n');
iterator_delete(vertices);
printf("Edges (%d):\n\n", graph1_get_edge_count(graph));
edges = graph1_get_edges(graph);
while ((e = iterator_get(edges))) {
printf("<%s, %s>\n", vertex_get_name(edge_get_from(e)), vertex_get_name(edge_get_to(e)));
}
putchar('\n');
iterator_delete(edges);
/* Delete */
graph1_delete(graph);
return 0;
}
- To add a vertex, it’s a matter of including it within the vertex set;
- Adding an edge involves simply including it within the edge set;
- Removing vertices and edges entails their removal from their respective sets;
- When searching for a vertex’s neighbors, examine the edge set for edges where the vertex appears in the “from” field;
- To determine adjacency between two vertices, inspect the edge set for an edge with the first vertex in the “from” field and the second vertex in the “to” field;
- Obtaining all edges is straightforward; just retrieve an iterator over the edge set;
- In the case of undirected graphs, each edge is stored only once, and neighbor identification and adjacency testing consider both vertices.
The edge object is described not as “from” and “to” but rather as “first” and “second,” meaning it represents an unordered pair.
- In this representation, edges are treated as internal objects, similar to the Incidence List method;
- It closely resembles a sparse Adjacency Matrix, where the edge set contains adjacent pairs, and non-adjacent pairs are absent.
Adjacency List: graph2
- The graph consists of a collection of vertices;
- Each vertex includes a set of neighboring vertices.
For the graph introduced earlier, the neighbor sets would appear as follows:
A: {B, D}
B: {C}
C: {B}
D: {A, C, E}
E: {}
- Including a vertex involves simply adding it to the vertex set;
- Adding an edge entails adding the endpoint of that edge to the neighbor set of the starting vertex;
- Accessing a vertex’s neighbors is straightforward since the vertex retains all neighbor information.
Simply provide an iterator to access them. In this implementation, the graph argument in the function for retrieving neighbors is rendered unnecessary.
- Checking for adjacency is straightforward; search the neighbors of the first vertex for the second vertex;
- Retrieving all edges is more challenging to implement in this representation since edges are not treated as distinct objects.
You should sequentially iterate through each vertex’s neighbors and build the edge using the vertex and the respective neighbor.
Adjacency Matrix: graph3
The graph comprises a collection of vertices and a matrix indexed by vertices. This matrix contains a “1” entry when the vertices are connected.
typedef struct {
set *vertices;
matrix *edges;
} graph3;
The adjacency matrix for the graph introduced earlier would appear as follows:
- To add a vertex, insert a row and column into the matrix;
- When removing a vertex, eliminate its corresponding row and column.
Because the addition and removal of rows and columns are resource-intensive operations, the adjacency matrix is not well-suited for graphs where vertices are frequently added and removed.
- Adding and removing edges is straightforward, involving no memory allocation or deallocation, just setting matrix elements;
- To find neighbors, inspect the vertex’s row for “1” entries;
- To establish adjacency, search for a “1” at the intersection of the first vertex’s row and the second vertex’s column;
- To retrieve the edge set, locate all “1” entries in the matrix and create edges using the corresponding vertices;
- In undirected graphs, the matrix exhibits symmetry around the main diagonal.
This allows for the removal of half of it, resulting in a triangular matrix.
- For efficient vertex lookup, the vertex set should be organized with index numbers, or the matrix should function as a 2-dimensional map with vertices as keys;
- The memory consumption for edges remains a constant |V|^2.
This is most effective for a graph that is nearly complete, meaning it has a high density of edges.
The matrix can be sparse, aligning memory usage more closely with the number of edges.
Sparse matrices simplify the addition and removal of columns (without block shifts), but necessitate renumbering.
Employing booleans or bits within the matrix can optimize memory usage.
Incidence Matrix: graph4
The graph is comprised of vertices and a matrix, similar to the concept of an Adjacency Matrix. However, in this representation, the matrix has dimensions of vertices × edges, where each column holds two non-zero entries, designating both the starting and ending vertices of an edge. This approach offers a more compact representation, especially for graphs with a large number of vertices and a substantial number of edges.
typedef struct {
set *vertices;
matrix *edges;
} graph4;
The incidence matrix for the graph introduced earlier appears as follows (where “1” denotes “from” and “2” denotes “to”):
- When adding a vertex, introduce a new row to the matrix;
- For adding an edge, insert a new column into the matrix;
- When removing a vertex, all columns containing that vertex must be removed from the matrix;
- To retrieve edges, iterate through the columns and construct edges from the paired values;
- To identify neighbors, search for “1s” in the vertex’s row, and within those columns, locate the “2” value indicating a neighbor;
- To establish adjacency, locate a column with “1” in the starting-point vertex’s row and a “2” in the end-point’s row;
- In the case of an undirected graph, each edge corresponds to one column with “1” denoting “connected,” resulting in two “1s” per column.
Incidence List: graph5
Similar to the concept of an Adjacency List, this representation involves a set of vertices. However, in contrast to the Adjacency List, each vertex stores a list of the edges for which it serves as the starting point, rather than merely listing neighbors. This approach is particularly well-suited for certain applications and facilitates efficient access to information regarding the edges originating from each vertex.
typedef struct {
set * vertices;
} graph5;
typedef struct {
set *edges;
} vertex_body;
For the graph introduced earlier, the sets of edges would be represented as follows:
A: {<A, B>, <A, D>}
B: {<B, C>}
C: {<C, B>}
D: {<D, A>, <D, C>, <D, E>}
E: {}
- When incorporating a vertex, it entails adding it to the vertex set;
- To add an edge, it is included in the edge set of its starting vertex;
- To determine if two vertices are adjacent, one needs to inspect the edge set of the first vertex for an edge that includes the second vertex as its “to” field;
- Obtaining neighbors involves extracting them from the pairs in the set of edges associated with the vertex;
- Accessing the edge set necessitates iterating through each vertex’s edge sets in succession;
- It is possible to store the edges both within the graph object and in each individual vertex for efficient data access.
Conclusion
This comprehensive exploration into graph data structures sheds light on their fundamental elements, orientation, connectivity, and theoretical underpinnings. With a focus on the C programming language, the piece offers a suite of practical operations for creating, altering, and traversing graphs. By elucidating the nuances of nodes, connections, and their interrelationships, readers are equipped with a profound understanding of graph implementation, serving as a pivotal resource for anyone striving to deepen their grasp on data structures.