Basic Graph Algorithms for Data Scientists
This live repo aims to cover basic graph algorithms implemented in C++ (for performance reasons) used in data science. It’s meant for educational purposes and will continue to evole.
Introduction
This repo covers basic graph algorithms for directed and undirected graphs with/without weights on edges. Graph description is read from a file with ascii format.
Base graph data structure is targeted for sparse graphs efficiency. For example, it maintains adjancy list and pointrs. Here are big-O time and space complexities.
RAM | node-add | edge-add | node-remove | edge-remove | query | |
---|---|---|---|---|---|---|
Adjacency List | [n]+[e] | 1 | 1 | [n]+[e] | [e] | [e] |
Incident List | [n]+[e] | 1 | 1 | [e] | [e] | [e] |
Legend:
- Two vertices are called adjacent if they are connected by an edge.
- Two edges are called incident, if they share a vertex.
Basic Graph
namespace basicGraph {
...
class bNode {
private:
string name_; // node name
set<const bEdge*, edgeCompare> edgelist_; // incident list
...
}/
class bEdge {
private:
const bNode* n1_; // from node for directd graphs
const bNode* n2_; // to node for directed graphs
...
};
...
class bGraph {
private:
bool isDirected_;
set<const bNode*, nodeCompare> nodeset_;
set<const bEdge*, edgeCompare> edgeset_;
...
};
}
For more details, refer to graph.h.
Algorithms Covered
- DFS
- Transpose
- Topological Sort
- Strongly Connected Components
- Prim’s Mimimal Spanning Tree
- Kruskal’s Mimimal Spanning Tree
- Dijkstra’s Single Source Shortest Path to All Nodes
- A* (aka Astar) Source-Destination Pair Shortest Path Finder Algorithm
Graph File Format
graph (un)directed`
<node-x> <node-y> [edge-weight]
Example Graph File
# Comment, ignored by the graph reader
graph directed
n1 n2 2
n1 n3 4
n2 n3 1
n2 n4 4
n2 n5 2
n3 n5 3
n4 n6 2
n5 n4 3
n5 n6 2
Application
How to Compile
cd src
<c++-compiler> *.cpp -o bgaMain.exe
How to Run
bgaMain.exe <graph-file>
Example run
$ src/bgaMain.exe test/dgraph5.txt
created graph with 6 nodes and 9 edges.
type 'help' for more options
help
help print search <root_node> sort mst [prim|kruskal] path <root_node> quit
graph directed n2 n3 1 n1 n2 2 ...
path n1
nd dist_from_src edge == ============= ==== n1 0 [none n1 0] n2 2 [n1 n2 2] n3 3 [n2 n3 1]
Disclaimer
This is a quick and dirty code produced over weekends. Main objective behind this repository is purely educational. It has quite a bit of room for improvement. Feel free to reach out if you spot weakness, bug, enancement or simply a suggestion.