Future Work
Graph Partitioning
Implement METIS-style node partitioning for billion-edge graphs. This would require:
- Partition assignment per node (e.g., round-robin, METIS, streaming)
- Multi-round MPI communication for cross-partition edges (BFS frontier exchange, Dijkstra relaxation broadcast)
- Result merging across partitions
This is the most impactful scalability improvement. The current replicated model works well up to ~100K edges per process memory budget.
Streaming File Input
Accept graph files larger than available RAM by streaming edge list parsing with a sliding-window approach:
- Stream file in chunks
- Emit edges as they're parsed (no full-file buffering)
- Algorithms would need to support incremental graph construction
Pairs with graph partitioning for true large-scale support.
Additional Algorithms
| Algorithm | Notes |
|---|---|
| Betweenness Centrality | Expensive (O(VE)), highly parallelizable across source nodes |
| Floyd-Warshall | All-pairs shortest path, O(V³) — practical only for small graphs |
| Louvain Community Detection | Modularity-based clustering, iterative |
| HITS (Hubs & Authorities) | Complements PageRank for link analysis |
| Bipartite Matching | Hungarian algorithm, Hopcroft-Karp |
WASM Shared Memory
Migrate the WASM API boundary from JSON strings to SharedArrayBuffer for zero-copy data transfer between the worker and main thread:
Current: Worker → JSON.stringify → postMessage(string) → JSON.parse → Main
Proposed: Worker → SharedArrayBuffer ← Main (zero copy, direct read)
Would reduce data transfer overhead for large graphs (currently <2ms, but visible for 10K+ node graphs).
GPU-Accelerated Force Layout
Integrate a GPU compute-shader force-directed layout algorithm for real-time layout of 100K+ node graphs:
- Current: CPU force simulation via Reagraph's d3-force
- Proposed: WebGPU compute shaders for O(N²) → O(N log N) Barnes-Hut on GPU
- Would enable smooth interactive layout for orders of magnitude more nodes
Graph Database Backend
Add a graph database backend (e.g., Dgraph, TigerGraph, or a custom Rust solution) to:
- Persist graphs without re-uploading on every session
- Support incremental updates (add/remove nodes and edges)
- Enable graph queries (find neighbors, subgraph extraction) without full re-processing
- Store algorithm result history
Authentication & Multi-User
For a hosted deployment:
- User accounts with per-user graph storage
- Shared graphs with collaboration (multiple users viewing the same graph)
- Job queue for long-running algorithms (don't block HTTP handler)
- WebSocket progress streaming for real-time algorithm execution feedback
Additional File Formats
| Format | Description |
|---|---|
| GraphML (XML) | Widely used in graph tools (yEd, Gephi) |
| GML | Graph Modelling Language |
| GEXF | Gephi's native format with rich metadata |
| CSV with headers | source,target,weight header row |
| Parquet | Columnar format for very large graphs |