Skip to main content

Future Work

Graph Partitioning

Implement METIS-style node partitioning for billion-edge graphs. This would require:

  • Partition assignment per node (e.g., round-robin, METIS, streaming)
  • Multi-round MPI communication for cross-partition edges (BFS frontier exchange, Dijkstra relaxation broadcast)
  • Result merging across partitions

This is the most impactful scalability improvement. The current replicated model works well up to ~100K edges per process memory budget.


Streaming File Input

Accept graph files larger than available RAM by streaming edge list parsing with a sliding-window approach:

  • Stream file in chunks
  • Emit edges as they're parsed (no full-file buffering)
  • Algorithms would need to support incremental graph construction

Pairs with graph partitioning for true large-scale support.


Additional Algorithms

AlgorithmNotes
Betweenness CentralityExpensive (O(VE)), highly parallelizable across source nodes
Floyd-WarshallAll-pairs shortest path, O(V³) — practical only for small graphs
Louvain Community DetectionModularity-based clustering, iterative
HITS (Hubs & Authorities)Complements PageRank for link analysis
Bipartite MatchingHungarian algorithm, Hopcroft-Karp

WASM Shared Memory

Migrate the WASM API boundary from JSON strings to SharedArrayBuffer for zero-copy data transfer between the worker and main thread:

Current:  Worker → JSON.stringify → postMessage(string) → JSON.parse → Main
Proposed: Worker → SharedArrayBuffer ← Main (zero copy, direct read)

Would reduce data transfer overhead for large graphs (currently <2ms, but visible for 10K+ node graphs).


GPU-Accelerated Force Layout

Integrate a GPU compute-shader force-directed layout algorithm for real-time layout of 100K+ node graphs:

  • Current: CPU force simulation via Reagraph's d3-force
  • Proposed: WebGPU compute shaders for O(N²) → O(N log N) Barnes-Hut on GPU
  • Would enable smooth interactive layout for orders of magnitude more nodes

Graph Database Backend

Add a graph database backend (e.g., Dgraph, TigerGraph, or a custom Rust solution) to:

  • Persist graphs without re-uploading on every session
  • Support incremental updates (add/remove nodes and edges)
  • Enable graph queries (find neighbors, subgraph extraction) without full re-processing
  • Store algorithm result history

Authentication & Multi-User

For a hosted deployment:

  • User accounts with per-user graph storage
  • Shared graphs with collaboration (multiple users viewing the same graph)
  • Job queue for long-running algorithms (don't block HTTP handler)
  • WebSocket progress streaming for real-time algorithm execution feedback

Additional File Formats

FormatDescription
GraphML (XML)Widely used in graph tools (yEd, Gephi)
GMLGraph Modelling Language
GEXFGephi's native format with rich metadata
CSV with headerssource,target,weight header row
ParquetColumnar format for very large graphs