this project is a panoramic, research-oriented pipeline for evaluating graph reduction algorithms. it's designed to support thesis-level experimentation by allowing the user to compare different algorithms that reduce graph complexity and the ways they influence graph properties in large-scale networks.
the core goal of this framework is to become a tool for the systematic study of reduction techniques and potentially evolve into a research environment as an automated laboratory for graph theory.
note: the project is an unfolding consequence of my bachelor's thesis.pdf. it is still very much a work in progress (hence the demo) but is constantly evolving to hopefully become part of my postgraduate research at jagiellonian university.
- multi-source graph ingestion: loading graphs from various formats (edgelists, memory objects) via a unified gateway. additionally, the framework uses lazy loading to boost performance while handling massive datasets
- polymorphic transformations: support for different types of graph reduction
- sparsification - selection of significant nodes/edges and discarding others
- coarsening - aggregation of similar nodes/edges to construct a smaller graph
- condensation (work in progress) - learning a synthetic graph from scratch
- automated metric registry: calculating structural properties in the original and modified graph on the fly
- experiment management: orchestrating full experiments from start to finish, where a graph is imported, transformed, analyzed, and the results are persisted as an audit trail
- efficiency benchmarking: automatic tracking of wall-clock time for both transformation and metric phases to evaluate theoretical vs. empirical complexity
- visualization: basic metric value plots and graph figures
the framework is built using domain-driven design principles and organized into distinct layers to ensure that experimental logic remains decoupled from infrastructure concerns. the plugin-driven architecture ensures flexibility and makes the program open for future extension.
domainlayer contains the core truth of the system, such as theGraphmodel,Metricdefinitions, and the abstractGraphTransformlogic; it is purely focused on graph theory and the mathematics behind the graph algorithmsapplicationlayer orchestrates the workflow via theExperimentService, handling the "business logic" of running a research job without needing to know how graphs are stored and where they come frominfrastructurelayer takes care of details like reading graphs (GraphGateway), persisting results (Repository), and managing database transactions (UnitofWork)interfacelayer provides entry points for the user including a CLI for automated batches and an API for potential integration with web dashboards
the program follows a defined lifecycle for every experiment.
- import: the
GraphGatewayreads raw data and creates a lazy object - orchestration: the
ExperimentServicereceives a command via a CLI/API and fetches the original graph from theRepository - transformation: the system looks up the requested algorithm in a
Registryand executes it, producing a newGraphobject while keeping the original graph intact - analysis: a series of
Metricsis run against the graph and return detailed dictionaries of values and metadata from the calculation - commitment: the
UnitofWorkensures that both the new graph and the experiment results are saved to storage simultaneously, preventing dirty data resulting from errors
experiments are managed through the ExperimentService, which coordinates the lifecycle of importing, transforming and measuring graphs.
- clone or download the repository to your local machine
- open the terminal and navigate to the project's root directory
- run the following demo that showcases the pipeline from ingestion to visualization:
python src/demo.pythis script runs a small selection of scenarios and outputs comparative metric results along with some basic graph visualizations.
it makes no difference whether the operation is removing edges, merging nodes, or reweighting the entire graph; as long as the algorithm follows the GraphTransform interface, it can be plugged into the pipeline without changing a single line of core code.
adding a new metric requires only creating a single new file, and the system automatically discovers new metrics at runtime using a scanning decorator. this makes it trivial to add new evaluation criteria with progressing research needs.
the current system uses an in-memory storage for speed and demo purposes, but the architecture is decoupled such that swapping to a SQL database or a specialized graph database (like Neo4j) requires only changing the Repository implementation, while the rest of the logic remains untouched.
this project is designed to eventually support adapting reduction strategies in dynamic graphs and automating the selection of reduction techniques based on graph topology for machine learning integration.