Case Study: Congressional Cosponsorship (end-to-end)
This walkthrough takes a real, public hypergraph from raw files all the way to a trained hypergraph neural network, exercising every layer of HyperMesh in a single script. It doubles as a reference for how the pieces fit together.
The full, runnable script lives at
examples/case_study_congress_bills.py.
pip install "hypermesh[analytics,interop,ml]" pandaspython examples/case_study_congress_bills.py # downloads the datasetpython examples/case_study_congress_bills.py --limit 20000 # faster subsetThe dataset
Section titled “The dataset”We use congress-bills from the Austin R. Benson data collection: a temporal higher-order network where
- nodes are US Congresspersons (1,718 of them),
- hyperedges are legislative bills — the set of a bill’s sponsor and co-sponsors (260,851 timestamped bills),
- timestamps are the day each bill was introduced.
This is a textbook hypergraph: a bill naturally connects many legislators at
once, which a plain graph can only approximate with cliques. It ships in the
standard Benson format (-nverts.txt, -simplices.txt, -times.txt,
-node-labels.txt), which the script downloads and parses.
1. Ingest — bulk-load timestamped hyperedges
Section titled “1. Ingest — bulk-load timestamped hyperedges”Each bill becomes one hyperedge. We hand copy_from_df a frame with event_ts
and a members list per row — the fast bulk path (no row-by-row inserts).
he_df = pd.DataFrame({ "event_ts": times, "members": [list(s) for s in simplices], # variable-size sets "weight": [1.0] * len(simplices),})db.execute("CREATE HYPEREDGE TABLE Cosponsorship () BUCKET_SECONDS 365")db.copy_from_df(he_df, "Cosponsorship")We also compute each legislator’s tenure (first/last active day) on the way in — used later as model features.
2. Query — temporal Cypher
Section titled “2. Query — temporal Cypher”The hyperedges are immediately queryable, including time windows and pagination:
db.execute( "MATCH HYPEREDGE (he:Cosponsorship) " "WHERE he.event_ts >= 2000 AND he.event_ts <= 4000 RETURN * LIMIT 1000")db.execute("MATCH HYPEREDGE (he:Cosponsorship) RETURN * SKIP 5 LIMIT 3")3. Analytics — who matters, and how it’s structured
Section titled “3. Analytics — who matters, and how it’s structured”The analytics engine runs directly on the stored hypergraph — degree, PageRank influence, density, spectral gap, and spectral communities:
an = db.analytics("Cosponsorship")an.density()an.spectral_gap()degree = an.node_degree() # {node_id: #bills}pr = an.pagerank() # influence rankingan.zhou_clustering() # spectral communitiesWe then derive a reproducible supervised label for the modeling stage:
legislators in the top quartile of cosponsorship degree are tagged "high"
influence, everyone else "low". This label is written into a node table:
db.execute( "CREATE NODE TABLE Congressperson (" " node_id INTEGER PRIMARY KEY, name TEXT, " " first_q INTEGER, last_q INTEGER, tier TEXT)")db.copy_from_df(nodes_df, "Congressperson") # name + tenure + tier4. Interop — export for visualisation
Section titled “4. Interop — export for visualisation”HyperMesh doesn’t render graphs itself, but interop bridges the hypergraph to the standard ecosystem. Here we project to NetworkX and write GraphML for Gephi/Cytoscape:
hg = db.to_hypergraph("Cosponsorship")g = hm.interop.to_networkx(hg, kind="clique")hm.interop.to_graphml(hg, "congress.graphml", kind="clique")5. Modeling — train an owned HGNN in one call
Section titled “5. Modeling — train an owned HGNN in one call”Now the payoff. We predict each legislator’s influence tier from their tenure features + co-sponsorship structure, using the modeling layer. Tenure is non-leaky with respect to the degree-derived label, so the model genuinely has to learn from the hypergraph.
fhg = hm.nn.featurize( db, "Cosponsorship", node_table="Congressperson", node_features=["first_q", "last_q"], label="tier",)
# Framework-native tensors, if you'd rather bring your own model:tensors = hm.nn.prepare(db, "Cosponsorship", framework="torch", node_table="Congressperson", node_features=["first_q", "last_q"], label="tier")
model = hm.nn.fit(fhg, epochs=150)model.evaluate(fhg.y, "val")["accuracy"]model.embed() # node embeddings for downstream usehm.nn.fit builds the spectral propagation operator from the incidence matrix,
standardises features, splits train/val, and returns a FittedHGNN you can
predict, predict_proba, embed, and evaluate.
6. Temporal — reservoir computing (LSM)
Section titled “6. Temporal — reservoir computing (LSM)”Finally we treat each legislator as a time series of yearly activity and classify their influence tier with a reservoir / liquid-state model — no backprop through time:
seq = hm.nn.temporal_features(db, "Cosponsorship", window_seconds=365)
clf = hm.nn.ReservoirClassifier(n_reservoir=128, seed=0)clf.fit(per_legislator_sequences, tiers)clf.score(per_legislator_sequences, tiers)What this demonstrates
Section titled “What this demonstrates”In one script, with one connect():
| Layer | API | What it did |
|---|---|---|
| Ingest | copy_from_df | 260k variable-size bills as hyperedges |
| Query | Cypher MATCH HYPEREDGE | temporal windows + pagination |
| Analytics | db.analytics(...) | influence, density, communities, spectral gap |
| Interop | hm.interop.* | NetworkX / GraphML export |
| Modeling | hm.nn.fit | owned spectral HGNN, trained on the DB |
| Temporal | hm.nn.ReservoirClassifier | activity-trajectory classification |
The same APIs run unchanged on the full dataset (--limit 0) and against a
remote HyperMesh server.