Modeling (HGNN / LSM)

hypermesh.nn is the modeling layer: it turns a stored hypergraph into model-ready tensors and trains hypergraph neural networks directly on the database, behind a single high-level call. You go from raw data to a trained model without hand-writing data loaders, incidence matrices, or feature joins.

pip install "hypermesh[ml]"       # torch-only: owned HGNN + tensor bridge
pip install "hypermesh[ml-pyg]"   # + PyTorch Geometric bridge
pip install "hypermesh[ml-dgl]"   # + Deep Graph Library bridge
pip install "hypermesh[ml-dhg]"   # + DeepHypergraph bridge

There are two stages, and a one-call front door over each.

Train a model in one call

hm.nn.fit(...) trains an owned spectral hypergraph neural network (torch-only — no PyG/DGL required) and returns a ready-to-score model:

import hypermesh as hm

db = hm.connect("/var/lib/hypermesh/data")

model = hm.nn.fit(
    db, "CoProximity",        # hyperedge table  → graph structure
    node_table="Patient",     # node table       → node features + label
    label="label",            # supervision target (a node-table column)
)

model.predict_dict()          # {node_id: predicted_class}
model.predict_proba()         # (n_nodes, n_classes) softmax probabilities
model.embed()                 # (n_nodes, hidden_dim) node embeddings

fit() automatically picks classification for categorical labels and regression for numeric ones (override with task=). It standardises features, holds out a validation split, and restores the best-validation weights when training finishes.

Parameter	Default	Meaning
`hidden_dim`	`32`	Hidden width of the HGNN.
`num_layers`	`2`	Number of Θ-propagation layers.
`dropout`	`0.5`	Dropout between layers.
`epochs`	`200`	Training iterations (full-batch / transductive).
`lr`, `weight_decay`	`0.01`, `5e-4`	Adam optimiser settings.
`val_fraction`	`0.2`	Fraction of labelled nodes held out for validation.
`task`	`"auto"`	`"classification"`, `"regression"`, or `"auto"`.
`standardize`	`True`	Z-score node features before training.

The `FittedHGNN`

Method	Returns
`predict()`	Per-node predictions aligned to `node_ids` (class labels or values).
`predict_dict()`	`{node_id: prediction}`.
`predict_proba()`	Class probabilities `(n_nodes, n_classes)` (classification).
`embed()`	Penultimate-layer node embeddings.
`evaluate(y, split="val")`	`{"accuracy": ...}` or `{"mse": ..., "mae": ...}`.

Stage 1 — features (`featurize`)

For full control, build a FeaturedHypergraph and inspect it before training:

fhg = hm.nn.featurize(db, "CoProximity", node_table="Patient", label="label")

fhg.X                     # node features  (n_nodes, f_node)
fhg.E                     # edge features  (n_edges, f_edge)
fhg.y                     # labels         (n_nodes,)
fhg.node_feature_names    # e.g. ['age', 'disease=A', 'disease=B']
fhg.classes_              # label classes for a categorical target

Node features are joined from a node table. Numeric columns pass through; TEXT columns are one-hot (default) or categorical="ordinal".
Nodes present in the hypergraph but missing from the node table are imputed (no error), and with no node table at all featurize falls back to structural features (degree, weighted_degree).
Edge features are drawn from weight, size, event_ts.
Pass an explicit x=<array> to override the join entirely.

Stage 2 — tensor bridges

prepare() is the one-call path from database to a framework-native container:

data = hm.nn.prepare(db, "CoProximity", framework="pyg",
                     node_table="Patient", label="label")
# -> torch_geometric.data.Data(x=..., hyperedge_index=..., y=...)

Or convert an existing FeaturedHypergraph with .to(framework):

`framework`	Returns	Encoding
`"torch"`	`dict` of tensors	sparse `incidence` + `hyperedge_index` `[2, nnz]`.
`"pyg"`	`torch_geometric.data.Data`	`hyperedge_index` / `hyperedge_weight` (the `HypergraphConv` convention).
`"dgl"`	DGL heterograph	`node` / `hyperedge` node types with `in` / `has` relations.
`"dhg"`	`dict` with a `dhg.Hypergraph`	0-based vertex positions + feature/label tensors.

fhg  = hm.nn.featurize(db, "CoProximity", node_table="Patient")
torch_data = fhg.to("torch")        # {"incidence", "hyperedge_index", "x", ...}
dgl_graph  = fhg.to("dgl")

Each bridge imports its backend lazily and raises a clear ImportError naming the install extra if the library is missing.

Temporal models (reservoir / LSM)

For time-evolving hypergraphs, ReservoirClassifier is an Echo State Network (a Liquid State Machine) with a cheap ridge readout — NumPy-only, no torch required. temporal_features is the bridge that turns a time range into a per-window feature sequence:

# one feature sequence per time range (n_windows × n_features)
seq = hm.nn.temporal_features(db, "CoProximity", window_seconds=3600)

clf = hm.nn.ReservoirClassifier(n_reservoir=200, seed=0)
clf.fit([seq_a, seq_b, ...], [0, 1, ...])    # many labelled sequences
clf.predict([new_seq])

Window features include n_edges, mean_size, mean_weight, sum_weight and n_active_nodes. Empty windows become all-zero rows so the sequence is dense and evenly spaced in time.

Remote clients

fit / featurize / prepare build the hypergraph in-process, so they need an embedded Connection. From a remote Client, fetch rows and build the snapshot yourself, then featurize from it:

from hypermesh import build_hypergraph

rows = list(client.execute("MATCH HYPEREDGE (he:CoProximity) RETURN *"))
hg   = build_hypergraph(rows)
fhg  = hm.nn.featurize(hg)            # structural features (no node table)