neighbor_algorithm module

Full Documentation for hippynn.layers.pairs.csr_pairs.neighbor_algorithm module. Click here for a summary page.

This file was written with assistance from an LLM.

Neighbor list construction over triclinic periodic cells using PyTorch + CSR.

This module builds neighbor pairs for batched molecular/atomistic systems with triclinic cells. It uses a light CSR “algebra” (see csrtable) to stage row-wise joins and filtering:

Pipeline (high level)

  1. build_initial_data() — wrap raw inputs into CSRs for systems/atoms.

  2. normalize_atoms() — wrap positions to fractional [0,1)^3 and record integer image offsets.

  3. build_image_offsets() — compute per-system image-shift stencils large enough for cutoff.

  4. build_image_atoms() — cartesian-expand primary atoms × image shifts.

  5. voxelize_images() — voxel grid per system with edge ≈ cutoff and assign image-atoms to voxels.

  6. voxel_adjacency() — build voxel–voxel edges via a fixed stencil (includes self-edges).

  7. expand_pairs() — expand voxel edges into candidate atom pairs.

  8. prune_pairs() — filter by cutoff; drop same-atom self-edges for primary images.

  9. calc_neighbors() — orchestration entry point that returns final pairs.

All tensor operations are batched and device/dtype agnostic. Index tensors are torch.long. Distances/displacements remain differentiable w.r.t. input positions.

build_image_atoms(atomCSR: CSRTable, imageCSR: CSRTable, systemCSR: CSRTable) CSRTable[source]

Expand primary atoms across image offsets (row-wise outer join).

Parameters

atomCSRCSRTable

Rows = systems; requires fields "positions":[N,3], "atom_gid":[N], "offsets":[N,3] (base offsets).

imageCSRCSRTable

Rows = systems; requires fields "image_offsets":[I,3], "shift":[I,3], "is_primary":[I].

systemCSRCSRTable

Rows = systems; requires "cells" and "cutoff" (used for culling).

Returns

image_atomCSRCSRTable

Rows = systems; entries = atoms×images. Carries: "positions":[M,3] (shifted), "atom_gid":[M], "offsets":[M,3] (total offset for that image), "is_primary":[M], "system":[M] (system id per entry).

Pruning

A coarse AABB check removes far-away images that cannot generate neighbors within cutoff (per system).

Complexity

O(N*I) per system in the worst case before pruning; typically much less after AABB culling.

build_image_offsets(systemCSR: CSRTable, use_full_stencil=True) CSRTable[source]

Compute per-system periodic image offsets sufficient for cutoff.

The stencil size per axis is n_rep[k] = ceil(cutoff * ||(H^{-1})_{k,:}||_2) with a symmetric range {-n_rep[k], ..., 0, ..., +n_rep[k]}. The Cartesian shift for an integer offset k is -k @ H (so that adding it moves an image back into the primary cell).

Parameters

systemCSRCSRTable

Rows = systems; requires "cells" : [S,3,3] and "cutoff" : [S]. If present, "cells_inv" is reused; otherwise it is computed.

use_full_stencilbool, default True

If True use the full symmetric 3D product. (Half-stencils or sparse variants can be added later.)

Returns

imageCSRCSRTable

Rows = systems; entries = image shifts for that system. Carries: "image_offsets" : [nnz,3] (long offsets), "shift" : [nnz,3] (Cartesian shift to apply), "is_primary" : [nnz] (True for k == (0,0,0)).

Complexity

O(S) to compute per-axis replication counts; O(total images) to materialize.

build_initial_data(positions, nonblank, cells, cutoff)[source]

Package raw inputs (positions/mask/cells/cutoff) into CSR containers.

Parameters

positionsTensor, shape [S, A, 3], float

Cartesian positions per system (padded to a common A). Can be float32/float64. Unused (masked) atoms are ignored via nonblank.

nonblankBoolTensor, shape [S, A]

Mask of real atoms per system. True entries become CSR rows’ entries.

cellsTensor, shape [S, 3, 3], float

Triclinic cell matrices (row- or column-major consistent with einsum ops).

cutoffTensor [S] or float

Radial cutoff per system; a scalar is broadcast to all systems.

Returns

atomCSRCSRTable

Rows = systems; entries = real atoms. Carries at least: "raw_positions" : [nnz,3], "atom_gid" : [nnz] (global id in the padded array), and any fields required downstream.

systemCSRCSRTable

One entry per system (counts = 1). Carries: "cells" : [S,3,3] and "cutoff" : [S].

Notes

  • atom_gid preserves a mapping back to the original padded indices.

  • No heavy validation is performed here.

Complexity

O(S*A) to build the mask-based CSR.

calc_neighbors(positions: Tensor, nonblank: Tensor, cells: Tensor, cutoff: float, use_full_stencil: bool = True, return_displacements=True) Tuple[Tensor, Tensor, Tensor, Tensor][source]

Full neighbor-list pipeline (batched triclinic PBC).

Pipeline Structure:
  1. normalize_atoms -> wrapped positions + base image offsets (integer)

  2. build_image_offsets -> per-system dynamic image CSR with (“image_offsets”,”shift”,”is_primary”)

  3. build_image_atoms -> CSR over systems with (“positions”,”atom_gid”,”offsets”,”is_primary”,”system”)

  4. voxelize_images -> (voxel_atomCSR, voxelCSR, per_system_grid_shape)

  5. voxel_adjancency -> (first_voxel_id, second_voxel_id), includes (0,0,0)

  6. expand edges -> pairs via CSRTable.expand_cartesian_pairings on voxel_atomCSR

  7. prune: distance < cutoff, drop self-connections

Parameters

positionsTensor, shape [S, A, 3]

Cartesian positions per system, padded to A.

nonblankBoolTensor, shape [S, A]

Mask indicating valid atoms.

cellsTensor, shape [S, 3, 3]

Triclinic cell matrices per system.

cutoffTensor [S] or float

Per-system radial cutoff; a scalar is broadcast.

use_full_stencilbool, default True

Controls the image offset stencil in build_image_offsets().

return_displacements: bool, default True

If true, return distances and displacments as well as discrete indices.

Returns

idALongTensor, shape [K]

Source atom global ids (within the padded input).

idBLongTensor, shape [K]

Target atom global ids.

rel_kLongTensor, shape [K, 3]

Integer image offset for B relative to A.

distTensor, shape [K]

Euclidean distances ||posB - posA||.

dispTensor, shape [K, 3]

Displacements posB - posA in Cartesian coordinates.

Guarantees

  • Deterministic ordering given fixed inputs and a fixed CSR ordering policy.

  • Includes cross-image neighbors due to explicit image expansion.

  • Excludes trivial self-edges as described in prune_pairs().

Complexity

Roughly linear in the number of candidate pairs after voxel pruning; memory peaks during pair expansion and can be controlled by voxel granularity.

Examples

>>> idA, idB, k, dist, disp = calc_neighbors(positions, nonblank, cells, cutoff)
calculate_aabb(positions: Tensor, system_id: Tensor, n_systems: int)[source]

Compute per-system axis-aligned bounding boxes (AABB).

Parameters

positionsTensor, shape [M, 3] (subset of atoms/images)

Cartesian coordinates to bound.

system_idLongTensor, shape [M]

Owning system id for each position.

n_systemsint

Total number of systems.

Returns

minsTensor, shape [S, 3]

Per-system minima; systems without entries return zeros.

maxsTensor, shape [S, 3]

Per-system maxima; systems without entries return zeros.

Notes

Uses scatter_reduce_ with amin/amax. Handles empty systems by emitting [0,0,0] for both min/max.

expand_pairs(voxel_atomCSR, first_vox, second_vox)[source]

Expand voxel edges to candidate atom pairs.

Parameters

voxel_atomCSRCSRTable

Rows = global voxel ids; contains per-entry fields for image-atoms.

first_voxLongTensor, shape [E]

Global voxel ids (left endpoints).

second_voxLongTensor, shape [E]

Global voxel ids (right endpoints).

Returns

pairsCSRTable

Rows = edges (one row per (first_vox[i], second_vox[i])), entries = Cartesian product of atoms from the two voxels. Carries at least: "idA","idB","posA","posB","rel_k","is_primary_B" and any additional payload required for pruning.

Notes

Implemented via CSRTable.expand_pairings()/outer-style joins for the specified voxel pairings.

normalize_atoms(atomCSR: CSRTable, systemCSR: CSRTable) tuple[Tensor, Tensor][source]

Wrap atoms into the primary triclinic cell and record image offsets.

For each atom with Cartesian position x and cell H, compute fractional f = H^{-1} x, integer image k = floor(f), wrapped fractional f' = f - k in [0,1)^3, and wrapped Cartesian x' = H f'. Store these plus H^{-1} for later use.

Parameters

atomCSRCSRTable

Must carry "raw_positions" : [nnz,3] and rows aligned to systems.

systemCSRCSRTable

Must carry "cells" : [S,3,3]. cells are inverted in-place and stored as "cells_inv".

Returns

atomCSRCSRTable

With added fields: "positions" : [nnz,3] (wrapped Cartesian), "offsets" : [nnz,3] (integer images, long).

systemCSRCSRTable

With added field "cells_inv" : [S,3,3].

Determinism

Purely functional given inputs; no randomness.

Autograd

positions remain differentiable w.r.t. input positions; offsets/cells_inv are non-differentiable.

prune_pairs(pairs)[source]

Apply geometric/pruning predicates to candidate pairs.

Predicates

  1. Radial cutoff: ||posB - posA|| < cutoff (per-system).

  2. Self-edge removal: when idA == idB, drop if is_primary_B is True (i.e., the primary image would create a trivial self-connection).

Parameters

pairsCSRTable

Must carry fields: "idA","idB","posA","posB","rel_k","cutoff","is_primary_B" (names may vary slightly depending on upstream ops, but semantics match above).

Returns

idA : LongTensor, shape [K] idB : LongTensor, shape [K] rel_k : LongTensor, shape [K,3] dist : Tensor, shape [K] disp : Tensor, shape [K,3]

Autograd

dist and disp are differentiable; discrete tensors are not.

Complexity

O(E_pairs) with simple boolean masking and a single norm.

voxel_adjacency(voxelCSR: CSRTable, systemCSR: CSRTable) Tuple[Tensor, Tensor][source]

Build voxel–voxel edges using a fixed stencil (includes self-edges).

Assumptions

  • PBC are handled by explicit image expansion; only in-bounds checks are needed.

  • Keep an edge if at least one endpoint voxel is primary (cheap pruning).

  • The stencil can be full (e.g., 3×3×3) or reduced; current code emits a symmetric set including (0,0,0).

Parameters

voxelCSRCSRTable

Rows = systems; entries = voxels with fields "v","m","s","voxel_gid","is_primary_voxel".

systemCSRCSRTable

Provides per-system voxel grid shapes.

Returns

first_globalLongTensor, shape [E]

Global voxel ids for the first endpoint.

second_globalLongTensor, shape [E]

Global voxel ids for the second endpoint.

Complexity

O(V * stencil_size), typically small constant factor per voxel.

voxelize_images(image_atomCSR: CSRTable, systemCSR: CSRTable) Tuple[CSRTable, CSRTable, Tensor][source]

Partition image-atoms into near-cubic voxels of edge ≈ cutoff.

Parameters

image_atomCSRCSRTable

Must carry (at least) "positions","atom_gid","offsets","is_primary","system".

systemCSRCSRTable

Must carry per-system "cutoff". Outputs grid metadata into this CSR.

Returns

voxel_atomCSRCSRTable

Rows = global voxel ids; entries = image-atoms in that voxel. Carries the same fields, plus a per-entry "voxel_gid" if helpful downstream.

voxelCSRCSRTable

Rows = systems; one entry per voxel. Carries: "v":[V,3] (local voxel coords), "m":[V,3] (per-system voxel grid shape), "s":[V,3] (local strides), "voxel_gid":[V], "is_primary_voxel":[V] (voxel contains at least one primary image-atom).

systemCSRCSRTable

Augmented with per-system fields, e.g.: "voxel_grid_shape":[S,3], "voxel_origin":[S,3].

Notes

  • A small skin factor inflates the voxel edge slightly to avoid numerical misses on boundaries.

  • Local voxel ids are unraveled/raveled with per-system strides.

Complexity

O(M) to assign atoms to voxels; O(V) to materialize per-system voxel metadata.