Functional Attention: From Pairwise Affinities to Functional Correspondences

Xiao, Jiefang; Gao, Maolin; Weber, Simon; Yang, Guandao; Cremers, Daniel

Functional Attention:
From Pairwise Affinities to Functional Correspondences

Jiefang Xiao^1,2, Maolin Gao^1,2,*, Simon Weber^3,*, Guandao Yang⁴, Daniel Cremers^1,2

¹TU Munich ²MCML ³PIXL, University of Oxford ⁴ECE, UT Austin
^*Corresponding author
ICML 2026

Code arXiv

Functional Attention(FuncAttn) reinterprets attention as a functional correspondence between learned bases. Queries, keys, and values are projected into a compact spectral space, where a regularized least-squares solve yields a k×k linear operator that transports information between two spaces, reducing complexity from O(n²) to O(k²) with k ≪ n.

Abstract

Learning mappings between infinite-dimensional function spaces, or operator learning, is essential for many machine learning applications. Although transformer-based operators are popular, they often rely on token-wise attention. These methods treat continuous fields as discrete tokens and usually ignore the global functional structure. We introduce Functional Attention, which reinterprets attention as a functional correspondence between adaptive bases. Inspired by geometric functional maps, our method replaces softmax affinities with structured linear operators. This yields a compact, generalizable, resolution-invariant representation that explicitly captures global dependencies. Experiments demonstrate that Functional Attention can match state-of-the-art performance in many operator learning tasks, including solving PDEs, 3D segmentation, and regression, while remaining robust to varying discretizations.

Ground truth and error maps for Elasticity and Darcy benchmarks. (relative L₂ ×100)

OOD generalization on the airfoil dataset. Our method generalizes to unseen Reynolds numbers while maintaining smooth and accurate predictions.

Qualitative comparison of RNA surface segmentation. Red circles highlight regions where our method more faithfully recovers the ground-truth segmentation than Transolver.

Evaluation

PDE Benchmarks

We report quantitative results on PDE benchmarks. Relative L₂ loss to ground truth (×100, ↓) is reported. The best results are in bold and the second best are underlined; “/” indicates that the method is not applicable. Our method, FuncAttn, reaches state-of-the-art results and outperforms prior methods on almost all datasets.

RNA Point Cloud Segmentation

We evaluate on RNA point cloud segmentation, where “xyz” and “hks” indicate whether the network input is the xyz coordinates or heat kernel signatures. Our method achieves the best segmentation accuracy.

OOD Generalization on AirfRANS

We report out-of-distribution (OOD) generalization on AirfRANS. The relative error of the lift coefficient (C_L, %) and the Spearman’s rank correlation (ρ_L, %) are reported, with all values scaled by 100. Our method achieves the best generalization performance on both OOD Reynolds and OOD Angles settings.

2D Darcy Flow with a Triangular Notch Domain

We further evaluate on 2D Darcy flow with a triangular notch domain. Relative L₂ error (%, ↓) is reported; † denotes our reproduction using the released code under a comparable parameter budget. Our method achieves the best performance on this singular-domain task.

Poster

BibTeX

@misc{xiao2026functionalattentionpairwiseaffinities,
      title={Functional Attention: From Pairwise Affinities to Functional Correspondences}, 
      author={Jiefang Xiao and Maolin Gao and Simon Weber and Guandao Yang and Daniel Cremers},
      year={2026},
      eprint={2605.31559},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2605.31559}, 
}

Functional Attention:From Pairwise Affinities to Functional Correspondences