Skip to content

davidpicard/pom

Repository files navigation

PoM

Official implementation of the Polynomial Mixer (CVPR'26 Findings)

teaser PoM

Install

Create a local package and install it with the following commands:


python setup.py sdist bdist_wheel
pip install .

Usage

PoM is used to replace Multi-Head Attention. Its key parameters are the degree of the polynomials $d$ and the expansion factor $k$ of each polynomial. Assuming the original features have dimension $D$, then the internal state representation has dimension $dkD$. This has to be set according to your compute/memory budget, knowing that it is empirically better to have a higher $kD$ than a higher $d$.

The code to use a PoM layer is simple:

from pom import PoM

pom = PoM(dimension, degree, expansion)

# residual self attention on token sequence X
X = X + pom(X)
# adding a residual feed-forward network as in transformers 
X = X + ffw(X)

Causal inference

If you have a block causal mask, you can do iterative inference that has a constant memory cost. This is the case for example in video generation, where all the tokens of one frame depend only on the tokens of previous frames. This information can be encoded in a hidden state carried from one frame to another, as in the following example:

# forward pass
state = [None for _ in range(n_layers)]
out = []
for f in range(n_frames):
    xf = x[:, f * n_tokens_p_frame:(f + 1) * n_tokens_p_frame, :]
    for l in range(n_layers):
        # self-attention
        x_sa, s = pom.state_forward(xf, xf, state=state[l])
        xf = xf + x_sa
        # ffw
        xf = xf + ffw(xf)
        state[l] = s
    out.append(xf)
x = torch.cat(out, dim=1)

Citing us

@inproceedings{picard24pom,
      title={{PoM}: {E}fficient Image and Video Generation with the Polynomial Mixer}, 
      author={David Picard and Nicolas Dufour and Lucas Degeorge and Arijit Ghosh and Davide Allegro and Tom Ravaud and Yohann Perron and Corentin Sautier and Zeynep Sonat Baltaci and Fei Meng and Syrine Kalleli and Marta López-Rauhut and Thibaut Loiseau and Ségolène Albouy and Raphael Baena and Elliot Vincent and Loic Landrieu},
      year={2026},
      booktitle={CVPR Findings},
}

About

official implementation of the Polynomial Mixer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages