GitHub - yuhanghe01/XShapeEnc: Training-free 2D Shape Encoding

Training-free Spatially Grounded Geometric Shape Encoding (Technical Report)

Yuhang He
Microsoft Research

Usage from PyPI

Install the package from PyPI:
```
pip install xshapeenc
```

Shape Geometry Encoding: Encodes a 2-D shape mask into a fixed-length Zernike coefficient vector that captures the shape's geometry.

import numpy as np
import xshapeenc

# Build a simple circular mask in polar coordinates (res × res)
res = 300
rho   = np.linspace(0, 1, res)
theta = np.linspace(0, 2 * np.pi, res)
r, t  = np.meshgrid(rho, theta)
circle_mask = (r <= 1.0).astype(np.float64)

# Encode – returns a list of `encode_len` floats
geo_vec = xshapeenc.encode_geometry(circle_mask, n_max=5, res=res, lam=0.6, encode_len=512)
print(f"Geometry encoding length: {len(geo_vec)}")  # 512

# Decode back to a mask and measure reconstruction error
encoder = xshapeenc.ShapeGeometryEncoder(n_max=5, res=res, lam=0.6, encode_len=512)
recon   = encoder.decode(encoder.encode(circle_mask))
mse     = float(np.mean((circle_mask - recon) ** 2))
print(f"Reconstruction MSE: {mse:.6f}")

If your mask is a regular Euclidean image (e.g. loaded from a PNG), pass mask_in_euclidean=True and the library handles the polar re-sampling automatically:

from PIL import Image
import numpy as np
import xshapeenc

img  = np.array(Image.open("my_shape.png").convert("L")) / 255.0  # H × W float
vec  = xshapeenc.encode_geometry(img, encode_len=512, mask_in_euclidean=True)

Pose encoding: Encodes a 1-D spatial pose vector (e.g. x/y coordinates, scale, orientation) into a set of Zernike radial coefficients.

import xshapeenc

pose_vec   = [0.2, 0.5, 0.7, 0.9, 0.4]   # arbitrary length
pose_coeffs = xshapeenc.encode_pose(pose_vec, encode_len=128)
print(f"Pose encoding length: {len(pose_coeffs)}")  # 128

Joint geometry-pose encoding: Fuses shape geometry and spatial pose into a single vector. The beta parameter controls the emphasis:

beta=0 → pure pose
beta=1 → equal weight (default)
beta=2 → pure geometry

import numpy as np
import xshapeenc

res = 300
rho   = np.linspace(0, 1, res)
theta = np.linspace(0, 2 * np.pi, res)
r, _  = np.meshgrid(rho, theta)
circle_mask = (r <= 1.0).astype(np.float64)

pose_vec  = [0.2, 0.5, 0.7, 0.9, 0.4]
joint_vec = xshapeenc.encode_geopose(circle_mask, pose_vec, encode_len=512, beta=1.0)
print(f"Joint encoding length: {len(joint_vec)}")  # 512

Quick Test

To build up the environment pip install requirements.txt.

Run quick_test.py to experience shape geometry encoding, shape pose encoding and their joint encoding.

#Run all tests::
python quick_test.py
#Run a specific subset with debug output::
python quick_test.py --tests pose geometry --verbose

XShapeCorpus dataset generation, go to README.md.

XShapeEnc Summary

Method	arbitrary shape?	high frequency?	training-free?	task-agnostic?	spatial-context?
AngularSweep	✗	✗	✓	✗	✓
Poly2Vec	✗	✗	✗	✗	✓
Space2Vec	✗	✗	✗	✗	✓
DeepSDF	✓	✓	✗	✓	✗
2DPE	✗	✗	✗	✓	✓
ShapeEmbed	✓	✓	✗	✓	✗
ShapeDist	✓	✗	✓	✓	✗
XShapeEnc (Ours)	✓	✓	✓	✓	✓

As shown in the table above, XShapeEnc encodes an arbitrary 2D geometric shape associated with a spatial position (e.g., x-, y- coordinate, scale) within a unified framework. It is totally training-free, task-agnostic and frequency-rich, while enjoying the advantage of controllable emphasis between shape geometry and shape pose encoding. In summary, it provides flexible encoding:

Options	Can XShapeEnc do?
Just Shape Rotation Invariant Feature	Yes
Just Shape Rotation Variant Feature	Yes
Just Shape Geometry Encoding	Yes
Just Shape Pose Encoding	Yes
Shape Geometry and Pose Joint Encoding	Yes

XShapeEnc Pipeline

XShapeEnc encoding pipeline is shown in the Figure above. It can independently encode shape geometry, shape pose or geometry-pose jointly. The whole encoding framework is based on Zernike basis. The shape pose requires to be first converted into harmonic pose field so as to be encodable by Zernike basis.

Experiment

Inter-Shape Polygon-Polygon Topological Relation Classification

We run experiment on spatially grounded polygon pair topological relation classification. As shown in the figure above, we classify 5 main relations: Disjoint, Within, Overlap, Touch and Equal. The polygon shapes are from Singapore and New York building bird-eye-view (BEV) map. The result is shown in the table below,

Method	Singapore	New York
PointSet	0.670	0.564
ShapeContexts	0.581	0.525
AngularSweep	0.606	0.546
Space2Vec	0.706	0.632
ResNet18	0.674	0.753
ViT	0.669	0.752
CLIP	0.700	0.779
Poly2Vec	0.702	0.684
XShapeEnc (Ours)	0.760	0.768

From this table, we can see that XShapeEnc maintains highly competitive performance.

Invertibility Visualization

We test XShapeEnc's invertibility property by running encoding-to-shape inversion from various commonly used encodng length: 64, 128, 256, 512, 1024, 2048, 4096. The result is shown in the Figure above, from which we can observe that larger encoding length leads to higher-fielity shape recovery.
Shape Geometry Clustering Visualization

We test XShapeEnc's shape geometry encoding inter- and intra- geometry discriminability by running t-SNE clustering on augmented 4 complex shapes. As shown in the figure above, XShapeEnc maintains the discriminability while most of the comparing baselines loosing such discriminability.

Cite XShapeEnc

@inproceedings{yuhheXShapeEnc2026,
title={Training-free Spatially Grounded Geometric Shape Encoding (Technical Report)},
author={He, Yuhang},
booktitle={arXiv:2604.07522},
year={2026}}

Contact

Email: yuhanghe[at]microsoft.com

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
XShapeCorpus		XShapeCorpus
experiment/Cluster		experiment/Cluster
imgs		imgs
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ShapeGeometryEncoder.py		ShapeGeometryEncoder.py
ShapeGeometryPoseEncoder.py		ShapeGeometryPoseEncoder.py
ShapePoseEncoder.py		ShapePoseEncoder.py
ZernikeBasisCorpus.py		ZernikeBasisCorpus.py
quick_test.py		quick_test.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Training-free Spatially Grounded Geometric Shape Encoding (Technical Report)

Usage from PyPI

Quick Test

XShapeEnc Summary

XShapeEnc Pipeline

Experiment

Cite XShapeEnc

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Training-free Spatially Grounded Geometric Shape Encoding (Technical Report)

Usage from PyPI

Quick Test

XShapeEnc Summary

XShapeEnc Pipeline

Experiment

Cite XShapeEnc

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages