gaussian grbm initialization by jquetzalcoatl · Pull Request #71 · dwavesystems/dwave-pytorch-plugin

jquetzalcoatl · 2026-03-16T21:41:43Z

grbm weights and biases initialization set to Gaussian N(0,1/number of nodes)

Hinton guide suggests 0.01 as standard deviation. See https://www.cs.toronto.edu/~hinton/absps/guideTR.pdf

Moreover, having it set to Gaussian with this dependence on the number of nodes makes the energy extensive and initializes the gRBM in a paramagnetic phase similar to that describen in the Random Energy model paper

https://journals.aps.org/prb/abstract/10.1103/PhysRevB.24.2613

See #48

kevinchern · 2026-03-16T21:51:40Z

@jquetzalcoatl IIRC, Hinton's recommendation pertains to zero-one-valued RBMs (bipartite with hidden units). Would it make sense to translate the $0.01$ to the spin-valued equivalent?

jquetzalcoatl · 2026-03-16T22:29:47Z

@kevinchern The REM reference is for spin models i.e., {-1,1}. Ultimately, the initialization pertains to whether the model is ergodic. In this sense, the support only set an offset energy.

I believe the main motivation for initializing with 0.01 in Hinton's guide is to start in a paramagnetic phase, which ties nicely with the REM/SK spin glass model

kevinchern

Can you add a release note to go with this?

Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>

jquetzalcoatl · 2026-03-17T01:24:51Z

added release note

kevinchern

Following motivation from references, should h be initialized to 0?

Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>

…ivation

kevinchern

Tests are failing but otherwise LGTM. Thanks for the much-needed PR @jquetzalcoatl !!

@VolodyaCO offered to take a look at the tests

Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>

kevinchern · 2026-04-01T23:00:35Z

Any updates on this?

VolodyaCO · 2026-04-08T20:58:29Z

The reason for this test failing is very strange. Essentially, it is making sure that both the DVAE forward (which does encode -> latent to discrete -> decode) matches encode -> latent_to_discrete -> decode, i.e., this is a pretty simple unit test:

expected_latents = self.encoders[n_latent_dims](self.data)
expected_discretes = self.dvaes[n_latent_dims].latent_to_discrete(
    expected_latents, n_samples
)
expected_reconstructed_x = self.decoders[n_latent_dims](expected_discretes)

latents, discretes, reconstructed_x = self.dvaes[n_latent_dims].forward(
    x=self.data, n_samples=n_samples
)

assert torch.equal(reconstructed_x, expected_reconstructed_x)
assert torch.equal(discretes, expected_discretes)
assert torch.equal(latents, expected_latents)

Moreover, self.dvaes is built as

self.encoders = {i: Encoder(i) for i in latent_dims_list}
self.decoders = {i: Decoder(latent_features, input_features) for i in latent_dims_list}
self.dvaes = {i: DVAE(self.encoders[i], self.decoders[i]) for i in latent_dims_list}

So even if the encoders/decoders are updated in other tests (because of training), there should be a permanent tracking of the encoders/decoders in the dvaes.

…forward method unit test

VolodyaCO · 2026-04-08T21:27:44Z

Found the issue and fixed it in a PR to @jquetzalcoatl 's repo: jquetzalcoatl#1

Please approve javi, this would update the current PR and solve the issue.

Took me a while to get the error!

Fix failing forward method unit tests

VolodyaCO

I have definitely had to manually change the initialisation of GRBM weights whenever I use the GRBM. Thanks for this PR. I think it looks good to merge.

kevinchern

@jquetzalcoatl I added a couple typo fixes, can you accept them?
The remaining questions/comments are for @VolodyaCO and should be good to merge after.

kevinchern · 2026-04-13T17:09:55Z

+    `Hinton's practical guide for RBM training<https://www.cs.toronto.edu/~hinton/absps/guideTR.pdf>`_, which recommends sampling 
+    weights from a Gaussian distribution with mean 0 and standard deviation 0.01 (for zero-one-valued RBMs). 
+    The scaling factor of :math:`1/\sqrt(N)` ensures that the energy functional remains extensive 
+    and initializes the GRBM in a paramagnetic regime, consistent with the `Sherrington-Kirkpatrick model<https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.35.1792>`_.


Suggested change

and initializes the GRBM in a paramagnetic regime, consistent with the `Sherrington-Kirkpatrick model<https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.35.1792>`_.

and initializes the GRBM in a paramagnetic regime, consistent with the `Sherrington-Kirkpatrick model <https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.35.1792>`_.

kevinchern · 2026-04-13T17:10:39Z

+features:
+  - |
+    Initialize ``GraphRestrictedBoltzmannMachine`` weights using Gaussian 
+    random variables with standard deviation equal to :math:`1/\sqrt(N)`, where N 


Suggested change

random variables with standard deviation equal to :math:`1/\sqrt(N)`, where N

random variables with standard deviation equal to :math:`1/\sqrt(N)`, where :math:`N`

kevinchern · 2026-04-13T17:38:20Z


+        torch.manual_seed(1234)  # Set seed again to ensure that the sampling in the forward method
+        # is the same as in the expected_discretes
        latents, discretes, reconstructed_x = self.dvaes[n_latent_dims].forward(


Sorry if I asked this in the first review for DVAE and forgot, but why does this test call the
forward method explicitly? Calling the model directly is the recommended practice as it has several hooks on top of the forward method. @VolodyaCO
(this question/comment is unrelated to this PR)

kevinchern · 2026-04-13T17:41:40Z

+        torch.testing.assert_close(discretes, expected_discretes)
+        torch.testing.assert_close(reconstructed_x, expected_reconstructed_x)

-        assert torch.equal(reconstructed_x, expected_reconstructed_x)


@VolodyaCO was this the fix to failing tests? Are these tests sensitive to the seed..?

thisac · 2026-04-13T23:55:25Z

@@ -0,0 +1,8 @@
+---
+features:


More an upgrade rather than a feature, no?

Suggested change

features:

upgrade:

thisac · 2026-04-13T23:56:22Z

+  - |
+    Initialize ``GraphRestrictedBoltzmannMachine`` weights using Gaussian 
+    random variables with standard deviation equal to :math:`1/\sqrt(N)`, where N 
+    denotes the number of nodes in the GRBM. The weight-initialization strategy is grounded in `Hinton's practical guide for RBM training <https://www.cs.toronto.edu/~hinton/absps/guideTR.pdf>`_, which recommends sampling weights from a Gaussian distribution with mean 0 and standard deviation 0.01 (for zero-one-valued RBMs). The scaling factor of :math:`1/\sqrt(N)` ensures that the energy functional remains extensive and initializes the GRBM in a paramagnetic regime, consistent with the `Sherrington-Kirkpatrick model<https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.35.1792>`_.


Better add some line breaks here, splitting the full paragraph on several lines.

gaussian grbm initialization

3fb3cdb

jquetzalcoatl mentioned this pull request Mar 16, 2026

GRBM weights are initialized arbitrarily and aren't robust #48

Open

kevinchern requested changes Mar 16, 2026

View reviewed changes

Comment thread dwave/plugins/torch/models/boltzmann_machine.py Outdated

jquetzalcoatl and others added 2 commits March 16, 2026 16:04

Update dwave/plugins/torch/models/boltzmann_machine.py

05cc617

Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>

added release note

bd9fdab

jquetzalcoatl added 2 commits March 16, 2026 22:04

added release note

0c1ddec

added docstring explaining motivation for weight initialization in grbm

d4bdfbf

kevinchern requested changes Mar 17, 2026

View reviewed changes

jquetzalcoatl and others added 5 commits March 17, 2026 10:46

Update releasenotes/notes/gaussian-rbm-init-28fd4d295ef86d77.yaml

60ee81a

Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>

Update dwave/plugins/torch/models/boltzmann_machine.py

eee250a

Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>

Update dwave/plugins/torch/models/boltzmann_machine.py

bc551b8

Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>

Update dwave/plugins/torch/models/boltzmann_machine.py

8ec902e

Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>

Fixed biases initilization to zero and added docstring explaining mot…

54b2862

…ivation

kevinchern requested changes Mar 17, 2026

View reviewed changes

Comment thread dwave/plugins/torch/models/boltzmann_machine.py Outdated

Comment thread releasenotes/notes/gaussian-rbm-init-28fd4d295ef86d77.yaml Outdated

Comment thread releasenotes/notes/gaussian-rbm-init-28fd4d295ef86d77.yaml Outdated

jquetzalcoatl and others added 3 commits March 17, 2026 10:59

Update dwave/plugins/torch/models/boltzmann_machine.py

8cde437

Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>

Update releasenotes/notes/gaussian-rbm-init-28fd4d295ef86d77.yaml

4e48419

Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>

Update releasenotes/notes/gaussian-rbm-init-28fd4d295ef86d77.yaml

d9a399c

Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>

kevinchern requested a review from VolodyaCO March 17, 2026 18:01

Vladimir Vargas Calderón added 2 commits April 8, 2026 14:11

Enforce deterministic latent mapping in tests for reproducibility of …

0e4761d

…forward method unit test

Ensure reproducibility in forward method

c6e98c8

Merge pull request #1 from VolodyaCO/fix-dvae-tests

85b3e98

Fix failing forward method unit tests

VolodyaCO approved these changes Apr 10, 2026

View reviewed changes

kevinchern self-requested a review April 13, 2026 17:08

kevinchern requested changes Apr 13, 2026

View reviewed changes

thisac reviewed Apr 13, 2026

View reviewed changes

	and initializes the GRBM in a paramagnetic regime, consistent with the `Sherrington-Kirkpatrick model<https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.35.1792>`_.
	and initializes the GRBM in a paramagnetic regime, consistent with the `Sherrington-Kirkpatrick model <https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.35.1792>`_.

	random variables with standard deviation equal to :math:`1/\sqrt(N)`, where N
	random variables with standard deviation equal to :math:`1/\sqrt(N)`, where :math:`N`

Conversation

jquetzalcoatl commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevinchern commented Mar 16, 2026

Uh oh!

jquetzalcoatl commented Mar 16, 2026

Uh oh!

kevinchern left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jquetzalcoatl commented Mar 17, 2026

Uh oh!

kevinchern left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kevinchern left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kevinchern commented Apr 1, 2026

Uh oh!

VolodyaCO commented Apr 8, 2026

Uh oh!

VolodyaCO commented Apr 8, 2026

Uh oh!

VolodyaCO left a comment

Choose a reason for hiding this comment

Uh oh!

kevinchern left a comment

Choose a reason for hiding this comment

Uh oh!

kevinchern Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

kevinchern Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

kevinchern Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

kevinchern Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

thisac Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

thisac Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jquetzalcoatl commented Mar 16, 2026 •

edited

Loading

kevinchern left a comment •

edited

Loading

kevinchern left a comment •

edited

Loading

kevinchern left a comment •

edited

Loading