-
Notifications
You must be signed in to change notification settings - Fork 38
Description
Is your feature request related to a problem? Please describe.
As part of the discussion of moving from Keras2 to Keras3, see #36 (comment)
Describe the solution you'd like
I have begun the conversion to using PyTorch but I have some questions regarding your original implementation. Feedback on this conversion would be most appreciated. My implementation can be found here:
https://github.com/aweaver1fandm/fetch/tree/pytorch
The implementation is not yet complete but is at a point where I can't proceed further until I'm sure it's moving in the correct direction.
-
Is the transfer training procedure (found in fetch/transfer_train.py) correct. I was unsure about the number of epochs when a layer is unfrozen. I tried to implement what you wrote in the paper but it was a bit vague when it came to the part about unfreezing layers
-
Does the data preparation (found in _data_from_h5 in fetch/puslar_data.py) seem to match what you originally did?
-
In the paper you talk about unfreezing layers, but based on the picture in that paper the term "layer" is a bit ambiguous and depends on the architecture of the CNN. That is, for example, it would be different for densenet architectures compared to vgg architectures.
a. Is that a correct interpretation of what you meant?
b. Do the custom unfreeze functions in fetch/model.py seem to capture that meaning? -
Based on your code, it seems that Guassian noise is only added to the freq data. Is that correct?
-
Do the model setups for transfer training an individual CNN (see TorchVisionModel in fetch/model.py) look correct in terms of matching how you modified them?
For transfer training I purposefully used an output layer with one neuron, instead of two as you did. This would seem to be the way to go because at this point it's a binary classification problem and the single output is the probability it's a pulsar
I've pulled the training and test data from http://astro.phys.wvu.edu/fetch/. I do an 85/15 random split of the training data for training and validation and use the test data to evaluate once the model is trained.