Describe the bug(问题描述)
I am training the DIEN model on a dataset with around 20 categorical features and 5 user behavior columns, all being strings. I was able to save the model with keras.save_model in .h5 format, but it throws the following error when I try to load the model with keras.load_model:
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/utils/generic_utils.py", line 668, in deserialize_keras_object
deserialized_obj = cls.from_config(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/functional.py", line 670, in from_config
input_tensors, output_tensors, created_layers = reconstruct_from_config(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/functional.py", line 1298, in reconstruct_from_config
process_node(layer, node_data)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/functional.py", line 1244, in process_node
output_tensors = layer(input_tensors, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer_v1.py", line 764, in __call__
self._maybe_build(inputs)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer_v1.py", line 2086, in _maybe_build
self.build(input_shapes)
File "/usr/local/lib/python3.8/dist-packages/deepctr/layers/sequence.py", line 255, in build
raise ValueError('A `AttentionSequencePoolingLayer` layer requires '
ValueError: A `AttentionSequencePoolingLayer` layer requires inputs of a 3 tensor with shape (None,1,embedding_size),(None,T,embedding_size) and (None,1) Got different shapes: [TensorShape([None, 15, 35]), TensorShape([None, 1, 35]), TensorShape([None, 1])]
This seems to be an issue in the model reconstruction when calling load_model(). More details in the additional content section.
To Reproduce(复现步骤)
Model:
model = DIEN(
feature_columns,
behavior_feat_list,
dnn_hidden_units=[256, 128, 64],
dnn_dropout=0.5,
gru_type='AUGRU',
use_negsampling=False,
att_activation='sigmoid',
)
model.compile(Adam(learning_rate=1e-5), 'binary_crossentropy', metrics=['binary_crossentropy'])
Train and save model:
history = model.fit(train_inputs,
'click',
verbose=True,
epochs=1,
batch_size=32,
validation_split='\t'
)
save_model(
model,
'dien.h5',
save_format='h5',
)
Load model (the part that raises the exception):
from deepctr.layers import custom_objects
loaded_model = load_model('dien.h5', custom_objects)
Operating environment(运行环境):
- python version: 3.8
- tensorflow version: 2.2-2.5. Encounter compatibility from numpy and TF for TF version >= 2.6
- deepctr version: 0.9.3
- CUDA version: 11.7
- NVIDIA driver version: 515.65.01
- base docker image:
tensorflow/tensorflow-2.5.1-gpu
Additional context
I could not try tensorflow older than 2.2 due to driver compatibility issues. DeepCTR also doesn't work with 2.6 <= TF <= 2.11.
My model has the following structure (from `model.summary()):
genre (InputLayer) [(None, 1)] 0
__________________________________________________________________________________________________
hist_genre (InputLayer) [(None, 15)] 0
__________________________________________________________________________________________________
...
hash_28 (Hash) (None, 1) 0 genre[0][0]
__________________________________________________________________________________________________
hash_15 (Hash) (None, 1) 0 genre[0][0]
__________________________________________________________________________________________________
hash_3 (Hash) (None, 15) 0 hist_genre[0][0]
__________________________________________________________________________________________________
...
sparse_seq_emb_hist_genre (Embe multiple 404 hash_3[0][0]
hash_15[0][0]
hash_28[0][0]
__________________________________________________________________________________________________
concat (Concat) (None, 15, 35) 0 sparse_seq_emb_hist_category[0][0
sparse_seq_emb_hist_channel[0][0]
sparse_seq_emb_hist_episode[0][0]
sparse_seq_emb_hist_genre[0][0]
sparse_seq_emb_hist_part[0][0]
sparse_seq_emb_hist_feature0[0][
__________________________________________________________________________________________________
seq_length (InputLayer) [(None, 1)] 0
__________________________________________________________________________________________________
gru1 (DynamicGRU) (None, 15, 35) 7455 concat[0][0]
seq_length[0][0]
__________________________________________________________________________________________________
concat_2 (Concat) (None, 1, 35) 0 sparse_seq_emb_hist_category[2][0
sparse_seq_emb_hist_channel[2][0]
sparse_seq_emb_hist_episode[2][0]
sparse_seq_emb_hist_genre[2][0]
sparse_seq_emb_hist_part[2][0]
sparse_seq_emb_hist_feature0[2][
__________________________________________________________________________________________________
attention_sequence_pooling_laye (None, 1, 15) 10081 concat_2[0][0]
gru1[0][0]
seq_length[0][0]
...
Following the stack track and a lot of extra debug messages, I believe load_model does not pass the inputs to embedding layers in the same order as the original model when reconstructing the model. Namely in tensorflow/python/keras/engine/functional.py , reconstruct_from_config(config, custom_objects, created_layers) builds the layers whenever all its inputs is ready. As a result, the outputs of the embedding layers in reconstructed model such as sparse_seq_emb_hist_genre could end up having the embedded historical behavior sequence (of shape (None, 15)) before the embedded sparse feature (of shape (None, 1)), i.e. output[0] is the embedded behavior sequence instead of output[1].
Multiple hash layers for the same input are also created when the model initializes the key embedding and query embedding for the attention layer due to a lack of sharing mechanism. This likely does not create a real issue as the two hashes should be identical.
I was able to make a work around by changing the order of the embedding look-up initialization in the dien model deepctr/models/sequence/dien.py:
keys_emb_list = embedding_lookup(embedding_dict, features, history_feature_columns,
return_feat_list=history_fc_names, to_list=True)
dnn_input_emb_list = embedding_lookup(embedding_dict, features, sparse_feature_columns,
mask_feat_list=history_feature_list, to_list=True)
# Move query embeddings from the first being initialized to the last.
query_emb_list = embedding_lookup(embedding_dict, features, sparse_feature_columns,
return_feat_list=history_feature_list, to_list=True)
This modification is definitely not safe. Please let me know if anyone has a better solution. Thank you in advance.
Describe the bug(问题描述)
I am training the DIEN model on a dataset with around 20 categorical features and 5 user behavior columns, all being strings. I was able to save the model with
keras.save_modelin.h5format, but it throws the following error when I try to load the model withkeras.load_model:This seems to be an issue in the model reconstruction when calling
load_model(). More details in the additional content section.To Reproduce(复现步骤)
Model:
Train and save model:
Load model (the part that raises the exception):
Operating environment(运行环境):
tensorflow/tensorflow-2.5.1-gpuAdditional context
I could not try tensorflow older than 2.2 due to driver compatibility issues. DeepCTR also doesn't work with 2.6 <= TF <= 2.11.
My model has the following structure (from `model.summary()):
Following the stack track and a lot of extra debug messages, I believe
load_modeldoes not pass the inputs to embedding layers in the same order as the original model when reconstructing the model. Namely intensorflow/python/keras/engine/functional.py,reconstruct_from_config(config, custom_objects, created_layers)builds the layers whenever all its inputs is ready. As a result, the outputs of the embedding layers in reconstructed model such assparse_seq_emb_hist_genrecould end up having the embedded historical behavior sequence (of shape(None, 15)) before the embedded sparse feature (of shape(None, 1)), i.e.output[0]is the embedded behavior sequence instead ofoutput[1].Multiple hash layers for the same input are also created when the model initializes the key embedding and query embedding for the attention layer due to a lack of sharing mechanism. This likely does not create a real issue as the two hashes should be identical.
I was able to make a work around by changing the order of the embedding look-up initialization in the dien model
deepctr/models/sequence/dien.py:This modification is definitely not safe. Please let me know if anyone has a better solution. Thank you in advance.