I am making a simple binary classification model, that takes 30 timestamps with 5 features and should return a probability of a certain class
I've ran into the problem of the model's loss not decreasing over epochs. I've looked into the model's summary and output, and found out that instead of producing a single output number (probability of a class) it instead produces an array of 30 probabilities, which probably leads to it not being able to learn.
The model code is as follows:
print(train['inputs'].shape) #(3511,30,5)
print(train['labels'].shape) #(3511,1)
lstm_model = tf.keras.models.Sequential([
tf.keras.layers.Dense(256),
tf.keras.layers.Activation('relu'),
tf.keras.layers.Dense(256),
tf.keras.layers.Activation('relu'),
tf.keras.layers.Dense(256),
tf.keras.layers.Activation('relu'),
tf.keras.layers.Dense(256),
tf.keras.layers.Activation('relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
lstm_model.compile(
loss="binary_crossentropy",
optimizer=tf.optimizers.Adam(learning_rate=0.0001),
metrics=["accuracy"])
history = lstm_model.fit(x=train['inputs'], y=train['labels'], epochs=1,
validation_data=(val['inputs'], val['labels']),
)
The number of layers doesn't seem to impact the issue (added this much trying to overfit the model)
The summary of the model is as follows:
Model: "sequential_108"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_297 (Dense) (1, 30, 256) 1536
_________________________________________________________________
activation_128 (Activation) (1, 30, 256) 0
_________________________________________________________________
dense_298 (Dense) (1, 30, 256) 65792
_________________________________________________________________
activation_129 (Activation) (1, 30, 256) 0
_________________________________________________________________
dense_299 (Dense) (1, 30, 256) 65792
_________________________________________________________________
activation_130 (Activation) (1, 30, 256) 0
_________________________________________________________________
dense_300 (Dense) (1, 30, 256) 65792
_________________________________________________________________
activation_131 (Activation) (1, 30, 256) 0
_________________________________________________________________
dense_301 (Dense) (1, 30, 1) 257
=================================================================
Total params: 199,169
Trainable params: 199,169
Non-trainable params: 0
As you can see, the output layer returns an array of shape (30,1), and the same happens when trying to make actual predictions using the model.
I've also tried to reshape the labels to (3511) and (3511,1,1), but this doesn't seem to have fixed the issue.
What could be causing this behavior?
question from:
https://stackoverflow.com/questions/65926784/tensorflow-classification-model-returns-incorrect-output-shape