Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
125 views
in Technique[技术] by (71.8m points)

python - Tensorflow classification model returns incorrect output shape

I am making a simple binary classification model, that takes 30 timestamps with 5 features and should return a probability of a certain class

I've ran into the problem of the model's loss not decreasing over epochs. I've looked into the model's summary and output, and found out that instead of producing a single output number (probability of a class) it instead produces an array of 30 probabilities, which probably leads to it not being able to learn.

The model code is as follows:

print(train['inputs'].shape)  #(3511,30,5)
print(train['labels'].shape)  #(3511,1)

lstm_model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
  ])


  lstm_model.compile(
                loss="binary_crossentropy",
                optimizer=tf.optimizers.Adam(learning_rate=0.0001),
                metrics=["accuracy"])

  history = lstm_model.fit(x=train['inputs'], y=train['labels'], epochs=1,
                       validation_data=(val['inputs'], val['labels']),
                      )

The number of layers doesn't seem to impact the issue (added this much trying to overfit the model)

The summary of the model is as follows:

Model: "sequential_108"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_297 (Dense)            (1, 30, 256)              1536      
_________________________________________________________________
activation_128 (Activation)  (1, 30, 256)              0         
_________________________________________________________________
dense_298 (Dense)            (1, 30, 256)              65792     
_________________________________________________________________
activation_129 (Activation)  (1, 30, 256)              0         
_________________________________________________________________
dense_299 (Dense)            (1, 30, 256)              65792     
_________________________________________________________________
activation_130 (Activation)  (1, 30, 256)              0         
_________________________________________________________________
dense_300 (Dense)            (1, 30, 256)              65792     
_________________________________________________________________
activation_131 (Activation)  (1, 30, 256)              0         
_________________________________________________________________
dense_301 (Dense)            (1, 30, 1)                257       
=================================================================
Total params: 199,169
Trainable params: 199,169
Non-trainable params: 0

As you can see, the output layer returns an array of shape (30,1), and the same happens when trying to make actual predictions using the model.

I've also tried to reshape the labels to (3511) and (3511,1,1), but this doesn't seem to have fixed the issue.

What could be causing this behavior?

question from:https://stackoverflow.com/questions/65926784/tensorflow-classification-model-returns-incorrect-output-shape

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I assume you want to use LSTM layer, as you are working with 3-dimensional timestamp input.

All you need to do is set return_sequences in the last LSTM layer to False, for example:

lstm_model = tf.keras.models.Sequential([
    tf.keras.layers.LSTM(5, return_sequences=True, dropout=0.2, recurrent_dropout=0.2),
    tf.keras.layers.LSTM(10, return_sequences=True, activation='relu'),
    tf.keras.layers.LSTM(64, return_sequences=False, activation='relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
  ])

Some explanation behind how shapes in LSTM layers work is provided e.g. in this question:

How to stack multiple lstm in keras?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...