The setup
I am using Python 3.6, TF 2.4.0
on an Azure DSVM STANDARD_NC6 (6 Cores, 56 GB RAM, 380 GB Disk) using 1 GPU
The parameters/model
I have training data: print(xtrain.shape)
with shape (4599, 124, 124, 3)
and ytrain | yval as categorical.
I use a classic generator
datagen = ImageDataGenerator(
zoom_range=0.1,
rotation_range=25,
width_shift_range=0.1,
height_shift_range=0.1,
shear_range=0.15,
horizontal_flip=True,
fill_mode="nearest"
)
datagen.fit(xtrain)
And my model is the base mobilenetv2 with own head:
baseModel = MobileNetV2(weights="imagenet",
include_top=False,
input_tensor=Input(shape=(224, 224,3)),
#input_shape=(224, 224, 3),
)
headModel = baseModel.output
headModel = AveragePooling2D(pool_size=(7, 7))(headModel)
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(64, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(2, activation="softmax")(headModel)
model = Model(inputs=baseModel.input, outputs=headModel)
for layer in baseModel.layers:
layer.trainable = False
model.compile(loss="mse", optimizer='adam', metrics=["accuracy"])
When I now fit the model
Batch_Size=1
h = model.fit(
datagen.flow(xtrain, ytrain, batch_size=Batch_Size),
steps_per_epoch=len(xtrain) // Batch_Size,
validation_data=(xval, yval),
validation_steps=len(xval) // Batch_Size,
epochs=EPOCHS,
callbacks=[model_checkpoint_callback, Board])
The Error
I get errors (all same but changing with batch size and loss function)
When I use batch_size=1
with loss=mse
, categorical_crossentropy
, or others, the model trains but throws following error at the epoch end
ValueError: Input 0 is incompatible with layer model_2: expected
shape=(None, 224, 224, 3), found shape=(1, 124, 124, 3)
If I I use a batch_size
above 1, e.g., 32 with loss=categorical_crossentropy
the error is thrown befor training:
InvalidArgumentError: Incompatible shapes: [32] vs. [0] [[node
Equal (defined at :12) ]]
[Op:__inference_train_function_65107]
with loss=mse
InvalidArgumentError: Incompatible shapes: [0,2] vs. [32,2] [[node
gradient_tape/mean_squared_error/BroadcastGradientArgs (defined at
:12) ]]
[Op:__inference_train_function_81958]
If I change the Hidden units of the last Dense Layer, the error changes to that. e.g.
...
headModel = Dense(5, activation="softmax")(headModel)
results in
InvalidArgumentError: Incompatible shapes: [0,5] vs. [32,2]
Apparently the correct input shape gets lost somwhere. Especially the batch size (second dimension is based on dense hidden units). Does anyone have an idea?
thanks
I checked many answeres from this old thread on git : https://github.com/kuza55/keras-extras/issues/7
but could not find a solution there.
question from:
https://stackoverflow.com/questions/65843923/dense-layer-probably-produces-invalidargumenterror-incompatible-shapes-0-2-v