Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mode.fit() error. Someone please help me fix this error. I am not able to figure it out #20444

Open
Israh-Abdul opened this issue Nov 4, 2024 · 2 comments

Comments

@Israh-Abdul
Copy link

I'm building a capsule network in TensorFlow for binary classification using a custom CapsuleLayer. My model and associated components are as follows:

class CapsuleLayer(layers.Layer):
    def __init__(self, num_capsule, dim_capsule, routings=3, **kwargs):
        super(CapsuleLayer, self).__init__(**kwargs)
        self.num_capsule = num_capsule
        self.dim_capsule = dim_capsule
        self.routings = routings

    def build(self, input_shape):
        self.kernel = self.add_weight(name='capsule_kernel',
                                      shape=(input_shape[-1], self.num_capsule * self.dim_capsule),
                                      initializer='glorot_uniform',
                                      trainable=True)

    def call(self, inputs):
        inputs_hat = K.dot(inputs, self.kernel)
        inputs_hat = K.reshape(inputs_hat, (-1, self.num_capsule, self.dim_capsule))
        b = K.zeros_like(inputs_hat[:, :, 0])

        for i in range(self.routings):
            c = tf.nn.softmax(b, axis=1)
            o = squash(tf.reduce_sum(c[..., None] * inputs_hat, 1))
            if i < self.routings - 1:
                b += tf.reduce_sum(inputs_hat * o[:, None, :], -1)
        return o

def squash(vectors, axis=-1):
    s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
    scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
    return scale * vectors

# Network architecture and margin loss
def CapsNet(input_shape):
    inputs = Input(shape=input_shape)
    x = Conv2D(64, (9, 9), strides=1, activation='relu', padding='valid')(inputs)
    x = Conv2D(128, (9, 9), strides=2, activation='relu', padding='valid')(x)
    x = Reshape((-1, 8))(x)
    primary_caps = CapsuleLayer(num_capsule=10, dim_capsule=8, routings=3)(x)
    digit_caps = CapsuleLayer(num_capsule=2, dim_capsule=16, routings=3)(primary_caps)
    out_caps = Lambda(lambda z: K.sqrt(K.sum(K.square(z), -1)))(digit_caps)
    return models.Model(inputs, out_caps)

def margin_loss(y_true, y_pred):
    m_plus, m_minus, lambda_val = 0.9, 0.1, 0.5
    left = tf.square(tf.maximum(0., m_plus - y_pred))
    right = tf.square(tf.maximum(0., y_pred - m_minus))
    return tf.reduce_mean(tf.reduce_sum(y_true * left + lambda_val * (1 - y_true) * right, axis=-1))

When training, I receive this error:
ValueError: Cannot squeeze axis=-1, because the dimension is not 1.

I've set class_mode='categorical' in the ImageDataGenerator flow:
train_generator = train_datagen.flow_from_directory(train_dir, target_size=(224, 224),
color_mode='grayscale', batch_size=64, class_mode='categorical')
I am using this model to classify an image dataset into 2 classes. Please help!

@VadisettyRahul
Copy link

Hi @Israh-Abdul @mehtamansi29

Some possibilities are:

This can happen if the network output does not have the expected shape for binary classification.

The model output needs to have two final activation units for each class. It would be possible to add a dense layer with softmax activation at the end of the network to ensure an output compatible with class_mode='categorical' of the ImageDataGenerator.

The error can also arise from the margin_loss loss function. It would have the loss for two classes and that y_true and y_pred have the same shape.

@dhantule dhantule added keras-team-review-pending Pending review by a Keras team member. labels Nov 18, 2024
@hertschuh
Copy link
Collaborator

Hi @Israh-Abdul ,

Thanks for the report. In order for us to be able to debug this, we'll need a few more details:

  • can you provide the full traceback in addition of the ValueError. Also, please add keras.config.disable_traceback_filtering() at the beginning.
  • can you provide training data (even fake using np.random) so that we can see the shapes.

Ideally, you would put the whole code in a colab or gist that runs and reproduces the error.

My hunch is that the labels don't have the right shape.

Thanks!

@hertschuh hertschuh self-assigned this Nov 20, 2024
@hertschuh hertschuh removed the keras-team-review-pending Pending review by a Keras team member. label Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants