Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The loss becomes neagative from positive values dring taining loop #19638

Open
yijianSU22 opened this issue Apr 29, 2024 · 5 comments
Open

The loss becomes neagative from positive values dring taining loop #19638

yijianSU22 opened this issue Apr 29, 2024 · 5 comments
Assignees
Labels
stale stat:awaiting response from contributor type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited.

Comments

@yijianSU22
Copy link

Hi, I just ran a unet model on a train set, and used the dice and crossentropy loss as a loss function,but t found that the loss value is not normal , it became negative geadually. As bellow:
2024-04-27 22:54:02.697477: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2617/2617 [==============================] - 9485s 4s/step - loss: 0.3995 - accuracy: 0.0302 - val_loss: 0.3482 - val_accuracy: 0.0182
Epoch 2/20
2617/2617 [==============================] - 9453s 4s/step - loss: 0.1805 - accuracy: 0.2205 - val_loss: 0.1516 - val_accuracy: 0.9400
Epoch 3/20
2617/2617 [==============================] - 9428s 4s/step - loss: 0.0435 - accuracy: 0.9362 - val_loss: 0.1033 - val_accuracy: 0.9482
Epoch 4/20
2617/2617 [==============================] - 9412s 4s/step - loss: -0.0293 - accuracy: 0.9398 - val_loss: 0.0141 - val_accuracy: 0.9459
Epoch 5/20
2617/2617 [==============================] - 9444s 4s/step - loss: -0.0844 - accuracy: 0.9420 - val_loss: -0.0150 - val_accuracy: 0.9548
Epoch 6/20
2617/2617 [==============================] - 9436s 4s/step - loss: -0.1212 - accuracy: 0.9440 - val_loss: -0.0363 - val_accuracy: 0.9599
Epoch 7/20
2617/2617 [==============================] - 9397s 4s/step - loss: -0.1537 - accuracy: 0.9457 - val_loss: -0.0193 - val_accuracy: 0.9538
Epoch 8/20
2617/2617 [==============================] - 9305s 4s/step - loss: -0.1777 - accuracy: 0.9467 - val_loss: -0.0149 - val_accuracy: 0.9526
Epoch 9/20
2617/2617 [==============================] - 8968s 3s/step - loss: -0.2004 - accuracy: 0.9473 - val_loss: -0.0841 - val_accuracy: 0.9576
Epoch 10/20
2617/2617 [==============================] - 8787s 3s/step - loss: -0.2210 - accuracy: 0.9480 - val_loss: -0.0822 - val_accuracy: 0.9571
Epoch 11/20
2617/2617 [==============================] - 8794s 3s/step - loss: -0.2337 - accuracy: 0.9486 - val_loss: -0.0837 - val_accuracy: 0.9566
Epoch 12/20
2617/2617 [==============================] - 8809s 3s/step - loss: -0.2521 - accuracy: 0.9492 - val_loss: -0.0856 - val_accuracy: 0.9615
Epoch 13/20
2617/2617 [==============================] - 8804s 3s/step - loss: -0.2688 - accuracy: 0.9500 - val_loss: -0.1012 - val_accuracy: 0.9594
Epoch 14/20
2617/2617 [==============================] - 8807s 3s/step - loss: -0.2867 - accuracy: 0.9508 - val_loss: -0.0994 - val_accuracy: 0.9599
Epoch 15/20
2617/2617 [==============================] - 8721s 3s/step - loss: -0.2949 - accuracy: 0.9511 - val_loss: -0.1008 - val_accuracy: 0.9605
Epoch 16/20
2617/2617 [==============================] - 8684s 3s/step - loss: -0.3071 - accuracy: 0.9515 - val_loss: -0.0705 - val_accuracy: 0.9564
Epoch 17/20
349/2617 [===>..........................] - ETA: 37:27 - loss: -0.0398 - accuracy: 0.9501

and this is my loss function:
class categorical_dicePcrossentropy_weight(tf.keras.losses.Loss):
def init(self,class_weight,lamda=0.5):
super().init()
self.lamda = lamda
self.weight = class_weight

def call(self, y_true, y_pred):
    smooth = 1.e-5
    smooth = tf.constant(smooth,tf.float32)

    y_true = tf.cast(y_true,tf.float32)
    y_pred = tf.cast(y_pred,tf.float32)

    intersection = tf.math.reduce_sum(y_pred * y_true,axis=(1,2,3))
    union = tf.math.reduce_sum((y_pred+y_true),axis=(1,2,3))
    dice_coef = tf.math.reduce_sum(2 * (intersection + smooth) / (union + smooth),axis=0)

    loss1 = tf.math.reduce_mean(self.weight * dice_coef)

    epsilon = 1.e-5
    output = y_pred/tf.math.reduce_sum(y_pred,axis=-1,keepdims=True)
    output = tf.clip_by_value(output,epsilon,1-epsilon)

    loss = y_true * tf.math.log(output)

    loss = tf.math.reduce_mean(loss, axis=(1, 2, 3))
    loss = tf.math.reduce_mean(loss, axis=0)
    loss2 = tf.math.reduce_mean(self.weight * loss)

    total_loss = (1 - self.lamda) * (1 - loss1) + self.lamda * loss2

    return total_loss

I don't know why,Is there a way to resolve it?

@td-jakubl
Copy link
Contributor

  1. For small values tf.math.log(output) is negative
  2. tf.clip_by_value() is not working for nan e.g. if output contains nan then tf.clip_by_value(output,epsilon,1-epsilon) also contains nan if I'm not mistaken

@yijianSU22
Copy link
Author

  1. For small values tf.math.log(output) is negative
  2. tf.clip_by_value() is not working for nan e.g. if output contains nan then tf.clip_by_value(output,epsilon,1-epsilon) also contains nan if I'm not mistaken

Thanks very much, yes, you're right. here should be -y_true *tf.math.log(output)

@yijianSU22
Copy link
Author

  1. For small values tf.math.log(output) is negative
  2. tf.clip_by_value() is not working for nan e.g. if output contains nan then tf.clip_by_value(output,epsilon,1-epsilon) also contains nan if I'm not mistaken

Thanks very much, yes, you're right. here should be -y_true *tf.math.log(output)

hi,Sorry to bother you again,I don't know why I used the tf.keras.losses.CategoricalCrossentropy() to compute CE, the loss value still will be negative during training loop.

@SuryanarayanaY
Copy link
Collaborator

Hi @yijianSU22 ,

The Op tf.math.log(x) outputs -inf if the value of x is 0 and nan if x<0. You can clip -inf values to a value you want using tf.clip_by_value . But for nan , clip_by_value also returns nan. SInce this is custom loss function, maybe you need to recheck it.

@SuryanarayanaY SuryanarayanaY added type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited. stat:awaiting response from contributor labels May 6, 2024
Copy link

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale stat:awaiting response from contributor type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited.
Projects
None yet
Development

No branches or pull requests

3 participants