[BUG Report + fix]: Earlystopping does not stop according to "_min_delta" and "_baseline" parameters #1178

GaijinOtohp · 2023-09-20T16:35:27Z

Description

[BUG Report]: Earlystopping should triger stopping the training if loss value is not improved by at least _min_delta.
[Suggestion]: I also suggest that the training should stop after loss reaches _base_line which should infer to the optimal value (the threshold to the desired loss value).

Reproduction Steps

            var layers = tf.keras.layers;

            Tensor input = tf.keras.Input((-1, 1), name: "input_place_holder");
            Tensor hiddenLayer1 = layers.Dense(7, activation: "linear").Apply(input);
            Tensor output = layers.Dense(2, activation: "linear").Apply(hiddenLayer1);
            var model = tf.keras.Model(input, output);

            Tensorflow.NumPy.NDArray inputs = new Tensorflow.NumPy.NDArray(new float[,] { { -5 }, { -4 }, { -3 }, { -2 }, { -1 }, { 0 }, { 1 }, { 2 }, { 3 }, { 4 }, { 5 } });
            Tensorflow.NumPy.NDArray outputs = new Tensorflow.NumPy.NDArray(new float[,] { { 0, 2 }, { 1, 4 }, { 2, 6 }, { 3, 8 }, { 4, 10 }, { 5, 12 }, { 6, 14 }, { 7, 16 }, { 8, 18 }, { 9, 20 }, { 10, 22 } });

            List<Tensorflow.NumPy.NDArray> weights = model.get_weights();

            var callbackPars = new Tensorflow.Keras.Callbacks.CallbackParams() { Model = model };
            var earlystoppingCB = new Tensorflow.Keras.Callbacks.EarlyStopping(parameters: callbackPars,
                                                                    monitor: "loss",
                                                                    patience: 3,
                                                                    mode: "min",
                                                                    restore_best_weights: true,
                                                                    //baseline: 0.001f,
                                                                    min_delta: 0.0001f);

            Tensorflow.Keras.Engine.ICallback historyCB;
            float learningRate = 0.1f;
            for (int i = 0; i < 100; i++)
            {
                model.compile(optimizer: tf.keras.optimizers.SGD(learningRate), loss: tf.keras.losses.MeanSquaredError());

                historyCB = model.fit(inputs, outputs, epochs: 1000, callbacks: new List<Tensorflow.Keras.Engine.ICallback>() { earlystoppingCB });

                if (historyCB.history["loss"][historyCB.history["loss"].Count - 1] < 0.5f)
                    break;

                model.set_weights(weights);
                learningRate /= 10f;
            }

            Tensor results0 = model.predict(tf.constant(new float[,] { { -8 } }));

Known Workarounds

Suggested fix to the bug

The fix for the bug is in line 134 in the file Tensorflow.Keras.Callbacks.EarlyStopping.cs:
bool less_op = (monitor_value - _min_delta) < reference_value;
---> bool less_op = (monitor_value + _min_delta) < reference_value;

Why:

since _min_delta is always positive, and monitor_value is a slight improvement of reference_value, then (monitor_value - _min_delta) < reference_value will always be true. Therefore, it keeps resetting _wait to 0.

The suggestion to earlystopping using _baseline

The fix to the suggestion is in line 88 in the file Tensorflow.Keras.Callbacks.EarlyStopping.cs:
if (_baseline == 0f || _is_improvement(current, _baseline))
---> if (_baseline == 0f || _is_improvement(_baseline, current))

Why:

We should check if the current improved loss is less than _baseline. If yes then do not reset _wait to 0.

Configuration and Other Information

No response

The text was updated successfully, but these errors were encountered:

Wanglongzhi2001 · 2023-09-20T17:15:40Z

Hello, very thanks for your suggestion. In TensorFlow.NET, it uses a single bool expression to represent self.monitor_op rather than np.greater and np.less that used in tensorflow. And it seems like the reason trigers this bug is that there lacks of parts corresponding to the following code in tensorflow:

if self.monitor_op == np.greater:
    self.min_delta *= 1
else:
    self.min_delta *= -1

I will fix it, and you can refer to the implemention in tensorflow:
https://github.com/tensorflow/tensorflow/blob/879a5e37eb350f556debec3cd5c9b6dcb8809318/tensorflow/python/keras/callbacks.py#L1702

Wanglongzhi2001 · 2023-09-21T14:35:42Z

Hello, I have commit a PR to solve this problem. #1180
If there still exist some problem, please tell me. ^_^

Wanglongzhi2001 · 2023-09-24T14:46:42Z

Can this issue be closed?

GaijinOtohp · 2023-09-25T19:38:53Z

Yes. The bug is fixed.
It can be closed.

GaijinOtohp closed this as completed Sep 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG Report + fix]: Earlystopping does not stop according to "_min_delta" and "_baseline" parameters #1178

[BUG Report + fix]: Earlystopping does not stop according to "_min_delta" and "_baseline" parameters #1178

GaijinOtohp commented Sep 20, 2023

Wanglongzhi2001 commented Sep 20, 2023 •

edited

Loading

Wanglongzhi2001 commented Sep 21, 2023

Wanglongzhi2001 commented Sep 24, 2023

GaijinOtohp commented Sep 25, 2023

[BUG Report + fix]: Earlystopping does not stop according to "_min_delta" and "_baseline" parameters #1178

[BUG Report + fix]: Earlystopping does not stop according to "_min_delta" and "_baseline" parameters #1178

Comments

GaijinOtohp commented Sep 20, 2023

Description

Reproduction Steps

Known Workarounds

Suggested fix to the bug

Why:

The suggestion to earlystopping using _baseline

Why:

Configuration and Other Information

Wanglongzhi2001 commented Sep 20, 2023 • edited Loading

Wanglongzhi2001 commented Sep 21, 2023

Wanglongzhi2001 commented Sep 24, 2023

GaijinOtohp commented Sep 25, 2023

Wanglongzhi2001 commented Sep 20, 2023 •

edited

Loading