Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any "standard" way to calculate the numerical gradient?

I am trying to calculate the numerical gradient of a smooth function in c++. And the parameter value could vary from zero to a very large number(maybe 1e10 to 1e20?)

I used the function f(x,y) = 10*x^3 + y^3 as a testbench, but I found that if x or y is too large, I can't get correct gradient.

Here is my code to calculate the graidient:

#include <iostream>
#include <cmath>
#include <cassert>
using namespace std;
double f(double x, double y)
{
    // black box expensive function
    return 10 * pow(x, 3) + pow(y, 3);
}
int main()
{
    // double x = -5897182590.8347721;
    // double y = 269857217.0017581;
    double x = 1.13041e+19;
    double y = -5.49756e+14;
    const double epsi = 1e-4;

    double f1 = f(x, y);
    double f2 = f(x, y+epsi);
    double f3 = f(x, y-epsi);
    cout << f1 << endl;
    cout << f2 << endl;
    cout << f3 << endl;
    cout << f1 - f2 << endl; // 0
    cout << f2 - f3 << endl; // 0
    return 0;
}

If I use the above code to calculate the gradient, the gradient would be zero!

The testbench function, 10*x^3 + y^3, is just a demo, the real problem I need to solve is actually a black box function.

So, is there any "standard" way to calculate the numerical gradient?

like image 375
Alaya Avatar asked Oct 22 '25 06:10

Alaya


1 Answers

In the first place, you should use the central difference scheme, which is more accurate (by cancellation of one more term of the Taylor develoment).

(f(x + h) - f(x - h)) / 2h

rather than

(f(x + h) - f(x)) / h

Then the choice of h is critical and using a fixed constant is the worst thing you can do. Because for small x, h will be too large so that the approximation formula no more works, and for large x, h will be too small, resulting in severe truncation error.

A much better choice is to take a relative value, h = x√ε, where ε is the machine epsilon (1 ulp), which gives a good tradeoff.

(f(x(1 + √ε)) - f(x(1 - √ε))) / 2x√ε

Beware that when x = 0, a relative value cannot work and you need to fall back to a constant. But then, nothing tells you which to use !