Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When using tensorflow to solve algebra, all variables became nan after training

I try to solve a Ax^2+Bx+C into (ax+b)(cx+d) where A,B,C are known and to solve value of a,b,c,d. Here are the code:

import tensorflow as tf
a = tf.Variable([.5])
b = tf.Variable([.5])
c = tf.Variable([.5])
d = tf.Variable([.5])
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)
fn1 = 2*x**2+3*x+4 #A=2,B=3,C=4
fn2 = (a*x+b)*(c*x+d)
x_train = [1,2,3,4]
y_train = [9,18,31,48]
loss = tf.reduce_sum(tf.square(fn2-y))
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

for i in range(1000):
  sess.run(train, {x:x_train, y:y_train})

print(sess.run([a,b,c,d]))

the result shows nan for all a,b,c and d. how to fix that? did i miss something? thanks for help.

like image 980
omg Avatar asked Dec 09 '25 01:12

omg


1 Answers

Your cost function is failing to converge at the learning rate of 0.01. Set the learning rate to 0.0001 (or lower) and the cost function begins to converge.

optimizer = tf.train.GradientDescentOptimizer(0.0001)

Also, if you modify your fn2 to a * x ** 2 + b * x + c, you will get closer solution to the one you are having of Ax^2+Bx+C. But if you use (ax+b)(cx+d), you might get a different solution which will satisfy the small training dataset with x = [1,2,3,4].

Another small tip is not to initialize same value (0.5 in your case) to all the variables. Randomly initialize it between -1.0 to 1.0.

like image 165
Prasad Avatar answered Dec 11 '25 16:12

Prasad