Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TensorBoard Metadata UnicodeDecodeError

I have a TensorBoard embedding that is created and saved like this from a gensim Doc2Vec model:

embedding = model.docvecs.vectors_docs

tf.reset_default_graph()
sess = tf.InteractiveSession()

X = tf.Variable([0.0], name='embedding')
place = tf.placeholder(tf.float32, shape=embedding.shape)
set_x = tf.assign(X, place, validate_shape=False)

sess.run(tf.global_variables_initializer())
sess.run(set_x, feed_dict={place: embedding})

summary_writer = tf.summary.FileWriter('log', sess.graph)

config = projector.ProjectorConfig()
embedding_conf = config.embeddings.add()
embedding_conf.tensor_name = 'embedding:0'
embedding_conf.metadata_path = os.path.join('log','metadata.tsv')

projector.visualize_embeddings(summary_writer, config)

saver = tf.train.Saver([X])
saver.save(sess, os.path.join('log', 'model.ckpt'), 1)

Results in this error:

TensorBoard 1.5.1 at http://COMPUTER_NAME:6006 (Press CTRL+C to quit)
E0222 17:15:20.231085 Thread-1 _internal.py:88] Error on request:
Traceback (most recent call last):
  File "c:\users\user\_installed\anaconda\envs\docsim\lib\site-packages\werkzeug
\serving.py", line 270, in run_wsgi
    execute(self.server.app)
  File "c:\users\user\_installed\anaconda\envs\docsim\lib\site-packages\werkzeug
\serving.py", line 258, in execute
    application_iter = app(environ, start_response)
  File "c:\users\user\_installed\anaconda\envs\docsim\lib\site-packages\tensorbo
ard\backend\application.py", line 271, in __call__
    return self.data_applications[clean_path](environ, start_response)
  File "c:\users\user\_installed\anaconda\envs\docsim\lib\site-packages\werkzeug
\wrappers.py", line 308, in application
    resp = f(*args[:-2] + (request,))
  File "c:\users\user\_installed\anaconda\envs\docsim\lib\site-packages\tensorbo
ard\plugins\projector\projector_plugin.py", line 514, in _serve_metadata
    for line in f:
  File "c:\users\user\_installed\anaconda\envs\docsim\lib\site-packages\tensorfl
ow\python\lib\io\file_io.py", line 214, in __next__
    return self.next()
  File "c:\users\user\_installed\anaconda\envs\docsim\lib\site-packages\tensorfl
ow\python\lib\io\file_io.py", line 208, in next
    retval = self.readline()
  File "c:\users\user\_installed\anaconda\envs\docsim\lib\site-packages\tensorfl
ow\python\lib\io\file_io.py", line 178, in readline
    return self._prepare_value(self._read_buf.ReadLineAsString())
  File "c:\users\user\_installed\anaconda\envs\docsim\lib\site-packages\tensorfl
ow\python\lib\io\file_io.py", line 94, in _prepare_value
    return compat.as_str_any(val)
  File "c:\users\user\_installed\anaconda\envs\docsim\lib\site-packages\tensorfl
ow\python\util\compat.py", line 106, in as_str_any
    return as_str(value)
  File "c:\users\user\_installed\anaconda\envs\docsim\lib\site-packages\tensorfl
ow\python\util\compat.py", line 84, in as_text
    return bytes_or_text.decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 84: invalid
 start byte

When I remove the embedding_conf.metadata_path = os.path.join('log','metadata.tsv') line, no error occurs. In addition, I can then select the very same metadata file that TensorFlow was attempting to bind to the embedding when the error occurred.

Why is this error occurring?

like image 417
OverflowingTheGlass Avatar asked Jan 25 '26 23:01

OverflowingTheGlass


1 Answers

Is it possible that the data inside your metadata.tsv file isn't UTF-8 encoded?

like image 146
Justine Tunney Avatar answered Jan 27 '26 11:01

Justine Tunney