How to tokenize input text in android studio to process in NLP model?

Question

When I created NLP model, I used keras tokenizer to tokenize my training data. So every word in training data has a number associated with it. Now I want to run the model in android app. So I converted the model into tflite format. Now in my app when the user gives me a text input I should convert it into array of numbers using the same tokens which I used for training data. I am unable to do so because tflite only contains the model and not the tokenizer. How to do this?

Shubham Panchal · Accepted Answer

You need to migrate the vocabulary of tokenized words from Python to Android. Use the tf.keras.preprocessing.text.Tokenizer.word_index property. This is a dict of ( word , index ) which you need to export as a JSON file.

import json

with open( 'android/word_dict.json' , 'w' ) as file:
    json.dump( tokenizer.word_index , file )

Now, we parse the JSON file in Android and create a Hashmap<String,Integer>.

Take the input String from the user and tokenize it.
Next, look for indices of each of the words using in the Hashmap.
Store these Integers in an int[] which is the input for our model.

I have discussed the whole process in this blog -> Text Classification in Android with TensorFlow

abhishek · Answer

Found a new layer in keras called tensorflow.keras.layers.experimental.preprocessing.TextVectorization.

This layer does the process of text tokenization.

This layer can be added in the model and will get imported when the model is imported. This was used in the NLP model program presented in Tensorflow Dev summit 2020.

Link to the talk: https://www.youtube.com/watch?v=aNrqaOAt5P4&list=LLyOAs3oTHjtkbQ9pqG0MYIQ&index=5&t=616s

How to tokenize input text in android studio to process in NLP model?

Tags:

android

tensorflow

tensorflow2.0

tf.keras

tensorflow-lite

abhishek

2 Answers

Shubham Panchal

abhishek

Recent Activity

Donate For Us

How to tokenize input text in android studio to process in NLP model?

Tags:

android

tensorflow

tensorflow2.0

tf.keras

tensorflow-lite

abhishek

2 Answers

Shubham Panchal

abhishek

Related questions

Recent Activity

Donate For Us