Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to tokenize input text in android studio to process in NLP model?

When I created NLP model, I used keras tokenizer to tokenize my training data. So every word in training data has a number associated with it. Now I want to run the model in android app. So I converted the model into tflite format. Now in my app when the user gives me a text input I should convert it into array of numbers using the same tokens which I used for training data. I am unable to do so because tflite only contains the model and not the tokenizer. How to do this?

like image 972
abhishek Avatar asked Oct 24 '25 22:10

abhishek


2 Answers

You need to migrate the vocabulary of tokenized words from Python to Android. Use the tf.keras.preprocessing.text.Tokenizer.word_index property. This is a dict of ( word , index ) which you need to export as a JSON file.

import json

with open( 'android/word_dict.json' , 'w' ) as file:
    json.dump( tokenizer.word_index , file )

Now, we parse the JSON file in Android and create a Hashmap<String,Integer>.

  • Take the input String from the user and tokenize it.
  • Next, look for indices of each of the words using in the Hashmap.
  • Store these Integers in an int[] which is the input for our model.

I have discussed the whole process in this blog -> Text Classification in Android with TensorFlow

like image 200
Shubham Panchal Avatar answered Oct 27 '25 13:10

Shubham Panchal


Found a new layer in keras called tensorflow.keras.layers.experimental.preprocessing.TextVectorization.

This layer does the process of text tokenization.

This layer can be added in the model and will get imported when the model is imported. This was used in the NLP model program presented in Tensorflow Dev summit 2020.

Link to the talk: https://www.youtube.com/watch?v=aNrqaOAt5P4&list=LLyOAs3oTHjtkbQ9pqG0MYIQ&index=5&t=616s

like image 29
abhishek Avatar answered Oct 27 '25 12:10

abhishek



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!