Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unsupervised finetuning of BERT for embeddings only?

I would like to fine-tuning BERT for a specific domain on unlabeled data and get the output layer to check the similarity between them. How can I do it? Do I need to fine-tuning first a classifier task (or question answer, etc..) and get the embeddings? Or can I just use a pre-trained Bert model without task and fine-tuning with my own data?

like image 871
Q_Dbk Avatar asked Oct 15 '25 04:10

Q_Dbk


1 Answers

There is no need to fine-tune for classification, especially if you do not have any supervised classification dataset.

You should continue training BERT the same unsupervised way it was originally trained, i.e., continue "pre-training" using the masked-language-model objective and next sentence prediction. Hugginface's implementation contains class BertForPretraining for this.

like image 90
Jindřich Avatar answered Oct 18 '25 09:10

Jindřich



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!