Unsupervised finetuning of BERT for embeddings only?

Question

I would like to fine-tuning BERT for a specific domain on unlabeled data and get the output layer to check the similarity between them. How can I do it? Do I need to fine-tuning first a classifier task (or question answer, etc..) and get the embeddings? Or can I just use a pre-trained Bert model without task and fine-tuning with my own data?

Jindřich · Accepted Answer

There is no need to fine-tune for classification, especially if you do not have any supervised classification dataset.

You should continue training BERT the same unsupervised way it was originally trained, i.e., continue "pre-training" using the masked-language-model objective and next sentence prediction. Hugginface's implementation contains class BertForPretraining for this.

Unsupervised finetuning of BERT for embeddings only?

Tags:

nlp

similarity

bert-language-model

Q_Dbk

1 Answers

Jindřich

Recent Activity

Donate For Us

Unsupervised finetuning of BERT for embeddings only?

Tags:

nlp

similarity

bert-language-model

Q_Dbk

1 Answers

Jindřich

Related questions

Recent Activity

Donate For Us