Weight Initialization from pretrained BERT error in pytorch

Question

I am trying to train the model using pretrained model(BERT) using pytorch. The pretrained model weights still arent accepted.

I see this error:

Weights of BertForMultiLable not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
Weights from pretrained model not used in BertForMultiLable: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias']

Here is the complete traceback:

Training/evaluation parameters Namespace(adam_epsilon=1e-08, arch='bert', data_name='ICD9', do_data=False, do_lower_case=True, do_test=False, do_train=True, epochs=6, eval_batch_size=8, eval_max_seq_len=256, fp16=False, fp16_opt_level='O1', grad_clip=1.0, gradient_accumulation_steps=1, learning_rate=2e-05, local_rank=-1, loss_scale=0, mode='min', monitor='valid_loss', n_gpu='0', resume_path='', save_best=True, seed=42, sorted=1, train_batch_size=8, train_max_seq_len=256, valid_size=0.2, warmup_proportion=0.1, weight_decay=0.01)
Loading examples from cached file pybert/dataset/cached_train_examples_bert
Loading features from cached file pybert/dataset/cached_train_features_256_bert
sorted data by th length of input
Loading examples from cached file pybert/dataset/cached_valid_examples_bert
Loading features from cached file pybert/dataset/cached_valid_features_256_bert
initializing model
loading configuration file pybert/pretrain/bert/base-uncased/config.json
Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 19,
  "output_attentions": false,
  "output_hidden_states": false,
  "output_past": true,
  "pruned_heads": {},
  "torchscript": false,
  "type_vocab_size": 2,
  "use_bfloat16": false,
  "vocab_size": 28996
}

loading weights file pybert/pretrain/bert/base-uncased/pytorch_model.bin
Weights of BertForMultiLable not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
Weights from pretrained model not used in BertForMultiLable: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias']
initializing callbacks
***** Running training *****
  Num examples = 21479
  Num Epochs = 6
  Total train batch size (w. parallel, distributed & accumulation) = 8
  Gradient Accumulation steps = 1
  Total optimization steps = 16110
Warning: There's no GPU available on this machine, training will be performed on CPU.
Warning: The number of GPU's configured to use is 0, but only 0 are available on this machine.
Traceback (most recent call last):
  File "run_bert.py", line 227, in <module>
    main()
  File "run_bert.py", line 220, in main
    run_train(args)
  File "run_bert.py", line 125, in run_train
    trainer.train(train_data=train_dataloader, valid_data=valid_dataloader, seed=args.seed)
  File "/home/aditya_vartak/bert_pytorch/pybert/train/trainer.py", line 168, in train
    summary(self.model,*(input_ids, segment_ids,input_mask),show_input=True)
  File "/home/aditya_vartak/bert_pytorch/pybert/common/tools.py", line 307, in summary
    model(*inputs)
  File "/home/aditya_vartak/virtualenvs/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/aditya_vartak/bert_pytorch/pybert/model/nn/bert_for_multi_label.py", line 14, in forward
    outputs = self.bert(input_ids, token_type_ids=token_type_ids,attention_mask=attention_mask, head_mask=head_mask)
  File "/home/aditya_vartak/virtualenvs/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/aditya_vartak/virtualenvs/anaconda3/envs/pytorch/lib/python3.7/site-packages/transformers/modeling_bert.py", line 722, in forward
    embedding_output = self.embeddings(input_ids=input_ids, position_ids=position_ids, token_type_ids=token_type_ids, inputs_embeds=inputs_embeds)
  File "/home/aditya_vartak/virtualenvs/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 533, in __call__
    result = hook(self, input)
  File "/home/aditya_vartak/bert_pytorch/pybert/common/tools.py", line 269, in hook
    summary[m_key]["input_shape"] = list(input[0].size())
IndexError: tuple index out of range

Any help would be great. Thanks in advance

emanuelevivoli · Accepted Answer

for the Error you cited, actually I think that is only a Warning that states you you are loading on your architecture BertForMultiLable the Weights from pretrained model that was not trained for that specific tasks. Similar warning discussion here

The real error here it's another: IndexError: tuple index out of range. but for that you should attach some code and more information about what you are doing .

Weight Initialization from pretrained BERT error in pytorch

Tags:

nlp

pytorch

multilabel-classification

pre-trained-model

spacy-transformers

Aditya Vartak

1 Answers

emanuelevivoli

Recent Activity

Donate For Us

Weight Initialization from pretrained BERT error in pytorch

Tags:

nlp

pytorch

multilabel-classification

pre-trained-model

spacy-transformers

Aditya Vartak

1 Answers

emanuelevivoli

Related questions

Recent Activity

Donate For Us