Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in huggingface-tokenizers
How does one set the pad token correctly (not to eos) during fine-tuning to avoid model not predicting EOS?
Mar 20, 2026
machine-learning
pytorch
huggingface-transformers
huggingface
huggingface-tokenizers
what is the difference between len(tokenizer) and tokenizer.vocab_size
Mar 14, 2026
nlp
tokenize
huggingface-transformers
huggingface-tokenizers
How can I make sentence-BERT throw an exception if the text exceeds max_seq_length, and what is the max possible max_seq_length for all-MiniLM-L6-v2?
Mar 13, 2026
nlp
huggingface-transformers
bert-language-model
huggingface-tokenizers
sentence-transformers
Huggingface MarianMT translators lose content, depending on the model
Mar 12, 2026
python
huggingface-transformers
huggingface-tokenizers
machine-translation
How to add new special token to the tokenizer?
Mar 10, 2026
bert-language-model
huggingface-tokenizers
sentencepiece
Tokenizer.from_file() HUGGINFACE : Exception: data did not match any variant of untagged enum ModelWrapper
Mar 08, 2026
json
nlp
huggingface-transformers
huggingface-tokenizers
huggingface
Loading checkpoint shards takes too long
Mar 03, 2026
huggingface-transformers
h2o
huggingface
huggingface-tokenizers
llama
what is so special about special tokens?
Feb 28, 2026
nlp
tokenize
huggingface-transformers
bert-language-model
huggingface-tokenizers
pip on Docker image cannot find Rust - even though Rust is installed
Feb 23, 2026
docker
rust
pip
docker-build
huggingface-tokenizers
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation
Feb 21, 2026
huggingface-transformers
huggingface-tokenizers
HuggingFace Bert Sentiment analysis
Jan 22, 2026
python
bert-language-model
huggingface-transformers
huggingface-tokenizers
HuggingFace AutoModelForCasualLM "decoder-only architecture" warning, even after setting padding_side='left'
Dec 20, 2025
python
machine-learning
huggingface-transformers
huggingface-tokenizers
BERT - Is that needed to add new tokens to be trained in a domain specific environment?
Dec 15, 2025
nlp
bert-language-model
huggingface-transformers
huggingface-tokenizers
Strange results with huggingface transformer[marianmt] translation of larger text
Dec 07, 2025
python
translation
huggingface-transformers
huggingface-tokenizers
resize_token_embeddings on the a pertrained model with different embedding size
Dec 03, 2025
pytorch
huggingface-transformers
bert-language-model
word-embedding
huggingface-tokenizers
Older Entries »