Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to bypass invalid utf8 character string in mysql

Tags:

mysql

utf-8

I have a large text file containing Arabic text data. When I try to load it into a MySQL table, I get error saying Error code 1300: invalid utf8 character string. This is what I have tried so far:

LOAD DATA INFILE '/var/lib/mysql-files/text_file.txt'
IGNORE INTO TABLE tblTest
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n';

I tried to ignore this error, but it does not work. I have tried LOCAL INFILE but it did not work, too. My database was created using DEFAULT CHAR SET UTF8 and DEFAULT COLLATE utf8_general_ci. The text file is utf-8 encoded.

I do not want the records which contain invalid utf8 characters. So how I can load the data with ignoring the records containing such invalid chars?

Thank in advance!

like image 399
Mohammed Avatar asked Nov 01 '25 06:11

Mohammed


2 Answers

It would help to have the HEX of the naughty character.

A possible approach to reading all the text, then dealing with any bad characters:

  1. Read into a column of type VARBINARY or BLOB.

  2. Loop through the rows, trying to copy to a VARCHAR or TEXT column.

Another plan is to use utf8mb4 instead of utf8. It could be that the bad character is an Emoji or Chinese character that will work in utf8mb4, but not utf8.

Ignore errors

This may let you ignore errors:

SET @save := @@sql_mode;
LOAD DATA ...;
SET @@sql_mode := @save;
like image 136
Rick James Avatar answered Nov 03 '25 22:11

Rick James


I have this problem when try use MySQL 5.7.14, too.

I returned to MySQL 5.6 and this problem is disappeared

like image 39
tecdoc ukr net Avatar answered Nov 03 '25 22:11

tecdoc ukr net



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!