Use tarantool, why i take in log this strange messages:
2016-03-24 16:19:58.987 [5803] main/493623/http/XXX.XXX.XXX.XXX:57295 txn.cc:214 W> too long WAL write: 0.527 sec
2016-03-24 16:20:09.841 [5803] main/493714/http/XXX.XXX.XXX.XXX:57346 txn.cc:214 W> too long WAL write: 0.605 sec
2016-03-24 16:20:12.988 [5803] main/493716/http/XXX.XXX.XXX.XXX:57347 txn.cc:214 W> too long WAL write: 1.682 sec
2016-03-24 16:20:15.023 [5803] main/493717/http/XXX.XXX.XXX.XXX:37825 txn.cc:214 W> too long WAL write: 3.373 sec
2016-03-24 16:20:35.145 [5803] main/494145/http/
The message "too long wal write" means that too much time has elapsed between writing updates to the .xlog file ("too much" here meaning "more than specified in Tarantool's configuration parameter too_long_threshold").
There are two common reasons: 1) slow disk 2) problems on the application's side.
To figure out the reason nature, launch atop
with a 1s interval and check out what happened during the "too long" events: disk util
means disk issues; cpu util
means application issues.
The recommended solution for slow disk issues is to write changes to the write ahead log in batches, where every batch is wrapped in a single transaction. This will give you just one disk write per transaction. You'll need no yields in this case (see notes about fiber.yield
further on).
Typical application issues are as follows:
you launched too many fibers (so, due to successive fiber switch, too much time may elapse before the next WAL write);
you make no yields within time-consuming operations (like making full scan search, deleting a huge number of records, etc).
Notes on yields:
require('fiber')
and occasionally yield control within your program
cycle (not too often though, several times per the interval specified
in too_long_threshold
is quite enough).As you optimize your application code, remember that one Tarantool instance can utilize only one CPU core, so increasing the number of CPU cores is useless — the only solution is to ensure proper control yields among the fibers.
After direct on-site help and debugging with agent-0007, we have found several issues.
Most of them been related to slow virtual environment (openvz been used), which shows inadequate io timings.
This problem is also related to Tarantool sphia make slow selects?
Additionally there are recommendations regarding slow disks: If it is possible, try to place WAL and Tarantool Snapshots or Sophia storage on separate disks.
snap_dir, wal_dir and sophia_dir options: http://tarantool.org/doc/book/configuration/index.html#basic-parameters
Thanks.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With