Skip to content

Commit 28844af

Browse files
committed
MDEV-37662: Binlog Corruption When tmpdir is Full
The binary log could be corrupted when committing a large transaction (i.e. one whose data exceeds the binlog_cache_size limit and spills into a tmp file) in binlog_format=row if the server's --tmp-dir is full. The corruption that happens is only the GTID of the errored transaction would be written into the binary log, without any body/finalizing events. This would happen because the content of the transaction wasn't flushed at the proper time, and the transaction's binlog cache data was not durable while trying to copy the content from the binlog cache file into the binary log itself. While switching the tmp file from a WRITE_CACHE to a READ_CACHE, the server would see there is still data to flush in the cache, and first try to flush it. This is not a valid time to flush that data to the temporary file though, as the GTID event has already been written directly to the binary log. So if this flushing fails, it leaves the binary log in a corrupted state. The flush itself is expected to happen in THD::binlog_flush_pending_rows_event(). However, if there is no pending event, the flush is skipped. This patch fixes this issue by still flushing the tmp file to disk in THD::binlog_flush_pending_rows_event() when there is no pending event. Reviewed-by: TODO
1 parent 1072b8e commit 28844af

File tree

2 files changed

+55
-0
lines changed

2 files changed

+55
-0
lines changed
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
include/master-slave.inc
2+
[connection master]
3+
connection master;
4+
set @old_binlog_cache_size= @@global.binlog_cache_size;
5+
set @@global.binlog_cache_size=4096;
6+
#
7+
# Initialize test data
8+
connection master;
9+
create table t1 (a int, b longtext default NULL) engine=innodb;
10+
#
11+
# Create transaction with cache data larger than the binlog_cache_size
12+
# so it spills into a tmp file, then simulate ENOSPC while flushing
13+
# the tmp file.
14+
#
15+
set @@session.debug_dbug="+d,simulate_binlog_tmp_file_no_space_left_on_flush";
16+
insert into t1 values (2, repeat("y", 8192));
17+
ERROR HY000: Error writing file '/home/brandon/workspace/server_1011/build/mysql-test/var/tmp/mysqld.1/#sql/fd=83' (Errcode: 28 "No space left on device")
18+
set @@session.debug_dbug="";
19+
#
20+
# Create another transaction to make sure the server/replication can
21+
# continue working normally after the error
22+
#
23+
insert into t1 values (3, repeat("z", 8192));
24+
include/save_master_gtid.inc
25+
connection slave;
26+
include/sync_with_master_gtid.inc
27+
include/diff_tables.inc [master:test.t1,slave:test.t1]
28+
#
29+
# Cleanup
30+
connection master;
31+
drop table t1;
32+
include/save_master_gtid.inc
33+
connection slave;
34+
include/sync_with_master_gtid.inc
35+
connection master;
36+
set @@global.binlog_cache_size= @old_binlog_cache_size;
37+
include/rpl_end.inc
38+
# End of rpl_row_binlog_tmp_file_flush_enospc.test

sql/log.cc

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6566,6 +6566,23 @@ int THD::binlog_flush_pending_rows_event(bool stmt_end, bool is_transactional)
65666566
error= mysql_bin_log.flush_and_set_pending_rows_event(this, 0,
65676567
is_transactional);
65686568
}
6569+
else
6570+
{
6571+
/*
6572+
There's no pending event, but we still need to flush the cache
6573+
*/
6574+
binlog_cache_mngr *cache_mngr=
6575+
(binlog_cache_mngr *) thd_get_ha_data(this, binlog_hton);
6576+
if (cache_mngr)
6577+
{
6578+
DBUG_EXECUTE_IF("simulate_binlog_tmp_file_no_space_left_on_flush",
6579+
{ DBUG_SET("+d,simulate_file_write_error"); });
6580+
error=
6581+
flush_io_cache(cache_mngr->get_binlog_cache_log(is_transactional));
6582+
DBUG_EXECUTE_IF("simulate_binlog_tmp_file_no_space_left_on_flush",
6583+
{ DBUG_SET("-d,simulate_file_write_error"); });
6584+
}
6585+
}
65696586

65706587
DBUG_RETURN(error);
65716588
}

0 commit comments

Comments
 (0)