Home - Waterfall Grid T-Grid Console Builders Recent Builds Buildslaves Changesources - JSON API - About

Console View


Categories: connectors experimental galera main
Legend:   Passed Failed Warnings Failed Again Running Exception Offline No data

connectors experimental galera main
Jan Lindström
MDEV-30732 : wsrep_store_key_val_for_row() may invoke memcpy() on nullptr

Problem was that row_mysql_read_blob_ref can return NULL
in case when blob datatype is used in a key and its real
value is NULL. This NULL pointer is then used in memcpy
function in wsrep_store_key_val_for_row. However,
memcpy is defined so that argument 2 must not be NULL.

Fixed by adding conditions before memcpy functions so
that argument 2 is always non NULL.

Additional fixes after review
- Removed unnecessary copying key data from one buffer to another.
Use original key data buffer as input and temporary buffer as output.
Extra output buffer is needed because strnxfrm might expand input buffer
contents.
- Removed unnecessary initialization of variables and move
declaration where first time needed.
- Removed unnecessary intitialization of temporary buffer because
we already keep track actual filled length.
- Remove unneccessary extra call to charset->strnxfrm
Andrei Elkin
Revert "MDEV-37686 rpl.create_or_replace_mix2 fails in MDEV-35915 branch"

This reverts commit 3241798214b066d62ba3274ba5dc29549349ca65.
Due to MDEV-38212.
sjaakola
MDEV-35511: Backport fix for Audit log not reporting user in Galera cluster

Setting a name for the the THD::security_ctx:user field for wsrep applier threads.
With this, the audit log events related to wsrep applying will be written in the
audit log.

Using user name <wsrep_applier> for wsrep appliers. This is for having identical
look with async replication, which uses: <replication_slave> for audit logging.

Using same approach as async replication to replace the security_ctx user name
with "system user" for processlist output.

Commit has also a mtr test galera.MDEV-35511, to verify wsrep applier audit logging.
The test does not install/uninstall audit log plugin, but loads the audit log plugin
before the test. This is because uninstalling the audit log plugin gives a warning
saying that plugin is busy and uninstall will be delayed until server shutdown.
This anomaly must be because of the applier thread being active audit logger.
Same problem with plugin uninstall happens also with async relication workers.
If plugn would remain installed, the post test sanity check complains of mismatching
state of pre and post test states, and test execution would fail for this.
Aquila Macedo
MDEV-38046 Make func_regexp_pcre tolerant to PCRE2 offset change

PCRE2 10.47 reports the invalid escape in 'A\q' at offset 3 instead of 2.
Update the expected result and add a --replace_regex in the test so the
suite passes with both older and newer PCRE2 versions.
Andrei Elkin
MDEV-37541 Race of rolling back and committing transaction to binlog

Two transactions could binlog their completions in opposite to how it
is done in Engine. That is is rare situations ROLLBACK in Engine of
the dependency parent transaction could be scheduled by the
transaction before its binlogging. That give a follower dependency
child one get binlogged ahead of the parent.

For fixing this bug its necessary to ensure the binlogging phase is
always first one in the internal one-phase rollback protocol.

The commit combines
1. a code polishing piece over a part of MDEV-21117 that
  made binlog handlerton always commit first in no-2pc cases and
2. the same rule now applies to the rollback.

An added test demonstrates how the child could otherwise reach binlog
before its parent.
Thirunarayanan Balathandayuthapani
MDEV-37138: Innochecksum fails to handle doublewrite buffer and
multiple file tablespace

Problem:
=======
- innochecksum was incorrectly interpreting doublewrite buffer
pages as index pages, causing confusion about stale tables
in the system tablespace.

- innochecksum fails to parse the multi-file system tablespace

Solution:
========
1. Rewrite checksum of doublewrite buffer pages
are skipped.

2. Introduced the option --tablespace-flags which can be used
to initialize page size. This option can handle the ibdata2,
ibdata3 etc without parsing ibdata1.

This is a cherry-pick of commit 9f8716ab612ee13a4ecd41dfc70fd89c0add81df
Sergei Golubchik
make innodb_gis.gis, innodb_gis.1, main.gis tests more stable

and fix outdated comments
Denis Protivensky
MDEV-34124: Make sequences work with streaming replication

- extend galera_sequences_transaction test with streaming replication
combinations (it demonstrates the exact results compared to the regular
Wsrep replication)
- remove MDEV-28971 test as it's not applicable after fixing the binlog
statement cache replication with Wsrep
Marko Mäkelä
Merge 10.6 into 10.11
Denis Protivensky
MDEV-34124: Test sequences recovery after crash in Galera cluster
Vlad Lesin
MDEV-36845 InnoDB: Failing assertion: tail.trx_no <= last_trx_no

The scenario of the bug is the following. Before killing the server some
transaction A starts undo log writing in some undo segment U of rseg R.
It writes its trx_id into the undo log header. Then new trx_id is assigned
to transaction B, but undo log hasn't been started yet. Then transaction
A commits and writes trx_no into its undo log header. Transaction B
starts writing undo log into the undo segment U. So we have the
following undo logs in the undo segments U:

... undo log 1...
... undo log 2...
      ...
undo log A, trx_id: L, trx_no: M, ...
undo log B, trx_id: N, trx_no: 0, ...

Where L < N < M.

Then server is killed.

On recovery the maximum trx_no is extracted from each rseg, and the
maximum trx_no among all rsegs plus one is considered as a new value
for server-wide transaction id/no counter.

For each undo segment of each rseg we read the last undo log header. If
the last undo log is committed, then we read trx_no from the header,
otherwise we treat trx_id as trx_no. The maximum trx_no from all undo
log segments of the current rseg is treated as the maximum trx_no of the
rseg.

For the above case the undo log of transaction B is not committed and
its trx_no is 0. So we read trx_id and treat it as trx_no. But M < N. If
U is the last modified undo segment in rseg R, and trx_(id/no) N is the
maximum trx_no among all rsegs, then there can be the case when after
recovery some transaction with trx_no_C, such as N < trx_no_C <= M, is
committed.

During a purging we store trx_no of the last parsed undo log of a
committed transaction in purge_sys.tail.trx_no. So if the last parsed
undo log is the undo log of transaction A(transaction B was rolled back
on recovery and its undo log was also removed from the undo segment U),
then purse_sys.tail.trx_no = M. Than if some other transaction C with
trx_no_C <= M is being committed and purged, we hit
"tail.trx_no <= last_trx_no" assertion failure in
purge_sys_t::choose_next_log(), because purge queue is min-heap of
(trx_no, trx_sys.rseg_array index) pairs, where the key is trx_no, and it
must not be that trx_no of the last parsed undo log of a committed
transaction is greater than the last trx_no of the rseg at the top of
the queue.

The fix is to read the trx_no of the previous to last undo log in undo
segment, if the last undo log in that undo segment is not committed, and
set trx_no=max(trx_id of the last undo log, trx_no of the previous to
last undo log) during recovery.

We can do this because we need to extract the maximum
value of trx_no or trx_id of the undo log segment, and the maximum value
is either trx_id of the last undo log or trx_no of the previous to
last undo log, because undo segment can be assigned only to the one
transaction at time, and undo logs in the undo segment are ordered by
trx_id.

Reviewed by Marko Mäkelä.
Marko Mäkelä
Merge 10.6 into 10.11
Mohammad El-Shennawy
MDEV-36269: improve error handling for source command

- refactor batch_readline_init to mysql.cc for proper error handling
- add unit test and move it to the end of main/mysql_client_test.test
to resolve embedded test failures
- remove unnecessary code and duplicates to clean up implementation
- redirect error message from stderr to stdout in .test file
- use labels to avoid code duplication
- handle windows check for block device
- ensure file failing to open in windows because being a directory
is different from any other reason for clear error message
Yuchen Pei
remove SELECT_CHECK

looks unused
Marko Mäkelä
MDEV-35810 test case fixup

A KILL QUERY of SET GLOBAL innodb_log_file_size would sometimes lead to
ER_QUERY_INTERRUPTED being reported.
Jan Lindström
MDEV-36528 : Test failure on galera.mdev-22543

Test case changes only. Replace sleep with wait_condition
that makes sure that node_1 is still Donor after
node_2 has requested SST and is actually stuck on
debug sync wait.
Marko Mäkelä
MDEV-38026 Recovery of FILE_CREATE fails to create a file

fil_ibd_create(): Add a DEBUG_SYNC point for the test case.

fil_node_open_file_low(): If node->deferred is set, set the
OS_FILE_ON_ERROR_SILENT flag on OS_FILE_OPEN and attempt
OS_FILE_CREATE if needed. If this fails, then InnoDB will
refuse to start up, giving the operator a chance to resolve
the situation, for example by freeing up some space in the
file system.

recv_validate_tablespace(): Invoke deferred_spaces.add() on
any missing tablespace for which we know the LSN of the FILE_CREATE
record. In this way, fil_node_open_file_low() will end up being invoked
on files that are supposed to be created.

fil_name_process(): For FILE_CREATE, remember the create_lsn.

recv_sys_t::parse(): Pass FILE_CREATE to fil_name_process().

Some existing tests have been adjusted for the improved recovery of
file creation.

Reviewed by: Thirunarayanan Balathandayuthapani
Tested by: Saahil Alam
Vlad Lesin
MDEV-37755 fil_space_t::drop() doesn't remove space from fil_system.named_spaces

mtr.commit_file() call in fil_space_t::drop() removes space from
fil_system.named_spaces, but then the space can be inserted in the
container again by some another thread while fil_space_t::drop() is
waiting for pending operations finishing.

The fix is to check and remove a space from fil_system.named_spaces
after all pengind operations on the space are finished. Also the ut_d()
macro is removed for space->max_lsn=0 assignments to avoid repeated
space removing from fil_system.named_spaces.

There is error in ilist::pop_back(). ilist::end() returns sentinel,
and the pop_back() removes sentinel from the list instead of the last
element. The error is fixed in this commit.

Reviewed by Marko Mäkelä
Sergei Golubchik
MDEV-38005 Assertion `(yyvsp[-3].simple_string) < (yyvsp[-1].simple_string)' failed

relax assertion to account for a query being killed in the parser
Monty
MDEV-32266 All queries in stored procedures increment empty_queries counter

Fixed by setting server_status SERVER_STATUS_RETURNED_ROW if send_data
is called for stored procedures.

This make the definition of Empty_queries well defined:
"Empty_queries" is the number of SELECT queries that returns 0 rows.
Daniel Black
MDEV-38137 s3.cnf still suggests changing plugin-maturity to alpha

S3 became stable in a49f5525bbe1. Adjust the configuration file
not to require a low plugin-maturity setting.

Thanks Mike Griffin for the bug report.
Sergey Vojtovich
MDEV-13257 - main.kill-2 failed in buildbot

Test output was affected by incompletely closed preceding connections.

Make test agnostic to concurrent connections by querying
information_schema.processlist only for connections that it uses.

Avoid querying for i_s.processlist db column. It is unstable due to
trylock_short(), can be "" if concurrent connection is holding
LOCK_thd_data.
Thirunarayanan Balathandayuthapani
MDEV-38041: MariaBackup fails during rollback of inplace FTS alter table

Problem:
========
When an inplace ALTER operation is rolled back, InnoDB drops
intermediate tables and their associated FTS internal tables.
However, MariaBackup's DDL tracking can incorrectly report
this as a backup failure.

The issue occurs because backup_set_alter_copy_lock() downgrades the
MDL_BACKUP_DDL lock before the inplace phase of ALTER,
allowing FTS internal tables to be dropped during the
later phases of backup when DDL tracking is still active.

Solution:
========
backup_file_op_fail(): Ignore delete operations on FTS internal
tables when not using --no-lock option, preventing false
positive backup failures.
Marko Mäkelä
Merge 10.6 into 10.11
Oleksandr Byelkin
Merge branch '10.11' into mariadb-10.11.15
Vladislav Vaintroub
MDEV-38059 - skip sp-bugs2 test, if compiled without perfschema
Marko Mäkelä
MDEV-38069 Heavy contention on buf_pool.flush_list_mutex

buf_do_flush_list_batch(): Release and reacquire buf_pool.flush_list_mutex
after every 32 iterations, similar to how buf_flush_LRU_list_batch()
releases buf_pool.mutex ever since
commit 27ff972be22880a4046652bc94c2f97fffb456c9 (MDEV-26827 fixup).

This regression was introduced in
commit 22b62edaedddb1cabd5b855cdd39a5e90a5695a2 (MDEV-25113)
and made more prominent by the recent
commit a7f0d79f8c1fc66005401eae8de258dfe8c0a219 (MDEV-35155).

Reviewed by: Thirunarayanan Balathandayuthapani
Tested by: Saahil Alam
Tested by: Rahul Raj
Denis Protivensky
MDEV-38073: Always use tx_read_only=false for Wsrep system threads

Apparently, there was a separate issue with applier thread variables:
wsrep_plugins_post_init() would overwrite thd->variables for every
applier thread and forget to restore proper read only context.
Then, upon every server transaction termination,
trans_reset_one_shot_statistics() would set thread's read only context
to the one stored in thd->variables, thus spoiling the read only
context value for appliers.
Denis Protivensky
MDEV-34124: Improve sequences replication with Galera

- use shared key for sequence update certification
- employ native replication's code to apply changes for sequences
which handles all corner cases properly
- fix the tests to allow more transactions using sequences to be
accepted

That way the sequence is always updated to the maximum value
independent of the order of updates, and shared certification keys
allow to improve acceptance ratio of concurrent transactions that
use sequences. It's reflected in the test changes.
Jan Lindström
MDEV-37981 : Test failure on galera.galera_temporary_sequences

Test case changes only. If DDL is made using RSU in only one
node, then all objects need also be dropped using RSU. Additionally,
added wait_conditions in second test case to verify that DDL and
DML has happened to second node before accessing them.

Also run MDEV-30764 separately and force restart before test
using --force-restart and different config because it does kill
whole cluster and this could disturb next test case run by same
worker.
Andrei Elkin
MDEV-37541 Race of rolling back and committing transaction to binlog

Two transactions could binlog their completions in opposite to how it
is done in Engine. That is is rare situations ROLLBACK in Engine of
the dependency parent transaction could be scheduled by the
transaction before its binlogging. That give a follower dependency
child one get binlogged ahead of the parent.

For fixing this bug its necessary to ensure the binlogging phase is
always first one in the internal one-phase rollback protocol.

The commit makes sure the binlog handlerton always rollbacks as first
handlerton in no-2pc cases.

An added test demonstrates how the child could otherwise reach binlog
before its parent.
Denis Protivensky
MDEV-38073: Unify setting variables for Wsrep schema threads

Use common variable initialization for all Wsrep schema threads.
Simplify the code for replay & recovery of SR transactions.
Andrei Elkin
Merge commit '2fd25d77f031f48f501344b5d77aeea62b42da88' into
bb-10.11-release with will be replace by 10.11 specific one.
Jan Lindström
MDEV-38201 : Assertion `level != Sql_condition::WARN_LEVEL_ERROR' failed in void push_warning(THD*, Sql_state_errno_level::enum_warning_level, uint, const char*)

Problem was that wrong level of Sql_condition was used on
push_warning_printf and error handling of REFRESH_HOSTS
(and similar) was broken.

Fixed warning printing in wsrep_TOI_begin after enter_toi_local
is called. Fixed also error handling after REFRESH_HOSTS (and others)
if TOI begin failed.
Jan Lindström
Galera library 26.4.25 contains gcs protocol change 5-->6
Marko Mäkelä
MDEV-37994 Race condition between checkpoint and .ibd file creation

It was possible that a log checkpoint was completed and the server killed
between the time that fil_ibd_create() durably wrote a FILE_CREATE record,
and the initialization of the tablespace. This could lead to a failure
to start up after the server was killed during TRUNCATE TABLE or any
table-rebuilding operation such as OPTIMIZE TABLE.

In the case of TRUNCATE TABLE, an attempt to rename a file #sql-ibNNN.ibd
(the contents of the table before truncation) to tablename.ibd would fail,
because both files existed and the file tablename.ibd would have been
filled with NUL bytes. It was possible to resume from this error by
deleting the file tablename.ibd and restarting the server.

We will prevent this class of errors by ensuring that both the FILE_CREATE
record and the records written by fsp_header_init() will be part of the
same atomic transaction, which must be durably written before any file
is created or allocated.

NOTE: There is another possible crash recovery problem, which we are not
attempting to solve here and which will be covered by the subsequent
change (MDEV-38026). If fil_ibd_create() fails to create the file and
the server were killed, recovery would not even attempt to create the file
at all.

fil_space_t::create(): Remove the DBUG_EXECUTE_IF fault injection that
was the only cause of return nullptr. This allows us to simplify
several callers.

fil_space_t::set_stopped(), fil_space_t::clear_stopped(): Accessor
functions for fil_ibd_create() for preventing any concurrent access
to an incompletely created tablespace.

fil_ibd_create(): In a single atomic mini-transaction, write the
FILE_CREATE record as well as the log for initializing the tablespace.
After durably writing the log, create the file in the file system
Only after the file has been successfully created and allocated,
open the tablespace for business. Finally, release the exclusive page
latches so that the header pages may be written to the file.

fil_ibd_open(): Move some fault injection from fil_space_t::create()
to a higher level, to the place where an existing file is being opened.

Reviewed by: Thirunarayanan Balathandayuthapani
Tested by: Saahil Alam
Brandon Nesterenko
MDEV-37662: Binlog Corruption When tmpdir is Full

The binary log could be corrupted when committing a large transaction
(i.e. one whose data exceeds the binlog_cache_size limit and spills
into a tmp file) in binlog_format=row if the server's --tmp-dir is
full. The corruption that happens is only the GTID of the errored
transaction would be written into the binary log, without any
body/finalizing events.  This would happen because the content of the
transaction wasn't flushed at the proper time, and the transaction's
binlog cache data was not durable while trying to copy the content
from the binlog cache file into the binary log itself. While switching
the tmp file from a WRITE_CACHE to a READ_CACHE, the server would see
there is still data to flush in the cache, and first try to flush it.
This is not a valid time to flush that data to the temporary file
though, as:

  1. The GTID event has already been written directly to the binary
    log. So if this flushing fails, it leaves the binary log in a
    corrupted state.

  2. This is done during group commit, and will slow down other
    concurrent transactions, which are otherwise ready to commit.

This patch fixes these issues by ensuring all transaction data is
fully flushed to its temporary file (if used) before starting any
critical paths, i.e. in binlog_flush_cache(). Note that if the binlog
cache is solely in-memory, this flush-to-temporary-file is skipped.

Reviewed-by: Andrei Elkin <[email protected]>
Signed-off-by: Brandon Nesterenko <[email protected]>
Pekka Lampio
MDEV-31517 Wrong variable name in the configuration leads Galera to
          think SST/IST failed, at next restart will request a full SST

This patch fixes an unwanted behavior of a Galera cluster node when
Server startup fails because of an error in configuration file: after
the failure full SST is requested at the next Server startup even
though full SST is not needed (MDEV-31517).

If Server startup fails because of a configuration error, this patch
ensures that Galera state of the failing node remains unchanged. This
avoids full SST at the next Server restart.

This fix consists of three patches for the following components:

1) Server,
2) WSREP library,
3) Galera.
Jan Lindström
MDEV-36909 : Assertion `client_state.transaction().active()' failed in int wsrep_thd_append_key(THD*, const wsrep_key*, int, Wsrep_service_key_type)

Problem was that when trigger was executed it accessed
non-transactional table using Aria-engine. To support
simple DML for Aria we use TOI and befoare TOI was
started existing wsrep transaction is rolled back.
In the following operation on transactional
engine in same statement wsrep transaction is not
active anymore leading to assertion.

However, this is incorrect if there is active
wsrep transaction that has done changes. Instead
we should refuse statement if transactional commit
is not supported.
Denis Protivensky
MDEV-34124: Fix streaming replication offset for binlog stmt cache

As the binlog statement cache is only replicated with the last
fragment, it's safe to pass zero offset instead of the stored
log position, which is used only for the binlog transaction cache.