Home - Waterfall Grid T-Grid Console Builders Recent Builds Buildslaves Changesources - JSON API - About

Console View


Categories: connectors experimental galera main
Legend:   Passed Failed Warnings Failed Again Running Exception Offline No data

connectors experimental galera main
Daniel Black
MDEV-39881 mroonga set, but not used variables: set_cursor_rk

len set but not used. Just remove len.
Daniel Black
MDEV-39881 mroonga set, but not used variables: grn_plugin_path

groonga/lib/plugin.c:118:7: warning: variable 'value_size' set but not used [-Wunused-but-set-variable]
  118 |  int value_size;
      |      ^

value_size would be 0 if the retreival of the variable
failed. In this case exit if the value is 0, hence becoming
used.
Thirunarayanan Balathandayuthapani
MDEV-34358  Encryption threads consume CPU and deadlock with DROP TABLE/purge

Problem:
========
1. Encryption threads busy-wait when no work is available:

When reaching fil_system.space_list.end(), fil_crypt_return_iops() is called
with wake=true, causing pthread_cond_broadcast() to wake all threads
unnecessarily, leading to CPU waste.

2. Tablespaces with CLOSING/STOPPING flags skipped during iteration:

Since DDL completion doesn't wake encryption threads, these spaces may never
be encrypted if threads sleep indefinitely.

3. For default_encrypt_list iteration, when spaces exist but none are
acquirable, threads need to wake others for cooperative retry, but this
case was not distinguished from fil_system.space_list.end().

4. IOPS are allocated before searching for tablespaces, wasting resources
during iteration when no I/O occurs.

5. Encryption threads use fil_crypt_threads_cond for two different purposes:
waiting for encryption work and waiting for IOPS allocation. When
fil_crypt_return_iops() or fil_crypt_realloc_iops() broadcasts after
releasing IOPS, it wakes ALL threads including those correctly waiting
for work, causing spurious wakeups and CPU waste.

6. When innodb_encrypt_tables or innodb_encryption_rotate_key_age is changed
during encryption thread iteration, threads continue with stale configuration
values, potentially missing tablespaces that should be encrypted or rotated
under the new settings.

7. The InnoDB encryption thread could deadlock with DROP TABLE and purge
operations in a three-way deadlock scenario:

- DROP TABLE thread holds lock_sys.latch and waits for the tablespace
pending reference count to reach zero before dropping the space.

- Encryption thread holds a tablespace reference and waits to acquire
the tablespace allocation latch in exclusive mode to call
fseg_page_is_allocated() for checking if a page is allocated before
encrypting it.

- Purge coordinator thread holds the tablespace allocation latch
in exclusive mode and waits to acquire lock_sys.latch
in shared mode for record lock operations.

This creates a circular dependency and leads to deadlock.

Solution:
=========
1. Implement timed wait with exponential backoff:

When space == fil_system.space_list.end() (applies to both default_encrypt_list
and space_list iteration when no acquirable spaces are found):
- First timeout: 5 seconds
- Subsequent timeouts: (timed_wait_count + 1) * 5 seconds (10s, 20s, 40s, 60s)
- After 5 consecutive timeouts (~135 seconds total), continue with 60-second
timed waits to ensure threads periodically recheck for state changes
- Timeout counter resets to 0 when woken by signal or when work is found

2. Move IOPS allocation from before tablespace search to after finding a space
that needs rotation. If allocation fails, set recheck=true to skip waiting
and immediately try next space.

Encryption threads would hold space references while waiting in
fil_crypt_alloc_iops(), blocking DROP TABLE. To prevent this,
encyption should do release-wait-reacquire pattern.

fil_crypt_alloc_iops(): Added nowait parameter (default false). When
nowait=true, returns immediately if IOPS not available instead of waiting.

fil_crypt_thread(): Try non-blocking IOPS allocation first with nowait=true.
Only if IOPS not immediately available:
- Save space ID and release space reference
- Wait for IOPS with nowait=false
- Reacquire space using fil_space_get_by_id() and space->acquire()
- If space dropped or stopping, release IOPS and skip

This ensures encryption threads never hold space references while waiting
for IOPS, allowing DROP TABLE operations to proceed without deadlock.

3. Introduce separate condition variable fil_crypt_iops_cond specifically for
IOPS allocation synchronization to prevent spurious wakeups:

- fil_crypt_threads_cond: Used in wait_for_work() for waiting when no
tablespaces need encryption. Signaled when settings change, new
tablespaces are created, or thread count changes.

- fil_crypt_iops_cond: Used in fil_crypt_alloc_iops() for waiting
when IOPS limit is reached. Signaled when IOPS are returned via
fil_crypt_return_iops(), released via fil_crypt_realloc_iops(), or
when srv_n_fil_crypt_iops is increased.

4. Added atomic version counter fil_crypt_settings_version that is incremented
whenever innodb_encrypt_tables or innodb_encryption_rotate_key_age changes.
Encryption threads capture the version at iteration start and check for
changes during iteration. If config changed, threads immediately restart
iteration from the beginning to ensure complete coverage with new settings.

New fields:
- fil_crypt_settings_version: Atomic counter to track configuration changes
- rotate_thread_t::timed_wait_count (uint8_t): Counts consecutive timeouts
  for exponential backoff
- rotate_thread_t::wait_for_work(): Implements timed/indefinite wait strategy
- rotate_thread_t::settings_version: Compares with fil_crypt_settings_version
  to restart encryption from the beginning

5. Fix three-way deadlock with trylock mechanism:
fseg_page_is_allocated(): Added optional caller_mtr parameter. Encryption
thread passes an mtr, allocation bitmap page latch is now correctly held
in that mtr until the caller commits it.

fil_crypt_get_page_throttle(): Use fil_space_t::x_lock_try() to acquire
the tablespace allocation latch non-blockingly. If the trylock fails,
back off and retry the same page to avoid deadlock. The bitmap page
S-latch is added to the mtr and held throughout the page read and
encryption operations. Release the tablespace exclusive latch immediately
after the allocation check to minimize contention, while the bitmap page
S-latch remains held in the mtr.

fil_crypt_rotate_page(): After 10 retries on page latch acquisition,
release IOPS, sleep 10ms, then try to reacquire IOPS with nowait=true.
If IOPS not immediately available, exit gracefully by setting first=true.
This periodic IOPS release allows other encryption threads working on
tablespaces being dropped to make progress and release their space
references, helping to break the deadlock scenario.

rotate_state.aborted: set under crypt_data->mutex when a
thread bails in fil_crypt_rotate_pages(). Gate should_flush in
fil_crypt_complete_rotate_space() on !aborted so a partial pass
cannot commit min_key_version. Cleared when the next pass
initializes the fresh pass restarts.
Daniel Black
MDEV-39881: mroonga set, but not used variables - grn_geo_meshes_for_circle

groonga/lib/geo.c:475:15: warning: variable 'n_sub_meshes' set but not used [-Wunused-but-set-variable]
  475 |    int i, j, n_sub_meshes, lat, lat_min, lat_max, lng, lng_min, lng_max;
      |              ^

n_sub_meshes only valid under defined(GEO_DEBUG) so restrict
accordingly.
Daniel Black
MDEV-39881: mroonga set, but not used variables: grn_io_flush

groonga/lib/io.c:1472:16: warning: variable 'nth_file_info' set but not used [-Wunused-but-set-variable]
1472 |      uint32_t nth_file_info;
      |                ^

This is used for the GRN_MSYNC(macro) which only uses its
file handle arg on windows. Replace inline to avoid
and warning.

Don't include grn_io_compute_nth_file_info except for Windows.
Arcadiy Ivanov
MDEV-37977 InnoDB deadlock report incorrectly reports rolled back transaction number

The "WE ROLL BACK TRANSACTION (N)" message in the deadlock report
referred to the wrong transaction number. The victim selection loop
and the display loop in `Deadlock::report()` traversed the cycle in
the same order (`cycle->wait_trx, ..., cycle`) but used misaligned
position numbering:

- **Victim selection** initialized `victim = cycle` at position 1
  *before* the loop, then started iterating from `cycle->wait_trx`
  at position 2.
- **Display loop** started from `cycle->wait_trx` at label `(1)`,
  with `cycle` displayed last at label `(N)`.

This caused `victim_pos` to be off by one relative to the displayed
transaction labels.

Fix: restructure the victim selection loop to start with `l=0` and
`victim=nullptr`, letting the loop handle all transactions uniformly.
The first iteration unconditionally picks `cycle->wait_trx` as the
initial victim at position 1, matching the display loop. The
`thd_deadlock_victim_preference()` call is guarded with a
`victim != nullptr` check to skip it on the first iteration (where
no prior victim exists to compare against).
Jan Lindström
MDEV-38870 : Galera test failure on galera.MDEV-38201

Test changes only. Moved wait condition where node should
disconnect from cluster because it has become inconsistent.
After that next FLUSH HOST based on timing could return
not supported or timeout.
Arcadiy Ivanov
Rename `mdev_37977` test to `deadlock_report`

Use a descriptive area-based test name instead of a ticket number.
Reorder `--source` includes before the per-test header comment.
Daniel Black
MDEV-39881 mroonga set, but not used variables: grn_ii_buffer_check

crid unused, so removed.
Oleksandr Byelkin
Fix MDEV-39207 test (skip if ha_example is not built)
Jan Lindström
MDEV-38869 : Galera test failure on galera.galera_sequences_recovery

Stabilize test using wait_conditions and forcing InnoDB checkpoint
before intentionally crashing the server.
Pekka Lampio
MDEV-38386 Fix incomplete cleanup in Galera MTR tests failing under --repeat

A number of Galera MTR tests pass on the first run but fail on a second
--repeat iteration, because server, cluster or filesystem state leaks
across runs and the test does not restore a clean starting state.

Fix the cleanup (or force a fresh cluster) in the affected tests. Each
fix was verified with --repeat=2 --force.

1. Stale async-slave GTID position (11 tests)

  RESET SLAVE [ALL] does not clear gtid_slave_pos. As the master does
  RESET MASTER in cleanup, on the next run the slave considers the
  events already applied and skips them, so the replicated tables never
  appear. Clear the position with SET GLOBAL gtid_slave_pos = "".

2. Leftover binlog GTID state from trailing cleanup (1 test)

  Trailing DROP TABLE / mtr.add_suppression statements ran after the
  .inc's reset master and re-populated node_2's binlog. gtid_binlog_state
  keeps the latest seqno per (domain, server_id) pair, so a stray
  0-2-<n> survived into the next run and broke the state comparison.
  Reorder the cleanup and reset node_2's binlog last.

3. Cluster-global, one-time or time-window state (11 tests)

  The wsrep GTID domain seqno is cluster-global and is not reset by
  reset master (nor by a mid-test SST rejoin); error-log contents,
  warning-flood suppression timers and one-time bootstrap behaviour are
  likewise not restored by in-test cleanup. Force a fresh cluster with
  include/force_restart.inc.

4. Leftover filesystem artifacts (1 test)

  mariabackup refuses to back up into a non-empty target directory, so
  the leftover target dirs from the previous run made the backup fail
  silently and the expected log messages never appeared. Remove the
  target directories in cleanup.
Daniel Black
MDEV-39881 mroonga set, but not used variables: grn_text_otoj

i used.

Looked like loop variable, but wasn't used.

Replaced with simplier while construct.
Dave Gosselin
MDEV-39207:  mark test as not_embedded

Test fails on embedded CI because original not_embedded flag was not preserved
Daniel Black
MDEV-39881 mroonga set, but not used variables: command_schema

too big of stack. Supress for now.

groonga/lib/proc/proc_schema.c:1209:1: error: stack frame size (77688) exceeds limit (49152) in 'command_schema' [-Werror,-Wframe-larger-than]
1209 | command_schema(grn_ctx *ctx, int nargs, grn_obj **args, grn_user_data *user_data)
      | ^
1 error generated.
Daniel Black
MDEV-31209 Queries with window functions do not obey KILL / max_statement_time

Window functions run in a loop in
Frame_cursor::compute_values_for_current_row which can include a large
number of rows.

Adjust this function to check for the current thd being killed by only
every 256 rows, so as not to destroy any CPU pipelining or similar.
Daniel Black
MDEV-39881 mroonga set, but not used variables: grn_output_result_set_open_v1

i set but not used.

'i' was used like a loop variable, but without
a terminating condition. Replace with while loop.
Daniel Black
MDEV-39881 mroonga set, but not used variables: grn_ii_buffer_check

size set not used. Remove size.
Andreas Schwab
MDEV-39925: Fix error reporting in create_libaio

The io_setup function in libaio returns a negated errno value on error,
but strerror expects a normal errno value.
Daniel Black
MDEV-39881 mroonga set, but not used variables: grn_ts_expr_bridge_node_filter

tmp used, so removed.
Fariha Shaikh
MDEV-39928 Fix GitLab CI centos9 job failure

The centos9 job uses yum-builddep -y mariadb-server to install build
dependencies, but the mariadb-server source package has been removed
from CentOS Stream 9 repositories. Replace with explicit installation of
the required build dependencies.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
Daniel Black
MDEV-39881: mroonga set, but not used variables - grn_config_cursor_get_value

storage/mroonga/vendor/groonga/lib/config.c:259:12: warning: variable 'value_size_raw' set but not used [-Wunused-but-set-variable]
  259 |  uint32_t value_size_raw;
      |            ^
1 warning generated.

The return value of grn_config_cursor_get_value isn't checked by
most callers.

The value_size_raw is left unmodified by grn_hash_cursor_get_value
when it returns 0. Add the debug assert.

Review thanks to Sutou Kouhei
Yuchen Pei
[fixup] Move tests requiring an example plugin from sys_vars.session_track_system_variables_basic to a separate test

A follow up to 1f56d9c3feeeca82661cbe57cb628207c8b186f8. This restores
test coverage when the example plugin is not built.
Dave Gosselin
MDEV-39952:  Skip tests that need mariabackup

Skips tests that require mariabackup if mariabackup was not
built (WITH_MARIABACKUP=OFF).

Backport of the same MTR change from 12.3 but applied to
additional tests.
Daniel Black
MDEV-39831 mtr: under rr use LSAN_OPTIONS=report_objects

When we have a rr recording of the process, a memory location
is a useful thing to have in the err log of leaks.

The output comes out like:

Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x55b08dbccdb8 in malloc (/build/sql/mariadbd+0x1b89db8) (BuildId: 4d72569bed63ce1fbf55d51e4d4a84b610efd87d)
    #1 0x7b3fe387a1b1  (<unknown module>)
    #2 0x7b3fe37c492f  (<unknown module>)
    #3 0x7b3fe3058407  (<unknown module>)
    #4 0x7b3fe2f09493  (<unknown module>)

Objects leaked above:
0x7b7feafe0990 (40 bytes)

Now once the location is resolved we can check that we resolving the
right object leak.
Dave Gosselin
MDEV-39856: innochecksum help never stops printing whitespace

The help printer wraps each option description at spaces so it fits a
fixed column.  When a single word is wider than that column there is
no space to break on, so it stops advancing and prints blank,
indented lines without end.  Running innochecksum with no arguments
showed this after a long documentation link was placed in an option
description, but any tool with a similar description is affected.

When a word does not fit the column, print it whole on its own line
instead of looping.  If it is the last word in the description, leave
it for the final write so the help does not end on a blank line.
Daniel Black
MDEV-39881 mroonga set, but not used variables grn_hash_delete

groonga/lib/hash.c:2818:18: warning: variable 'm' set but not used [-Wunused-but-set-variable]
2818 |  uint32_t h, i, m, s;
      |                  ^
The assignment of the max_offset to m was unused so removed.
Dave Gosselin
MDEV-38158:  Incorrect query result

When setup_copy_fields() needs to copy a non-aggregate function value,
it doesn't construct an Item_copy directly.  Instead, it calls
Type_handler::create_item_copy, which is a kind of factory.  The base
Type_handler::create_item_copy returns Item_copy_string.  Some type
handlers override it, like timestamp and fixed binary.  However, the
numeric type handlers (e.g., float, double, int, decimal) did not, so
they fell through to that base and got Item_copy_string.

A SELECT that aggregates will copy each non aggregate function value
into a temporary table through an Item_copy object, whose concrete
type is chosen by the create_item_copy method on the value's type
handler.  For numeric types that method returned Item_copy_string,
which stores the value as text.  A FLOAT keeps only FLT_DIG
significant digits as text, too few to reproduce its 24 bit mantissa,
so the copied value differed from the original.  With one row per
group, CAST(c1 AS FLOAT) - MIN(CAST(c1 AS FLOAT)) returned a large
number instead of zero.

Add Item_copy_real with Item_copy_float and Item_copy_double variants
that keep the value as a double, the same way Item_cache_real does, and
let the float and double type handlers create them.  This mirrors the
existing copy items for timestamp and fixed binary types.
Daniel Black
MDEV-39881 mroonga set, but not used variables: expr.c

grn_expr_unpack - used offset - add attribute unused
as to not to change macro used elsewhere.

grn_expr_exec: unused ln1/la1 - remove

grn_table_select_index_not_equal: weight unused - remove
Fariha Shaikh
MDEV-39931 Fix main.socket_conflict failure when running as root

The test directly executes $MYSQLD via --exec, bypassing MTR's automatic
--user=root injection. In GitLab CI containers where tests run as root,
mariadbd refuses to start and the test fails.

Skip the test when running as root, matching the existing approach used
by the related main.bad_startup_options test.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
Daniel Black
MDEV-39881: mroonga set, but not used variables: grn_geo_select_in_circle

center_{lat,long} unused so removed.
Daniel Black
MDEV-39881 mroonga set, but not used variables: func_snippet

rc set  but not used

Handle snippet errors based on rc value.
Daniel Black
MDEV-39881 mroonga set, but not used variables: grn_time_to_tm

usec unused as it couldn't be used by grn_time_t_to_tm.
Kristian Nielsen
Merge 10.11 to 11.4

Signed-off-by: Kristian Nielsen <[email protected]>
Daniel Black
MDEV-39881 mroonga set, but not used variables: yy_reduce

attribute unused for those in macros.
Kristian Nielsen
Merge 11.4 to 11.8

Signed-off-by: Kristian Nielsen <[email protected]>
Kristian Nielsen
Fix inconsistent terminology

Signed-off-by: Kristian Nielsen <[email protected]>
Arcadiy Ivanov
Delete neighbor coalescing for HEAP free-list blocks

When freeing records to the HEAP delete list, coalesce with the
free-list head if adjacent.  All coalescing logic is unified in
`hp_push_free_block_coalesce`, which accepts count >= 1 (single
records or blocks).  `hp_push_free_record_coalesce` is a thin
wrapper that calls `hp_push_free_block_coalesce(share, pos, 1)`.

**Coalescing cases** (all O(1)):
- Head is single, new entry above/below: form a 2-record block
- Head is block-end, new entry above: extend block upward (old
  end becomes dark, count++)
- Head is block-end, new entry below block-start: extend
  downward (old start becomes dark, count++)
- Block-to-block: merge two adjacent blocks into one
- Block-to-single / single-to-block: merge block with adjacent
  single on the free list
- Combined count capped at `UINT_MAX16`; falls back to
  `hp_push_free_record` (count==1) or `hp_push_free_block`
  (count >= 2) when no adjacency

The unified function handles count=1 correctly because bzero
ranges become zero-length no-ops where no dark records exist,
and full-recbuffer zeroes where they do.

`heap_delete()` calls the wrapper for single-record deletes.
`hp_free_run_chain()` calls both versions: blob primary records
and continuation chain blocks coalesce into a single free-list
entry.  Two adjacent blob rows deleted sequentially produce one
merged block spanning both.

**Unit tests** (`hp_test_freelist-t.c`): 6 new tests (104
assertions, 252 total): ascending/descending delete coalescing,
non-adjacent gap, coalesced block reuse by blob insert, scan
batch-skip, block-to-block via adjacent blob chains.

Existing block-structure tests updated to reflect coalesced
layout (primary now part of the block, not a separate single).
Dave Gosselin
MDEV-39870: macOSB Build Warning on Typecast

On THD::reset, typecast the result of query_start_sec_part() to
tv_usec in a more portable way.  Previously, we assumed that the
left hand side was always long-compatible but that isn't the case
on mac.
Daniel Black
MDEV-39831 mtr: extend LSAN_OPTIONS rather than overwrite

Its useful to add own LSAN_OPTIONS like report_objects=1
so lets make sure MTR doesn't overwrite this.