Home - Waterfall Grid T-Grid Console Builders Recent Builds Buildslaves Changesources - JSON API - About

Console View


Categories: connectors experimental galera main
Legend:   Passed Failed Warnings Failed Again Running Exception Offline No data

connectors experimental galera main
Monty
Fixed wrong error handling in handler::ha_update_row()

- Error could be wrongly ignored
Monty
MDEV-40029 Add support for bit fields to HEAP

This adds support for BIT_FIELD in record and keys for HEAP tables.

The HEAP engine now has HA_CAN_BIT_FIELD set in table_flags()

Multiple bugs in BIT field handling was found fixed. Some in HEAP table
code, other bugs was affecting usage of BIT fields as keys.
Monty
Removed wrong assert on thd->lex->query_tables != table_list

This was in log_event_server.cc and I got mutiple asserts on this
in valgrind builds, when malloc() returned same address for different
table list (with free() in between)
Arcadiy Ivanov
Fix CI regressions from MDEV-38975 forward-port to main

Seven code fixes, a new test, and test re-recordings for issues found
by CI on PR #5222.

**NULL dereference in `create_tmp_field()`**: `SYS_REFCURSOR` plugin
returns NULL from `make_new_field()` (cursor values cannot be
materialized). The feature added `result->flags |= FIELD_PART_OF_TMP_UNIQUE`
without a NULL check. Added `if (result)` guard.

**xmltype identity loss and recursive CTE reclength mismatch in
`Item_type_holder::create_tmp_field_ex()`**: the blob_key dispatch now
requires both: (1) `type_handler_for_tmp_table()` returns
`blob_key_type_handler()`, AND (2) `dynamic_cast<Type_handler_blob_common*>`
confirms the original type is a native blob. Condition 1 excludes xmltype
(its override returns itself). Condition 2 excludes VARCHAR types promoted
via `varstring_type_handler()` -> `too_big_for_varchar()` ->
`blob_type_handler()`. Without condition 2, wide VARCHAR in recursive CTEs
(e.g. `cast('...' as varchar(1000))`) was promoted to `Field_blob_key` in
the main UNION DISTINCT table (`part_of_unique_key=true`) but stayed as
`Field_varchar` in the incremental table (`part_of_unique_key=false`),
causing a `reclength` mismatch assertion in
`select_union_recursive::send_data()` (`main.json_equals` crash).

**Spurious `reclength > HA_MAX_REC_LENGTH` in `pick_engine()`**: the
original `choose_engine()` (both 10.11 and upstream/main) never had a
reclength check. MDEV-38975 introduced it when replacing the
`blob_fields` condition. HEAP has no internal reclength limit --
`hp_create.c` stores `uint reclength` and allocates blocks of that size;
`max_supported_record_length()` is only checked in `unireg.cc` during
user-facing CREATE TABLE. I_S tables like SLAVE_STATUS routinely have
reclength ~880KB (13 bare `Varchar()` columns). The check forced them to
Aria where `fill_slave_status()` returned 0 rows. Removed the check and
the unused `reclength` parameter from `pick_engine()`.

**Multi-update `tmp_memory_table_size` override**: the 10.11 feature
overrode `big_tables=FALSE` for multi-update dedup tables. The forward-port
translated this as `tmp_memory_table_size=SIZE_T_MAX` when the variable
was 0. But `big_tables=FALSE` was a soft "don't force disk" hint, while
`tmp_memory_table_size=SIZE_T_MAX` overrides the user's explicit
`tmp_memory_table_size=0` directive. Since main removed `big_tables`
entirely (MDEV-19713), the override is not needed. Removed.

**Zero-length key rejection in `check_tmp_key()`**: reject `key_len == 0`
to prevent useless zero-length keys from being created by `add_tmp_key()`.
Reachable when all key parts are CHAR(0) NOT NULL: `key_length()` returns
0, the field is not nullable (no HA_KEY_NULL_LENGTH) and not
VARCHAR/BLOB/GEOMETRY (no HA_KEY_BLOB_LENGTH), so `fld_store_len` is 0
for every part. Without this guard, `check_tmp_key()` would accept the
key (0 <= max_key_length), and the optimizer would create a ref key that
cannot distinguish any rows. Added `heap.char0_key` test exercising this
via a materialized derived table with CHAR(0) NOT NULL join columns.

**Non-deterministic `column_compression` test**: HEAP blob support allows
compressed VARCHAR/TEXT temp tables to stay in HEAP instead of falling to
Aria, changing row iteration order. Added `--sorted_result` to the two
MDEV-24726 subqueries that lack `ORDER BY`.

Test changes:
- `spatial_utility_function_collect`: added ORDER BY to window function
  that lacked it (results were engine-row-order-dependent)
- `tmp_space_usage`: removed multi-update override; forced disk for
  MDEV-34016/34060 Aria-specific test sections (blob I_S tables now
  stay in MEMORY)
- `blob_update_overflow`: replaced `SHOW STATUS LIKE 'Created_tmp_%'`
  with targeted I_S query (Created_tmp_files varies on sanitizer builds)
- `column_compression`: added `--sorted_result` for MDEV-24726 queries
- `char0_key` (new): CHAR(0) NOT NULL derived table ref key rejection
- Re-recorded 8 tests for expected "temp table stays in MEMORY" changes
Dave Gosselin
MDEV-37932: Parser support FULL OUTER JOIN syntax

Syntax support for FULL JOIN, FULL OUTER JOIN, NATURAL FULL JOIN, and
NATURAL FULL OUTER JOIN in the parser.

While we accept full join syntax, such joins are not yet supported.
Queries specifying any of the above joins will fail with
ER_NOT_SUPPORTED_YET.

Add the counter LEX::has_full_outer_join so we can see how many FULL JOINs
are present in the query.
Dave Gosselin
MDEV-39914:  Crash in ST_SIMPLIFY used in an IN list

A geometry function writes its binary result into a buffer given by
the caller.  ST_SIMPLIFY did not set the charset of that buffer.  A
constant value in an IN list, though, is stored into the array that IN
builds for its comparisons, and that array has no charset.  Appending
the geometry then read an invalid charset and crashed the server.

Set the result buffer to the binary charset before writing the
geometry, matching what the other geometry functions already do.
Dave Gosselin
MDEV-37995: FULL OUTER JOIN name resolution

Allow FULL OUTER JOIN queries to proceed through name resolution.

Permits limited EXPLAIN EXTENDED support so tests can prove that the
JOIN_TYPE_* table markings are reflected when the query is echoed back by the
server.  This happens in at least two places:  via a Warning message during
EXPLAIN EXTENDED and during VIEW .frm file creation.

While the query plan output is mostly meaningless at this point, this
limited EXPLAIN support improves the SELECT_LEX print function for the new
JOIN types.

TODO: fix PS protocol before end of FULL OUTER JOIN development
Dave Gosselin
MDEV-37933: Rewrite [NATURAL] FULL OUTER to LEFT, RIGHT, or INNER JOIN

Rewrite FULL OUTER JOIN queries as either LEFT, RIGHT, or INNER JOIN
by checking if and how the WHERE clause rejects nulls.

For example, the following two queries are equivalent because the
WHERE condition rejects nulls from the left table and allows matches
in the right table (or NULL from the right table) for the remaining
rows:

  SELECT * FROM t1 FULL JOIN t2 ON t1.v = t2.v WHERE t1.v IS NOT NULL;
  SELECT * FROM t1 LEFT JOIN t2 ON t1.v = t2.v;

  SELECT * FROM t1 FULL JOIN t2 ON t1.v = t2.v WHERE t1.a=t2.a;
  SELECT * FROM t1 INNER JOIN t2 ON t1.v = t2.v WHERE t1.a=t2.a;
Dave Gosselin
Update table_elim for FULL JOIN base table check

Two EXPLAIN queries in table_elim place a nested join on the right
side of a FULL JOIN.  Phase 2 supports only base tables there, so
check_full_join_base_tables rejects them with
ER_FULL_JOIN_BASE_TABLES_ONLY.
Dave Gosselin
MDEV-39967:  no bug, but preserve test case
Dave Gosselin
MDEV-39569: Skip FULL JOIN rewrite to inner side of an outer join

Prevent simplify_joins from rewriting a chained FULL JOIN into a query
where a FULL JOIN could end up on the inner side of another outer
join.  Of course, this means that we will have a null complement pass
that the rewritten query would have avoided.  Once we support FULL
JOINs on the inner side of outer joins, in phase 3, then we can relax
this constraint.
Dave Gosselin
MDEV-38692: COALESCE() on NATURAL FULL JOIN result sets

FULL JOIN yields result sets with columns from both tables participating in
the join (for the sake of explanation, assume base tables).  However,
NATURAL FULL JOIN should show unique columns in the output.

Given the following query:
  SELECT * FROM t1 NATURAL JOIN t2;
transform it into:
  SELECT COALESCE(t1.f_1, t2.f_1), ..., COALESCE(t1.f_n, t2.f_n) FROM
    t1 NATURAL JOIN t2;

This change applies only in the case of NATURAL FULL JOIN.  Otherwise,
NATURAL JOINs work as they have in the past, which is using columns
from the left table for the resulting column set.
Dave Gosselin
MDEV-39911:  Crash in ST_SIMPLIFY of a collection geometry

ST_SIMPLIFY of a multilinestring, polygon, multipolygon, or geometry
collection reserved space for the result header but omitted the four
byte element count that it then appends.  This resulted in a buffer
overrun.

Reserve the full header size, including the count, in each of the four
collection simplify functions, the same fix applied for MDEV-35062 and
MDEV-36042.
Sergei Petrunia
Code cleanup (2). Introduce dump_sql_script() function.
Thirunarayanan Balathandayuthapani
MDEV-39963  InnoDB system tablespace autoshrink fails when the tail  extent is an empty XDES_FREE_FRAG extent

Problem:
========
- The InnoDB system tablespace fails to autoshrink even when it is
almost entirely free. Defragmentation reports success but no
space is reclaimed, and the log shows the high-water mark pinned
at the end of the file.

fsp_traverse_extents(): when locating the last used extent,
descends from the end of the tablespace and treats an extent
as reclaimable only when it is XDES_FREE, or the descriptor-page
extent (XDES_FREE_FRAG with two used pages).
Every other XDES_FREE_FRAG extent stops the scan.

An XDES_FREE_FRAG extent with zero used pages can legitimately
exist on disk in tablespaces written by server versions between
commit 0b47c126e31 (MDEV-13542) and commit 7737f15f874 (MDEV-31333).
In that window fsp_free_page() evaluated xdes_get_n_used()
before clearing the freed page's XDES_FREE_BIT, so freeing the
last used page of a fragment extent left the empty extent on the
FSP_FREE_FRAG list instead of moving it to FSP_FREE.
7737f15f874 restored the correct ordering, but pre-existing
data files may still carry such extents.

Such an empty extent is logically identical to XDES_FREE, but
fsp_traverse_extents() mistook it for a used extent, pinned
last_used_extent at end-of-file, and the shrink reclaimed nothing.

Solution:
========
fsp_traverse_extents(): Treat an XDES_FREE_FRAG extent with no used
pages (n_used == 0) the same as XDES_FREE
Monty
fixup! remove is_text_key_segment
Arcadiy Ivanov
MDEV-38975: HEAP engine BLOB/TEXT/JSON/GEOMETRY support with indexable blob columns

Remove the HA_NO_BLOBS restriction from the HEAP engine, allowing
the optimizer to keep temporary tables with BLOB/TEXT columns in
memory when they fit within max_heap_table_size / tmp_memory_table_size
limits.  Additionally, advertise HA_CAN_GEOMETRY so explicit
CREATE TABLE ... ENGINE=MEMORY with GEOMETRY columns works.

Unlike other HEAP blob implementations (e.g. Percona), this patch
provides full HASH index support on blob columns, enabling efficient
lookups, GROUP BY, and DISTINCT operations directly in HEAP without
falling back to disk.

Architecture
------------

BLOB data is stored using continuation records -- additional fixed-size
records allocated from the same HP_BLOCK that holds regular rows.  This
reuses existing allocation, free list, and size accounting with minimal
structural change, and avoids per-blob my_malloc() calls.

The existing single-byte visibility flag is extended into a flags byte
with bits for HP_ROW_HAS_CONT, HP_ROW_IS_CONT, HP_ROW_CONT_ZEROCOPY,
HP_ROW_SINGLE_REC, and HP_ROW_MULTIPLE_REC.  Continuation records are
grouped into variable-length runs -- contiguous sequences within a leaf
block.  Only the first record of each run carries a 10-byte header
(next_cont pointer + run_rec_count); inner records are pure payload.

Three storage formats, detected by flag bits via inline predicates:

  Case A (HP_ROW_SINGLE_REC): single record, no header, data at
  offset 0.  Zero-copy read.

  Case B (HP_ROW_CONT_ZEROCOPY): single run, multiple records.
  Header in rec 0, data contiguous in rec 1..N-1.  Zero-copy read
  via chain + recbuffer.

  Case C (HP_ROW_MULTIPLE_REC): one or more runs linked via
  next_cont.  Reassembly into blob_buff required.

Run allocation uses a two-phase strategy: (1) peek-then-unlink walk
of the free list detecting contiguous groups, (2) tail allocation
from HP_BLOCK for remaining data.  A Step 3 scavenge fallback
walks the entire free list when tail allocation fails.

HP_SHARE::total_records tracks all physical records (primary +
continuation), while HP_SHARE::records remains the logical count
used by hash bucket mapping.

Reassembly buffer (HP_INFO::blob_buff) follows the same pattern as
InnoDB's blob_heap -- allocated once, grown via my_realloc, freed
on heap_reset()/close.  Zero-copy cases (A/B) return pointers
directly into HP_BLOCK with no copy.

Full HASH index key handling for BLOB columns: hp_rec_hashnr(),
hp_rec_key_cmp(), hp_key_cmp(), hp_make_key(), hp_hashnr() are
extended for HA_BLOB_PART segments.  Hash pre-check optimization
skips expensive blob materialization when hashes differ.  PAD SPACE
collation semantics are preserved for blob key comparisons.

Field_blob_key (Monty) produces HEAP-native key format (4-byte length
+ 8-byte data pointer) directly, eliminating key buffer translation
between the SQL layer and HEAP engine.

SQL layer changes
-----------------

pick_engine() (new, extracted from choose_engine()): replaced the
blob_fields check with a reclength > HA_MAX_REC_LENGTH guard.
choose_engine() calls pick_engine() with the real reclength.
pick_engine() is also called early in finalize() with reclength=0
to predict whether the engine will be HEAP, enabling blob-aware
GROUP BY key setup that avoids unnecessary m_using_unique_constraint.

finalize(): HEAP+blob uses fixed-width rows; GROUP BY key setup sets
key_part_flag from field, uses item max_length for blob key sizing.
store_length initialized for all GROUP BY key parts.  key_type uses
field->binary() to determine FIELDFLAG_BINARY vs text collation.
DISTINCT key setup skips null-bits helper for HEAP.

remove_duplicates(): blob check moved before HEAP check to fall
through to remove_dup_with_compare().

Aggregator_distinct::add(): overflow-to-disk conversion via
create_internal_tmp_table_from_heap() for non-dup write errors.

Expression cache disabled for HEAP+blob (key format incompatibility).

FULLTEXT early detection in mysql_derived_prepare(): forces disk
engine via TMP_TABLE_FORCE_MYISAM when outer query uses MATCH.

Deferred blob chain free (MDEV-39732): heap_delete() saves chain
pointers to pending_blob_chains, flushed on next mutation or
heap_reset()/close.  Prevents dangling zero-copy pointers during
binlog_log_row().

REPLACE safety (MDEV-39825): HP_SHARE::write_can_replace flag
forces copy mode in hp_read_blobs(), preventing blob data corruption
from freed-then-reused continuation records during REPLACE.

Geometry GROUP_CONCAT fix (MDEV-39761): downgrade Field_geom to
Field_blob for GROUP_CONCAT temp tables in both expression creation
paths.  Type_handler_geometry::type_handler_for_tmp_table() added.

Geometry GROUP BY key fix (MDEV-39871): detect when new_key_field()
produced non-blob Field_varstring for a blob column, replace with
Field_blob_key.

Performance
-----------

Non-blob tables: zero regression.  Every blob-specific code path is
guarded by if (share->blob_count).  No new allocations, no format
changes, no hash function changes for non-blob keys.

Blob tables: eliminates file creation/deletion overhead and page cache
management.  For single-run blobs (common case), the read path is
entirely zero-copy.

Limitations
-----------

1. No BTREE indexes on blob columns (HASH only)
2. No partial-key prefix indexing for blobs
3. 2x memory for Case C reads only (A/B are 1x)
4. No blob compression
5. 65,535 records per run (uint16 cap, auto-split)
6. max_heap_table_size applies to continuation records
7. Expression cache disabled for HEAP+blob
8. FULLTEXT forces disk engine

Linked bugs fixed:

- MDEV-39703: mroonga fulltext test ordering
- MDEV-39723: ER_DUP_ENTRY on GROUP BY with blob column
- MDEV-39724: crash in hp_is_single_rec with GROUP BY
- MDEV-39732: slave crash in hp_free_run_chain on blob replication
- MDEV-39761: Field_geom::store() assertion in GROUP_CONCAT
- MDEV-39782: RBR ER_KEY_NOT_FOUND on HEAP blob UPDATE
- MDEV-39825: blob data corruption on REPLACE into HEAP table
- MDEV-39871: crash in my_hash_sort_bin on GROUP BY with geometry

Reviewed by: Michael Widenius <[email protected]>
  Monty reviewed the entire patch. Areas where he suggested changes
  or contributed code:
  - Field_blob_key class (HEAP-native blob key format, 4-byte length +
    data pointer)
  - Duplicate key fix on HEAP-to-Aria conversion
  - hp_blob_key_length() uint32 fix
  - hp_rec_hashnr_stored removal
  - type_handler_for_tmp_table() param cleanup
  - Type_handler_geometry::type_handler_for_tmp_table() virtual
  - blob pointer bzero()
  - find_unique_row() double-materialization fix
  - Tail reclaim review
  - Batch tail allocation review
  - hp_update.c cleanup
  - Field_blob_compressed temp table fix
  - row_pack_length() dedup
  - pack_length_no_ptr() removal
  - Race condition fix in HEAP
  - MDEV-39703 mroonga test fix
  - MDEV-39825 write_can_replace optimization
  - is_text_key_segment removal (field->binary() simplification)
  - Documentation (Docs/internal-temporary-tables.txt)
Contribution by: Alexander Barkov <[email protected]>
  Type_handler::make_and_init_table_field_ex() -- refactored temp table
  field creation from inline code in sql_select.cc into type handler
  virtual methods (sql_type.cc, sql_type_geom.cc), enabling clean
  per-type-handler field creation for HEAP blob promotion.
Dave Gosselin
MDEV-38502: FULL OUTER JOIN get correct searchable condition

Move the temporary gate against FULL OUTER JOIN deeper into the
codebase, which causes the FULL OUTER JOIN query plans to have
more relevant information (hence the change).  In some cases, the
join order of nested INNER JOINs within the FULL OUTER JOIN changed.

Small cleanups in get_sargable_cond ahead of the feature work in
the next commit.
Dave Gosselin
Reject a nested join on the right of a rewritten FULL JOIN

check_full_join_base_tables runs before simplify_joins and rejects the
disallowed FULL JOIN shapes that are visible in the parse tree.
simplify_joins can rewrite a FULL JOIN to a LEFT, RIGHT, or INNER
JOIN, so sometimes disallowed queries appear only afterward.

Add check_full_join_after_simplify, called from optimize_inner once
simplify_joins is done, to reject unsupported queries after
rewrite by simplify_joins.
Sergei Petrunia
Code cleanup, introduce get_create_table_stmt().
Dave Gosselin
MDEV-39746: FULL JOIN with a nested join on the right loses rows

The outermost FULL JOIN's right operand can be a nested join rather
than a single base table.  The parser places the nest on the right
when the outermost FULL JOIN's ON is the last one written, because the
parser keeps the outermost FULL JOIN pending until its ON arrives, and
the inner FULL JOINs reduce first into a nest that becomes the right
operand.
alloc_full_join_duplicate_filters allocates the fj_dups filter on a
JOIN_TAB carrying JOIN_TYPE_FULL | JOIN_TYPE_RIGHT, so with the
FULL|RIGHT bits on the nest, which is never a JOIN_TAB, no filter was
allocated and the null complement pass never fired.  The unmatched
rows from the right side were never emitted, producing a result with
missing rows.

Add swap_full_join_sides, called from rewrite_full_outer_joins
when a FULL JOIN survives simplify_joins with a leaf on the left
and a nested join on the right.  FULL JOIN is symmetric on its
operands, so swapping does not change query semantics; after the
swap the leaf carries the FULL|RIGHT bits and the rescan target
is a single base table.
Arcadiy Ivanov
Fix CI regressions from MDEV-38975 forward-port to main

Six code fixes and test re-recordings for issues found by CI on PR #5222.

**NULL dereference in `create_tmp_field()`**: `SYS_REFCURSOR` plugin
returns NULL from `make_new_field()` (cursor values cannot be
materialized). The feature added `result->flags |= FIELD_PART_OF_TMP_UNIQUE`
without a NULL check. Added `if (result)` guard.

**xmltype identity loss and recursive CTE reclength mismatch in
`Item_type_holder::create_tmp_field_ex()`**: the blob_key dispatch now
requires both: (1) `type_handler_for_tmp_table()` returns
`blob_key_type_handler()`, AND (2) `dynamic_cast<Type_handler_blob_common*>`
confirms the original type is a native blob. Condition 1 excludes xmltype
(its override returns itself). Condition 2 excludes VARCHAR types promoted
via `varstring_type_handler()` -> `too_big_for_varchar()` ->
`blob_type_handler()`. Without condition 2, wide VARCHAR in recursive CTEs
(e.g. `cast('...' as varchar(1000))`) was promoted to `Field_blob_key` in
the main UNION DISTINCT table (`part_of_unique_key=true`) but stayed as
`Field_varchar` in the incremental table (`part_of_unique_key=false`),
causing a `reclength` mismatch assertion in
`select_union_recursive::send_data()` (`main.json_equals` crash).

**Spurious `reclength > HA_MAX_REC_LENGTH` in `pick_engine()`**: the
original `choose_engine()` (both 10.11 and upstream/main) never had a
reclength check. MDEV-38975 introduced it when replacing the
`blob_fields` condition. HEAP has no internal reclength limit --
`hp_create.c` stores `uint reclength` and allocates blocks of that size;
`max_supported_record_length()` is only checked in `unireg.cc` during
user-facing CREATE TABLE. I_S tables like SLAVE_STATUS routinely have
reclength ~880KB (13 bare `Varchar()` columns). The check forced them to
Aria where `fill_slave_status()` returned 0 rows. Removed the check and
the unused `reclength` parameter from `pick_engine()`.

**Multi-update `tmp_memory_table_size` override**: the 10.11 feature
overrode `big_tables=FALSE` for multi-update dedup tables. The forward-port
translated this as `tmp_memory_table_size=SIZE_T_MAX` when the variable
was 0. But `big_tables=FALSE` was a soft "don't force disk" hint, while
`tmp_memory_table_size=SIZE_T_MAX` overrides the user's explicit
`tmp_memory_table_size=0` directive. Since main removed `big_tables`
entirely (MDEV-19713), the override is not needed. Removed.

**Zero-length key rejection in `check_tmp_key()`**: defense-in-depth
guard rejecting `key_len == 0` to prevent useless zero-length keys from
being created by `add_tmp_key()`.

**Non-deterministic `column_compression` test**: HEAP blob support allows
compressed VARCHAR/TEXT temp tables to stay in HEAP instead of falling to
Aria, changing row iteration order. Added `--sorted_result` to the two
MDEV-24726 subqueries that lack `ORDER BY`.

Test changes:
- `spatial_utility_function_collect`: added ORDER BY to window function
  that lacked it (results were engine-row-order-dependent)
- `tmp_space_usage`: removed multi-update override; forced disk for
  MDEV-34016/34060 Aria-specific test sections (blob I_S tables now
  stay in MEMORY)
- `blob_update_overflow`: replaced `SHOW STATUS LIKE 'Created_tmp_%'`
  with targeted I_S query (Created_tmp_files varies on sanitizer builds)
- Re-recorded 8 tests for expected "temp table stays in MEMORY" changes
- `column_compression`: added `--sorted_result` for MDEV-24726 queries
Sergei Petrunia
Code cleanup (3).
Monty
Removed wrong assert on thd->lex->query_tables != table_list

This was in log_event_server.cc and I got mutiple asserts on this
in valgrind builds, when malloc() returned same address for different
table list (with free() in between)
Monty
Added support for bit fields and CHECK TABLE to the memory engine

HEAP engine now has HA_CAN_BIT_FIELD set in table_flags()
CHECK TABLE is supported, but will only return ok or failed.
Still good enough for testing heap table consistenty.

Note that one bug for bit field in HEAP tables was fixed as part of
MDEV-38975

Bugs fixed:
Field_bit::get_key_image() returned wrong length for lengths
8,16,32,48,64.
hp_rb_make_key() did not handle bit fields correctly.

Other things:
- Added sorted_order to some test in type_bit to be able to run it
  with the memory engine
Dave Gosselin
MDEV-38136: Prevent elimination of tables in a FULL OUTER JOIN

Prevent elimination of tables participating in a FULL OUTER JOIN during
eliminate_tables as part of phase one FULL OUTER JOIN development.

Move the functionality gate for FULL JOIN further into the codebase.

Fixes an old bug where, when running the server as a debug build and in
debug mode, a null pointer deference in
Dep_analysis_context::dbug_print_deps would cause a crash.
Dave Gosselin
MDEV-38508: Constant table detection

If a table that's in a FULL OUTER JOIN is found to be a const
table, then don't allow the constant table optimization to
take place.

Later, when we support FULL OUTER JOIN on the inner side of
other join types then we may be able to relax this restriction.
Sergei Petrunia
More code cleanups.
Dave Gosselin
MDEV-39936:  Defer left side WHERE predicates of a surviving FULL JOIN

A FULL JOIN runs as a LEFT JOIN of its left side over its right side,
followed by a pass that emits the right rows that never matched a left
row.  That second pass is correct only if the first pass records every
left to right match, so it must read every left row and reach the right
side for each match.

A WHERE predicate that references only the left side was applied
directly during the first pass.  It pruned left rows in the nested loop,
and it could build a ref or range access on the left side.  Either way a
left row was dropped before its match was recorded, and the matching
right row then reappeared in the second pass as a right-only row.  In a
FULL JOIN the left side is null complemented in the right-only rows just
as the right side is in the left-only rows, so its predicates are inner
side predicates and must be deferred the same way.

Before access selection, lift the WHERE conjuncts that reference a
surviving FULL JOIN's left side but not its right side out of the WHERE
and hold them on the right side partner.  Removed from the WHERE they
build no access on the left side, so it is read in full.
make_join_select reattaches them to the right partner under the found
match guard, so they apply only after the match is recorded.  A conjunct
that also references the right side stays in place, since the right side
already defers it.

Also tighten List_iterator::swap_next to assert that it is positioned on
a valid element instead of returning nullptr, since the FULL JOIN
rewrite only calls it in that state.
Dave Gosselin
MDEV-39014: FULL JOIN Phase 2

In phase 1, FULL [OUTER] JOIN was only supported when simplify_joins()
could rewrite it into an equivalent LEFT, RIGHT, or INNER JOIN based
on NULL-rejecting WHERE predicates.  Queries that could not be
rewritten raised ER_NOT_SUPPORTED_YET.  (Phase 1 was not released.)

This commit removes that restriction by adding proper support for FULL
JOIN by executing a 'LEFT JOIN pass' that emits matched rows and left
null-complemented rows, then a second "null-complement" pass which
rescans the right table to emit null-complement rows that were never
matched.

FULL JOIN supports nested joins on the left of the FULL JOIN,
NATURAL FULL JOIN, semi-joins, CTEs / derived tables (kept
materialized when they participate in a FULL JOIN), prepared
statements, stored procedures, and aggregates.  Examples:

  SELECT * FROM (d1 FULL JOIN d2 ON d1.a = d2.a)
              FULL JOIN t3 ON d1.a = t3.a;

  SELECT * FROM t1 NATURAL FULL JOIN t2;

  SELECT * FROM t1 INNER JOIN t2 FULL JOIN t3 ON t1.a = t3.a;

  PREPARE st FROM
    'SELECT COUNT(*) FROM t1 FULL JOIN t2 ON t1.a = t2.a';

Limitations:
  - Statistics and cost estimates for the null-complement pass have
    not been fully implemented; the optimizer may under- or
    over-estimate FULL JOIN costs in plans involving multiple
    FULL JOINs.  Again, a follow-up will optimize the cost calculations.
  - Optimizations for constant tables not fully supported.
  - Nested tables on the right side of a FULL JOIN are not yet supported.
Arcadiy Ivanov
MDEV-38975: HEAP engine BLOB/TEXT/JSON/GEOMETRY support with indexable blob columns

Remove the HA_NO_BLOBS restriction from the HEAP engine, allowing
the optimizer to keep temporary tables with BLOB/TEXT columns in
memory when they fit within max_heap_table_size / tmp_memory_table_size
limits.  Additionally, advertise HA_CAN_GEOMETRY so explicit
CREATE TABLE ... ENGINE=MEMORY with GEOMETRY columns works.

Unlike other HEAP blob implementations (e.g. Percona), this patch
provides full HASH index support on blob columns, enabling efficient
lookups, GROUP BY, and DISTINCT operations directly in HEAP without
falling back to disk.

Architecture
------------

BLOB data is stored using continuation records -- additional fixed-size
records allocated from the same HP_BLOCK that holds regular rows.  This
reuses existing allocation, free list, and size accounting with minimal
structural change, and avoids per-blob my_malloc() calls.

The existing single-byte visibility flag is extended into a flags byte
with bits for HP_ROW_HAS_CONT, HP_ROW_IS_CONT, HP_ROW_CONT_ZEROCOPY,
HP_ROW_SINGLE_REC, and HP_ROW_MULTIPLE_REC.  Continuation records are
grouped into variable-length runs -- contiguous sequences within a leaf
block.  Only the first record of each run carries a 10-byte header
(next_cont pointer + run_rec_count); inner records are pure payload.

Three storage formats, detected by flag bits via inline predicates:

  Case A (HP_ROW_SINGLE_REC): single record, no header, data at
  offset 0.  Zero-copy read.

  Case B (HP_ROW_CONT_ZEROCOPY): single run, multiple records.
  Header in rec 0, data contiguous in rec 1..N-1.  Zero-copy read
  via chain + recbuffer.

  Case C (HP_ROW_MULTIPLE_REC): one or more runs linked via
  next_cont.  Reassembly into blob_buff required.

Run allocation uses a two-phase strategy: (1) peek-then-unlink walk
of the free list detecting contiguous groups, (2) tail allocation
from HP_BLOCK for remaining data.  A Step 3 scavenge fallback
walks the entire free list when tail allocation fails.

HP_SHARE::total_records tracks all physical records (primary +
continuation), while HP_SHARE::records remains the logical count
used by hash bucket mapping.

Reassembly buffer (HP_INFO::blob_buff) follows the same pattern as
InnoDB's blob_heap -- allocated once, grown via my_realloc, freed
on heap_reset()/close.  Zero-copy cases (A/B) return pointers
directly into HP_BLOCK with no copy.

Full HASH index key handling for BLOB columns: hp_rec_hashnr(),
hp_rec_key_cmp(), hp_key_cmp(), hp_make_key(), hp_hashnr() are
extended for HA_BLOB_PART segments.  Hash pre-check optimization
skips expensive blob materialization when hashes differ.  PAD SPACE
collation semantics are preserved for blob key comparisons.

Field_blob_key (Monty) produces HEAP-native key format (4-byte length
+ 8-byte data pointer) directly, eliminating key buffer translation
between the SQL layer and HEAP engine.

SQL layer changes
-----------------

choose_engine(): removed blob_fields check, added reclength >
HA_MAX_REC_LENGTH.

finalize(): HEAP+blob uses fixed-width rows; GROUP BY key setup sets
key_part_flag from field, uses item max_length for blob key sizing.
store_length initialized for all GROUP BY key parts.  DISTINCT key
setup skips null-bits helper for HEAP.

remove_duplicates(): blob check moved before HEAP check to fall
through to remove_dup_with_compare().

Aggregator_distinct::add(): overflow-to-disk conversion via
create_internal_tmp_table_from_heap() for non-dup write errors.

Expression cache disabled for HEAP+blob (key format incompatibility).

FULLTEXT early detection in mysql_derived_prepare(): forces disk
engine via TMP_TABLE_FORCE_MYISAM when outer query uses MATCH.

Deferred blob chain free (MDEV-39732): heap_delete() saves chain
pointers to pending_blob_chains, flushed on next mutation or
heap_reset()/close.  Prevents dangling zero-copy pointers during
binlog_log_row().

REPLACE safety (MDEV-39825): HP_SHARE::write_can_replace flag
forces copy mode in hp_read_blobs(), preventing blob data corruption
from freed-then-reused continuation records during REPLACE.

Geometry GROUP_CONCAT fix (MDEV-39761): downgrade Field_geom to
Field_blob for GROUP_CONCAT temp tables in both expression creation
paths.  Type_handler_geometry::type_handler_for_tmp_table() added.

Geometry GROUP BY key fix (MDEV-39871): detect when new_key_field()
produced non-blob Field_varstring for a blob column, replace with
Field_blob_key.

Performance
-----------

Non-blob tables: zero regression.  Every blob-specific code path is
guarded by if (share->blob_count).  No new allocations, no format
changes, no hash function changes for non-blob keys.

Blob tables: eliminates file creation/deletion overhead and page cache
management.  For single-run blobs (common case), the read path is
entirely zero-copy.

Limitations
-----------

1. No BTREE indexes on blob columns (HASH only)
2. No partial-key prefix indexing for blobs
3. 2x memory for Case C reads only (A/B are 1x)
4. No blob compression
5. 65,535 records per run (uint16 cap, auto-split)
6. max_heap_table_size applies to continuation records
7. Expression cache disabled for HEAP+blob
8. FULLTEXT forces disk engine

Linked bugs fixed:

- MDEV-39703: mroonga fulltext test ordering
- MDEV-39723: ER_DUP_ENTRY on GROUP BY with blob column
- MDEV-39724: crash in hp_is_single_rec with GROUP BY
- MDEV-39732: slave crash in hp_free_run_chain on blob replication
- MDEV-39761: Field_geom::store() assertion in GROUP_CONCAT
- MDEV-39782: RBR ER_KEY_NOT_FOUND on HEAP blob UPDATE
- MDEV-39825: blob data corruption on REPLACE into HEAP table
- MDEV-39871: crash in my_hash_sort_bin on GROUP BY with geometry

Reviewed by: Michael Widenius <[email protected]>
  Monty reviewed the entire patch. Areas where he suggested changes
  or contributed code:
  - Field_blob_key class (HEAP-native blob key format, 4-byte length +
    data pointer)
  - Duplicate key fix on HEAP-to-Aria conversion
  - hp_blob_key_length() uint32 fix
  - hp_rec_hashnr_stored removal
  - type_handler_for_tmp_table() param cleanup
  - Type_handler_geometry::type_handler_for_tmp_table() virtual
  - blob pointer bzero()
  - find_unique_row() double-materialization fix
  - Tail reclaim review
  - Batch tail allocation review
  - hp_update.c cleanup
  - Field_blob_compressed temp table fix
  - row_pack_length() dedup
  - pack_length_no_ptr() removal
  - Race condition fix in HEAP
  - MDEV-39703 mroonga test fix
  - MDEV-39825 write_can_replace optimization
  - Documentation (Docs/internal-temporary-tables.txt)
Contribution by: Alexander Barkov <[email protected]>
  Type_handler::make_and_init_table_field_ex() -- refactored temp table
  field creation from inline code in sql_select.cc into type handler
  virtual methods (sql_type.cc, sql_type_geom.cc), enabling clean
  per-type-handler field creation for HEAP blob promotion.
Dave Gosselin
MDEV-39936:  Free FULL JOIN duplicate filters on allocation failure

alloc_full_join_duplicate_filters allocates one duplicate filter for
each right side FULL JOIN table in a range of join tabs.  When a later
allocation in the range failed, the filters created earlier in the same
call stayed allocated and leaked.

Free the filters created so far before returning the failure, both when
a recursive call for a bush child fails and when a filter's own
allocation or initialization fails.  A failed call now leaves no filters
allocated.
Dave Gosselin
MDEV-39936:  Preserve a FULL JOIN when its left nest moves conditions

The rewrite of a FULL JOIN to a RIGHT JOIN lost rows when the left
operand was a nested join.  rewrite_full_outer_joins recurses into the
left nest, and that recursion can move the ON conditions of the nest's
inner join children into the WHERE clause.  Those moved conditions
filter the nest's rows correctly only while the nest stays on the outer
side of the join.

The rewrite to a RIGHT JOIN makes the nest the inner side of the
resulting LEFT JOIN, where the moved conditions reject its null
complemented rows and drop them.  Detect the move by comparing the WHERE
pointer before and after the recursion, since every move reassigns it.
When a condition moved, skip the rewrite and let the FULL JOIN survive,
and zero not_null_tables so the caller does not turn the surviving FULL
JOIN into an inner join.
Rucha Deodhar
MDEV-39878: JSON_ARRAY_INTERSECT: nested call JSON_ARRAY_INTERSECT(
JSON_ARRAY_INTERSECT(a,b), b) returns NULL instead of intersect result

Analysis:
Evaluating args[0] inside fix_length_and_dec() forces the argument to run
during the query preparation phase. If args[0] is a nested function,
its runtime memory structures are not yet initialized. This causes a
early failure that makes inner function as NULL before the query
starts executing.

Fix:
Wrap the evaluation in a const_item() check so it only runs during setup
if the argument is a constant. Otherwise, do it in runtime.
Monty
fixup! remove is_text_key_segment
Monty
MDEV-40030 Add support for CHECK TABLE for memory tables

CHECK TABLE is now supported, but will only return ok or fail.
Still good enough for testing heap table consistenty.
Dave Gosselin
MDEV-38502: FULL OUTER JOIN get correct searchable condition

Fetches the ON condition from the FULL OUTER JOIN as the searchable condition.
We ignore the WHERE clause here because we don't want accidental conversions
from FULL JOIN to INNER JOIN during, for example, range analysis, as that
would produce wrong results.

GCOV shows that existing FULL OUTER JOIN tests exercise this new codepath.
Dave Gosselin
Address Monty's Phase 2 Review Feedback
Sergei Petrunia
More code cleanups.