Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] ORCA fallbacks for collate "C" #717

Open
1 of 2 tasks
my-ship-it opened this issue Nov 19, 2024 · 3 comments
Open
1 of 2 tasks

[Bug] ORCA fallbacks for collate "C" #717

my-ship-it opened this issue Nov 19, 2024 · 3 comments
Assignees
Labels
type: Bug Something isn't working type: Orca only orca has the issue

Comments

@my-ship-it
Copy link
Contributor

Cloudberry Database version

No response

What happened

Currently, when column attribute of table is collate "C", ORCA would fallback. We need to support it in ORCA also, because sometimes ORCA would produce better plan.

What you think should happen instead

No response

How to reproduce

postgres=# create table tbl(v text);
CREATE TABLE
postgres=# create table tbl_collate_c(v text collate "C");
CREATE TABLE
postgres=# explain select * from tbl order by v;
                                  QUERY PLAN
------------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..431.00 rows=1 width=8)
   Merge Key: v
   ->  Sort  (cost=0.00..431.00 rows=1 width=8)
         Sort Key: v
         ->  Seq Scan on tbl  (cost=0.00..431.00 rows=1 width=8)
 Optimizer: Pivotal Optimizer (GPORCA)
(6 rows)

postgres=# explain select * from tbl_collate_c order by v;
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)  (cost=1451.09..2199.09 rows=52800 width=32)
   Merge Key: v
   ->  Sort  (cost=1451.09..1495.09 rows=17600 width=32)
         Sort Key: v COLLATE "C"
         ->  Seq Scan on tbl_collate_c  (cost=0.00..210.00 rows=17600 width=32)
 Optimizer: Postgres query optimizer
(6 rows)

Operating System

No specific

Anything else

No response

Are you willing to submit PR?

  • Yes, I am willing to submit a PR!

Code of Conduct

@my-ship-it my-ship-it added type: Bug Something isn't working type: Orca only orca has the issue labels Nov 19, 2024
@yjhjstz
Copy link
Member

yjhjstz commented Dec 10, 2024

// GPDB_91_MERGE_FIXME: collation
	INT non_default_collation = gpdb::CheckCollation((Node *) query);

	if (0 < non_default_collation)
	{
		GPOS_RAISE(gpdxl::ExmaDXL, gpdxl::ExmiQuery2DXLUnsupportedFeature,
				   GPOS_WSZ_LIT("Non-default collation"));
	}

Need to dig how to solve this.

@jiaqizho
Copy link
Contributor

Each of phy expression(sort) need derived the collation i guess...

I'm researching this this part of logic..

@jiaqizho
Copy link
Contributor

In my research, I found that it is very difficult to support collate "c" in ORCA for several reasons:

  1. After the sql passes through the parser, there may always be T_RelabelType or T_CollateExpr(common in subqueries)
  2. ORCA can't deal the T_RelabelType or T_CollateExpr in current version, This also means that we can’t just deal the collate in the DXLToPlStmt stage of ORCA.

Therefore, when ORCA receives a query, we need to include collation in the operator and calculate the collation during the exploration and implementation phases. This will be a big change in ORCA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: Bug Something isn't working type: Orca only orca has the issue
Projects
None yet
Development

No branches or pull requests

3 participants