-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[2/2]Cherry-Pick Fix gpcheckcat false alarms for pg_default_acl etc #814
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
yjhjstz
changed the title
Cherry-Pick Fix gpcheckcat false alarms for pg_default_acl etc
[2/2]Cherry-Pick Fix gpcheckcat false alarms for pg_default_acl etc
Dec 25, 2024
my-ship-it
previously approved these changes
Dec 25, 2024
avamingli
previously approved these changes
Dec 25, 2024
In ATExecAddColumn() where we find child tables to apply a flag, we used to open the tables w/o lock. That is because we have already held a lock on the parent and we implicitly believed that no one else could alter the child tables now. That is still correct, but the upstream had introduced an expectation for relation_open() in 4b21acf that all callers of relation_open() should have acquired a lock except in bootstrap mode. I do not think it's important to still use NoLock and try to workaround that assert in relation_open() in some way. Therefore, resolving the FIXME as it is.
Issue: gpinitsystem is not working with debug option -D and throwing errors as below: 20220715:11:44:09:048391 gpinitsystem:-[INFO]:-End Function PING_HOST /usr/local/gpdb6/bin/lib/gp_bash_functions.sh: line 1051: [: 2>/dev/null: integer expression expected /usr/local/gpdb6/bin/lib/gp_bash_functions.sh: line 1051: [: REMOTE_EXECUTE_AND_GET_OUTPUT: integer expression expected /usr/local/gpdb6/bin/lib/gp_bash_functions.sh: line 1055: [: too many arguments /usr/local/gpdb6/bin/lib/gp_bash_functions.sh: line 1060: [: too many arguments /usr/local/gpdb6/bin/lib/gp_bash_functions.sh: line 1066: [: too many arguments /usr/local/gpdb6/bin/lib/gp_bash_functions.sh: line 1079: [: too many arguments 20220715:11:44:09:048391 gpinitsystem:-[INFO]:-End Function GET_PG_PID_ACTIVE Cause: When we run gpinitsystem with debug option, all the "LOG_MSG" statements are logging(echoing) messages to both STDOUT(console) and log file (As per LOG_MSG function definition present in gpdb/bin/lib/gp_bash_functions.sh). Causing "REMOTE_EXECUTE_AND_GET_OUTPUT" function to echo(print) mutliple messages but only one string is expected to be returned from this function, so the error When we run gpinitsystem without debug option, there is no issue because in that case "LOG_MSG" will log messages to log file only. Fix: Save the initial DEBUG_LEVEL and turn off the debugging at start of this function and then reset DEBUG_LEVEL to INITIAL_DEBUG_LEVEL at end of this function, which will cause all the LOG_MSG statements to log messages only in log file, not echo on STDOUT. Added behave scenario for the above changes
The LSN output from pg_switch_wal() is commonly used with pg_walfile_name(). However, the LSN set outputted from gp_switch_wal() cannot be used by pg_walfile_name() because of the very likely timeline differences of the coordinator segment and all the segments. To make sure users/developers that want the WAL segment filename and 100% guarantee that it is correct, we should bake it into gp_switch_wal(). Having a separate catalog function would create a window where HA failover would make the timeline ids incorrect and we would have the same problem all over again. We also convert the function into a C function to guarantee that all the function calls in gp_switch_wal() are called on the correct segments. Using the EXECUTE ON SEGMENT/COORDINATOR combo with a UNION was observed to have known issues when dealing with segment-specific values (specifically the coordinator) and redistribute motion plans. Issue example: ``` postgres=# SELECT gp_segment_id, last_archived_wal FROM gp_stat_archiver ORDER BY gp_segment_id; gp_segment_id | last_archived_wal ---------------+-------------------------- -1 | 000000070000000300000033 0 | 000000060000000200000039 1 | 00000006000000020000003B 2 | 00000006000000020000003D (4 rows) postgres=# create table testtable(a int); CREATE TABLE -- the walfilenames all have the coordinator's timeline id in them postgres=# SELECT gp_segment_id, pg_switch_wal, pg_walfile_name(pg_switch_wal) AS walfilename FROM gp_switch_wal() ORDER BY gp_segment_id; gp_segment_id | pg_switch_wal | walfilename ---------------+---------------+-------------------------- -1 | 3/D0042DB0 | 000000070000000300000034 0 | 2/E8042230 | 00000007000000020000003A 1 | 2/F0042230 | 00000007000000020000003C 2 | 2/F8042230 | 00000007000000020000003E (4 rows) -- these are the expected WAL segment filenames with correct timeline id postgres=# SELECT gp_segment_id, last_archived_wal FROM gp_stat_archiver ORDER BY gp_segment_id; gp_segment_id | last_archived_wal ---------------+-------------------------- -1 | 000000070000000300000034 0 | 00000006000000020000003A 1 | 00000006000000020000003C 2 | 00000006000000020000003E (4 rows) ```
Considering that the original command is UPDATE for a SplitUpdate, fire INSERT and DELETE triggers may lead to the wrong action to be enforced. And the triggers in GPDB may require cross segments data changes, disallow the INSERT and DELETE triggers on a SplitUpdate.
We used to always dispatch in DefineIndex() but now we do not if it is part of the ALTER TABLE command execution. So we no longer need to skip DefineIndex for QEs. Confirmed and remove this FIXME.
This issue does not seeem relevant any more on recent versions of clang. I was able to build with configure flag --with-llvm without any issue on clang 10.0.0.
The test started failing after a change to `gp_switch_wal()` was made. The db_size_functions regress test was not updated accordingly. Fix the test by simply updating the expected output (matchsub in the test added as well). GPDB commit reference: https://github.com/greenplum-db/gpdb/commit/d3163467b018082b1b36d70f94fe95e59898b6e0
* Fix a panic case in the postgres_fdw test. It happens in the below query: select foo.f1, loct1.f1 from foo join loct1 on (foo.f1 = loct1.f1) order by foo.f2 offset 10 limit 10; The DDL statements are: ``` CREATE EXTENSION greenplum_fdw; CREATE SERVER loopback FOREIGN DATA WRAPPER greenplum_fdw OPTIONS (host 'localhost', port '15432', dbname 'gpadmin'); CREATE USER MAPPING FOR CURRENT_USER SERVER loopback; create table foo (f1 int, f2 int); create table loct1 (f1 int, f2 int, f3 int); create foreign table foo2 (f3 int) inherits (foo) server loopback options (table_name 'loct1', use_remote_estimate 'true'); ``` The reason is that set_append_path_locus()->cdbpath_create_motion_path() returns NULL (i.e. can not create a motion to reach the target locus requirement) but set_append_path_locus() does not handle this bad case and panic due to this (i.e. when accessing subpath->sameslice_relids while subpath is NULL). This happens because in the above query, foo has a child with foreign table (and a local heap table child additionally) and the append node tries to create a parameterized path (one is an index scan path and another is the foreign scan path; the target locus is Entry). This happens when use_remote_estimate is set on the foreign table. Per postgresGetForeignPaths() code only when this option is set, parameterized path for foreign scan is considered. Fixing this by returning NULL path for the case that set_append_path_locus() returns false. Callers of them should take care of the NULL returning cases. Co-authored-by: Huiliang.liu <[email protected]> Co-authored-by: Paul Guo <[email protected]>
…063) This patch adds support for describing auxiliary tables. e.g., ``` postgres=# create table foo(i int) with (appendonly=true); postgres=# create index on foo(i); CREATE INDEX postgres=# select blkdirrelid::regclass from pg_appendonly where relid = 'foo'::regclass; blkdirrelid ---------------------------- pg_aoseg.pg_aoblkdir_16411 (1 row) postgres=# \d+ pg_aoseg.pg_aoblkdir_16411 Appendonly block directory table: "pg_aoseg.pg_aoblkdir_16411" Column | Type | Storage ----------------+---------+--------- segno | integer | plain columngroup_no | integer | plain first_row_no | bigint | plain minipage | bytea | plain Indexes: "pg_aoblkdir_16411_index" PRIMARY KEY, btree (segno, columngroup_no, first_row_no) postgres=# \d+ pg_aoseg.pg_aovisimap_16411 Appendonly visibility map table: "pg_aoseg.pg_aovisimap_16411" Column | Type | Storage --------------+---------+--------- segno | integer | plain first_row_no | bigint | plain visimap | bytea | plain Indexes: "pg_aovisimap_16411_index" PRIMARY KEY, btree (segno, first_row_no) postgres=# \d+ pg_aoseg.pg_aoseg_16411 Appendonly segment entry table: "pg_aoseg.pg_aoseg_16411" Column | Type | Storage -----------------+----------+--------- segno | integer | plain eof | bigint | plain tupcount | bigint | plain varblockcount | bigint | plain eofuncompressed | bigint | plain modcount | bigint | plain formatversion | smallint | plain state | smallint | plain ```
In postgres, multiple recursive slef-references is disallowed by the SQL spec, and prevent inlining of multiply-referenced CTEs with outer recursive refs, so they only have one WorkTable correlate to one RecursiveUnion. But in GPDB, we don't support CTE scan plan, if a WITH RECURSIVE query has multiply-referenced CTEs with outer recursive, there will be multiple WorkTable scans correlate to one RecursiveUnion, and they share the same readptr of working table, so that it will lose some datas. Create a readptr for each WorkTable scan so that they won't influence each other. In postgres, it can generate CTE SubPlans for WITH subqueries, if a CTE subplan have outer recursive refs with outer recursive CTE, it will exist a WorkTable in subplan. And initialize state information for subplans will be before initialize on the main query tree, so there are corner cases where we'll get the init call before the RecursiveUnion does. IN GPDB, we don't have CTE scan node, so we won't generate CTE subplan for WITH subqueries, also we won't call the ExecInitWorkTableScan func before ExecInitRecursiveUnion. Set rustate in the INIT step. since shareinputscan with outer refs is not supported by GPDB, if contain outer self references, the cte also need to be inlined. Remove GPDB_12_MERGE_FIXME in subselect.sql which describes the issue.
gpcheckcat currently flags default privilege objects as having missing dependencies. This is because it searches for dependencies of pg_default_acl oids in pg_depend instead of in pg_shdepend. The fix is to recognize pg_default_acl as a table outside the purview of pg_depend. Minimal repro: postgres=# CREATE ROLE foo; postgres=# ALTER DEFAULT PRIVILEGES FOR ROLE foo REVOKE EXECUTE ON FUNCTIONS FROM PUBLIC; postgres=# SELECT oid, * FROM pg_default_acl; oid | defaclrole | defaclnamespace | defaclobjtype | defaclacl -------+------------+-----------------+---------------+------------- 17304 | 17303 | 0 | f | {foo=X/foo} (1 row) postgres=# SELECT * FROM pg_shdepend WHERE objid = 17304; dbid | classid | objid | objsubid | refclassid | refobjid | deptype -------+---------+-------+----------+------------+----------+--------- 12812 | 826 | 17304 | 0 | 1260 | 17303 | o (2 rows) $ gpcheckcat postgres ... Object oid: 17304 Table name: pg_default_acl Name of test which found this issue: dependency_pg_default_acl Table pg_default_acl has a dependency issue on oid 17304 at content -1 Table pg_default_acl has a dependency issue on oid 17304 at content 0 Table pg_default_acl has a dependency issue on oid 17304 at content 1 Table pg_default_acl has a dependency issue on oid 17304 at content 2
The report_progress function may take a NULL argument for cluster, but was not checking it before trying to print cluster->major_version_str, resulting in a core dump. Add a check before fprintf call, which also resolves a FIXME. Also use --progress flag in pg_upgrade regression test. Authored-by: Brent Doil <[email protected]>
avamingli
approved these changes
Dec 26, 2024
my-ship-it
approved these changes
Dec 26, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #ISSUE_Number
What does this PR do?
Type of Change
Breaking Changes
Test Plan
make installcheck
make -C src/test installcheck-cbdb-parallel
Impact
Performance:
User-facing changes:
Dependencies:
Checklist
Additional Context
CI Skip Instructions