You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When importing data, incoming column names that match SQL reserved keywords (e.g., DESC, GROUP, SELECT) currently require manual quoting to avoid parsing errors. This typically throws newcomers or in scenarios where schemas are automatically generated.
Propose adding an optional parameter, quoteReserved=True, to loading methods (e.g., read_csv_auto) to automatically quote such column names during schema creation and to subsequent queries where ingested data tables are involved.
Proposed Solution
Introduce a parameter, quoteReserved (default: False), to functions like read_csv_auto and read_csv.
When == True, this would:
Detect column names that are SQL reserved keywords.
Automatically add quotes around those column names during ingestion.
Pros:
Improved Usability: Simplifies ingestion of datasets with column names that conflict with SQL reserved keywords.
Backward Compatibility: By default, the feature is disabled (quoteReserved=False), ensuring no gotchas for existing codebase.
Ingestion of new/foreign datasets have 1 less snafu to handle.
Considerations
The quoteReserved flag would apply only during schema creation and not affect runtime query behaviour.
Users would still need to be aware of quoted column names when writing queries.
example usage:
input.txt
KEY|DESC|VALUE
1|description|100
2|description|200
con.execute("""
CREATE TABLE my_table AS SELECT * FROM read_csv_auto('input.txt', delim='|', header=True, quoteReserved=True)
""")
here the DESC field is automatically quoted to prevent an exception being thrown
in the following scenario where quoteReserved is True
SELECT KEY, DESC FROM my_table;
becomes
SELECT KEY, "DESC" FROM my_table;
likewise
SELECT A.KEY,A.DESC from my_table a;
becomes
SELECT A.KEY,A."DESC" from my_table a;
The text was updated successfully, but these errors were encountered:
Thanks for the thoughts! Perhaps the normalize_names option already achieves what you're looking for? This prefixes reserved keywords with an underscore:
szarnyasg
changed the title
Add quoteReserved (SQL KEYWORDS) option when ingesting data
Add quoteReserved (SQL KEYWORDS) option when ingesting data from CSV
Dec 11, 2024
For consideration:
When importing data, incoming column names that match SQL reserved keywords (e.g., DESC, GROUP, SELECT) currently require manual quoting to avoid parsing errors. This typically throws newcomers or in scenarios where schemas are automatically generated.
Propose adding an optional parameter, quoteReserved=True, to loading methods (e.g., read_csv_auto) to automatically quote such column names during schema creation and to subsequent queries where ingested data tables are involved.
Proposed Solution
Introduce a parameter, quoteReserved (default: False), to functions like read_csv_auto and read_csv.
When == True, this would:
Detect column names that are SQL reserved keywords.
Automatically add quotes around those column names during ingestion.
Pros:
Considerations
The quoteReserved flag would apply only during schema creation and not affect runtime query behaviour.
Users would still need to be aware of quoted column names when writing queries.
example usage:
The text was updated successfully, but these errors were encountered: