Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: IcebergIO support name-based mapping for a schema #33314

Open
1 of 17 tasks
regadas opened this issue Dec 6, 2024 · 0 comments · May be fixed by #33315
Open
1 of 17 tasks

[Bug]: IcebergIO support name-based mapping for a schema #33314

regadas opened this issue Dec 6, 2024 · 0 comments · May be fixed by #33315

Comments

@regadas
Copy link
Contributor

regadas commented Dec 6, 2024

What happened?

Reading an Iceberg Table it's expected that the underlying data files contain field id's information. However this info might not be present when data files are managed/written by external systems.

To support this use case Iceberg Table spec has a property schema.name-mapping.default with a JSON name mapping containing a list of field mapping objects.

PR #33315 adds NameMapping read support

I spotted this, because I was getting the following error

org.apache.beam.sdk.util.UserCodeException: java.lang.NullPointerException: Cannot invoke "org.apache.parquet.schema.Type$ID.intValue()" because the return value of "org.apache.parquet.schema.Type.getId()" is null

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@regadas regadas changed the title [Bug]: [Bug]: IcebergIO support name-based mapping for a schema Dec 6, 2024
@regadas regadas linked a pull request Dec 6, 2024 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant