Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Is there is a plan to expose functionality as library code? #113

Open
thisisaaronland opened this issue Nov 7, 2023 · 4 comments

Comments

@thisisaaronland
Copy link

thisisaaronland commented Nov 7, 2023

Hi,

I am interested in using gpq to generate GeoParquet files for Who's On First (WOF) data. Ideally I would like to do that by reading and writing data on a per-record basis rather than starting with a single GeoJSON file.

Poking through the code it appears I can stream data to gpq via STDIN which would allow me using a similar approach to how we derive PMTiles from WOF data.

That would solve me immediate problem but the functionality, specifically the convert functionality, wrapped by the gpq command would be generally useful to have a library code (outside of internal).

@tschaub
Copy link
Member

tschaub commented Nov 7, 2023

Hi @thisisaaronland - thanks for reaching out about this. Yes, I think it makes sense to expose packages with functions for generating GeoParquet data.

If you have ideas about the ideal API that you'd like to use, maybe you can drop them here and we can discuss. I'm curious in particular about whether you would want to provide an Parquet (or Arrow) schema up front or if you would like this to be derived from the data.

@thisisaaronland
Copy link
Author

Hi @tschaub

For starters I am not super knowledgeable about Parquet or Arrow but I have been watching the conversations around geoparquet and so this was an exercise to start getting more familiar and to prove that WOF data could be bundled in a new format. (One of the unofficial mottoes of the WOF project is: We don't need to have an opinion about your database :-)

The first thing I'd like to be able to is write a go-writer-geoparquet package that implements to whosonfirst/go-writer.Writer interface:

https://pkg.go.dev/github.com/whosonfirst/go-writer#Writer

A concrete example of that would be the go-writer-geojson package:

https://github.com/whosonfirst/go-writer-featurecollection/blob/main/featurecollection.go

That would allow me to continue to use a common sets of interfaces for writing WOF documents to a variety of targets and encapsulate all the Parquet/Arrow specific details in the constructor and the URI used to create it.

Based on the short amount of time I've spent spelunking through the gpq code it seems like just making the internal/geo* packages public might be enough.

@thisisaaronland
Copy link
Author

Hi,

Just checking about this. Has there been any (more) thought about exposing the code in internal as public library code?

@tschaub
Copy link
Member

tschaub commented Jun 6, 2024

I haven't made any time for this yet unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants