Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load config from URL #1252

Open
wants to merge 1 commit into
base: skosmos-2
Choose a base branch
from
Open

Conversation

pulquero
Copy link

@pulquero pulquero commented Dec 1, 2021

Reasons for creating this PR

Enable the creation of a reusable (deployment-agnostic) docker container.

Link to relevant issue(s), if any

  • Closes #

Description of the changes in this PR

  1. Make it easier to manage multiple configs without having to rename files back-and-forth with the introduction of a SKOSMOS_CONFIG environment variable.
  2. Externalise config from the docker container by allowing URLs (incl sparql query URLs) as well as filenames.

Known problems or uncertainties in this PR

Checklist

  • phpUnit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works (if not, explain why below)
    As there are a large number of outstanding PRs, I'm loath to spend time beyond my immediate needs on stuff that may just end up going nowhere.
  • The PR doesn't introduce unintended code changes (e.g. empty lines or useless reindentation)

@sonarcloud
Copy link

sonarcloud bot commented Dec 1, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

Copy link
Collaborator

@kinow kinow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't test, but change looks good to me. If the env var is specified, we replace the config name. In that case we don't use the cache and parse the config directly. When parsing it, the config is fetched via curl, which is already a requirement of Skosmos.

@@ -107,9 +122,18 @@ private function initializeConfig()
*/
private function parseConfig($filename)
{
if (str_starts_with($filename, self::HTTP_URL_PREFIX) || str_starts_with($filename, self::HTTPS_URL_PREFIX)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could modify parseConfig to pass a flag telling it whether it is a local config, or a remote config? This way we wouldn't have to check if it starts with http or https here and in the caller function ☝️

@osma osma self-assigned this Mar 2, 2022
@osma
Copy link
Member

osma commented Mar 2, 2022

Thanks for the PR and sorry for the slow response. This is a good feature (both introducing the environment variable and allowing URLs to be used instead of filenames) and IIRC something like this has been requested by other users as well in the past although I forget the details.

I'm worried about the performance. Parsing the config file can be a very expensive operation, tens or hundreds of milliseconds, and if there is no caching on the Skosmos side, then both URL retrieval and parsing has to be done each time a page is being generated - sometimes several times when the page performs AJAX style queries or accesses the REST API, for example on the vocabulary home page. The cache seems to be bypassed now in case a URL is given. Although this is not a regression in a strict sense (previously it wasn't possible to use a config URL at all, now it's possible but potentially very slow), I don't think this is an ideal solution.

I can see two possible improvements:

  1. Use the timestamp (Last-Modified probably?) from the HTTP response to implement smarter caching. Or even better, implement full HTTP caching semantics (ETag, If-Modified-Since, Cache-Control etc.). Only re-parse when the configuration on the HTTP server has actually changed. But this could still involve lots of HTTP requests (depending in part on how the HTTP server is configured), even if it saves on parsing time. And it would be very difficult to implement from scratch, but I'm sure there are existing HTTP client libraries (Guzzle, PHP-HTTP...) that could do this if we decide to add a dependency.
  2. Add a simple timeout for the configurations loaded using HTTP. For example, the cached configuration would be valid up to 5 minutes, then it would be fetched and parsed again.

I'd go for 2, since it's a lot simpler. It would still be possible to implement option 1 later, if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Reviewed - further actions needed
Development

Successfully merging this pull request may close these issues.

3 participants