Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pluggable storage #142

Open
5 of 7 tasks
timja opened this issue Aug 15, 2020 · 14 comments
Open
5 of 7 tasks

Pluggable storage #142

timja opened this issue Aug 15, 2020 · 14 comments

Comments

@timja
Copy link
Member

timja commented Aug 15, 2020

In common CI/CD use-cases a lot of the space is being consumed by test reports. This data is stored within JENKINS_HOME, and the current storage format requires huge overheads when retrieving statistics and, especially, trends. In order to display trends, each report has to be loaded and then processed in-memory.

The main purpose of externalising Test Results is to optimize Jenkins logic by querying the desired data from specialized external storages, e.g. from Document-based databases like Elasticsearch. According to the current plan, JUnit Plugin will be extended in order to support such external storage in its APIs being widely used by test reporting plugins.

Status:

Foundation work started

Prototype API: #110

Continued in: #141

Todo:

  • New trend chart implementation that doesn't rely on loading Runs
  • Rewrite history chart, probably port it to echarts
  • Fix package result loading showing no results when viewing by package
  • Plugin implementing API - most likely a PostgreSQL plugin
  • Benchmarks - show performance improvements for large instances
  • Present at Cloud native Sig
  • Announcement blog on Jenkins.io

Test strategy:

The junit plugin will provide a sample implementation using h2 database that can be used to verify pluggable storage functionality.

The Postgres implementation will be able to use something like test containers and replicate a similar test inside it's own plugin.

Performance tests will be written using the Microbenchmark harness, see https://www.jenkins.io/blog/2019/06/21/performance-testing-jenkins/

Graduation criteria

This API will be released in Beta initially and will likely change as it's developed.

  • At least one production ready pluggable storage implementation
  • Benchmarks showing improvements
  • Blog on Jenkins.io
@timja timja pinned this issue Aug 15, 2020
@timja timja added this to the Pluggable storage milestone Aug 15, 2020
@oleg-nenashev
Copy link
Member

FTR https://www.jenkins.io/sigs/cloud-native/pluggable-storage/

@timja Thanks for re-starting it! Do you have a particular database in mind for the reference implementation?

@timja
Copy link
Member Author

timja commented Aug 17, 2020

Postgres: https://github.com/timja/jenkins-junit-postgresql-plugin

@oleg-nenashev
Copy link
Member

oleg-nenashev commented Aug 17, 2020

I wonder whether we could unify the Data serialization logic with https://github.com/jenkinsci/postgresql-fingerprint-storage-plugin . CC @stellargo . Test result storage also has extensible data structure

@jglick
Copy link
Member

jglick commented Aug 17, 2020

Test result storage also has extensible data structure

Yes but a very different structure. I do not think it would be productive to mix those.

@jglick
Copy link
Member

jglick commented Aug 17, 2020

Graduation criteria should include a (draft) JEP. See 202 for an example predecessor.

@timja
Copy link
Member Author

timja commented Sep 18, 2020

Pretty close to hosting and releasing a beta.

After #164 is merged it pretty much all works.

What's left:

  • Hide the descriptions when using pluggable storage, as I currently have no plan to implement it (unless anyone sees a need). - not doing see Hide description when using pluggable storage #163 (comment)
  • Port some recent changes over to the postgres plugin and get it hosted
  • Add indexes to postgres plugin
  • Schema management would be great, flyway maybe, thoughts?
  • Some more real world test cases, currently just using the initial xml file that Jesse wrote years ago, need to get that actual tests are handled properly.

@timja
Copy link
Member Author

timja commented Sep 22, 2020

Update:

I've released v0.2 of the Junit SQL Storage plugin

I've been doing some benchmarking, caught one deadlock doing that, need to review reports and see if there's anything else that needs doing.

After #165 I would say this is pretty much complete other than announcements / jep and pending any feedback

@timja
Copy link
Member Author

timja commented Sep 22, 2020

I've done some benchmarks, seems slightly slower for the trend chart, but I expect that's relative to what's in memory, not sure if worth trying to figure out how to test that.

it was something like .0015s vs .0003s.

@jglick
Copy link
Member

jglick commented Sep 22, 2020

slightly slower for the trend chart

The relevant comparison would be to a job with thousands of build records, most of which are not in memory, if you disable my hack to avoid displaying results from unloaded builds!

@mdealer
Copy link
Contributor

mdealer commented Sep 20, 2022

I tried to set it up while investigating test history performance issues, but got this while running a test pipeline that uploads some junit test results:

hudson.remoting.ProxyException: org.postgresql.util.PSQLException: ERROR: relation "caseresults" does not exist
  Position: 13
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2553)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2285)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:323)
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:473)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:393)
	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:164)
	at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:130)
	at org.apache.commons.dbcp2.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:136)
	at org.apache.commons.dbcp2.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:136)
	at io.jenkins.plugins.junit.storage.database.DatabaseTestResultStorage$RemotePublisherImpl.publish(DatabaseTestResultStorage.java:149)
Also:   hudson.remoting.ProxyException: hudson.remoting.ProxyException: org.postgresql.util.PSQLException: ERROR: relation "caseresults" does not exist
  Position: 13
		at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2553)
		at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2285)
		at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:323)
		at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:473)
		at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:393)
		at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:164)
		at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:130)
		at org.apache.commons.dbcp2.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:136)
		at org.apache.commons.dbcp2.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:136)
		at io.jenkins.plugins.junit.storage.database.DatabaseTestResultStorage$RemotePublisherImpl.publish(DatabaseTestResultStorage.java:149)
	Also:   hudson.remoting.ProxyException: hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from ....
			at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1784)
			at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356)
			at hudson.remoting.Channel.call(Channel.java:1000)
			at hudson.FilePath.act(FilePath.java:1194)
			at hudson.FilePath.act(FilePath.java:1183)
			at hudson.tasks.junit.JUnitParser.summarizeResult(JUnitParser.java:127)
			at hudson.tasks.junit.JUnitResultArchiver.parseAndSummarize(JUnitResultArchiver.java:257)
			at hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:63)
			at hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:29)
			at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
			at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
			at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
			at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
			at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
			at java.base/java.lang.Thread.run(Thread.java:829)

Any ideas?

EDIT: Turns out I had to restart Jenkins for this to fully work.

@mdealer
Copy link
Contributor

mdealer commented Sep 27, 2022

I got it running using postgresql, but it was still impossibly slow to load anything.

I guess for retrieving the test results externally it is ok, but for our case where we use Jenkins it did not improve the situation.

Also, the plugin is bursting out of the seams in part to other additions as well.

@timja
Copy link
Member Author

timja commented Sep 27, 2022

what do you mean by?

Also, the plugin is bursting out of the seams in part to other additions as well.

@mdealer
Copy link
Contributor

mdealer commented Sep 27, 2022

what do you mean by?

Also, the plugin is bursting out of the seams in part to other additions as well.

I spent around a day trying to understand where the performance issue comes from just to detect a lot of unused SQL that don't improve performance for our case.

Then I noticed that the charts don't work properly, then I saw that it was ported to echarts and the old behavior of single chart with both duration and result was scrapped. The new implementation seems broken on our side.

I was just referring to the general design patterned chaos in the plugin.

@mdealer
Copy link
Contributor

mdealer commented Sep 29, 2022

On a second thought, I think the pluggable storage should have an option to only export data to SQL instead of completely replacing the storage. I think a complete replacement of storage makes it significantly less maintainable but at the same time exporting test results to SQL can have its usefulness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants