Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http-netty: let RetryingHttpRequesterFilter return responses on failure #3048

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

bryce-anderson
Copy link
Contributor

Motivation:

Sometimes people just want to get the last failed response when the retry
loop ends. However, right now we only yield the exceptions that where
created. Users can't do this smuggling themselves in a generic way via the
HttpResponseException because it could lead to resource leaks.

Modifications:

Let users simply return the last failed response when the retry loop
exits unsuccessfully.

Motivation:

Sometimes people just want to get the last failed response when the retry
loop ends. However, right now we only yield the exceptions that where
created. Users can't do this smuggling themselves in a generic way via the
HttpResponseException because it could lead to resource leaks.

Modifications:

Let users simply return the last failed response when the retry loop
exits unsuccessfully.
Comment on lines 798 to 801
public Builder returnFailedResponses(final boolean returnFailedResponses) {
this.returnFailedResponses = returnFailedResponses;
return this;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm certain this can have a better name and clearly it needs docs before merging. Name suggestions welcome.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think this API is a bit awkward: first you must turn a response into an HttpResponseException and then it's going to be discarded. Alternatively, we could just have a different lambda to the tune of Function<Boolean, HttpResponseMetadata> shouldRetry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we don't have RS operators to achieve retries without mapping into exceptions. If we go the route of clean retry of response meta-data without mapping to exceptions, it's possible but will take longer.

Current rational was that some users want to always map responses to exceptions, that's why we have independent responseMapper. Then some users may want to retry that, so there is a 2nd method for them to retryResponses. We decided to put them next to each other on the same builder instead of offering 2 different filters bcz they often used together.

I agree that having a 3rd method that works only if the other 2 also configured is not intuitive. Alternatively, we can consider adding a retryResponses overload that takes a boolean to make a decision if it need to unwrap the original response or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the boolean overload, which would signal that it needs to be configured "together". Alternatively when building, we should at least check if this value is set to true and others are in their default state to reject the config?

@bryce-anderson
Copy link
Contributor Author

A risk of this PR is that it's very difficult to know for sure that the deferred response is properly drained since it gets smuggled through the error channel.

Comment on lines 798 to 801
public Builder returnFailedResponses(final boolean returnFailedResponses) {
this.returnFailedResponses = returnFailedResponses;
return this;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we don't have RS operators to achieve retries without mapping into exceptions. If we go the route of clean retry of response meta-data without mapping to exceptions, it's possible but will take longer.

Current rational was that some users want to always map responses to exceptions, that's why we have independent responseMapper. Then some users may want to retry that, so there is a 2nd method for them to retryResponses. We decided to put them next to each other on the same builder instead of offering 2 different filters bcz they often used together.

I agree that having a 3rd method that works only if the other 2 also configured is not intuitive. Alternatively, we can consider adding a retryResponses overload that takes a boolean to make a decision if it need to unwrap the original response or not.

@bryce-anderson bryce-anderson force-pushed the bl_anderson/RetryingHttpRequesterCanReturnRepsonses branch from c751f7c to 840fab0 Compare October 8, 2024 20:35
@bryce-anderson bryce-anderson force-pushed the bl_anderson/RetryingHttpRequesterCanReturnRepsonses branch from 840fab0 to e80e98e Compare October 8, 2024 20:47
@bryce-anderson
Copy link
Contributor Author

@idelpivnitskiy, with the additional constraint of not returning the body this got dramatically simpler but I'm not certain having an empty response body is what we wanted.

@bryce-anderson bryce-anderson marked this pull request as ready for review October 11, 2024 15:29
@bryce-anderson bryce-anderson requested a review from daschl October 11, 2024 15:29
@bryce-anderson bryce-anderson marked this pull request as draft October 21, 2024 21:28
@bryce-anderson bryce-anderson marked this pull request as ready for review November 4, 2024 22:21
result = result.onErrorMap(backoffError -> ThrowableUtils.addSuppressed(t, backoffError))
// If we get cancelled we also need to drain the message body as there is no guarantee
// we'll ever receive a completion event, error or success.
.beforeCancel(() -> drain(response).subscribe())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does that retry draining collide/overlap with the draining @idelpivnitskiy added in the other PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. That leak should have been self contained and the problem was that we didn't drain a response that the redirect filter had decided to consume itself.

Comment on lines 798 to 801
public Builder returnFailedResponses(final boolean returnFailedResponses) {
this.returnFailedResponses = returnFailedResponses;
return this;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the boolean overload, which would signal that it needs to be configured "together". Alternatively when building, we should at least check if this value is set to true and others are in their default state to reject the config?

@bryce-anderson bryce-anderson requested a review from daschl December 3, 2024 22:08
Copy link
Contributor

@daschl daschl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, just some minor comments on docs

Copy link
Member

@idelpivnitskiy idelpivnitskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that it took so long to review. I like the approach, just some comments to make it stronger:

// Disable request retrying
.retryRetryableExceptions((requestMetaData, e) -> ofNoRetries())
// Retry only responses marked so
.retryResponses((requestMetaData, throwable) -> ofImmediate(maxTotalRetries - 1))
.retryResponses((requestMetaData, throwable) -> {
if (throwable instanceof HttpResponseException &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it some intermediate change that not needed anymore? I tried locally reverting back to the original lambda and it worked.

assertThat("Unexpected exception.", e, instanceOf(HttpResponseException.class));
if (returnFailedResponses) {
HttpResponse response = normalClient.request(normalClient.get("/"));
assertThat(response.status(), is(HttpResponseStatus.OK));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider enhancing server response to also add payload body and asserting here that payload is not drained.

@@ -798,9 +844,28 @@ public Builder maxTotalRetries(final int maxRetries) {
* @param mapper a {@link Function} that maps a {@link HttpResponseMetaData} to an
* {@link HttpResponseException} or returns {@code null} if there is no mapping for response meta-data. The
* mapper should return {@code null} if no retry is needed or if it cannot be determined that a retry is needed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like this "The mapper should return" statement is here by mistake bcz this mapper is for mapping only, retries are handled by retryResponses. While you are here, could you please clean it up on both overloads?

* @return {@code this}
*/
public Builder responseMapper(final Function<HttpResponseMetaData, HttpResponseException> mapper,
final boolean returnFailedResponses) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it feels like this boolean better belongs to retryResponses rather than responseMapper bcz users may decide to map 4xx/5xx to exceptions even if they are no going to retry them. WDYT?

// If we succeed, we need to drain the response body before we continue. If we fail we want to
// surface the original exception and don't worry about draining since it will be returned to
// the user.
result = result.onErrorMap(backoffError -> ThrowableUtils.addSuppressed(t, backoffError))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what cases is it possible that backoffError != t? Do we really need this mapping?

// the user.
result = result.onErrorMap(backoffError -> ThrowableUtils.addSuppressed(t, backoffError))
// If we get cancelled we also need to drain the message body as there is no guarantee
// we'll ever receive a completion event, error or success.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarification! Consider adding: and it's legit to do that because subscriber is no longer interested in the response

} else if (LOGGER.isDebugEnabled()) {
if (!(t instanceof HttpResponseException)) {
LOGGER.debug("Couldn't unpack response due to unexpected dynamic types. Required " +
"exception of type HttpResponseException, found {}", t.getClass());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to log in case t is not HttpResponseException? Any other exception type will fall under this condition and produce logging. Users may come to us worried about it.

+1 for logging when the exception type is HttpResponseException but metaData is not StreamingHttpResponse. It might be even better to increase this to info/warn and add an assertion.
Should we log here or inside single.onErrorResume(HttpResponseException.class, ...) or in both places?

* @param mapper a {@link Function} that maps a {@link HttpResponseMetaData} to an
* {@link HttpResponseException} or returns {@code null} if there is no mapping for response meta-data. The
* mapper should return {@code null} if no retry is needed or if it cannot be determined that a retry is needed.
* @param returnFailedResponses whether to unwrap the response defined by the {@link HttpResponseException}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wdyt about naming it returnOriginalResponses?

@@ -258,19 +291,31 @@ protected Single<StreamingHttpResponse> request(final StreamingHttpRequester del
if (responseMapper != null) {
single = single.flatMap(resp -> {
final HttpResponseException exception = responseMapper.apply(resp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to your changes, but a find: if user-defined responseMapper throws, we leak resp 😢

@@ -109,13 +117,14 @@ public final class RetryingHttpRequesterFilter

RetryingHttpRequesterFilter(
final boolean waitForLb, final boolean ignoreSdErrors, final boolean mayReplayRequestPayload,
final int maxTotalRetries,
final boolean returnFailedResponses, final int maxTotalRetries,
@Nullable final Function<HttpResponseMetaData, HttpResponseException> responseMapper,
final BiFunction<HttpRequestMetaData, Throwable, BackOffPolicy> retryFor,
Copy link
Member

@idelpivnitskiy idelpivnitskiy Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we may leak "pending response" if any user-defined function under retryFor throws. Scenario:

  1. Server responds with 503
  2. User maps it to HttpResponseException
  3. returnFailedResponses == true, we don't drain proactively
  4. OuterRetryStrategy.apply is invoked
  5. retryFor throws
  6. We won't invoke applyRetryCallbacks and won't drain that response.

Consider adding a try-catch inside apply to make sure we drain t instance of HttpResponseException in case of any unexpected exception.

Adding a test will be highly appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants