Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: ignore HTTP response errors by status code #784

Merged
merged 6 commits into from
Sep 30, 2024

Conversation

ctreatma
Copy link
Contributor

@ctreatma ctreatma commented Sep 25, 2024

This was broken out of #782 since it involves a widely-used helper function and will impact more than just the Metal VLAN resource.

Previously, the IgnoreHttpResponseErrors helper function accepted as arguments a number of other, existing helper functions that are meant to evaluate whether a particular API response has a certain HTTP status. Here's an example helper function for checking if the response is a 403:

func HttpForbidden(resp *http.Response, err error) bool {
	if resp != nil && (resp.StatusCode != http.StatusForbidden) {
		return false
	}

	switch err := err.(type) {
	case *ErrorResponse, *packngo.ErrorResponse:
		return IsForbidden(err)
	}

	return false
}

This function immediately returns false if the response status code is not 403; otherwise it tries to convert the error argument to an ErrorResponse or a packngo.ErrorResponse and then use a different helper function to determine if that error represents a 403. This assumes that the error object contains a copy of the response, which is only certain to be the case for code that uses packngo. In practice we almost never did the necessary conversion to ErrorResponse for non-packngo resources and data sources, so in many cases we were probably not successfully ignoring the desired response codes.

This refactors IgnoreHttpResponseErrors to inspect HTTP status codes directly in the response instead of relying on helper functions for specific custom error types.

This also removes the FriendlyErrorForMetalGo function which only existed in order to convert a separate HTTP response and error into an ErrorResponse, which is no longer necessary. Some usage of FriendlyError is also removed in cases where that function had no or minimal effect.

@displague
Copy link
Member

This was an extra flaky CI AccTest run, mostly around network resources but also devices that timed out during provisioning and projects not being cleaned up as a result.

Triggering a retest.


(out of pr scope)
We could avoid these in the future with a test fuzzing (ignore attribute changes within a few seconds).

=== NAME  TestAccMetalDevice_importBasic
    resource_test.go:591: Step 2/2 error running import: ImportStateVerify attributes not equivalent. Difference is shown below. The - symbol indicates attributes missing after import.
        
          map[string]string{
        - 	"updated": "2024-09-25T21:58:29Z",
        + 	"updated": "2024-09-25T21:58:30Z",
          }

@displague
Copy link
Member

Looking at the failures again, they seem relevant to the change:

=== NAME  TestAccMetalGateway_privateIPv4
    resource_test.go:15: Error running post-test destroy, there may be dangling resources: exit status 1
        
        Error: Failed to delete Metal Gateway 00448d03-8c3c-446c-86b2-d8082f0a7c3d
        
        API Error HTTP 404 404 Not Found Not found
--- FAIL: TestAccMetalGateway_privateIPv4 (22.67s)

404s should be considered successful deletes. It wasn't picked up. This happens for a few different resource types.

A device_timeout test fails with a 403, which may be exactly what we would have expected.

@ctreatma
Copy link
Contributor Author

ctreatma commented Sep 26, 2024

404s should be considered successful deletes. It wasn't picked up. This happens for a few different resource types.

This is expected at the moment. The packngo-based error filtering was tied in to state waiters as well, which necessitated converting equinix-sdk-go responses to FriendlyErrors in order to ensure the correct flow through later status checks (and this conversion wasn't done consistently, resulting in some incorrect behavior depending on how quickly things are deleted).

Conversion of error filtering for impacted resources should be complete as of now, and this PR is ready for review.

Copy link
Contributor

@thogarty thogarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good from Fabric standpoint. Thanks @ctreatma !

// deletedMarker is a terraform-provider-only value that is used by the waiter
// to indicate that the gateway appears to be deleted successfully based on
// status code
deletedMarker := "tf-marker-for-deleted-gateway"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@ocobles ocobles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@ctreatma ctreatma merged commit 33aff72 into main Sep 30, 2024
10 of 13 checks passed
@ctreatma ctreatma deleted the http-check-error-codes branch September 30, 2024 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants