-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support running e2e tests on AWS #45
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work @eaudetcobello, added couple of comments
we should probably get #22 in before this.
@@ -92,14 +92,14 @@ KUSTOMIZE := $(TOOLS_BIN_DIR)/$(KUSTOMIZE_BIN)-$(KUSTOMIZE_VER) | |||
# Ginkgo | |||
TEST_DIR := $(shell pwd)/test | |||
ARTIFACTS ?= $(shell pwd)/_artifacts | |||
GINKGO_FOCUS ?= | |||
GINKGO_FOCUS ?= Workload cluster creation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is a left over from testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is temporary. I am leaving it like this for now because this test passes on all infras and it's easier to iterate with one test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by temporary do we mean that it's gonna get removed before this PR is merged? or are we planning to change this in the up coming PRs? if it's the latter, maybe we can add a note or todo.
also I think this will change the scope for both the docker and aws tests, right? I think we only need this for the aws test.
# when bootstrapping with tilt, it will use | ||
# the defaultProviderVersion in https://github.com/kubernetes-sigs/cluster-api/blob/main/hack/tools/internal/tilt-prepare/main.go as | ||
# default version for docker infrastructure provider | ||
# name here should match defaultProviderVersion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like it does not belong here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It belongs, but I will clean up the message.
c0609c4
to
9c3c084
Compare
b4b0cb7
to
a4f07d2
Compare
- Change service CIDR block to match our standard range - Create a bastion by default - Copy cluster bootstrap/join overrides to control-plane - Remove unneeded prerunCommands - Remove tolerations that don't apply to our setup (kubernetes.io/master) - Add args from default provider template to aws-cloud-controller-manager args - Restore some RBAC rules from default provider template
no hardcoded defaults, empty variables by default, use new nodeName field to specify node name
support tilt with tilt-provider.json
ad3a016
to
23f5a94
Compare
190c2a5
to
54dddf0
Compare
54dddf0
to
3320f66
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice work! Thanks a lot @eaudetcobello! did an initial pass and left some comments.
.github/workflows/e2e-deleteme.yaml
Outdated
strategy: | ||
matrix: | ||
ginkgo_focus: | ||
#- "KCP remediation" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we plan to uncomment these? or are these commented by mistake? maybe we can add a note or todo if this is intentional
.github/workflows/e2e-deleteme.yaml
Outdated
- name: Install requirements | ||
run: | | ||
sudo apt install make | ||
sudo apt install wget |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can install everything in a single line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused about the name of this file, I guess the contents of this file was supposed to be carried over to the e2e.yaml
file or something
@@ -2,6 +2,8 @@ name: E2E Tests | |||
|
|||
on: | |||
pull_request: | |||
branches: | |||
- does-not-exist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and I guess after moving the content of deleteme
file to this file, we should remove this line
@@ -92,14 +92,14 @@ KUSTOMIZE := $(TOOLS_BIN_DIR)/$(KUSTOMIZE_BIN)-$(KUSTOMIZE_VER) | |||
# Ginkgo | |||
TEST_DIR := $(shell pwd)/test | |||
ARTIFACTS ?= $(shell pwd)/_artifacts | |||
GINKGO_FOCUS ?= | |||
GINKGO_FOCUS ?= Workload cluster creation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by temporary do we mean that it's gonna get removed before this PR is merged? or are we planning to change this in the up coming PRs? if it's the latter, maybe we can add a note or todo.
also I think this will change the scope for both the docker and aws tests, right? I think we only need this for the aws test.
@@ -0,0 +1,96 @@ | |||
regions: | |||
- us-east-2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can extract this from this file and also the workflow file. I think this might cause subtle problems if we duplicate it.
.github/workflows/e2e-deleteme.yaml
Outdated
sudo sysctl fs.inotify.max_user_instances=8192 | ||
- name: Run e2e tests | ||
run: | | ||
sudo -E ./hack/ci-e2e-tests.sh true aws v0.1.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't understand this true
here, would elaborate a bit? I see ${0}
, ${2}
and ${3}
in the script, this should be ${1}
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really nice work putting up this script together. it should have been really challenging. but I strongly feel like we need to change this to a (maybe) python script cause it's too complicated for a bash script. not for this task but in the near future.
b9a596d
to
8aeee09
Compare
ci-e2e-tests.sh → my goal is to make this script a one-step solution to run the e2e tests, it launches a lxc container on the local machine and all the necessary tools are installed into it, and then tests are run within. It requires AWS credential configuration variables to be set on the local machine.
e2e-deleteme.yaml → temporary file I created to not interefere with the existing workflows that are defined in e2e.yaml. End goal is that whatever is in e2e-deleteme.yaml should be merged with e2e.yaml and a test matrix should be used that includes each architecture.
aws-nuke-config.yaml → file is still a WIP. It’s the config. to use with aws-nuke. The goal is for this config to cleanup all resources that have a certain tag. I’m still evaluating if this is the best way to go about things. Because of the way aws-nuke is built, we need to pass an exact tag, which means we can’t use a glob to include all capi resources (like tag:sigs.k8s.io/cluster-api/cluster/*). We need to say specifically say tag:sigs.k8s.io/cluster-api/cluster/capick8s-test-2djn3. One way to circumvent this would be to grep the cluster name from the output of the e2e test logs, then use sed to replace the tag in the aws-nuke configuration file.
install-aws-nuke.sh, install-clusterawsadm.sh, install-clusterctl.sh → install these tools into the lxc container's /usr/local/bin
write-provider-config.sh → this writes the CAPI provider configuration to the clusterctl.yaml config. file. We need this because ck8s is not a part of the default providers, so we need to tell clusterctl where to find our provider.
cluster_upgrade_test.go → Here we replace the hardcoded reference to docker infrastructure and instead tell the test suite to resolve `. This is then read from the config. file of the test (I think), which lives at test/config/ck8s-{docker,aws}.yaml
create_test.go → same as above.
test/e2e/data/infrastructure-aws folder → this folder contains the templates that will be applied to the cluster during the test. cluster-template.yaml is the one I have tested “Workload cluster creation” test with. The remediation.yaml templates are pretty much the same as the cluster-template.yaml except that they add a MachineHealthCheck and labels under MachineDeployment.
test/e2e/shared/v1beta1 → this defines the major and minor contract version of the core cluster api components that we use for the tests. (See ck8s-aws.yaml line 25)
test/e2e/shared/v1beta1_aws → this defines the major and minor contract version of the CAPA resources we deploy. (See ck8s-aws.yaml Line 38)
kcp_remediation_test.go, md_remediation_test.go, node_scale_test.go → only changed the hardcoded docker provider to clusterctl.DefaultInfrastructureProvider