Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compute-starter-kit dt #60

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

SeanMooney
Copy link
Contributor

This change adds a redme describign the comptue starter kit
DT and job variants. This DT may be graducated to a VA in the future
and will be used as the basis of the compute content promotion jobs
prior to integration of promoted content.

@SeanMooney SeanMooney force-pushed the compute-starter-kit branch 2 times, most recently from a778b56 to b1ca096 Compare December 22, 2023 17:14
@SeanMooney SeanMooney force-pushed the compute-starter-kit branch 3 times, most recently from dd000d2 to 9cf89d7 Compare January 9, 2024 03:20
@SeanMooney
Copy link
Contributor Author

@fultonj i have not actually tested this beyond rendering the crs with kustomize locally but this is a port of the old nova compute kit ci toplogy i wrote last year overlayed on the lib functions you intoduced

https://github.com/openstack-k8s-operators/nova-operator/tree/main/ci/nova-operator-compute-kit

one slight annoyance is ideaaly we would start with a contol plane with all service disabled and then enabled only the ones needed
to work around that i have used patches to remvoe all service form the default template provided in teh contolplane lib that are not needed and then modifed the remaining services as need.

i have a couple of opens with the current version mainly around how do i correctly expose the rest apis as routes?

and then obviously i need to extend this to have 2 edpm nodes in cell 1 before it actully complete

Copy link
Contributor

@fultonj fultonj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks promising. This is just a first pass. I'll do a follow up review.

This is a collection of CR templates that represent a openstack deployment
topology that has the following characteristics:

- Single noe OpenShift cluster (CRC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node


All stages must be executed in the order listed below. Everything is required unless otherwise indicated.

1. [Install the OpenStack K8S operators and their dependencies]../(../../common/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct the markdown link syntax so it renders correcty.

Install the OpenStack K8S operators and their dependencies

```
oc project openstack
```
Change to the hci directory
Copy link
Contributor

@jamepark4 jamepark4 Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be Change to the compute-starter-kit/edpm directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep
this file is mostly just a copy past right now but ill correct this when i get to the datapalne part

this dt will have at least the following phases.

deploy openstack with nova and cell0
deploy cell1

  • first deploy the contolplane part of cell1 and then waith for ti to be readay
  • then deploy the dataplane
    finally perfrom post install config (flavor/image creations) and run tests.

metallb.universe.tf/loadBalancerIPs: 172.17.0.86

lbServiceType: LoadBalancer
storageClass: crc-csi-hostpath-provisioner
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we define additional nova compute configuration and then use customize to copy that into the DataPlaneService/nova? E.g. enabling cpu pinning for the second scenario

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proably yes i have not got to the edpm part yet.

@SeanMooney
Copy link
Contributor Author

this is just a rebase to take account of the changes in the repo and i fixed 2 issues
rabbit now requests enouch memory to avoid being OOM killed and i set the database user names for nova.

with this nova/placment deploy properly

glance is failing with some pod creation issue related to the storage network we will have to figure that out. my guess is the network config in this dt is not compatible with CRC currently and needs to be adjusted but i have not really had time to figure out how that is working

prefix-length: 24
iface: enp7s0.21
vlan: 21
base_iface: enp7s0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my best guess right now is this was broken because i had not done

make crc_attach_default_interface before i applied this so this probably failed and that is why the glance dbsync pod was failing to spawn

base_iface: enp7s0 probably did not exist
so this network probably didn't get create properly

error configuring pod [openstack/glance-db-sync-9wz75] networking: [openstack/glance-db-sync-9wz75/165e4d20-d6d9-4edc-873e-8a5fa269fe9d:storage]: error adding container to network "storage": Link not found

https://termbin.com/xqbn

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep for crc this is

enp6s0

data:
# nodes
node_0:
name: crc
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok this needs to be updated to match the name of the CRC VM

for my current instance this should be

node_0:
name: crc-n5gv4-master-0
internalapi_ip: 172.17.0.5
tenant_ip: 172.19.0.5
ctlplane_ip: 192.168.122.19
storage_ip: 172.18.0.5

that is why the network config was broken
it was not finding a node called crc

version: v1beta1
- path: control-plane/disable.yaml
- path: control-plane/infra.yaml
- path: control-plane/nova.yaml
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glance and neutron seam to be defaulting to 0 replicas so i just need to add a patch file for each and they should deploy

@SeanMooney
Copy link
Contributor Author

just a very quick update this now depoys on crc correctly

when using this locally you will have to update eh node_0 name to match your crc instances hostname

steps to use this are

crc start
cd /devscript
make crc_attach_default_interface
oc apply -k examples/common/olm/
oc apply -k examples/common/metallb/
oc apply -k examples/common/nmstate/
wait for those to complete
update values.yaml with the hostname of CRC and IP for enp6s0 on the CRC instance
oc apply -k examples/dt/compute/compute-starter-kit/
wait for that to complete and at that point you should a OpenStack control plane with
keystone, nova, placement, neutron and glance

nova will be deployed with cell 0 only

in a future version ill add the steps for adding nova-cell1 and adding edpm nodes to this
once complete we can move on to the post deploy config and running tempest.

for now I'm going to put this on hold for a week or two while i work on nova stuff.

@leifmadsen leifmadsen added the needs-info Information is requested of the reporter or reviewers label Apr 29, 2024
@leifmadsen
Copy link
Contributor

Discussed this during triage, and we're wondering if this was a good idea for implementation, but maybe we can close this for now, and revisit in the future if the priority bubbles back up?

@SeanMooney
Copy link
Contributor Author

to me this is required before we can move the ci framework jobs to all be based on the architecture repo.

so this is not a priority for beta or even ga but i actually think
this should be elevated to a VA i have just been two busy to work on this due to other work

@leifmadsen
Copy link
Contributor

to me this is required before we can move the ci framework jobs to all be based on the architecture repo.

so this is not a priority for beta or even ga but i actually think this should be elevated to a VA i have just been two busy to work on this due to other work

Perhaps this should be moved to a Jira for tracking, scoping, and backlog prioritization rather than leaving this hanging open. If this is not a target for beta or GA, and is more tied to CIFMW integration, then I think it might be best to offline this and start with planning what needs to be done.

@SeanMooney
Copy link
Contributor Author

orignally this was motaivate by the orginal request for DT which was also communicated as applying to the component piplien and github jobs.

later that request was removed but the desire to have a minimal configuration for development and ci
i.e. that works with a single openshfit worker node (CRC, microshfit, SNO)

this stopped being a ga requirement when having dt for all jobs including upstream github and downstream component pipepliens was removed a few months ago which is why i depriotised this.

This change adds a readme describing the compute starter kit
DT and job variants. This DT may be graducated to a VA in the future
and will be used as the basis of the compute content promotion jobs
prior to integration of promoted content.

This initial patch also include a defintion of the compute-kit
controlplane containing only keystone, placement, glance, neutron
and nova. it does not contain a dataplane definition.
that will be added in later patches.

this dt currently deploy nova with only cell 0
cell 1 will be added when adding the dataplane
Copy link

openshift-ci bot commented Sep 5, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: SeanMooney
Once this PR has been reviewed and has the lgtm label, please assign raukadah for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.
Warning:
Error merging github.com/openstack-k8s-operators/architecture for 60,c960e66a69fd8942d28b19c20ba33e3c89965088

Copy link

openshift-ci bot commented Sep 5, 2024

@SeanMooney: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/images c960e66 link true /test images
ci/prow/unit c960e66 link true /test unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-info Information is requested of the reporter or reviewers needs-rebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants