ETCD: operator appends random string to endpoint #20

mkania-cisco · 2022-09-11T19:14:02Z

I'm using Metallb as a provider for LoadBalancer.

I've created two services:

mkania@linux-700-2:~$ kubectl get svc -n cnwan-msc-green
NAME                         TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)          AGE
svc-msc-simple-k8s-app-bar   LoadBalancer   10.111.59.63    70.70.70.129   8000:30273/TCP   174m
svc-msc-simple-k8s-app-foo   LoadBalancer   10.103.51.228   70.70.70.128   8000:31586/TCP   174m

which autopopulated endpoints:

mkania@linux-700-2:~$ kubectl get endpoints -n cnwan-msc-green
NAME                         ENDPOINTS                        AGE
svc-msc-simple-k8s-app-bar   10.0.1.56:8000,10.0.4.80:8000    176m
svc-msc-simple-k8s-app-foo   10.0.1.131:8000,10.0.2.32:8000   176m

however, when cn-reader received changes, it came up with an error:

7:00PM INF sending data...	| func=queue.senderWorkQueue.sendData length=1
7:00PM INF received response from the adaptor	| func=services.servicesHandler.logResponseError response="207 - INVALID RESOURCES: Some resources have not been processed successfully. List of failed resources is included." status-code=207
7:00PM WRN adaptor error occurred on resource	| error="Resource 'svc-msc-simple-k8s-app-foo-b7f9137dfd': 400 ENDPOINT NOT FOUND  Cannot process DELETE event: resource  IP 70.70.70.128 and port 8000 does not exist. Ignoring this event." func=services.servicesHandler.logResponseError status-code=207
7:00PM INF events sent successfully	| func=queue.senderWorkQueue.sendData length=1

looking into etcd I see different endpoint name:

/service-registry/namespaces/cnwan-msc-green/services/svc-msc-simple-k8s-app-foo
name: svc-msc-simple-k8s-app-foo
namespaceName: cnwan-msc-green
metadata:
    cnwan.io/traffic-profile: green
    owner: cnwan-operator

/service-registry/namespaces/cnwan-msc-green/services/svc-msc-simple-k8s-app-foo/endpoints/svc-msc-simple-k8s-app-foo-b7f9137dfd
name: svc-msc-simple-k8s-app-foo-b7f9137dfd
serviceName: svc-msc-simple-k8s-app-foo
namespaceName: cnwan-msc-green
metadata:
    owner: cnwan-operator
address: 70.70.70.128
port: 8000

Although I deleted these endpoints manually and created again manually to match name from kubectl but still these does not populate to vManage...

The text was updated successfully, but these errors were encountered:

asimpleidea · 2022-09-12T06:11:54Z

Hi mkania,

thank you for posting this. This is an expected behavior as the random string is appended to prevent overlapping in endpoints for the same Service and it is a sha256(address+":"+port).

The error occurred - Resource 'svc-msc-simple-k8s-app-foo-b7f9137dfd': 400 ENDPOINT NOT FOUND Cannot process DELETE event: resource IP 70.70.70.128 and port 8000 does not exist. Ignoring this event. - means that the cnwan-reader was not able to reach the cnwan-adapter.

Is the adapter running properly and reachable?

mkania-cisco · 2022-09-12T07:33:39Z

Thanks @asimpleidea for quick response!

I think it should be reachable, it does not complain on startup (I tried with both docker IP and host):

root@linux-700-2:/home/mkania# docker run  \
>               --name reader \
>               --rm \
>               cnwan/cnwan-reader:v0.8.0 watch etcd \
>               --metadata-keys cnwan.io/traffic-profile \
>               --adaptor-api http://172.17.0.3:8080/cnwan \
>               --endpoints 70.70.72.2:3379 \
>               --prefix /service-registry/ \
>               --interval 5
7:28AM INF getting current state of service registry from etcd...
7:28AM INF watching for changes...
7:28AM INF /service-registry/
7:28AM INF sending data...	| func=queue.senderWorkQueue.sendData length=2
7:28AM INF received response from the adaptor	| func=services.servicesHandler.logResponseError response=<> status-code=204
7:28AM INF events sent successfully	| func=queue.senderWorkQueue.sendData length=2
7:30AM INF detected deleted endpoint key=namespaces/cnwan-msc-green/services/svc-msc-simple-k8s-app-foo/endpoints/svc-msc-simple-k8s-app-foo-b7f9137dfd
getting before the delete
7:30AM INF sending data...	| func=queue.senderWorkQueue.sendData length=1
7:30AM INF detected deleted endpoint key=namespaces/cnwan-msc-green/services/svc-msc-simple-k8s-app-bar/endpoints/svc-msc-simple-k8s-app-bar-5216f6163b
getting before the delete
7:30AM INF received response from the adaptor	| func=services.servicesHandler.logResponseError response="207 - INVALID RESOURCES: Some resources have not been processed successfully. List of failed resources is included." status-code=207
7:30AM WRN adaptor error occurred on resource	| error="Resource 'svc-msc-simple-k8s-app-foo-b7f9137dfd': 400 ENDPOINT NOT FOUND  Cannot process DELETE event: resource  IP 70.70.70.128 and port 8000 does not exist. Ignoring this event." func=services.servicesHandler.logResponseError status-code=207

and when I change to endpoint that is not supposed to work I get error on startup:

root@linux-700-2:/home/mkania# docker run  \
>               --name reader \
>               --rm \
>               cnwan/cnwan-reader:v0.8.0 watch etcd \
>               --metadata-keys cnwan.io/traffic-profile \
>               --adaptor-api http://172.17.0.3:1234/cnwan \
>               --endpoints 70.70.72.2:3379 \
>               --prefix /service-registry/ \
>               --interval 5
7:31AM INF getting current state of service registry from etcd...
7:31AM INF sending data...	| func=queue.senderWorkQueue.sendData length=2
7:31AM ERR error while getting response	| error="Post \"http://172.17.0.3:1234/cnwan/events\": dial tcp 172.17.0.3:1234: connect: connection refused" func=services.servicesHandler.Send
7:31AM INF watching for changes...
7:31AM INF /service-registry/

Is there any way to enable any debugging on adaptor side?

asimpleidea · 2022-09-12T07:52:17Z

Judging by the error, it looks like the adaptor could find endpoint svc-msc-simple-k8s-app-bar-5216f6163b in any policies.

This is an issue that the reader is re-forwarding from the adaptor, so before moving this issue to the adaptor's repo could you repeat these same steps but also including --debug in the reader's command and post the output again, please?

I want to make sure that the reader is sending all events appropriately before moving on investigating the adaptor. Thanks!

mkania-cisco · 2022-09-12T08:22:22Z

root@linux-700-2:/home/mkania# docker run  \
>               --name reader \
>               --rm \
>               cnwan/cnwan-reader:v0.8.0 watch etcd \
>               --metadata-keys cnwan.io/traffic-profile \
>               --adaptor-api http://172.17.0.3:8080/cnwan \
>               --endpoints 70.70.72.2:3379 \
>               --prefix /service-registry/ \
>               --interval 5 \
>               --debug
8:02AM INF getting current state of service registry from etcd...
8:02AM INF watching for changes...
8:02AM INF /service-registry/
8:02AM INF sending data...	| func=queue.senderWorkQueue.sendData length=2
8:02AM INF received response from the adaptor	| func=services.servicesHandler.logResponseError response=<> status-code=204
8:02AM INF events sent successfully	| func=queue.senderWorkQueue.sendData length=2

--debug flag seems not to change logging level.

asimpleidea · 2022-09-12T08:27:23Z

Transferring this to the adaptor's repo.

Could you post here any logs you see from the adaptor?

arnatal · 2022-09-12T08:34:48Z

Hi @mkania-cisco, thanks for posting this! Which version of vManage are you running? Note that the Adaptor has been tested mainly with vManage version 20.3.1 (recommended) and 19.2.1. Thanks!

mkania-cisco · 2022-09-12T08:41:15Z

Hey @arnatal

Hi @mkania-cisco, thanks for posting this! Which version of vManage are you running? Note that the Adaptor has been tested mainly with vManage version 20.3.1 (recommended) and 19.2.1. Thanks!

unfortunately pretty recent -- 20.7.1.1.

Transferring this to the adaptor's repo.

Could you post here any logs you see from the adaptor?

root@linux-700-2:/home/mkania# docker logs adaptor
 * Serving Flask app '__main__' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off

not much in default logging.

mkania-cisco · 2022-09-12T08:48:17Z

Enabling debugging for flask does not help:

docker run -d \
            -p 80:8080 \
            --rm \
            --name adaptor \
            -e SDWAN_IP=*** \
            -e SDWAN_USERNAME=*** \
            -e SDWAN_PASSWORD=*** \
            -e MERGE_POLICY=cnwan_merge\
            -e FLASK_ENV=development \
            ghcr.io/cloudnativesdwan/cnwan-adaptor

root@linux-700-2:/home/mkania# curl -H 'Content-Type: application/json' -X POST -d '{"metadataKey":"cnwan.io/traffic-profile","metadataValue": "green", "policyType": "Data", "policyName": "cnwan_dp" }' http://localhost:80/mappings
{
  "detail": "Config OK"
}

root@linux-700-2:/home/mkania# docker logs adaptor
 * Serving Flask app '__main__' (lazy loading)
 * Environment: development
 * Debug mode: on

asimpleidea · 2022-09-12T08:52:14Z

Ok then this is probably due to a response sent by vManage.

Thank you, will update you asap!

arnatal · 2022-09-12T09:03:30Z

@mkania-cisco If I remember correctly the adaptor keeps the logs inside the docker container. Could you get inside the adaptor container and check "adaptor.log"?

docker exec -it [container-id] /bin/sh
cat adaptor.log

mkania-cisco · 2022-09-12T09:14:32Z

Thanks -- I haven't dug that deep to discover that!

DEBUG:metadata_adaptor.vmanage_functions:AppRoute policy loaded: {'description': 'cnwan_merge',
 'mode': '',
 'name': 'cnwan_merge',
 'optimized': 'false',
 'sequences': [{'actions': [{'parameter': [{'field': 'name',
                                            'ref': 'e68bcf64-a8f9-4fb1-b2f4-803cb1907aba'},
                                           {'field': 'preferredColor',
                                            'value': 'green'},
                                           {'field': 'strict'}],
                             'type': 'slaClass'}],
                'match': {'entries': []},
                'sequenceId': 10,
                'sequenceIpType': 'ipv4',
                'sequenceName': 'App Route',
                'sequenceType': 'appRoute'}],
 'type': 'appRoute'}
DEBUG:metadata_adaptor.core_lib:New merge policy for AppRoute is [{'actions': [{'parameter': [{'field': 'name',
                              'ref': 'e68bcf64-a8f9-4fb1-b2f4-803cb1907aba'},
                             {'field': 'preferredColor', 'value': 'green'},
                             {'field': 'strict'}],
               'type': 'slaClass'}],
  'match': {'entries': []},
  'sequenceId': 10,
  'sequenceIpType': 'ipv4',
  'sequenceName': 'App Route',
  'sequenceType': 'appRoute'}]
DEBUG:metadata_adaptor.vmanage_functions:PUT https://10.62.141.179:8443/dataservice/template/policy/definition/approute/cbce642b-46e1-457f-b5a8-e8541cdbb769
DEBUG:metadata_adaptor.vmanage_functions:Sending this payload: {"name": "cnwan_merge", "type": "appRoute", "description": "cnwan_merge", "sequences": [{"sequenceId": 10, "sequenceName": "App Route", "sequenceType": "appRoute", "sequenceIpType": "ipv4", "match": {"entries": []}, "actions": [{"type": "slaClass", "parameter": [{"field": "name", "ref": "e68bcf64-a8f9-4fb1-b2f4-803cb1907aba"}, {"field": "preferredColor", "value": "green"}, {"field": "strict"}]}]}], "mode": "", "optimized": "false"}
DEBUG:urllib3.connectionpool:https://10.62.141.179:8443 "PUT /dataservice/template/policy/definition/approute/cbce642b-46e1-457f-b5a8-e8541cdbb769 HTTP/1.1" 200 None
DEBUG:metadata_adaptor.vmanage_functions:Status Code:  200
DEBUG:connexion.apis.abstract:Getting data and status code
DEBUG:connexion.apis.abstract:Prepared body and status code (204)
DEBUG:connexion.apis.abstract:Got framework response

From here it looks ok with 200 code.

and seems no errors either:

/usr/src/app # cat adaptor.log | grep ERROR
/usr/src/app #

mkania-cisco · 2022-09-12T09:22:35Z

...end after etcd refresh I see ERROR:

ERROR:metadata_adaptor.core_lib:An error ocurred while communicating with the SDWAN controller.
ERROR:metadata_adaptor.core_lib:Details: 'encap'

probably some change in templates..?

Looking here does not give any further ideas aside of checking for BREAKING CHANGES in vManage APIs.

asimpleidea self-assigned this Sep 12, 2022

asimpleidea transferred this issue from CloudNativeSDWAN/cnwan-operator Sep 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ETCD: operator appends random string to endpoint #20

ETCD: operator appends random string to endpoint #20

mkania-cisco commented Sep 11, 2022

asimpleidea commented Sep 12, 2022 •

edited

Loading

mkania-cisco commented Sep 12, 2022 •

edited

Loading

asimpleidea commented Sep 12, 2022

mkania-cisco commented Sep 12, 2022

asimpleidea commented Sep 12, 2022

arnatal commented Sep 12, 2022

mkania-cisco commented Sep 12, 2022 •

edited

Loading

mkania-cisco commented Sep 12, 2022 •

edited

Loading

asimpleidea commented Sep 12, 2022

arnatal commented Sep 12, 2022

mkania-cisco commented Sep 12, 2022 •

edited

Loading

mkania-cisco commented Sep 12, 2022 •

edited

Loading

ETCD: operator appends random string to endpoint #20

ETCD: operator appends random string to endpoint #20

Comments

mkania-cisco commented Sep 11, 2022

asimpleidea commented Sep 12, 2022 • edited Loading

mkania-cisco commented Sep 12, 2022 • edited Loading

asimpleidea commented Sep 12, 2022

mkania-cisco commented Sep 12, 2022

asimpleidea commented Sep 12, 2022

arnatal commented Sep 12, 2022

mkania-cisco commented Sep 12, 2022 • edited Loading

mkania-cisco commented Sep 12, 2022 • edited Loading

asimpleidea commented Sep 12, 2022

arnatal commented Sep 12, 2022

mkania-cisco commented Sep 12, 2022 • edited Loading

mkania-cisco commented Sep 12, 2022 • edited Loading

asimpleidea commented Sep 12, 2022 •

edited

Loading

mkania-cisco commented Sep 12, 2022 •

edited

Loading

mkania-cisco commented Sep 12, 2022 •

edited

Loading

mkania-cisco commented Sep 12, 2022 •

edited

Loading

mkania-cisco commented Sep 12, 2022 •

edited

Loading

mkania-cisco commented Sep 12, 2022 •

edited

Loading