NFS example needs updates #44

kingdonb · 2017-07-29T22:12:09Z

Is this now the best place to raise: kubernetes/kubernetes#48161

The NFS example is not on the list of supported examples, but if you're looking for maintainer volunteers... I am pretty sure I am going to need this NFS example to work personally, and I hope to keep up with Kubernetes upgrades with several dev/prod environments, so I might be a good fit for supporting maintainer.

ahmetb · 2017-08-03T17:53:58Z

@kingdonb Thanks for showing interest. Our goal with examples like NFS that are currently in staging/ directory is to find them maintainers and ideally move them to another repository. If you think you have cycles to maintain this, I recommend you suggest creating a kubernetes-incubator/kubernetes-nfs repository by discussing this with [email protected]. Alternatively you can create a personal repository until that happens and then move things over.

Also please consider creating a Helm chart for easy installation of NFS in multiple environments if applicable. I see around many issues regarding NFS and a "community-maintained Helm chart" could help prevent anyone trying to run NFS on Kubernetes go through the same pains. Would this be something you're interested in?

kingdonb · 2017-08-03T17:57:39Z

Yes, I am interested in that.

I'll review your notes on the linked PR and start working on a Helm chart tonight!

msau42 · 2017-08-04T23:53:44Z

/sig storage

ahmetb · 2017-08-05T07:07:23Z

@msau42 I'm afraid we don't have SIG or area labels in this repository yet.

kingdonb · 2017-08-29T02:46:21Z

I'm working my way back around to this from deis/workflow#856 and deis/workflow#857 ... these are the things I need personally so, hopefully I will need to maintain them and thereby make a great maintainer for these examples...

ahmetb · 2017-09-13T19:55:45Z

Any updates here?

kingdonb · 2017-09-18T18:35:08Z

Not yet. Sorry, will make a point to look at it this week.

The goal is still to convert the example to work as a helm chart though, and with some parameters. (I'm learning ins and outs of building helm charts by porting Deis to OpenShift, which has been fun!)

kingdonb · 2017-10-03T14:42:52Z

I still owe a helm chart for this story

I've actually got a number of inquiries from people that need a PV that can be written into by simultaneously running pods that have the volume mounted concurrently. I'm trying to avoid coding a helm chart in a paper bag that only solves my problem, "engineering rule of three" says you can't make a general solution until you have at least three distinct customers with overlapping problems.

Not abandoned, but actively seeking that third customer if anyone needs this please reach out to me.

kingdonb · 2017-10-08T15:42:13Z

Well this looks like terrible news: https://stackoverflow.com/questions/42973882/configure-nfs-server-for-persistentvolume-either-via-dns-or-static-clusterip

kingdonb · 2017-10-08T17:18:41Z

OK, I have composed PR #108 in order to avoid "perfect getting in the way of good"

It simply converts the existing structure into a helm chart, does not adjust any documentation yet.

You can helm install -n nfs examples/staging/volumes/nfs/chart/ and you get three RCs as in the original example: nfs-server, nfs-busybox, and nfs-web.

The NFS server binds itself to a 200GB PV, then exposes itself to the cluster as a service with service.clusterIP from your examples/staging/volumes/nfs/chart/values.yaml. This is the only parameter that the user is expected to set. I have not been able to find a way to assign this dynamically, but it is not a regression from the previous non-helm example.

I personally think this is a bug in the NFS PV driver, but I do not know how any other similar PV driver works so maybe this is working as designed.

kingdonb · 2017-10-08T18:30:13Z

The new example from #108 works exactly once per cluster boot (using Minikube)... something does not terminate upon helm delete --purge nfs and the result is, upon the second attempt to helm install the chart you get:

Unable to mount volumes for pod "nfs-busybox-3227049434-5dch7_default(9ffcb0cc-ac55-11e7-aaaa-080027f5463e)": timeout expired waiting for volumes to attach/mount for pod "default"/"nfs-busybox-3227049434-5dch7". list of unattached/unmounted volumes=[nfs]
Error syncing pod

Whatever hangs around also prevents a minikube stop from happening cleanly. If I terminate the VirtualBox instance and start it again, the service comes up and busybox/web workers all connect to the nfs-server successfully binding their PVCs to the ReadWriteMany service.

There may be many obstacles to overcome in order to make this example viable. Restarting minikube is akin to doing a down/up on the entire cluster, so I'm going to assume this type of failure condition is "very bad" in terms of what someone using this example on a live cluster would look like.

kingdonb · 2017-10-08T18:36:55Z

There's another proposed change in kingdonb#2

Is there any reason to keep these services as "replicationcontrollers" and have I done the conversion to Deployment resources correctly? The changes work for me (the example pods still come up and do their job.)

msau42 · 2017-10-09T15:14:36Z

Regarding service name resolution, @jingxu97 should have fixed the issue on GCE/GKE COS images with Set up DNS server in containerized mounter path kubernetes#51645. The underlying issue is that mounts are done at the host level, so you need to configure the host's resolv.conf to also add the kube-dns server.
Regarding termination, are you terminating everything in the right order? All the NFS clients must be terminated first, and then finally the NFS server. If you terminate the server first, then all the client Pods will fail to unmount during shutdown.

kingdonb · 2017-10-09T21:47:48Z

@msau42 That helps, thank you! I will try terminating the clients first next time.

I'm not on GKE COS though, so I'll seek out a solution on my local OS. That seems like a reasonable adjustment to make on the host OS. I also have other components that require the client (off-cluster) must be able to resolve the service in the same way as the on-cluster services that utilize them do. (I'm running a CAS server for federated HTTP authentication.)

Thanks very much!

kingdonb · 2017-10-15T13:07:53Z

I think the other solution vs terminating the clients first, is to set the mount option for soft errors... that way the client will produce an error when it loses contact with the server, and not hang forever.

You might want your production deployments to wait for the server to come back instead of erroring out, so this can be one of the parameters to add to values.yaml

kingdonb · 2017-10-22T19:37:05Z

There is helm/charts#2559 now which can be a new home for this concern, noted that the examples/staging/nfs was updated and also works fine for me without helm at this point. Thanks!

ahmetb mentioned this issue Jul 30, 2017

fix miss properties "storageClassName" and change nfs-server #30

Merged

kingdonb mentioned this issue Oct 8, 2017

WIP: Roadmap for developing the helm chart kingdonb/examples-volumes-nfs#1

Open

kingdonb closed this as completed Oct 22, 2017

kingdonb mentioned this issue Nov 17, 2017

[incubator/nfs-pv-provider] ReadWriteMany NFS PV Provisioner helm/charts#2559

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NFS example needs updates #44

NFS example needs updates #44

kingdonb commented Jul 29, 2017

ahmetb commented Aug 3, 2017

kingdonb commented Aug 3, 2017

msau42 commented Aug 4, 2017

ahmetb commented Aug 5, 2017

kingdonb commented Aug 29, 2017

ahmetb commented Sep 13, 2017

kingdonb commented Sep 18, 2017

kingdonb commented Oct 3, 2017

kingdonb commented Oct 8, 2017

kingdonb commented Oct 8, 2017

kingdonb commented Oct 8, 2017

kingdonb commented Oct 8, 2017

msau42 commented Oct 9, 2017

kingdonb commented Oct 9, 2017

kingdonb commented Oct 15, 2017

kingdonb commented Oct 22, 2017 •

edited

Loading

NFS example needs updates #44

NFS example needs updates #44

Comments

kingdonb commented Jul 29, 2017

ahmetb commented Aug 3, 2017

kingdonb commented Aug 3, 2017

msau42 commented Aug 4, 2017

ahmetb commented Aug 5, 2017

kingdonb commented Aug 29, 2017

ahmetb commented Sep 13, 2017

kingdonb commented Sep 18, 2017

kingdonb commented Oct 3, 2017

kingdonb commented Oct 8, 2017

kingdonb commented Oct 8, 2017

kingdonb commented Oct 8, 2017

kingdonb commented Oct 8, 2017

msau42 commented Oct 9, 2017

kingdonb commented Oct 9, 2017

kingdonb commented Oct 15, 2017

kingdonb commented Oct 22, 2017 • edited Loading

kingdonb commented Oct 22, 2017 •

edited

Loading