Jupyter on Kubernetes - the easy way

Categories: Kubernetes

Introduction

I have been playing a bit more with Python recently. I wanted to test out some algorithms using this awesome tool called Project Jupyter. I also did not want to drain battery of my laptop while doing this, as I have a powerful machine at home running single node Kubernetes cluster.

If you are interested in learning how to install such single-node-cluster on your old desktop, you can Find it here.

UPDATE 08-05-2018

Minimal notebook doesn’t have much inside. If you want to use your jupyter notebook for data science or tensorflow experiments, I recommend switching the minimal jupyter image jupyter/minimal-notebook with jupyter/tensorflow-notebook

First iteration

First we need to know what are we running and whether we can run it locally. I have tried different images on Docker Hub, but unfortunately most of them require non-trivial amount of setup. What I wanted instead, was something quick and dirty, just to make a single notebook running on my home baremetal cluster.

I ended up running “Minimal notebook” in a container.

Let’s first create a namespace for our experiments:

kubectl create ns jupyter

After that we need to create simple jupyter deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: jupyter-notebook
  labels:
    app: jupyter-notebook
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jupyter-notebook
  template:
    metadata:
      labels:
        app: jupyter-notebook
    spec:
      containers:
      - name: minimal-notebook
        image: jupyter/minimal-notebook:latest
        ports:
        - containerPort: 8888

And apply it

kubectl apply -f jupyter.yaml --namespace jupyter

Let it download all the layers, it will take a while. The container weights around 1GB. It took me 7 minutes to download a container on a 40Mbps network.

After it runs, we need to get logs of a container in order to get the token that will allow us to reach the UI.

kubectl logs -n jupyter jupyter-notebook-57757bf84d-rwl7m

Result:

/usr/local/bin/start-notebook.sh: ignoring /usr/local/bin/start-notebook.d/*

Container must be run with group root to update passwd file
Executing the command: jupyter notebook
[I 08:55:50.738 NotebookApp] Writing notebook server cookie secret to /home/jovyan/.local/share/jupyter/runtime/notebook_cookie_secret
[W 08:55:50.959 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
[I 08:55:50.998 NotebookApp] JupyterLab beta preview extension loaded from /opt/conda/lib/python3.6/site-packages/jupyterlab
[I 08:55:50.998 NotebookApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 08:55:51.006 NotebookApp] Serving notebooks from local directory: /home/jovyan
[I 08:55:51.006 NotebookApp] 0 active kernels
[I 08:55:51.006 NotebookApp] The Jupyter Notebook is running at:
[I 08:55:51.006 NotebookApp] http://[all ip addresses on your system]:8888/?token=29aea50d487f7aa3ea52e10393d2e84ef1efef0873c993d9
[I 08:55:51.006 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 08:55:51.007 NotebookApp]

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://localhost:8888/?token=29aea50d487f7aa3ea52e10393d2e84ef1efef0873c993d9

Let's now test our setup. We will forward port from container to localhost and try to login using the token.

kubectl port-forward jupyter-notebook-57757bf84d-rwl7m 8888:8888 -n jupyter

Now open your browser at (change the token to the one from your logs output) http://localhost:8888/?token=29aea50d487f7aa3ea52e10393d2e84ef1efef0873c993d9

You should see Jupyter homepage, as below

jupyter notebook homepage

Second iteration

Now as we know that our installation works, we need to expose it so that we don't need to forward ports every time we want to play with this notebook. Let's create a service and expose it on a high port number on a target machine.

I will save you a few hours and will tell you that I wasn't able to reach the same container as we deployed in the first iteration through NodePort service. For some reason, the notebook requested authentication and did not accept any token. Therefore I had to disable the token authentication and leave the notebook opened. Remember that this setup is safe only in your local network. In the third iteration we will expose the notebook to the world and also password-protect your ingress. In order to disable authentication I had to run docker image with custom startup script start-notebook.sh and parameter --NotebookApp.token=''. This is not an issue for us, as more appropiate place to protect the notebook is probably the ingress anyway.

Now jupyter.yaml should look like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: jupyter-notebook
  labels:
    app: jupyter-notebook
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jupyter-notebook
  template:
    metadata:
      labels:
        app: jupyter-notebook
    spec:
      containers:
      - name: minimal-notebook
        image: jupyter/minimal-notebook:latest
        ports:
        - containerPort: 8888
        command: ["start-notebook.sh"]
        args: ["--NotebookApp.token=''"]
---
kind: Service
apiVersion: v1
metadata:
  name: jupyter-notebook
spec:
  type: NodePort
  selector:
    app: jupyter-notebook
  ports:
  - protocol: TCP
    nodePort: 30040
    port: 8888
    targetPort: 8888

Now you should be able to reach it at;

http://[YOUR_CLUSTER_IP]:30040/

Third iteration

Let's expose our notebook to the public network. This part is heavily dependent on your cluster setup and it might work differently for you.

What you need is:
- properly configured nginx ingress controller, at least version 0.9
- kube-lego
- domain pointing at your cluster

You can read on how to install kube-lego in the cluster here

Let's add ingress definition to our file and apply it to the cluster again:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: jupyter-notebook
  labels:
    app: jupyter-notebook
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jupyter-notebook
  template:
    metadata:
      labels:
        app: jupyter-notebook
    spec:
      containers:
      - name: minimal-notebook
        image: jupyter/minimal-notebook:latest
        ports:
        - containerPort: 8888
        command: ["start-notebook.sh"]
        args: ["--NotebookApp.token=''"]
---
kind: Service
apiVersion: v1
metadata:
  name: jupyter-notebook
spec:
  type: NodePort
  selector:
    app: jupyter-notebook
  ports:
  - protocol: TCP
    nodePort: 30040
    port: 8888
    targetPort: 8888
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: jupyter-notebook
  namespace:
  annotations:
    kubernetes.io/tls-acme: "true"
    kubernetes.io/ingress.class: "nginx"
spec:
  rules:
  - host: yourhost.yourdomain.com
    http:
      paths:
      - path: /
        backend:
          serviceName: jupyter-notebook
          servicePort: 8888
  tls:
  - secretName: jupyter-notebook-tls
    hosts:
      - yourhost.yourdomain.com

Fourth iteration

The last thing that we have to do is to password-protect our publictly-available notebook. We need to generate a password for ingress definition as described here:

https://github.com/kubernetes/contrib/tree/master/ingress/controllers/nginx/examples/auth

Remember that the secret with password MUST be named basic-auth and it must be applied to jupyter namespace! You may also have only one of them per namespace.

So assuming that we have a secret generated in our namespace with a name basic-auth, we can add password-protection annotations to our ingress.

The jupyter.yaml will look now like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: jupyter-notebook
  labels:
    app: jupyter-notebook
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jupyter-notebook
  template:
    metadata:
      labels:
        app: jupyter-notebook
    spec:
      containers:
      - name: minimal-notebook
        image: jupyter/minimal-notebook:latest
        ports:
        - containerPort: 8888
        command: ["start-notebook.sh"]
        args: ["--NotebookApp.token=''"]
---
kind: Service
apiVersion: v1
metadata:
  name: jupyter-notebook
spec:
  type: NodePort
  selector:
    app: jupyter-notebook
  ports:
  - protocol: TCP
    nodePort: 30040
    port: 8888
    targetPort: 8888
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: jupyter-notebook
  namespace:
  annotations:
    kubernetes.io/tls-acme: "true"
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/auth-type: basic
    # name of the secret that contains the user/password definitions
    nginx.ingress.kubernetes.io/auth-secret: basic-auth
    # message to display with an appropiate context why the authentication is required
    nginx.ingress.kubernetes.io/auth-realm: "Authentication Required - jupyter"
spec:
  rules:
  - host: yourhost.yourdomain.com
    http:
      paths:
      - path: /
        backend:
          serviceName: jupyter-notebook
          servicePort: 8888
  tls:
  - secretName: jupyter-notebook-tls
    hosts:
      - yourhost.yourdomain.com

Now, you should be able to go to your website from anywhere in the world and access your password-protected Jupyter notebook on secure SSL connection. I hope this article will save you battery on your ultra-portable laptop next time you're in cafeteria or on a balcony testing your python scripts ;)

See also

Share your excitement about this post

comments powered by Disqus