Linkerd vs Istio: my 2¢

Almost like Vans vs Converse…

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I see people are still reading this post and wanted to point our that this was written quite some time ago and the information here is likely not up to date. Please do your own research and take the opinions expressed in this article lightly. I haven’t been able to write a follow up article as my views have changed over the past year…

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This article was written before Linkerd introduced support for Istio. For more information: check out Linkerd’s Istio documentation and Linkerd’s blog post.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TL;DR

In short, a service mesh is a layer that manages the communication between apps (or between parts of the same app, e.g. microservices)

Just as applications shouldn’t be writing their own TCP stack, they also shouldn’t be managing their own load balancing logic, or their own service discovery management, or their own retry and timeout logic. -link

Mesh: A group of hosts that coordinate to provide a consistent network topology. In this documentation, an “Envoy mesh” is a group of Envoy proxies that form a message passing substrate for a distributed system comprised of many different services and application platforms. -link

Linkerd

I have spent a few weeks ago experimenting with Linkerd. If you haven’t already, you can check out a few of the articles here: (part 1, part 2, part 3). Overall, I found Linkerd to be very helpful with a few use cases:

Of course, Linkerd has many other features but these were the first few use cases that were important to me.

Envoy

From the looks of it, Envoy fits the same role as Linkerd. Both applications function as a proxy and can report over the services that are connected. From the looks of this issue, Envoy isn’t designed to be a Kubernetes Ingress Controller. Instead, Istio aims to bridge the gap between a service platform (Kubernetes being an implementation) and a service mesh agent (Istio Proxy).

Istio

Much like a Kubernetes deployment, an Istio deployment has several moving parts. Most of the pages in their documentation contain diagrams of the architectures specific to a part. Here are the pages we are concerned about:

It appears that Kubernetes Services are the “Service” in “Service Registry” (keyword: appears). Linkerd uses identifiers and binding to support multiple service registries. It doesn’t look like we will have much luck plugging in our service registered with Consul/Zookeeper/Etcd/…

Installation

After downloading and extracting a release, you should be left with the following:

$ tree istio-0.1.5
istio-0.1.5
├── LICENSE
├── bin
│ └── istioctl
├── install
│ └── kubernetes
│ ├── README.md
│ ├── addons
│ │ ├── grafana.yaml
│ │ ├── prometheus.yaml
│ │ ├── servicegraph.yaml
│ │ └── zipkin.yaml
│ ├── istio-auth.yaml
│ ├── istio-rbac-alpha.yaml
│ ├── istio-rbac-beta.yaml
│ ├── istio.yaml
│ └── templates
│ ├── istio-auth
│ │ ├── istio-auth-with-cluster-ca.yaml
│ │ ├── istio-cluster-ca.yaml
│ │ ├── istio-egress-auth.yaml
│ │ ├── istio-ingress-auth.yaml
│ │ └── istio-namespace-ca.yaml
│ ├── istio-egress.yaml
│ ├── istio-ingress.yaml
│ ├── istio-manager.yaml
│ └── istio-mixer.yaml
├── istio.VERSION
└── samples
├── README.md
└── apps
├── bookinfo
│ ├── README.md
│ ├── bookinfo.yaml
│ ├── cleanup.sh
│ ├── destination-ratings-test-delay.yaml
│ ├── loadbalancing-policy-reviews.yaml
│ ├── mixer-rule-additional-telemetry.yaml
│ ├── mixer-rule-empty-rule.yaml
│ ├── mixer-rule-ratings-denial.yaml
│ ├── mixer-rule-ratings-ratelimit.yaml
│ ├── route-rule-all-v1.yaml
│ ├── route-rule-delay.yaml
│ ├── route-rule-reviews-50-v3.yaml
│ ├── route-rule-reviews-test-v2.yaml
│ ├── route-rule-reviews-v2-v3.yaml
│ └── route-rule-reviews-v3.yaml
├── httpbin
│ ├── README.md
│ └── httpbin.yaml
└── sleep
├── README.md
└── sleep.yaml

After starting a Kubernetes cluster, apply the istio.yaml to your cluster and you should be all set!

$ kubectl apply -f istio-0.1.5/install/kubernetes/istio.yaml
service "istio-mixer" created
deployment "istio-mixer" created
configmap "istio" created
service "istio-manager" created
serviceaccount "istio-manager-service-account" created
deployment "istio-manager" created
service "istio-ingress" created
serviceaccount "istio-ingress-service-account" created
deployment "istio-ingress" created
service "istio-egress" created
deployment "istio-egress" created

Note: if you are using Minikube, you’ll have to change some service to NodePort.

Istio does not come with a dashboard. Instead, you can use the default Kubernetes dashboard provided with Minikube and browse to Third Party Resources to gain some visibility.

Calculator Service

We will use the calculator service from my previous demos. This service isn’t as fancy as Istio’s BookInfo application, but it will help me highlight the difficulties of integrating this framework with an existing application.

First, we have to make some modifications to Ingress: kubernetes.io/ingress.class: "istio"

Next, we can generate and apply our resources:

$ ENVIRONMENT=preprod ./generate.sh
...
$ kubectl apply -f templated

After, we can use the istio kube-inject command to simulate what Istio needs in a Deployment.

$ istio-0.1.5/bin/istioctl -v 6 kube-inject -f echoheaders/echoheaders-deployment.yaml
I0529 10:46:47.877381 44254 loader.go:354] Config loaded from file /Users/genslerj/.kube/config
I0529 10:46:47.909208 44254 round_trippers.go:417] GET https://192.168.99.100:8443/api/v1/namespaces/default/configmaps/istio 200 OK in 29 milliseconds
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
labels:
app: mydeployment
version: v1
name: mydeployment
spec:
replicas: 2
strategy: {}
template:
metadata:
annotations:
alpha.istio.io/sidecar: injected
alpha.istio.io/version: jenkins@ubuntu-16-04-build-de3bbfab70500-0.1.5-21f4cb4
pod.beta.kubernetes.io/init-containers: '[{"args":["-p","15001","-u","1337"],"image":"docker.io/istio/init:0.1","imagePullPolicy":"Always","name":"init","securityContext":{"capabilities":{"add":["NET_ADMIN"]}}},{"args":["-c","sysctl
-w kernel.core_pattern=/tmp/core.%e.%p.%t \u0026\u0026 ulimit -c unlimited"],"command":["/bin/sh"],"image":"alpine","imagePullPolicy":"Always","name":"enable-core-dump","securityContext":{"privileged":true}}]'
creationTimestamp: null
labels:
app: mydeployment
version: v1
spec:
containers:
- image: brndnmtthws/nginx-echo-headers:latest
name: mydeployment
ports:
- containerPort: 8080
name: my-http
resources: {}
- args:
- proxy
- sidecar
- -v
- "2"
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
image: docker.io/istio/proxy_debug:0.1
imagePullPolicy: Always
name: proxy
resources: {}
securityContext:
runAsUser: 1337
status: {}
---

The injection of the sidecar proxy is expected. However, the injection of an init container is interesting. Digging into their code, I have found the following comment block:

NOTE: This tool only exists because kubernetes does not support dynamic/out-of-tree admission controller for transparent proxy injection. This file should be removed as soon as a proper kubernetes admission controller is written for istio.

Here is the link to the command line reference with a larger explanation. There is also a whole page of documentation on this topic. Overall, this doesn’t surprise us because we normally set the HTTP_PROXY variable with Linkerd deployments. There is an issue noting the support for HTTP_PROXY variables instead of a transparent proxy.

When you deploy the modified files, you might find that a curl to the Istio Ingress Controller fails:

$ curl -H "Host: gateway" 192.168.99.100:32001/compute -v
* Trying 192.168.99.100...
* Connected to 192.168.99.100 (192.168.99.100) port 32001 (#0)
> GET /compute HTTP/1.1
> Host: gateway
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< date: Sat, 27 May 2017 21:54:46 GMT
< server: envoy
< content-length: 0
<
* Connection #0 to host 192.168.99.100 left intact

This is because our Ingress resources are in separate namespaces. Even after fixing the whole namespace issue, I have found that Istio doesn’t handle port names with dashes. The httpbin example doesn’t have any variation of ports (service port different than container port, service port name different than container port name, ingress targets service port by name rather than number)**. I have used echoheaders to debug because all of my calculator service applications are written in Go (like the Ingress Controller)

$ curl 192.168.99.100:32001/echo -v
* Trying 192.168.99.100...
* Connected to 192.168.99.100 (192.168.99.100) port 32001 (#0)
> GET /echo HTTP/1.1
> Host: 192.168.99.100:32001
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 200 OK
< server: envoy
< date: Sun, 28 May 2017 06:00:50 GMT
< content-type: text/plain
< x-envoy-upstream-service-time: 0
< transfer-encoding: chunked
<
GET /echo HTTP/1.1
host: mydeployment.default.svc.cluster.local:8080
user-agent: curl/7.43.0
accept: */*
x-forwarded-proto: http
x-request-id: ec896caf-7ee5-90db-8155-757f20abe838
x-b3-traceid: 0000ba3ddf454303
x-b3-spanid: 0000ba3ddf454303
x-b3-sampled: 1
x-ot-span-context: 0000ba3ddf454303;0000ba3ddf454303;0000000000000000;cs
x-envoy-expected-rq-timeout-ms: 15000
content-length: 0
$ curl 192.168.99.100:32001/echo/wont/match -v
* Trying 192.168.99.100...
* Connected to 192.168.99.100 (192.168.99.100) port 32001 (#0)
> GET /echo/wont/match HTTP/1.1
> Host: 192.168.99.100:32001
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< date: Sun, 28 May 2017 06:06:22 GMT
< server: envoy
< content-length: 0
<
* Connection #0 to host 192.168.99.100 left intact

After solving for the single namespace and port issue, we can finally query our service through the Ingress Controller.

$ curl -H "Host: gateway" "192.168.99.100:32001/compute?equation=1%2B6%2B7%2A2%2F1%2A99%2D72%2F100" -v
* Trying 192.168.99.100...
* Connected to 192.168.99.100 (192.168.99.100) port 32001 (#0)
> GET /compute?equation=1%2B6%2B7%2A2%2F1%2A99%2D72%2F100 HTTP/1.1
> Host: gateway
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 200 OK
< date: Sun, 28 May 2017 06:47:42 GMT
< content-length: 2
< content-type: text/plain; charset=utf-8
< x-envoy-upstream-service-time: 11
< server: envoy
<
* Connection #0 to host 192.168.99.100 left intact
27%

Getting Stats in Grafana

I had a bit of trouble with either old configuration or, well, old configuration. Turns out the metrics are exposed on port 9093 and not 42422…

:(

One interesting observation I have made is that the Grafana dashboard (source) uses the app label for the word “Service.” You can verify this by querying Prometheus on request counts.

with service.spec.selector.app = reviews-app-v1

Remember, these metrics are exported from the Mixer. This means that if you want your application to show up, you’ll need to perform some sort of registration. Having the proxies is not enough. If you want statistics in your Grafana page, you will need to create am Istio route-rule.

After a long period of time without anything working, I have found out that you must call istio kube-inject to get services registered with the Mixer. I am not sure why this is the case because this means your mixer won’t persist configuration between deploys/failures.

# 1+6+7*2/1*99-72/100
curl -H "Host: gateway" "192.168.99.100:32001/compute?equation=1%2B6%2B7%2A2%2F1%2A99%2D72%2F100"
lines are drawn at 4 ops (division and multiplication get called twice)

Unfortunately, I couldn’t seem to get all of the service into a single Zipkin span, though I am pretty sure this should be possible. This may be the case because of the Gateway not using the same headers (the Zipkin trace/span id) on requests to integrating services. See the product pages code to inspect how they’ve done it.

Building a Canary

This is pretty straightforward given the route-rule system.

---
type: route-rule
name: multiplication-operator-default
spec:
destination: multiplication-operator.default.svc.cluster.local
precedence: 1
route:
- tags:
version: "v1"
weight: 60
- tags:
version: "build-123"
weight: 40

Recipe consists of a deployment with:

As stated before, if you route to something with Host: multiplication-operator, you will route to pods with app: multiplication-operator , not a Service! I think I feel more comfortable with Linkerd using Ingress’s host field as a way of describing which host to talk to.

Internal and External Traffic

Seeing as the Istio is an Ingress Controller, we can use the same NGINX trick from my previous post and let teams create Ingress resources with a service.external host. Getting traffic out of Kubernetes is likely impossible given that we need Pods with an app Label set (rather than a Service with External Name).

Another option would be to use a more complex route-rule system or auth objects so that an NGINX container isn’t authorized to talk with internal services.

Closing Thoughts

At the moment, Istio is only for application deployed on Kubernetes. There is likely a way to run the Istio proxy outside of a cluster but it might be trouble than it is worth. I am excited about the project but I would caution some early adopters to think about what problems this tool will help you solve.

While I don’t use Linkerd in production, I would place my bet on Linkerd. They are an incredibly responsive team in their Slack channel and on their Discourse page.

Find the code here:

**

[2017-05-28 06:12:27.951][12][warning][router] rds: fetch failure: JSON at lines 3-9 does not conform to schema.
Invalid schema: #/properties/routes
Schema violation: type
Offending document key: #/routes
[2017-05-28 06:17:41.217][12][warning][router] rds: fetch failure: JSON at lines 3-9 does not conform to schema.
Invalid schema: #/properties/routes
Schema violation: type
Offending document key: #/routes
$ curl 192.168.99.100:32001/echo -v
* Trying 192.168.99.100...
* Connected to 192.168.99.100 (192.168.99.100) port 32001 (#0)
> GET /echo HTTP/1.1
> Host: 192.168.99.100:32001
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< date: Sun, 28 May 2017 06:14:44 GMT
< server: envoy
< content-length: 0
<