The workshop is based on this RedHat tutorial for Istio. Java (Spring Boot, Vert.x and Micro profile) + Istio on Kubernetes/OpenShift. A couples of days ago this Istio tutorial has been updated to version 1.3. I didn’t have the time update this page, so we will use the version 1.1.9 of the tutorial for the time being.
We will use the minishift v1.34 to run Istio on OpenShift 3.11. Please follow the instructions how to install Istio on Minishift based on the example above. Furthermore deploy the application like described the section 2. Deploy Microservices. You can use the images that I uploaded to https://quay.io/repository/omeyer/istio-tutorial instead of the building the image locally. The deployments yaml and yaml for the exercises can be found here at my GitHub repo https://github.com/olaf-meyer/openshift-talks. Please clone it, because we will use it in the workshop.
Use this command to setup the demo application
oc apply -f <(istioctl kube-inject -f ../openshift-talks/istio_example_application.yaml) -n tutorial
Hint: You might need to change the path to the yaml files to your environment.
We will skip the section with the routing of requests, because this was part of the first Istio workshop.
In the workshop we are not going to compile change in the source code instead we are going to use example images that I uploaded on quay.io.
In this section you will use dynamic routing based on certain error cases. If i.e an timeout happens for a certain pod than istio will retry the request with alternatives (if possible). In case not alternative is present the error 503 is returned.
Lets start with this exercise: https://redhat-developer-demos.github.io/istio-tutorial/istio-tutorial/1.1.x/5circuit-breaker.html
oc exec -it -n tutorial $(oc get pods -n tutorial|grep recommendation-v2|awk '{ print $1 }'|head -1) -c recommendation /bin/bash
curl localhost:8080/misbehave
exit
The exercise explains that pod which return 503s is not called. However it is not explaining why this is happening. To get to it we need to get the envoy configuration from the preference pod with the following commands:
oc port-forward preference-v1-b56964f5c-fhczd -n istio-system 15000
and in a different terminal
curl http://localhost:15000/config_dump > ../openshift-talks/envoy_config_preference.json
If you search in the envoy_config_preference.json file for the following string outbound|8080||preference.tutorial.svc.cluster.local
you should find a similar configuration:
"routes": [
{
"match": {
"prefix": "/"
},
"route": {
"cluster": "outbound|8080||preference.tutorial.svc.cluster.local",
"timeout": "0s",
"retry_policy": {
"retry_on": "connect-failure,refused-stream,unavailable,cancelled,resource-exhausted,retriable-status-codes",
"num_retries": 2,
"retry_host_predicate": [
{
"name": "envoy.retry_host_predicates.previous_hosts"
}
],
"host_selection_retry_max_attempts": "3",
"retriable_status_codes": [
503
]
},
"max_grpc_timeout": "0s"
},
"decorator": {
"operation": "preference.tutorial.svc.cluster.local:8080/*"
},
You can see that by default the envoy proxy is configured to do a retry from the status code 503. Details on the configuration can be found here route.RetryPolicy
In the next exercise we are going to simulate slow responses of a service/pod. Please use this patch command for this instead of the provided command.
oc patch deployment recommendation-v2 --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value": "quay.io/omeyer/istio-tutorial:recommendationv3" }]'
Run the rest of the exercise like described here Timeout
Definition of the timeout setting can be found here https://istio.io/docs/reference/config/networking/v1alpha3/virtual-service/#HTTPRoute and here https://istio.io/docs/concepts/traffic-management/#timeouts-and-retries
Reset the behavior of recommendation-v2
oc patch deployment recommendation-v2 --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value": "quay.io/omeyer/istio-tutorial:recommendationv2" }]'
In the first part we will use the attribute http1MaxPendingRequests
and maxRequestsPerConnection
in the DestinationRule to limit the number of concurrent requests.
Please delete all destination rules and virtual services with the script ./clean.sh of the tutorial.
In order that we can see an effect of the connection pool, the requests shouldn’t reply immediately but should use a bit of processing time. We can do this by using the image with the time out:
oc patch deployment recommendation-v2 --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value": "quay.io/omeyer/istio-tutorial:recommendationv3" }]'
Now lets create the virtual service and der destination rule with these commands:
oc create -f istio-workshop-part2/circuitbreaker/connection\ pool/destinationrule.yaml
oc create -f istio-workshop-part2/circuitbreaker/connection\ pool/virtualservice.yaml
Lets put our application a bit under load (5 requests concurrent) with this command:
siege -r 10 -c 5 -v http://istio-ingressgateway-istio-system.$(minishift ip).nip.io/customer
The result should be that all requests are return the status 200
If we increase the load we should see that for a lot of cases the status 503 is return:
siege -r 10 -c 20 -v http://istio-ingressgateway-istio-system.$(minishift ip).nip.io/customer
Question Do you think that the connection pool defined per calling pod or per called pod or a globally?
The connection pool protects backend application from to much traffic. However the drawback is that the client application gets a lot of error messages which might cause problems as well. Lets check if the outlier setting can help us.
First you need to delete the previous virtual service and destination route and create them new by executing these commands:
oc patch deployment recommendation-v2 --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value": "quay.io/omeyer/istio-tutorial:recommendationv2" }]'
oc create -f istio-workshop-part2/circuitbreaker/outlier/destinationrule.yaml
oc create -f istio-workshop-part2/circuitbreaker/outlier/virtualservice.yaml
In the next step scale up the number of pod to 4 with this command:
oc scale deployment --replicas=4 recommendation-v2
After a moment you should have 4 pods up and running. Next lets test if they all return a value by running this command (location of script my vary):
istio-tutorial/scripts/run.sh istio-ingressgateway-istio-system.$(minishift ip).nip.io/customer
The output should contain responses from all pods. Now lets convince one pod to misbehave by executing this command:
oc exec recommendation-v2-6bd4d5775c-7plwf -- curl -v localhost:8080/misbehave
If you now execute the command:
istio-tutorial/scripts/run.sh istio-ingressgateway-istio-system.$(minishift ip).nip.io/customer
You should not the a response of the misbehave pod. If you however look in the istio-proxy logs of the misbehaved pod you should see one log statement for a response with the status 503. Here you can find details how to setup the outlier detection: Outlier detection
Hint: With the version 1.1.9 I was not able to setup an outlier dection that is using two services in the destination rule. The misbehaving pod is called more often than configured.
You can play around how to mix outlier detection and the connection pool.
Lets clean up the mess and scale down the application.
In the previous section we had a look how to configure your application to use the proper routing and connection pool (better term would be connection limit). One problem remains however: How can you be sure the the configuration is correct? By testing is of course ;-)
The first part would be to inject random errors. See exercise here: Fault Inject in tutorial. However because of the above mentioned reasons with retries we are going to use a modified version of the exercise that returns a 404 in case a fault. So clean up the settings from before and execute the following commands:
oc create -f istio-workshop-part2/faultinjection/error/destination-rule-recommendation.yml
oc create -f istio-workshop-part2/faultinjection/error/virtual-service-recommendation-404.yml
The documentation for the configuration can be found here: https://istio.io/docs/reference/config/networking/v1alpha3/virtual-service/#HTTPFaultInjection-Abort
To verify that an error occurred, please check the log files of the preference pod or have a look at Jaeger.
Test Is it possible to return two different error code? Is it possible to combine the fault setting with other settings?
The next test is much easier to verify that it was successful or faulty (??), because the response from the recommendation service will be delayed.
Like always clean up the previous exercise and execute the following commands:
oc create -f istio-workshop-part2/faultinjection/timeout/destination-rule-recommendation.yml
oc create -f istio-workshop-part2/faultinjection/timeout/virtual-service-recommendation-timeout.yml
To test is the fault injection works we can use this siege command:
siege -r 10 -c 2 -v http://istio-ingressgateway-istio-system.$(minishift ip).nip.io/customer
The result should look like this. As you can see that every request took about 5 seconds, so our fault injection works:
** SIEGE 4.0.4
** Preparing 2 concurrent users for battle.
The server is now under siege...
HTTP/1.1 200 5.10 secs: 71 bytes ==> GET /customer
HTTP/1.1 200 5.12 secs: 71 bytes ==> GET /customer
HTTP/1.1 200 5.06 secs: 71 bytes ==> GET /customer
HTTP/1.1 200 5.06 secs: 71 bytes ==> GET /customer
HTTP/1.1 200 5.04 secs: 71 bytes ==> GET /customer
HTTP/1.1 200 5.08 secs: 71 bytes ==> GET /customer
HTTP/1.1 200 5.03 secs: 71 bytes ==> GET /customer
HTTP/1.1 200 5.06 secs: 71 bytes ==> GET /customer
HTTP/1.1 200 5.54 secs: 72 bytes ==> GET /customer
HTTP/1.1 200 5.47 secs: 72 bytes ==> GET /customer
HTTP/1.1 200 5.14 secs: 72 bytes ==> GET /customer
HTTP/1.1 200 5.15 secs: 72 bytes ==> GET /customer
HTTP/1.1 200 5.15 secs: 72 bytes ==> GET /customer
HTTP/1.1 200 5.18 secs: 72 bytes ==> GET /customer
HTTP/1.1 200 5.07 secs: 72 bytes ==> GET /customer
HTTP/1.1 200 5.10 secs: 72 bytes ==> GET /customer
HTTP/1.1 200 5.05 secs: 72 bytes ==> GET /customer
HTTP/1.1 200 5.16 secs: 72 bytes ==> GET /customer
HTTP/1.1 200 5.11 secs: 72 bytes ==> GET /customer
HTTP/1.1 200 5.06 secs: 72 bytes ==> GET /customer
Transactions: 20 hits
Availability: 100.00 %
Elapsed time: 51.40 secs
Data transferred: 0.00 MB
Response time: 5.14 secs
Transaction rate: 0.39 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 2.00
Successful transactions: 20
Failed transactions: 0
Longest transaction: 5.54
Shortest transaction: 5.03
The configuration of the delay can be found here: https://istio.io/docs/reference/config/networking/v1alpha3/virtual-service/#HTTPFaultInjection-Delay
Policies und telemetry are handled by the Mixer in Istio. Details can be found here: Mixer configuration model
This is a quote from Istio documentation regarding the communication between the mixer and the Envoy sidecar container
The Envoy sidecar logically calls Mixer before each request to perform precondition checks, and after each request to report telemetry. The sidecar has local caching such that a large percentage of precondition checks can be performed from cache. Additionally, the sidecar buffers outgoing telemetry such that it only calls Mixer infrequently.
The functionality of the mixer is extended by adapters. Which can be configured and extend at runtime.
Lets have a look at the components and resources that are used for telemetry and policies.
Adapters are used to abstracts backend functionality from Istio. List of supported adapaters can be found here: Supported adapter.
An attribute is a small bit of data that describes a single property of a specific service request or the environment for the request. For example, an attribute can specify the size of a specific request, the response code for an operation, the IP address where a request came from, etc.
Attribute expression allows the modification of attributes. For a list of possible attributes see here: Attribute expression
Policies are telemetry are configured by three resources:
A template defines a schema how to map attributes from the request to adapter input. An adapter can support multiple templates. From what I understand a template can be use as in a Instances.
To clarify the terms lets look at some examples.
To do a blacklisting we need to define an handler (adapter), an instance and a rule.
For our application it looks like this:
apiVersion: "config.istio.io/v1alpha2"
kind: denier
metadata:
name: denycustomerhandler
spec:
status:
code: 7
message: Not allowed
---
apiVersion: "config.istio.io/v1alpha2"
kind: checknothing
metadata:
name: denycustomerrequests
spec:
---
apiVersion: "config.istio.io/v1alpha2"
kind: rule
metadata:
name: denycustomer
spec:
match: destination.labels["app"] == "preference" && source.labels["app"]=="customer"
actions:
- handler: denycustomerhandler.denier
instances: [ denycustomerrequests.checknothing ]
What this policy is doing, is to call the handler denier for every request from the source with the label app equals customer to the destination with the label app equals preference with the template (or instance) checknothing.
The resource denycustomerhandler
is an adapter. You can find a description of it here: Denier. The template/instance is denycustomerrequests
. Details of this template can be found here checknothing.
BTW a more readable form of the yaml would be this, which is introduced in a later version of Istio.
apiVersion: "config.istio.io/v1alpha2"
kind: handler
metadata:
name: denycustomerhandler
spec:
compiledAdapter: denier
status:
code: 7
message: Not allowed
---
apiVersion: "config.istio.io/v1alpha2"
kind: instance
metadata:
name: denycustomerrequests
spec:
compiledTemplate: checknothing
---
apiVersion: "config.istio.io/v1alpha2"
kind: rule
metadata:
name: denycustomer
spec:
match: destination.labels["app"] == "preference" && source.labels["app"]=="customer"
actions:
- handler: denycustomerhandler
instances: [ denycustomerrequests ]
Any how lets create the black listing with the command:
oc create -f istio-workshop-part2/policy/blacklisting/blacklisting1.yaml
To test it run this command:
istio-tutorial/scripts/run.sh istio-ingressgateway-istio-system.192.168.99.107.nip.io/customer
The result should look like this:
customer => Error: 403 - PERMISSION_DENIED:denycustomerhandler.denier.tutorial:Not allowed
customer => Error: 403 - PERMISSION_DENIED:denycustomerhandler.denier.tutorial:Not allowed
customer => Error: 403 - PERMISSION_DENIED:denycustomerhandler.denier.tutorial:Not allowed
customer => Error: 403 - PERMISSION_DENIED:denycustomerhandler.denier.tutorial:Not allowed
customer => Error: 403 - PERMISSION_DENIED:denycustomerhandler.denier.tutorial:Not allowed
The whitelisting is using a listchecker as an adapter. A listentry as template ro instance.
apiVersion: "config.istio.io/v1alpha2"
kind: listchecker
metadata:
name: preferencewhitelist
spec:
overrides: ["customer"]
blacklist: false
---
apiVersion: "config.istio.io/v1alpha2"
kind: listentry
metadata:
name: preferencesource
spec:
value: source.labels["app"]
---
apiVersion: "config.istio.io/v1alpha2"
kind: rule
metadata:
name: checkfromcustomer
spec:
match: destination.labels["app"] == "preference"
actions:
- handler: preferencewhitelist.listchecker
instances:
- preferencesource.listentry
Lets test the white listing:
oc create -f istio-workshop-part2/policy/whitelisting/whitelisting.yaml
To test it run this command:
istio-tutorial/scripts/run.sh istio-ingressgateway-istio-system.192.168.99.107.nip.io/customer
The result should look like this:
customer => preference => recommendation v2 from '6bd4d5775c-8ltf8': 60
customer => preference => recommendation v1 from '999958457-8rdg8': 40
customer => preference => recommendation v2 from '6bd4d5775c-8ltf8': 61
customer => preference => recommendation v1 from '999958457-8rdg8': 41
If you try to call the preference service from with the recommendation pod you get an error:
$ oc rsh recommendation-v1-999958457-8rdg8
Defaulting container name to recommendation.
Use 'oc describe pod/recommendation-v1-999958457-8rdg8 -n tutorial' to see all of the containers in this pod.
sh-4.2$ curl customer:8080
customer => preference => recommendation v2 from '6bd4d5775c-8ltf8': 62
sh-4.2$ curl preference:8080
PERMISSION_DENIED:preferencewhitelist.listchecker.tutorial:recommendation is not whitelisted
sh-4.2$
Lets clean up the created resources with the command:
oc delete -f istio-workshop-part2/policy/whitelisting/whitelisting.yaml
The rate limit is a bit more complex. For this we need to define a policy with instance, handler and rule in the mixer and QuotaSpec and QuotaSpecBinding on the client side in the side car container (envoy proxy).
Lets create the ratelimit and see if it works. The rate limit can be created with the command:
oc create -f istio-workshop-part2/policy/ratelimit/ratelimit.yaml
If we now run a the commands istio-tutorial/scripts/run.sh istio-ingressgateway-istio-system.192.168.99.107.nip.io/customer
or siege -r 10 -c 10 -v http://istio-ingressgateway-istio-system.$(minishift ip).nip.io/customer
you will not get an error message. This is because of the default retries of the envoy proxy in the preference pod. Our rate limit works only for the recommendation v2 and not for recommendation v1. In the log if the istio-proxy container of recommendation-v2-… pod, you should find e statement like this:
[2019-09-16T11:02:44.946Z] "GET / HTTP/1.1" 429 UAEX ""RESOURCE_EXHAUSTED:Quota is exhausted for: requestcount"" 0 55 0 - "-" "Java/1.8.0_191" "ef79ef60-ebdc-913e-9f5e-8fd6156a0ef5" "recommendation:8080" "-" - - 172.17.0.27:8080 172.17.0.28:50832 -
Hint: It might take some time till the ratelimits are applied. Exercise What do you need to set in order to get an error immediately?
The last part of the exercise would be to delete the ratelimit with the command
oc delete -f istio-workshop-part2/policy/ratelimit/ratelimit.yaml
A detailed description about the enhanced security that Istio provides can be found here: Istio documentation security. In the workshop we are going to focus on for aspects of it.
We were using policies in the previous section. One important part of the security in Istio is the definition of authentication policies (which have the kind Policy). The authentication policies can be defined global (Istio wide), namespace wide or service specific. The detailed documentation for this policy can be found here: Authentication Policy.
There are to points I’d like to point out at the beginning:
The egress functionality is done by creating ServiceEntries.
The reference documentation for the ServiceEntry can be found here: ServiceEntry
Let’s play with the egress functionality with the exercise ServiceEntry
Let me quote the Istio documentation on this subject:
Transport authentication, also known as service-to-service authentication: verifies the direct client making the connection. Istio offers mutual TLS as a full stack solution for transport authentication. You can easily turn on this feature without requiring service code changes. This solution:
- Provides each service with a strong identity representing its role to enable interoperability across clusters and clouds.
- Secures service-to-service communication and end-user-to-service communication.
- Provides a key management system to automate key and certificate generation, distribution, and rotation.
You can get the complete documentation of this topic here: Istio Authentication. Also have a look at the Best practices for deployments in respect of security and certificates.
The default installation of Istio on Minishift set the system wide mTLS (mutual TLS) mode to permissive, which means that the sidecar container accept both plaintext and mutual Traffic at the same time. For the workshop we are going to change the mTLS mode for one service.
In the first exercise we are going to use mTLS for the requests between the preference service and recommendation pods. To do that execute the following commands:
oc create -f istio-workshop-part2/security/mtls/virtualservice.yaml
oc create -f istio-workshop-part2/security/mtls/destinationrule.yaml
To verify that the connections are using TLS, you can either use tcpdump in one of the pods or use the graph overview in Kiali. To see the mTLS badge enabled you need to set the checkbox Security
in the Dropdown Display
.
Lets look down our project with this policy, so that every request in the project needs to use mTLS:
apiVersion: "authentication.istio.io/v1alpha1"
kind: "Policy"
metadata:
name: "default"
namespace: "tutorial"
spec:
peers:
- mtls: {}
Please execute this command:
oc create -f istio-workshop-part2/security/mtls/policy.yaml
After we wait a bit you should see something like this if you execute a curl to the demo application:
$ istio-tutorial/scripts/run.sh istio-ingressgateway-istio-system.$(minishift ip).nip.io/customer
upstream connect error or disconnect/reset before headers. reset reason: connection terminationupstream connect error or disconnect/reset before headers. reset reason: connection terminationupstream connect error or disconnect/reset before headers. reset reason: connection terminationupstream connect error or disconnect/reset before headers. reset reason: connection terminationupstream connect error or disconnect/reset before headers. reset reason: connection terminationupstream connect error or disconnect/reset before headers. reset reason: connection terminationupstream connect error or disconnect/reset before headers. reset reason: connection terminationupstream connect error or disconnect/reset before headers. reset reason: connection terminationupstream connect error or disconnect/reset before headers.
To fix that we need to create destination rules with mTLS enabled. You can do this with the following commands:
oc create -f istio-workshop-part2/security/mtls/destinationrule_all.yaml
oc create -f istio-workshop-part2/security/mtls/virtualservice_all.yaml
After this the demo application should work again and you should see a padlock on all services in Kiali
Before we start with the exercise lets go back to the Authentication Policy because not only can you define that the services must use mTLS but also you can define which authentication method should be used. In our case we will use a Keycloak server that we are going to install on our Minishift VM. The documentation of the authentication method can be found here Origin Authentication Method.
In the previous version of the tutorial a Keycloak container was deployed and a user defined in it. If you what to use the Keycloak server instead of hard coded value you can find the example here: Authentication preparation
After this we should be able to to the exercise: End-user authentication with JWT
There is one more thing that I like to meantion before we do the exercise. That is authorization. Istio has a role base access model at namespace-level, service-level and method-level. IMHO the documentation of Istio is describing best how it works and what the purpose of ServiceRole and ServiceRoleBinding is. The documentation can be found here Istio Authorization
Now lets to the exercise: Istio Role Based Access Control (RBAC)
Exercise Lets add ServiceRoles and ServiceRoleBinding that only the service customer is allowed to call the service preference. How can we validate that?