Unlock the power of modern technology to transform your business.
Choose the service that is right for your organization:
Choose the training that is right for your role:
Learn about the Oteemo Way.
Meet the members of Oteemo!
Kubernetes
by Chris McGrath | Sep 19, 2019
This blog post is the 1st in a 4 part series with the goal of thoroughly explaining how Kubernetes Ingress Works:
If you read till the end of this series you’ll gain a deep understanding of the following diagram, Kubernetes Networking, the 7 service types, Kubernetes Ingress, and a few other fundamental concepts.
I’ve seen a project with 2 websites and 2 API Groups: Externally accessible APIs and Internally accessible APIs. The External APIs were to function as publicly reachable backend points of entry for the websites and act as potentially reusable building blocks for future projects. Internal APIs were named such to make it obvious that it’d be dangerous to externally expose them as they were meant to house application middleware logic and backend database logic.
The architecture evolved to a point where both the Internal and External APIs were externally exposed using Kubernetes Ingress, firewall rules were implemented to limit access to both sets of APIs. The External APIs were put behind an API Gateway as a means of bolting on authentication functionality, and the Internal APIs were firewalled to prevent them from being externally exposed.
It’s a valid solution, but I’d like to point out that the Internal APIs should have only been internally reachable using a ClusterIP service as this would have been both more secure and less complex. Exposing the Internal APIs over Ingress came about out of ignorance of the basics of Kubernetes / that was the only known way to interact with things in the cluster.
The point of the story is that trying to implement Kubernetes by relying on how to guides can cause you to learn the tool equivalent of a hammer and then see every problem as a nail. If you take the time to deeply learn the fundamentals that advanced concepts are built upon you’ll be able to come up with multiple solutions to problems that don’t have a perfect how to guide readily available and evaluate which solution is best for a given situation. Understanding the how and why of basic concepts improves your ability to do quick solid evaluations of different tooling solutions, which is critical since no one has time to learn every tool in depth.
I find abstract concepts are easier to understand and follow when you can build on basic facts, tie in prior knowledge, and parallel abstract concepts with concrete examples. In this section I’ll use those techniques to help explain the following concepts:
Router’s that do PAT (Port Address Translation — a type of Network Address Translation (NAT)) form a network boundary where it’s easy to talk in 1 direction, and hard to talk in the other direction.
Routers connect networks:
Switches create networks/allow multiple computers on the same network to talk to each other.
Below is a picture of the back of a home router, which is acting as a Router by connecting Internet Network to LAN, and acting as a Switch by connecting the 4 computers on the LAN to form a network where they can freely talk to each other.
PAT allows 2 things to happen:
Your Home Router is doing the job of several conceptual devices combined into a single unit. The pictures of the back of a home router clarify that it’s is a Router and Switch, they often are also DNS/DHCP servers, Wireless Access Points, and sometimes even modems rolled into a single unit. In a similar fashion Kubernetes Node’s aren’t just computers, they act like virtual routers and use PAT to form a network boundary, they also act like virtual switches and create another network:
A single Kubernetes Cluster often belongs to a topology involving 3 levels of Network Boundaries.
Default network configuration settings make it so computers on the left side can’t start conversations with computers on the right side, but they can reply to conversations started by computers on the right side. Computers on the right side are free to start a conversation with any computers on the left side.
So by default:
This offers a secure default traffic flow to start with, and allowing traffic to flow against the secure default flow requires configuration.
It’s common for a single Kubernetes Cluster to have access to 3 levels of DNS (Public Internet DNS, LAN DNS, Inner cluster DNS)A pod will have access to all 3 levels of DNS, it can connect to:
PodBash# curl nginx.default.svc.cluster.local
Note: If you use the following commands:
PodBash# cat /etc/resolv.conf
nameserver 100.64.0.10search default.svc.cluster.local svc.cluster.local cluster.localoptions ndots:5
You’ll realize that pods in the default namespace can shorten the above to:PodBash# curl nginxA pod in the ingress namespace could shorten the above to:PodBash# curl nginx.default
S3 storage on the Internet:https://s3.us-east-1.amazonaws.com/bucket_name/test.png
A Management Laptop won’t be able to resolve any inner cluster dns names, it’ll only have access to websites defined in LAN DNS and Public Internet DNS.
The Inner Cluster Network is actually a combination of 2 Networks: A Kubernetes Service Network and a Pod Network.
The Kubernetes Service Network:
The Pod Network:
At this point you’ve come down the rabbit hole far enough that the following statement will make more sense compared to if I had frontloaded it. While I try to be as accurate as possible in my explanations, the mix and match rapidly evolving nature of Kubernetes makes it impossible to give an explanation that’s 100% accurate to all implementations of Kubernetes, the big picture concepts will more or less be the same, but some nitty gritty details may vary. Please keep this in mind when you think about these explanations in the context of your environment. Also be aware that the explanations in this series of posts will assume an overlay network (like that of Canal or Cilium Container Network Interface) is used. Kubernetes is based on the Linux Kernel, which acts as a base upon which various tools are bolted on to create various Linux Distributions. Every flavor of linux has its own quirks and things that are unique to that Linux Distro like Alpine Linux’s apk add curl, Debian’s apt-get install curl, and RHELs yum install curl. Yet at the same time they all have similarities like Bash and supported file system types. In a similar vein there are different kubernetes distributions and even different flavors of Ingress Controllers, infact all of Kubernetes is built to be modular/customizable, the kubernetes controller manager component of the masters for example comes in a vanilla flavor and a cloud provider specific flavor that knows how to interact with Cloud Provider APIs to provision things like Cloud Load Balancers. K3s, a Rancher Labs Kubernetes Distro, replaces etcd with an SQLite based implementation. You may be wondering as I used to wonder: How the hell can Kubernetes be stable when there’s a million different permutations? The short answer to that is API Contracts. The pluggable components that make up Kubernetes conform to standards usually in the form of an api contract, as long as both modules satisfy the contract you can usually swap them out and only need a little integration testing of the module against the pieces it touches, instead of having to rely on end to end testing of every single permutation possible.
All Kubernetes Service Types:
There are 4 normal service types:
Each of the normal service types has a Static Inner Cluster IP that’s persisted in etcd and kube-proxy/NodePort services can forward traffic directly to any of these service types.
A Kubernetes Load Balancer Service encapsulates other services types, similar to how a Deployment object will create and encapsulate other nested object types.
After creating a deployment:
LaptopBash# kubectl run nginx –image=nginx
You can run
LaptopBash# kubectl get deploy,rs,pod
…and find a match for the object that was just created.
This is because a deployment creates and manages replicasets, and a replicaset creates and manages pods.
Similarly, when you create a service of type load balancer, the created service will have the same properties that a service of type NodePort has, and when a service of type NodePort is created it has the same properties as a service of type ClusterIP.
There are 3 Headless service types:
StatefulSet Headless Services will additionally generate a per pod Inner-Cluster DNS name f using the convention:<statefulset name-#>.<service name>.<namespace>.svc.cluster.local
Headless services don’t get Static Inner Cluster IP a side effect of this is that kube-proxy/NodePort services can’t forward external traffic directly to Headless Services.
An ExternalName service is just a DNS redirector. It can redirect to any dns name: This could be a cluster level, LAN level, or Internet level DNS name.
One use case for ExternalName services is to workaround the inability to externally expose individual pods in a stateful set, these can’t be directly externally exposed due to a limitation associated with headless services. A NodePort service can point to an ExternalName service which can then point to the inner cluster dns name of an individual pod of a statefulset.
A second use case is for implementing an in cluster blue green hard cutover between 2 services of type ClusterIP, which could be good if you have a distributed monolith (several microservices where the versions need to be tightly coupled in order to work due to a lack of API contracts.) that requires several deployments to be updated at the same time, and want to avoid doing a rolling update of components that are not backwards compatible. You may want to implement this in a production environment to avoid having to do several non backwards compatible rolling updates, while live traffic could be coming in. (This also allows you to stage a production deployment and have a 2 second upgrade vs a 10++ minute upgrade window.)
A third use case is to offer a level of consistency between a lower environment and a higher environment, in the diagram below a ClusterIP service in a Dev Environment, could have the same service name as a ExternalName service in a Prod Environment. Consistency usually makes automation and configuration management much easier.
Create predictable static Inner Cluster DNS Names that makes inner cluster communication easier and act like a highly available inner cluster load balancer.
Makes LAN to Inner Cluster Communication Possible
NodePort services open a consistent port on every node in the cluster and map traffic that comes in on that port to a service inside the cluster. (Kubeproxy is what’s responsible for redirecting traffic coming in on the Node Port to the service.) NodePort service is somewhat misnamed, NodesPort service would have been a better name as when this service is created a port is, by default, randomly chosen from the range 30000-32767, and every node in the cluster starts listening on that port. If you deployed an rabbitmq pod on your cluster, it’s Management Web GUI/Web API would be available to pods over <servicename>.<namespace>:15672. A management laptop on the same LAN could access the management web GUI via <NodeIP>:<randomly generated port>, which works but isn’t very convenient. Even if you updated LAN DNS to map rabbitmq.lan to the IP of every node, you’d still have to use a randomly generated port, e.g., http://rabbitmq.lan:<random port>
(Note NodePorts will be randomly chosen within the NodePort range by default, but if you want consistency/predictability it is possible to explicitly assign a Node Port to be used in the service yaml. Also pods can listen directly on ports like 80 and 443, this is an advanced scenario that doesn’t use NodePort Service, and will be covered in the 3rd article.)
Make Internet to Inner Cluster communication possible, and can also make LAN to Inner Cluster communication easier.
When a user types in www.website1.com Internet DNS maps their request to 1.2.3.4 and http:// uses port 80, the highly available cloud load balancer on the top load balances traffic between each node, it remaps traffic that came in on port 80 to be directed to port 31111, a Node Port that’s consistent on every node. Any node that receives traffic on that Node Port knows to forward it to website1’s service using kube-proxy. A similar flow occurs for website2.
Recall that a Load Balancer Service builds on the functionality of a NodePort service, and a NodePort service builds on the functionality of a ClusterIP service.
Likewise the Ingress Controller concept builds on the LoadBalancer Service functionality. We’ll cover the Ingress concept in depth in the next article.
Your email address will not be published. Required fields are marked *
Comment
Name *
Email *
Website
Save my name, email, and website in this browser for the next time I comment.
Submit Comment
Ingress 102: Kubernetes Ingress Implementation Options
Kubernetes tooling for TechOps and Support
Ingress 101 | What Is Kubernetes Ingress And Why Does It Exist?