The leader election mechanism is a somewhat complex thing to kind of code up for an application. There are various Golang libraries that assist with this but it would be nicer if there were mechanisms within the environment that the application operate in which can help with this. In the case for the Kubernetes ecosystem - we can actual rely on the fact of how Kubernetes would usually etcd that does this leader election dance on our behalf. If we can tap on this mechanism, we can avoid introducing this mess of a complexity within our application.
The whole process of profiling an application is an attempt to identify hotspots within the application which consumes more resources or takes too much time - knowing this would allow us to identify how to further improve the code within the applications that we build in order to build applications that consume less resources or would respond better to external inputs. Profiling of an application is just another aspect to improve observability of application’s performance on top of the common usual tooling such as distributed traces, metrics and logs. Tools such as distributed traces, metrics and logs only can capture part of the picture of how an application performs within an environment but is different for profiling. Profiling would point out what is happening “internally” within the application such as amount of memory being allocated for particular functions, how much CPU time is being taken for a particular function, thereby providing even more visiblity to how the application works.
While playing around with container technologies such as docker and kubernetes, one critical component that kind of comes up over and over again is the whole portion about managing network connections to the containers. If we are to just take an example of Kubernetes - the networking stack is handled by technologies that would interface with CNI as well kube proxy. In this post, we’ll be focusing on the linux feature that kube proxy kind of rely on (one of the modes that it runs on) which is IP Tables.
There is an old adage from security land that we should restrict access to resources/assets as much as we can. Users and applications should only access items that they need to operate themselves. Following this line of thought, that would mean that if we are to deploy application in a Kubernetes Cluster, we should ensure that pods should only accept communication that they’ve explicitly declared as “required”. Is there a way to do so?
This blog post is kind of a blog post that provide some notes of some experimentation that I encountered while playing with Google Cloud Platform. The purpose of this experimentation was to do the following:
There is a trend of images that follow the philosophy of minimizing the size of image by removing almost everything out of image. This helps with getting image downloaded more quickly by kubelet into the nodes as well as possibly reducing the attack surface of the container even further (I suppose it’s harder to do things in a container if utilities like shell or bash don’t exist within it). You would probably see errors such as this for those containers that have somewhat remove the shell/bash:
While dealing with branded links during my course of work, I kind of wondered how it can be tackled if I were to do it in a Google Kubernetes Engine Cluster. The situation I would imagine that would need to solve is this:
This is a quick sample tool to retrieve bus arrivals in Singapore. In order to use it, we would need to find for the Bus Stop ID or Bus Stop Code from where we’re taking the bus from. After keying it, it would fetch the records from LTA Datamall’s real time bus arrival API and present those records in this tool.
Database migration is kind of a critical bit when it comes to running and operating applications. In Golang, it is kind of appealing to rely on ORM (Object Relational Mapping) libraries. It allows one to kind of map structs to tabular structures within the database storage. One such example of an ORM library that I’ve found on the first page of Google is GORM.
There was a change to the Google Slides API that resulted in an inability to upload images from Google Drive into Google Slides programmatically. Refer to the following issue on the rgoogleslides github repo - https://github.com/hairizuanbinnoorazman/rgoogleslides/issues/28.
NOTE: BEFORE READING THIS - ALL SCREENSHOTS BELOW ARE TAKEN SOMETIME IN OCTOBER 2021. THE UI MAY CHANGE IN THE FUTURE - USE THIS AS A ROUGH GUIDE AND NOT AS ABSOLUTE TRUTH
In a previous post, it details some information of how to setup some open source tooling to capture logs, retrieve metrics as well as capture distributed trace information from apps. The previous blog post would cover the setup of logging system which is Loki, distributed tracing system which is Tempo and metrics collection system which is Prometheus. Refer to the link below here.
Generally, most cloud providers come along with all the observability tooling that you need for your apps built-in with the platform. Some of the common observability tools such as logging, monitoring and nowadays, distributed tracing are usually made available and you can easily use said tools by reading up on the various documentation of how to setup each of these tooling. E.g. if your application is inside a virtual machine and if you need collect metrics and logs from the application, you may need to install an agent in the said VM. The agent would collect those information and send it to the centralized observability tooling in the cloud provider where the information would be provided to you via a UI. Most of the time, these tools are charged based on the amount of logs/metrics you generate from the application (so the less logs/metrics you generate, the cheaper it is monitor your application - a very understanable/reasonable situation). In cases where if your application runs in Kubernetes, maybe the cluster comes with agents pre-installed, making it easier to make use of the logging/metrics/distributed tracing that the cloud provider has.
As of now, one of the common and easier way to have services communicate with each other would be over HTTP. In real world use cases, HTTPS is usually used (in order to ensure communications are secure) and this communication is done following some sort of REST framework. This provides some sort of structure of how to standardize such communications for the various software applications out there. It got to the point where entire companies are developing in order to support this: e.g. Apigee, SmartBear
This is definitely not an exhaustive list of items to consider but definitely some of the more obvious features that client side users would look out for and consider when attempting to install such third party apps and operate it on their infrastructure.
NOTE: As software advances, some of the commands shown below may become depreciated/irrelevant. If one encounters errors - check the output logs to see what the issue is (e.g. missing library? missing dependency? wrong folder structure due to being unable to find a file)
This are some notes in the case where one wants to deploy a bunch of python “microservices” to a Google Kubernetes Engine cluster. These notes emphasize on the basics rather than the various nuances of running a “production” grade python application.
Sometime earlier this year (2021), Google Cloud Run started to support websocket support - which is one of the critical components in order to be able to run a R Shiny Dashboard application.
Let’s say we have a set of applications that was designed to be a set of microservices. Each of the applications would generally be designed to be focused on one specific domain and in order to achieve the overall goal of the platform. However,for the platform to work properly, the applications would generally need to work together as one which would involve the application contacting each other.
This blog post is still being updated
Various cloud providers started offering serverless containers as a service. This is a service where developers can just create a container and then, pass that container over to the cloud provider and then forget about it. The cloud provider would deal with the scaling, provisioning of resources to host the applications, deployment, monitoring etc.