Skip to main content
  1. Posts/

Monitoring Jenkins via Prometheus

··424 words·2 mins
Author
Hairizuan Noorazman
Software engineering experiments, implementation notes, and lessons learned.

It is pretty important to understand how our jenkins job is running. We can technically keep querying the jenkins server via jenkins API but that would mean trying to parse the every changing response - which could be quite a painful process to go through. Instead, what we can do is to simply install 2 plugins - metrics and prometheus jenkins plugins.

I have a small setup to demonstrate this with a jenkins setup that will setup the following in a docker-compose setup

  • Jenkins master in a container
  • Jenkins agent in a container
  • Prometheus (to collect metrics)
  • Grafana to vizualize that data from the prometheus

Reference to the setup is here: https://github.com/hairizuanbinnoorazman/Go_Programming/tree/master/Environment/jenkins

Important step 1: Install required plugins
#

There are a few critical steps just for doing the monitoring of jenkins via promethues. The first would be install the metrics and prometheus plugins. Technically we can do this via Jenkins UI but with the setup mentioned above, we can do it “automatically” - we can define it in the plugins.txt mentioned in the following file in the above repo: https://github.com/hairizuanbinnoorazman/Go_Programming/blob/master/Environment/jenkins/plugins.txt

With that, we would install it during docker build step and it would be available on next start of jenkins master and slave servers.

Important step 2: Querying of prometheus data
#

The prometheus data is possible to be queried by querying the <host>:<port>/prometheus/ endpoint (but its possible to configure it differently as well). Refer to the following plugin page: https://plugins.jenkins.io/prometheus/

For prometheus, we can set the configuration of the prometheus with static configuration. Since we’re doing the above setup via docker compose setup - we can see that the jenkins master server can be reached and pinged with jenkins hostname.

global:
  scrape_interval: 1m
  evaluation_interval: 1m

# A list of scrape configurations.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to all metrics scraped from this config.
  - job_name: 'jenkins'

    static_configs:
      - targets: ['jenkins:8080']
        metrics_path: "/prometheus"

The jenkins server is exposed via port 8080. For the metrics, it is exposed on /prometheus instead of the usual /metrics path.

Technically this is enough to get something started with data collection. Actual vizualization of the metrics collection on grafana

Important step 3: Viewing of jenkins data on grafana
#

We can then hook the grafana setup to the prometheus server. To check that metrics are collected correctly. Once the metrics collected, we can then use the following dashboard: https://grafana.com/grafana/dashboards/9964-jenkins-performance-and-health-overview/ to try to get something going.

This could be a good promql that we can use to get average duration of job:

increase(default_jenkins_builds_duration_milliseconds_summary_sum{jenkins_job="firstjob"}[5m])/increase(default_jenkins_builds_duration_milliseconds_summary_count{jenkins_job="firstjob"}[5m])/1000

Related

Using Alloy and Grafana for extracting metrics and pushing to dashboard

··420 words·2 mins
I need to deploy a metrics exporter to check for nodes on instances and push it into a grafana metrics dashboard We can demonstrate this with 2 instances Deploy alloy to collect Node Metrics # We would first install alloy of the instance we would want to monitor. Here are the reference for it: https://grafana.com/docs/alloy/latest/set-up/install/linux/

Backfilling logs on Loki (Grafana Stack)

··523 words·3 mins
I have a small engineering problem to resolve which to export logs from an android application and save it into a monitoring stack of sorts. The logs are mostly only for debugging purposes because its a pure pain to try to go chat with the user that holds the phone in order to debug the issue. Technically, I can use tools like sentry that is able to retrieve logs more automatically but that would require a bit more involvement with sending logs more consistently to the cloud. The application as of now generates too much logs over long periods so there is a slight fear that if I enable that, it might take too much bandwidth from the android application. (I guess I also need to mention that the application would be operating with a very limited bandwidth - logs are a nice to have and only used in debugging cases - which is technically not often)

Container Signing Experimentation

··489 words·3 mins
One of the major things that I was researching on for security stuff for distributing software is the capability to answer “is this software produced from your company”? This led me to a rabbit hole for the signing mechanism for containers. The signing mechanism is somewhat similar to us install packages from rpm or deb repos for the various linux repos - there is a need to ensure that the package received is truly from the correct source.

Trying ZFS filesystems

··953 words·5 mins
There is a technical challenge and interesting requirement in my job that requires lightweight snapshot capability of a folder/set of files. Technically, it should be ok to simply create a volume snapshot on the cloud vendor of this - however - creating such snapshots actually take a lot of time and potentially, a lot of space - it’s not the cheapest solution for this.