Author: admin

How to Deploy Kubernetes on AWS the Scalable Way

Kubernetes has become the de facto standard for orchestrating containerized workloads—but deploying it correctly on AWS requires more than just spinning up an EKS cluster. You need to think about scalability, cost-efficiency, security, and high availability from day one.

In this guide, we’ll walk you through how to deploy a scalable, production-grade Kubernetes environment on AWS—step by step.

Why Kubernetes on AWS?

Amazon Web Services offers powerful tools to run Kubernetes at scale, including:

  • Amazon EKS – Fully managed control plane
  • EC2 Auto Scaling Groups – Dynamic compute scaling
  • Elastic Load Balancer (ELB) – Handles incoming traffic
  • IAM Roles for Service Accounts – Fine-grained access control
  • Fargate (Optional) – Run pods without managing servers

Step-by-Step Deployment Plan

1. Plan the Architecture

Your Kubernetes architecture should be:

  • Highly Available (Multi-AZ)
  • Scalable (Auto-scaling groups)
  • Secure (Private networking, IAM roles)
  • Observable (Monitoring, logging)
+---------------------+
|   Route 53 / ALB    |
+----------+----------+
           |
   +-------v-------+
   |  EKS Control   |
   |    Plane       |  <- Managed by AWS
   +-------+--------+
           |
+----------v----------+
|   EC2 Worker Nodes   |  <- Auto-scaling
|  (in Private Subnet) |
+----------+-----------+
           |
   +-------v--------+
   |  Kubernetes     |
   |  Workloads      |
   +-----------------+

2. Provision Infrastructure with IaC (Terraform)

Use Terraform to define your VPC, subnets, security groups, and EKS cluster:

module "eks" {
  source          = "terraform-aws-modules/eks/aws"
  cluster_name    = "my-cluster"
  cluster_version = "1.29"
  subnets         = module.vpc.private_subnets
  vpc_id          = module.vpc.vpc_id
  manage_aws_auth = true

  node_groups = {
    default = {
      desired_capacity = 3
      max_capacity     = 6
      min_capacity     = 1
      instance_type    = "t3.medium"
    }
  }
}

Security Tip: Keep worker nodes in private subnets and expose only your load balancer to the public internet.

3. Set Up Cluster Autoscaler

Install the Kubernetes Cluster Autoscaler to automatically scale your EC2 nodes:

kubectl apply -f cluster-autoscaler-autodiscover.yaml

Ensure the autoscaler has IAM permissions via IRSA (IAM Roles for Service Accounts).

4. Use Horizontal Pod Autoscaler

Use HPA to scale pods based on resource usage:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

5. Implement CI/CD Pipelines

Use tools like Argo CD, Flux, or GitHub Actions:

- name: Deploy to EKS
  uses: aws-actions/amazon-eks-deploy@v1
  with:
    cluster-name: my-cluster
    kubectl-version: '1.29'

6. Set Up Observability

Install:

  • Prometheus + Grafana for metrics
  • Fluent Bit or Loki for logging
  • Kube-State-Metrics for cluster state
  • AWS CloudTrail and GuardDuty for security monitoring

7. Optimize Costs

  • Use Spot Instances with on-demand fallback
  • Use EC2 Mixed Instance Policies
  • Try Graviton (ARM) nodes for better cost-performance ratio

Bonus: Fargate Profiles for Microservices

For small or bursty workloads, use AWS Fargate to run pods serverlessly:

eksctl create fargateprofile \
  --cluster my-cluster \
  --name fp-default \
  --namespace default

Recap Checklist

  • Multi-AZ VPC with private subnets
  • Terraform-managed EKS cluster
  • Cluster and pod auto-scaling enabled
  • CI/CD pipeline in place
  • Observability stack (metrics/logs/security)
  • Spot instances or Fargate to save costs

Final Thoughts

Deploying Kubernetes on AWS at scale doesn’t have to be complex—but it does need a solid foundation. Use managed services where possible, automate everything, and focus on observability and security from the start.

If you’re looking for a production-grade, scalable deployment, Terraform + EKS + autoscaling is your winning combo.

Fixing Read-Only Mode on eLux Thin Clients

Fixing Read-Only Mode on eLux Thin Clients

If your eLux device boots into a read-only filesystem or prevents saving changes, it’s usually due to the write filter or system protection settings. Here’s how to identify and fix the issue.

Common Causes

  • Write Filter is enabled (RAM overlay by default)
  • System partition is locked as part of image protection
  • Corrupted overlay from improper shutdown

Fix 1: Temporarily Remount as Read/Write

sudo mount -o remount,rw /

This allows you to make temporary changes. They will be lost after reboot unless you adjust the image or profile settings.

Fix 2: Enable Persistent Mode via the EIS Tool

  1. Open your image project in the EIS Tool
  2. Go to the Settings tab
  3. Locate the write filter or storage persistence section
  4. Set it to Persistent Storage
  5. Export the updated image and redeploy

Fix 3: Enable Persistence via Scout Configuration Profile

  1. Open Scout Enterprise Console
  2. Go to Configuration > Profiles
  3. Edit the assigned profile
  4. Enable options like:
    • Persistent user data
    • Persistent certificate storage
    • Persistent logging
  5. Save and reassign the profile

Fix 4: Reimage the Device

  • If the system is damaged or stuck in read-only permanently, use a USB stick or PXE deployment to reflash the device.
  • Ensure the new image has persistence enabled in the EIS Tool before deploying.

Check Filesystem Mount Status

mount | grep ' / '

If you see (ro) in the output, the system is in read-only mode.

Final Notes

  • eLux protects system partitions by design — use Scout and EIS Tool to make lasting changes
  • Remounting manually is fine for diagnostics but not a long-term fix
  • Always test changes on a test device before rolling out to production

Elux Image Deployment

How to Create and Deploy a Custom eLux Image at Scale

This guide is intended for Linux/VDI system administrators managing eLux thin clients across enterprise environments. It covers:

  • Part 1: Creating a fresh, customized eLux image
  • Part 2: Deploying the image at scale using Scout Enterprise

Part 1: Creating a Custom eLux Image with Tailored Settings

Step 1: Download Required Files

  1. Go to https://www.myelux.com and log in.
  2. Download the following:
    • Base OS image (e.g., elux-RP6-base.ufi)
    • Module files (.ulc) – Citrix, VMware, Firefox, etc.
    • EIS Tool (eLux Image Stick Tool) for your admin OS

Step 2: Install and Open the EIS Tool

  1. Install the EIS Tool on a Windows or Linux system.
  2. Launch the tool and click New Project.
  3. Select the downloaded .ufi base image.
  4. Name your project (e.g., elux-custom-v1) and confirm.

Step 3: Add or Remove Modules

  1. Go to the Modules tab inside the EIS Tool.
  2. Click Add and import the required .ulc files.
  3. Deselect any modules you don’t need.
  4. Click Apply to save module selections.

Step 4: Modify System Settings (Optional)

  • Set default screen resolution
  • Enable or disable write protection
  • Choose RAM overlay or persistent storage
  • Enable shell access if needed for support
  • Disable unneeded services

Step 5: Export the Image

  • To USB stick:
    Click "Write to USB Stick"
    Select your USB target drive
  • To file for network deployment:
    Click "Export Image"
    Save your customized .ufi (e.g., elux-custom-v1.ufi)

Part 2: Deploying the Custom Image at Scale Using Scout Enterprise

Step 1: Import the Image into Scout

  1. Open Scout Enterprise Console
  2. Navigate to Repository > Images
  3. Right-click → Import Image
  4. Select the .ufi file created earlier

Step 2: Create and Configure a Profile

  1. Go to Configuration > Profiles
  2. Click New Profile
  3. Configure network, session, and UI settings
  4. Save and name the profile (e.g., Citrix-Kiosk-Profile)

Step 3: Assign Image and Profile to Devices or Groups

  1. Navigate to Devices or Groups
  2. Right-click → Assign OS Image
  3. Select your custom .ufi
  4. Right-click → Assign Profile
  5. Select your configuration profile

Step 4: Deploy the Image

Option A: PXE Network Deployment

  • Enable PXE boot on client devices (via BIOS)
  • Ensure PXE services are running (Scout or custom)
  • On reboot, clients auto-deploy image and config

Option B: USB Stick Installation

  • Boot client device from prepared USB stick
  • Follow on-screen instructions to install
  • Device registers and pulls config from Scout

Step 5: Monitor Deployment

  • Use Logs > Job Queue to track installations
  • Search for devices to confirm version and status

Optional Commands

Inspect or Write Images

# Mount .ufi image (read-only)
sudo mount -o loop elux-custom.ufi /mnt/elux

# Write image to USB on Linux
sudo dd if=elux-custom.ufi of=/dev/sdX bs=4M status=progress

Manual PXE Server Setup (Linux)

sudo apt install tftpd-hpa dnsmasq

# Example dnsmasq.conf
port=0
interface=eth0
dhcp-range=192.168.1.100,192.168.1.200,12h
dhcp-boot=pxelinux.0
enable-tftp
tftp-root=/srv/tftp

sudo systemctl restart tftpd-hpa
dsudo systemctl restart dnsmasq

Commands on eLux Device Shell

# Switch to shell (Ctrl+Alt+F1), then:
uname -a
df -h
scout showconfig
scout pullconfig

Summary

Task Tool
Build custom image EIS Tool
Add/remove software modules .ulc files + EIS Tool
Customize settings EIS Tool + Scout Profile
Deploy to all clients PXE boot or USB + Scout
Manage and monitor at scale Scout Enterprise Console

Key Components for Setting Up an HPC Cluster

Head Node (Controller)

 Manages job scheduling and resource allocation.
 Runs Slurm Controller Daemon (`slurmctld`).

Compute Nodes

 Execute computational tasks.
 Run Slurm Node Daemon (`slurmd`).
 Configured for CPUs, GPUs, or specialized hardware.

Networking

 High-speed interconnect like Infiniband or Ethernet.
 Ensures fast communication between nodes.

Storage

 Centralized storage like NFS, Lustre, or BeeGFS.
 Provides shared file access for all nodes.

Authentication

 Use Munge for secure communication between Slurm components.

Scheduler

 Slurm for job scheduling and resource management.
 Configured with partitions and node definitions.

Resource Management

 Use cgroups to control CPU, memory, and GPU usage.
 Optional: ProctrackType=cgroup in Slurm.

Parallel File System (Optional)

 High-performance shared storage for parallel workloads.
 Examples: Lustre, GPFS.

Interconnect Libraries

 MPI (Message Passing Interface) for distributed computing.
 Install libraries like OpenMPI or MPICH.

Monitoring and Debugging Tools

 Tools like Prometheus, Grafana, or Ganglia for resource monitoring.
 Enable verbose logging in Slurm for debugging.

How to configure Slurm Controller Node on Ubuntu 22.04

How to setup HPC-Slurm Controller Node

Refer to Key Components for HPC Cluster Setup; for which pieces you need to setup.

This guide provides step-by-step instructions for setting up the Slurm controller daemon (`slurmctld`) on Ubuntu 22.04. It also includes common errors encountered during the setup process and how to resolve them.

Step 1: Install Prerequisites

To begin, install the required dependencies for Slurm and its components:

sudo apt update && sudo apt upgrade -y
sudo apt install -y munge libmunge-dev libmunge2 build-essential man-db mariadb-server mariadb-client libmariadb-dev python3 python3-pip chrony

Step 2: Configure Munge (Authentication for slurm)

Munge is required for authentication within the Slurm cluster.

1. Generate a Munge key on the controller node:
sudo create-munge-key

2. Copy the key to all compute nodes:
scp /etc/munge/munge.key user@node:/etc/munge/

3. Start the Munge service:
sudo systemctl enable –now munge

Step 3: Install Slurm

1. Download and compile Slurm:
wget https://download.schedmd.com/slurm/slurm-23.02.4.tar.bz2
tar -xvjf slurm-23.02.4.tar.bz2
cd slurm-23.02.4
./configure –prefix=/usr/local/slurm –sysconfdir=/etc/slurm
make -j$(nproc)
sudo make install

2. Create necessary directories and set permissions:
sudo mkdir -p /etc/slurm /var/spool/slurm /var/log/slurm
sudo chown slurm: /var/spool/slurm /var/log/slurm

3. Add the Slurm user:
sudo useradd -m slurm

Step 4: Configure Slurm; more complex configs contact Nick Tailor

1. Generate a basic `slurm.conf` using the configurator tool at
https://slurm.schedmd.com/configurator.html. Save the configuration to `/etc/slurm/slurm.conf`.

# Basic Slurm Configuration

ClusterName=my_cluster

ControlMachine=slurmctld # Replace with your control node’s hostname

# BackupController=backup-slurmctld # Uncomment and replace if you have a backup controller

.

# Authentication

AuthType=auth/munge

CryptoType=crypto/munge

.

# Logging

SlurmdLogFile=/var/log/slurm/slurmd.log

SlurmctldLogFile=/var/log/slurm/slurmctld.log

SlurmctldDebug=info

SlurmdDebug=info

.

# Slurm User

SlurmUser=slurm

StateSaveLocation=/var/spool/slurm

SlurmdSpoolDir=/var/spool/slurmd

.

# Scheduler

SchedulerType=sched/backfill

SchedulerParameters=bf_continue

.

# Accounting

AccountingStorageType=accounting_storage/none

JobAcctGatherType=jobacct_gather/linux

.

# Compute Nodes

NodeName=node[1-2] CPUs=4 RealMemory=8192 State=UNKNOWN

PartitionName=debug Nodes=node[1-2] Default=YES MaxTime=INFINITE State=UP

2. Distribute `slurm.conf` to all compute nodes:
scp /etc/slurm/slurm.conf user@node:/etc/slurm/

3. Restart Slurm services:
sudo systemctl restart slurmctld
sudo systemctl restart slurmd

Troubleshooting Common Errors

.

root@slrmcltd:~# tail /var/log/slurm/slurmctld.log

[2024-12-06T11:57:25.428] error: High latency for 1000 calls to gettimeofday(): 20012 microseconds

[2024-12-06T11:57:25.431] fatal: mkdir(/var/spool/slurm): Permission denied

[2024-12-06T11:58:34.862] error: High latency for 1000 calls to gettimeofday(): 20029 microseconds

[2024-12-06T11:58:34.864] fatal: mkdir(/var/spool/slurm): Permission denied

[2024-12-06T11:59:38.843] error: High latency for 1000 calls to gettimeofday(): 18842 microseconds

[2024-12-06T11:59:38.847] fatal: mkdir(/var/spool/slurm): Permission denied

Error: Permission Denied for /var/spool/slurm

This error occurs when the `slurm` user does not have the correct permissions to access the directory.

Fix:
sudo mkdir -p /var/spool/slurm
sudo chown -R slurm: /var/spool/slurm
sudo chmod -R 755 /var/spool/slurm

Error: Temporary Failure in Name Resolution

Slurm could not resolve the hostname `slurmctld`. This can be fixed by updating `/etc/hosts`:

1. Edit `/etc/hosts` and add the following:
127.0.0.1
slurmctld
192.168.20.8
slurmctld

2. Verify the hostname matches `ControlMachine` in `/etc/slurm/slurm.conf`.

3. Restart networking and test hostname resolution:
sudo systemctl restart systemd-networkd
ping slurmctld

Error: High Latency for gettimeofday()

Dec 06 11:57:25 slrmcltd.home systemd[1]: Started Slurm controller daemon.

Dec 06 11:57:25 slrmcltd.home slurmctld[2619]: slurmctld: error: High latency for 1000 calls to gettimeofday(): 20012 microseconds

Dec 06 11:57:25 slrmcltd.home systemd[1]: slurmctld.service: Main process exited, code=exited, status=1/FAILURE

Dec 06 11:57:25 slrmcltd.home systemd[1]: slurmctld.service: Failed with result ‘exit-code’.

This warning typically indicates timing issues in the system.

Fixes:
1. Install and configure `
chrony` for time synchronization:
sudo apt install chrony
sudo systemctl enable –now chrony
   chronyc tracking
timedatectl
2. For virtualized environments, optimize the clocksource:
sudo echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource

3. Disable high-precision timing in `slurm.conf` (optional):
HighPrecisionTimer=NO
sudo systemctl restart slurmctld

Step 5: Verify and Test the Setup

1. Validate the configuration:
scontrol reconfigure
– no errors mean its working. If this doesn’t work check the connection between nodes
update your /etc/hosts to have the hosts all listed across the all machines and nodes.

2. Check node and partition status:
sinfo

root@slrmcltd:/etc/slurm# sinfo

PARTITION AVAIL TIMELIMIT NODES STATE NODELIST

debug* up infinite 1 idle* node1

3. Monitor logs for errors:
sudo tail -f /var/log/slurm/slurmctld.log

.

Written By: Nick Tailor

Mastering Podman: A Comprehensive Guide with Detailed Command Examples

Mastering Podman on Ubuntu: A Comprehensive Guide with Detailed Command Examples

Podman has become a popular alternative to Docker due to its flexibility, security, and rootless operation capabilities. This guide will walk you through the installation process and various advanced usage scenarios of Podman on Ubuntu, providing detailed examples for each command.

Table of Contents

.

1. How to Install Podman

To get started with Podman on Ubuntu, follow these steps:

Update Package Index

Before installing any new software, it’s a good idea to update your package index to ensure you’re getting the latest version of Podman:

.

.

sudo apt update

Install Podman

With your package index updated, you can now install Podman. This command will download and install Podman and any necessary dependencies:

.

.

sudo apt install podman -y

Example Output:

kotlin

.

Reading package lists… Done

Building dependency tree

Reading state information… Done

The following additional packages will be installed:

After this operation, X MB of additional disk space will be used.

Do you want to continue? [Y/n] y

Setting up podman (4.0.2) …

Verifying Installation

After installation, verify that Podman is installed correctly:

.

.

podman –version

Example Output:

.

podman version 4.0.2

2. How to Search for Images

Before running a container, you may need to find an appropriate image. Podman allows you to search for images in various registries.

Search Docker Hub

To search for images on Docker Hub:

.

.

podman search ubuntu

Example Output:

lua

.

INDEX NAME DESCRIPTION STARS OFFICIAL AUTOMATED

docker.io docker.io/library/ubuntu Ubuntu is a Debian-based Linux operating sys… 12329 [OK]

docker.io docker.io/ubuntu-upstart Upstart is an event-based replacement for the … 108 [OK]

docker.io docker.io/tutum/ubuntu Ubuntu image with SSH access. For the root p… 39

docker.io docker.io/ansible/ubuntu14.04-ansible Ubuntu 14.04 LTS with ansible 9 [OK]

This command will return a list of Ubuntu images available in Docker Hub.

3. How to Run Rootless Containers

One of the key features of Podman is the ability to run containers without needing root privileges, enhancing security.

Running a Rootless Container

As a non-root user, you can run a container like this:

.

.

podman run –rm -it ubuntu

Example Output:

ruby

.

root@d2f56a8d1234:/#

This command runs an Ubuntu container in an interactive shell, without requiring root access on the host system.

Configuring Rootless Environment

Ensure your user is added to the subuid and subgid files for proper UID/GID mapping:

.

.

echo “$USER:100000:65536” | sudo tee -a /etc/subuid /etc/subgid

Example Output:

makefile

.

user:100000:65536

user:100000:65536

4. How to Search for Containers

Once you start using containers, you may need to find specific ones.

Listing All Containers

To list all containers (both running and stopped):

.

.

podman ps -a

Example Output:

.

.

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

d13c5bcf30fd docker.io/library/ubuntu:latest 3 minutes ago Exited (0) 2 minutes ago confident_mayer

Filtering Containers

You can filter containers by their status, names, or other attributes. For instance, to find running containers:

.

.

podman ps –filter status=running

Example Output:

.

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

No output indicates there are no running containers at the moment.

5. How to Add Ping to Containers

Some minimal Ubuntu images don’t come with ping installed. Here’s how to add it.

Installing Ping in an Ubuntu Container

First, start an Ubuntu container:

.

.

podman run -it –cap-add=CAP_NET_RAW ubuntu

Inside the container, install ping (part of the iputils-ping package):

.

.

apt update

apt install iputils-ping

Example Output:

mathematica

.

Get:1 http://archive.ubuntu.com/ubuntu focal InRelease [265 kB]

Setting up iputils-ping (3:20190709-3) …

Now you can use ping within the container.

6. How to Expose Ports

Exposing ports is crucial for running services that need to be accessible from outside the container.

Exposing a Port

To expose a port, use the -p flag with the podman run command:

.

.

podman run -d -p 8080:80 ubuntu -c “apt update && apt install -y nginx && nginx -g ‘daemon off;'”

Example Output:

.

54c11dff6a8d9b6f896028f2857c6d74bda60f61ff178165e041e5e2cb0c51c8

This command runs an Ubuntu container, installs Nginx, and exposes port 80 in the container as port 8080 on the host.

Exposing Multiple Ports

You can expose multiple ports by specifying additional -p flags:

.

.

podman run -d -p 8080:80 -p 443:443 ubuntu -c “apt update && apt install -y nginx && nginx -g ‘daemon off;'”

Example Output:

wasm

.

b67f7d89253a4e8f0b5f64dcb9f2f1d542973fbbce73e7cdd6729b35e0d1125c

7. How to Create a Network

Creating a custom network allows you to isolate containers and manage their communication.

Creating a Network

To create a new network:

.

.

podman network create mynetwork

Example Output:

.

mynetwork

This command creates a new network named mynetwork.

Running a Container on a Custom Network

.

.

podman run -d –network mynetwork ubuntu -c “apt update && apt install -y nginx && nginx -g ‘daemon off;'”

Example Output:

.

1e0d2fdb110c8e3b6f2f4f5462d1c9b99e9c47db2b16da6b2de1e4d9275c2a50

This container will now communicate with others on the mynetwork network.

8. How to Connect a Network Between Pods

Podman allows you to manage pods, which are groups of containers sharing the same network namespace.

Creating a Pod and Adding Containers

.

.

podman pod create mypod

podman run -dt –pod mypod ubuntu -c “apt update && apt install -y nginx && nginx -g ‘daemon off;'”

podman run -dt –pod mypod ubuntu -c “apt update && apt install -y redis-server && redis-server”

Example Output:

.

f04d1c28b030f24f3f7b91f9f68d07fe1e6a2d81caeb60c356c64b3f7f7412c7

8cf540eb8e1b0566c65886c684017d5367f2a167d82d7b3b8c3496cbd763d447

4f3402b31e20a07f545dbf69cb4e1f61290591df124bdaf736de64bc3d40d4b1

Both containers now share the same network namespace and can communicate over the mypod network.

Connecting Pods to a Network

To connect a pod to an existing network:

.

.

podman pod create –network mynetwork mypod

Example Output:

.

f04d1c28b030f24f3f7b91f9f68d07fe1e6a2d81caeb60c356c64b3f7f7412c7

This pod will use the mynetwork network, allowing communication with other containers on that network.

9. How to Inspect a Network

Inspecting a network provides detailed information about the network configuration and connected containers.

Inspecting a Network

Use the podman network inspect command:

.

.

podman network inspect mynetwork

Example Output:

json

.

[

{

“name”: “mynetwork”,

“id”: “3c0d6e2eaf3c4f3b98a71c86f7b35d10b9d4f7b749b929a6d758b3f76cd1f8c6”,

“driver”: “bridge”,

“network_interface”: “cni-podman0”,

“created”: “2024-08-12T08:45:24.903716327Z”,

“subnets”: [

{

“subnet”: “10.88.1.0/24”,

“gateway”: “10.88.1.1”

}

],

“ipv6_enabled”: false,

“internal”: false,

“dns_enabled”: true,

“network_dns_servers”: [

“8.8.8.8”

]

}

]

This command will display detailed JSON output, including network interfaces, IP ranges, and connected containers.

10. How to Add a Static Address

Assigning a static IP address can be necessary for consistent network configurations.

Assigning a Static IP

When running a container, you can assign it a static IP address within a custom network:

.

.

podman run -d –network mynetwork –ip 10.88.1.100 ubuntu -c “apt update && apt install -y nginx && nginx -g ‘daemon off;'”

Example Output:

.

f05c2f18e41b4ef3a76a7b2349db20c10d9f2ff09f8c676eb08e9dc92f87c216

Ensure that the IP address is within the subnet range of your custom network.

11. How to Log On to a Container with

Accessing a container’s shell is often necessary for debugging or managing running applications.

Starting a Container with

If the container image includes , you can start it directly:

.

.

podman run -it ubuntu

Example Output:

ruby

.

root@e87b469f2e45:/#

Accessing a Running Container

To access an already running container:

.

.

podman exec -it <container_id>

Replace <container_id> with the actual ID or name of the container.

Example Output:

ruby

.

root@d2f56a8d1234:/#

.

.

What the Hell is Helm? (And Why You Should Care)

What the Hell is Helm?

If you’re tired of managing 10+ YAML files for every app, Helm is your new best friend. It’s basically the package manager for Kubernetes — like apt or brew — but for deploying apps in your cluster.

Instead of editing raw YAML over and over for each environment (dev, staging, prod), Helm lets you template it, inject dynamic values, and install with a single command.


Why Use Helm?

Here’s the reality:

  • You don’t want to maintain 3 sets of YAMLs for each environment.
  • You want to roll back fast if something breaks.
  • You want to reuse deployments across projects without rewriting.

Helm fixes all that. It gives you:

  • Templated YAML (no more copy-paste hell)
  • One chart, many environments
  • Version control + rollback support
  • Easy upgrades with helm upgrade
  • Access to thousands of ready-made charts from the community

Real Talk: What’s a Chart?

Think of a chart like a folder of YAML files with variables in it. You install it, pass in your config (values.yaml), and Helm renders the final manifests and applies them to your cluster.

When you install a chart, Helm creates a release — basically, a named instance of the chart running in your cluster.


How to Get Started (No BS)

1. Install Helm

brew install helm       # mac  
choco install kubernetes-helm  # windows  
sudo snap install helm  # linux  

2. Create Your First Chart

helm create myapp

Boom — you now have a scaffolded chart in a folder with templates, a values file, and everything else you need.


Folder Breakdown

myapp/
├── Chart.yaml         # Metadata
├── values.yaml        # Config you can override
└── templates/         # All your actual Kubernetes YAMLs (as templates)

Example Template (deployment.yaml)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Release.Name }}-app
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      app: {{ .Chart.Name }}
  template:
    metadata:
      labels:
        app: {{ .Chart.Name }}
    spec:
      containers:
      - name: {{ .Chart.Name }}
        image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
        ports:
        - containerPort: 80

Example values.yaml

replicaCount: 3
image:
  repository: nginx
  tag: latest

Change the values, re-deploy, and you’re done.


Deploying to Your Cluster

helm install my-release ./myapp

Upgrading later?

helm upgrade my-release ./myapp -f prod-values.yaml

Roll it back?

helm rollback my-release 1

Uninstall it?

helm uninstall my-release

Simple. Clean. Versioned.


Want a Database?

Don’t write your own MySQL config. Just pull it from Bitnami’s chart repo:

helm repo add bitnami https://charts.bitnami.com/bitnami
helm install my-db bitnami/mysql

Done.


Helm lets you:

  • Turn your Kubernetes YAML into reusable templates
  • Manage config per environment without duplicating files
  • Version your deployments and roll back instantly
  • Install apps like MySQL, Redis, etc., with one command

It’s the smart way to scale your Kubernetes setup without losing your mind.

How to renable the tempurl in latest Cpanel

.

As some of you have noticed the new cpanel by default has a bunch of new default settings that nobody likes.

FTPserver is not configured out of the box.

TempURL is disabled for security reasons. Under certain conditions, a user can attack another user’s account if they access a malicious script through a mod_userdir URL.

So they removed it by default.

.

They did not provide instructions for people who need it. You can easily enable it BUT php wont work on the temp url unless you do the following

remove below: and by remove I mean you need to recompile easyapache 4 the following changes.

mod_ruid2
mod_passenger
mod_mpm_itk
mod_proxy_fcgi
mod_fcgid

Install

Mod_suexec
mod_suphp

Then go into Apache_mode_user dir tweak and enable it and exclude default host only.
It
wont save the setting in the portal, but the configuration is updated. If you go back and look it will look like the settings didnt take. Looks like a bug in cpanel they need to fix on their front end.

Then PHP will work again on the tempurl

.

How to integrate VROPS with Ansible

Automating VMware vRealize Operations (vROps) with Ansible

In the world of IT operations, automation is the key to efficiency and consistency. VMware’s vRealize Operations (vROps) provides powerful monitoring and management capabilities for virtualized environments. Integrating vROps with Ansible, an open-source automation tool, can take your infrastructure management to the next level. In this blog post, we’ll explore how to achieve this integration and demonstrate its benefits with a practical example.

What is vRealize Operations (vROps)?

vRealize Operations (vROps) is a comprehensive monitoring and analytics solution from VMware. It helps IT administrators manage the performance, capacity, and overall health of their virtual environments. Key features of vROps include:

 Performance Monitoring: Continuous tracking of VMs, hosts, and other resources.
 Capacity Management: Planning and optimizing resource usage.
 Troubleshooting: Identifying and resolving issues promptly.
 Automated Actions: Responding to specific events with predefined actions.

Why Integrate vROps with Ansible?

Integrating vROps with Ansible allows you to automate routine tasks, enforce consistent configurations, and rapidly respond to changes or issues in your virtual environment. This integration enables you to:

 Automate Monitoring Setup: Configure monitoring for new virtual machines or environments automatically.
 Trigger Remediation Actions: Automate responses to alerts generated by vROps.
 Generate Reports: Automate the creation and distribution of performance and capacity reports.
 Maintain Configuration Compliance: Ensure consistent vROps configurations across environments.

Setting Up the Integration

Prerequisites

Before you start, ensure you have:

1.vROps Environment: A running instance of VMware vRealize Operations.
2.Ansible Installed: Ansible should be installed on your control node.

Step-by-Step Guide

Step 1: Configure API Access in vROps

First, ensure you have the necessary API access in vROps. You’ll need:

 vROps Host: The URL of your vROps instance.
 vROps Username: A user with API access permissions.
 vROps Password: The password for the above user.

Step 2: Install Ansible

If you haven’t installed Ansible yet, you can do so by following these commands:

sh

sudo apt update

sudo apt install ansible

Step 3: Create an Ansible Playbook

Create an Ansible playbook to interact with vROps. Below is an example playbook that retrieves the status of vROps resources.

Note: to use the other api end points you will need to acquire the token and set it as a fact to pass later.

Example

If you want to acquire the auth token:

.

name: Authenticate with vROps and Check vROps Status

  hosts: localhost

  vars:

    vrops_host: “your-vrops-host”

    vrops_username: “your-username”

    vrops_password: “your-password”

  tasks:

    – name: Authenticate with vROps

      uri:

        url: “https://{{ vrops_host }}/suite-api/api/auth/token/acquire”

        method: POST

        body_format: json

        body:

          username: {{ vrops_username }}”

          password: {{ vrops_password }}”

        headers:

          Content-Type: “application/json

        validate_certs: no

      register: auth_response

.

    – name: Fail if authentication failed

      fail:

        msg: “Authentication with vROps failed: {{ auth_response.json }}”

      when: auth_response.status != 200

.

    – name: Set auth token as fact

      set_fact:

        auth_token: {{ auth_response.json.token }}”

.

    – name: Get vROps status

      uri:

        url: “https://{{ vrops_host }}/suite-api/api/resources”

        method: GET

        headers:

          Authorization: vRealizeOpsToken {{ auth_token }}”

          Content-Type: “application/json

        validate_certs: no

      register: vrops_response

.

    – name: Display vROps status

      debug:

        msg: vROps response: {{ vrops_response.json }}”

.

Save this playbook to a file, for example, check_vrops_status.yml.

Step 4: Define Variables

Create a variables file to store your vROps credentials and host information.
Save it as vars.yml:

vrops_host: your-vrops-host

vrops_username: your-username

vrops_password: your-password

Step 5: Run the Playbook

Execute the playbook using the following command:

sh

ansible-playbook -e @vars.yml check_vrops_status.yml

This above command runs the playbook and retrieves the status of vROps resources, displaying the results if you used the first example.

Here are some of the key API functions you can use:

The Authentication to use the endpoints listed below, you will need to acquire the auth token and set it as a fact to pass to other tasks inside ansible to use with the various endpoints below.

 Login: Authenticate and get a session token.
 Endpoint: POST /suite-api/api/auth/token/acquire

Resource Management

 Get Resources: Retrieve a list of resources managed by vROps.
 Endpoint: GET /suite-api/api/resources
 Get Resource by ID: Retrieve details of a specific resource.
 Endpoint: GET /suite-api/api/resources/{resourceId}
 Create Resource: Add a new resource to vROps.
 Endpoint: POST /suite-api/api/resources
 Update Resource: Update information for an existing resource.
 Endpoint: PUT /suite-api/api/resources/{resourceId}
 Delete Resource: Remove a resource from vROps.
 Endpoint: DELETE /suite-api/api/resources/{resourceId}

Metrics and Data

 Get Metrics for a Resource: Retrieve metrics for a specific resource.
 Endpoint: GET /suite-api/api/resources/{resourceId}/stats
 Get Metric Definitions: List available metrics for a resource kind.
 Endpoint: GET /suite-api/api/resources/kind/{resourceKindKey}/statkeys
 Get Historical Metrics: Retrieve historical metric data for a resource.
 Endpoint: GET /suite-api/api/resources/{resourceId}/stats/historical

Alerts and Notifications

 Get Alerts: Retrieve a list of alerts.
 Endpoint: GET /suite-api/api/alerts
 Get Alert by ID: Retrieve details of a specific alert.
 Endpoint: GET /suite-api/api/alerts/{alertId}
 Acknowledge Alert: Acknowledge a specific alert.
 Endpoint: POST /suite-api/api/alerts/{alertId}/acknowledge
 Cancel Alert: Cancel a specific alert.
 Endpoint: POST /suite-api/api/alerts/{alertId}/cancel
 Generate Notifications: Send notifications based on specific conditions.
 Endpoint: POST /suite-api/api/notifications

Policies and Configurations

 Get Policies: Retrieve a list of policies.
 Endpoint: GET /suite-api/api/policies
 Get Policy by ID: Retrieve details of a specific policy.
 Endpoint: GET /suite-api/api/policies/{policyId}
 Create Policy: Add a new policy.
 Endpoint: POST /suite-api/api/policies
 Update Policy: Update an existing policy.
 Endpoint: PUT /suite-api/api/policies/{policyId}
 Delete Policy: Remove a policy.
 Endpoint: DELETE /suite-api/api/policies/{policyId}

Dashboards and Reports

 Get Dashboards: Retrieve a list of dashboards.
 Endpoint: GET /suite-api/api/dashboards
 Get Dashboard by ID: Retrieve details of a specific dashboard.
 Endpoint: GET /suite-api/api/dashboards/{dashboardId}
 Create Dashboard: Add a new dashboard.
 Endpoint: POST /suite-api/api/dashboards
 Update Dashboard: Update an existing dashboard.
 Endpoint: PUT /suite-api/api/dashboards/{dashboardId}
 Delete Dashboard: Remove a dashboard.
 Endpoint: DELETE /suite-api/api/dashboards/{dashboardId}
 Get Reports: Retrieve a list of reports.
 Endpoint: GET /suite-api/api/reports
 Generate Report: Generate a new report based on a template.
 Endpoint: POST /suite-api/api/reports/{reportTemplateId}/generate
 Get Report by ID: Retrieve details of a specific report.
 Endpoint: GET /suite-api/api/reports/{reportId}

Capacity and Utilization

 Get Capacity Remaining: Retrieve remaining capacity for a specific resource.
 Endpoint: GET /suite-api/api/resources/{resourceId}/capacity/remaining
 Get Capacity Usage: Retrieve capacity usage for a specific resource.
 Endpoint: GET /suite-api/api/resources/{resourceId}/capacity/usage

Additional Functionalities

 Get Custom Groups: Retrieve a list of custom groups.
 Endpoint: GET /suite-api/api/groups
 Create Custom Group: Add a new custom group.
 Endpoint: POST /suite-api/api/groups
 Update Custom Group: Update an existing custom group.
 Endpoint: PUT /suite-api/api/groups/{groupId}
 Delete Custom Group: Remove a custom group.
 Endpoint: DELETE /suite-api/api/groups/{groupId}
 Get Recommendations: Retrieve a list of recommendations.
 Endpoint: GET /suite-api/api/recommendations
 Get Recommendation by ID: Retrieve details of a specific recommendation.
 Endpoint: GET /suite-api/api/recommendations/{recommendationId}

These are just a few examples of the many functions available through the vROps REST API.

.

.

More Cheat Sheet for DevOps Engineers

More Cheat Sheet for DevOps Engineers

This guide is focused entirely on the most commonly used Kubernetes YAML examples and why you’d use them in a production or staging environment. These YAML definitions act as the foundation for automating, scaling, and managing containerized workloads.


1. Pod YAML (Basic Unit of Execution)

Use this when you want to run a single container on the cluster.

apiVersion: v1
kind: Pod
metadata:
  name: simple-pod
spec:
  containers:
  - name: nginx
    image: nginx

This is the most basic unit in Kubernetes. Ideal for testing and debugging.


2. Deployment YAML (For Scaling and Updates)

Use deployments to manage stateless apps with rolling updates and replicas.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: nginx
        image: nginx:1.21

3. Production-Ready Deployment Example

Use this to deploy a resilient application with health checks and resource limits.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: production-app
  labels:
    app: myapp
spec:
  replicas: 4
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp-container
        image: myorg/myapp:2.1.0
        ports:
        - containerPort: 80
        livenessProbe:
          httpGet:
            path: /healthz
            port: 80
          initialDelaySeconds: 15
          periodSeconds: 20
        readinessProbe:
          httpGet:
            path: /ready
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 10
        resources:
          requests:
            cpu: "250m"
            memory: "512Mi"
          limits:
            cpu: "500m"
            memory: "1Gi"

4. Service YAML (Stable Networking Access)

apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  selector:
    app: web
  ports:
  - port: 80
    targetPort: 80
  type: ClusterIP

5. ConfigMap YAML (External Configuration)

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  LOG_LEVEL: "debug"
  FEATURE_FLAG: "true"

6. Secret YAML (Sensitive Information)

apiVersion: v1
kind: Secret
metadata:
  name: app-secret
stringData:
  password: supersecret123

7. PersistentVolumeClaim YAML (For Storage)

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

8. Job YAML (Run Once Tasks)

apiVersion: batch/v1
kind: Job
metadata:
  name: hello-job
spec:
  template:
    spec:
      containers:
      - name: hello
        image: busybox
        command: ["echo", "Hello World"]
      restartPolicy: Never

9. CronJob YAML (Recurring Tasks)

apiVersion: batch/v1
kind: CronJob
metadata:
  name: scheduled-task
spec:
  schedule: "*/5 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: task
            image: busybox
            args: ["/bin/sh", "-c", "echo Scheduled Job"]
          restartPolicy: OnFailure

10. Ingress YAML (Routing External Traffic)

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: web-service
            port:
              number: 80

11. NetworkPolicy YAML (Security Control)

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-nginx
spec:
  podSelector:
    matchLabels:
      app: nginx
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
0