Day: November 12, 2024

Deploying Lustre File System with RDMA, Node Maps, and ACLs

Lustre is the de facto parallel file system for high-performance computing (HPC) clusters, providing extreme scalability, high throughput, and low-latency access across thousands of nodes. This guide walks through a complete deployment of Lustre using RDMA over InfiniBand for performance, along with Node Maps for client access control and ACLs for fine-grained permissions.


1. Understanding the Lustre Architecture

Lustre separates metadata and data services into distinct roles:

  • MGS (Management Server) – Manages Lustre configuration and coordinates cluster services.
  • MDT (Metadata Target) – Stores file system metadata (names, permissions, directories).
  • OST (Object Storage Target) – Stores file data blocks.
  • Clients – Mount and access the Lustre file system for I/O.

The typical architecture looks like this:

+-------------+        +-------------+
|   Client 1  |        |   Client 2  |
| /mnt/lustre |        | /mnt/lustre |
+------+------+        +------+------+
       |                        |
       +--------o2ib RDMA-------+
                |
        +-------+-------+
        |     OSS/OST    |
        |   (Data I/O)   |
        +-------+-------+
                |
        +-------+-------+
        |     MGS/MDT    |
        |  (Metadata)    |
        +---------------+

2. Prerequisites and Environment

ComponentRequirements
OSRHEL / Rocky / AlmaLinux 8.x or higher
KernelBuilt with Lustre and OFED RDMA modules
NetworkInfiniBand fabric (Mellanox or compatible)
Lustre Version2.14 or later
DevicesSeparate block devices for MDT, OST(s), and client mount

3. Install Lustre Packages

On MGS, MDT, and OSS Nodes:

dnf install -y lustre kmod-lustre lustre-osd-ldiskfs

On Client Nodes:

dnf install -y lustre-client kmod-lustre-client

4. Configure InfiniBand and RDMA (o2ib)

InfiniBand provides the lowest latency for Lustre communication via RDMA. Configure the o2ib network type for Lustre.

1. Install and verify InfiniBand stack

dnf install -y rdma-core infiniband-diags perftest libibverbs-utils
systemctl enable --now rdma
ibstat

2. Configure IB network

nmcli con add type infiniband ifname ib0 con-name ib0 ip4 10.0.0.1/24
nmcli con up ib0

3. Verify RDMA link

ibv_devinfo
ibv_rc_pingpong -d mlx5_0

4. Configure LNET for o2ib

Create /etc/modprobe.d/lustre.conf with:

options lnet networks="o2ib(ib0)"
modprobe lnet
lnetctl lnet configure
lnetctl net add --net o2ib --if ib0
lnetctl net show

Expected output:

net:
  - net type: o2ib
    interfaces:
      0: ib0

5. Format and Mount Lustre Targets

Metadata Server (MGS + MDT)

mkfs.lustre --fsname=lustrefs --mgs --mdt --index=0 /dev/sdb
mount -t lustre /dev/sdb /mnt/mdt

Object Storage Server (OSS)

mkfs.lustre --fsname=lustrefs --ost --index=0 --mgsnode=<MGS>@o2ib /dev/sdc
mount -t lustre /dev/sdc /mnt/ost

Client Node

mount -t lustre <MGS>@o2ib:/lustrefs /mnt/lustre
sudo mkdir -p /mnt/lustre

sudo mount -t lustre \
  172.16.0.10@o2ib:/lustrefs \
  /mnt/lustre

example without ibnetwork
[root@vbox ~]# mount -t lustre 172.16.0.10@tcp:/lustre /mnt/lustre-client
[root@vbox ~]# 
[root@vbox ~]# # Verify the mount worked
[root@vbox ~]# df -h /mnt/lustre-client
Filesystem                Size  Used Avail Use% Mounted on
172.16.0.10@tcp:/lustre   12G  2.5M   11G   1% /mnt/lustre-client
[root@vbox ~]# lfs df -h
UUID                       bytes        Used   Available Use% Mounted on
lustre-MDT0000_UUID         4.5G        1.9M        4.1G   1% /mnt/lustre-client[MDT:0]
lustre-OST0000_UUID         7.5G        1.2M        7.0G   1% /mnt/lustre-client[OST:0]
lustre-OST0001_UUID         3.9G        1.2M        3.7G   1% /mnt/lustre-client[OST:1]
filesystem_summary:        11.4G        2.4M       10.7G   1% /mnt/lustre-client

6. Configuring Node Maps (Access Control)

Node maps allow administrators to restrict Lustre client access based on network or host identity.

1. View current node maps

lctl nodemap_list

2. Create a new node map for trusted clients

lctl nodemap_add trusted_clients

3. Add allowed network range or host

lctl nodemap_add_range trusted_clients 10.0.0.0/24

4. Enable enforcement

lctl set_param nodemap.trusted_clients.admin=1
lctl set_param nodemap.trusted_clients.trust_client_ids=1

5. Restrict default map

lctl set_param nodemap.default.reject_unauthenticated=1

This ensures only IPs in 10.0.0.0/24 can mount and access the Lustre filesystem.


7. Configuring Access Control Lists (ACLs)

Lustre supports standard POSIX ACLs for fine-grained directory and file permissions.

1. Enable ACL support on mount

mount -t lustre -o acl <MGS>@o2ib:/lustrefs /mnt/lustre

2. Verify ACL support

mount | grep lustre

Should show:

/dev/sda on /mnt/lustre type lustre (rw,acl)

3. Set ACLs on directories

setfacl -m u:researcher:rwx /mnt/lustre/projects
setfacl -m g:analysts:rx /mnt/lustre/reports

4. View ACLs

getfacl /mnt/lustre/projects

Sample output:

# file: projects
# owner: root
# group: root
user::rwx
user:researcher:rwx
group::r-x
group:analysts:r-x
mask::rwx
other::---

8. Verifying Cluster Health

On all nodes:

lctl ping <MGS>@o2ib
lctl dl
lctl get_param -n net.*.state

Check RDMA performance:

lctl get_param -n o2iblnd.*.stats

Check file system mount from client:

df -h /mnt/lustre

Optional: Check node map enforcement

Try mounting from an unauthorized IP — it should fail:

mount -t lustre <MGS>@o2ib:/lustrefs /mnt/test
mount.lustre: mount <MGS>@o2ib:/lustrefs at /mnt/test failed: Permission denied

9. Common Issues and Troubleshooting

IssuePossible CauseResolution
Mount failed: no route to hostIB subnet mismatch or LNET not configuredVerify lnetctl net show and ping -I ib0 between nodes.
Permission deniedNode map restriction activeCheck lctl nodemap_list and ensure client IP range is allowed.
Slow performanceRDMA disabled or fallback to TCPVerify lctl list_nids shows @o2ib transport.

10. Final Validation Checklist

  • InfiniBand RDMA verified with ibv_rc_pingpong
  • LNET configured for o2ib(ib0)
  • MGS, MDT, and OST mounted successfully
  • Clients connected via @o2ib
  • Node maps restricting unauthorized hosts
  • ACLs correctly enforcing directory-level access

Summary

With RDMA transport, Lustre achieves near line-rate performance while node maps and ACLs enforce robust security and access control. This combination provides a scalable, high-performance, and policy-driven storage environment ideal for AI, HPC, and research workloads.

0