Skip to content

Blog

Welcome to the LaFabrique.AI blog. Here you'll find articles about artificial intelligence, Kubernetes, DevOps, and technology insights.

My AI Coworker Ships Fast, Breaks Things, and Never Takes a Coffee Break

Three projects. Thirty days. Three AI approaches. This is not a tutorial.


The Setup

Over the past month, I ran three very different projects with AI as my primary collaborator (OpenClaw + Claude Opus 4.6). Each project used a different approach : AI as refactoring assistant, AI as autonomous builder, AI as research partner. Not a tool I prompted occasionally. A coworker I paired with for hours.

The projects :

  1. GIA — refactoring and translating an existing app. AI as assistant.
  2. Chat my Resume — building a chatbot from scratch. AI as autonomous builder.
  3. LLM Bench Lab — benchmarking GPUs and writing a technical blog. AI as research partner.

Three approaches. Three very different results. None of them went smoothly.

Blackwell GPUs for Local LLMs : RTX PRO 6000 vs RTX 5070 Ti

The Benchmark Nobody Asked For

Tested on AMD Ryzen 7 9800X3D with llama.cpp b7966 via localscore-bench, February 2026


TL;DR

We benchmarked eight LLM models (1B to 70B parameters) on two Blackwell GPUs with Vulkan and CUDA 13.1, on both Linux and Windows.

The headlines :

  • The $950 card delivers 4 to 7x more tokens per dollar. For models under 12B, the 5070 Ti is the rational choice.
  • Vulkan and CUDA perform within 5 to 15% on cold hardware. Pick whichever works for your setup.
  • The 5070 Ti just works. The PRO 6000 needs server cooling to get to its full potential. Active fans vs passive heatsink is a real differentiator.
  • OS barely matters. Linux and Windows within 5 to 10%.
  • VRAM is the main reason to buy the PRO 6000. It earns its price at 32B+, not at 12B.

Devstral 2 & vibe by Mistral AI the hidden gems of the AI Coding Agent.

For about a year, I have been working daily with various coding assistants, choosing different tools depending on my mood, needs and constraints. My journey has included testing Windsurf and Tabnine professionally, while personally transitioning from being a fervent Copilot user to adopting Claude Code.

During this exploration, I discovered Devstral 2, which ultimately replaced Claude Code in my workflow for several compelling reasons:

  1. Aesthetic Excellence: The tool offers a beautiful user experience.
    From the blog post announcement to the API documentation and vibe itself, the color scheme, visual effects, and overall polish create a distinctly pleasant working environment.

  2. Comparable Performance: In the "me, myself & I benchmark", Devstral 2 code suggestion is on par with Claude Code.
    While both trend to occasionally overlook framework documentation ; they deliver excellent results overall when refactoring, suggesting commit message, or tweaking CSS.

  3. Cost-Effective and Open Source: Devstral 2 is significantly more affordable than Claude Code and is open source.
    Users receive 1 million tokens for trial, with pricing at $0.10/$0.30 for Devstral Small 2 past the 1st million.
    With Claude Code, I frequently hit usage limits, even after employing /compact commands and tracking my /usage.
    And even if you bust the vibe usage limits it has:

  4. Local Execution Capability: Although vibe time to first token can be slower than claude, Mistral offers a crucial advantage !
    Both Devstral 2 & small version are open source with the ability to run entirely on local machines, providing greater control, privacy, and if you have the gear, blazing-fast performance⚡.

The documentation to run it locally is rather sparse and Devstral-2-small is still relatively resource-intensive, therefore needing some tweaks.

Here are the instructions for running Devstral 2 small + vibe on Ubuntu 24.04 with an NVIDIA L40S with 24GB VRAM hosted by Scaleway .

Welcome to LaFabrique.AI

Welcome to LaFabrique.AI! An evolution of storage-chaos.io. This blog tracks and documents the beginning of a journey through the world of artificial intelligence.

What to Expect

This blog will cover a wide range of AI-related topics (or not!):

  • 🤖 AI Tools – Reviews and tutorials on the latest AI tools
  • 🏗️ AI Infrastructure – Benchmark & architecture best practices
  • ☸️ Kubernetes – Insights on Kubernetes, storage & its use in the context of AI
  • 🏭 Industry Insights – Trends and developments in the AI space

Stay tuned for more content!

CSI for PowerFlex on OpenShift with Multiple Networks

Managing multiple networks for storage workloads on OpenShift is not optional: it is essential for performance and isolation. Dell PowerFlex , with its CSI driver, delivers dynamic storage provisioning, but multi-network setups require proper configuration.

This guide explains how to enable multi-network support for CSI PowerFlex on OpenShift, including prerequisites, network attachment definitions, and best practices for high availability.

Enable Storage Multi-tenancy on Kubernetes with PowerScale

Dell PowerScale is a scale-out NAS solution designed for high-performance, enterprise-grade file storage and multi-tenant environments. In multi-tenant environments, such as shared Kubernetes clusters, isolating workloads and data access is critical.

PowerScale addresses this need through Access Zones, which logically partition the cluster to enforce authentication boundaries, export rules, and quota policies. The Dell CSI driver maps Kubernetes StorageClass resources to specific Access Zones, providing per-tenant isolation at the storage layer.

This setup is particularly useful when multiple teams share a common PowerScale backend but require strict separation of data and access controls. This approach proved extremely valuable when building a GPU-as-a-Service AI Factory .

Use Harvester with Dell Storage

Co-authored with Parasar Kodati.

Dell CSI drivers for PowerStore, PowerMax, PowerFlex, and PowerScale have all been tested and are compatible with KubeVirt . This guide provides instructions for installing Dell CSI for PowerMax on Harvester , though the steps are very similar regardless of the storage backend.

Tested on :

  • Harvester v1.3.1
  • CSM v2.11
  • PowerMax protocols : Fibre Channel, iSCSI, and NFS

🌩️🛟 Disaster Recovery for VMs on Kubernetes

Author(s): Pooja Prasannakumar & Florian Coulombel

Kubernetes is no longer just a container orchestrator. As organizations modernize infrastructure, there’s growing interest in using Kubernetes to manage virtual machines (VMs) alongside cloud-native workloads—while still meeting familiar expectations like disaster recovery (DR).

In this post, we’ll walk through a practical, GitOps-friendly DR approach for VMs running on Kubernetes using:

  • KubeVirt to run VMs on Kubernetes
  • Dell Container Storage Modules (CSM) for storage and replication
  • CSM Replication to replicate VM disks across clusters
  • Argo CD + Kustomize to manage deployment and failover via GitOps

🔒🧰 Hardening Kubernetes CSI Drivers: Reducing CAP_SYS_ADMIN Without Breaking Storage

Many Kubernetes storage drivers still rely on the powerful—and notoriously over‑broad—Linux capability CAP_SYS_ADMIN to perform host‑level operations. While it enables critical actions like filesystem mounts, it also substantially expands the attack surface of your cluster.

This post explains why CSI node plugins often end up needing CAP_SYS_ADMIN, what breaks when you remove it, and several concrete hardening strategies using tools like seccomp, AppArmor, SELinux, and controlled privilege elevation.