ALBERTO FERRER

Linux Engineer & AI Architect

For contact information, please visit Contact.

⚊

Professional Profile

▢

Results-oriented IT Solutions professional with solid background in project management, regulatory compliance and process improvement. Excellent communication skills. Experience working in teams with several companies to develop and deploy technical solutions. Demonstrated success in analyzing areas to implement improvements and achievement of project results that contribute to the overall objectives of the company.

⚊

AI & Machine Learning Expertise

▢

AI Frameworks & Inference

LLM Inference Engines:

vLLM (PagedAttention, Continuous Batching)
SGLang (RadixAttention, Prefix Caching)
TensorRT-LLM (NVIDIA Optimizations)
NVIDIA NIM (Containerized Microservices)

AI Frameworks:

TensorFlow / PyTorch
Hugging Face Transformers
vLLM / SGLang
llama.cpp / Unsloth
NVIDIA Triton Server / TensorRT / NIM
OpenAI APIs / Claude APIs
OpenWebUI

Vector Databases & RAG

PostgreSQL + pgvector
ChromaDB
Milvus / Qdrant
FAISS
Elasticsearch Vector
LangChain / LlamaIndex

Agentic AI & Development

Frameworks: CrewAI, LangGraph React Agents, Agno Agents, FastAPI

Projects: Python PoCs, Document Extraction with VLMs, AI System Integrations

Model Development & Training

Fine-tuning & Continual Pretraining
GPT/Transformer Model Creation
Classification Model Development
MLflow / Kubeflow
Ray / Apache Airflow
Custom Model Publishing

AI Model Publishing & Contributions:

Published custom models and datasets on Hugging Face Hub
Developed PoC applications with FastAPI and Python
Created document extraction systems using Vision Language Models (VLMs)
Built agentic AI solutions and multi-agent systems

⚊

AI Infrastructure & Production Systems

▢

Rackspace Private Cloud AI

LLM Inference Optimization:

Production deployment of vLLM, SGLang, TensorRT-LLM
Distributed KV Cache Management (LMCache, Mooncake, NVIDIA NXIL, Redis)
Disaggregated Prefill/Decode Architecture (1p1d pattern)
Performance benchmarking: TTFT (4-500ms), ITL, throughput (up to 12K tokens/sec)
Multi-node inference orchestration with NVIDIA Dynamo and Grove

GPU Optimization & Hardware:

NVIDIA GPU platforms: H100, H200, A100, L40S optimization
GPU memory optimization (90% utilization, FP8/INT8 quantization)
CUDA optimization, Tensor Core utilization
NCCL, RDMA, NVLink for inter-GPU communication
Multi-Instance GPU (MIG), GPU persistence mode

Kubernetes for AI Workloads

NVIDIA GPU Operator management
KEDA auto-scaling for LLM workloads
Helm charts for AI deployments
Custom Resource Definitions (CRDs) for AI infrastructure
LoRA adapter management with AIBrix

System Optimization for AI/ML

OS tuning: Transparent Huge Pages (THP), BBR congestion control
NUMA balancing, CPU governor optimization
Memory management: dirty ratios, swappiness, overcommit
I/O scheduler tuning (NVMe, SSD optimization)
Network stack optimization for distributed inference

Production Architecture Patterns

Single-node multi-GPU with NXIL (<5us KV cache latency)
Disaggregated prefill-decode with Mooncake (cost-optimized)
Multi-cloud datacenter with NVIDIA Dynamo (1.2M tokens/sec)
Enterprise AI platform with VMware Private AI Foundation
Development-to-production scaling with vLLM Production Stack

Database Integration for AI

Vector databases: pgvector, MongoDB Atlas Vector Search, RediSearch
Hybrid search: text + vector combination
RAG architecture: document indexing, retrieval, embedding storage
Knowledge graphs: Apache AGE, RedisGraph
Session management and LLM response caching

MLOps & Observability

Model management: validation, versioning, model gallery
LoRA adapter batching and caching
Prometheus/Grafana monitoring for LLM metrics
Auto-scaling based on KV cache, queue depth, latency
CI/CD for model deployment

Enterprise Features

VMware Cloud Foundation integration (VKS, NSX, multi-tenant)
High Availability: control plane HA, disaster recovery
Security: network segmentation, secrets management, model signing
Compliance: model validation, drift prevention, governance

⚊

Technical Skills

▢

Systems & Infrastructure

Systems Management
Database Administration
Technical Support
System Hardening
Kernels Management
Systems Security
Server Administration

Development & Automation

CI/SRE: CI/SRE Generation, Process Documentation, Process Auditing, RPM/DEB

Languages & Tools: NIX* Distributions Creation, Multiple Linux Languages

Web Technologies

Advanced LAMP/LEMP
Nginx Clusters
PHP-FPM
MySQL HA Clusters
Apache Optimization
SSL/TLS Implementation

Cloud & Virtualization

VMware Cloud Foundation (VCF):

VMware vSphere / ESXi administration
VMware vSAN storage management
NSX-T network virtualization
vRealize Suite (Automation, Operations, Log Insight)
vCenter Server management
Tanzu Kubernetes Grid (TKG) / vSphere with Tanzu
VMware Private AI Foundation integration
Multi-tenant isolation and resource pools
Supervisor clusters and vSphere Namespaces
Enterprise HA and disaster recovery

Container Orchestration & Containerization:

Kubernetes (K8s) administration and architecture
Docker containerization and image management
Helm charts development and deployment
Kubernetes operators and Custom Resource Definitions (CRDs)
NVIDIA GPU Operator for AI workloads
KEDA auto-scaling
Service mesh (Istio, Linkerd)
CNI plugins (Calico, Flannel, NSX-T)
Persistent storage (CSI drivers, StatefulSets)
Monitoring stack (Prometheus, Grafana, ELK)

Cloud Platforms:

VMware Cloud on AWS
Microsoft Azure (VMs, AKS, Storage)
Amazon Web Services (EC2, EKS, S3)
Multi-cloud orchestration

Virtualization Technologies:

KVM/QEMU
vSphere / vCenter
Hyper-V
Proxmox

⚊

Professional Experience

▢

RACKSPACE TECHNOLOGY - Product Engineer/AI Architect | 2024 - Present

AI Team Leader - Newly formed AI division

Lead Engineer for Run:ai and their products (SME)
Lead Engineer for NVIDIA and their AI platform
Lead Engineer for Applied AI PoC program
Prototyping of Applications and integrations
Customer configuration analysis (SWOT)
Vector Databases implementation
Embeddings and ML workflows
DELL Hardware implementation for NVIDIA
NVIDIA Triton Server, TensorRT, NIM
Kubernetes for AI (GPU & others)
Docker images for AI related tools
vLLM, SGLang, TensorRT-LLM production deployments
Distributed KV caching with Mooncake, LMCache, Redis
OS tuning for AI workloads (NUMA, THP, RDMA)
Performance benchmarking and optimization
AIBrix orchestration and LoRA management
NVIDIA Dynamo multi-datacenter inference
VMware Private AI Foundation integration

RACKSPACE TECHNOLOGY - Linux Engineer IV (Sr Escalations Team) | 2022 - 2024

Lead Engineer & Trainer at Escalations Team
Technical support ownership for customer base
Advanced troubleshooting and OS-level issue resolution
Customer loyalty through exceptional service delivery
Issue escalation management and resolution
Training and mentoring of Rackers
Collaboration with CSM, Account Managers, and Incident Management
Security remediation via Crowdstrike with malware analysis
Ansible automation applications development

Custom Tools Developed:

MRMF: Python-based Malware Scanner for LAMP Stack
Scanware: Rust Application Scanner with plugins
Traffic Analyzer: Python 3 port with enhanced features

RACKSPACE TECHNOLOGY - Linux Customer Success Enterprise Support Engineer | 2020 - 2022

Career Progression: L1 > L2 > L3 > Linux Engineer

Full Stack Linux System Administration
Customized support for Enterprise Level accounts
Account technical expert on Rackspace side
Technical point of contact and liaison
Infrastructure documentation preparation
Project assistance based on customer needs
Infrastructure recommendations and consultancy
Continuous infrastructure supervision and monitoring
Proactive issue identification and resolution

ALM GROUP - Support Manager | 2017

Company support process management
Technological implementations
Documentation writing and maintenance
Software architecture design

DIVALIA S.A DE C.V - Founder | 2010 - 2015

Company founding and management
Software development and programming
Server architecture design and implementation
Software architecture and system design

⚊

Open Source & Personal Projects

▢

EMANON LINUX - Creator / Maintainer | 2010 - 2015

Linux distribution development based on IPCop & RedHat Linux
CentOS variant implementation
Wikipedia documented project

RHEL TO CENTOS REPOS - Maintainer | 2009 - Present

RPM packages updating and maintenance
Private repositories creation and management
SRPM Trees upgrading and updating

INSTRUCTOR / BLOGGER - Technical Writer | 1999 - Present

Technical articles writing (barrahome.org)
Community support on Freenode
Multiple Linux distributions support
Documentation and knowledge sharing

⚊

Technical Expertise

▢

Operating Systems

RedHat, CentOS, Debian, Ubuntu, Unix (Solaris), FreeBSD, Windows

Programming Languages

Python, Perl, Bash, PHP, C/C++, Java, PowerShell, Rust

Networking & Security

Cisco Routers and Switches
LAN/WAN Configuration
VPN Implementation
TCP/IP Protocol Suite
Firewalls Configuration
Intrusion Detection Systems

Web & Database Technologies

Apache/Nginx (10+ years)
MySQL/MariaDB Clustering
PHP-FPM
cPanel/WHM (10+ years)
SSL/TLS Implementation
DNS Management

⚊

Notable Contributions

▢

RedHat Security Reports: Contributed several kernel failure reports to RedHat
Bugzilla Contributions: Bug #455833
Linux Security Forum: Active contributor to gmane.linux.lfs.security
Community Support: 25+ years providing Linux support and documentation

⚊

Additional Skills & Certifications

▢

Specialized Knowledge

High-Performance Computing (HPC)
Distributed Systems Architecture
Performance Engineering and Benchmarking
Cost Optimization for AI Workloads
Enterprise Compliance and Governance
Machine Learning Operations (MLOps)
Site Reliability Engineering (SRE)

Languages

Spanish (Native)
English (Professional Working Proficiency)

Last Updated: February 2026