This bootcamp is currently in Beta — content may be updated as we refine the material
BOOTCAMPINTERMEDIATE

AI Platform Engineering Bootcamp

A 16-week intensive program designed to transition Platform Engineers into AI Platform Engineering and LLMOps roles. Build production-ready AI systems using LLMs, RAG, agents, and MLOps practices on Kubernetes infrastructure.

16 weeks
200 hours
8 courses

What You'll Master

Integrate LLMs into applications using OpenAI, Claude, and open-source model APIs

Deploy AI gateways with intelligent routing, caching, and cost tracking on Kubernetes

Design and implement RAG systems with vector databases and evaluation pipelines

Build production AI agents using LangChain and LangGraph with safety guardrails

Create MCP servers for standardized AI-infrastructure integration

Implement MLOps pipelines with experiment tracking and workflow orchestration

Deploy models to production using KServe with autoscaling and canary deployments

Monitor AI systems with custom metrics, evaluation frameworks, and drift detection

Implement guardrails and safety for production LLM applications

Apply enterprise security using HashiCorp Vault for secrets management

Who Is This Bootcamp For?

Platform Engineers pivoting to AI Platform Engineering

DevOps Engineers adding AI/ML infrastructure skills

Software Engineers building AI-powered applications

Site Reliability Engineers managing AI workloads

Cloud Engineers implementing MLOps practices

Bootcamp Curriculum

1

Week 1: AI Foundations for Infrastructure Engineers

Bridge the gap between traditional infrastructure and AI systems. Run your first local LLMs and understand their resource requirements.

Goals:

  • Understand AI workloads from an infrastructure perspective
  • Master essential ML vocabulary and concepts
  • Deploy and interact with local LLMs using Ollama
  • Set up Python development environment for AI workloads
⭐ Required
2

Week 2: LLM Integration and API Patterns

Build production-ready API layer with multi-provider routing, failover, caching, and cost tracking.

Goals:

  • Master LLM API integration patterns with multiple providers
  • Deploy AI gateway on Kubernetes with intelligent routing
  • Implement prompt engineering for production systems
  • Build cost tracking and optimization dashboards
  • Integrate AWS Bedrock for managed AI services
⭐ Required
3

Week 3: RAG Architectures and Vector Databases

Connect LLMs to organizational knowledge bases with semantic search and optimized retrieval pipelines.

Goals:

  • Deploy and manage vector databases on Kubernetes
  • Implement document processing and chunking strategies
  • Build complete RAG API services with streaming
  • Evaluate and test RAG system quality
⭐ Required
4

Week 4: AI Agents and Agentic Workflows

Build autonomous AI agents that can reason, plan, and execute complex tasks with proper safety guardrails.

Goals:

  • Master agent fundamentals and the ReAct pattern
  • Build Platform Engineering agents with Kubernetes tools
  • Implement LangGraph workflows with human-in-the-loop
  • Create MCP servers for standardized tool integration
  • Design multi-agent systems for complex tasks
⭐ Required
5

Week 5: ML Infrastructure and Experiment Tracking

Implement experiment tracking, model versioning, and ML pipeline orchestration using GitOps principles.

Goals:

  • Deploy MLflow on Kubernetes with S3 artifact storage
  • Track LLM experiments and prompt strategies
  • Build ML pipelines with Argo Workflows
⭐ Required
6

Week 6: Model Serving and Kubernetes for ML

Deploy models to production on Kubernetes with KServe, autoscaling, and canary deployments.

Goals:

  • Understand GPU scheduling concepts for ML workloads
  • Configure resource management for inference
  • Deploy models with KServe and autoscaling
  • Implement canary deployments for safe rollouts
⭐ Required
7

Week 7: AI Observability and LLMOps

Implement comprehensive monitoring, evaluation, guardrails, and drift detection for AI systems.

Goals:

  • Deploy AI observability stack with Prometheus and Grafana
  • Build LLM evaluation pipelines with automated testing
  • Implement production guardrails and safety measures
  • Detect and respond to model drift and degradation
⭐ Required
8

Week 8: Enterprise AI and Capstone Project

Apply all learned skills to build a production-ready AI-powered Platform Assistant with enterprise security.

Goals:

  • Implement enterprise AI security with Vault integration
  • Understand AI governance and compliance requirements
  • Complete comprehensive capstone project
  • Present production-ready Platform Assistant
⭐ Required

Prerequisites

Completion of Platform Engineering Bootcamp or equivalent experience

Strong Kubernetes fundamentals (deployments, services, Helm)

Experience with Terraform and infrastructure as code

CI/CD pipeline experience (GitHub Actions preferred)

Python programming fundamentals (functions, classes, packages)

AWS cloud experience

Basic SQL knowledge (SELECT, JOIN, WHERE, GROUP BY)

Experience with relational databases (PostgreSQL or MySQL)

Technologies Covered

ai-platform-engineeringmlopsllmopsragai-agentslangchainlanggraphollamavector-databaseskuberneteskservemlflowprometheusgrafanahashicorp-vaultaws-bedrockpythonopenaiclaude

Choose your plan

Simple, Transparent Pricing

One price, everything included

Monthly Plan

Access all content

$99/month
Save 16%

Quarterly Plan

Save 16% with quarterly billing

$249/quarter

Everything Included in Your Subscription

Content & Learning

  • Access to all courses and bootcamps
  • Video lessons with closed captions
  • Interactive quizzes and assessments
  • Course completion certificates

Hands-On Labs

  • Browser-based cloud labs
  • Pre-configured VMs ready to use
  • Playgrounds for experiments
  • Multi-VM realistic scenarios

AWS Integration

  • Managed AWS Account included
  • Pre-configured environments
  • Real-world cloud scenarios

Support & Community

  • Priority support
  • Active community forum

No Setup Required

  • Everything runs in your browser
  • No software installation needed
  • Automatic environment provisioning
  • Works on any device

Ready to Transform Your Career?

Join this comprehensive bootcamp and master Platform Engineering

Get Access Now