About Groupon

Pioneering a transformation with AI, Groupon is building a centralized "AI Superhighway" to empower engineering and business teams for faster, smarter, and more efficient innovation. As a founding member of the AIOps team, you will be instrumental in building, operating, and evangelizing shared platforms like workflow automation tools, LLM gateways, vector stores, and agentic frameworks to democratize AI development.

Role Description

As a Senior AIOps Engineer (AI Platform Engineer), you will be a force multiplier for the entire organization. You will manage existing infrastructure and build, operate, and evangelize the shared platforms that democratize AI development. This role is ideal for someone passionate about building robust platforms and being at the center of a company's AI transformation.

Responsibilities:

  • Build and Operate the Core AI Platform: Deploy, manage, and scale our centralized AI stack, including LLM gateways (e.g., LiteLLM), vector databases (e.g., Qdrant, Weaviate), and agentic frameworks (e.g., Dify.ai) on Kubernetes.
  • Enable and Empower Engineers: Create the "golden paths" for AI development by providing IaC modules (Terraform/Helm), "how-to" playbooks, and expert-level support to product teams.
  • Govern for Security and Efficiency: Implement and manage solutions for tracking and optimizing AI-related costs (e.g., LLM API spend), and define and enforce security best practices for AI tools and data pipelines.
  • Innovate and Evangelize: Stay on the cutting edge by evaluating, prototyping, and integrating emerging AIOps tools, and act as a key advisor to teams designing AI-powered solutions.

Why You’ll Love It Here:

  • Pioneer an Entire Platform: This is a greenfield opportunity to build the company's central AI platform from the ground up, not just maintain an existing one.
  • Be a Force Multiplier: Your work will directly accelerate innovation across dozens of teams and have a visible impact on the business.
  • Work with a Modern Stack: Get hands-on with a cutting-edge, rapidly evolving AIOps and LLMOps technology stack.
  • Define the Future: You won't just follow a roadmap—you'll help create it, defining the standards for how our company builds with AI.

Groupon is an AI-First Company: We encourage candidates to leverage AI tools during the hiring process where it adds value, and we’re always keen to hear how technology improves the way you work. If you’re passionate about AI or curious to explore how it can elevate your role—you’ll be right at home here.

Core Qualifications:

  • Cloud & Kubernetes Mastery: 5+ years of experience with a strong background in cloud engineering, container orchestration (Kubernetes, Docker, Helm), and cloud networking/security.
  • Infrastructure as Code (IaC) & SRE Mindset: Deep expertise with IaC tools like Terraform and a solid understanding of SRE principles for building reliable, scalable, and observable systems (Prometheus, Grafana, ELK).
  • Strong Scripting & Automation Skills: Proficiency in Python or Go for building automation, API wrappers, and integration bots.

AI-First Mindset: Demonstrating experience in building or managing platforms for AI workloads. This includes a solid understanding of: * LLM Gateways: For routing, caching, and managing costs and rate limits. * Vector Databases & Semantic Search: Essential for constructing Retrieval-Augmented Generation (RAG) pipelines. * Observability for LLMs: Including metrics for tracing, latency, and token usage.

Preferred Qualifications:

  • Hands-on experience with specific tools from our target stack: LiteLLM, Langfuse, Dify.ai, n8n, Qdrant/Weaviate, crewAI, Temporal, Open WebUI
  • Previous experience in a platform engineering role, building tools and services for other developers.
  • Familiarity with governance and compliance frameworks for GenAI (e.g., EU AI Act, SOC2).