Multimodal Deep Learning Solution Architect - Vision Language and Action Models

NVIDIA’s Worldwide Field Operations (WWFO) team is seeking a Solutions Architect with expertise in Multimodal Deep Learning, with a strong background in Vision-Language Models (VLMs), and a deep understanding of their implications for physical AI which is redefining industries such as robotics, manufacturing, and healthcare by combining perception and language with decision-making.

In this role, you will operate at the intersection of innovative AI research and real-world applications. You will work as the primary technical specialist for NVIDIA customers, helping drive innovation powered by NVIDIA’s advanced hardware and software platform. You will develop proof-of-concept solutions, demonstrate modern neural network architectures, and advance how customers leverage multimodal reasoning for robotics and autonomous systems. A key part of this role involves close collaboration with a wide range of team members including developers, data scientists, IT managers, and senior executives. The ideal candidate is an experienced AI specialist with a deep understanding of vision-language-action reasoning, including large-scale pretraining, data curation, and post-training using supervised fine-tuning and reinforcement learning. The candidate should have a good knowledge in neural network optimization approaches. Expertise applying VLM to Physical AI use cases is a nice to have.

What you will be doing: * Serve as the primary technical expert between NVIDIA and our customers, understanding their technology and provide the best AI solutions/ guidance on training process in terms of tools and methodology * Build proof-of-concepts and demonstrations that highlight the power of NVIDIA AI platforms for Vision Language Reasoning Models * Partner with developers, researchers, technology specialists, IT professionals, and executives to facilitate the integration of NVIDIA technology * Partner with Engineering, Product and Sales teams to develop, plan best suitable solutions for customers. Enable development and growth of product features through customer feedback and proof-of-concept evaluations.

Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer www.nvidiabenefits.com/

What we need to see:

  • MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, other Engineering fields.
  • Deep expertise in AI/Deep Learning, with hands-on experience in training or optimizing VLMs for production
  • Expertise with deep learning frameworks for training VLMs (PyTorch, Nemo), and/or experience with such model's optimization methods and tools (TensorRT and Triton Inference Server).
  • Excellent verbal, written communication, and technical presentation skills in English.
  • 5+ years' work or research experience with Python/ C++ / other software development
  • AI passionate with a growth mindset, ability to collaborate effectively with different teams (Engineering, Product, Sales, Marketing) in a rapid evolving environment while continuously learning and sharing insights.

Ways to Stand Out from The Crowd:

  • Familiarity with Cosmos-Reason and Isaac GR00T
  • Track record in running large scale training and customization of VLM.
  • Track record in Neural Networks inference optimization for Physical AI usecases