Tenstorrent is a leader in cutting-edge AI technology, aiming to revolutionize performance, ease of use, and cost efficiency. As AI redefines computing, Tenstorrent is unifying innovations in software models, compilers, platforms, networking, and semiconductors. Our team has developed a high-performance RISC-V CPU from scratch and is passionate about building the best AI platform. We value collaboration, curiosity, and solving complex problems, and we are expanding our team with contributors of all seniorities.

We are seeking an experienced engineer to lead AI workload productization and benchmarking for Large Language Models (LLMs). This role involves preparing models for customers, developing benchmarking infrastructure, and ensuring our AI models achieve industry-leading efficiency and scalability.

This is a hybrid role, based in Warsaw or Gdansk, Poland. We welcome candidates at various experience levels; the appropriate level will be determined during the interview process.

Responsibilities:

  • Design and execute comprehensive model testing protocols for robustness and scalability.
  • Develop and execute performance and accuracy benchmarking tests for AI workloads across various computational environments.
  • Analyze and optimize system performance using advanced profiling and tuning techniques.
  • Conduct competitive analysis and positioning to inform strategic decision-making and product development.
  • Collaborate with cross-functional teams to integrate best practices and innovations in AI performance optimization.
  • Integrate LLMs with popular inference server platforms (e.g., vLLM), perform testing and benchmarking, and stay updated on inference server trends.
  • Track AI model accuracy and performance in a CI/CD environment, identify regressions, and drive fixes.

Experience & Qualifications:

  • Bachelor's, Master’s, or PhD in Computer Science, Electrical Engineering, Machine Learning, or a related field.
  • Strong background in AI model benchmarking and profiling.
  • Experience with scalable AI infrastructure, including distributed computing environments.
  • Proficiency in Python for AI workload optimization.
  • Familiarity with LLM frameworks, AI accelerators, and performance tuning methodologies.
  • Familiarity with Github CI/CD environments is required.
  • Familiarity with LLM inference servers (e.g., vLLM) is a bonus.
  • Ability to interpret and analyze hardware/software interactions to maximize AI model efficiency.

Tenstorrent offers a highly competitive compensation package and benefits.

Tenstorrent

Tenstorrent