What to Expect
As an AI Inference Software Engineer within the Autonomy group, you will have the opportunity to fine-tune, deploy, and optimize large neural networks for efficient inference on heterogenous edge devices (CPU/GPU/AI ASIC).
The nature of the role is multi-disciplinary - you will be working at the intersection of machine learning and systems.
You will build the frameworks and infrastructure that enable the seamless deployment, integration, and inference of all neural networks that run on Autopilot and the Humanoid Robot.
You will develop system tools to benchmark, characterize, and optimize the latency and throughput of AI workloads on the FSD chip.
What You’ll Do
Build robust AI frameworks to lower neural networks (PyTorch) to edge devices
Build robust AI infrastructure to train and fine-tune networks for Autopilot and the Humanoid Robot on large GPU clusters
Deploy state-of-the-art neural networks on heterogenous compute, including Tesla’s in-house AI ASIC, with an aim to maximize network performance while minimizing latency
Closely collaborate with AI scientists and hardware teams to effectively quantize, prune, and run inference in low precision
Design and implement custom GPU kernels (OpenCL/CUDA) for efficient training and post-processing of network output
What You’ll Bring
Proficiency with Python and C++, including modern C++ (14/17/20)
Experience with PyTorch, TensorFlow or other machine learning frameworks
Experience with Machine Learning, Deep Learning, and Computer Vision
Experience with Model Fine-Tuning: Quantization Aware Training, Compression, Pruning
Experience with training and deploying neural networks for real-world AI
Experience with Computer Systems/Architecture
Experience with CUDA and/or OpenCL