Required Skills
About the Job
AMD is at the forefront of accelerating next-generation computing experiences across AI, data centers, PCs, gaming, and embedded systems. We are seeking a Senior Software Engineer to lead the validation of AI/ML compute stacks and Physical AI SDKs. This role involves end-to-end AI pipeline validation, benchmarking, and performance optimization on Linux platforms, focusing on real customer use cases.
**Key Responsibilities:**
- Validate AI/ML compute stacks on Ubuntu and Yocto.
- Define and execute comprehensive validation plans, including functional, performance, stress, regression, and scalability testing.
- Manage the defect lifecycle, from triage and root cause analysis to closure.
- Validate end-to-end AI pipelines, covering model training, conversion, and optimization (e.g., PyTorch to ONNX).
- Assess kernel execution, memory transfers, and inference accuracy.
- Define and perform AI benchmarking and profiling for training and inference workloads.
- Validate AI frameworks and runtimes such as PyTorch, TensorFlow, and ONNX Runtime.
- Execute workloads on ROCm/HIP, CUDA, OpenCL, and heterogeneous accelerators.
- Develop and maintain Python-based automation for validation, benchmarking, and reporting.
- Mentor junior engineers and clearly communicate quality and performance metrics.
**Technical Qualifications:**
- 4-8 years of experience in AI/ML validation.
- Strong Python scripting and automation skills.
- Solid understanding of ML fundamentals, including deep learning and LLMs.
- Experience with HIP, CUDA, OpenCL, and TensorFlow/PyTorch integrations.
- Hands-on experience with ROCm validation and end-to-end AI pipeline validation.
- Strong Linux expertise (Ubuntu, Yocto).
**Process Qualifications:**
- Performance-driven validation mindset with a focus on release readiness.
- High level of ownership and ability to work independently.
**Soft Skills:**
- Strong analytical and problem-solving skills.
- Clear written and verbal communication.
- Ability to collaborate effectively with global, cross-functional teams.
**Preferred Qualifications:**
- Benchmarking experience on CPU, GPU, and AI accelerators.
- Performance tuning of large models, including memory optimization and mixed precision.
**Education:**
* Bachelor's or Master's degree in AI/ML, Computer Science, Robotics, Electronics, or a related field.