Posted at: 28 May

AI Computing Development Engineer, TensorRT and TensorRT-LLM

Company

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote Hiring Policy:

NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Job Type

Full-time

Allowed Applicant Locations

China

Apply Here

Job Description

NVIDIA is hiring software engineers for its AI Computing team. Academic and commercial groups around the world are using GPUs to power a revolution in deep learning-powered AI, enabling breakthroughs in areas like generative AI, computer vision, speech recognition, recommender systems, and large-scale language and multimodal models. Join the team building the inferencing software (TensorRT/TensorRT-LLM) that will be used across our product lines. The ability to work in a fast-paced, delivery-focused environment is required, and excellent interpersonal skills are a must.What you'll be doing:Design and develop robust inferencing software (TensorRT/TensorRT-LLM) optimized for functionality and performance across platformsPerform performance analysis, optimization, and tuning of deep learning inference workloadsTrack and integrate academic and industry advancements in AI and feature-update TensorRT/TensorRT-LLM accordinglyProvide feedback into architecture and hardware design and developmentCollaborate across hardware, software, and research teams to shape the direction of machine learning inferencing across NVIDIA platformsOwn and deliver technical work with scope based on experience, ranging from complex features to substantial parts of larger projects, with increasing independence and technical leadership over timePublish key technical results at leading scientific and engineering conferencesWhat we need to see:Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics, or related computing-focused field (or equivalent experience)Strong C/C++ or Python programming and software design experience, including debugging, performance profiling, and test design2+ years working experienceStrong curiosity about artificial intelligence and familiarity with the latest developments in deep learning — including generative models, multimodal systems, and large neural networksExperience working with deep learning frameworks such as PyTorch, TensorRT/TensorRT-LLM, NeMo, or vLLMProactive, self-driven, and able to work independentlyExcellent written and verbal communication skills in EnglishDemonstrated ability, commensurate with experience, to take technical ownership, solve complex problems, and contribute effectively in cross-functional environmentsNVIDIA is widely considered to be one of technology’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. Does the idea of contributing to and pushing the boundaries of state-of-the-art AI and compute systems excite you? Interested in getting exposure to the entire deep learning software stack? Come join us and help build the GPU-accelerated AI platform used worldwide.

Apply Here