Software Engineering Manager, AI Networking
Meta

Menlo Park, California

Posted in Retail


This job has expired.

Job Info


Meta operates one of the most advanced GPU-based training clusters in the world, powering its AI across Ads, Video, Feed, Reels, and GenAI workloads.The AI Networking group is responsible for designing, developing, and operating software systems that coordinates and transports the GPU-GPU communication across a large number of machines in large scale datacenters, leveraging state-of-the-art collective communication libraries such as NCCL. Software Engineering Manager for this team is expected to bring technical expertise in supporting the team in design and development activities, project management skills to effectively and efficiently drive associated programs and people management skills to support members of this team and collaborate with cross-functional partners.

Software Engineering Manager, AI Networking Responsibilities:

  • Help define technical roadmap for the team, drive execution of associated tasks and support the team in resolving dependencies
  • Collaborate effectively with other groups such as product groups, PyTorch, hardware, infrastructure, operations
  • Interact with external partners as needed in resolving dependencies associated with objectives
  • Guide and help team members develop appropriate skillsets to grow in their careers, and where necessary address under performance
  • Communicate cross-functionally and drive engineering efforts


Minimum Qualifications:

  • BS in Computer Science or related technical discipline or equivalent experience
  • 2+ years of experience managing a Software Engineering Team
  • Knowledge in HPC collective communication and parallel computing libraries such as NCCL, RCCL, OneCCL, and MPI
  • Knowledge in high performance transport stacks such as RoCE and InfiniBand
  • Working knowledge in machine learning frameworks such as PyTorch and TensorFlow
  • Experience recruiting and managing Software Engineers


Preferred Qualifications:

  • Experience with machine learning workloads such as recommendation systems, speech recognition, vision, and large language models
  • Experience with parallel computing platforms such as CUDA, RoCM and OpenCL




Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. You may view our Equal Employment Opportunity notice here. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. We may use your information to maintain the safety and security of Meta, its employees, and others as required or permitted by law. You may view Meta's Pay Transparency Policy, Equal Employment Opportunity is the Law notice, and Notice to Applicants for Employment and Employees by clicking on their corresponding links. Additionally, Meta participates in the E-Verify program in certain locations, as required by law


This job has expired.

More Retail jobs


clairesinc
Palmdale, California
Posted about 1 hour ago

clairesinc
Marysville, Ohio
Posted about 1 hour ago

clairesinc
Irvine, California
Posted about 1 hour ago

Get Hired Faster

Subscribe to job alerts and upload your resume!

*By registering with our site, you agree to our
Terms and Privacy Policy.