AI Inference Engineer Job at Signify Technology, Santa Clara, CA

bVhtVjVTcFhKL3pQZk04MnlaVmVDWXBvMnc9PQ==
  • Signify Technology
  • Santa Clara, CA

Job Description

AI Inference Engineer – Stealth Startup | San Fransisco Onsite

Compensation: $200K–$300K + equity

Join a stealth-stage team backed by prominent academic research and successful technical founders, working at the bleeding edge of AI infrastructure. As generative AI continues to scale rapidly, the bottleneck is no longer training—it’s inference. This team is rebuilding the core systems that power inference, from kernel-level GPU optimizations to full-stack distributed deployment.

This role is ideal for engineers who want to go deep: working on quantization, KV caching, attention mechanisms like FlashAttention, and designing new strategies for parallelism across heterogeneous compute. You'll contribute to an integrated software-hardware stack that enables large-scale model deployment with dramatically improved performance, efficiency, and quality—at production scale.

What You’ll Be Doing:

  • Research and implement state-of-the-art techniques to improve AI model inference speed and quality
  • Architect and optimize distributed AI infrastructure across both GPU kernel and software layers
  • Profile, benchmark, and debug system performance across varied hardware environments
  • Drive improvements in model execution through compiler-level tuning, caching, and runtime strategies

What They’re Looking For:

  • Bachelor's degree in Computer Science, Engineering, Applied Math, or a related field
  • Strong experience with performance optimization and systems-level thinking
  • Proficiency in Python, C++, and CUDA
  • Familiarity with AI frameworks like PyTorch, TensorFlow, ONNX, or vLLM

Nice to Have:

  • Graduate degree in a technical field
  • Experience with MLIR or other compiler frameworks
  • Hands-on work with large-scale GPU infrastructure or custom kernels

This is a hands-on, foundational role in a fast-moving environment, offering the chance to shape the backbone of the next generation of AI systems.

Job Tags

Similar Jobs

AD Energy Recruitment

Plant Operator - Renewable Natural Gas (RNG) Job at AD Energy Recruitment

 ...Job Title: Renewable Natural Gas (RNG) Plant Operator Location: Ringle, Wisconsin Employment Type: Full-Time Reports To: Plant Manager Position Overview We are seeking a dedicated and skilled RNG Plant Operator to oversee the daily operations and maintenance... 

GDIT

Cyber Security Analyst - FBI ECS Division Job at GDIT

 ...related experience US Citizenship Required: Yes Job Description: Own your career as a Cyber Capability Developer Senior at GDIT. Here, you'll have the opportunity to build strong lines of cyber defense using cutting-edge technologies. Your work in cyber... 

Pride Health

RN Case Manager Job at Pride Health

 ...Now Hiring: RN Case Manager Join a Leading Healthcare Team! Full-Time | Competitive Pay |26 Weeks Contract with Potential of Conversion to Permanent &##128205;Location: Jacksonville, FL, 32206 The case manager will support 3 SNFs located within a 10-mile... 

KCM Technical

CNC Field Service Technician Job at KCM Technical

Summary & Responsibilities: ~100% Domestic Travel. Provides field service in connection with the installation, start-up, service, troubleshooting, repair, or demonstration of grinding machines and their components. ~ Duties will be tailored to the skill level and...

Nederveld, Inc.

Forensic Engineer Job at Nederveld, Inc.

 ...The role of a Forensic Engineer within the Transportation and Collision Analysis group involves independent and collaborative work with the Forensic Engineering team, investigative fieldwork, and analysis to determine the root-cause related to motor vehicle collisions...