Solutions Architect - Generative AI

Job Description

Job Title: Senior Solutions Architect - Generative AI. Location: Pune, India

Responsibilities:

  • Lead enterprise customers through the complete lifecycle of generative AI projects, from use case discovery to deployment and evaluation
  • Design and implement production-grade retrieval-augmented generation (RAG) systems using vector databases and retrieval frameworks
  • Guide customers in selecting, customizing, and fine-tuning foundation models including Llama, Mistral, Qwen, and Gemma
  • Build and deploy agentic AI applications using frameworks such as LangGraph, CrewAI, AutoGen, and LlamaIndex;
  • Advise customers on inference optimization techniques including quantization, speculative decoding, and multi-LoRA serving
  • Implement responsible AI practices, including guardrails, red teaming, evaluation frameworks, and observability solutions
  • Support operationalization of AI systems through MLOps practices, model versioning, CI/CD workflows, drift detection, and human feedback mechanisms
  • Provide architectural guidance for large-scale AI infrastructure deployments utilizing DGX, HGX, high-performance networking, and storage systems
  • Design and scale AI clusters for training, fine-tuning, and inference workloads
  • Build orchestration solutions using Kubernetes, GPU Operator, Network Operator, Run:ai, and Slurm
  • Conduct technical workshops, proof-of-concepts, architecture reviews, and customer presentations;
  • Collaborate with sales, product, and engineering teams to translate business requirements into deployable architectures
  • Create reusable reference architectures, deployment guides, demonstrations, and technical assets

Qualifications:

  • 5 to 8 years of experience as a Solutions Architect, Field Engineer, ML Platform Engineer, or similar customer-facing technical role
  • Strong experience with accelerated computing infrastructure, GPUs, networking technologies, storage systems, and data center environments
  • Production experience with Kubernetes and workload schedulers such as Run:ai, Slurm, Kubeflow, or Volcano;
  • Hands-on experience with large language model training, fine-tuning, RAG architectures, inference optimization, and quantization techniques
  • Experience with enterprise AI platforms or the ability to rapidly learn NVIDIA AI Enterprise technologies
  • Experience addressing enterprise requirements such as security, compliance, data sovereignty, multi-tenancy, and air-gapped environments
  • Strong communication and presentation skills with both executive and technical stakeholders;
  • Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related field, or equivalent practical experience

Benefits And Compensation:

  • Highly competitive salary package
  • Comprehensive employee benefits program
  • Opportunity to work on cutting-edge Generative AI and accelerated computing technologies
  • Exposure to enterprise-scale AI deployments and global customers;
  • Collaboration with industry-leading AI researchers, engineers, and architects
  • Career growth within one of the world's leading technology companies

Other Information:

  • Travel requirement is less than 25%
  • Role involves customer-facing technical leadership and architecture consulting;
  • Opportunity to represent the organization at customer events, technical forums, and industry conferences
  • Applications are reviewed on an ongoing basis until the position is filled.

LOCATION

JOB TYPE

Full-time

COMPENSATION

SHARE

Important: To avoid application spam, include this statement at the end of your resume or application: 'I found this position on ( Quantum Jobs List ) .' Applications without it will be disqualified.

Back to all Quantum jobs
arrow mark
📌 Be the FIRST to join — A Professional Networking Platform for quantum.  Join Waitlist