Computer Vision Engineer Job at Tykhe Inc, Palo Alto, CA

WVE4bUwxS0dvREZ0WlU1cjJuNC82N0dkRHc9PQ==
  • Tykhe Inc
  • Palo Alto, CA

Job Description

We are seeking experienced Multimodal and Vision AI Engineers/Scientists to research, develop, optimize, and deploy Vision-Language Models (VLMs) , multimodal generative models, diffusion models, and traditional computer vision techniques. You will work on foundational models integrating vision, language, and audio, optimize AI architectures, and push the boundaries of multimodal AI research.

Responsibilities:

  • Research, design, and train multimodal vision-language models (VLMs), integrating deep learning, transformers, and attention mechanisms.
  • Develop and optimize small-scale distillation of VLMs for efficient deployment on resource-constrained devices.
  • Implement state-of-the-art object detection (YOLO, Faster R-CNN), segmentation (Panoptic Segmentation), classification (ResNets, Vision Transformers), and image generation (Stable Diffusion, Stable Cascade).
  • Train or fine-tune vision models for representation (e.g., Vision Transformers, Q-Former, CLIP, SigLIP), generation, and video representation (e.g., Video-Swin Transformer).
  • Work with diffusion models and generative models for conditional image generation and multimodal applications.
  • Optimize CNN-based architectures for computer vision tasks like recognition, tracking, and feature extraction.
  • Implement and optimize audio models for representation (e.g., W2V-BERT) and generation (e.g., Hi-Fi GAN, SeamlessM4T).
  • Innovate with multimodal fusion techniques such as early fusion, deep fusion, Mixture-of-Experts (MoE), FlashAttention, MQA, GQA, MLA, and other transformer architectures.
  • Advance video analysis, video summarization, and video question-answering models to enhance multimedia understanding.
  • Integrate and tailor deep learning frameworks like PyTorch, TensorFlow, DeepSpeed, Lightning, Habana, and FSDP.
  • Deploy large-scale distributed AI models using MLOps frameworks such as AirFlow, MosaicML, Anyscale, Kubeflow, and Terraform.
  • Publish research in top-tier conferences (NeurIPS, CVPR, ICCV, ICLR, ICML) and contribute to open-source AI projects.

Qualifications:

  • Ph.D. or Master’s degree with 2+ years of experience in Vision-Language Models (VLMs), multimodal AI, diffusion models, CNNs, ResNets, computer vision, and generative models.
  • Demonstrated expertise in high-performance computing, proficiency in Python, C/C++, CUDA, and kernel-level programming for AI applications.
  • Experience in optimizing training and inference of large-scale AI models, with knowledge of quantization, distillation, and LLMOps.
  • Hands-on experience with object detection (YOLO, Faster R-CNN), image segmentation (Panoptic Segmentation), and video understanding (Swin Transformer, Timesformer).
  • Proficiency in AI toolkits like PyTorch, TensorFlow, OpenCV, and familiarity with MLOps frameworks.

Job Tags

Similar Jobs

Music Ministry International

Catholic Music Director Musician Pianist Job at Music Ministry International

 ...The successful candidate for this position will be a 1099 contractor with Music Ministry International and be responsible for the following: Scope of Responsibilities. Serves as the Music Director and Musician for Beale AFB Catholic Service programs. These programs... 

Starbucks

Barista Job at Starbucks

 ...availability . This post is for the Olive Branch Location. Barista- $12/hr SSV- $15.20/hr Baristas are the face of...  .... Maintains regular and punctual attendance. Summary of Experience: ~ No previous experience is required. Basic Qualifications... 

Engel Consulting Group

ENG CP100 - Insurance Sales Person Job at Engel Consulting Group

Job Description:Inside Sales Insurance Agent - Medicare MarketExperience and Skills:Must have Current License to sell Life and Health Insurance in the state of Illinois.Will work as an Independent Contractor (1099).Computer Skills Needed: MS Office Suite, Basic... 

General Motors

Market Research Analyst - Digital Media (Apps & Websites) Job at General Motors

 ...interact with digital products? As a Market Research Analyst - Digital Media (Apps & Websites) focused...  ...to assist with your job search or application for employment, email...  ...employees bring their collective passion for engineering, technology and design to deliver on... 

Tanlines

Spa Manager Job at Tanlines

 ...Tanlines Wellness Sun & Spa in Toledo, OH is looking for a full-time Spa Manager to join our successful wellness & tanning spa . We are located on 5200 Monroe Street. Our ideal candidate is highly motivated and personable. We are looking for someone with a willingness...