Apply Now!

LLM – Full Stack (Python & JavaScript)

Noida, India

Location: Open to candidates across Latin America & West Africa
Experience Required: 6+ Years

We’re a coding-first research team working as a trusted partner for a Frontier AI Lab. Our mission is to build high-quality coding tasks, evaluations, datasets, and tooling that directly improve how large language models (LLMs) think, reason, and write code.

This is a hands-on engineering role where precision, correctness, and reproducibility truly matter. You’ll work on real production-grade code, investigate subtle model failures, and design rigorous evaluations that shape next-generation AI systems.

If you enjoy solving non-obvious technical problems, breaking systems to understand them, and working in developer-centric environments—this role is for you.

What You’ll Be Working On

Writing, reviewing, and debugging production-quality code across multiple languages
Designing coding, reasoning, and debugging tasks for LLM evaluation
Analyzing LLM outputs to identify hallucinations, regressions, and failure patterns
Building reproducible dev environments using Docker and automation tools
Developing scripts, pipelines, and tools for data generation, scoring, and validation
Producing structured annotations, judgments, and high-signal datasets
Running systematic evaluations to improve model reliability and reasoning
Collaborating closely with engineers, researchers, and quality owners

What We’re Looking For

Must-Have Skills

Strong hands-on coding experience (professional or research-based) in:
◦ Python
◦ JavaScript / Node.js / TypeScript
Experience using LLM coding tools (Cursor, Copilot, CodeWhisperer)
Solid knowledge of Linux, Bash, and scripting
Strong experience with Docker, dev containers, and reproducible environments
Advanced Git skills (branching, diffs, patches, conflict resolution)
Strong understanding of testing & QA (unit, integration, edge-case testing)
Ability to overlap reliably with 8:00 AM – 12:00 PM PT

Nice to Have

Experience with dataset creation, annotation, or evaluation pipelines
Familiarity with benchmarks like SWE-Bench or Terminal Bench
Background in QA automation, DevOps, ML systems, or data engineering
Experience with additional languages (Go, Java, C++, C#, Rust, SQL, R, Dart, etc.)

Who Will Thrive Here

Engineers who enjoy breaking things and understanding why
Builders who like designing tasks, running experiments, and debugging deeply
Detail-oriented developers who catch subtle bugs and model issues
Engineers who prefer clean, reusable workflows over quick hacks

Why Join Us?

Work directly on systems that improve state-of-the-art AI models
Solve unique, non-routine engineering problems
Collaborate with smart, quality-driven engineers and researchers
Build tools and datasets that have real impact at scale

Apply Now!

LLM – Full Stack (Python & JavaScript)

Noida, India

Sara Lambrinos

Jason Stotlar

VP Operations

Digital Marketing Analyst

4.0

1

2

3

Schedule a Free Consultation

backend

frontend

mobile

full stack

DEVOPS

CMS & ECOMMERCE

Software Development

AI/ML

IT Consulting & Support

TESTING

LLM – Full Stack (Python & JavaScript)

Noida, India

Sara Lambrinos

Jason Stotlar

VP Operations

Digital Marketing Analyst

4.0

Schedule a Free Consultation