Inworld AI Logo

Inworld AI

Staff / Principal Machine Learning Engineer, Serving - UK

Reposted 7 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in UK
Expert/Leader
Remote
Hiring Remotely in UK
Expert/Leader
As a Staff/Principal Machine Learning Engineer, you will optimize and scale multimodal inference systems, ensuring high performance and reliability in production environments.
The summary above was generated by AI

About Inworld

Inworld is a research lab of top researchers and engineers, building the world’s top-ranked realtime voice models.

Today our models are the #1 ranked realtime voice models in the world. They are used to power the largest consumer-facing AI applications available, across categories like health, fitness, learning, therapy, companions, customer experience and media; representing 100s of millions of end users. Our work spans areas like research and development of state-of-the-art models, optimizing realtime inference, and creating best-in-class APIs and products that allow developers to engage their users.

We’ve raised more than $125M from Lightspeed, Section 32, Kleiner Perkins, Microsoft’s M12 venture fund, Founders Fund, Meta and Stanford, among others. Our technology has powered experiences from companies such as NVIDIA, Microsoft Xbox, Niantic, Logitech Streamlabs, Wishroll, Little Umbrella and Bible Chat. We’ve also been recognized by CB Insights as one of the 100 most promising AI companies globally and have been named one of LinkedIn’s Top 10 Startups in the USA.

Who We're Looking For

A year ago, reliably working agentic systems and sub-second multimodal inference at scale barely existed. Nobody has a decade of experience here. So we're not screening for a resume template — we're looking for strong people from varied backgrounds who learn fast, thrive in ambiguity, and can show us what they've built, broken, and understood.

Experience We Find Useful

You don't need all of this. But you need enough to make a case.

  • Inference Optimization. Deep understanding of modern serving frameworks and techniques like vLLM or TRT-LLM.

  • Model Acceleration. Hands-on experience with quantization, distillation, caching strategies , continuous batching, paged attention, and speculative decoding.

  • High-Performance Systems. Proficiency in C++, CUDA, Rust, or highly optimized Python. You know how to profile code and squeeze every ounce of performance out of NVIDIA GPUs.

  • Distributed Systems & Scaling. Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and reliably handling thousands of concurrent connections.

  • Public work. Non-trivial systems programming projects, open-source contributions to major inference engines, or deep-dive technical write-ups.

  • Full-cycle ownership. You can take a model from the research team, containerize it, optimize its serving, and ensure it runs reliably in production.

  • Background. PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems.

Who Thrives Here

  • You don’t need a roadmap to start walking; you’re comfortable picking a direction and building the map as you go.

  • You believe engineering isn't finished until it’s shipped and stable. You have a bias for impact over purely theoretical optimizations.

  • You don't just ship code; you obsess over the why. You’re the first to question an architecture if you think there’s a better way to solve the core latency or throughput problem.

  • You aren't satisfied with "the PM said so." You thrive on deep context and want to understand the fundamental logic behind every decision we make.

What Working Here Is Like

We hand you unclear problems and expect you to make them clear. We value engineers who say "I don't know yet" and then design the benchmark or prototype that finds out. We treat performance, latency, and reliability as first-class product features, not a box to check before launch. Impact comes before everything else, though we support sharing work and open-source contributions that move the field forward. Your work should be visible. Flat structure, fast iterations, minimal process theater.

The base salary range for this full-time position is £140,000 – £200,000. In addition to base pay, total compensation includes equity and benefits. Within the range, individual pay is determined by work location, level, and additional factors, including competencies, experience, and business needs. The base pay range is subject to change and may be modified in the future.

Candidates must already have the legal right to work in the United Kingdom, as visa sponsorship is not available for this role. For candidates interested in relocating to the San Francisco Bay Area in the future, full U.S. visa and relocation support may be available, subject to business needs and applicable legal and work authorization requirements.

Similar Jobs

5 Minutes Ago
Remote or Hybrid
United Kingdom
Mid level
Mid level
Healthtech • Information Technology • Security • Software • Cybersecurity
Manage small-to-medium software deployment projects for healthcare customers from planning through rollout. Coordinate cross-functional teams, oversee schedules, budgets, risks, and stakeholder communications, maintain project documentation, facilitate status meetings, support SOW/estimates, and identify upsell opportunities while improving Professional Services delivery.
Top Skills: FinancialforceExcelMicrosoft ProjectOpenairSalesforce
5 Minutes Ago
Remote or Hybrid
United Kingdom
Senior level
Senior level
Healthtech • Information Technology • Security • Software • Cybersecurity
Lead post-sales planning and execution of Imprivata software implementations for customers. Perform technical assessments, install and test products, manage project lifecycles, train customers and partners, develop best-practice documentation, and collaborate with PMO and channel partners to ensure successful deployments.
Top Skills: AutoitCernerCitrixEpicHyper-VIdentity GovernanceImprivata EamImprivata MamImprivata MfaImprivata PasJavaScriptMdmMeditechMySQLNutanixOraclePostgresSalesforceSQLSQL ServerSsoStored ProceduresVbscriptVdiVMwareWindows
7 Minutes Ago
Easy Apply
Remote
United Kingdom
Easy Apply
Senior level
Senior level
Cloud • Security • Software • Cybersecurity • Automation
Lead and grow a distributed engineering organization building customer-facing security capabilities: proprietary scanners, AI/ML detection engines, agentic remediation, vulnerability management, and security foundations. Set vision and roadmap, drive architecture and delivery, partner with product management, establish engineering standards, and represent the group in cross-functional and customer forums.
Top Skills: Agentic AiAIAi Agent OrchestrationDevsecopsGitlabMachine LearningOpen Source Security ToolingSaaSSastScanning InfrastructureSecret DetectionSoftware Composition AnalysisSoftware Supply Chain SecurityThreat IntelligenceVulnerability Management

What you need to know about the Edinburgh Tech Scene

From traditional pubs and centuries-old universities to sleek shopping malls and glass-paneled office buildings, Edinburgh's architecture reflects its unique blend of history and modernity. But the fusion of past and future isn't just visible in its buildings; it's also shaping the city's economy. Named the United Kingdom's leading technology ecosystem outside of London, Edinburgh plays host to major global companies like Apple and Adobe, as well as a growing number of innovative startups in fields like cybersecurity, finance and healthcare.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account