Nebius Logo

Nebius

Senior Software Engineer (Data Platform, C++)

Posted 6 Hours Ago
Be an Early Applicant
In-Office or Remote
4 Locations
Senior level
In-Office or Remote
4 Locations
Senior level
Design and implement core YTsaurus functionality in C++, build and evolve platform-level capabilities, improve APIs and UX for internal users, own production quality including incident response/on-call, and collaborate with stakeholders to prioritize and deliver scalable data platform features.
The summary above was generated by AI

Why work at Nebius
Nebius is leading a new era in cloud computing to serve the global AI economy. We create the tools and resources our customers need to solve real-world challenges and transform industries, without massive infrastructure costs or the need to build large in-house AI/ML teams. Our employees work at the cutting edge of AI cloud infrastructure alongside some of the most experienced and innovative leaders and engineers in the field.

Where we work
Headquartered in Amsterdam and listed on Nasdaq, Nebius has a global footprint with R&D hubs across Europe, North America, and Israel. The team of over 800 employees includes more than 400 highly skilled engineers with deep expertise across hardware and software engineering, as well as an in-house AI R&D team.

The role

We’re looking for a Software Engineer with strong C++ expertise to join the team building and operating Nebius Data Platform — a distributed storage and a processing platform that acts as the company’s “source of truth” and the backbone of many internal (and some external) products.

Nebius Data Platform is a single multi-tenant ecosystem based on YTsaurus — instead of running separate HDFS/Kafka/HBase-style systems, we provide storage, compute, and analytics capabilities inside one platform.

Built on top of the open-source YTsaurus ecosystem, we run and extend our own Nebius distribution and develop significant in-house functionality (core and platform-level). We can design, implement, and roll out features end-to-end on our clusters without waiting for upstream approvals and contribute upstream when it makes sense.

At scale today, this includes~500 servers, ~20k CPU cores and ~10 PB of compressed data in our largest production cluster, supporting workloads ranging from business-critical pipelines and financial transactions to large-scale ML/LLM training datasets and compute.

What’s inside the platform

You’ll work on a system that includes (and ties together):

  • Distributed Storage (Cypress): transactional semantics, tiered storage, erasure coding, replication, and strong reliability expectations.
  • Compute & ETL: a cluster-wide job scheduler (tens of thousands of cores), MapReduce, YQL for SQL-like data processing, and SPYT (Spark over YTsaurus) for modern data engineering.
  • Interactive analytics (CHYT): ClickHouse® instances spun up directly on compute nodes for fast SQL over data in-place.
  • Dynamic Tables: low-latency NoSQL KV with distributed ACID transactions for OLTP-style workloads and feature stores.
  • Orchestracto: workflow orchestration deeply integrated with the platform (Airflow-like, but platform-native).
What you’ll do

We’re looking for engineers who combine strong systems skills with product sense: understanding who uses the platform, why certain capabilities matter, and making pragmatic trade-offs to maximize impact. On our team, engineering work is expected to be connected to real users and outcomes — you’ll regularly align with internal stakeholders, clarify requirements, and help drive prioritization.

In this role, you will:

  • Design and implement new functionality in YTsaurus core (C++) with production reliability in mind.
  • Build and evolve platform-level capabilities: platform architecture and operating model—multi-cluster growth, shared primitives, and a consistent experience that scales with new teams and use cases.
  • Improve end-to-end platform experience for internal (and external-facing) users: APIs, guardrails, debugging workflows, and automation.
  • Own production quality: incident response / on-call rotation, root cause analysis, and turning learnings into durable fixes.
Example projects
  • Roll out sharded YTsaurus masters (incl. Kubernetes operator support) and build automatic balancing of metadata across master cells (consensus groups) to remove control-plane bottlenecks and unlock 10–100x cluster growth.
  • Make CHYT interactive SQL faster and more predictable at high load via performance work like data-skipping / min-max-style indexes and improved execution introspection.
  • Turn Orchestracto into a platform product by defining the building blocks, developer experience, and governance for how teams create and share workflows.
  • Scale and harden Parquet-on-S3 for native YTsaurus workloads by tackling replication/movement, consistent lifecycle semantics, and master-server metadata optimizations for performance and reliability.
  • Design and ship complete, trustworthy audit trails for data changes (who/what/when) across heterogeneous storage and compute paths.
Tech stack
  • Core: modern C++ (C++20, async + multithreaded primitives)
  • Services & tooling: Go and Python (microservices, utilities, integration tests)
What we expect
  • 5+ years of software engineering experience.
  • Strong C++ skills (you’ll write core code).
  • Working knowledge of Python and/or Go (you don’t have to be expert, but should be comfortable navigating them).
  • Experience developing and/or operating high-load, distributed services.
  • Production mindset: ability to use SSH, read logs/metrics/traces, and debug distributed systems behavior.
  • Solid CS fundamentals: algorithms, data structures, concurrency basics.
Nice to have
  • Experience with Big Data systems (YTsaurus/Hadoop/Spark/ClickHouse/Kafka-like ecosystems).
  • Experience with multi-tenant platforms, schedulers, resource isolation, quotas, and reliability engineering.
  • Strong performance engineering skills (profiling, lock contention, latency/throughput tradeoffs).

We conduct coding interviews as part of the process.

What we offer 

  • Competitive salary and comprehensive benefits package.
  • Opportunities for professional growth within Nebius.
  • Flexible working arrangements.
  • A dynamic and collaborative work environment that values initiative and innovation.

We’re growing and expanding our products every day. If you’re up to the challenge and are excited about AI and ML as much as we are, join us!

Top Skills

C++
C++20
Clickhouse
Go
Hadoop
Kafka
Kubernetes
Mapreduce
Parquet
Python
S3
Spark
Spyt
SQL
Ssh
Yql
Ytsaurus

Similar Jobs

6 Minutes Ago
Remote or Hybrid
Germany
Senior level
Senior level
Cloud • Information Technology • Security • Software • Cybersecurity
This role involves driving sales growth in the DACH region by managing enterprise accounts and leading complex sales cycles, with a focus on digital transformation and Cloudflare's solutions.
Top Skills: Cloud NetworkingEdge ComputingGoogle SuiteMsft SuiteSalesforceSecurityTableau
10 Hours Ago
Remote
Germany
Senior level
Senior level
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
Design, build, and productionize multimodal ML and semantic retrieval systems for multimedia search. Collaborate with product, design, and infra teams, run benchmarks, prototype features, and scale inference and retrieval across image, video, and text.
Top Skills: Python,Go,C/C++,Pytorch,Huggingface,Tensorflow,Keras,Scikit-Learn,Large Language Models,Semantic Search,Embeddings,Vector Retrieval,Retrieval-Augmented Generation
Senior level
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Advise financial institutions on AI governance and compliance with EU AI Act and GDPR. Conduct model lifecycle risk assessments, design governance and compliance-by-design frameworks, support validation, documentation and audit readiness, and advise senior stakeholders on AI risk mitigation.

What you need to know about the Edinburgh Tech Scene

From traditional pubs and centuries-old universities to sleek shopping malls and glass-paneled office buildings, Edinburgh's architecture reflects its unique blend of history and modernity. But the fusion of past and future isn't just visible in its buildings; it's also shaping the city's economy. Named the United Kingdom's leading technology ecosystem outside of London, Edinburgh plays host to major global companies like Apple and Adobe, as well as a growing number of innovative startups in fields like cybersecurity, finance and healthcare.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account