Lead the design of data solutions, ensure FAIR principles in data management, build scalable data products for biomedical applications, and collaborate with engineers and scientists.
Headquartered in Silicon Valley, we are a newly established start-up, where a collective of visionary scientists, engineers, and entrepreneurs are dedicated to transforming the landscape of biology and medicine through the power of Generative AI. Our team comprises leading minds and innovators in AI and Biological Science, pushing the boundaries of what is possible. We are dreamers who reimagine a new paradigm for biology and medicine.
We are committed to decoding biology holistically and enabling the next generation of life-transforming solutions. As the first mover in pan-modal Large Biological Models (LBM), we are pioneering a new era of biomedicine, with our LBM training leading to ground-breaking advancements and a transformative approach to healthcare. Our exceptionally strong R&D team and leadership in LLM and generative AI position us at the forefront of this revolutionary field. With headquarters in Silicon Valley, California, and a branch office in Paris, we are poised to make a global impact. Join us as we embark on this journey to redefine the future of biology and medicine through the transformative power of Generative AI.
Key Responsibilities:
- Lead the strategic design of a holistic solution to our large and diverse data usage needs,
- Set the collaboration and reusability strategy for data consumption including publicly available and partner generated data
- Ensure the FAIR principles are followed in our data storage and retrieval strategy.
- Build and maintain scalable, efficient, and reusable data products and codebases for large-scale foundation model training, adaptation, evaluation, and inference.
- Collaborate closely with data engineers and research scientists to integrate models into production environments.
- Ensure code quality, scalability, and performance through rigorous testing and code reviews.
Qualifications:
- Bachelor’s, Master’s degree in Computer Science, Engineering, or related field. Experience in life sciences or healthcare is required.
- Strong familiarity with at least some (the more the better) of the following biomedical data types: Sequencing data, other high throughput omics data, biological imaging data, clinical and phenotypic data
- Experience with using (developing an advantage) large scale data products and systems for biological or biomedical applications.
- Stong programming skills in JavaScript, Python, and modern web development frameworks, and familiarity with GPU-accelerated tools (e.g., CUDA, cuDNN, Triton).
- Knowledge of major deep learning frameworks such as PyTorch, HuggingFace Transformers & Accelerate, or Megatron-LM/DeepSpeed.
- Familiarity with resource management and scheduling systems (e.g., SLURM, Kubernetes).
- Proficiency in back-end frameworks like Django, Flask, or Node.js, and database technologies (e.g., PostgreSQL, MongoDB).
- Expertise in distributed systems, cloud computing (AWS, GCP), and containerization tools (Docker, Kubernetes).
Preferred Qualifications:
- Prior experience pre-training or serving large language models or large-scale foundation models.
- Experience with deep learning workflows.
- Knowledge of challenges and experience with bioinformatics tools
- Familiarity with version control systems like Git and CI/CD pipelines.
- Strong understanding of RESTful APIs, authentication, and deployment pipelines
- Familiarity with machine learning workflows and biological datasets.
Join us as we embark on this journey to redefine the future of biology and medicine.
We are an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
Top Skills
AWS
Cuda
Cudnn
Deepspeed
Django
Docker
Flask
GCP
Huggingface Transformers
JavaScript
Kubernetes
Megatron-Lm
MongoDB
Node.js
Postgres
Python
PyTorch
Slurm
Triton
Similar Jobs
Fintech • Software • Financial Services
As a Lead Data Engineer, you will design and maintain data products, lead a team of data engineers, and implement best practices for data security and data engineering.
Top Skills:
Artificial IntelligenceCi/CdData EngineeringData SecurityData WarehousingDevOpsETLMachine Learning
Payments
The Lead Data Engineer designs, develops, and maintains software solutions while mentoring junior team members and participating in agile practices.
Top Skills:
ClouderaHiveImpalaJavaKafkaLinuxNifiOoziePostgresScalaSparkSqoopUnix
Financial Services
The Senior Lead Software Engineer will design and develop high-performance transactional data systems, optimize OLTP platforms, and lead technical guidance within an agile team.
Top Skills:
Apache KafkaAWSAzureC++CockroachdbGCPGoJavaKinesisKubernetesMqMySQLOraclePostgresPythonSQL Server
What you need to know about the Edinburgh Tech Scene
From traditional pubs and centuries-old universities to sleek shopping malls and glass-paneled office buildings, Edinburgh's architecture reflects its unique blend of history and modernity. But the fusion of past and future isn't just visible in its buildings; it's also shaping the city's economy. Named the United Kingdom's leading technology ecosystem outside of London, Edinburgh plays host to major global companies like Apple and Adobe, as well as a growing number of innovative startups in fields like cybersecurity, finance and healthcare.