Neurons Lab Jobs

Data Engineer

Neurons Lab

Data Engineer

Posted Yesterday

Be an Early Applicant

In-Office or Remote

Hiring Remotely in Greece

Mid level

In-Office or Remote

Hiring Remotely in Greece

Mid level

Lead foundational data-engineering work to validate and re-engineer pipelines for an anonymized, centralized credit data lake. Harmonize schemas across entities, build dbt models and tests, implement data-quality suites (Great Expectations), entity resolution, anonymization controls, optimize Spark/Glue jobs, orchestrate pipelines (Airflow/Step Functions), and produce documented, feature-ready datasets and runbooks for a regulated UK/Ireland lending environment.

The summary above was generated by AI

About the project (description, duration, stage)

Join Neurons Lab as a Data Engineer on a new engagement with a regulated UK & Ireland credit and lending company. The client has lifted data from multiple business entities into a newly centralized, anonymized data lake, but lacks the data-engineering depth to make it trustworthy and analytics-ready: current pipelines were assembled quickly (partly AI-assisted), and the descriptive statistics cannot yet be validated or reproduced.

You put that foundation on solid ground so the Data Science Lead can model on it with confidence — validate and re-engineer the pipelines, build the harmonization / semantic layer across entities, enforce data quality and lineage, and prepare clean, feature-ready datasets.

This is a foundational data-engineering role on a regulated data estate; data protection and reproducibility are the primary constraints on every decision.

Full-time engagement preferable.

What you'll actually do (example tasks)

Reproduce a descriptive-statistics report end-to-end so any figure traces back to raw source — closing the gap the client admitted (numbers they can't currently defend).
Profile and reconcile differing source schemas across acquired entities: map differing field names, types, encodings and business definitions for the same concept into one conformed model.
Build dbt staging → intermediate → mart models with tests; codify the harmonized definitions the Data Science Lead specifies.
Write Great Expectations suites (null / range / uniqueness / referential checks) and wire them into the pipeline so bad data fails loudly rather than silently corrupting analysis.
Implement entity / identity resolution (deterministic + fuzzy matching) where there is no clean shared key for the same customer or account across sources.
Implement and verify anonymization / pseudonymization (hashing / tokenization / k-anonymity) and evidence that re-identification risk is controlled for the client's IT / compliance team.
Optimize Spark / Glue jobs over tens of millions of rows — partitioning, file formats (Parquet), incremental loads, cost control.
Orchestrate with Airflow / Step Functions; build repeatable, scheduled pipelines rather than one-off scripts.
Prepare clean, documented, feature-ready datasets for the PD / delinquency models.
Document runbooks so the offshore team can operate the pipelines and handover takes days, not weeks; help scope onboarding of the remaining (Ireland + additional) sources.

Skills

Strong SQL and Python for large-scale data processing
AWS data stack: S3, Glue, Lake Formation, Athena / Redshift, EMR / Spark, Step Functions / Airflow
Data modeling & semantic layer (dbt or equivalent); dimensional modeling
Entity resolution / record linkage across heterogeneous sources
Data-quality & testing frameworks (Great Expectations, dbt tests) and data lineage
Anonymization / pseudonymization techniques and their analytical trade-offs
Big-data processing (Spark) with performance and cost optimization at scale
Clear written / verbal English; documents for handover and works well with a distributed team

Knowledge

GDPR fundamentals as applied to anonymized / pseudonymized financial data and UK / EU data residency
AWS Well-Architected (Analytics, Security) for BFSI
Awareness of credit / risk data structures and what downstream modeling consumers need — a plus

Experience

4+ years in data engineering, with strong AWS + Spark / SQL at scale
Demonstrated experience harmonizing / integrating data across multiple source systems
Experience building validated, reproducible pipelines in a regulated environment (BFSI, healthcare, government) — strong plus
Comfortable stepping into a messy, partly-built data estate and bringing it up to standard
Comfortable as the sole or lead data engineer on a small (3–4 person) delivery pod

Similar Jobs

Datafold

Data Engineer

Yesterday

Remote

Mid level

Analytics

Own end-to-end AI-automated data platform migration projects (1-4 concurrently): scope, plan, execute, and hand off. Act as primary customer contact, configure Datafold's Migration Agent, partner with engineering on execution, and help refine delivery playbooks.

Top Skills: AIDatabricksDatafold Migration AgentDbtETLIncremental ProcessingOrchestration ToolsSnowflakeStored ProceduresStreaming

Vimachem - IIoT Pharma 4.0 AI Platform

Data Engineer

9 Days Ago

Remote

Senior level

Artificial Intelligence • Software • Biotech • Pharmaceutical

Design and maintain analytics models and pipelines, write and tune SQL, and build performant Metabase dashboards. Partner with product and engineering to standardize reporting, improve performance, and prepare datasets for AI/ML.

Top Skills: DbtLookerMetabaseModePower BIPythonSQLSqlmeshSupersetTableau

Turner & Townsend

Data Engineer

12 Days Ago

Remote or Hybrid

Senior level

Professional Services • Real Estate • Consulting

Design, build, and operate enterprise reporting and analytics platforms. Create scalable ELT/ETL pipelines, master data foundations, and lakehouse/warehouse semantic models. Deliver Power BI datasets and dashboards, implement data quality, governance, DevOps practices, and cost-management data solutions. Partner with finance and stakeholders and provide technical leadership across architecture, deployment, and run-state support.

Top Skills: APIsAzure Data FactoryAzure DevopsAzure SqlAzure Synapse AnalyticsDatabricksDataflows Gen2DaxDbtEltETLGitLakehouseMicrosoft FabricMicrosoft PurviewNotebooksOnelakePower BIPysparkPythonReal-Time IntelligenceSparkSQLTerraform

What you need to know about the Edinburgh Tech Scene

From traditional pubs and centuries-old universities to sleek shopping malls and glass-paneled office buildings, Edinburgh's architecture reflects its unique blend of history and modernity. But the fusion of past and future isn't just visible in its buildings; it's also shaping the city's economy. Named the United Kingdom's leading technology ecosystem outside of London, Edinburgh plays host to major global companies like Apple and Adobe, as well as a growing number of innovative startups in fields like cybersecurity, finance and healthcare.