Data Engineer
Job Description
A client is hiring a mid-level Data Engineer to join their existing data products and tooling team, that role sits within the broader data organisation. The team is focussed on developing carbon offset-related data products for their clients, as well as building internal data tools to increase the efficiency for their ratings teams.
You’ll be responsible for building data products and tools that directly affect the way the ratings teams analyse carbon offset projects. This is a cross-functional role: you will be working together with colleagues from product, ratings, and software engineering team every day.
To give you a flavour of the kind of work this team does, these are some of the projects members in our team have been working on recently:
- Designing robust back-end API services that power our in-house central data portal, enabling ratings analysts to access prepared and curated data essential for evaluating carbon offset projects.
- Introducing an in-house knowledge management tool, with generative AI capabilities, to help rating analysts in navigating the large amounts of unstructured document data that exists in the carbon market.
- Collaborating closely with our rating analysts to standardise and automate quantitative analyses central to assessing renewable energy offsetting projects.
- Deploying a system of web crawlers to aggregate carbon project related data into our data warehouse, alongside developing a standardised data model so the data can be used internally and some external platforms too.
If you’re excited by working on such problems and making impactful contributions to data in the climate space, then we’re looking for you.
Tech stack
As a data team, they have a bias towards shipping products, staying close to internal and external customers, and end-to-end ownership of infrastructure and deployments.
This is a team that follows software engineering best practices closely.
Data stack includes the following technologies:
- AWS serves as the cloud infrastructure provider.
- Snowflake acts as central data warehouse for tabular data. AWS S3 is used for geospatial raster data, and PostGIS for storing and querying geospatial vector data.
- dbt for building SQL-style data models and Python jobs for non-SQL data transformations.
- Computational jobs are executed in Docker containers on AWS ECS, and Prefect for workflow orchestration engine.
- GitHub Actions for CI / CD.
- Metabase serves as a dashboarding solution for end-users.
They are a remote-friendly company, many colleagues work fully remote; however, for this position, they will only consider applications from candidates based in the UK. If you live in or near London, you are welcome (but not required!) to work from their London office.
Responsibilities:
You will be an individual contributor in our data engineering team, focused on designing and building robust data pipelines for the ingestion and processing of carbon offset-related data.
You will contribute to and maintain our analytical data models in our data warehouse.
You will work with internal research and ratings teams to integrate the outputs of (analytical) data pipelines into the business processes & products.
You will work with other teams in the business to enable them to be more efficient, by building data tools and automation.
You’ll be our ideal candidate if:
You care deeply about the climate and carbon markets and are excited by solutions for decarbonising our economy.
You are a highly collaborative individual who wants to solve problems that drive business value.
You have at least 2 years of experience building ELT/ETL pipelines in production for data engineering use cases, using Python and SQL.
You have hands-on experience with workflow orchestration tools (e.g., Airflow, Prefect, Dagster), containerization using Docker, and a cloud platform like AWS.
You can write clean, maintainable, scalable, and robust code in Python and SQL, and are familiar with collaborative coding best practices and continuous integration tooling.
You are well-versed in code version control and have experience working in team setups on production code repositories.
You’ve designed back-end services and deployed APIs yourself, ideally using a framework like FastAPI.
You have experience in deploying and maintaining cloud resources into production using tools such as AWS Cloud Formation, Terraform, or others.
Our interview process:
- Initial screening interview with recruiter (15 mins)
- Introduction call with Chief Data Officer (30 mins)
- 2x Technical interview with members from the data engineering team (60-90 mins)
- Reference checks + offer
Interview process:
- Initial screening interview with recruiter (15 mins)
- Introduction call with Chief Data Officer (30 mins)
- Technical interview with members from the data engineering team (90 mins)
- Reference checks + offer
They value diversity, they need a team that brings different perspectives and backgrounds together to build the tools needed to make the voluntary carbon market transparent. Therefore, commitment to not discriminate based on race, religion, colour, national origin, sex, sexual orientation, gender identity, marital status, veteran status, age, or disability.
If you possess the required experience, skills to carry out the role & tasks to a high standard please get in touch with Ian at the earliest opportunity!
Ian@theBDPN.com 07944-841968 – www.TheBDPN.com