Data Engineer (Remote, UK)

Posted 5 months ago

Permanent
London or UK-based
Posted 5 months ago

Full time, London or UK-based.

About our client

A global ratings agency for the Voluntary Carbon Market, distributing ratings via a SaaS Product, informing all market participants on how to price and manage risk. Our ratings and research tools support buyers, intermediaries, investors and carbon project developers.

Founded in April 2020, our 170+ strong team combines climatic and earth sciences, sell-side financial research, earth observation, machine learning, data and technology, engineering, and public policy expertise. They work across four continents. Having raised a significant Series B funding round in late 2022, we are rapidly growing as a company, accelerating the Net-Zero transition through ratings.

Job Description

They are hiring a mid-level data engineer to join their existing data products and tooling team, that sits within the broader data organisation. The team is focused on developing carbon offset-related data products for our clients, as well as building internal data tools to increase the efficiency of our Ratings teams.

You’ll be responsible for building data products and tools that directly affect the way their ratings teams analysis of carbon projects. This is a cross-functional role: you will be working together with colleagues from product, ratings, and software engineering team every day.

To give you a flavour of the kind of work this team does, these are some of the projects members have been working on recently:

Designing robust back-end API services that power our in-house central data portal, enabling ratings analysts to access prepared and curated data essential for evaluating carbon offset projects.
Introducing an in-house knowledge management tool, with generative AI capabilities, to help rating analysts in navigating the large amounts of unstructured document data that exists in the carbon market.
Collaborating closely with our rating analysts to standardise and automate quantitative analyses central to assessing renewable energy offsetting projects.
Deploying a system of web crawlers to aggregate carbon project related data into our data warehouse, alongside developing a standardised data model so the data can be used internally and displayed on our client-facing platform.

If you’re excited by working on such problems and making impactful contributions to data in the climate space, then let’s have a call at the earliest opportunity.

Tech stack

As a data team, with a bias towards shipping products, staying close to the internal and external customers, and end-to-end ownership of our infrastructure and deployments. This is a team that follows software engineering best practices closely. The data stack includes the following technologies:

AWS serves as the cloud infrastructure provider.
Snowflake acts as the central data warehouse for tabular data. AWS S3 is used for any of our geospatial raster data, use PostGIS for storing and querying geospatial vector data.
Use dbt for building SǪL-style data models and Python jobs for non-SǪL data transformations.
Our computational jobs are executed in Docker containers on AWS ECS, and we use Prefect as our workflow orchestration engine.
GitHub Actions for CI / CD.
Metabase serves as a dashboarding solution for end-users.

Responsibilities:

You will be an individual contributor in the data engineering team, focused on designing and building robust data pipelines for the ingestion and processing of carbon offset- related data.
You will contribute to and maintain our analytical data models in our data warehouse.
You will work with our internal research and ratings teams to integrate the outputs of (analytical) data pipelines into the business processes products.
You will work with other teams in the business to enable them to be more efficient, by building data tools and automations.

You’ll be an ideal candidate if:

You care deeply about the climate and carbon markets and are excited by solutions for decarbonising our economy.
You are a highly collaborative individual who wants to solve problems that drive business value.
You have at least 2 years of experience building ELT/ETL pipelines in production for data engineering use cases, using Python and SǪL.
You have hands-on experience with workflow orchestration tools (e.g., Airflow, Prefect, Dagster), containerization using Docker, and a cloud platform like AWS.
You can write clean, maintainable, scalable, and robust code in Python and SǪL, and are familiar with collaborative coding best practices and continuous integration tooling.
You are well-versed in code version control and have experience working in team setups on production code repositories.
You’ve designed back-end services and deployed APIs yourself, ideally using a framework like FastAPI.
You have experience in deploying and maintaining cloud resources into production using tools such as AWS Cloud Formation, Terraform, or others.

This role is full-time and can be performed either hybrid (London Office) or remotely elsewhere in the UK.