Python (pyspark) developer

Job Title: Python (pyspark) developer
Contract Type: Contract
Location: England
Salary: £25 - £27 per hour + Inside IR35
Start Date: June 2021
Reference: BBBH27156_1622047386
Contact Name: Zoe Latuszka
Contact Email:
Job Published: May 26, 2021 17:43

Job Description

Python (PySpark) Developer

Eligable for SC

£25-27 per hour. Inside IR35

3-6 Months.

Office in Newport. Remote for foreseeable/duration.

Description of Requirement:
The coronavirus (COVID-19) Infection Survey (CIS) within ONS provides statistical analysis on the COVID-19 pandemic for Government and academic research purposes. The CIS Data Processing Pipeline (DPP) receives data from multiple sources and engineer these data to provide cleaned, linked datasets for analysis.

These roles are responsible for the development and maintenance of data processing pipelines. You will work with a small development team and wider analysis team to meet requirements for analysis. You will need experience of data-driven projects.
We are looking for both a senior developer and developer to join the team for hands-on programming roles. We are looking for strong development skills coupled with wider awareness and the ability to contribute fully to the role of the development team. Rates will be dependent on demonstrated skills & experience.
Please highlight specific experience that applies the skills described below.

You will be responsible for the development of sophisticated data processing pipelines, whilst supporting team members and others to apply agile principles to deliverables. In doing so, you will:
  • Contribute to the solution design and development approach for the pipeline
  • Work hands-on to produce code solutions in Python / PySpark / SQL Server / Hive
  • Communicate and work with other developers, through pair programming, to implement data processing pipelines.
  • Liaise with analysis teams to ensure that requirements for outputs are understood and incorporated into the pipeline.
  • Address technical blockers, actively seeking solutions to remove them and proposing alternative routes to delivery.
  • Routinely test the developed pipelines to ensure that they are fit for purpose.
  • Peer review work of others in the team, to ensure quality and consistency.

Relevant Skills and Experience: maximum 6

Desirable or
  • Python - proficiency in Python development, including PySpark.

  • Data storage - working with relational and non-relational databases.

  • Testing - writing unit tests for PySpark routines using pytest.

  • Agile - applying Agile software development practices, including Git version control and Jira project management.

  • DevOps - applying DevOps practices, including continuous integration and continuous deployment.

  • Analysis - identifying the root cause of data quality issues and communicating these with data providers.