Job Description
Data Engineer x4 - Central Government
- Tech Stack: Cloudera/Hadoop, Spark, Python, Pyspark
- DBS Check before starting
- Fully Remote working anywhere in the UK
- 6 month contracts - Inside IR35, rates negotiable
- Deadline: 12 Noon 12/01/22
SERVICE REQUIREMENT
Software developers required to lead on design and implementation of data interface to move data products through the processing tech stack. Outputs to be delivered to various QA teams to validate census return quality, enabling outputs in a year publication data to be met. Candidate will work as one of a team of engineers using agile methodologies to implement these interfaces and provide first line support through critical operational periods.
Software developers required to work on building data pipelines on Cloudera using Hadoop, Spark, Python, Pyspark, Parquet and Avro, Hive, Hue with an understanding and focus on sourcing data, transforming, establishing governance and building flexible frameworks to transform, aggregate data for aggregation.
Responsibilities: Hands on Data engineer to work on building data pipelines on Cloudera using Hadoop, Spark, Python, Pyspark, Parquet and Avro, Hive, Hue with an understanding and focus on sourcing data, transforming, establishing governance and building flexible frameworks to transform, aggregate and facilitate data for aggregation.
Relevant Skills and Experience:
Familiarity with Cloudera toolset desirable (HUE, Hive, Impala, Cloudera Data Science Workbench, HDFS, Avro, Parquet).
Essential
Python, Spark, PySpark, Gitlab - developer languages and code repo
Essential
Agile - to align to team current way of working and promote iterative development
Essential
Confluence, JIRA, Sharepoint - agile project tooling
Desirable
