9 months contract
SC clearance required
This role will require the development of an understanding of key government operational data and the contractor to quickly develop a familiarity with Census data and collection methods and specifically the role will involve:
As indicated above, the role requires experience of manipulating big data using Hadoop-based tools such as Hive, Impala and Spark so experience of data manipulation and querying using Python and SQL in such tools is a core element of the position.
- Collaborating with key members of the Data Engineering team to develop automated coding solutions for a range of ETL, data cleaning, structuring and validation processes.
- Working with area leads across the broader Data Architecture Division providing ad-hoc coding support on a range of projects underway in Data Architecture utilising cross-government data;
- Forming part of a joint project team with Census group to deliver a number of primary data outputs in support of the 2019 Census Rehearsal;
- Proving training and coaching to new members of staff across the Data Engineering team;
Skills and Experience
- Extensive proven experience of data engineering and architectural techniques, including data wrangling, data profiling, data preparation, metadata development, and data upload/download;
- Proven experience of 'big data' environments, including the Hadoop Stack (Cloudera), including data ingestion, processing and storage using HDFS, Spark, Hive and Impala;
- Extensive hands-on experience of developing ETL functionality in a cloud or on-premise environment;
- Experience of using tools such as python and SQL (in Spark) to profile, query and structure large-volume data;
- Proven experience of using Cloud Services particularly in the context of Hadoop;
- Experience of developing/utilising programming and query languages e.g. SQL (Hive Impala specifically), Python (through Spark), Scala.
- SC-level clearance valid for at least 1 year on commencement of the contract;
- Understanding of data bases and applying data models.
- Experience of coaching and training others in programming and ETL techniques;
- Experience of UK Government Administrative Data;