- Security Clearnce required (SC level)
- Start Date: ASAP
- Deadline: 29.09.21
- Good knowledge of Pyspark and experience with Pyspark in projects involving big data (Essential expereince with PySpark)
- Experience with spatial data (Beneficial)
- Experience of geospatial work (advantageous)
- Develop methods for further expanding the use of UN Global AIS shipping data.
- Maintenance, improvement and further development of the current processing pipeline of the current Shipping Faster Indicator.
- Mentoring and knowledge sharing within the project team and Data Science Campus.
- Apply Pyspark and Python for data & software engineering on UN Global AIS shipping data to maintain and improve the current processing pipeline of the shipping faster indicator.
- Apply time series, machine learning and data science techniques to improve the shipping faster indicator as outlined in the project scope.
- Data visualisation techniques and reporting.
- Regular reporting to Senior Leader Team
- Maintaining and developing the shipping faster indicator
- Improved Shipping Faster Indicators (indicative areas for improvement: ship visits by ship type, time-based port indicators, ship journey matrices by direction, cargo flow matrices, estimation of container build-up and individual business activity).
- The Data Science Campus has already developed, in collaboration with the Faster Indicators team, a Shipping Faster Indicator. Ensure consistency between the existing Shipping indicator and the Improved Shipping Faster Indicators.
- Work with the delivery team following agile practices.
- Protect security of data and code as required.
- Version controlled and well documented, clean, maintainable code for the improved Shipping Faster Indicator.
- Regular automated reporting of the Shipping Faster Indicator.
- Regular updates to the Faster Indicators team and Data Science Campus scientists to ensure that the product meets user needs.
- Support Emerging Platforms in the publication of the Shipping Faster Indicator.
- Writing technical blogs, scientific papers, and reports.
- Presentation to stakeholders.
You should be able to demonstrate significant expertise in at least one of the following areas.
(Note: We do not expect any one person to cover all the skills listed; scores will be awarded based on the breadth and depth of your experience).
* Data analytics - such as supervised and unsupervised machine learning, natural language processing, geospatial analysis, econometrics and regression, microdata, and causal inference
* Data management/curation - such as the manipulation and analysis of complex, high volume and high dimensionality data, distributed processing, relational and non-relational databases, cloud storage and data management, interoperability and standardisation for data, metadata management, knowledge of UK datasets
* Data engineering - such as the design of algorithms, implementation of big data solutions, multi-core/distributed processing, SQL and NoSQL database systems, statistical analysis languages and tooling
* Storytelling and data visualisation - including the visualisation of insights drawn from data and the building of data driven products
* Scientific research methods - analytical, critical, and curious analysis of data or modelling, ability to master the scientific literature, a track record of high-quality publications
* Software engineering - software installation and distribution, algorithm and implementation optimisation, design and/or implementation of user interfaces, and the software lifecycle
* A portfolio of open-source projects and contributions
* Any other tools, techniques or programming languages, for example, D3, HIVE tables, GPU programming experience. Some travel to other Campus and Government locations will be required when restrictions allow.