Menu

Data Engineer-IT

at The Jackson Laboratory in Bar Harbor, Maine, United States

Job Description

The Data Engineer builds, manages, and optimizes data pipelines into production for key Business Products. The incumbent plays a pivot role in operationalizing large-scale data capture, management data analytics initiatives. Data engineers comply with data quality, governance and security requirements while creating and operationalizing integrated and reusable data pipelines for faster data access. This individual interfaces with and applies creativity, problem solving and collaboration with IT members. This role will also work with collaborators outside of our dept, to architect and manage the flow and integration of externally sourced data. This role requires deep understanding of data architecture, data engineering, data analysis, reporting, and a basic understanding of data science techniques and workflows. The ideal candidate is a skilled data / software engineer with experience creating data products supporting analytic solutions. The Data Engineer reports to a Product Manager, IT.

 

Key Responsibilities:

 

Build data pipelines: 35% Architect, create and maintain data pipelines. Manage data pipelines consisting of a series of stages through which data flows, from data sources to integration to end-user consumption. Build pipelines for data transformation, models, schemas, metadata and workload management. Create, maintain and optimize as workloads move from development to production for specific use cases. 35%

Drive Automation through effective metadata management: 25% Apply innovative and modern tools, techniques and architectures to partially or completely automate the most common, repeatable data preparation and integration tasks in order to minimize manual and error-prone processes and improve productivity. Improve the data management infrastructure process to drive automation in data integration and management. Learn and use modern data preparation, integration and AI-enabled metadata management tools and techniques. 25%

Collaborate with Product Managers in refining their data requirements for various data and analytics initiatives and their data consumption requirements. Work with collaborators outside the Laboratory to integrate or ingest external data. 20%

Stay current with new innovations, and propose modern data ingestion, preparation, integration and operationalization techniques in optimally addressing the data requirements. Train counterparts in these data pipelining and preparation techniques to enable them to better analyze the data. Contribute to grant writing as needed. 10%

Comply with data quality, data governance and security requirements. 10%

Other duties as assigned.

 

Knowledge, Skills, and Abilities:

Proven experience in leading and mentoring others

Bachelor's degree in computer science or engineering with 3 years' experience in relevant area of computer science or engineering

Expertise with advanced analytics tools for object oriented/ object function scripting using languages such as MuleSoft, Python, Java, C++, Scala, others

Expertise with popular database programming languages for relational and nonrelational databases to build, manage and integrate data pipelines. Experience with ETL processes

Familiar with multiple deployment environments and operating systems and containerization techniques

Ability to develop, recommend and execute solutions to complex problems often under time constraints

A positive attitude, self-starter and ability to perform successfully in a fast-paced environment. Sense of urgency

Familiarity with open source and commercial data science platforms

Exposure to multiple, diverse technologies and processing environments. Ability to quickly comprehend the functions and capabilities of new technologies.

Thorough knowledge of Microsoft SQL Server

Working knowledge of SQL Server Integrated Services (SSIS)

Working knowledge of SQL Server Reporting Service (SSRS)

Bachelor of Science in Computer Science, Engineering, Mathematics, Statistics or related subject

Expertise in SQL programming language

Expertise designing and implementing data pipelines using modern data engineering approach and tools: SQL, MuleSoft, Delta Lake, Databricks, Spark, Glue, Nifi, Streamsets, cloud-native DWH (BigQuery, Snowflake), Kafka/Confluence, Presto/ Dremio /Athena

Experience with developing solutions on cloud computing services and infrastructure with Azure

Experience with database development using a variety of relational, NoSQL, and cloud database technologies

Worked with BI tools such as PowerBI

Conceptual knowledge of data and analytics, such as dimensional modeling, ETL, reporting tools, data governance, DWH, etc.

Exposure to machine learning, data science, computer vision, artificial intelligence, statistics, and/or applied... For full info follow application link.

 

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability or protected veteran status.

To view full details and how to apply, please login or create a Job Seeker account
How to Apply Copy Link

Job Posting: 627922

Posted On: Oct 23, 2021

Updated On: Oct 23, 2021