at The Jackson Laboratory in Bar Harbor, Maine, United States
The Data Engineer builds, manages, and optimizes data pipelines into production for key Business Products. The incumbent plays a pivot role in operationalizing large-scale data capture, management data analytics initiatives. Data engineers comply with data quality, governance, and security requirements while creating and operationalizing integrated and reusable data pipelines for faster data access. This individual interfaces with and applies creativity, problem-solving and collaboration with IT members. This role will also work with collaborators outside of our department, to architect and manage the flow and integration of externally sourced data. This role requires a deep understanding of data architecture, data engineering, data analysis, reporting, and a basic understanding of data science techniques and workflows. The ideal candidate is a skilled data/software engineer with experience creating data products supporting analytic solutions. The Data Engineer reports to a Product Manager, IT.
Build data pipelines: Architect, create, and maintain data pipelines. Manage data pipelines consisting of a series of stages through which data flows, from data sources to integration to end-user consumption. Build pipelines for data transformation, models, schemas, metadata, and workload management. Create, maintain and optimize as workloads move from development to production for specific use cases.
Drive Automation through effective metadata management: Apply innovative and modern tools, techniques, and architectures to partially or completely automate the most common, repeatable data preparation and integration tasks in order to minimize manual and error-prone processes and improve productivity. Improve the data management infrastructure process to drive automation in data integration and management. Learn and use modern data preparation, integration, and AI-enabled metadata management tools and techniques.
Collaborate with Product Managers in refining their data requirements for various data and analytics initiatives and their data consumption requirements. Work with collaborators outside the Laboratory to integrate or ingest external data.
Stay current with new innovations, and propose modern data ingestion, preparation, integration, and operationalization techniques in optimally addressing the data requirements. Train counterparts in these data pipelining and preparation techniques to enable them to better analyze the data. Contribute to grant writing as needed.
Comply with data quality, data governance, and security requirements.
Bachelor's degree in Computer Science, Engineering, Mathematics, Statistics or related subject with 3 years' experience in relevant area of computer science or engineering
Expertise with advanced analytics tools for object-oriented/ object function scripting using languages such as MuleSoft, Python, Java, C++, Scala, others
Expertise with popular database programming languages for relational and nonrelational databases to build, manage and integrate data pipelines. Experience with ETL processes
Familiar with multiple deployment environments and operating systems and containerization techniques
Thorough knowledge of Microsoft SQL Server
Working knowledge of SQL Server Integrated Services (SSIS)
Working knowledge of SQL Server Reporting Service (SSRS)
Expertise in SQL programming language
Expertise designing and implementing data pipelines using modern data engineering approach and tools: SQL, MuleSoft, Delta Lake, Databricks, Spark, Glue, Nifi, Streamsets, cloud-native DWH (BigQuery, Snowflake), Kafka/Confluence, Presto/ Dremio /Athena
Experience with developing solutions on cloud computing services and infrastructure with Azure
Experience with database development using a variety of relational, NoSQL, and cloud database technologies
Experience with Business Intelligence tools such as PowerBI
Conceptual knowledge of data and analytics, such as dimensional modeling, ETL, reporting tools, data governance, DWH, etc.
Exposure to machine learning, data science, computer vision, artificial intelligence, statistics, and/or applied mathematics
Exposure to Agile software development life cycle process and concepts such as Continuous Integration, Continuous Delivery, and Test-Driven Development
Ability to develop, recommend and execute solutions to complex problems often under time constraints
Familiarity with open source and commercial data science platforms
Exposure to multiple, diverse technologies and processing environments. Ability to quickly comprehend the functions and capabilities of new technologies.
Strong advocate of a culture of process and data quality in all development team
A positive attitude,... For full info follow application link.
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability or protected veteran status.To view full details and how to apply, please login or create a Job Seeker account