We are looking for Data/Machine Learning engineers at all levels to help us build a robust and scalable data platform to support AI/ML data pipelines, reporting and data analysis as our business scales. We use cloud native (AWS) cutting-edge technologies like Spark, Kinesis/Kafka Streaming, Graph , infrastructure as code, CI/CD to deliver high-quality data solutions to analysts, data scientists, and partners.
We’re looking for an engineer that takes ownership in their work, has a strong focus on quality, and enjoys working in a collaborative environment.
- Apply proven expertise and build high-performance scalable data warehouses
- Design, build and launch efficient & reliable data pipelines to move and transform data (both large and small amounts)
- Securely source external data from numerous partners
- Intelligently design data models for optimal storage and retrieval
- Deploy inclusive data quality checks to ensure high quality of data
- Optimize existing pipelines and maintain of all domain-related data pipelines
- Ownership of the end-to-end data engineering component of the solution
- Collaboration with the Data Center SMEs, Data Scientists, and Program Managers
- Support on-call shift as needed to support the team
- Design and develop new systems in partnership with software engineers to enable quick and easy consumption of data
- Bachelor’s degree or Maser degree in Computer Science, Software Engineering, or related field
- 5 + years of SQL (Oracle, Vertica, Hive, etc.) experience and relational databases experience (Oracle, MySQL)
- 5 + years of experience in custom or structured (i.e. Informatica/Talent/Pentaho) ETL design, implementation and maintenance
- 5 + years’ experience in data engineering, experience in applying DWH/ETL best practic
- 5 + years of Java and/or Python development experience
- 5 + years of experience in LAMP and the Big Data stack environments (Hadoop, MapReduce, Hive)
- 5 + years of experience working with enterprise DE tools and experience learning in-house DE tools
- 3+ years exp in AWS data solutions stack – EMR, S3, redshift, Kinesis, ECS, Docker
- 3+ years exp in CI/CD stack – Jenkins , Git
- Master’s degree or Bachelor degree in a related field.
- Cloudera Administrator certification.
- Office environment.