AWS Cloud Data Engineer

Position Responsibilities / Accountabilities:

Implements and enhances complex data processing pipelines with a focus on collecting, parsing, cleaning, managing and analyzing large data sets that produce valuable business insights and discoveries.
Determines the required infrastructure, services, and software required to build advanced data ingestion & transformation pipelines and solutions in the cloud.
Assists data scientists and data analysts with data preparation, exploration and analysis activities.
Applies problem solving experience and knowledge of advanced algorithms to build high-performance, parallel, and distributed solutions.
Performs code and solution review activities and recommends enhancements that improve efficiency, performance, stability, and decreased support costs.
Applies the latest DevOps and Agile methodologies to improve delivery time.
Works with SCRUM teams in daily stand-up, providing progress updates on a frequent basis.
Supports application, including incident and problem management.
Performs debugging and triage of incident or problem and deployment of fix to restore services.
Documents requirements and configurations and clarifies ambiguous specs
Performs other duties as assigned by management.

Minimum Requirements:

3+ years of experience working in large-scale data integration and analytics projects, using AWS cloud (e.g. AWS Redshift, S3, EC2, Glue, Kinesis, EMR), big data (Hadoop) and orchestration (e.g. Apache Airflow) technologies
3 years with Languages: SQL, Python, Spark
3+ years of experience in implementing distributed data processing pipelines using Apache Spark
3+ years of experience in designing relational/NoSQL databases and data warehouse solutions

Nice to have, but not required:

2+ years of experience in writing and optimizing SQL queries in a business environment with large-scale, complex datasets
2+ years of Unix/Linux operating system knowledge (including shell programming).
2+ years of experience in automation/configuration management tools such as Terraform, Puppet or Chef.
2+ years of experience in container development and management using Docker.
2+ years of experience in continuous integration tools (e.g. Jenkins).
Basic knowledge of machine learning algorithms and data visualization tools such as Microsoft Power BI and Tableau

INDHCLSMC

Nearest Major Market:
New York City

Job Segment:
Cloud, Database, SQL, Linux, Unix, Technology