Key Responsibilities
Pipeline Development:
Design, develop, and maintain end-to-end ETL/ELT pipelines using
Python
and
PySpark .
Big Data Processing:
Build large-scale data processing frameworks to handle structured and unstructured data, ensuring high performance and reliability.
Cloud Infrastructure:
Architect and manage data solutions within the
GCP ecosystem , focusing on cost-efficiency and security.
Data Modeling:
Design and implement robust data warehouse models (Star/Snowflake schemas) and data lake architectures.
Optimization:
Identify, design, and implement internal process improvements, such as automating manual processes and optimizing data delivery for greater scalability.
Collaboration:
Work closely with stakeholders to understand data requirements and translate them into technical specifications.
Technical Qualification
sCore Programming : Strong profic...