Data Quality Framework for Business-to-Customer Marketing environment.
Build vision and innovation roadmap capability for RTDF (Real Time Data Factory) on GCP, including AI/ML based data recovery, data poisoning etc
Build Deployment / technology roadmap aligned to roadmap
Work with the application development team to implement the data strategies, build data flows and develop conceptual data models.
Design solutions in DataProc, Dataflow, BigQuery, Cloud Composer/Airflow, DataFusion other ETL tools.
Design end to end scalable CI/CD architecture, automated production gatekeeping
Analyze data-related system integration challenges and propose appropriate solutions.
Assemble large, complex data sets that meet functional / non-functional business requirements.
The candidate must have experience using the following software/tools:
Who we're looking for?
o Good handle on GCP cloud services: VMs, DataProc, Dataflow, BigQuery, DataFusion, Cloud Composer/Airflow, GKE, Flux, PubSub, CloudProc
o Experience with data processing tools & frameworks & language : Spark, Apache Beam
o Experience with stream-processing systems
o Experience with object-oriented/object function scripting languages: Python, Java, etc.
o Experience with relational SQL and NoSQL databases
o Experience with data pipeline and workflow management toolsNICE TO HAVE
Cross location distributed delivery