Project description
As Data Engineer you'll be working with alongside data architects to take data throughout its lifecycle - acquisition, exploration, data cleaning, integration, analysis, interpretation and visualization. You will be creating the pipeline for data processing, end-to-end data handling, management and analytics processes.
Your tasks
- Development of data processing pipelines
- Developing, constructing, testing, and maintaining architectures of data flows
- Developing data set processes
- Identifying ways to improve data reliability, efficiency and quality
- Using large data sets to address technical and business issues
- Designing, developing and maintaining scalable large databases
- Performing hands-on DevOps work to keep the Data platform / product secure and reliable
- Working closely with product managers and data scientists to bring various datasets together and to cater for business intelligence and analytics use-cases
- Working with other team members or stakeholders to understand objectives and gather requirements
Who we're looking for?
Requirements:
- • 3+ years industrial experience in the domain of large-scale data management
Must Have:
- Proven hands on data engineering experience and ETL;
- Solid experience with data modelling, data warehouse design and data lake concepts and practices
- Experience working in Python
- Delivering data products with Kafka, Spark, preferably with PySpark package.
- SQL data engineering skills and experience working with analytical databases and relational databases
- Fluent in English
Nice to have:
- Knowledge of Hadoop ETL tools (sqoop, impala, hive, oozie)
- Exposure working in a Microsoft Azure Data Platform environment (Data Factory, Storage (Blob or Data Lake), Databricks, Functions, Synapse),
- Exposure working in with cloud service (AWS, GCP, Azure).
- Experience with workflow management systems, such as Airflow.
- Experience working with Bash scripting languages.
Skills
Healthcare
- Healthcare package
Traning
- Conferences
- Trainings