- RetailSenior Data EngineerSeptember 2023 - Today (1 year and 9 months)Self Employed (Contractor)Sep 2023 - Present Next Retail (September 2023 - Present)Technologies: GCP, Databricks, Pyspark, Pandas, Python, AWS, Azure, Azure DevOps, SQL, Looker, Bigquery, DBTKey Achievements:• Developed and maintained data pipelines using Databricks to seamlessly transfer processed data to cloud storage.• Managing Delta pipelines and delta tables for multiple brands• Leveraged Databricks notebooks for interactive data exploration, analysis, and visualization• Experienced in working with Delta Lake on Databricks for reliable and scalable data storage• Leveraged cloud computing power to process Google Analytics data and made it readily accessible on BigQuery.• Implemented comprehensive data quality checks and Pytest to ensure pipeline integrity and data reliability.• Optimized data storage efficiency by refining the data model to minimize information footprint.• Migrated existing reports from Tableau to Looker, enhancing data visualization and accessibility.• Created interactive Looker dashboards connected to BigQuery, providing valuable marketing insights.• Writing and executing DBT models using SQL and Python.• Schema designing for the shared tables between teams.• Creating CI/CD pipelines for automating DBT deployments.
- QONTIGOSenior Data EngineerMarch 2023 - August 2023 (5 months)Self Employed (contractor)Mar 2023 - Aug 2023 (6 months) Qontigo (March 2023 - August 2023)Technologies: Python, GCP, Firebase, Cloud Compute, BigQuery, Cloud SQL, Okta, CrushFTP, Cloud Function, Liferay, Postgres, Snowflake, Apache Beam, DBT, Pyspark, Pandas , Dataflow, Pubsub, Cloud Composer, Data Modeling, Gitlab. Bigquery, DataVault• Successfully transitioned Liferay-based systems to GCP, leveraging cloud-native infrastructure and services.• Employed Cloud Functions to efficiently replicate file listings, ensuring data consistency and accessibility.• Integrated Firebase and Okta to establish robust user access controls for data files.• Installed and configured CrushFTP on Linux, enabling secure user access to data files, mirroring Firebase's functionality.• Established seamless integration between Postgres (Cloud SQL) and CrushFTP, facilitating data exchange.• Developed data pipelines to extract and load indices and stocks data into Cloud SQL, making it readily available via FTP and cloud functions.• Utilized DataBP APIs to retrieve user entitlement information and integrate it into the product system.• Conducted a proof-of-concept (POC) using Apache Beam and Dataflow, evaluating tool suitability based on project requirements.• Enhanced the Liferay data model for compatibility with Cloud SQL Postgres, ensuring data persistence and accessibility.• Using DBT on UAT to thoroughly test the data model, load dimension data, and deploy the code to Snowflake.• Integrating DBT with various data sources, including MySQL, PostgreSQL, and Snowflake.• Creating CI/CD pipelines for automating DBT deployments.
- TamocoLead Data EngineerApril 2022 - March 2023 (11 months)Technologies: AWS, GCP, Snowflake Python, Py-Spark, Pandas, BigQuery, Airflow, Bitbucket, MySQL DBT,Typescript, Data Modeling, DataVaultKey Achievements:• Streamlined Operations: Migrated to AWS and optimized pipelines to minimize manual intervention.• Enhanced Geospatial Data Pipelines: Upgraded existing pipelines for geospatial data using Python/ Py-Spark and Airflow, ensuring data reliability and efficiency.• Leading a team of 3 people.• Agile Development and Bug Resolution: Resolved existing bugs promptly and efficiently.• Data Exploration and Support: Supported data analysts with data exploration and bug fixes, ensuring data integrity and accessibility.• Sample Pipelines for Potential Customers: Developed sample pipelines to demonstrate our capabilities and attract potential customers.• Cost-Effective Operations: Continuously monitored costs and planned work accordingly to optimize resource utilization.• Privacy-Compliant Pipelines: Constructed pipelines compliant with geographic data privacy regulations, ensuring user privacy protection.• CICD/MWAA/Athena/Code Pipelines: Established CICD/MWAA/Athena/code pipelines for seamless deployment and automation.• Future Improvement Plans: Collaborated with the team to establish future improvement plans with defined deadlines and cost approvals.• Typescript for Automated Deployments: Developed Typescript scripts for automated deployments, enhancing operational efficiency.• Data Model on AWS Cloud: Created a new data model on AWS cloud accessible via Athena, facilitating data access and analysis.• DBT Commands and Snowflake Deployment: Using DBT to test the data model, loaded dimension data, and deployed the code to Snowflake.• Creating CI/CD pipelines for automating DBT deployments.• Integrating DBT with Snowflake and GCP data models.
- Diploma in Data Science Diploma, Business Management BCADiploma in Data Science Diploma, Business Management BCA