Description

TL;DR

Working remotely ( 90% of time)

2016 -> Now: Software / Data (Big Data) / Azure Engineer

2010 -> 2016: worked as software engineer (Essentially Microsoft)

2006 -> 2010: worked as R&D engineer

A bit more:

Got my PhD (in data mining) in 2011; I worked essentially on designing a specific statistical data analysis Model (called Symbolic Data Analysis, SDA). combined with, relational algebra, nested algebra. Hence to the algebraic soundness and completeness of the model a set of Statistical model algorithms (k-means and hierarchical clustering and other data mining / Machine learning) were designed with simple queries. All this were embedded in SQL SERVER using C#

Then, I moved to the software industry working on different sectors (stock market, financial institution, Transport, Insurance, building, energy, media) as a software developer (essentially in .Net).

Since 2016, I am working on big-data/Azure projects as data engineer for massive data processing, data quality, and and data migration projects.

Languages

Arabic
Native or bilingual
French
Native or bilingual
English
Fluent
Spanish
Fluent

Workplace preferences

Can work on-site

Clermont-Ferrand (up to 50km), Paris (up to 30km), Lyon (up to 30km)

SAUR
Data Azure Engineer & AI-Driven Data Platforms
ENVIRONMENTAL
December 2024 - May 2026 (1 year and 5 months)
Data Engineer specialized in the design, develop, and operate of large-scale cloud-native data platforms on Microsoft Azure, supporting both batch and real-time processing across multiple business domains. Strong focus on data engineering, cloud infrastructure, and AI-assisted engineering to enhance system reliability, scalability, and productivity.

- Delivered and operated a Lakehouse architecture (Medallion model) handling high-volume data workflows

- Managed and optimized 230+ ingestion pipelines using Azure Data Factory

- Developed batch pipelines using Databricks, PySpark, dbt, and Databricks Asset Bundles

- Built and operated real-time event-driven applications using Azure Event Hub (Kafka), Spark Streaming, and AKS

- Integrated messaging systems (RabbitMQ) for event processing and orchestration

- Managed Azure SQL databases with large-scale datasets

- Containerized and deployed services using Docker and Helm

- Automated infrastructure provisioning with Terraform and CDKTF

- Designed and maintained CI/CD pipelines using Azure DevOps

- Implemented end-to-end observability (Grafana, Log Analytics, KQL)

- Provided L3 support and root cause analysis for critical incidents

- Leveraged GitHub Copilot and LLM-based tools to accelerate code generation, refactoring, and analysis across complex repositories

- Applied AI-assisted engineering and agentic workflows to improve troubleshooting, incident resolution, and documentation

- Contributed to reusable knowledge and AI-enabled engineering practices, enhancing onboarding and decision-making

- Accelerated upskilling (with AI) on dbt, cdktf, Docker-Compose, Azure DevOps pipelines

Tech stack: Azure Data Factory, Databricks, PySpark, dbt, Terraform, CDKTF, Azure SQL, AKS, Event Hub, RabbitMQ, Azure DevOps, Docker, Helm, Grafana, Generative AI, LLMs, GitHub Copilot
PySpark Databricks DBT Kubernetes LLM
KPN
/ Data Engineer / Developer
TELECOMMUNICATIONS
March 2022 - December 2024 (2 years and 9 months)
Netherlands
Mission: As a member of the DPPR team within Data Dragon (20 people), I contributed to the design, delivery, and L1/L2/L3 support of a platform used by 15+ teams and 100+ users to provision Azure infrastructure and perform data loading within Data Mesh.
Key Responsibilities:
- Onboard and support on-premise big data projects migrating to Azure throughout the full migration lifecycle
- Maintain and evolve an in-house framework used to integrate data sources and load data into raw, core, mirror layers, and Teradata
- Enable new platform capabilities across Azure services such as AKS, HDInsight, Databricks, and Oracle GoldenGate
- Deliver custom features including Airflow integrations, advanced Spark capabilities such as dynamic allocation, and configuration-driven enablement
- Automate feature delivery using Jinja templating and custom YAML-based configuration
- Resolve platform security vulnerabilities across multiple components
Security and AI-assisted remediation:
- Started using ChatGPT in an ask-driven mode to accelerate investigation and remediation of vulnerabilities reported by the SRT+ tool across the platform stack
- Worked on remediation for Linux VMs, Oracle GoldenGate, Airflow, Docker images, and other infrastructure or runtime components
- Applied recommended fixes such as package upgrades, repository additions, dependency updates, and hardening actions
- rebuilt components directly from source code when no official patched release was yet available
- Used AI assistance to speed up analysis, compare remediation paths, and support secure resolution workflows across heterogeneous technical environments
- Accelerated upskilling on CVE vulnerabilities remediation on linux, Docker image optimization, Oracle Golden Gate
Tech Stack:
Terraform, Azure HDInsight, Databricks, Docker, AKS, Virtual Desktop, PostgreSQL, SQL Server, WSL, Bash, Azure CLI, Oracle GoldenGate, Python, Spark, PySpark, Hive, Jinja2, Livy, Airflow, Git, Teradata, Robot Framework, ChatGPT
EM2 Technologies
Data engineer
March 2014 - Today (12 years and 3 months)
France
I setup data processing pipeline (airflow, jenkins, Spark, Talend) Optimize data workload (Spark Map/Reduce) Monitor and debug data pipeline (Elastic, Kibana, SparkUI) Costs Optimization (Financial Operations, FinOps) Integrate/expose APIs
Security fixes (SRT+, Azure Policies, Azure Advisor)
• Acheivement : 30 - 50% cost reductions with HDInsights envolved on data projects to onboard/migrate +400 data projets, to the new Data Mesh Organization +50 trained students on Apache Spark / HBase / Hive / Big Data/ Hadoop / etc. automation infrastructure setup for +100 projects -Tech : Azure (HDInsight, Databricks, Kubernetes, Docker), Airflow, DB (SQL Server, MySQL, PostgreSQL, etc.), Terraform

Check out Omar's experience

Be the first to recommend Omar

Help this freelancer shine by sharing your experience working together.

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

Baptiste Duhen

Fullstack developer

4.6

(4)

Amed Hamou

Senior Lead Developer

(2)

Audrey Champion

Web developer

4.3

(3)

Signup to reveal

Doctor of Philosophy (PhD), Computer Science
Université Paris Dauphine
2010
Doctor of Philosophy (PhD), Computer Science
Master of Science in Computer Science
Université Paris Dauphine
2006
Master's degree, Computer Science

Check out Omar's education

CCD-410 Cloudera Certified Developer for Apache Hadoop (CCDH)
Cloudera
2015
Hadoop
Databricks Certified Associate Developer
Databricks
https://api.accredible.com/v1/frontend/credential_website_embed_image/certificate/111988568
Adaptive query execution Spark SQL functions PySpark Scala UDFs Spark DataFrames Spark architecture

Omar Merroun

Spark Expert - Azure - Developer

About Omar

Experience

Recommendations

These freelancer profiles also match your criteria

Education

Certifications

Skill set

Categories