🐍
🧠
πŸ”₯
🧭
☁️
🐳
🧱
πŸ”—
βš™οΈ
πŸ“ˆ

Amine Bousmah

Data & AI Engineer

I turn data into decisions. I shape messy data into simple answers, work closely with teams, and ship things that truly help people. πŸ€πŸš€

PythonSQLAirflowSparkCloud (GCP / AWS / Azure)DatabricksdbtSnowflakeBigQueryTensorFlowPyTorchscikit-learnMLflowDockerGitCI/CDAPIJiraPower BITableauKafkaC#Dataiku DSS
Scroll to explore

Technical Skills

πŸ› οΈ

Data Engineering & Analytics

Foundations for reliable data systems.

ETL/ELT with Python & SQL; orchestration basics (dbt/Airflow).
Star/snowflake modeling and essential warehousing patterns.
Data validation and tests with clear SLAs/expectations.
Query optimization fundamentals and cost-aware thinking.
Small streaming prototypes (Kafka) + batch/stream joins.
πŸ€–

Machine Learning & Modeling

Pragmatic ML with strong evaluation discipline.

Solid baselines (linear, tree-based) before complex models.
Time-series forecasting (ARIMA/Prophet, boosting) when useful.
Feature pipelines with leakage-safe cross-validation.
Explainability (SHAP/feature importances) and readable model cards.
Experiment tracking (MLflow) and packaging models for APIs.
πŸ“Š

BI & Data Visualization

Make results clear, trusted, and actionable.

Defined KPIs and a simple semantic/metric layer.
Interactive dashboards with filters and drill-through.
Data storytelling: annotations, small multiples, clear legends.
Row-level security basics and governance-ready layouts.
Scheduled refreshes, exports, and lightweight QA checks.
🧩

Application Design & API

Product-minded developer focused on clean, secure services.

DDD-lite: clear module boundaries and dependency rules.
REST APIs (FastAPI/Flask) with typed schemas and OpenAPI.
Auth (JWT/OAuth2), input validation, and robust error handling.
Background jobs (Celery/RQ), file ingestion, async I/O.
Frontend integration with React/Next and reusable UI patterns.
☁️

Cloud & DevOps

Ship small, observe, and iterate.

Containerized dev with Docker; reproducible environments.
CI/CD (GitHub Actions): tests, linting, type checks.
Deploy on Vercel/Cloud Run; env & secrets management.
Basic monitoring (logs/metrics/traces) and alerting.
Cost awareness and usage-based scaling (serverless first).
πŸ’Ή

Data Finance & Revenue Analytics

Applied analytics for markets, risk, and growth.

Credit risk scorecards & PD estimation; calibration & backtesting.
Market analytics: returns/volatility, simple VaR/ES & stress tests.
Fraud/AML: anomaly detection and KYC/KYB entity matching.
Portfolio analytics: mean-variance, factor tilts, Black-Litterman basics.
Documentation & governance aligned with IFRS/Model Risk standards.

Selected Projects

Vinted Extension β€” Smart auto-repost to boost visibility

Browser extension that automatically republishes listings to leverage Vinted’s algorithmic boost. Features safe scheduling, anti-duplicate logic and local anti-tracking to maximize views and click-throughs without manual effort.

Key results

95%
automation
85%
time Saved
30%
view Uplift

Technical implementation

  • β–ΉChrome Extension (content + background service worker)
  • β–ΉTask scheduler, de-duplication & cooldown management
  • β–ΉLocal headers/cookies handling; optional Express helper
  • β–ΉImage helpers (crop/compress) when needed
Vinted Extension β€” Smart auto-repost to boost visibility

Tribara β€” Talent Matching Optimization

AI-powered recruitment optimization to automate candidate screening and ranking, integrated with ATS for seamless workflows. Delivered faster shortlists and more relevant matches for recruiters.

Key results

50%
screening Time Reduction
500
cv Volume
30%
relevance Gain

Technical implementation

  • β–ΉPython ETL & ML pipeline (parsing + scoring)
  • β–ΉNLP-based candidate ranking with continuous fine-tuning
  • β–ΉATS integration (webhooks/API) & scoring feedback loop
  • β–ΉDashboard & export for recruiter decision support
Tribara β€” Talent Matching Optimization

Face Recognition β€” Find all photos of a person

Application that lets you upload a few photos of yourself to automatically detect all occurrences within an event album (ideal for team building/seminars).

Key results

91.4%
detection A P
99.83%
lfw Accuracy
10000
index Size

Technical implementation

  • β–ΉFace detection: RetinaFace/SCRFD (InsightFace)
  • β–ΉFace embeddings: ArcFace (InsightFace, 512-d vectors)
  • β–ΉSimilarity search & scaling: FAISS (IVF/PQ or HNSW)
  • β–ΉDe-duplication & robustness: thresholding + DBSCAN; multi-reference averaging
Face Recognition β€” Find all photos of a person

11Field β€” Football analytics & scouting suite

End-to-end scouting toolkit: xG/xGA, role-based radars, league comparators, match reports and player similarity. Adds ML models for clustering and explainability to support recruitment decisions.

Key results

12%
leagues Covered
40%
modeled Features
60%
time To Insight

Technical implementation

  • β–ΉData ingestion from public football APIs (FBref/ESPN/ClubElo, etc.)
  • β–ΉInteractive dashboards (Streamlit + Plotly)
  • β–ΉPCA + KMeans for playing-style clusters
  • β–ΉRandomForest + SHAP for explainable player ranking
11Field β€” Football analytics & scouting suite

Modern Data Capabilities

πŸ”Œ

Ingestion & Connectivity

  • REST/GraphQL, webhooks, SaaS & DB connectors
  • Batch files (CSV/Parquet) + CDC/event streams
  • Secrets, retries, backoff, idempotency
🧭

Workflow Orchestration

  • Reproducible DAGs with clear SLAs
  • Idempotent tasks, alerts, backfills
  • Data-aware scheduling & dependency management
πŸ—„οΈ

Lakehouse Storage & Formats

  • Object storage + warehouse interoperability
  • Parquet/Delta/Iceberg, partitioning & compaction
  • Schema evolution, time travel & ACID tables
πŸ—οΈ

Modeling & ELT

  • Layered models (staging β†’ core β†’ marts)
  • Data contracts & tests (quality as code)
  • SCD patterns, surrogate keys, audit columns
πŸ§ͺ

Data Quality & Observability

  • Freshness, completeness, accuracy monitors
  • Column-level lineage & impact analysis
  • Anomaly detection with playbooks/runbooks
πŸ“Š

BI & Semantic Layer

  • Governed metrics/semantic layer for consistency
  • Row-level security & policy-based access
  • Drill-through dashboards, alerts & subscriptions
✨

Data Apps & UX

  • API-first apps (Next.js/React) with great UX
  • Accessible, fast, mobile-friendly interfaces
  • Shareable exports & decision-ready views
⚑

Realtime & Streaming

  • CDC & event-driven pipelines (micro-batch/stream)
  • Live dashboards via WebSockets/SSE
  • Materialized views & low-latency caches
πŸ€–

ML & MLOps

  • Feature pipelines with reproducible training
  • Experiment tracking, registry & versioning
  • Drift/fairness monitoring & A/B evaluations
🧠

LLM & RAG

  • Embeddings & chunking with prompt versioning
  • Hybrid retrieval + guardrails & citations
  • Privacy-aware grounding on enterprise data
🧭

Vector Search

  • ANN indexes (HNSW, IVF-PQ) at scale
  • Hybrid keyword + vector retrieval
  • Deduplication & clustering for discovery
πŸ”’

Governance, Privacy & Security

  • RBAC/ABAC, masking/tokenization of PII
  • Catalog, lineage, ownership & audit logs
  • Compliance by design (GDPR, ISO 27001)
πŸ’Έ

FinOps & Performance

  • Cost tags/budgets & storage lifecycle
  • Pruning, partition pushdown, caching
  • Autoscaling, SLAs/SLOs with clear error budgets
πŸš€

Data CI/CD & DevEx

  • Git-based reviews, tests & linters for data code
  • Reproducible builds & artifact versioning
  • Ephemeral preview envs & safe rollbacks
πŸ”

Interop & Internal APIs

  • OpenAPI/JSON Schema contracts & governance
  • Reverse ETL to operational tools
  • Pagination, rate limits & idempotent writes

Let's work together πŸš€

Are you looking for a Data & AI profile capable of combining technical expertise with a human touch, someone who turns data into meaningful stories, builds solutions that matter, and works hand in hand with teams to create impact?

Paris, France