MSc Data Science @ Paris 1 Panthéon-Sorbonne · Data Engineer / Data Scientist in the making
Data science student with hands-on experience in credit risk modelling at Groupe BPCE (France's 2nd largest banking group). Passionate about building end-to-end ML pipelines and GenAI applications from data to production. Exchange semester at Peking University (GPA 4.0/4.0).
ML / Stats: Regression (OLS, LASSO, Ridge) · Clustering (K-Means, GMM) · XGBoost · Neural Networks · PCA GenAI: RAG · LangChain LCEL · FAISS · HuggingFace Embeddings · Groq / Llama 3.3 70B
| Project | Description | Stack |
|---|---|---|
| ⚡ energy-ops-assistant | GenAI chatbot for energy data analysis — RAG pipeline (FAISS + LangChain) powered by Llama 3.3 70B. Ask questions about energy reports & sensor data in plain English. ▶ Live demo | Python · LangChain · Groq · Streamlit |
| hotel-booking-ml | Customer segmentation & cancellation prediction — GMM clustering, XGBoost, PCA on 36K bookings | Python · Scikit-learn |
| global-sustainable-energy | PCA & K-Means clustering on global energy data — country profiling & temporal trends (2000–2020) | Python · Scikit-learn |
| analyse-crypto-csd | Critical Slowing Down indicators (AR1, skewness, Kendall's τ) on BTC/ETH pre-crash signals | SAS |