Work Experience
01/04/2024-Present Data Scientist / Open Source Developer, PyMC Labs, Remote.
- Working on open-source software packages and helping clients use them and get the best of them to generate business value.
- Advising on data science projects in the area of marketing analytics and experimentation.
- Coaching and technical workshops on Bayesian statistics and applications.
- Technical writing and communication.
01/08/2029-Present Senior Applied Scientist (Pricing & Forecasting), Wolt, Berlin, Germany.
- End-to-end deployed product-level (~ 100K time series) demand forecasting models for Wolt Market.
- Scoping (product requirements and simulations), planning, and executing pricing experiments.
- Price elasticity estimation for thousands of retail products using modern Bayesian methods at scale..
01/10/2021-31/07/2023 Senior Applied Scientist (Marketing Tech), Wolt, Berlin, Germany.
- Member of the marketing tech team, a cross-functional product team. I am leading the data science projects from conceptualization, and modeling to deployment.
- Developing data science products in the following domains: marketing attribution, customer lifetime value, churn prediction and prevention, cohort revenue-retention matrix modeling, A/B testing, marketing efficiency measurement and optimization through geo-experimentation, causal inference methods and media mix models (via modern Bayesian techniques in PyMC).
- Mentoring and coaching data scientists (technical, project management and soft skills).
01/03/2020-30/09/2021 Senior Data Scientist (Forecasting), HelloFresh SE, Berlin, Germany.
- Main contributor of the data science efforts in the meal-box, recipe and add-ons forecasting, using methods in machine learning and time series analysis for various international markets. Maintained several time series models running in production and contributed to internal utility libraries in Python. Main tech stack: Python, Docker, Concourse for the CI/CD pipelines and Airflow for job scheduling.
- Responsible for planning and developing data science products: business requirements, conceptualization, exploratory data analysis, modelling, deployment and monitoring. Collaborated closely with stakeholders, product owners, data analysts and data engineers.
- Team member of the data science continuous learning work-stream. Responsible for the organisation and planning of knowledge sharing sessions (seminars, workshops, and external speakers)
- Involved in the recruitment and hiring process.
- Mentored junior/mid level data scientists.
01/09/2019-31/01/2020 Senior Data Scientist, TD Reply GmbH, Berlin, Germany.
- Project Management: Lead various data projects working closely with data engineers, consultants and clients. Mentoring junior data scientists.
- Machine Learning + Forecasting: Sales prediction models via Time Series Analysis + ML • Explainable ML models • Media Mix modeling and ROI optimization.
- Geolocation Modeling: Outlet Segmentation • Traffic Modeling.
- Data Integration: Scalable automated data integration (Spark) • KPI design + definition • Tracking.
01/04/2018-01/09/2019 Data Scientist, TD Reply GmbH, Berlin, Germany.
- Machine Learning + Forecasting: Sales prediction models via Time Series Analysis + ML • Explainable ML models • Media Mix modeling and ROI optimization.
- Time Series Analysis: Time Series Clustering • Product life-cycle analysis.
- Supervisor Bachelor Thesis: Applying Network Analysis and Data Visualization in the Medical Domain, Laert Nuhu (TU Berlin).
01/07/2017-01/04/2018 Junior Data Scientist, TD Reply GmbH, Berlin, Germany.
- Text Analysis: Social media text data mining. • n-gram networks • Topic Modeling.
- Social Networks: Analysis of social network interactions, node importance and community detection.
- Others: Dynamic data visualization (e.g. Shiny, Plotly).
01/01/2017-01/07/2017 Trainee - Data Analyst, GoEuro, Berlin, Germany.
- Integrate provider data from various sources into the search engine • Normalize and geo-reference data from different sources • Automate and define new tools to scale and increase efficiency for the data quality program.
2014-2015 Student Representative, Berlin Mathematical School, Berlin, Germany.
- Member of the organization committee of the 3rd−BMS Student Conference.
2008-2010 Teaching Assistant, Universidad de los Andes, Bogotá, Colombia.
- Courses: Linear Algebra, Basic Physics II, Riemannian Geometry.
Volunteering Activities
- 07/2017 - 01/2020 Data Scientist, TD ProBonoProData Team supporting and sharing knowledge with NPOs.
- 01/2018-03/2018 Data Scientist, CorrelAid Network, Dashboard Project for Projekt Seehilfe e.V.
- 03/2011-08/2011 Mathematics Teacher, ColombiaCrece, Bogotá, Colombia.
- 06/2007-08/2007 Camp Counselor, Skylake Yosemite Camp, Wishon, CA 93669, USA.
Education
2014-2017 PhD in Mathematics (Magna Cum Laude), Humboldt Universität zu Berlin, Berlin, Germany.
Advisor: Prof. Jochen Brüning
- Thesis: Induced Dirac-Schrödinger on quotients of semi-free circle actions.
2011-2014 M.Sc. Mathematics, Humboldt Universität zu Berlin, Berlin, Germany.
Advisor: Prof. Jochen Brüning
- Thesis: The signature operator on certain singular spaces.
2005-2011 B.Sc. Mathematics (Cum Laude), Universidad de los Andes, Bogotá, Colombia.
Advisor: Alexander Cardona
2005-2011 B.Sc. Physics (Cum Laude), Universidad de los Andes, Bogotá, Colombia.
Advisor: Andrés Reyes
Publications & Preprints
Notes & Expository Articles
- \(S^1\)-Equivariant Dirac operators on the Hopf Fibration (2018)
- \(L^2\)-Cohomology and the Hodge Theorem, BMS Student Conference (2016)
- \(C^*\)-Algebras and the Gelfand-Naimark Theorems, BMS Student Conference (2014)
- The Signature Theorem, Villa de Leyva Summer School (2013)
- What is a Dirac Operator?, What is Seminar (2012)
- Introduction to the Chern Class (Dirac’s Monopole), Index Theory Seminar (2012)
- Introduction to the Moment Map, Villa de Leyva Summer School (2011)
Certificates
Data Analysis and Machine Learning
TensorFlow 2 for Deep Learning (Coursera) Tools: tensorflow (probability), keras.
- Getting started with TensorFlow 2
- Customising your models with TensorFlow 2
- Probabilistic Deep Learning with TensorFlow 2
Deep Learning Specialization (Coursera) Tools: numpy, tensorflow, keras.
- Neural Networks and Deep Learning
- Improving Deep Neural Networks
- Structuring Machine Learning Projects
- Convolutional Neural Networks
- Sequence Models
Applied Data Science with Python Specialization (Coursera) Tools: scipy, numpy, pandas, matplotlib, scikit-learn, nltk, networkX.
- Introduction to Data Science in Python
- Applied Plotting, Charting & Data Representation in Python
- Applied Machine Learning in Python
- Applied Text Mining in Python
- Applied Social Network Analysis in Python
Machine Learning (Coursera) Tools: matlab
Statistical Analysis
- Econometrics: Methods and Applications (Coursera) Tools: pyton: statsmodels (python).
- Bayesian Statistics: From Concept to Data Analysis (Coursera) Tools: R.
- Bayesian Statistics: Techniques and Models (Coursera) Tools: R and JAGS.
Data Visualization
- Information Visualization: Programming with D3.js (Coursera) Tools: D3.js
Selected Talks and Events
09/2024 Time Series forecasting with NumPyro, PyData Amsterdam 2024, Amsterdam, Netherlands. Slides
04/2024 A conceptual and practical introduction to Hilbert Space Gaussian Process (HSGP) approximation methods, PyConDE & PyData Berlin 2024, Berlin, Germany. Slides
08/2023 Cohort Revenue & Retention Analysis: A Bayesian Approach, PyData Berlin Meetup, Berlin, Germany. Slides
05/2023 Bayesian Methods in Modern Marketing Analytics, PyMC Labs webinar. Slides
03/2023 Media Mix Models: A Bayesian Approach with PyMC, Artificial Intelligence Association of Lithuania. Slides
10/2022 Offline Campaign Analysis Measurement: A journey through causal impact, geo-experimentation and synthetic control, Data Science at Wolt, Helsinki Data Science Meetup, Helsinki, Finland.
09/2022 Introduction to BTYD (Buy Till You Die) Models, Berlin Bayesians Meetup, Berlin, Germany. Slides
07/2022 Introduction to Bayesian Modeling with PyMC, PyCon Colombia 2022. Slides
05/2022 Podcast: Data Talks Club: Machine Learning in Marketing for DataTalks.Club
04/2022 Introduction to Uplift Modeling, PyCon DE & PyData Berlin 2022, Berlin, Germany. Slides
10/2021 Exploring Tools for Interpretable Machine Learning, PyData Global 2021. Slides
09/2020 Gaussian Processes for Time Series Forecasting with Applications in Scikit-Learn, Second Symposium on Machine Learning and Dynamical Systems 2020, The Fields Institute for Research in Mathematical Sciences.
06/2020 Member of the Organization Committee , satRday Berlin 2020: A conference for R users in Berlin, Berlin, Germany.
10 /2019 Gaussian Process for Time Series Analysis, PyCon DE & PyData 2019, Berlin, Germany. Slides
08/2019 BMS Summer School 2019: Mathematics of Deep Learning, Berlin, Germany.
06/2019 Remedies for Severe Class Imbalance, satRday Berlin 2019: A conference for R users in Berlin, Berlin, Germany.
07/2018 On Laplacian Eigenmaps for Dimensionality Reduction, PyData Berlin 2018, Berlin, Germany. Slides
08/2017 Introduction to Bayesian modeling with PyMC3, Python Users Berlin (PUB), Berlin, Germany.
07/2017 Workshop on Loop Spaces, Supersymmetry and Index Theory, Chern Institute of Mathematics, Tianjin, China.
07/2017 PyData Berlin, Hochschule für Technik und Wirtschaft, Berlin, Germany.
08/2016 Focus Program on Topology, Stratified Spaces and Particle Physics, The Fields Institute for Research in Mathematical Sciences, Toronto, Canada.
07/2016 Contributed Talk: Induced Dirac-Schrödinger operators on quotients of semi-free circle actions, 7th European Congress of Mathematics, Berlin, Germany.
06/2015 Summer School: Geometric and Computational Spectral Theory Université de Montréal, Montreal, Canada.
09/2014 Trimester Program: Non-commutative Geometry and its Applications, Hausdorff Research Institute for Mathematics, Bonn, Germany.
Languages
- Spanish: Native proficiency.
- English: Full professional proficiency.
- German: Professional working proficiency.