Research data scientist / research software engineer

Xavi Rubio

Data scientist and research software engineer working on reliable ML, statistical pipelines, and AI systems that can be evaluated. My path has moved through software, IoT analytics, biomedical research, and finance, and I am now building toward deeper statistical work at the University of Geneva.

Xavi Rubio GitHub profile image
Focus
Reliable ML systems, statistical modelling, and evaluation
Experience
Finance, IoT analytics, biomedical research, and data software
Education
MSc Statistics at the University of Geneva and BSc Applied Data Science

About

I am drawn to the part of data science where statistics, software, and real systems meet.

I came into data science through applied work rather than toy problems. I worked with chemical-data software, IoT sensor networks, neurological research data, and credit-risk reporting. Those environments taught me that useful ML is rarely just the model. It is the data pipeline, the assumptions, the validation checks, the interface, and the honesty about uncertainty. That is the direction I am taking into statistics and research software.

I like work where the implementation and the reasoning have to hold together. A good project should be easy to inspect, possible to reproduce, and honest about what is known.

The projects I want to keep building sit between statistical thinking and practical systems. I use them to become stronger in software development while building a deeper foundation in research and statistics, so I can approach new technologies with enough breadth to understand the whole system and enough depth to apply them carefully.

Project portfolio

Technical work

A compact set of projects ordered by technical scope, maturity, and relevance to my current research and engineering direction.

Background

Experience

Sep 2025 - Feb 2026

Banco Sabadell

Data Science Intern

Reworked monthly credit reporting into a weekly workflow, adapting data sources, validation checks, execution logic, and BI visualisations for risk monitoring.

SAS SQL Credit risk BI reporting
Jul 2022 - Dec 2024

Orpheus

Data Scientist

Owned IoT data workflows for thousands of sensors and hundreds of hubs. Built ETL, PostgreSQL/Grafana analytics, backend processing, and regression-based anomaly detection.

Python PostgreSQL Grafana ETL IoT analytics
Jun 2024 - Nov 2024

Institut de Recerca Sant Pau

Data Science Intern

Harmonised neurological research datasets, prepared APOE biomarker data, and supported exploratory and Kaplan-Meier analysis in Python, R, and remote research environments.

Python R Kaplan-Meier Research data VNC/Linux
Feb 2021 - Dec 2021

Chemotargets

Software Developer

Optimised SQL queries and refactored database schemas, improving ETL runtime and maintainability for data-processing workflows.

Python SQL Flask MySQL Angular

Education

Academic background

Period
2026 -
Status
Enrolled

MSc Statistics

University of Geneva

Period
2021 - 2026
Grade
8.08/10

BSc Applied Data Science

Open University of Catalonia

Period
2019 - 2021
Grade
8.12/10

AAS Computer Science

Escuelas Universitarias Gimbernat

Tooling

Working stack

Programming

Python, SQL, R, SAS, Bash, MATLAB, Swift, JavaScript, Go basics.

ML and AI

Regression, anomaly detection, supervised and unsupervised learning, NLP, embeddings, RAG, LLM systems.

Statistics

EDA, Bayesian inference, PCA, survival analysis, calibration, conformal prediction, credit-risk metrics.

Data systems

FastAPI, PostgreSQL, Docker, Git, ETL pipelines, validation checks, Grafana, Linux/VNC, AWS.

Contact

Open to research software, data science, and applied AI work.