Profile Photo

Ilenia Fortuna

MSc Student in Data Science | Data Analysis, Machine Learning, Big Data

About Me

I hold a Bachelor's degree in Digital Humanities from the University of Pisa and I am currently pursuing a Master's degree in Data Science and Business Informatics at the University of Pisa.

I am particularly interested in data analysis, machine learning, business intelligence, and real-world data-driven applications. Through my academic projects, I have worked on different types of datasets and analytical problems, ranging from large-scale distributed data processing to clustering, classification, regression, time series analysis, and explainability.

I have also developed experience in data warehousing and decision support systems, including OLAP analysis, ETL processes, and dashboard development. I have worked with several programming languages and technologies, including HTML, CSS, JavaScript, Python, R, SQL, XML, PySpark, and tools for data analysis and reporting.

Skills

Python R SQL PySpark HTML CSS JavaScript XML Machine Learning Data Mining Clustering Classification Regression Time Series Outlier Detection Explainability Data Cleaning Feature Engineering ETL Data Warehouse OLAP MDX Power BI

Projects

Distributed Data Analysis and Mining – US Accidents

Big data project developed on the US Accidents dataset (2016–2023), containing about 7.7 million records. Using PySpark, I worked on large-scale data preprocessing, cleaning, clustering, classification, regression, and explainability techniques to analyze accident severity and duration.

The project involved handling large volumes of data in a distributed environment, building predictive models, and extracting meaningful insights from temporal, geographical, weather, and traffic-related variables.

View project on GitHub

Data Mining – IMDb Analytics

End-to-end data mining project on an IMDb dataset of over 16,000 titles. The work included data understanding and preparation, clustering, classification and regression in the first phase, and then more advanced topics such as outlier detection, imbalanced learning, advanced classification, explainability, anomaly detection, motifs, discords and time series analysis.

This project allowed me to work on several stages of the data mining pipeline, from preprocessing and modeling to more advanced analytical techniques.

View project on GitHub

Decision Support System – Music Streaming Analytics

Decision support project focused on the analysis of a music streaming service dataset. I contributed to data understanding, cleaning and enrichment, dimensional modeling, data warehouse design, ETL implementation, OLAP cube analysis, MDX queries and dashboard development for business insights.

The project combined technical and business-oriented analysis, with a focus on transforming raw data into structured decision support tools.

View project on GitHub

Other Academic and Technical Projects

In addition to the projects above, I also worked on web development and data-oriented projects, including a blog database project, a website developed with HTML and CSS, a statistical project in R on ranked data, and coding activities related to computational linguistics.

Contacts

LinkedIn: linkedin.com
Email: ilenia.fortuna@outlook.it