About Me
I am Marcel Ferrari, a computational scientist with a BSc in Computational Science and Engineering, currently pursuing an MSc at ETH Zurich. Here you will find an overview of my work, including my experience, publications, and projects.
Education
Eidgenössische Technische Hochschule Zürich (ETHZ)
Sep. 2023 - Ongoing, Zurich
Master of Science - M.Sc. Computational Science and Engineering
Eidgenössische Technische Hochschule Zürich (ETHZ)
Sep. 2020 - Jan. 2024, Zurich
Bachelor of Science - B.Sc. Computational Science and Engineering
Liceo Cantonale Lugano 2 (LiLu2)
Sep. 2016 - Jun. 2020, Lugano
High School - Majored in physics and applied mathematics
Professional Experience
Swiss National Supercomputing Center (CSCS)
Jan. 2024 - Ongoing, Zurich
Software Engineer. Working on developing GPU performance monitoring tools for large distributed workloads.
Technical Skills
- Programming Languages: C, C++, Python, Bash, Fortran, x86 Assembly, SQL
- HPC Libraries and Tools: MPI, Eigen, BLAS, LAPACK, Numpy, Scipy, PyTorch, CuPy, Cuda, Slurm, Apptainer, Enroot
- Machine Learning Models: Neural Networks, Probabilistic Models, Gaussian Processes, Reinforcement Learning, PINNs, Neural Operators, Transformers
- Mathematics and Numerical Methods: Linear Algebra, Calculus, Differential Equations, Numerical Analysis, Optimization, Monte Carlo Methods
- Physics: Classical Mechanics, Quantum Mechanics, Thermodynamics, Electromagnetism, Geophysics
- Languages: Italian (Native), English (Native), Spoken Arabic (Native), German (B2), French (B2)
Research
My research interests focus on the application of machine learning, parallel numerical algorithms, and high-performance computing to solve complex scientific problems. Below are some of the key projects I have undertaken:
Developing an ML-ready Framework for Geodynamic Modelling
ETH Zurich, Jan. 2024 - ongoing
- Research project.
- Main developer of a high-performance Python/C++ ML-ready framework for Geodynamic modelling. The goal is to help researchers in the field of Geophysics to integrate ML models implemented using standard Python-based ML frameworks (e.g. Pytorch, Tensorflow, ...) into fast and efficient Python-based numerical PDE solvers.
Applications of Machine Learning and AI in scientific computing
ETH Zurich, Jan. 2024 - Jun. 2024, Zurich
- Semester projects. Part of the course AI in the sciences and engineering.
- Worked on the implementation of machine learning models for scientific and engineering tasks, focusing primarily on Physics Informed Neural Networks (PINNs), Neural Operator Learning (CNOs, FNOs), Neural Networks for PDE discovery and Transformers.
Parallel numerical algorithms for quantum tunnelling in molecular systems
ETH Zurich, Jan. 2023 - Jan. 2024, Zurich
- Bachelor's thesis supervised by Prof. Jeremy Richardson (ETHZ).
- Title: Design and Implementation of Optimized Eigendecomposition Algorithms For Symmetric Block-tridiagonal and Symmetric Broad-arrowhead Matrices and their Applications in Ring-polymer Instanton Theory.
Download
Optimized MPI collectives for low diameter network topologies
ETH Zurich, Sep. 2022 - Jan. 2023, Zurich
- Semester project. Part of the course Design of Parallel and High-Performance Computing.
- Title: Optimized Alltoall Algorithm for Low-Diameter Network Topologies.
- Lead author of a team of 5 members.
Simulations of an HIV-1 diagnostic kit based on synthetic proteins
Liceo Cantonale Lugano 2 (LiLu2), Jun. 2019 - Jun. 2020, Lugano
- High school thesis. Obtained highest grade.
- Title: Molecular Simulations on HPC Clusters of an HIV-1 Diagnostic kit based on the Top7 protein.
Publications
A Comparison of Sparse Solvers for Severely Ill-Conditioned Linear Systems in Geophysical Marker-In-Cell Simulations
arXiv, Sep. 2024
Marcel Ferrari. 2024. "A Comparison of Sparse Solvers for Severely Ill-Conditioned Linear Systems in Geophysical Marker-In-Cell Simulations." arXiv preprint arXiv:2409.11515.
View | Download
Arrowhead Factorization of Real Symmetric Matrices and its Applications in Optimized Eigendecomposition
ACM Digital Library, Jun. 2024
Marcel Ferrari, Francesco Cavalli, Hussein Harake, Christopher Lompa, and Nicola Lo Russo. 2024. Arrowhead Factorization of Real Symmetric Matrices and its Applications in Optimized Eigendecomposition. In Proceedings of the Platform for Advanced Scientific Computing Conference (PASC '24). Association for Computing Machinery, New York, NY, USA, Article 4, 1-12.
View | Download
A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network
USENIX, Apr. 2024
Blach, N., Besta, M., De Sensi, D., Domke, J., Harake, H., Li, S., Iff, P., Konieczny, M., Lakhotia, K., Kubicek, A., Ferrari, M., Petrini, F., & Hoefler, T. (2024). A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network. In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24) (pp. 1025-1044). Santa Clara, CA: USENIX Association.
View | Download
Competitions
I have actively participated in various student cluster competitions organized by world-leading HPC conferences, during which I have collaborated with my peers to solve complex computational problems and optimize performance of HPC and ML applications. Below are the key events I have been involved in:
ISC23 Student Cluster Competition
May 21-25 2023, Hamburg
- Participated as a member of team RACKlette (ETH Zurich).
- Obtained the best LINPACK award.
SC22 Student Cluster Competition
Nov. 14-17 2022, Dallas
- Participated as a member of team RACKlette (ETH Zurich).
- Obtained the overall best performance for the application I was responsible for (LAMMPS).
ISC22 Student Cluster Competition
May 21-25 2022, Hamburg
- Participated as a member of team RACKlette (ETH Zurich).
- Awarded 2nd place.
- Obtained the overall best performance for the application I was responsible for (XCompact3D).
Software Projects
Nvimon
Nvimon is a utility designed to monitor the performance metrics of GPUs on large-scale jobs. It integrates the Nvidia DCGM utility with the Slurm workload manager to provide users with a systemic overview of all the GPU resources involved in a specific Slurm job. The tool is inspired by the "top" command and presents the data in a similar tabular way, except each row corresponds to a GPU on a separate host as opposed to a running process.
Freccia Eigensolver Library
The Freccia Eigensolver library implements a series of exotic high-performance parallel eigendecomposition routines. Currently, it supports algorithms for Diagonal Plus Rank One (DPR1), Diagonal Plus Rank k (DPRK), Banded, Arrowhead, and Broad Arrowhead matrices. It is built in C++ on top of Eigen3 and supports multiple BLAS/LAPACK backends as well as partial GPU support using NVBLAS.
View
Contact
You can contact me via one of the following channels: