About Me

I am Marcel Ferrari, a computational scientist with a BSc in Computational Science and Engineering, currently pursuing an MSc at ETH Zurich. Here you will find an overview of my work, including my experience, publications, and projects.

Education

Eidgenössische Technische Hochschule Zürich (ETHZ)

Sep. 2023 - Ongoing, Zurich

Master of Science - M.Sc. Computational Science and Engineering

Eidgenössische Technische Hochschule Zürich (ETHZ)

Sep. 2020 - Jan. 2024, Zurich

Bachelor of Science - B.Sc. Computational Science and Engineering

Liceo Cantonale Lugano 2 (LiLu2)

Sep. 2016 - Jun. 2020, Lugano

High School - Majored in physics and applied mathematics

Professional Experience

Swiss National Supercomputing Center (CSCS)

Jan. 2024 - Ongoing, Zurich

Software Engineer. Working on developing GPU performance monitoring tools for large distributed workloads.

Technical Skills

  • Programming Languages: C, C++, Python, Bash, Fortran, x86 Assembly, SQL
  • HPC Libraries and Tools: MPI, Eigen, BLAS, LAPACK, Numpy, Scipy, PyTorch, CuPy, Cuda, Slurm, Apptainer, Enroot
  • Machine Learning Models: Neural Networks, Probabilistic Models, Gaussian Processes, Reinforcement Learning, PINNs, Neural Operators, Transformers
  • Mathematics and Numerical Methods: Linear Algebra, Calculus, Differential Equations, Numerical Analysis, Optimization, Monte Carlo Methods
  • Physics: Classical Mechanics, Quantum Mechanics, Thermodynamics, Electromagnetism, Geophysics
  • Languages: Italian (Native), English (Native), Spoken Arabic (Native), German (B2), French (B2)

Research

My research interests focus on the application of machine learning, parallel numerical algorithms, and high-performance computing to solve complex scientific problems. Below are some of the key projects I have undertaken:

Developing an ML-ready Framework for Geodynamic Modelling

ETH Zurich, Jan. 2024 - ongoing

  • Research project.
  • Main developer of a high-performance Python/C++ ML-ready framework for Geodynamic modelling. The goal is to help researchers in the field of Geophysics to integrate ML models implemented using standard Python-based ML frameworks (e.g. Pytorch, Tensorflow, ...) into fast and efficient Python-based numerical PDE solvers.

Applications of Machine Learning and AI in scientific computing

ETH Zurich, Jan. 2024 - Jun. 2024, Zurich

  • Semester projects. Part of the course AI in the sciences and engineering.
  • Worked on the implementation of machine learning models for scientific and engineering tasks, focusing primarily on Physics Informed Neural Networks (PINNs), Neural Operator Learning (CNOs, FNOs), Neural Networks for PDE discovery and Transformers.

Parallel numerical algorithms for quantum tunnelling in molecular systems

ETH Zurich, Jan. 2023 - Jan. 2024, Zurich

  • Bachelor's thesis supervised by Prof. Jeremy Richardson (ETHZ).
  • Title: Design and Implementation of Optimized Eigendecomposition Algorithms For Symmetric Block-tridiagonal and Symmetric Broad-arrowhead Matrices and their Applications in Ring-polymer Instanton Theory.
    Download

Optimized MPI collectives for low diameter network topologies

ETH Zurich, Sep. 2022 - Jan. 2023, Zurich

  • Semester project. Part of the course Design of Parallel and High-Performance Computing.
  • Title: Optimized Alltoall Algorithm for Low-Diameter Network Topologies.
  • Lead author of a team of 5 members.

Simulations of an HIV-1 diagnostic kit based on synthetic proteins

Liceo Cantonale Lugano 2 (LiLu2), Jun. 2019 - Jun. 2020, Lugano

  • High school thesis. Obtained highest grade.
  • Title: Molecular Simulations on HPC Clusters of an HIV-1 Diagnostic kit based on the Top7 protein.

Publications

A Comparison of Sparse Solvers for Severely Ill-Conditioned Linear Systems in Geophysical Marker-In-Cell Simulations

arXiv, Sep. 2024

Marcel Ferrari. 2024. "A Comparison of Sparse Solvers for Severely Ill-Conditioned Linear Systems in Geophysical Marker-In-Cell Simulations." arXiv preprint arXiv:2409.11515.
View | Download

Arrowhead Factorization of Real Symmetric Matrices and its Applications in Optimized Eigendecomposition

ACM Digital Library, Jun. 2024

Marcel Ferrari, Francesco Cavalli, Hussein Harake, Christopher Lompa, and Nicola Lo Russo. 2024. Arrowhead Factorization of Real Symmetric Matrices and its Applications in Optimized Eigendecomposition. In Proceedings of the Platform for Advanced Scientific Computing Conference (PASC '24). Association for Computing Machinery, New York, NY, USA, Article 4, 1-12.
View | Download

A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network

USENIX, Apr. 2024

Blach, N., Besta, M., De Sensi, D., Domke, J., Harake, H., Li, S., Iff, P., Konieczny, M., Lakhotia, K., Kubicek, A., Ferrari, M., Petrini, F., & Hoefler, T. (2024). A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network. In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24) (pp. 1025-1044). Santa Clara, CA: USENIX Association.
View | Download

Competitions

I have actively participated in various student cluster competitions organized by world-leading HPC conferences, during which I have collaborated with my peers to solve complex computational problems and optimize performance of HPC and ML applications. Below are the key events I have been involved in:

ISC23 Student Cluster Competition

May 21-25 2023, Hamburg

  • Participated as a member of team RACKlette (ETH Zurich).
  • Obtained the best LINPACK award.

SC22 Student Cluster Competition

Nov. 14-17 2022, Dallas

  • Participated as a member of team RACKlette (ETH Zurich).
  • Obtained the overall best performance for the application I was responsible for (LAMMPS).

ISC22 Student Cluster Competition

May 21-25 2022, Hamburg

  • Participated as a member of team RACKlette (ETH Zurich).
  • Awarded 2nd place.
  • Obtained the overall best performance for the application I was responsible for (XCompact3D).

Software Projects

Nvimon

Nvimon is a utility designed to monitor the performance metrics of GPUs on large-scale jobs. It integrates the Nvidia DCGM utility with the Slurm workload manager to provide users with a systemic overview of all the GPU resources involved in a specific Slurm job. The tool is inspired by the "top" command and presents the data in a similar tabular way, except each row corresponds to a GPU on a separate host as opposed to a running process.

Freccia Eigensolver Library

The Freccia Eigensolver library implements a series of exotic high-performance parallel eigendecomposition routines. Currently, it supports algorithms for Diagonal Plus Rank One (DPR1), Diagonal Plus Rank k (DPRK), Banded, Arrowhead, and Broad Arrowhead matrices. It is built in C++ on top of Eigen3 and supports multiple BLAS/LAPACK backends as well as partial GPU support using NVBLAS.
View

Contact

You can contact me via one of the following channels: