Magdy Mahmoud

About

I am an M.Sc. Informatics student at the Technical University of Munich, working on computer vision and 3D perception. My current research focuses on monocular and multi-modal 3D scene understanding, with an emphasis on geometric representation learning, segmentation, and vision foundation models for robust perception.

Before TUM, I worked as a machine-learning engineer at Aigorithm on computer-vision systems for object detection, segmentation, and active-learning data labeling. I also interned with the Zurich-based Google Shopping team, working from the Munich office on a machine-learning system prototype for offer freshness. These experiences shaped my interest in research that connects visual representation learning, 3D geometry, and reliable perception in real-world settings.

Research Interests

My interests center on robust scene understanding from visual and multi-modal data.

Monocular and multi-modal 3D scene understanding for autonomous driving and robotics.
Geometry-aware perception: 3D detection, segmentation, tracking, depth/shape estimation, and scene reconstruction.
Vision and geometric foundation models: developing, adapting, and evaluating models for robust spatial perception.
Weakly supervised, self-supervised, and scalable learning from images, masks, depth, and 3D structure.

Experience & Education

M.Sc. Informatics

Technical University of Munich

May 2022 - Dec 2026 expected

Expected graduation: Dec 2026. Focus: computer vision, machine learning, and 3D scene understanding.
Research & Teaching Assistant

Technical University of Munich

Dec 2022 - Apr 2025

Teaching assistant for Computer Vision 3: Detection, Segmentation and Tracking. Researched unsupervised semantic segmentation with denoising diffusion and student-teacher distillation.
Software Engineer Intern

Google Shopping, Zurich team; based in Munich

Aug 2022 - Dec 2022

Developed a machine-learning system prototype for estimating offer freshness from product and price signals.
Junior Machine Learning Engineer

Aigorithm, Cairo

Nov 2020 - Apr 2022

Developed computer-vision pipelines for data labeling, active learning, object detection, segmentation and model deployment.
Software Engineer Intern

DevisionX, Cairo

Jun 2018 - Aug 2018

Worked on ID verification using segmentation and recognition of Arabic text in TensorFlow, with Python/Flask backend integration.
B.Sc. Computer Science

Thebes University

Oct 2014 - Aug 2018

Final GPA: 3.44/4.00. Thesis: Arabic Image Captioning.

Projects

Selected Research Projects

M.Sc. Thesis / Current Research: Monocular and Multi-Modal 3D Scene Understanding PyTorch 3D Perception Autonomous Driving
Researching foundation-model-based and geometry-aware approaches for 3D scene understanding in autonomous-driving scenarios, with emphasis on robust perception and reliable evaluation.
Guided Research: Multi-Modal 3D Object Detection [report] LiDAR Image Fusion
Designed a LiDAR-image 3D detector with relation-aware reasoning for driving scenes and studied feature fusion strategies for 3D box localization.
IDP: Semi-Supervised Vehicle Part Segmentation [report | code] Point Clouds Segmentation
Explored semi-supervised vehicle part segmentation in LiDAR point clouds to reduce annotation cost for autonomous-driving perception.
Master-Praktikum: Unsupervised Video Segmentation [report | code] Video Optical Flow
Combined motion and appearance cues for unsupervised video object segmentation with optical-flow-based consistency losses.
3DMM Estimation from Highly Distorted Images [code | report] 3D Reconstruction
Estimated 3D morphable model parameters from highly distorted images and analyzed robustness under camera distortion.
Semantic Segmentation with DGCNN on ScanNet [code | report] DGCNN ScanNet
Implemented point-cloud semantic segmentation on the ScanNet indoor dataset.
Cobb Angle Estimation for Scoliosis Screening [project | hackathon] Medical Imaging
Built a deep-learning pipeline to estimate the Cobb angle from spine X-ray images for automatic scoliosis screening.

Additional Technical Projects

Road Damage Detection PyTorch YOLOv5
Fine-tuned object-detection models and adapted training code for custom road-damage datasets.
Human Action Recognition Keras Video
Built video-classification models for human action recognition on UCF101.
Arabic Image Captioning Keras NLP
Prototyped Arabic image captioning using translated Flickr8k captions and Arabic word embeddings.
Predict Future Sales scikit-learn Time Series
Solved a Kaggle forecasting task with feature engineering, lag features, mean encoding, and ensemble models.
Gender-Age Prediction Keras Faces
Built a face-attribute prediction prototype using pretrained convolutional networks.
Web and C++ Systems [Q&A | store | courses | tic-tac-toe] JSP MySQL C++
Built full-stack, backend, and algorithmic software prototypes during undergraduate and self-directed work.

Awards & Competitive Programming

ICPC: Qualified for the ICPC World Finals through ACPC, with strong regional results including 6th place at ECPC and 2nd place at ECPCQ.
Meta Hacker Cup: Reached Round 2 in 2019 and 2021; among the top contestants in Egypt.
Contest judge: Served as ICPC contest judge at several Egypt local contests and regionals.
Online certificates: Udacity Machine Learning Engineer Nanodegree scholarship; completed in 3 months vs. 6 expected. Also completed the Deep Learning Specialization by deeplearning.ai.

Certificates

Udacity Machine Learning Engineer Nanodegree [certificate]
Deep Learning Specialization by deeplearning.ai [certificate]
Machine Learning Specialization by the University of Washington [certificate]

About

Research Interests

Experience & Education

M.Sc. Informatics

Technical University of Munich

Research & Teaching Assistant

Technical University of Munich

Software Engineer Intern

Google Shopping, Zurich team; based in Munich

Junior Machine Learning Engineer

Aigorithm, Cairo

Software Engineer Intern

DevisionX, Cairo

B.Sc. Computer Science

Thebes University

Projects

Awards & Competitive Programming

Certificates