Summary

Publications

Datasets

I am a master's student in Data Engineering and Analytics at the Technical University of Munich and a research intern at Harvard Medical School, where I am writing my thesis on estimating neural orientation distribution fields for diffusion MRI reconstruction. I also work part-time as a student researcher at the Helmholtz AI Central Unit in Munich, where we are developing a tool to detect and classify organoids in microscopy images. My research focuses on signal and image processing for inverse problems in medical imaging, at the intersection of machine learning and biomedical imaging. Back in Syria, I completed a five-year bachelor's degree in Information Systems Engineering, graduating at the top of my cohort in the Intelligent Systems specialization. As a volunteer, I serve on the graduate leadership team of the Syrian Youth Empowerment Initiative, a nonprofit that helps Syrian students apply to prestigious universities and scholarship programs worldwide.

Ghandoura, Abdulkader, Farouk Hjabo, and Oumayma Al Dakkak. “Building and benchmarking an Arabic Speech Commands dataset for small-footprint keyword spotting.” Engineering Applications of Artificial Intelligence 102 (2021): 104267.

Arabic Speech Commands Dataset (CC BY 4.0): Our dataset is a list of pairs (x, y), where x is the input speech signal, and y is the corresponding keyword. The final dataset consists of 12000 such pairs, comprising 40 keywords. Each audio file is one-second in length sampled at 16 kHz. We have 30 participants, each of them recorded 10 utterances for each keyword. Therefore, we have 300 audio files for each keyword in total (30 * 10 * 40 = 12000), and the total size of all the recorded keywords is ~384 MB. The dataset also contains several background noise recordings we obtained from various natural sources of noise. We saved these audio files in a separate folder with the name background_noise and a total size of ~49 MB.