Summary
Publications
Datasets
Summary
Research Intern at Harvard Medical School, focusing on estimating neural orientation distribution fields for diffusion MRI reconstruction. Interested in signal and image processing for solving inverse problems in medical imaging. Pursuing a Master's degree in Data Engineering and Analytics at the Technical University of Munich, funded by a DAAD. Volunteers on the graduate leadership team of Syrian Youth Empowerment, a national nonprofit that helps Syrian students apply to universities and scholarships worldwide. Holds a Bachelor's degree in Information Systems Engineering with a specialization in Intelligent Systems.
Publications
- Ghandoura, Abdulkader, Farouk Hjabo, and Oumayma Al Dakkak. “Building and benchmarking an Arabic Speech Commands dataset for small-footprint keyword spotting.” Engineering Applications of Artificial Intelligence 102 (2021): 104267.
Datasets
- Arabic Speech Commands Dataset (CC BY 4.0): Our dataset is a list of pairs (x, y), where x is the input speech signal, and y is the corresponding keyword. The final dataset consists of 12000 such pairs, comprising 40 keywords. Each audio file is one-second in length sampled at 16 kHz. We have 30 participants, each of them recorded 10 utterances for each keyword. Therefore, we have 300 audio files for each keyword in total (30 * 10 * 40 = 12000), and the total size of all the recorded keywords is ~384 MB. The dataset also contains several background noise recordings we obtained from various natural sources of noise. We saved these audio files in a separate folder with the name background_noise and a total size of ~49 MB.