Overview
I recently had the pleasure of hosting a workshop on Exploring Big Data on High-Performance Computing (HPC) at the University of California, Los Angeles (UCLA). In this workshop, we delved into the challenges and solutions associated with Big Data processing using HPC resources.
Event Details
- Date: Sept 24, 2022
- Location: Virtual (via Zoom)
Workshop Highlights
The central theme of the workshop revolved around the application of Big Data techniques on HPC infrastructure. As data grows in size and complexity, traditional computing methods become less effective. Similarly, training complex machine learning models can pose challenges.
During the workshop, participants were introduced to various tools and techniques to address these challenges. Specifically, we covered the following topics:
- Big Data Processing: Approaches and Challenges
- Using UCLA’s HPC Resource: Hoffman2
- Introduction to Big Data Libraries: Apache Spark and Dask
Workshop Materials
The workshop materials, including the presentation PDF and Zoom recording, are available in the UCLA ORAC GitHub repository. You can access them using the following link:
Conclusion
This introductory workshop served as an opportunity for participants to learn more about the world of Big Data and HPC. We look forward to hosting similar events in the future and continuing to explore the intersection of Big Data and High-Performance Computing.