Exploring Big Data on HPC: Workshop Recap

UCLA
Big Data
workshops
Author

Charles Peterson

Published

October 24, 2022

Overview

I recently had the pleasure of hosting a workshop on Exploring Big Data on High-Performance Computing (HPC) at the University of California, Los Angeles (UCLA). In this workshop, we delved into the challenges and solutions associated with Big Data processing using HPC resources.

Event Details

  • Date: Sept 24, 2022
  • Location: Virtual (via Zoom)

Workshop Highlights

The central theme of the workshop revolved around the application of Big Data techniques on HPC infrastructure. As data grows in size and complexity, traditional computing methods become less effective. Similarly, training complex machine learning models can pose challenges.

During the workshop, participants were introduced to various tools and techniques to address these challenges. Specifically, we covered the following topics:

  • Big Data Processing: Approaches and Challenges
  • Using UCLA’s HPC Resource: Hoffman2
  • Introduction to Big Data Libraries: Apache Spark and Dask

Workshop Materials

The workshop materials, including the presentation PDF and Zoom recording, are available in the UCLA ORAC GitHub repository. You can access them using the following link:

Workshop Materials (GitHub)

Conclusion

This introductory workshop served as an opportunity for participants to learn more about the world of Big Data and HPC. We look forward to hosting similar events in the future and continuing to explore the intersection of Big Data and High-Performance Computing.