Spark memory-linked errors

Amit Singh Rathore
3 min readMay 6, 2024

Common memory-related issues in Apache Spark applications

There are many potential reasons for memory problems:

  • Too few shuffle partitions
  • Large broadcast
  • UDFs
  • Window function without PARTITION BY statement
  • Skew
  • Streaming State

Out-of-Memory Errors (OOM)

java.lang.OutOfMemoryError

  • Driver OOM: The Spark driver runs the main program and…

--

--

Amit Singh Rathore

Staff Data Engineer @ Visa — Writes about Cloud | Big Data | ML