The challenge with ephemeral clusters and Dataproc Serverless for Spark is that you will lose the application logs when the cluster machines are deleted after the job. Persistent History Server (PHS) enables access to the completed Hadoop and Spark application details for the jobs executed on different ephemeral clusters or serverless Spark. It can list running and completed applications. These are important for troubleshooting and exploring.
Check out this video to understand more about Persistent History Server!
Persistent History Server Best Practices and Code Snippets → https://goo.gle/3mjVkZL
Dataproc Persistent History Server Documentation → https://goo.gle/3JakmUg
Run faster and more cost-effective Dataproc jobs → https://goo.gle/3IHAye5
Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech
Publisher: Google Cloud
You can watch this video also at the source.