How can I use snowflake jar in Bitnami Spark Docker container?
I was able to create docker based bitnami stand alone spark instance and run spark jobs on it. However I'm not able not able to write data to snowflake from the the spark dataframe.
I created a Dockerfile to copy the snowflake jar to the image but it still doesn't find the snowflake plugin. However if I check the jars folder the jar file is in there. I get the following error:
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 8.0 failed 4 times, most recent failure:
Lost task 0.3 in stage 8.0 (TID 10) (172.19.0.3 executor 0): java.lang.ClassNotFoundException:
net.snowflake.spark.snowflake.io.SnowflakeResultSetPartition
Here's my Dockerfile:
FROM docker.io/bitnami/spark
USER root
COPY *.jar /opt/bitnami/spark/jars
What other settings should I be setting to get it to the snowflake plugin to be recognized?
Here are my maven dependencies:
4.0.0
com.test
test-spark
1.0.0
1.8
1.8
com.amazonaws
aws-java-sdk-bom
1.11.837
pom
import
org.apache.spark
spark-core_2.12
3.2.1
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.2.1</version>
<scope>compile</scope>