PySpark No suitable drivers found for jdbc: mysql: // dbhost

advertisements

I am trying to write my dataframe to a mysql table. I am getting No suitable driver found for jdbc:mysql://dbhost when I try write.

As part of the preprocessing I read from other tables in the same DB and have no issues doing that. I can do the full run and save the rows to a parquet file so it is definitely reading from the mysql DB.

I am submitting using:

spark-submit --conf spark.executor.extraClassPath=/home/user/Downloads/mysql-connector-java-5.1.35-bin.jar --driver-class-path /home/user/Downloads/mysql-connector-java-5.1.35-bin.jar --jars /home/user/Downloads/mysql-connector-java-5.1.35-bin.jar main.py

And I am writing using:

df.write.jdbc(url="jdbc:mysql://dbhost/dbname", table="tablename", mode="append", properties={"user":"dbuser", "password": "s3cret"})


This is a bug related the the classloader. This is the ticket for it: https://issues.apache.org/jira/browse/SPARK-8463 and this is the pull request for it: https://github.com/apache/spark/pull/6900.

A workaround is to copy mysql-connector-java-5.1.35-bin.jar to every machine at the same location as it is on the driver.