0
0

More than 3 years have passed since last update.

ModuleNotFoundError: No module named 'py4j'

Posted at

結論

未解決

背景

Dockerfile内で

From jupyter/jupyter/pyspark-notebook:~~~~~~~~~

としてPysparkを読み込んでいる.

環境

Python 3.7.6
pyspark 2.4.5

該当箇所

from pyspark.sql import SparkSession

エラー内容

/usr/local/spark/python/pyspark/__init__.py in <module>
     49 
     50 from pyspark.conf import SparkConf
---> 51 from pyspark.context import SparkContext
     52 from pyspark.rdd import RDD, RDDBarrier
     53 from pyspark.files import SparkFiles

/usr/local/spark/python/pyspark/context.py in <module>
     27 from tempfile import NamedTemporaryFile
     28 
---> 29 from py4j.protocol import Py4JError
     30 
     31 from pyspark import accumulators

ModuleNotFoundError: No module named 'py4j'

対応方法

支障が出るので今回はjupyter notebook上で

!pip install py4j

と対応した
以下のエラーが出るので、エラーのでない対応方法が分かり次第追記したいと思います

ERROR: pyspark 2.4.5 has requirement py4j==0.10.7, but you'll have py4j 0.10.9.1 which is incompatible.

参考

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0