0
我正在使用spark 2.1.0。我有2个数据帧不超过3 MB。当我试图在2个数据框上运行内部连接时,我所有的转换逻辑都可以完美地工作。但是,当我使用RightOuter加入2个数据框时,出现以下错误。连接条件下Pyspark内存问题
错误
RN for exceeding memory limits. 1.5 GB of 1.5 GB physical memory used.
Consider boosting spark.yarn.executor.memoryOverhead.
17/08/02 02:29:53 ERROR cluster.YarnScheduler: Lost executor 337 on ip-172-
21-1-105.eu-west-1.compute.internal: Container killed by YARN for exceeding
memory limits. 1.5 GB of 1.5 GB physical memory used. Consider boosting
spark.yarn.executor.memoryOverhead.
17/08/02 02:29:53 WARN scheduler.TaskSetManager: Lost task 34.0 in stage
283.0 (TID 11396, ip-172-21-1-105.eu-west-1.compute.internal, executor 337):
ExecutorLostFailure (executor 337 exited caused by one of the running tasks)
Reason: Container killed by YARN for exceeding memory limits. 1.5 GB of 1.5
GB physical memory used. Consider boosting
spark.yarn.executor.memoryOverhead.
17/08/02 02:29:53 WARN server.TransportChannelHandler: Exception in
connection from /172.21.1.105:50342
java.io.IOException: Connection reset by peer
我试着用替代 1)df.coalesce(x值).show() 2)尝试设置执行内存毫无效果。
此问题在过去几周内未解决。任何人都可以请让我知道我错在哪里