You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In 4.1.11, with numParts set to 50k, a table of 1k rows takes about 1 minute to complete. However, in 5.1.4, the same configuration takes over an hour.
25/01/08 17:11:43 INFO SparkContext: Running Spark version 3.5.1
25/01/08 17:11:43 INFO SparkContext: OS info Linux, 3.10.0-1160.125.1.el7.x86_64, amd64
25/01/08 17:11:43 INFO SparkContext: Java version 11
25/01/08 17:11:44 INFO PropertyHelper: Known property [spark.cdm.perfops.ratelimit.origin] is configured with value [20000] and is type [NUMBER]
25/01/08 17:11:44 INFO PropertyHelper: Known property [spark.cdm.schema.origin.column.ttl.automatic] is configured with value [true] and is type [BOOLEAN]
25/01/08 17:11:44 INFO PropertyHelper: Known property [spark.cdm.schema.origin.column.writetime.automatic] is configured with value [true] and is type [BOOLEAN]
25/01/08 17:11:44 INFO PropertyHelper: Known property [spark.cdm.perfops.numParts] is configured with value [50000] and is type [NUMBER]
We dropped the numParts to 10, and the job completed in 1 minute.
The error that throws every 10 minutes is:
25/01/08 17:35:29 ERROR Inbox: Ignoring error
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
at org.apache.spark.storage.BlockManagerMasterEndpoint.driverEndpoint$lzycompute(BlockManagerMasterEndpoint.scala:124)
at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$driverEndpoint(BlockManagerMasterEndpoint.scala:123)
at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$lzycompute$1(BlockManagerMasterEndpoint.scala:688)
at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$1(BlockManagerMasterEndpoint.scala:687)
at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$register(BlockManagerMasterEndpoint.scala:725)
at org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$receiveAndReply$1.applyOrElse(BlockManagerMasterEndpoint.scala:133)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:103)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
at org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.spark.rpc.RpcEndpointNotFoundException: Cannot find endpoint: spark://CoarseGrainedScheduler@vmb.foo.bar-dns.com:38327
at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$asyncSetupEndpointRefByURI$1(NettyRpcEnv.scala:148)
at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$asyncSetupEndpointRefByURI$1$adapted(NettyRpcEnv.scala:144)
at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:470)
at org.apache.spark.util.ThreadUtils$$anon$1.execute(ThreadUtils.scala:99)
at scala.concurrent.impl.ExecutionContextImpl.execute(ExecutionContextImpl.scala:21)
at scala.concurrent.impl.Promise$Transformation.submitWithValue(Promise.scala:429)
at scala.concurrent.impl.Promise$DefaultPromise.submitWithValue(Promise.scala:338)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete0(Promise.scala:285)
at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:504)
at org.apache.spark.util.ThreadUtils$$anon$1.execute(ThreadUtils.scala:99)
at scala.concurrent.impl.ExecutionContextImpl.execute(ExecutionContextImpl.scala:21)
at scala.concurrent.impl.Promise$Transformation.submitWithValue(Promise.scala:429)
at scala.concurrent.impl.Promise$DefaultPromise.submitWithValue(Promise.scala:338)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete0(Promise.scala:285)
at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:504)
at scala.concurrent.ExecutionContext$parasitic$.execute(ExecutionContext.scala:222)
at scala.concurrent.impl.Promise$Transformation.submitWithValue(Promise.scala:429)
at scala.concurrent.impl.Promise$DefaultPromise.submitWithValue(Promise.scala:335)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete0(Promise.scala:285)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:278)
at scala.concurrent.Promise.trySuccess(Promise.scala:99)
at scala.concurrent.Promise.trySuccess$(Promise.scala:99)
at scala.concurrent.impl.Promise$DefaultPromise.trySuccess(Promise.scala:104)
at org.apache.spark.rpc.netty.NettyRpcEnv.onSuccess$1(NettyRpcEnv.scala:225)
at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$askAbortable$5(NettyRpcEnv.scala:239)
at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$askAbortable$5$adapted(NettyRpcEnv.scala:238)
at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:484)
at org.apache.spark.util.ThreadUtils$$anon$1.execute(ThreadUtils.scala:99)
at scala.concurrent.impl.ExecutionContextImpl.execute(ExecutionContextImpl.scala:21)
at scala.concurrent.impl.Promise$Transformation.submitWithValue(Promise.scala:429)
at scala.concurrent.impl.Promise$DefaultPromise.submitWithValue(Promise.scala:338)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete0(Promise.scala:285)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:278)
at scala.concurrent.Promise.complete(Promise.scala:57)
at scala.concurrent.Promise.complete$(Promise.scala:56)
at scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:104)
at scala.concurrent.Promise.success(Promise.scala:91)
at scala.concurrent.Promise.success$(Promise.scala:91)
at scala.concurrent.impl.Promise$DefaultPromise.success(Promise.scala:104)
at org.apache.spark.rpc.netty.LocalNettyRpcCallContext.send(NettyRpcCallContext.scala:50)
at org.apache.spark.rpc.netty.NettyRpcCallContext.reply(NettyRpcCallContext.scala:32)
at org.apache.spark.rpc.netty.RpcEndpointVerifier$$anonfun$receiveAndReply$1.applyOrElse(RpcEndpointVerifier.scala:31)
... 8 more
25/01/08 17:35:29 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:101)
at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:85)
at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:80)
at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:642)
at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1223)
at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:295)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928)
at org.apache.spark.Heartbeater$$anon$1.run(Heartbeater.scala:46)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
at org.apache.spark.storage.BlockManagerMasterEndpoint.driverEndpoint$lzycompute(BlockManagerMasterEndpoint.scala:124)
at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$driverEndpoint(BlockManagerMasterEndpoint.scala:123)
at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$lzycompute$1(BlockManagerMasterEndpoint.scala:688)
at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$1(BlockManagerMasterEndpoint.scala:687)
at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$register(BlockManagerMasterEndpoint.scala:725)
at org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$receiveAndReply$1.applyOrElse(BlockManagerMasterEndpoint.scala:133)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:103)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
at org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
... 3 more
Caused by: org.apache.spark.rpc.RpcEndpointNotFoundException: Cannot find endpoint: spark://CoarseGrainedScheduler@vmb.foo.bar-dns.com:38327
at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$asyncSetupEndpointRefByURI$1(NettyRpcEnv.scala:148)
at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$asyncSetupEndpointRefByURI$1$adapted(NettyRpcEnv.scala:144)
at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:470)
at org.apache.spark.util.ThreadUtils$$anon$1.execute(ThreadUtils.scala:99)
at scala.concurrent.impl.ExecutionContextImpl.execute(ExecutionContextImpl.scala:21)
at scala.concurrent.impl.Promise$Transformation.submitWithValue(Promise.scala:429)
at scala.concurrent.impl.Promise$DefaultPromise.submitWithValue(Promise.scala:338)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete0(Promise.scala:285)
at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:504)
at org.apache.spark.util.ThreadUtils$$anon$1.execute(ThreadUtils.scala:99)
at scala.concurrent.impl.ExecutionContextImpl.execute(ExecutionContextImpl.scala:21)
at scala.concurrent.impl.Promise$Transformation.submitWithValue(Promise.scala:429)
at scala.concurrent.impl.Promise$DefaultPromise.submitWithValue(Promise.scala:338)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete0(Promise.scala:285)
at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:504)
at scala.concurrent.ExecutionContext$parasitic$.execute(ExecutionContext.scala:222)
at scala.concurrent.impl.Promise$Transformation.submitWithValue(Promise.scala:429)
at scala.concurrent.impl.Promise$DefaultPromise.submitWithValue(Promise.scala:335)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete0(Promise.scala:285)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:278)
at scala.concurrent.Promise.trySuccess(Promise.scala:99)
at scala.concurrent.Promise.trySuccess$(Promise.scala:99)
at scala.concurrent.impl.Promise$DefaultPromise.trySuccess(Promise.scala:104)
at org.apache.spark.rpc.netty.NettyRpcEnv.onSuccess$1(NettyRpcEnv.scala:225)
at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$askAbortable$5(NettyRpcEnv.scala:239)
at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$askAbortable$5$adapted(NettyRpcEnv.scala:238)
at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:484)
at org.apache.spark.util.ThreadUtils$$anon$1.execute(ThreadUtils.scala:99)
at scala.concurrent.impl.ExecutionContextImpl.execute(ExecutionContextImpl.scala:21)
at scala.concurrent.impl.Promise$Transformation.submitWithValue(Promise.scala:429)
at scala.concurrent.impl.Promise$DefaultPromise.submitWithValue(Promise.scala:338)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete0(Promise.scala:285)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:278)
at scala.concurrent.Promise.complete(Promise.scala:57)
at scala.concurrent.Promise.complete$(Promise.scala:56)
at scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:104)
at scala.concurrent.Promise.success(Promise.scala:91)
at scala.concurrent.Promise.success$(Promise.scala:91)
at scala.concurrent.impl.Promise$DefaultPromise.success(Promise.scala:104)
at org.apache.spark.rpc.netty.LocalNettyRpcCallContext.send(NettyRpcCallContext.scala:50)
at org.apache.spark.rpc.netty.NettyRpcCallContext.reply(NettyRpcCallContext.scala:32)
at org.apache.spark.rpc.netty.RpcEndpointVerifier$$anonfun$receiveAndReply$1.applyOrElse(RpcEndpointVerifier.scala:31)
... 8 more
The text was updated successfully, but these errors were encountered:
In 4.1.11, with numParts set to 50k, a table of 1k rows takes about 1 minute to complete. However, in 5.1.4, the same configuration takes over an hour.
We dropped the numParts to 10, and the job completed in 1 minute.
The error that throws every 10 minutes is:
The text was updated successfully, but these errors were encountered: