Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dataproc provisioning timeout due to network unreachable to googleapis.com

I'm trying to create a basic (I used default values) a dataproc cluster in a GCP project, The VMs are created but the cluster still in the Provisioning State forever until timeout

  • I tried both with the console and also with the command line.
  • I tried different image versions (2.0-debian, 2.0-ubuntu, 1.5-debian, 1.5-ubuntu)
  • Non components are selected ( it will be used for spark Jobs)

in all those cases I have the following error ( found SSHing the master on /var/log/google-dataproc-agent.0.log)

Network is unreachable: dataproccontrol-europe-west1.googleapis.com/2a00:1450:400c:c04:0:0:0:5f:443

The full error trace :

ul 24, 2021 11:02:53 AM com.google.cloud.hadoop.services.repackaged.com.google.cloud.hadoop.util.ResilientOperation nextSleep INFO: Transient exception caught. Sleeping for 1120, then retrying.
com.google.cloud.hadoop.services.repackaged.io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 29.974635818s. [buffered_nanos=30006131805, waiting_for_connection]
        at com.google.cloud.hadoop.services.repackaged.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:244)
        at com.google.cloud.hadoop.services.repackaged.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:225)
        at com.google.cloud.hadoop.services.repackaged.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:142)
        at com.google.cloud.dataproc.control.v1.AgentServiceGrpc$AgentServiceBlockingStub.createAgent(AgentServiceGrpc.java:735)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater$1.call(AgentApiAsyncUpdater.java:238)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater$1.call(AgentApiAsyncUpdater.java:235)
        at com.google.cloud.hadoop.services.repackaged.com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:67)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.executeWithBackoff(AgentApiAsyncUpdater.java:345)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.createAgent(AgentApiAsyncUpdater.java:234)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.getOrCreateAgent(AgentApiAsyncUpdater.java:203)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.run(AgentApiAsyncUpdater.java:183)
        at com.google.cloud.hadoop.services.repackaged.com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator$NeverSuccessfulListenableFutureTask.run(MoreExecutors.java:679)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Jul 24, 2021 11:03:23 AM com.google.cloud.hadoop.services.repackaged.com.google.cloud.hadoop.util.ResilientOperation nextSleep INFO: Transient exception caught. Sleeping for 1958, then retrying.
com.google.cloud.hadoop.services.repackaged.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
        at com.google.cloud.hadoop.services.repackaged.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:244)
        at com.google.cloud.hadoop.services.repackaged.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:225)
        at com.google.cloud.hadoop.services.repackaged.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:142)
        at com.google.cloud.dataproc.control.v1.AgentServiceGrpc$AgentServiceBlockingStub.createAgent(AgentServiceGrpc.java:735)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater$1.call(AgentApiAsyncUpdater.java:238)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater$1.call(AgentApiAsyncUpdater.java:235)
        at com.google.cloud.hadoop.services.repackaged.com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:67)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.executeWithBackoff(AgentApiAsyncUpdater.java:345)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.createAgent(AgentApiAsyncUpdater.java:234)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.getOrCreateAgent(AgentApiAsyncUpdater.java:203)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.run(AgentApiAsyncUpdater.java:183)
        at com.google.cloud.hadoop.services.repackaged.com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator$NeverSuccessfulListenableFutureTask.run(MoreExecutors.java:679)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: com.google.cloud.hadoop.services.repackaged.io.netty.channel.AbstractChannel$AnnotatedSocketException: Network is unreachable: dataproccontrol-europe-west1.googleapis.com/2a00:1450:400c:c04:0:0:0:5f:443
Caused by: java.net.SocketException: Network is unreachable
        at sun.nio.ch.Net.connect0(Native Method)
        at sun.nio.ch.Net.connect(Net.java:482)
        at sun.nio.ch.Net.connect(Net.java:474)
        at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:647)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.internal.SocketUtils$3.run(SocketUtils.java:91)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.internal.SocketUtils$3.run(SocketUtils.java:88)
        at java.security.AccessController.doPrivileged(Native Method)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.internal.SocketUtils.connect(SocketUtils.java:88)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:315)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:248)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1342)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:548)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:533)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.ChannelDuplexHandler.connect(ChannelDuplexHandler.java:54)
        at com.google.cloud.hadoop.services.repackaged.io.grpc.netty.WriteBufferingAndExceptionHandler.connect(WriteBufferingAndExceptionHandler.java:150)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:548)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.AbstractChannelHandlerContext.access$1000(AbstractChannelHandlerContext.java:61)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.AbstractChannelHandlerContext$9.run(AbstractChannelHandlerContext.java:538)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)

Any help please !

thank you in advance

Edit :My firewalls & VPC enter image description here enter image description here enter image description here enter image description here enter image description here

Cluster configuration : enter image description here

like image 266
med Avatar asked Sep 11 '25 15:09

med


1 Answers

Based on the error message Network is unreachable: dataproccontrol-europe-west1.googleapis.com/2a00:1450:400c:c04:0:0:0:5f:443 and your network settings, seems you are missing a route to the internet.

You can fix the problem by adding a route to 0.0.0.0/0 for IPv4 and ::/0 for IPv6 with --next-hop-gateway=default-internet-gateway, see more details in this doc. The route should have been automatically created for a new VPC network but I guess you deleted it, see this doc.

The reason for the need of the route is that the Dataproc agent on the VM needs to access the Dataproc control API to get jobs and report status. The API domain name dataproccontrol-<region>.googleapis.com is resolved to an external IP, so the VMs need to have a route to the internet (or the IP ranges), but when Private Google Access is enabled, the traffic won't leave Google data centers. The recommendation is to always have a route to the internet, and use firewall rules for more granular access control. Also note that VMs without external IPs are not able to access the internet by default, even if routes and firewall rules allow it, see this doc if you want a solution. BTW, You can use the Connectivity Test tool for troubleshooting.

like image 74
Dagang Avatar answered Sep 13 '25 06:09

Dagang