Trino intermittently fails to pick up IRSA #15267

Pluies · 2022-12-01T11:09:55Z

Hello Trino folks!

We've been running Trino 403 on EKS via the Helm chart, with IRSA enabled to read Delta files from S3 buckets.

This setup works very well for us... Most of the time. Unfortunately, sometimes, a node will start up and fail to get its IRSA authentication, and will use its instance profile instead. The instance profile does not have the correct S3 permissions, and any calls to S3 will then fail with a 403 Access Denied, causing Trino queries to fail with the following stacktrace:

2022-11-08T17:16:20.185Z	ERROR	SplitRunner-821-67	io.trino.execution.executor.TaskExecutor	Error processing Split 20221108_171619_00082_ssjay.1.1.0-12 {path=s3a://REDACTED/table/0001/block_date=2022-07-09/part-00000-b7d3e0d5-91f0-4a5e-b423-4a115c0dfe47.c000.snappy.parquet, start=0, length=9688} (start = 5538933.073355, wall = 11 ms, cpu = 0 ms, wait = 0 ms, calls = 1): HIVE_CANNOT_OPEN_SPLIT: Error opening Hive split s3a://REDACTED/table/0001/block_date=2022-07-09/part-00000-b7d3e0d5-91f0-4a5e-b423-4a115c0dfe47.c000.snappy.parquet (offset=0, length=9688): com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: GWPGEHRBPM0Z41ER; S3 Extended Request ID: btz5MHY4uuvDZWpIiCeDvsb7Waxfj5+Er8++1ATgMPF5XhcIvraqklFutzc1rH4GBZVwmhoZdfg=; Proxy: null), S3 Extended Request ID: btz5MHY4uuvDZWpIiCeDvsb7Waxfj5+Er8++1ATgMPF5XhcIvraqklFutzc1rH4GBZVwmhoZdfg= (Path: s3a://REDACTED/table/0001/block_date=2022-07-09/part-00000-b7d3e0d5-91f0-4a5e-b423-4a115c0dfe47.c000.snappy.parquet)
io.trino.spi.TrinoException: Error opening Hive split s3a://REDACTED/table/0001/block_date=2022-07-09/part-00000-b7d3e0d5-91f0-4a5e-b423-4a115c0dfe47.c000.snappy.parquet (offset=0, length=9688): com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: GWPGEHRBPM0Z41ER; S3 Extended Request ID: btz5MHY4uuvDZWpIiCeDvsb7Waxfj5+Er8++1ATgMPF5XhcIvraqklFutzc1rH4GBZVwmhoZdfg=; Proxy: null), S3 Extended Request ID: btz5MHY4uuvDZWpIiCeDvsb7Waxfj5+Er8++1ATgMPF5XhcIvraqklFutzc1rH4GBZVwmhoZdfg= (Path: s3a://REDACTED/table/0001/block_date=2022-07-09/part-00000-b7d3e0d5-91f0-4a5e-b423-4a115c0dfe47.c000.snappy.parquet)
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:290)
	at io.trino.plugin.deltalake.DeltaLakePageSourceProvider.createPageSource(DeltaLakePageSourceProvider.java:199)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:49)
	at io.trino.split.PageSourceManager.createPageSource(PageSourceManager.java:62)
	at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:308)
	at io.trino.operator.Driver.processInternal(Driver.java:411)
	at io.trino.operator.Driver.lambda$process$10(Driver.java:314)
	at io.trino.operator.Driver.tryWithLock(Driver.java:706)
	at io.trino.operator.Driver.process(Driver.java:306)
	at io.trino.operator.Driver.processForDuration(Driver.java:277)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:739)
	at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:164)
	at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:515)
	at io.trino.$gen.Trino_402____20221108_171328_2.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: io.trino.plugin.hive.s3.TrinoS3FileSystem$UnrecoverableS3OperationException: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: GWPGEHRBPM0Z41ER; S3 Extended Request ID: btz5MHY4uuvDZWpIiCeDvsb7Waxfj5+Er8++1ATgMPF5XhcIvraqklFutzc1rH4GBZVwmhoZdfg=; Proxy: null), S3 Extended Request ID: btz5MHY4uuvDZWpIiCeDvsb7Waxfj5+Er8++1ATgMPF5XhcIvraqklFutzc1rH4GBZVwmhoZdfg= (Path: s3a://REDACTED/table/0001/block_date=2022-07-09/part-00000-b7d3e0d5-91f0-4a5e-b423-4a115c0dfe47.c000.snappy.parquet)
	at io.trino.plugin.hive.s3.TrinoS3FileSystem$TrinoS3InputStream.lambda$openStream$2(TrinoS3FileSystem.java:1373)
	at io.trino.plugin.hive.util.RetryDriver.run(RetryDriver.java:130)
	at io.trino.plugin.hive.s3.TrinoS3FileSystem$TrinoS3InputStream.openStream(TrinoS3FileSystem.java:1360)
	at io.trino.plugin.hive.s3.TrinoS3FileSystem$TrinoS3InputStream.openStream(TrinoS3FileSystem.java:1345)
	at io.trino.plugin.hive.s3.TrinoS3FileSystem$TrinoS3InputStream.seekStream(TrinoS3FileSystem.java:1338)
	at io.trino.plugin.hive.s3.TrinoS3FileSystem$TrinoS3InputStream.lambda$read$1(TrinoS3FileSystem.java:1282)
	at io.trino.plugin.hive.util.RetryDriver.run(RetryDriver.java:130)
	at io.trino.plugin.hive.s3.TrinoS3FileSystem$TrinoS3InputStream.read(TrinoS3FileSystem.java:1281)
	at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:282)
	at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:343)
	at java.base/java.io.DataInputStream.read(DataInputStream.java:151)
	at java.base/java.io.DataInputStream.read(DataInputStream.java:151)
	at io.trino.hdfs.FSDataInputStreamTail.readTail(FSDataInputStreamTail.java:59)
	at io.trino.filesystem.hdfs.HdfsInput.readTail(HdfsInput.java:56)
	at io.trino.filesystem.TrinoInput.readTail(TrinoInput.java:46)
	at io.trino.plugin.hive.parquet.TrinoParquetDataSource.readTailInternal(TrinoParquetDataSource.java:54)
	at io.trino.parquet.AbstractParquetDataSource.readTail(AbstractParquetDataSource.java:90)
	at io.trino.parquet.reader.MetadataReader.readFooter(MetadataReader.java:98)
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:210)
	... 16 more
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: GWPGEHRBPM0Z41ER; S3 Extended Request ID: btz5MHY4uuvDZWpIiCeDvsb7Waxfj5+Er8++1ATgMPF5XhcIvraqklFutzc1rH4GBZVwmhoZdfg=; Proxy: null), S3 Extended Request ID: btz5MHY4uuvDZWpIiCeDvsb7Waxfj5+Er8++1ATgMPF5XhcIvraqklFutzc1rH4GBZVwmhoZdfg=
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1879)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1418)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1387)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
	at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
	at io.trino.plugin.hive.s3.TrinoS3FileSystem$TrinoS3InputStream.lambda$openStream$2(TrinoS3FileSystem.java:1365)
	... 34 more

The only way to fix this issue is to delete the worker pod.

We confirmed via S3 logs that the problematic requests were issued with the node profile rather than the IRSA profile, pointing to an issue with the credentials chain.

After asking on Slack, we thought it might be an issue with IRSA itself (i.e., the pod starting before IRSA credentials were available). In order to examine this possibility, we did two things:

Created an initContainer running aws sts get-caller-identity and waiting until the right IAM role comes up: this container always returned the correct IAM role, but some Trino workers were still starting up faulty, so we could confirm that the Pod always received the correct credentials
But maybe it's container-scoped? We added a custom entrypoint to our Trino container to do the same, and, likewise, it always showed the correct IAM role, and yet some worker nodes still started up wrong, so we ruled out this possibility entirely.

After adding com.amazonaws=DEBUG to our logging configuration, we can see how the authentication chain happens in a normal node:

2022-11-18T15:05:47.758Z	DEBUG	20221118_150542_00001_i5juv.1.0.0-6-103	com.amazonaws.auth.AWSCredentialsProviderChain	Unable to load credentials from EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
2022-11-18T15:05:47.759Z	DEBUG	20221118_150542_00001_i5juv.1.0.0-6-103	com.amazonaws.auth.AWSCredentialsProviderChain	Unable to load credentials from SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey)
2022-11-18T15:05:47.785Z	DEBUG	20221118_150542_00001_i5juv.1.0.0-58-81	com.amazonaws.request	Sending Request: POST https://sts.eu-west-1.amazonaws.com / (...) 
(...)
2022-11-18T15:05:47.996Z	DEBUG	20221118_150542_00001_i5juv.1.0.0-58-81	com.amazonaws.request	Received successful response: 200, AWS Request ID: 70951960-78a5-4b33-a576-e3473b670e0e
2022-11-18T15:05:47.996Z	DEBUG	20221118_150542_00001_i5juv.1.0.0-58-81	com.amazonaws.requestId	x-amzn-RequestId: 70951960-78a5-4b33-a576-e3473b670e0e
2022-11-18T15:05:47.997Z	DEBUG	20221118_150542_00001_i5juv.1.0.0-58-81	com.amazonaws.auth.AWSCredentialsProviderChain	Loading credentials from WebIdentityTokenCredentialsProvider

After trying static credentials, which don't exist, Trino tries the WebIdentityTokenCredentialsProvider, which works: all good!

(There's a lot of very similar log lines in there, I assume this happens once per process?)

But in a problematic node, we see the following errors:

2022-11-20T07:46:17.027008338Z stderr F 2022-11-20T07:46:17.026Z	DEBUG	20221120_074616_00009_zjt8k.1.11.0-6-70	com.amazonaws.auth.AWSCredentialsProviderChain	Unable to load credentials from WebIdentityTokenCredentialsProvider: Interrupted waiting to refresh the value.
2022-11-20T07:46:17.026983467Z stderr F 2022-11-20T07:46:17.026Z	DEBUG	20221120_074616_00009_zjt8k.1.11.0-2-69	com.amazonaws.auth.AWSCredentialsProviderChain	Unable to load credentials from WebIdentityTokenCredentialsProvider: Interrupted waiting to refresh the value.
2022-11-20T07:46:17.026962152Z stderr F 2022-11-20T07:46:17.026Z	DEBUG	20221120_074616_00009_zjt8k.1.11.0-1-81	com.amazonaws.auth.AWSCredentialsProviderChain	Unable to load credentials from WebIdentityTokenCredentialsProvider: Interrupted waiting to refresh the value.
2022-11-20T07:46:17.026960176Z stderr F 2022-11-20T07:46:17.026Z	DEBUG	20221120_074616_00009_zjt8k.1.11.0-0-71	com.amazonaws.auth.AWSCredentialsProviderChain	Unable to load credentials from WebIdentityTokenCredentialsProvider: Interrupted waiting to refresh the value.
2022-11-20T07:46:17.026955191Z stderr F 2022-11-20T07:46:17.026Z	DEBUG	20221120_074616_00009_zjt8k.1.11.0-7-77	com.amazonaws.auth.AWSCredentialsProviderChain	Unable to load credentials from WebIdentityTokenCredentialsProvider: Interrupted waiting to refresh the value.
2022-11-20T07:46:17.02695308Z stderr F 2022-11-20T07:46:17.026Z	DEBUG	20221120_074616_00009_zjt8k.1.11.0-3-80	com.amazonaws.auth.AWSCredentialsProviderChain	Unable to load credentials from WebIdentityTokenCredentialsProvider: Interrupted waiting to refresh the value.
2022-11-20T07:46:17.026950294Z stderr F 2022-11-20T07:46:17.026Z	DEBUG	20221120_074616_00009_zjt8k.1.11.0-4-82	com.amazonaws.auth.AWSCredentialsProviderChain	Unable to load credentials from WebIdentityTokenCredentialsProvider: Interrupted waiting to refresh the value.
2022-11-20T07:46:17.026932937Z stderr F 2022-11-20T07:46:17.026Z	DEBUG	20221120_074616_00009_zjt8k.1.11.0-5-66	com.amazonaws.auth.AWSCredentialsProviderChain	Unable to load credentials from WebIdentityTokenCredentialsProvider: Interrupted waiting to refresh the value.

The credentials chain then tries the next auth steps in the chain, and eventually uses the instance credentials from the node itself, following AWS's default credentials chain.

We managed to workaround the issue for now by setting up a Custom S3 Credentials Provider class, a barebones subclass of com.amazonaws.auth.WebIdentityTokenCredentialsProvider. Since implementing this workaround, we have not seen any auth issues in any of our nodes.

So my bug report is - something is interrupting the WebIdentityTokenCredentialsProvider, causing it to intermittently crash. I don't have enough knowledge to dive into the Trino code, but if someone could I'd be super grateful 🙏

The text was updated successfully, but these errors were encountered:

skyahead · 2023-03-07T21:50:16Z

@Pluies I wonder what is being done in the Custom S3 Credentials Provider that you wrote? Does it keep looking for the web token?

Pluies · 2023-03-08T09:54:17Z

@skyahead it doesn't even keep looking, it only does it once 😄 Well, actually, being a subclass of AWS' WebIdentityTokenCredentialsProvider, it reuses AWS' logic, which may be retrying; I haven't dug into it.

If it helps, it looks like this:

package com.foo

import com.amazonaws.auth.WebIdentityTokenCredentialsProvider
import java.net.URI
import org.apache.hadoop.conf.Configuration

class CustomWebIdentityTokenCredentialsProvider(uri: URI, hadoopConf: Configuration) :
    WebIdentityTokenCredentialsProvider() {}

And our hadoop.xml override to pick it up looks like:

<configuration>
  <property>
    <name>trino.s3.credentials-provider</name>
    <value>com.foo.CustomWebIdentityTokenCredentialsProvider</value>
    <description>Custom IAM credentials provider to force IRSA</description>
  </property>
</configuration>

skyahead · 2023-03-08T23:03:26Z

@skyahead it doesn't even keep looking, it only does it once 😄 Well, actually, being a subclass of AWS' WebIdentityTokenCredentialsProvider, it reuses AWS' logic, which may be retrying; I haven't dug into it.

If it helps, it looks like this:
package com.foo

import com.amazonaws.auth.WebIdentityTokenCredentialsProvider
import java.net.URI
import org.apache.hadoop.conf.Configuration

class CustomWebIdentityTokenCredentialsProvider(uri: URI, hadoopConf: Configuration) :
    WebIdentityTokenCredentialsProvider() {}
And our hadoop.xml override to pick it up looks like:
<configuration>
  <property>
    <name>trino.s3.credentials-provider</name>
    <value>com.foo.CustomWebIdentityTokenCredentialsProvider</value>
    <description>Custom IAM credentials provider to force IRSA</description>
  </property>
</configuration>

thanks so much, will give it a try

mccartney · 2023-03-27T05:00:08Z

The credentials chain then tries the next auth steps in the chain, and eventually uses the instance credentials from the node itself, following AWS's default credentials chain.

I suspect the fact that AWSCredentialsProviderChain has reuseLastProvider defaulting to true doesn't help.

If your code happens to run in an environment which has multiple set of credentials available (e.g. IRSA role from container and EC2 instance profile) and only one of them allows for S3 bucket's access, then what might be happening is the following:

fetching the proper credentials works most of the time
but when it fails for some reason (you know, network problems, some temporary glitch, etc.), then the default AWSCredentialsProviderChain delegates to the next credentials provider in the chain
and that next credentials provider in the chain might succeed to provide credentials, but these credentials might fail to let into the S3 bucket
if so, the provider chain will save the wrong provider to be used for any subsequent calls anyway:
https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-core/src/main/java/com/amazonaws/auth/AWSCredentialsProviderChain.java#L111-L112

hashhar · 2023-08-04T04:43:14Z

cc: @pettyjamesm Have you by chance ever run into this or have ideas if something is wrong on the Trino side?

pettyjamesm · 2023-08-04T22:06:14Z

We have a very different setup for credential management on our side because instances don’t typically use their own credentials to access AWS services- but I have seen the instance profile metadata endpoint return throttling exceptions or otherwise fail intermittently in some cases so presumably something similar could happen to the IRSA provider. If that were to happen, then @mccartney’s description of what could happen makes sense based on the code snippets linked.

In general, it does seem like a bad idea to have a “chain” of credentials providers that can yield credentials with very different access permissions. Probably you want a way to specify to either use IRSA credentials or fail without trying to fall back to instance profile credentials.

hashhar · 2023-08-05T07:30:37Z

Got it. So we'd can add a config which when enabled would use just IRSA creds instead of default chain.

cc: @electrum this is regarding better EKS integration

osscm · 2023-10-19T05:34:42Z

cc @hashhar @pettyjamesm @mccartney @electrum

We have also faced similar issues and endup implementing a custom credential provider, which internally uses WebIdentityTokenCredentialsProvider and STSAssumeRoleSessionCredentialsProvider.
Though we have to add little a bit more code (picked from the TrinoS3FileSystem ) as we want to support the assume role case as well.
This will pin to the WebIdentityTokenCredentialsProvider and do not fallback to instance profile credentials or any other credentials, which makes its easier to understand as well.

We have also added extra logging, so that if anything breaks, we will get more information, which is an another problem when these issues happen in production, and there is not much to trace.

Since we have pinned the credential provider to the CustomCredentialsProvider (based on WebIdentityTokenCredentialsProvider) we have not seen this issue (fingers crossed)
This is being tested/used for good amount of time.

IMO, It would be good to add a WebIdentityTokenCredentialsProvider based credentials provider this in the Trino core code base itself, so that if needed, then based on the config or by overriding the current property trino.s3.credentials-provider user can just use it, instead of implementing again and duplicating the code.

We are happy to contribute this to upstream as well, as looks like it will help other teams as well. thanks!

kar0t · 2023-11-01T07:45:07Z

For those who are experiencing the same problem in the production environment so need immediate fix.

I Assume that the cause of this problem is the same as what other users discussed above:
AWS IAM API Throttling or AWS IAM API Failure + Credential Provider Chaining

We need to fix the credential provider configuration to use WebIdentityTokenCredentialsProvider in few points.

(1) Glue Catalog: we can use hive.metastore.glue.aws-credentials-provider for hive and iceberg catalogs

(2) HDFS-S3 File System: Fix TrinoS3FileSystem.getCustomAWSCredentialsProvider to create WebIdentityTokenCredentialsProvider Instance property otherwise you will get error message like

Caused by: java.lang.NoSuchMethodException: com.amazonaws.auth.WebIdentityTokenCredentialsProvider.(java.net.URI,org.apache.hadoop.conf.Configuration)

(3, Optional) Exchange File System: Add a configuration like exchange.s3.credential-provider and use it in theS3FileSystemExchangeStroage.createAwsCredentialsProvider method to create WebIdentityTokenFileCredentialsProvider

After the trino cluster is deployed, you can check whether credential provider chain is used by setting com.amazonaws.auth.AWSCredentialsProviderChain=DEBUG.

1ambda · 2023-11-04T02:22:01Z

FYI: WebIdentityTokenFileCredentialsProvider is different w/ WebIdentityTokenCredentialsProvider

TrinoS3FileSystem uses WebIdentityTokenCredentialsProvider
theS3FileSystemExchangeStroage uses WebIdentityTokenFileCredentialsProvider

1ambda · 2023-12-08T03:14:14Z

After the modification, the IRSA Access Denied Issue never happen.

MehulBatra · 2024-01-12T10:27:44Z

Hi team, any update on this, did we get this fixed in the later version?

lmay-r7 · 2024-01-19T18:04:32Z

Hey, we came across this issue today too, we're on Trino 435. Is this currently being looked into?

jfmrm · 2024-02-29T21:38:06Z

@1ambda would u be able to provider your config file? It's been quite hard to understand how to piece things together.

kassett · 2024-03-01T14:05:37Z

@1ambda would u be able to provider your config file? It's been quite hard to understand how to piece things together.

This has also become an issue for me. I will work on finding a fix today, but it'd be nice if someone already had one. I have attached the output of helm get values trino
helm_values.txt

osscm · 2024-03-03T22:43:29Z

Wondering, How we are fixing it? We have fixed it by using the custom credential provider to pin it to use WebIdentity instead of the chain, and make it default, and we did not face that issue any more. Regards, Manish

…

On Fri, Mar 1, 2024 at 6:05 AM Samuel Chai ***@***.***> wrote: @1ambda <https://github.com/1ambda> would u be able to provider your config file? It's been quite hard to understand how to piece things together. This has also become an issue for me. I will work on finding a fix today, but it'd be nice if someone already had one. helm_values.txt <https://github.com/trinodb/trino/files/14462038/helm_values.txt> — Reply to this email directly, view it on GitHub <#15267 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AXQ2PYTQA3OUPZGDPYZLYHLYWCDMDAVCNFSM6AAAAAASQVC3CKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZTGI3DMMJRHE> . You are receiving this because you commented.Message ID: ***@***.***>

denniswebb · 2024-04-05T20:21:49Z

Is the solution simply to update configs to include
hive.metastore.glue.aws-credentials-provider=com.amazonaws.auth.WebIdentityTokenCredentialsProvider where needed if IRSA is all we want to use?

Sorry, I'm a DevOps guy trying to help out the data team and have limited Java and no Trino knowledge.

findepi · 2024-06-19T10:18:00Z

For S3 this is done in #22163
For Glue this is being done in #22425

mgorbatenko · 2024-08-07T20:55:28Z

Hey all -

I was following along for this issue as we ran into the same thing. I've now updated to version 451, and am using hive.metastore.glue.use-web-identity-token-credentials-provider=true in my config.

Not often, but occasionally my worker pods can still get into a bad state. Any queries running on those pods fail with the below error. Sure enough, restarting the pods seems to do the trick. I'm using IRSA to query parquet files in s3 buckets.

io.trino.spi.TrinoException: Error opening Hive split s3://amperity-tenant-nnebwd/tables/31JbBg91EHaproQ/2FJ/part-00023-8207dbc7-7342-4b2a-a421-54a409618b96-c000.snappy.parquet (offset=16942990, length=16942991): Read 49152 tail bytes of file s3://amperity-tenant-nnebwd/tables/31JbBg91EHaproQ/2FJ/part-00023-8207dbc7-7342-4b2a-a421-54a409618b96-c000.snappy.parquet failed: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: C889VCGHTHDKR1PQ; S3 Extended Request ID: 5U7/BTdY6W5ZnUAFwHwZmD2b6JMdPd9NfIuCk5KiB5wumdC2uTIhQ+N0SFbrWFRkt/cUdUqV40ab/v2EWAnQTiiNzqnLYBFRF6/VR7MMxc8=; Proxy: null), S3 Extended Request ID: 5U7/BTdY6W5ZnUAFwHwZmD2b6JMdPd9NfIuCk5KiB5wumdC2uTIhQ+N0SFbrWFRkt/cUdUqV40ab/v2EWAnQTiiNzqnLYBFRF6/VR7MMxc8= (Bucket: amperity-tenant-nnebwd, Key: tables/31JbBg91EHaproQ/2FJ/part-00023-8207dbc7-7342-4b2a-a421-54a409618b96-c000.snappy.parquet)
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:306)
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:180)
	at io.trino.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:202)
	at io.trino.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:139)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:48)
	at io.trino.split.PageSourceManager$PageSourceProviderInstance.createPageSource(PageSourceManager.java:79)
	at io.trino.operator.ScanFilterAndProjectOperator$SplitToPages.process(ScanFilterAndProjectOperator.java:260)
	at io.trino.operator.ScanFilterAndProjectOperator$SplitToPages.process(ScanFilterAndProjectOperator.java:191)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:359)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:423)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:346)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:423)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:346)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:423)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils.lambda$processStateMonitor$2(WorkProcessorUtils.java:240)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:423)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils.lambda$finishWhen$3(WorkProcessorUtils.java:255)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:423)
	at io.trino.operator.WorkProcessorSourceOperatorAdapter.getOutput(WorkProcessorSourceOperatorAdapter.java:133)
	at io.trino.operator.Driver.processInternal(Driver.java:403)
	at io.trino.operator.Driver.lambda$process$8(Driver.java:306)
	at io.trino.operator.Driver.tryWithLock(Driver.java:709)
	at io.trino.operator.Driver.process(Driver.java:298)
	at io.trino.operator.Driver.processForDuration(Driver.java:269)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:890)
	at io.trino.execution.executor.timesharing.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
	at io.trino.execution.executor.timesharing.TimeSharingTaskExecutor$TaskRunner.run(TimeSharingTaskExecutor.java:565)
	at io.trino.$gen.Trino_453____20240807_192932_2.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1570)
Caused by: java.io.IOException: Read 49152 tail bytes of file s3://amperity-tenant-nnebwd/tables/31JbBg91EHaproQ/2FJ/part-00023-8207dbc7-7342-4b2a-a421-54a409618b96-c000.snappy.parquet failed: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: C889VCGHTHDKR1PQ; S3 Extended Request ID: 5U7/BTdY6W5ZnUAFwHwZmD2b6JMdPd9NfIuCk5KiB5wumdC2uTIhQ+N0SFbrWFRkt/cUdUqV40ab/v2EWAnQTiiNzqnLYBFRF6/VR7MMxc8=; Proxy: null), S3 Extended Request ID: 5U7/BTdY6W5ZnUAFwHwZmD2b6JMdPd9NfIuCk5KiB5wumdC2uTIhQ+N0SFbrWFRkt/cUdUqV40ab/v2EWAnQTiiNzqnLYBFRF6/VR7MMxc8= (Bucket: amperity-tenant-nnebwd, Key: tables/31JbBg91EHaproQ/2FJ/part-00023-8207dbc7-7342-4b2a-a421-54a409618b96-c000.snappy.parquet)
	at io.trino.filesystem.hdfs.HdfsInput.readTail(HdfsInput.java:71)
	at io.trino.filesystem.TrinoInput.readTail(TrinoInput.java:43)
	at io.trino.filesystem.tracing.TracingInput.lambda$readTail$3(TracingInput.java:81)
	at io.trino.filesystem.tracing.Tracing.withTracing(Tracing.java:47)
	at io.trino.filesystem.tracing.TracingInput.readTail(TracingInput.java:81)
	at io.trino.plugin.hive.parquet.TrinoParquetDataSource.readTailInternal(TrinoParquetDataSource.java:54)
	at io.trino.parquet.AbstractParquetDataSource.readTail(AbstractParquetDataSource.java:100)
	at io.trino.parquet.reader.MetadataReader.readFooter(MetadataReader.java:101)
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:226)
	... 32 more
Caused by: io.trino.hdfs.s3.TrinoS3FileSystem.UnrecoverableS3OperationException: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: C889VCGHTHDKR1PQ; S3 Extended Request ID: 5U7/BTdY6W5ZnUAFwHwZmD2b6JMdPd9NfIuCk5KiB5wumdC2uTIhQ+N0SFbrWFRkt/cUdUqV40ab/v2EWAnQTiiNzqnLYBFRF6/VR7MMxc8=; Proxy: null), S3 Extended Request ID: 5U7/BTdY6W5ZnUAFwHwZmD2b6JMdPd9NfIuCk5KiB5wumdC2uTIhQ+N0SFbrWFRkt/cUdUqV40ab/v2EWAnQTiiNzqnLYBFRF6/VR7MMxc8= (Bucket: amperity-tenant-nnebwd, Key: tables/31JbBg91EHaproQ/2FJ/part-00023-8207dbc7-7342-4b2a-a421-54a409618b96-c000.snappy.parquet)
	at io.trino.hdfs.s3.TrinoS3FileSystem$TrinoS3InputStream.lambda$openStream$2(TrinoS3FileSystem.java:1585)
	at io.trino.hdfs.s3.RetryDriver.run(RetryDriver.java:125)
	at io.trino.hdfs.s3.TrinoS3FileSystem$TrinoS3InputStream.openStream(TrinoS3FileSystem.java:1571)
	at io.trino.hdfs.s3.TrinoS3FileSystem$TrinoS3InputStream.openStream(TrinoS3FileSystem.java:1556)
	at io.trino.hdfs.s3.TrinoS3FileSystem$TrinoS3InputStream.seekStream(TrinoS3FileSystem.java:1549)
	at io.trino.hdfs.s3.TrinoS3FileSystem$TrinoS3InputStream.lambda$read$1(TrinoS3FileSystem.java:1493)
	at io.trino.hdfs.s3.RetryDriver.run(RetryDriver.java:125)
	at io.trino.hdfs.s3.TrinoS3FileSystem$TrinoS3InputStream.read(TrinoS3FileSystem.java:1492)
	at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:345)
	at java.base/java.io.BufferedInputStream.implRead(BufferedInputStream.java:420)
	at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:405)
	at java.base/java.io.DataInputStream.read(DataInputStream.java:158)
	at java.base/java.io.DataInputStream.read(DataInputStream.java:158)
	at io.trino.hdfs.FSDataInputStreamTail.readTail(FSDataInputStreamTail.java:59)
	at io.trino.filesystem.hdfs.HdfsInput.readTail(HdfsInput.java:63)
	... 40 more

cc @findepi @Pluies

rohanag12 · 2024-08-09T19:34:47Z

@mgorbatenko The hive.metastore.glue.use-web-identity-token-credentials-provider configuration was released in Trino 453 (Ref release notes), it does not exist in 451. Can you try upgrading to 453?

mgorbatenko · 2024-08-09T19:39:49Z

@mgorbatenko The hive.metastore.glue.use-web-identity-token-credentials-provider configuration was released in Trino 453 (Ref release notes), it does not exist in 451. Can you try upgrading to 453?

Hey @rohanag12, apologies for the confusion. I am actually using 453. We upgraded from 451 -> 453 when this was released. I misspoke in my previous comment.

nineinchnick mentioned this issue May 17, 2024

Make it easier to use AWS classes as a custom credentials provider #22007

Closed

This was referenced May 27, 2024

Random S3 permission-related issues with Trino on EKS trinodb/charts#171

Closed

Allow pinning to WebIdentityTokenCredentialsProvider in legacy S3 client #22162

Merged

Allow pinning to WebIdentityTokenCredentialsProvider in native S3 client #22163

Merged

rohanag12 mentioned this issue Jun 3, 2024

Glue v2 #20657

Merged

findepi mentioned this issue Jun 19, 2024

Allow pinning Glue credentials provider to StsWebIdentityTokenFileCredentialsProvider #22425

Merged

ebyhr closed this as completed in #22425 Jul 18, 2024

nineinchnick mentioned this issue Sep 24, 2024

Unable to load credentials from any of the providers in the chain file with native s3 support #23545

Closed

mgorbatenko mentioned this issue Nov 11, 2024

Trino fails to correctly provision IRSA #24096

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trino intermittently fails to pick up IRSA #15267

Trino intermittently fails to pick up IRSA #15267

Pluies commented Dec 1, 2022

skyahead commented Mar 7, 2023

Pluies commented Mar 8, 2023

skyahead commented Mar 8, 2023

mccartney commented Mar 27, 2023

hashhar commented Aug 4, 2023

pettyjamesm commented Aug 4, 2023 •

edited

Loading

hashhar commented Aug 5, 2023

osscm commented Oct 19, 2023 •

edited

Loading

kar0t commented Nov 1, 2023 •

edited

Loading

1ambda commented Nov 4, 2023

1ambda commented Dec 8, 2023

MehulBatra commented Jan 12, 2024

lmay-r7 commented Jan 19, 2024

jfmrm commented Feb 29, 2024

kassett commented Mar 1, 2024 •

edited

Loading

osscm commented Mar 3, 2024 via email

denniswebb commented Apr 5, 2024

findepi commented Jun 19, 2024

mgorbatenko commented Aug 7, 2024

rohanag12 commented Aug 9, 2024

mgorbatenko commented Aug 9, 2024

Trino intermittently fails to pick up IRSA #15267

Trino intermittently fails to pick up IRSA #15267

Comments

Pluies commented Dec 1, 2022

skyahead commented Mar 7, 2023

Pluies commented Mar 8, 2023

skyahead commented Mar 8, 2023

mccartney commented Mar 27, 2023

hashhar commented Aug 4, 2023

pettyjamesm commented Aug 4, 2023 • edited Loading

hashhar commented Aug 5, 2023

osscm commented Oct 19, 2023 • edited Loading

kar0t commented Nov 1, 2023 • edited Loading

1ambda commented Nov 4, 2023

1ambda commented Dec 8, 2023

MehulBatra commented Jan 12, 2024

lmay-r7 commented Jan 19, 2024

jfmrm commented Feb 29, 2024

kassett commented Mar 1, 2024 • edited Loading

osscm commented Mar 3, 2024 via email

denniswebb commented Apr 5, 2024

findepi commented Jun 19, 2024

mgorbatenko commented Aug 7, 2024

rohanag12 commented Aug 9, 2024

mgorbatenko commented Aug 9, 2024

pettyjamesm commented Aug 4, 2023 •

edited

Loading

osscm commented Oct 19, 2023 •

edited

Loading

kar0t commented Nov 1, 2023 •

edited

Loading

kassett commented Mar 1, 2024 •

edited

Loading