Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shaded sdk bundle doesn't shade mozilla/public-suffix-list.txt #2860

Open
steveloughran opened this issue Oct 5, 2022 · 6 comments
Open
Labels
bug This issue is a bug. p3 This is a minor priority issue

Comments

@steveloughran
Copy link

Describe the bug

the aws-sdk-bundle doesn't shade mozilla/public-suffix-list.txt, which is used by httpclient to determine how it handles https certificates

as a result, if a different library has an out of date list, applications may not be able to connect to more recent s3 regions

This surfaced in HADOOP-18159

Expected Behavior

aws httpclients get the up to date public suffix list and so connect to all s3 regions

Current Behavior

If an out of date resource comes from a different library on the classpath, the caller sees the error message "Certificate doesn't match any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]"

Reproduction Steps

  1. add a JAR with an outdated copy of the same resource, eg. cos_api-bundle-5.6.19.jar to the classpath
  2. attempt to talk to s3 regions

Possible Solution

when shading resources, move this one.

aws sdk is not unique in not shading this (hadoop doesn't do it properly either)

Additional Information/Context

reported against v2 SDK as well aws/aws-sdk-java-v2#1786

AWS Java SDK version used

1.12.262

JDK version used

not known

Operating System and version

not known

@steveloughran steveloughran added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Oct 5, 2022
@yasminetalby yasminetalby self-assigned this Oct 5, 2022
@yasminetalby yasminetalby removed their assignment Jan 27, 2023
@debora-ito debora-ito added needs-review This issue or PR needs review from the team. p3 This is a minor priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Mar 29, 2023
@garretwilson
Copy link

garretwilson commented Jun 25, 2023

I appreciate the issue, but it's not clear what the appropriate approach to shading would be. I can't readily find a transformer to merge the files without duplicate lines; see my Stack Overflow question, Maven Shade Plugin transformer to merge text files, discarding duplicate lines. Perhaps it would be OK to use the existing AppendingTransformer and merge the files, duplicating virtually all of the lines? Will Apache HttpClient merely discard duplicates it finds?

@steveloughran
Copy link
Author

I don't know what to do here. could the file be renamed and the shaded client set to pick it up? that way you can ship something which knows of all your regions

@garretwilson
Copy link

that way you can ship something which knows of all your regions

Regions? This is not a per-region file, is it? It's just that one source happened to be out of date, isn't it?

Honestly I am not familiar with how the original ticket came about, but in my case there were simply different versions of Apache HttpClient being used by transient dependencies, and they had slightly different content because it looks like they were generated at different times with slightly different changelogs—not because they cover different regions. I understood in the case of this ticket that one was simply out of date and was overwriting the newer one via shading.

@steveloughran
Copy link
Author

the original issue is: HADOOP-18159
Certificate doesn't match any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com].

the shaded https client couldn't talk to newer s3 regions because of certificate issue

@garretwilson
Copy link

the shaded https client couldn't talk to newer s3 regions because of certificate issue

Yes, I saw that ticket, although I didn't read all the comments. I inferred that there was simply an older public-suffix-list.txt file that Maven Shade Plugin used to overwrite the newer one, since they both had the same name.

For example a recent public-suffix-list.txt for Apache HttpClient 5.x has this excerpt below. All regions are included. It's just that in the case of HADOOP-18159, an old version was present that didn't have any s3.amazonaws.com entries at all is my understanding. How is this ticket related to regions? It's simply the common issue of a file being out of date, right? Shade overwrote an updated version with an old, outdated version.

We simply need a Shade transformer that will merge two files of the same name and throw out duplicates.

…
// Amazon S3
// Submitted by Luke Wells <psl-maintainers@amazon.com>
// Reference: d068bd97-f0a9-4838-a6d8-954b622ef4ae
s3.cn-north-1.amazonaws.com.cn
s3.dualstack.ap-northeast-1.amazonaws.com
…
s3.amazonaws.com
…
s3-ca-central-1.amazonaws.com
s3-eu-central-1.amazonaws.com
s3-eu-west-1.amazonaws.com
s3-eu-west-2.amazonaws.com
…

// AWS Cloud9
// Submitted by: AWS Security <psl-maintainers@amazon.com>
// Reference: 2b6dfa9a-3a7f-4367-b2e7-0321e77c0d59
vfs.cloud9.af-south-1.amazonaws.com
webview-assets.cloud9.af-south-1.amazonaws.com
vfs.cloud9.ap-east-1.amazonaws.com
…

@steveloughran
Copy link
Author

there was another jar on the classpath with the older list. because both jars had the same path to the list, it was up to the jvm to pick a version, and it chose the one without the newer regions listed as TLDs, hence, s3a stopped talking to those regions.

@debora-ito debora-ito removed the needs-review This issue or PR needs review from the team. label Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p3 This is a minor priority issue
Projects
None yet
Development

No branches or pull requests

4 participants