Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3-Transfer-Manager Streaming #317

Closed
djchapm opened this issue Jun 16, 2023 · 22 comments
Closed

S3-Transfer-Manager Streaming #317

djchapm opened this issue Jun 16, 2023 · 22 comments
Labels
bug Something isn't working

Comments

@djchapm
Copy link

djchapm commented Jun 16, 2023

Describe the bug

I've been testing combination of latest aws-c-s3 update regarding #285 and the ability to stream data of unknown size to S3 using a publisher.

I have 2 concerns aside from errors I'm getting -

  1. For small loads this works great, but starts breaking down somewhere after 128MB at least in my setup. I am not sure if the issue is with this project or the java v2 sdk for s3-transfer-manager - feel free to redirect.
  2. As the size increases, there is a jump in time to complete the upload - at 128M I've seen the completion step is anywhere from ~4s to 18s,

Platform: Macbook Pro, Ventura 13.4, 32GB, Apple M1 Max ARM64 processor.
Aws CLI: aws-cli/2.12.1 Python/3.11.4 Darwin/22.5.0 source/arm64
TargetThroughput: 20.0Gbps
Minimum Part Size: 1000000L (I think this causes issues after 128M)

I've mocked up data which is just a string of 128 Bytes that I send over and over in a ByteBuffer.

Errors include failed response (400) from awssdk, missing checksums for parts, and SIGSEGV on libobjc.A.dylib... I've also received a SIGABRT which doesn't even give me a dump.

Attaching a simple Java project to test - configure whichever credentials and a bucket name and execute - as you increase num lines you'll start to see issues. I create the crt log in whatever your work directory happens to be.

Expected Behavior

Files uploaded successfully and can manage at least 100G S3 Objects.
Closing the stream doesn't grow with file size so drastically as it does now.

Current Behavior

Crashes, Failed uploads, heavy delay on completing the upload.

Reproduction Steps

s3AsyncTest.zip

Possible Solution

No response

Additional Information/Context

No response

aws-c-s3 version used

aws.sdk.version 2.20.79, aws.crt.version 0.22.1

Compiler and version used

openjdk 17.0.3 2022-04-19

Operating System and version

Darwin US10MAC44VWYPKH 22.5.0 Darwin Kernel Version 22.5.0: Mon Apr 24 20:52:24 PDT 2023; root:xnu-8796.121.2~5/RELEASE_ARM64_T6000 arm64 arm Darwin

@TingDaoK
Copy link
Contributor

TingDaoK commented Jun 19, 2023

Hey @djchapm , thanks for reporting the issue.

I did reproduce the crash and 400 sometimes and it's a bug from our side, and I have a possible fix for it.

But for the jump in time to complete the upload, I am not sure I get what you mean.
I looked into your sample code, you are calculating the time when the body finished to be read to the whole upload completes. So, you start the timer when the last part has been read, and there are lot of things happening afterward:

  • We will wait for the upload part to complete
  • We generate the complete MPU request
  • wait for the complete MPU request to finish.

So, the time really varies for me during my tests. I saw time between 23 sec to 40 sec during my test with 1.28 GB object. But, you know, as I said above, the time varies for internet and stuff, I think that's understandable. That may look weird. However, I tried both withContentLength and without, I didn't see much difference between them based on the CloseDuration your sample has. Let me know if you see it differently.

Edit: I'll look into why there is a such difference after the read complete to the whole upload completes to get more info.

@TingDaoK
Copy link
Contributor

TingDaoK commented Jun 19, 2023

I think the time difference you saw is irrelevant with the unknown content-length change. I used aws-crt@0.21.17, which is before the unknown content-length stuff merged. I had a run with 128MB object, and result in:

15:08:07.443 [AwsEventLoop 8] INFO  com.test.s3.AWSTransferManager - Done
15:08:25.496 [AwsEventLoop 5] DEBUG software.amazon.awssdk.requestId - Received successful response: 200, Request ID: B753BVB3AD3QE70W, Extended Request ID: UnmyUS202nmIpAuptKfYW/nFub0o+WjuaBXP95/HuHrW7iF+MRKmWPpLyqvZCNQJP/oYMZG4j1o=
15:08:25.496 [AwsEventLoop 5] DEBUG software.amazon.awssdk.request - Received successful response: 200, Request ID: B753BVB3AD3QE70W, Extended Request ID: UnmyUS202nmIpAuptKfYW/nFub0o+WjuaBXP95/HuHrW7iF+MRKmWPpLyqvZCNQJP/oYMZG4j1o=
15:08:25.512 [main] INFO  com.test.s3.AWSTransferManager - CompleteUploadResponse: PutObjectResponse(ETag="becb7ef096615795b4da34e3707f5874-25", ServerSideEncryption=AES256)
	CloseDuration: 18.906 seconds
15:08:25.512 [sdk-async-response-0-0] INFO  com.test.s3.S3TransferListener - Transfer complete for resource: flux.bin
	 Bytes: 128000000.000000
	 MBs: 122.070
	Throughput: 6.42 MB/s

In this run, it took 18.9 secs. And from the CRT log: we can see that the last upload part request was made at 2023-06-19T22:08:07Z, which is 15:08:07 PDT, basically matches when the read finishes.

[INFO] [2023-06-19T22:08:07Z] [00007000073ea000] [AuthSigning] - (id=0x600002f7e0d0) Signing successfully built canonical request for algorithm SigV4, with contents 
PUT
/flux.bin
partNumber=25

However, the response has not been received until 2023-06-19T22:08:25Z, which is 15:08:25 PDT, and matches the time we saw when the whole upload object completes.

[TRACE] [2023-06-19T22:08:25Z] [0000700006dc6000] [http-stream] - id=0x7fc5de94dcc0: Incoming response status: 200 (OK).
[TRACE] [2023-06-19T22:08:25Z] [0000700006dc6000] [http-stream] - id=0x7fc5de94dcc0: Incoming header: x-amz-id-2: hVGxd2t6wzhqeRPleRocQUoe3TdeMzhXZtOseeUXJdjuDcu1fYY4MDN/m6QSnT1IBtRHoJiJ+rs=
[TRACE] [2023-06-19T22:08:25Z] [0000700006dc6000] [http-stream] - id=0x7fc5de94dcc0: Incoming header: x-amz-request-id: 2W47P7AV612AVDNB
[TRACE] [2023-06-19T22:08:25Z] [0000700006dc6000] [http-stream] - id=0x7fc5de94dcc0: Incoming header: Date: Mon, 19 Jun 2023 22:08:08 GMT

So, I think the time issue you saw is really just internet or something with the server side.


To add more info, I also tested the latest change with unknown content length, the last upload part is created basically the same time as the body read completes, and the whole object upload completes basically matches when the last upload part response receives. I saw the major difference is the S3 response to the request varies, it took some time from 7 secs as bellow.

[TRACE] [2023-06-19T22:28:06Z] [000070000b3f3000] [http-stream] - id=0x7f7f17b2be30: Incoming response status: 200 (OK).
[TRACE] [2023-06-19T22:28:06Z] [000070000b3f3000] [http-stream] - id=0x7f7f17b2be30: Incoming header: x-amz-id-2: dmcQwE/7Q3vpalej4CDNJyAkLNObdoROhrQQXcVpgNmJbVtf15Lvv/+hBDiq0/h/ZHjs6e8Rkd4=
[TRACE] [2023-06-19T22:28:06Z] [000070000b3f3000] [http-stream] - id=0x7f7f17b2be30: Incoming header: x-amz-request-id: V6SCBY4AWPM1KKYR
[TRACE] [2023-06-19T22:28:06Z] [000070000b3f3000] [http-stream] - id=0x7f7f17b2be30: Incoming header: Date: Mon, 19 Jun 2023 22:27:59 GMT

and the other time as 20 secs, logs:

[TRACE] [2023-06-19T22:27:30Z] [0000700008635000] [http-stream] - id=0x7fdabe533e00: Incoming response status: 200 (OK).
[TRACE] [2023-06-19T22:27:30Z] [0000700008635000] [http-stream] - id=0x7fdabe533e00: Incoming header: x-amz-id-2: RyJoIUHzgatbAj0CvRheGG1oOLROxwT4FTg+2OH5J/i/IF5A1e2juu92ZMedwOUzG1Wtdz2ramg=
[TRACE] [2023-06-19T22:27:30Z] [0000700008635000] [http-stream] - id=0x7fdabe533e00: Incoming header: x-amz-request-id: 9TYKWNE08MTQPKK9
[TRACE] [2023-06-19T22:27:30Z] [0000700008635000] [http-stream] - id=0x7fdabe533e00: Incoming header: Date: Mon, 19 Jun 2023 22:27:10 GMT

@TingDaoK
Copy link
Contributor

Add more:
the time diff I have seen is mostly my local machine and internet, for the 20 secs diff I saw the log:
[TRACE] [2023-06-19T22:27:30Z] [0000700008635000] [http-stream] - id=0x7fdabe533e00: Done sending data.
So, S3 response right after we finish streaming the body of the request. The time difference is mostly on internet and the machine. I also used my local macbook to run all those tests with all sort of other stuff going on and result in the differences.

@djchapm
Copy link
Author

djchapm commented Jun 20, 2023

Thanks @TingDaoK for the analysis.

Regarding with/without minimum part size - it was random for me but it seemed I could push more data before hitting an error when it was unset. May not be true as it was somewhat random. I just mentioned in case that meant something because more often than not it was better with unknown size.

On the response time - some background. This whole solution (being able to upload without knowing content length) is being worked around by us - we buffer up to an estimated part size. Once it's full we push the part and start a new buffer. We keep track of parts and overall size so that we can create the completeMultipartUpload request at the end. Not sure if this is what you are doing or not. So this is our old code in AWS S3 V1 SDK and I'm using it to compare to the new features in aws-c-s3+TransferManger V2. My throughput is consistently 2-3 times higher with the old solution, and it's much more consistent on the response times.

Maybe we're buffering more and sending in larger chunks than you, or maybe you're making more calls than we are for the final processing - I'm not sure. We'd like to get to the point where we see comparable performance. I'll see if I can get some numbers together on that.

@djchapm
Copy link
Author

djchapm commented Jun 20, 2023

Figured out the discrepancy - with our manual workaround we are using compression which makes a huge difference. Now the times are more comparable. Below times are with a 15MB part size

Old Solution:
image

New Solution:
image

In order to use compression - I copied "BlockingOutputStreamAsyncRequestBody.java" from aws-core and wrapped it in a GZipOutputStream - for some reason they lock this up and make it package private - I'll submit request to open that up.

Attached is the updated test code.

@djchapm
Copy link
Author

djchapm commented Jun 20, 2023

s3AsyncTest.zip

@TingDaoK
Copy link
Contributor

The fix of the crash and error has been merged and released, I don't think the discrepancy is related to CRT.

@TingDaoK
Copy link
Contributor

Please use the latest aws-crt-java@v0.22.2 https://github.com/awslabs/aws-crt-java/tree/v0.22.2. And let us know if you still have issues

@rickyeng127
Copy link

@TingDaoK , when we run in AWS, we are seeing this stack trace using the code that @djchapm attached. It works without issue on Windows.

software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: An input string was passed to a parser and the string was incorrectly formatted.
at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111) ~[sdk-core-2.20.94.jar:?]
at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:43) ~[sdk-core-2.20.94.jar:?]
at software.amazon.awssdk.services.s3.internal.crt.S3CrtResponseHandlerAdapter.handleError(S3CrtResponseHandlerAdapter.java:127) ~[s3-2.20.94.jar:?]
at software.amazon.awssdk.services.s3.internal.crt.S3CrtResponseHandlerAdapter.onFinished(S3CrtResponseHandlerAdapter.java:93) ~[s3-2.20.94.jar:?]
at software.amazon.awssdk.crt.s3.S3MetaRequestResponseHandlerNativeAdapter.onFinished(S3MetaRequestResponseHandlerNativeAdapter.java:24) ~[aws-crt-0.22.2.jar:0.22.2]
at software.amazon.awssdk.crt.s3.S3MetaRequestResponseHandlerNativeAdapter.onFinished(S3MetaRequestResponseHandlerNativeAdapter.java:24) ~[aws-crt-0.22.2.jar:0.22.2]

@DmitriyMusatkin
Copy link
Contributor

@rickyeng127 are you modifying the sample in any way? that is a common parser error and in s3 case its probably error in parsing host header or endpoint override. one possible reason for it working on windows, but not on linux is due to using backslashes in the path

@campidelli
https://github.com/awslabs/aws-crt-java exposes crt through jni bindings to java users
https://github.com/aws/aws-sdk-java-v2/tree/master/http-clients/aws-crt-client is java sdk v2 compatible http client that is based on crt and basically implements sdk specific http client interfaces using the crt bindings in aws-crt-java

@rickyeng127
Copy link

@DmitriyMusatkin, I am not modifying the sample. The following is the sample code that is working in Windows but not in AWS.

`public class S3StreamLoader {

private Logger _logger = LogManager.getLogger(S3StreamLoader.class);

private String _bucketName;
private String _objectKey;
private long _minPartSize = 1000000L;
private S3AsyncClient _s3AsyncClient;
private S3TransferManager _s3TransferManager;
private Upload _upload;
private ConsumerFeed<byte[]> _consumerFeed;

private long _bytesWritten;

public S3StreamLoader(String bucketName, String objectKey) throws InterruptedException {
_bucketName = bucketName;
_objectKey = objectKey;
}

public void initiateRequest() throws InterruptedException {

_s3AsyncClient = getS3AsyncClient();
_s3TransferManager = getTransferManager(_s3AsyncClient);

// Create a ConsumerFeed we can 'stream' messages to:
_consumerFeed = new ConsumerFeed<>();

Flux<ByteBuffer> flux = _consumerFeed.getFlux()
  .map(ByteBuffer::wrap)
  .doOnComplete(() -> _logger.info("Done"));

UploadRequest.Builder req = UploadRequest.builder()
  .requestBody(AsyncRequestBody.fromPublisher(flux))
  .addTransferListener(new S3TransferListener(_objectKey))
  .putObjectRequest(b -> b.bucket(_bucketName).key(_objectKey));

_upload = _s3TransferManager.upload(req.build());
// wait for it to be created
_consumerFeed.getSubscription().await();

}

public void write(byte[] buf) {
if (buf==null) return;
_bytesWritten += buf.length;
_consumerFeed.getDataFeed().accept(buf);
}

public S3AsyncClient getS3AsyncClient() {
S3AsyncClient s3AsyncClient =
S3AsyncClient.crtBuilder()
.targetThroughputInGbps(20.0)
.minimumPartSizeInBytes(_minPartSize)
.build();
return s3AsyncClient;
}

public S3TransferManager getTransferManager(S3AsyncClient s3AsyncClient) {
return S3TransferManager.builder()
.s3Client(s3AsyncClient)
.build();
}

public CompletedUpload finish() {
CompletedUpload completedUpload = null;
try {
_consumerFeed.complete();
if (_bytesWritten > 0) {
completedUpload = _upload.completionFuture().join();
} else {
_upload.completionFuture().cancel(true);
}
_s3TransferManager.close();
_s3AsyncClient.close();
} catch (Exception ex) {
_logger.error("Error while attempting to finish stream with " + _bytesWritten + " bytes");
}
return completedUpload;
}
}

public class ConsumerFeed {

private Logger _logger = LogManager.getLogger(ConsumerFeed.class);

private Consumer _dataFeed;
private FluxSink _sink;
private final CountDownLatch _subscription = new CountDownLatch(1);

Flux _flux = Flux.create(sink -> {
_logger.info("Create");
this._sink = sink;
_dataFeed = sink::next;
_subscription.countDown();
});

public Flux getFlux() {
return _flux;
}

public CountDownLatch getSubscription() {
return _subscription;
}

public Consumer getDataFeed() {
return _dataFeed;
}

void complete() {
_sink.complete();
}
}

public class S3TransferListener implements TransferListener {

private Logger _logger = LogManager.getLogger(S3TransferListener.class);

final String _resource;
long _startTime;
private int _step = 0;

public S3TransferListener(String objectKey) {
_resource = objectKey;
}

@OverRide
public void transferInitiated(Context.TransferInitiated context) {
_logger.info("Streaming transfer initiated: {}, {}", _resource, context.progressSnapshot().ratioTransferred());
_startTime = System.currentTimeMillis();
status(context.progressSnapshot().transferredBytes());
}

private void status(long l) {
if (l > _step * 1_000_000) {
_logger.info("Bytes transferred {}", l);
_step++;
}
}

@OverRide
public void bytesTransferred(Context.BytesTransferred context) {
status(context.progressSnapshot().transferredBytes());
}

@OverRide
public void transferComplete(Context.TransferComplete context) {
long seconds = (System.currentTimeMillis() - _startTime) / 1000;
double bytes = (double)context.progressSnapshot().transferredBytes();
double megabytes = bytes / 1_048_576;
double throughput = megabytes / seconds;
_logger.info("Streaming transfer complete for resource: {}\n\t Bytes: {}\n\t MBs: {}\n\tThroughput: {} MB/s",
_resource, String.format("%10f", bytes), String.format("%.3f", megabytes),
String.format("%.2f", throughput));
}

@OverRide
public void transferFailed(Context.TransferFailed context) {
_logger.error("Streaming transfer failed on resource "+ _resource, context.exception());
}
}

public class TestClass {

public static void main(string[] args) {
String s3BucketName = "myTestBucket";
String objectKey = "myPrefix/TestFile.txt";
String testString = "This is some test data that will get sent to S3 in stream form";
S3StreamLoader streamLoader = new S3StreamLoader(s3BucketName, objectKey);
streamLoader.initiateRequest();
streamLoader.write(testString.getBytes());
streamLoader.finish();
}
}`

@djchapm
Copy link
Author

djchapm commented Jun 30, 2023

Hey @TingDaoK - I've run a couple tests today using aws-crt-java@v0.22.2, uploading files of > 17G with part size of 15M without issue.
Note - meant to send this confirmation a week ago but didn't submit and went on vacation. Should we close ticket?

@rickyeng127
Copy link

Hi @TingDaoK , Can you provide sample code on how you ran these tests?

@TingDaoK
Copy link
Contributor

TingDaoK commented Jun 30, 2023

@rickyeng127 I just used the sample provided by @djchapm . And I don't think i made any change. I am not sure about the error you seen, the exception comes from java sdk. And I haven't really looked deep into the code you pasted.

edit: Oh, the exception actually comes from us, as Dmitriy said, to enable the log will probably help.

@djchapm Sounds good! We can also update the issue from the java sdk side.

@DmitriyMusatkin
Copy link
Contributor

@rickyeng127 i would recommend turning trace logging in sdk to see if it will provide any more info on whats failing. you can use one of the init methods here to do that - https://awslabs.github.io/aws-crt-java/software/amazon/awssdk/crt/Log.html.
As i mentioned above the error you are seeing comes from crt parsers (probably uri parser in this case). Thing to confirm in the logs would be to verify that http req is constructed properly, i.e. does not have any weird unicode in places where it should not be or any weird escape chars.

I agree we should close this issue. @rickyeng127 if you continue facing the issue i would recommend you open a separate one with sample code and trace log

@rickyeng127
Copy link

Hi @DmitriyMusatkin, I enabled trace debugging and in fact see an error in the uri parser. Note, I am attempting to write to a file in a top level bucket (with no prefix). Here is the error trace log showing the error. Is this an error in AWS or my code? As noted earlier, this works when connected to AWS remotely but fails when running natively in AWS.

[TRACE] [2023-07-05T18:11:01Z] [00007fdbc6242700] [event-loop] - id=0x7fdbe42c5a80: detected more scheduled tasks with the next occurring at 0, using timeout of 0.
[TRACE] [2023-07-05T18:11:01Z] [00007fdbc6242700] [event-loop] - id=0x7fdbe42c5a80: waiting for a maximum of 0 ms
[TRACE] [2023-07-05T18:11:01Z] [00007fdbc6242700] [event-loop] - id=0x7fdbe42c5a80: wake up with 0 events to process.
[TRACE] [2023-07-05T18:11:01Z] [00007fdbc6242700] [event-loop] - id=0x7fdbe42c5a80: running scheduled tasks.
[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [task-scheduler] - id=0x7fdc0419e908: Running aws_exponential_backoff_retry_task task with status
[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [exp-backoff-strategy] - id=0x7fdbe4509160: Vending retry_token 0x7fdc0419e8a0
[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [standard-retry-strategy] - id=0x7fdbe4508fd0: token acquired callback invoked with error Success. with token 0x7fdbfc00c8f0 and nested token 0x7fdc0419e8a0
[TRACE] [2023-07-05T18:11:01Z] [00007fdbc6242700] [standard-retry-strategy] - id=0x7fdbe4508fd0: invoking on_retry_token_acquired callback
[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [connection-manager] - id=0x7fdbe45a2950: Acquire connection
[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [connection-manager] - id=0x7fdbe45a2950: snapshot - state=1, idle_connection_count=0, pending_acquire_count=1, pending_settings_count=0, pending_connect_count=1, vended_connection_count=0, open_connection_count=0, ref_count=1
[INFO] [2023-07-05T18:11:01Z] [00007fdbc6242700] [connection-manager] - id=0x7fdbe45a2950: Requesting 1 new connections from http
[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [http-connection] - https_proxy environment found
[ERROR] [2023-07-05T18:11:01Z] [00007fdbc6242700] [http-connection] - Could not parse found proxy URI.
[ERROR] [2023-07-05T18:11:01Z] [00007fdbc6242700] [connection-manager] - id=0x7fdbe45a2950: http connection creation failed with error code 36(An input string was passed to a parser and the string was incorrectly formatted.)
[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [connection-manager] - id=0x7fdbe45a2950: Failing excess connection acquisition with error code 36
[WARN] [2023-07-05T18:11:01Z] [00007fdbc6242700] [connection-manager] - id=0x7fdbe45a2950: Failed to complete connection acquisition with error_code 36(An input string was passed to a parser and the string was incorrectly formatted.)
[ERROR] [2023-07-05T18:11:01Z] [00007fdbc6242700] [S3Endpoint] - id=0x7fdbe4592e80: Could not acquire connection due to error code 36 (An input string was passed to a parser and the string was incorrectly formatted.)
[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [S3Client] - id=0x7fdbe4401fa0 Client scheduling retry of request 0x7fdc0419e130 for meta request 0x7fdbe45779f0 with token 0x7fdbfc00c8f0 with error code 36 (An input string was passed to a parser and the string was incorrectly formatted.).
[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [standard-retry-strategy] - token_id=0x7fdbfc00c8f0: reducing retry capacity by 10 from 500 and scheduling retry.
[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [exp-backoff-strategy] - id=0x7fdbe4509160: Attempting retry on token 0x7fdc0419e8a0 with error type 0
[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [exp-backoff-strategy] - id=0x7fdbe4509160: Computed backoff value of 8310907ns on token 0x7fdc0419e8a0
[TRACE] [2023-07-05T18:11:01Z] [00007fdbc6242700] [event-loop] - id=0x7fdbe42c5a80: scheduling task 0x7fdc0419e908 in-thread for timestamp 977992596685
[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [task-scheduler] - id=0x7fdc0419e908: Scheduling aws_exponential_backoff_retry_task task for future execution at time 977992596685
[TRACE] [2023-07-05T18:11:01Z] [00007fdbc6242700] [standard-retry-strategy] - id=0x7fdbe4508fd0: on_retry_token_acquired callback completed

@waahm7
Copy link
Contributor

waahm7 commented Jul 5, 2023

@rickyeng127, it seems that you have an invalid proxy configuration in your AWS settings, which is causing the issue. Please verify the proxy configuration and also check the HTTPS_PROXY environment variable. If the problem persists, kindly create a new issue with additional details about your specific use case and the proxy environment you're working with.

@waahm7 waahm7 closed this as completed Jul 5, 2023
@rickyeng127
Copy link

@waahm7, these requests are being initiated from an ECS container to S3. Do I still need to configure a HTTPS_PROXY when running in this mode?

@waahm7
Copy link
Contributor

waahm7 commented Jul 5, 2023

@rickyeng127 It is not required to configure proxy. Your environment has some invalid proxy configuration which is causing the issue as the S3 client tries to parse the proxy url and fails.

[DEBUG] [2023-07-05T18:11:01Z] [00007fdbc6242700] [http-connection] - https_proxy environment found
[ERROR] [2023-07-05T18:11:01Z] [00007fdbc6242700] [http-connection] - Could not parse found proxy URI.

@rickyeng127
Copy link

@waahm7, When I debug these system properties, they all return null, meaning they haven't been set. How is the https_proxy environment being detected?

log.info("http.proxyHost: " + System.getProperty("http.proxyHost"));
log.info("https.proxyHost: " + System.getProperty("https.proxyHost"));

@waahm7
Copy link
Contributor

waahm7 commented Jul 6, 2023

@rickyeng127 The environment variable name is HTTPS_PROXY or HTTP_PROXY, please create a new issue with details of your setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants