Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial download of codegen #394

Open
johanandren opened this issue Jul 12, 2022 · 5 comments
Open

Partial download of codegen #394

johanandren opened this issue Jul 12, 2022 · 5 comments
Labels
bug Something isn't working kalix-runtime Runtime and SDKs sub-team

Comments

@johanandren
Copy link
Contributor

Created shopping cart quickstart and tried to build it

On Linux x86_64:

/srv/homes/johan/code/lightbend/shopping-cart/node_modules/@kalix-io/kalix-scripts/bin/kalix-codegen-js.bin --typescript --proto-source-dir ./proto --source-dir ./src --generated-source-dir ./lib/generated --test-source-dir ./test --integration-test-source-dir ./integration-test
Segmentation fault

On darvin arm64:

/Users/johan/Code/Lightbend/Kalix/shopping-cart-quickstart/node_modules/@kalix-io/kalix-scripts/bin/kalix-codegen-js.bin --typescript --proto-source-dir ./proto --source-dir ./src --generated-source-dir ./lib/generated --test-source-dir ./test --integration-test-source-dir ./integration-test
fish: Job 1, '/Users/johan/Code/Lightbend/Kal…' terminated by signal SIGKILL (Forced quit)
@johanandren johanandren added bug Something isn't working kalix-runtime Runtime and SDKs sub-team labels Jul 12, 2022
@johanandren
Copy link
Contributor Author

None of the usual tools on Linux (gdb, ldd) seems to recognize the kalix-codegen-js.bin as a executable although file says:

file /srv/homes/johan/code/lightbend/shopping-cart/node_modules/@kalix-io/kalix-scripts/bin/kalix-codegen-js.bin
/srv/homes/johan/code/lightbend/shopping-cart/node_modules/@kalix-io/kalix-scripts/bin/kalix-codegen-js.bin: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, missing section headers at 19418976

readelf, maybe these errors are a hint about what's wrong?:

$ readelf -h /srv/homes/johan/code/lightbend/shopping-cart/node_modules/@kalix-io/kalix-scripts/bin/kalix-codegen-js.bin
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0xbb200
  Start of program headers:          64 (bytes into file)
  Start of section headers:          19416672 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         11
  Size of section headers:           64 (bytes)
  Number of section headers:         37
  Section header string table index: 36
readelf: Error: Reading 2368 bytes extends past end of file for section headers
readelf: Error: the dynamic segment offset + size exceeds the size of the file

@johanandren
Copy link
Contributor Author

johanandren commented Jul 12, 2022

Hmm, tried to download the binary manually and getting a lot of disconnects from the Lightbend repo, maybe node doesn't handle that well and that's why the file is incomplete: https://repo.lightbend.com/raw/kalix/versions/1.0.0/kalix-codegen-js-x86_64-apple-darwin

Downloading with wget keeps retrying until it has the whole file and that then can be called wihout segfaults.

The successfully/manually downloaded one is 17mb, looking in node_modules that binary is just 6.3mb

@johanandren johanandren changed the title Codegen segfaulting Partial download of codegen Jul 12, 2022
@johanandren
Copy link
Contributor Author

It seems node-fetch doesn't report when connection closed before delivering all bytes but instead tells us it is "OK"

@johanandren
Copy link
Contributor Author

johanandren commented Jul 12, 2022

Didn't figure out a way to detect this, body is a stream so would have to be comparing response.headers.get('content-length') with the number of bytes piped through when done writing or something.

IT team is looking into why repo downloads are partial/closing connection though, so maybe that will sort this out.

@pvlugter
Copy link
Member

That's unpleasant. Seems to fail the downloads quite often.

Have also had a play around with detecting. Couldn't get any errors from the stream until trying it on Node 16, where it will signal this error event on the response body:

Error: aborted
    at connResetException (node:internal/errors:692:14)
    at TLSSocket.socketCloseListener (node:_http_client:414:19)
    at TLSSocket.emit (node:events:539:35)
    at node:net:709:12
    at TCP.done (node:_tls_wrap:582:7) {
  code: 'ECONNRESET'
}

Following the changes there, it seems that it should have an aborted event on Node 14. But don't see that emitted for the body stream with node-fetch. Trying out axios in place of node-fetch and can get the aborted signal (seems it's a different type of stream as well, IncomingMessage instead of PassThrough).

Can also use stream.pipeline to have error handling attached automatically, and then it has this error:

Error [ERR_STREAM_PREMATURE_CLOSE]: Premature close
    at new NodeError (internal/errors.js:322:7)
    at IncomingMessage.onclose (internal/streams/end-of-stream.js:117:38)
    at IncomingMessage.emit (events.js:400:28)
    at TLSSocket.socketCloseListener (_http_client.js:432:11)
    at TLSSocket.emit (events.js:412:35)
    at net.js:686:12
    at TCP.done (_tls_wrap.js:564:7) {
  code: 'ERR_STREAM_PREMATURE_CLOSE'
}

I'll push up a draft that captures this at least. We could add retries on top.

Depending on what the underlying issue is, repo proxy or cloudsmith, we could also look at having these downloads somewhere else, like downloads.lightbend.com (S3).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working kalix-runtime Runtime and SDKs sub-team
Projects
None yet
Development

No branches or pull requests

2 participants