Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't federate with matrix.org #17678

Open
jans23 opened this issue Sep 9, 2024 · 8 comments
Open

can't federate with matrix.org #17678

jans23 opened this issue Sep 9, 2024 · 8 comments

Comments

@jans23
Copy link

jans23 commented Sep 9, 2024

Description

We reported the federation issue #17371 which got fixed. Now we face a similar issue. Users at matrix.org can't decrypt messages from users at nitro.chat. It seems that other home servers are not affected. We think that the cause is at matrix.org as last time.

Steps to reproduce

Users at matrix.org can't decrypt messages from users at nitro.chat

Homeserver

nitro.chat

Synapse Version

1.114.0

Installation Method

Debian packages from packages.matrix.org

Database

PostgreSQL

Workers

I don't know

Platform

Ubuntu

Configuration

No response

Relevant log output

In the homeserver.log are some errors like:


2024-08-20 18:54:08,085 - synapse.http.matrixfederationclient - 1658 - WARNING - GET-2206194- {GET-O-110526} [matrix.org] Requested file is too large > 52428800 bytes

Errors from the federation_sender.log are:

2024-08-20 19:13:51,825 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1729019 - {PUT-O-1724313} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1724110145885: ResponseNeverReceived:[CancelledError()]
2024-08-20 19:15:14,219 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1729019 - {PUT-O-1724313} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1724110145885: ResponseNeverReceived:[CancelledError()]
2024-08-20 19:17:13,379 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1729019 - {PUT-O-1724313} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1724110145885: ResponseNeverReceived:[CancelledError()]
2024-08-20 19:19:30,001 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1729019 - {PUT-O-1724313} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1724110145885: ResponseNeverReceived:[CancelledError()]
2024-08-20 19:21:48,354 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1729019 - {PUT-O-1724313} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1724110145885: ResponseNeverReceived:[CancelledError()]
2024-08-20 19:23:49,161 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1729019 - {PUT-O-1724313} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1724110145885: ResponseNeverReceived:[CancelledError()]
2024-08-20 19:25:41,177 - synapse.http.matrixfederationclient - 768 - INFO - federation_transaction_transmission_loop-1729019 - {PUT-O-1724313} [matrix.org] Got response headers: 520


### Anything else that would be useful to know?

_No response_
@daedric7
Copy link

daedric7 commented Sep 9, 2024

There's no chance that it's this right ?

#17035

@jans23
Copy link
Author

jans23 commented Sep 10, 2024

There's no chance that it's this right ?

#17035

We tried the steps mentioned in the ticket but it didn't help thus it seems we face a different issue.

@jans23
Copy link
Author

jans23 commented Sep 10, 2024

@erikjohnston Any idea?

@daedric7
Copy link

You'll need to figure out what exactly is being GET that is too large. There must be more data in the logs.

@ansiwen
Copy link

ansiwen commented Sep 13, 2024

I created a non-encrypted room to isolate the issue, and there I see that messages from matrix.org -> nitro.chat get delivered immediately, but messages from nitro.chat -> matrix.org are delayed by about 1 day. Could be a separate issue, though.

@ansiwen
Copy link

ansiwen commented Sep 13, 2024

Some observations:

All failed connections to matrix-federation://matrix.org/_matrix/federation/v1/send/ are "bundled" requests with (PDUs: 50, EDUs: 100) which fail with either HttpResponseException('520: ') or ResponseNeverReceived:[CancelledError()]

This is the last example:

2024-09-13 17:31:23,899 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-1361440 - TX [matrix.org] {1726183365481} Sending transaction [1726183365481], (PDUs: 50, EDUs: 100)
2024-09-13 17:32:24,265 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1361440 - {PUT-O-1344847} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726183365481: ResponseNeverReceived:[CancelledError()]
2024-09-13 17:33:30,238 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1361440 - {PUT-O-1344847} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726183365481: ResponseNeverReceived:[CancelledError()]
2024-09-13 17:33:49,571 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1361440 - {PUT-O-1344847} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726183365481: HttpResponseException('520: ')
2024-09-13 17:36:03,086 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1361440 - {PUT-O-1344847} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726183365481: ResponseNeverReceived:[CancelledError()]
2024-09-13 17:38:11,440 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1361440 - {PUT-O-1344847} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726183365481: ResponseNeverReceived:[CancelledError()]
2024-09-13 17:40:32,957 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1361440 - {PUT-O-1344847} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726183365481: ResponseNeverReceived:[CancelledError()]
2024-09-13 17:42:54,415 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1361440 - {PUT-O-1344847} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726183365481: ResponseNeverReceived:[CancelledError()]
2024-09-13 17:44:43,320 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1361440 - {PUT-O-1344847} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726183365481: HttpResponseException('520: ')
2024-09-13 17:46:48,366 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1361440 - {PUT-O-1344847} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726183365481: ResponseNeverReceived:[CancelledError()]
2024-09-13 17:48:41,961 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1361440 - {PUT-O-1344847} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726183365481: HttpResponseException('520: ')
2024-09-13 17:50:31,480 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-1361440 - {PUT-O-1344847} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726183365481: ResponseNeverReceived:[CancelledError()]
2024-09-13 17:50:31,480 - synapse.util.retryutils - 283 - INFO - federation_transaction_transmission_loop-1361440 - Connection to matrix.org was unsuccessful (<class 'synapse.api.errors.R
equestSendFailed'>(Failed to send request: ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.defer.CancelledError: >])); backoff now 600000
2024-09-13 17:50:31,480 - synapse.federation.sender.per_destination_queue - 421 - WARNING - federation_transaction_transmission_loop-1361440 - TX [matrix.org] Failed to send transaction: 
Failed to send request: ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.defer.CancelledError: >]
2024-09-13 17:50:31,480 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $JPs9G73PlQVYtvS4EOoR_Hne4XoO00JqckH3RyBogtU to matrix.org
2024-09-13 17:50:31,480 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $yHe98tmVnE7CNlCEzasNwDWRYyuY3Y58R7L15fssLDg to matrix.org
2024-09-13 17:50:31,480 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $6ljNKpZUExnXgiU7PeGr0FFtoYvgZIosFFPJUp_Hykk to matrix.org
2024-09-13 17:50:31,480 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $VXcQkJQ_qZ1xP-wPKLZDLtsfNqqbE4D6QWYSVy7NTh8 to matrix.org
2024-09-13 17:50:31,480 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $_-4Mk9dg8OkIhb-87wmRGSg776BcvrO01j9m2RgOoEE to matrix.org
2024-09-13 17:50:31,480 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $Xd-8qJdUKfcKbvoAkIRwdbAUReq6SyTWURtFfbAPS8E to matrix.org
2024-09-13 17:50:31,480 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $TRTb9BE9P8txa6-HdVaOlmWziDRok4JGhALdRUtjZO0 to matrix.org
2024-09-13 17:50:31,480 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $KuXd2wC7sKmGlfBf8tc4rwTjrUsxpZyEObJpQeUvOOw to matrix.org
2024-09-13 17:50:31,480 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $pQoppTeuEjLpDAmwVWdTZbWvDA5C2oY0WiOTezk8SC0 to matrix.org
2024-09-13 17:50:31,480 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $h7SqRbBi1fjIQwdzXb4kw6p2TI1I1VH6C_qisT1UxP8 to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $i67qkznO7TGf1V69uQSy2Lci7UhEewwVr3dbY1DP46U to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $jmUArY3m_nnlhuWwxzCvNWhykiwts_5wH7Ryenty4aM to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $YEPJ6SEHFEheOGbEg0oWs11MsI1WwX45OTN4NdkUPiE to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $WRBlSD3iAR6Q-YHH-QeH0XnSezJJAD5_ppvLBLWNnxE to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $ZvngO_Sxih3Gut9_nrxyvOkt9sOjJxFcfiW7X-Hzqzw to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $2ZEZYZcBoUXYZIqrxHgqAvUK71D94gwkxfXu--_Y-eM to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $dl53v-h29fD54MYzeKUo2AMb3mJeXaX6RI4GfDrxUBs to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $mmM25-aZM6THV9b6n9Hq0knLt8TrFvy-ZBsRN0yfEHw to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $UOnAvcCuR_vgZ7Wii8I7TPulIyES4NQjJs0T1PJE9hE to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $Mjuem-MUSrQ0zpv7S0Fw7KNHIBgImOJW3qq-lk0e0HE to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $a_7UCharIpYUtpJHoDvpHi2oOq08zHVV8eVWt2gIuLw to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $9iUMns_aWFZkTcSNT5Nfk_esaVZXFL1XZcBRvtO1_tA to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $f5Fo7Dq43cigiTgSxDH2_yDQBltK_a7lc1U8AdUGJ0Y to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $GZF5RgApYSrXO5aSnmq3gaczSeECITGGFwS4CK8UBxU to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $cmR-QFhhe53D11XibfJ-nZ-7IoDxGqAlrN2JjfzSEBY to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $BiOUCPObIQKFoSmHe-08pcooXxuHathFCYsc2CI4iyY to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $9m7zUbp4Ty3MyXZp1sxrFk4cXudZVjc-giKggc-k9qc to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $gxX16nm7Uvuvj0BjVGoyL8enbMM5mXHXa86cx0zuZhQ to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $j63vKkZ2_iAry7j-R_C_V0AJBNnYMCU_Uz0MbXQTXAY to matrix.org
2024-09-13 17:50:31,481 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $Wh6qkqKgClHnwIDRpy9irmmv70hL7F0m1VxstAGrtO0 to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $KUbT65NXgFcx2JUIpygzTB0gDYxZqehyh4EDBJ9bSzk to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $35cH3B5cvw68PfFhq2x5AhCMJMFQMF9iUbALu32Mho0 to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $eJ0YSYVCmmO5mDpEf-ToOmdYB4PaVsTNoATmeheaOUE to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $0x25tCDhq8bZnFFk61jr7q6v4P6i0Qnw7aCqRlDCcdU to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $7I9CI-ec9oflO234c499_07Hzt-bll6fgpVSBiq_JhA to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $3J5NgxZ6Rraa6Rz5qe42W5wds1VJwGW1hU60AQzAFGo to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $WTcCJ7Dzf7iMBFEiJuDLKTS3pietj36aKCguTPD-J0g to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $rOHZpjUyh6apD4FI4ey9-VuV2p5jjEl7G9pLbpQ7Zv0 to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $samXGu-1CsNMQtapnhLH1zagg_tI0vjpb3-woP24ICs to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $EujQbI0qgWrWjTpgBCrzCSpZACms9Fqne57z1htIzEE to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $nssmgMX-wjuxIuS8AeyQuLQRkgRIL8__wVwshvgmiOQ to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $ai2H7EnYBWmSYjV_F-o0s1KxKczo8dZsHi83T16NUyc to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $prYxmqda7uFMQgDtnnO6tAzaoyaTLs3jvTcMHe-cECc to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $PvyKD0UenhdMZfMd9PgBK9CH8lz8h8vlRkQUEITI0kI to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $uxDWMTWQz-sHcP_KI4SD7QP2S3k18I4NPEWvMEKXsRY to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $eQejTwtJbw53M7pxyuAL0delZMISlzGPembP0PkxlIg to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $clDwGE-IgpCAXscj9W5GmGMm7E8UF5T3ecNznF8FZ5U to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $3EZrD3hoe7S137CLeSbk6wve-D5xblbXWImNFp321_4 to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $nZd1JW0PGztiTwz2fasgewvZcF3B5Ry6c2juA9ae2Vk to matrix.org
2024-09-13 17:50:31,482 - synapse.federation.sender.per_destination_queue - 426 - INFO - federation_transaction_transmission_loop-1361440 - Failed to send event $gli4sVR3DEFTObpw_xmzs50HSmhv1LX5ywXwDcK_xcA to matrix.org

@ansiwen
Copy link

ansiwen commented Sep 14, 2024

I found a workaround:

if I divide all the PDU/EDU limits in federation/sender/per_destination_queue.py by two (see patch below), the problem seems to be gone, or at least a lot more unlikely. After the first restart after applying the patch interestingly it happened one time again with the lower limit, here the transactions:

2024-09-13 22:21:20,970 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-7 - TX [matrix.org] {1726258837961} Sending transaction [1726258837961], (PDUs: 1, EDUs: 0)
2024-09-13 22:21:21,150 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-7 - TX [matrix.org] {1726258837979} Sending transaction [1726258837979], (PDUs: 1, EDUs: 0)
2024-09-13 22:21:21,739 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-7 - TX [matrix.org] {1726258838032} Sending transaction [1726258838032], (PDUs: 0, EDUs: 50)
2024-09-13 22:21:26,998 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-7 - TX [matrix.org] {1726258838520} Sending transaction [1726258838520], (PDUs: 0, EDUs: 49)
2024-09-13 22:21:33,283 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-7 - TX [matrix.org] {1726258839127} Sending transaction [1726258839127], (PDUs: 0, EDUs: 46)
2024-09-13 22:21:36,898 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-7 - TX [matrix.org] {1726258839485} Sending transaction [1726258839485], (PDUs: 0, EDUs: 45)
2024-09-13 22:21:44,646 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-7 - TX [matrix.org] {1726258840244} Sending transaction [1726258840244], (PDUs: 0, EDUs: 47)
2024-09-13 22:21:54,335 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-7 - TX [matrix.org] {1726258841215} Sending transaction [1726258841215], (PDUs: 0, EDUs: 41)
2024-09-13 22:21:55,450 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-7 - TX [matrix.org] {1726258841329} Sending transaction [1726258841329], (PDUs: 0, EDUs: 40)
2024-09-13 22:21:57,276 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-7 - TX [matrix.org] {1726258841503} Sending transaction [1726258841503], (PDUs: 0, EDUs: 40)
2024-09-13 22:22:01,397 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-7 - TX [matrix.org] {1726258841885} Sending transaction [1726258841885], (PDUs: 0, EDUs: 40)

[ ... 340 transactions with 40 EDUs, all successful, matrix.org replied with 200 OK ... ]

Then this one is failing again

2024-09-13 22:48:50,080 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-7 - TX [matrix.org] {1726258881112} Sending transaction [1726258881112], (PDUs: 0, EDUs: 40)
2024-09-13 22:48:53,785 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-7 - {PUT-O-46809} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726258881112: HttpResponseException('520: ')
2024-09-13 22:48:59,480 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-7 - {PUT-O-46809} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726258881112: HttpResponseException('502: Bad Gateway')
2024-09-13 22:49:22,317 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-7 - {PUT-O-46809} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726258881112: HttpResponseException('520: ')
2024-09-13 22:50:17,969 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-7 - {PUT-O-46809} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726258881112: HttpResponseException('520: ')
2024-09-13 22:51:23,460 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-7 - {PUT-O-46809} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726258881112: HttpResponseException('520: ')
2024-09-13 22:52:16,359 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-7 - {PUT-O-46809} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726258881112: HttpResponseException('520: ')
2024-09-13 22:53:23,943 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-7 - {PUT-O-46809} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726258881112: HttpResponseException('520: ')
2024-09-13 22:54:38,146 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-7 - {PUT-O-46809} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726258881112: HttpResponseException('520: ')
2024-09-13 22:55:57,357 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-7 - {PUT-O-46809} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726258881112: HttpResponseException('520: ')
2024-09-13 22:57:01,351 - synapse.http.matrixfederationclient - 811 - INFO - federation_transaction_transmission_loop-7 - {PUT-O-46809} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1726258881112: HttpResponseException('520: ')

So it looks like we try to deliver pending EDUs, and at some point matrix.org starts to block our transactions because there are too many or so?

After another restart, it again shortly went up to 40 EDUs, but then recovered:

*RESTART*
2024-09-13 22:58:21,425 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261100644} Sending transaction [1726261100644], (PDUs: 1, EDUs: 0)
2024-09-13 22:58:21,606 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261100647} Sending transaction [1726261100647], (PDUs: 1, EDUs: 0)
2024-09-13 22:58:21,712 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261100649} Sending transaction [1726261100649], (PDUs: 1, EDUs: 0)
2024-09-13 22:58:21,800 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261100652} Sending transaction [1726261100652], (PDUs: 1, EDUs: 0)
2024-09-13 22:58:22,024 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261100662} Sending transaction [1726261100662], (PDUs: 0, EDUs: 41)
2024-09-13 22:58:29,524 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261101119} Sending transaction [1726261101119], (PDUs: 0, EDUs: 40)
2024-09-13 22:58:31,954 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261101338} Sending transaction [1726261101338], (PDUs: 0, EDUs: 42)
2024-09-13 22:58:36,458 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261101730} Sending transaction [1726261101730], (PDUs: 0, EDUs: 40)
2024-09-13 22:58:38,599 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261101906} Sending transaction [1726261101906], (PDUs: 1, EDUs: 41)
2024-09-13 22:58:42,249 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261102211} Sending transaction [1726261102211], (PDUs: 0, EDUs: 40)
2024-09-13 22:58:53,842 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261103181} Sending transaction [1726261103181], (PDUs: 2, EDUs: 41)
2024-09-13 22:58:56,380 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261103392} Sending transaction [1726261103392], (PDUs: 0, EDUs: 40)
2024-09-13 22:59:00,876 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4 - TX [matrix.org] {1726261103767} Sending transaction [1726261103767], (PDUs: 1, EDUs: 5)
2024-09-13 22:59:08,930 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-3838 - TX [matrix.org] {1726261104475} Sending transaction [1726261104475], (PDUs: 0, EDUs: 1)
2024-09-13 22:59:16,567 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4183 - TX [matrix.org] {1726261104816} Sending transaction [1726261104816], (PDUs: 0, EDUs: 1)
2024-09-13 22:59:16,672 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4183 - TX [matrix.org] {1726261104818} Sending transaction [1726261104818], (PDUs: 0, EDUs: 1)
2024-09-13 22:59:18,807 - synapse.federation.sender.transaction_manager - 127 - INFO - federation_transaction_transmission_loop-4186 - TX [matrix.org] {1726261104820} Sending transaction [1726261104820], (PDUs: 0, EDUs: 1)
...

Since then no incident anymore (~18 hours).

Also note that all transactions with 40 EDUs are done by federation_transaction_transmission_loop-7 and federation_transaction_transmission_loop-4, probably loops who deliver pending EDUs, and after that, every transaction is done by different loops.

Here's the patch I applied, that seems to mitigate the issue:

--- /opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/federation/sender/per_destination_queue.py	2024-09-02 18:16:07.000000000 +0200
+++ per_destination_queue.py	2024-09-13 19:36:30.462914769 +0200
@@ -50,7 +50,7 @@
     import synapse.server
 
 # This is defined in the Matrix spec and enforced by the receiver.
-MAX_EDUS_PER_TRANSACTION = 100
+MAX_EDUS_PER_TRANSACTION = 50
 
 logger = logging.getLogger(__name__)
 
@@ -71,7 +71,7 @@
 
 # Limit how many presence states we add to each presence EDU, to ensure that
 # they are bounded in size.
-MAX_PRESENCE_STATES_PER_EDU = 50
+MAX_PRESENCE_STATES_PER_EDU = 25
 
 
 class PerDestinationQueue:
@@ -790,7 +790,7 @@
 
         # Now we look for any PDUs to send, by getting up to 50 PDUs from the
         # queue
-        self._pdus = self.queue._pending_pdus[:50]
+        self._pdus = self.queue._pending_pdus[:25]
 
         if not self._pdus and not pending_edus:
             return [], []

@ansiwen
Copy link

ansiwen commented Sep 14, 2024

Ok, after looking into #17035 (which I should have done in the beginning 🤦‍♂️) I believe this is the same issue, but the PR #17371 didn't fix it yet. Will comment there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants