Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Synapse fails to make DNS query for its own hostname when attempting invite via 3PID #9475

Closed
alex-caelus opened this issue Feb 23, 2021 · 42 comments

Comments

@alex-caelus
Copy link

alex-caelus commented Feb 23, 2021

Hi!

Synapse makes DNS ANY (ALL) requests through use of twisted, which sometimes fail. EDIT: My bad, this was apparently not what was happening, see thread for discussion.

For example when I'm trying to invite 'ma1sd-federation-test@kamax.io' as per instructions on https://github.com/ma1uta/ma1sd/blob/master/docs/getting-started.md the invitation fails. Upon investigation I see the following in the logs:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 252, in _async_render_wrapper
    callback_return = await self._async_render(request)
  File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 430, in _async_render
    callback_return = await raw_callback_return
  File "/usr/local/lib/python3.8/site-packages/synapse/rest/client/v1/room.py", line 734, in on_POST
    await self.room_member_handler.do_3pid_invite(
  File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room_member.py", line 879, in do_3pid_invite
    stream_id = await self._make_and_store_3pid_invite(
  File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room_member.py", line 941, in _make_and_store_3pid_invite
    ) = await self.identity_handler.ask_id_server_for_third_party_invite(
  File "/usr/local/lib/python3.8/site-packages/synapse/handlers/identity.py", line 856, in ask_id_server_for_third_party_invite
    data = await self.blacklisting_http_client.post_json_get_json(
  File "/usr/local/lib/python3.8/site-packages/synapse/http/client.py", line 530, in post_json_get_json
    response = await self.request(
  File "/usr/local/lib/python3.8/site-packages/synapse/http/client.py", line 422, in request
    response = await make_deferred_yieldable(request_deferred)
  File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 654, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/usr/local/lib/python3.8/site-packages/twisted/internet/endpoints.py", line 981, in startConnectionAttempts
    raise error.DNSLookupError(
twisted.internet.error.DNSLookupError: DNS lookup failed: no results for hostname lookup: matrix.nilsson.link.

Note that it's trying to make a DNS lookup to my own server, on my own network which my local dns server responds to. Unfortunately the dns request is of type 255 or ANY/ALL which has been deprecated for many years. My DNS server, correctly, returns an empty response.

BTW, the following code also fails (for me) when doing a lookup on matrix.org, because google's DNS server responds with a HINFO instead of a A or AAA record (or CNAME):

import sys

from twisted.python import log
from twisted.names import client

from twisted.internet import reactor

if __name__ == "__main__":
    log.startLogging(sys.stdout)
    client.theResolver = client.Resolver(servers=[("8.8.8.8", 53)])

    def cb(*args):
        log.msg(args)

    def do_lookup(domain):
        d = client.getHostByName(domain)
        d.addBoth(cb)

    from twisted.internet import reactor
    reactor.callLater(0, do_lookup, "matrix.org")
    reactor.run()
@ShadowJonathan
Copy link
Contributor

Relevant links:

Btw, what twisted version are you using? (pip freeze | grep -i twisted)

@richvdh
Copy link
Member

richvdh commented Feb 23, 2021

this seems to be https://twistedmatrix.com/trac/ticket/9691

@richvdh
Copy link
Member

richvdh commented Feb 23, 2021

I'm surprised though. I didn't think we used twisted.names except for SRV lookups (which don't use ANY).

I've been very confused about this in the past though (see #7113 (comment) and https://github.com/matrix-org/synapse/blob/v1.27.0/synapse/app/_base.py#L393), and it's possible I'm still confused. It seems like somehow twisted.names is being installed as the resolver for you.

@alex-caelus
Copy link
Author

alex-caelus commented Feb 24, 2021

I'm running with docker as per http://github.com/spantaleev/matrix-docker-ansible-deploy

docker exec -it matrix-synapse python -c 'from twisted import version
print(str(version))'

[Twisted, version 20.3.0] <- this appears to be the latest version

@ShadowJonathan
Copy link
Contributor

ShadowJonathan commented Feb 25, 2021

@alex-caelus do you have a dnsmasq installation on your host machine? (what OS/version if that machine?) Or anything that catches any DNS requests from the system, and alters them before passing on? (having localhost as your resolver)

@ShadowJonathan
Copy link
Contributor

I think this needs more info, like synapse version, OS version, and maybe the DNS server's logs denoting the ANY DNS request (and other requests) making it possible to properly debug this.

@clokep clokep added the X-Needs-Info This issue is blocked awaiting information from the reporter label Feb 25, 2021
@alex-caelus
Copy link
Author

It's a clean and up-to-date Ubuntu 20.04. Matrix was installed using the ansible scripts from https://github.com/spantaleev/matrix-docker-ansible-deploy. Which means docker images.

root@matrix:~# dpkg -l

https://paste.ubuntu.com/p/m5DSsWqy33/

root@matrix:~# docker ps
CONTAINER ID   IMAGE                                     
faa1c199c9ea   dock.mau.dev/tulir/mautrix-whatsapp:latest
f83844d8eb32   zeratax/matrix-registration:v0.7.2        
d52c3de3c073   ma1uta/ma1sd:2.4.0-amd64                  
93e6d2492d39   sorunome/mx-puppet-slack:latest           
984f6676415c   matrixdotorg/synapse:v1.27.0              
efec68203cbb   jitsi/jvb:stable-5142                     
d75c34164315   jitsi/jicofo:stable-5142                  
d1c66be55161   turt2live/matrix-dimension:latest         
d3840e23e75b   instrumentisto/coturn:4.5.2               
919cb8b552ae   devture/exim-relay:4.93-r1                
22e6a545e635   jitsi/web:stable-5142                     
de5dd243400a   vectorim/element-web:v1.7.21              
2693ae26beb1   postgres:13.2-alpine                      
020be23c3de3   jitsi/prosody:stable-5142                 
65b3655dc118   nginx:1.19.6-alpine                       
root@matrix:~# systemd-resolve --status | tail -n 11
Link 2 (ens18)
      Current Scopes: DNS
DefaultRoute setting: yes
       LLMNR setting: yes
MulticastDNS setting: no
  DNSOverTLS setting: no
      DNSSEC setting: no
    DNSSEC supported: no
  Current DNS Server: 10.0.0.1
         DNS Servers: 10.0.0.1
          DNS Domain: nilsson.link

But this is wierd: Perhaps the title of this issue is wrong. I tried to reproduce the stacktrace on my own before I filed this bug. I ran a few scripts on my desktop (tried both windows and linux) with twisted and I got the same exception and then I used wireshark to look at the traffic. Which is why I came to the conclusion in my first post. However, when I do a tcpdump on the matrix server instead I get the correct requests for A and AAAA records. Which means the stacktrace in my first post is caused by something else.

I'm attaching the tcpdump, in case you are interested but I can see nothing wrong in it. (Note the bind9 man-in-the-middle: 127.0.0.1 <-> 127.0.0.53 <-> 10.0.0.1).

@ShadowJonathan
Copy link
Contributor

I'll take a look at that once i have some time, thanks for coming through!

@ShadowJonathan
Copy link
Contributor

@alex-caelus i'm not seeing any ANY requests in here, does the bug still persist? If so, please record the tcpdump while the bug is occuring, because this looks like normal behaviour.

@richvdh
Copy link
Member

richvdh commented Feb 26, 2021

i'm not seeing any ANY requests in here,

isn't that exactly what they said in #9475 (comment) ?

@alex-caelus
Copy link
Author

@alex-caelus i'm not seeing any ANY requests in here, does the bug still persist? If so, please record the tcpdump while the bug is occuring, because this looks like normal behaviour.

Well, my assumed root-cause might be wrong so the title is off. But the backtrace and symptoms (failed to invite via email-address) still affect me, so yes the bug is still persisting.

@ShadowJonathan
Copy link
Contributor

Alright, just for clarification, does your DNS server still register ANY DNS requests when you do that?

@alex-caelus
Copy link
Author

alex-caelus commented Feb 26, 2021

No, it apparently never did. I just assumed that it did (see my earlier comment)

@alex-caelus alex-caelus changed the title Synapse makes DNS ANY (ALL) requests through used of twisted, which fails sometimes. Synapse fails to make DNS query for its own hostname when attempting invite via 3PID Feb 26, 2021
@alex-caelus
Copy link
Author

I edited title and description, it was confusing, to say the least.

@ShadowJonathan
Copy link
Contributor

Then the question becomes; why is it failing a lookup to that address (from the server)?

Could you maybe do host matrix.nilsson.link 10.0.0.1/dig matrix.nilsson.link 10.0.0.1 on your host machine (assuming 10.0.0.1 there is your dns server)?

@alex-caelus
Copy link
Author

alex-caelus commented Feb 26, 2021

If you look at the pcap file I attached you can see that the query is successful but still the exception happens. Here is a screenshot of the relevant parts, for those that do not have wireshark installed:

pcap-summary

@alex-caelus
Copy link
Author

I see three possibilities myself, based on the pcap:

  1. The IPv6 response is empty and this trigger some bug in twisted.
  2. There is some kind of DNSSEC or authoritative issue that makes the responses untrusted.
  3. Some other bug...

I have trouble believing in number 2 since I have had no other issues with other programs or devices on the same network before.

Could you maybe do dig matrix.nilsson.link 10.0.0.1 on your host machine (assuming 10.0.0.1 there is your dns server)?

Sure thing:

root@matrix:~# dig matrix.nilsson.link 10.0.0.1

; <<>> DiG 9.16.1-Ubuntu <<>> matrix.nilsson.link 10.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45630
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;matrix.nilsson.link.           IN      A

;; ANSWER SECTION:
matrix.nilsson.link.    1087    IN      A       10.0.0.110

;; Query time: 0 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Fri Feb 26 12:45:03 UTC 2021
;; MSG SIZE  rcvd: 64

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 60238
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;10.0.0.1.                      IN      A

;; Query time: 4 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Fri Feb 26 12:45:03 UTC 2021
;; MSG SIZE  rcvd: 37

@ShadowJonathan
Copy link
Contributor

oh whoops, forgot dig needed @ in front of dns server directions

thanks for the information, though, this is probably enough to figure out this bug

@hidraulicChicken
Copy link

hidraulicChicken commented Jun 9, 2021

Hey, I have the same problem:

Jun 09 11:09:48 synapse matrix-synapse[10074]: 2021-06-09 09:09:48,649 - synapse.http.server - 93 - ERROR - POST-107881 - Failed handle request via 'ThreepidBindRestServlet': <XForwardedForRequest at 0x7f19947ff3d0 method='POST' uri='/_matrix/client/r0/account/3pid/bind' clientproto='HTTP/1.0' site='8008'>
Jun 09 11:09:48 synapse matrix-synapse[10074]: Traceback (most recent call last):
Jun 09 11:09:48 synapse matrix-synapse[10074]: File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 258, in _async_render_wrapper
Jun 09 11:09:48 synapse matrix-synapse[10074]: callback_return = await self._async_render(request)
Jun 09 11:09:48 synapse matrix-synapse[10074]: File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 446, in _async_render
Jun 09 11:09:48 synapse matrix-synapse[10074]: callback_return = await raw_callback_return
Jun 09 11:09:48 synapse matrix-synapse[10074]: File "/usr/local/lib/python3.8/site-packages/synapse/rest/client/v2_alpha/account.py", line 760, in on_POST
Jun 09 11:09:48 synapse matrix-synapse[10074]: await self.identity_handler.bind_threepid(
Jun 09 11:09:48 synapse matrix-synapse[10074]: File "/usr/local/lib/python3.8/site-packages/synapse/handlers/identity.py", line 213, in bind_threepid
Jun 09 11:09:48 synapse matrix-synapse[10074]: data = await self.blacklisting_http_client.post_json_get_json(
Jun 09 11:09:48 synapse matrix-synapse[10074]: File "/usr/local/lib/python3.8/site-packages/synapse/http/client.py", line 545, in post_json_get_json
Jun 09 11:09:48 synapse matrix-synapse[10074]: response = await self.request(
Jun 09 11:09:48 synapse matrix-synapse[10074]: File "/usr/local/lib/python3.8/site-packages/synapse/http/client.py", line 437, in request
Jun 09 11:09:48 synapse matrix-synapse[10074]: response = await make_deferred_yieldable(request_deferred)
Jun 09 11:09:48 synapse matrix-synapse[10074]: File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
Jun 09 11:09:48 synapse matrix-synapse[10074]: current.result = callback(current.result, *args, **kw)
Jun 09 11:09:48 synapse matrix-synapse[10074]: File "/usr/local/lib/python3.8/site-packages/twisted/internet/endpoints.py", line 1024, in startConnectionAttempts
Jun 09 11:09:48 synapse matrix-synapse[10074]: raise error.DNSLookupError(
Jun 09 11:09:48 synapse matrix-synapse[10074]: twisted.internet.error.DNSLookupError: DNS lookup failed: no results for hostname lookup: matrix.example.com.

also it runs in container, I have no issue getting the ip on the host

@evodicka
Copy link

evodicka commented Jun 9, 2021

Hey, I am also experiencing this issue:

Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/synapse/http/matrixfederationclient.py", line 567, in _send_request
response = await request_deferred
File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 1443, in _inlineCallbacks
result = current_context.run(result.throwExceptionIntoGenerator, g)
File "/usr/local/lib/python3.8/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "/usr/local/lib/python3.8/site-packages/synapse/http/federation/matrix_federation_agent.py", line 189, in request
res = yield make_deferred_yieldable(
File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 1443, in _inlineCallbacks
result = current_context.run(result.throwExceptionIntoGenerator, g)
File "/usr/local/lib/python3.8/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "/usr/local/lib/python3.8/site-packages/synapse/http/federation/matrix_federation_agent.py", line 295, in _do_connect
raise first_exception
File "/usr/local/lib/python3.8/site-packages/synapse/http/federation/matrix_federation_agent.py", line 281, in _do_connect
result = await make_deferred_yieldable(
File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/usr/local/lib/python3.8/site-packages/twisted/internet/endpoints.py", line 1024, in startConnectionAttempts
raise error.DNSLookupError(
twisted.internet.error.DNSLookupError: DNS lookup failed: no results for hostname lookup: *********.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 258, in _async_render_wrapper
callback_return = await self._async_render(request)
File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 446, in _async_render
callback_return = await raw_callback_return
File "/usr/local/lib/python3.8/site-packages/synapse/rest/client/v1/room.py", line 90, in on_POST
info, _ = await self._room_creation_handler.create_room(
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room.py", line 824, in create_room
) = await self.room_member_handler.update_membership_locked(
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room_member.py", line 654, in update_membership_locked
return await self._local_membership_update(
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room_member.py", line 287, in _local_membership_update
result_event = await self.event_creation_handler.handle_new_client_event(
File "/usr/local/lib/python3.8/site-packages/synapse/util/metrics.py", line 91, in measured_func
r = await func(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/message.py", line 985, in handle_new_client_event
result = await make_deferred_yieldable(
File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 1445, in _inlineCallbacks
result = current_context.run(g.send, result)
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/message.py", line 1049, in _persist_event
event = await self.persist_and_notify_client_event(
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/message.py", line 1252, in persist_and_notify_client_event
returned_invite = await federation_handler.send_invite(
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/federation.py", line 1398, in send_invite
pdu = await self.federation_client.send_invite(
File "/usr/local/lib/python3.8/site-packages/synapse/federation/federation_client.py", line 786, in send_invite
content = await self._do_send_invite(destination, pdu, room_version)
File "/usr/local/lib/python3.8/site-packages/synapse/federation/federation_client.py", line 818, in _do_send_invite
return await self.transport_layer.send_invite_v2(
File "/usr/local/lib/python3.8/site-packages/synapse/federation/transport/client.py", line 338, in send_invite_v2
response = await self.client.put_json(
File "/usr/local/lib/python3.8/site-packages/synapse/http/matrixfederationclient.py", line 843, in put_json
response = await self._send_request_with_optional_trailing_slash(
File "/usr/local/lib/python3.8/site-packages/synapse/http/matrixfederationclient.py", line 383, in _send_request_with_optional_trailing_slash
response = await self._send_request(request, **send_request_args)
File "/usr/local/lib/python3.8/site-packages/synapse/http/matrixfederationclient.py", line 569, in _send_request
raise RequestSendFailed(e, can_retry=retry_on_dns_fail) from e
synapse.api.errors.RequestSendFailed: Failed to send request: DNSLookupError: DNS lookup failed: no results for hostname lookup: *******

I am running the latest Synapse Docker Container matrixdotorg/synapse:v1.35.1

A few points that might be interesting:

  • The Hostname that cannot be resolved is an internal name, that is only served from the DNS server in our company network
  • Other "external" hostnames seem to be working
  • Calling dig or nslookup inside the Docker container returns the expected result
  • DNS name only has an A-Record (no IPv6 Address)

@evodicka
Copy link

evodicka commented Jun 9, 2021

Ok, after an additional digging through the code and the logs, I found the solution to my problem:
Synapse is blocking DNS resolution to private IP ranges, as the comment for ip_range_blacklist in homeserver.yaml says:

Prevent outgoing requests from being sent to the following blacklisted IP address
CIDR ranges. If this option is not specified then it defaults to private IP
address ranges (see the example below).

The blacklist applies to the outbound requests for federation, identity servers,
push servers, and for checking key validity for third-party invite events.

(0.0.0.0 and :: are always blacklisted, whether or not they are explicitly
listed here, since they correspond to unroutable addresses.)

This option replaces federation_ip_range_blacklist in Synapse v1.25.0.

After setting the blacklist and excluding my IP range, name resolution (and federation) started working. Strangely, it was not enough to put the IP range on the whitelist, I had to remove it from the blacklist.

@erikjohnston
Copy link
Member

@evodicka the problem with the whitelist is likely due to #10115


@alex-caelus looks like your server is trying to send a request to a host with a blacklist IP (from your pcap the answer to the DNS query is in 10.0.*, which is private and blacklisted by default). Is it trying to talk to itself? If so that's a bug, as Synapse should never attempt to talk to itself over federation.

@alex-caelus
Copy link
Author

I don't know about actually talking to, but yes. It's trying to resolve it's own hostname. I'm inviting ma1sd-federation-test@kamax.io but the step that fails is the dns resolve of 'matrix.nilsson.link' which is the server itself.

@alex-caelus
Copy link
Author

Here's the synapse logs again (v1.35.1):

Jun 15 07:29:10 matrix matrix-synapse[1715079]: 2021-06-15 07:29:10,980 - synapse.handlers.identity - 666 - WARNING - POST-1310686 - Error when looking up hashing details: DNS lookup failed: no results for hostname lookup: matrix.nilsson.link.
Jun 15 07:29:11 matrix matrix-synapse[1715079]: 2021-06-15 07:29:11,010 - synapse.http.server - 93 - ERROR - POST-1310686 - Failed handle request via 'RoomCreateRestServlet': <XForwardedForRequest at 0x7f4c5dee0430 method='POST' uri='/_matrix/client/r0/createRoom' clientproto='HTTP/1.0' site='8008'>
Jun 15 07:29:11 matrix matrix-synapse[1715079]: Traceback (most recent call last):
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 258, in _async_render_wrapper
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     callback_return = await self._async_render(request)
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 446, in _async_render
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     callback_return = await raw_callback_return
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/rest/client/v1/room.py", line 90, in on_POST
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     info, _ = await self._room_creation_handler.create_room(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room.py", line 840, in create_room
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     last_stream_id = await self.hs.get_room_member_handler().do_3pid_invite(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room_member.py", line 931, in do_3pid_invite
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     stream_id = await self._make_and_store_3pid_invite(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room_member.py", line 993, in _make_and_store_3pid_invite
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     ) = await self.identity_handler.ask_id_server_for_third_party_invite(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/handlers/identity.py", line 890, in ask_id_server_for_third_party_invite
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     data = await self.blacklisting_http_client.post_json_get_json(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/http/client.py", line 545, in post_json_get_json
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     response = await self.request(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/http/client.py", line 437, in request
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     response = await make_deferred_yieldable(request_deferred)
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     current.result = callback(current.result, *args, **kw)
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/twisted/internet/endpoints.py", line 1024, in startConnectionAttempts
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     raise error.DNSLookupError(
Jun 15 07:29:11 matrix matrix-synapse[1715079]: twisted.internet.error.DNSLookupError: DNS lookup failed: no results for hostname lookup: matrix.nilsson.link.

@erikjohnston
Copy link
Member

Does matrix.nilsson.link resolve to a private IP on that box? (Or an IP in the blacklist?)

@alex-caelus
Copy link
Author

it resolves to 10.0.0.110

@erikjohnston
Copy link
Member

erikjohnston commented Jun 18, 2021

Ok, yup, that will be in the IP blacklist by default. Looks like you're something on matrix.nilsson.link as an ID server? I guess this is "expected" in that the blacklist is working correctly. Though terrible that it isn't more obvious.

I wonder if we should log loudly every time we blacklist an IP or something?

@alex-caelus
Copy link
Author

I'm using https://github.com/spantaleev/matrix-docker-ansible-deploy to deploy everything on the same server, including both synapse and ma1sd.

But why is it trying to look up itself?

What is special about my setup so that not everyone is having the same issue? Also, should I remove my ip from the blacklist or is it there for a reason?

@clokep
Copy link
Member

clokep commented Jun 18, 2021

I wonder if we should log loudly every time we blacklist an IP or something?

See

# if we have a blacklisted IP, we'd like to raise an error to block the
# request, but all we can really do from here is claim that there were no
# valid results.

What is special about my setup so that not everyone is having the same issue? Also, should I remove my ip from the blacklist or is it there for a reason?

You're running two matrix services that are attempting to communicate via a private IP, which is not common, although it seems I've been fielding a bunch of questions about this frequently.

@richvdh
Copy link
Member

richvdh commented Jun 18, 2021

But why is it trying to look up itself?

#4857 is probably related here

@daudo
Copy link

daudo commented Jun 18, 2021

stumbled upon this issue as well. My setup is based on https://github.com/spantaleev/matrix-docker-ansible-deploy, too.

It's a little bit weird to have synapse refusing to make DNS queries against our own internal nameservers and if I look at the sample config file

# The blacklist applies to the outbound requests for federation, identity servers,
it says:

The blacklist applies to the outbound requests for federation, identity servers,
push servers, and for checking key validity for third-party invite events.

I fail to see DNS queries being mentioned here. So IMHO at least DNS queries should be mentioned there as well plus any other resources that might fail, too.

@clokep
Copy link
Member

clokep commented Jun 18, 2021

I fail to see DNS queries being mentioned here. So IMHO at least DNS queries should be mentioned there as well plus any other resources that might fail, too.

DNS queries are (generally) made using getaddrinfo, not directly from Synapse, so queries to DNS servers on a private IP are not blocked by Synapse (since Synapse doesn't even know what is being queried -- the underlying OS/glib implementation is doing it).

@alex-caelus
Copy link
Author

I fail to see DNS queries being mentioned here. So IMHO at least DNS queries should be mentioned there as well plus any other resources that might fail, too.

DNS queries are (generally) made using getaddrinfo, not directly from Synapse, so queries to DNS servers on a private IP are not blocked by Synapse (since Synapse doesn't even know what is being queried -- the underlying OS/glib implementation is doing it).

Maybe I'm dense, but that sounds like synapse is not blocking the dns queries based on the blacklist. I'm getting a bit confused here

@evodicka
Copy link

I fail to see DNS queries being mentioned here. So IMHO at least DNS queries should be mentioned there as well plus any other resources that might fail, too.

DNS queries are (generally) made using getaddrinfo, not directly from Synapse, so queries to DNS servers on a private IP are not blocked by Synapse (since Synapse doesn't even know what is being queried -- the underlying OS/glib implementation is doing it).

Maybe I'm dense, but that sounds like synapse is not blocking the dns queries based on the blacklist. I'm getting a bit confused here

Well, it looks like Synapse is not blocking the DNS queries, but it throws away the resolved IPs if they are on the blacklist. You can checkout client.py to have a look what it actually does.

@daudo
Copy link

daudo commented Jun 21, 2021

I fail to see DNS queries being mentioned here. So IMHO at least DNS queries should be mentioned there as well plus any other resources that might fail, too.

DNS queries are (generally) made using getaddrinfo, not directly from Synapse, so queries to DNS servers on a private IP are not blocked by Synapse (since Synapse doesn't even know what is being queried -- the underlying OS/glib implementation is doing it).

Maybe I'm dense, but that sounds like synapse is not blocking the dns queries based on the blacklist. I'm getting a bit confused here

Well, it looks like Synapse is not blocking the DNS queries, but it throws away the resolved IPs if they are on the blacklist. You can checkout client.py to have a look what it actually does.

allright, but if I read the stack trace correctly, this has got nothing to do with client.py (as a sidenote, client.py even nicely logs if it throws away an IP because it has been blacklisted:

logger.info(
"Dropped %s from DNS resolution to %s due to blacklist"
% (ip_address, hostname)
)

The stack trace reads like this (copied from comment 861258334)

Jun 15 07:29:10 matrix matrix-synapse[1715079]: 2021-06-15 07:29:10,980 - synapse.handlers.identity - 666 - WARNING - POST-1310686 - Error when looking up hashing details: DNS lookup failed: no results for hostname lookup: matrix.nilsson.link.
Jun 15 07:29:11 matrix matrix-synapse[1715079]: 2021-06-15 07:29:11,010 - synapse.http.server - 93 - ERROR - POST-1310686 - Failed handle request via 'RoomCreateRestServlet': <XForwardedForRequest at 0x7f4c5dee0430 method='POST' uri='/_matrix/client/r0/createRoom' clientproto='HTTP/1.0' site='8008'>
Jun 15 07:29:11 matrix matrix-synapse[1715079]: Traceback (most recent call last):
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 258, in _async_render_wrapper
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     callback_return = await self._async_render(request)
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 446, in _async_render
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     callback_return = await raw_callback_return
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/rest/client/v1/room.py", line 90, in on_POST
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     info, _ = await self._room_creation_handler.create_room(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room.py", line 840, in create_room
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     last_stream_id = await self.hs.get_room_member_handler().do_3pid_invite(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room_member.py", line 931, in do_3pid_invite
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     stream_id = await self._make_and_store_3pid_invite(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room_member.py", line 993, in _make_and_store_3pid_invite
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     ) = await self.identity_handler.ask_id_server_for_third_party_invite(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/handlers/identity.py", line 890, in ask_id_server_for_third_party_invite
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     data = await self.blacklisting_http_client.post_json_get_json(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/http/client.py", line 545, in post_json_get_json
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     response = await self.request(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/http/client.py", line 437, in request
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     response = await make_deferred_yieldable(request_deferred)
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     current.result = callback(current.result, *args, **kw)
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/twisted/internet/endpoints.py", line 1024, in startConnectionAttempts
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     raise error.DNSLookupError(
Jun 15 07:29:11 matrix matrix-synapse[1715079]: twisted.internet.error.DNSLookupError: DNS lookup failed: no results for hostname lookup: matrix.nilsson.link.

And if you look at the actual error logged, it says twisted.internet.error.DNSLookupError: DNS lookup failed: no results for hostname lookup: matrix.nilsson.link.

So the message explicitly says that the DNS lookup failed. That's definitely something that should be corrected to say that the DNS response has been filtered out due to the blacklist.

@evodicka
Copy link

evodicka commented Jun 21, 2021

I fail to see DNS queries being mentioned here. So IMHO at least DNS queries should be mentioned there as well plus any other resources that might fail, too.

DNS queries are (generally) made using getaddrinfo, not directly from Synapse, so queries to DNS servers on a private IP are not blocked by Synapse (since Synapse doesn't even know what is being queried -- the underlying OS/glib implementation is doing it).

Maybe I'm dense, but that sounds like synapse is not blocking the dns queries based on the blacklist. I'm getting a bit confused here

Well, it looks like Synapse is not blocking the DNS queries, but it throws away the resolved IPs if they are on the blacklist. You can checkout client.py to have a look what it actually does.

allright, but if I read the stack trace correctly, this has got nothing to do with client.py (as a sidenote, client.py even nicely logs if it throws away an IP because it has been blacklisted:

logger.info(
"Dropped %s from DNS resolution to %s due to blacklist"
% (ip_address, hostname)
)

Yes, it does. And when I ran into this issue, I could actually find exactly that log a couple of lines above the stacktrace. I just wasn't paying attention to it at first, because I focused only on the stacktrace.

The stack trace reads like this (copied from comment 861258334)

Jun 15 07:29:10 matrix matrix-synapse[1715079]: 2021-06-15 07:29:10,980 - synapse.handlers.identity - 666 - WARNING - POST-1310686 - Error when looking up hashing details: DNS lookup failed: no results for hostname lookup: matrix.nilsson.link.
Jun 15 07:29:11 matrix matrix-synapse[1715079]: 2021-06-15 07:29:11,010 - synapse.http.server - 93 - ERROR - POST-1310686 - Failed handle request via 'RoomCreateRestServlet': <XForwardedForRequest at 0x7f4c5dee0430 method='POST' uri='/_matrix/client/r0/createRoom' clientproto='HTTP/1.0' site='8008'>
Jun 15 07:29:11 matrix matrix-synapse[1715079]: Traceback (most recent call last):
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 258, in _async_render_wrapper
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     callback_return = await self._async_render(request)
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 446, in _async_render
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     callback_return = await raw_callback_return
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/rest/client/v1/room.py", line 90, in on_POST
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     info, _ = await self._room_creation_handler.create_room(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room.py", line 840, in create_room
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     last_stream_id = await self.hs.get_room_member_handler().do_3pid_invite(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room_member.py", line 931, in do_3pid_invite
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     stream_id = await self._make_and_store_3pid_invite(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/handlers/room_member.py", line 993, in _make_and_store_3pid_invite
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     ) = await self.identity_handler.ask_id_server_for_third_party_invite(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/handlers/identity.py", line 890, in ask_id_server_for_third_party_invite
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     data = await self.blacklisting_http_client.post_json_get_json(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/http/client.py", line 545, in post_json_get_json
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     response = await self.request(
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/synapse/http/client.py", line 437, in request
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     response = await make_deferred_yieldable(request_deferred)
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     current.result = callback(current.result, *args, **kw)
Jun 15 07:29:11 matrix matrix-synapse[1715079]:   File "/usr/local/lib/python3.8/site-packages/twisted/internet/endpoints.py", line 1024, in startConnectionAttempts
Jun 15 07:29:11 matrix matrix-synapse[1715079]:     raise error.DNSLookupError(
Jun 15 07:29:11 matrix matrix-synapse[1715079]: twisted.internet.error.DNSLookupError: DNS lookup failed: no results for hostname lookup: matrix.nilsson.link.

And if you look at the actual error logged, it says twisted.internet.error.DNSLookupError: DNS lookup failed: no results for hostname lookup: matrix.nilsson.link.

So the message explicitly says that the DNS lookup failed. That's definitely something that should be corrected to say that the DNS response has been filtered out due to the blacklist.

I am not an expert on this issue, but what I understood when investigating that issue after I ran into it:

  • client.py defines a wrapper around the Twisted name resolver that drops blacklisted IP addresses.
  • This wrapper is use for example in matrix_federation_agent.py
  • enpoints.py (which is the original cause of the stack trace) throws the DNSLookupError when it tries to connect but the DNS query returns no IP addresses (because they were filtered out by the reactor defined in client.py

And yes, I find that very misleading.

@erikjohnston erikjohnston added T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. and removed X-Needs-Info This issue is blocked awaiting information from the reporter T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. labels Jun 21, 2021
@erikjohnston
Copy link
Member

I've opened #10224 to track the fact the stack trace is quite misleading.

I think otherwise all the issues that have been brought up in this thread have been resolved?

@daudo
Copy link

daudo commented Jun 21, 2021

I'm not sure & I'm no expert here. From my POV, I still see 3 potential remaining problems:

  1. why does Synapse talk to itself?
  2. If Synapse talks to itself, this should never fail when using default options
  3. why is talking to internal networks blocked by default

I think problem one has been tracked here #4857

IMHO, the second problem really should just never occur. I fail to see any (security) implications when a service like Synapse is allowed to talk to itself. From my POV, this seems to be an almost "natural" behavior :). No default setting of a blacklist should disallow this.

And per the third potential problem, I honestly don't understand why Synapse should not be allowed to talk to internal networks at all, or at least to servers that share its same network segment.

So all I can say is that problems should be fixed where they occur, but I am in no position to say what the "right place" for such a fix is.

@erikjohnston
Copy link
Member

If Synapse talks to itself, this should never fail when using default options

That is true, but it looks to me from the stack traces that its talking to a configured identity server, which happens to have the same domain as the home server implementation.

And per the third potential problem, I honestly don't understand why Synapse should not be allowed to talk to internal networks at all, or at least to servers that share its same network segment.

There are a number of ways in which clients and other servers can get Synapse to make requests to arbitrary IPs and ports, completely bypassing any firewalls etc. This can easily lead to security issues if people don't realise that that is possible (and generally its rare for there to be any reason for Synapse to talk to private IPs).

@Jieiku
Copy link

Jieiku commented Dec 12, 2021

Nobody covered in this thread how to remove an ip range from the blacklist.

I am running HAProxy on pfsense and it is handling request externally just fine, but internally I use a virtual IP so that clients on the local network also get routed, (I do not use NAT reflection)

Unfortunately the result is:
2021-12-12 14:45:23,067 - sydent.http.blacklisting_reactor - 91 - INFO - Dropped 10.9.9.5 from DNS resolution to matrix.example.com due to blacklist

Update.

sydent gives you a lot of config values under [default] you cannot just edit these, you have to figure out which category they belong to below, once I set the whitelist below the [general] section i am now past that hurdle and onto the next one.

sydent.conf:

[general]
server.name = example.com
ip.whitelist = 10.9.9.5

@DMRobertson
Copy link
Contributor

I edited my matrix synapse homeserver config:

@Jieiku your Homeserver is separate from Sydent (identify server). The homeserver's black- and whitelists are separate to the identity server's black- and whitelists.

I strongly recommend you do not set the ip.blacklist as above, because doing so wipes out the defaults, which is probably not what you want. Instead, you want to set ip.whitelist under the [general] section of sydent's config.

If that doesn't resolve the problem, please open an issue in the Sydent repo, or ask in the matrix room #sydent:matrix.org for help.

@Jieiku
Copy link

Jieiku commented Dec 14, 2021

Thank you! I appreciate it! I did actually figure out that it was the sydent config to edit. The confusing part of this config is that there is a [default] section, but changing stuff there does not actually make any changes. you have to add the lines to the categories below such as [general]

I think sydent config file should have the sections below populated, and just comment out the lines. It was not at all obvious how to edit this file.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants