Page MenuHomePureOS Tracker

Resource temporarily unavailable [IP: 138.201.228.45 443] (notorious https)
Closed, ResolvedPublic

Description

Could not wait for server fd - select (11: Resource temporarily unavailable) [IP: 138.201.228.45 443]

Event Timeline

Similar issue:

Get:393 https://repo.pureos.net/pureos byzantium/main amd64 xvfb amd64 2:1.20.10-2 [3036 kB]
Fetched 182 MB in 5s (40.4 MB/s)
E: Failed to fetch https://repo.pureos.net/pureos/pool/main/l/llvm-toolchain-11/libllvm11_11.0.1-2_amd64.deb Error reading from server - read (5: Input/output error) [IP: 138.201.228.45 443]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
E: Failed to process build dependencies

jeremiah.foster renamed this task from Resource contention in APT to Resource temporarily unavailable [IP: 138.201.228.45 443] .Feb 10 2021, 10:42
jeremiah.foster triaged this task as High priority.

Similar:

Get:289 https://repo.pureos.net/pureos byzantium/main arm64 xdg-user-dirs arm64 0.17-2 [53.2 kB]
E: Failed to fetch https://repo.pureos.net/pureos/pool/main/c/chardet/python3-chardet_4.0.0-1_all.deb Error reading from server - read (5: Input/output error) [IP: 138.201.228.45 443]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
Fetched 94.6 MB in 9s (10.1 MB/s)

jeremiah.foster renamed this task from Resource temporarily unavailable [IP: 138.201.228.45 443] to Resource temporarily unavailable [IP: 138.201.228.45 443] (notorious https).Feb 11 2021, 06:47
jeremiah.foster claimed this task.

nginx error log on Artemis is saying:

2021/02/11 00:36:48 [error] 503#503: *9986273 directory index of "/srv/repo.puri.sm/" is forbidden, client: 138.201.228.45, server: artemis.pureos.net, request: "POST / HTTP/1.0", host: "artemis.pureos.net"
2021/02/11 01:17:27 [error] 503#503: *9990896 open() "/srv/repo.puri.sm/.well-known/security.txt" failed (2: No such file or directory), client: 138.201.228.45, server: artemis.pureos.net, request: "GET /.well-known/security.txt HTTP/1.0", host: "artemis.pureos.net"
2021/02/11 09:31:20 [error] 503#503: *10043922 directory index of "/srv/repo.puri.sm/" is forbidden, client: 138.201.228.45, server: artemis.pureos.net, request: "POST / HTTP/1.0", host: "artemis.pureos.net"
2021/02/11 10:46:11 [error] 503#503: *10049424 directory index of "/srv/repo.puri.sm/" is forbidden, client: 138.201.228.45, server: artemis.pureos.net, request: "POST / HTTP/1.0", host: "artemis.pureos.net"

guido added subscribers: mak, guido.EditedFeb 16 2021, 00:01

to add some context:

we're seeing image build and CI failures when downloading packages. @mak also saw this when generating isos. Besides the above the client side also sometimes says:

Fetched 164 MB in 14s (11.4 MB/s)
E: Failed to fetch https://repo.pureos.net/pureos/pool/main/libx/libxcb/libxcb-shm0_1.14-3_arm64.deb  Undetermined Error [IP: 138.201.228.45 443]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
E: Failed to process build dependencies

Switching to http:// reliably works around the problem

maybe relevant to your research:
curl -O https://repo.pureos.net/pureos/dists/amber/main/source/Sources.xz works fine,
but curl -O --cert-status https://repo.pureos.net/pureos/dists/amber/main/source/Sources.xz fails:

curl: (91) No OCSP response received

Similar for https://deb.debian.org/ succeeds.

I..e might be this issue can be narrowed down to OCSP stapling on our server.

https://stackoverflow.com/a/60243923 mentions how to disable OCSP for apt:

touch /etc/apt/apt.conf.d/99verify-peer.conf \
&& echo >>/etc/apt/apt.conf.d/99verify-peer.conf "Acquire { https::Verify-Peer false }"

https://serverfault.com/a/950568 mentions intermittent OCSP failures tied to IPv6.

mak added a comment.Mar 4 2021, 14:31

I tested this with https::Verify-Peer false, still the same issue happens:

Fetched 1016 MB in 4min 12s (4025 kB/s)                                        
2021/03/04 22:30:39 apt | E: Failed to fetch https://repo.pureos.net/pureos/pool/main/f/fftw3/libfftw3-double3_3.3.8-2_amd64.deb  Connection timed out [IP: 138.201.228.45 443]
2021/03/04 22:30:39 apt | E: Failed to fetch https://repo.pureos.net/pureos/pool/main/s/spice-gtk/libspice-client-glib-2.0-8_0.39-1_amd64.deb  Connection timed out [IP: 138.201.228.45 443]
2021/03/04 22:30:39 apt | E: Failed to fetch https://repo.pureos.net/pureos/pool/main/libs/libsodium/libsodium23_1.0.18-1_amd64.deb  Connection timed out [IP: 138.201.228.45 443]
2021/03/04 22:30:39 apt | E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
mak added a comment.Mar 4 2021, 14:53

I pasted the wrong log, but the connection timeout is actually even more frequent now than the Resource temporarily unavailable issue - but both appear.

mak added a comment.Mar 4 2021, 15:22

I also tried messing with timeouts on APTs transport methods, with no luck - according to APT, the server just stops responding (according to curl though, it doesn't).

Benchmark:

Server Software:        nginx/1.10.3
Server Hostname:        repo.pureos.net
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256,2048,128
Server Temp Key:        X25519 253 bits
TLS Server Name:        repo.pureos.net

Document Path:          /
Document Length:        381 bytes

Concurrency Level:      5
Time taken for tests:   91.208 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      518000 bytes
HTML transferred:       381000 bytes
Requests per second:    10.96 [#/sec] (mean)
Time per request:       456.042 [ms] (mean)
Time per request:       91.208 [ms] (mean, across all concurrent requests)
Transfer rate:          5.55 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:      285  339 164.2    300    1784
Processing:    92  114  71.8     96     641
Waiting:       91  102  45.8     95     637
Total:        379  452 179.8    398    1886

Percentage of the requests served within a certain time (ms)
  50%    398
  66%    402
  75%    410
  80%    427
  90%    547
  95%    733
  98%   1060
  99%   1489
 100%   1886 (longest request)
Certificate version: 3
Valid from: Oct  7 19:21:40 2020 GMT
Valid to  : Sep 29 19:21:40 2021 GMT
Public key is 2048 bits
The issuer name is /O=Digital Signature Trust Co./CN=DST Root CA X3
The subject name is /C=US/O=Let's Encrypt/CN=R3
Extension Count: 8
Peer certificate
Certificate version: 3
Valid from: Feb 27 18:12:29 2021 GMT
Valid to  : May 28 18:12:29 2021 GMT
Public key is 2048 bits
The issuer name is /C=US/O=Let's Encrypt/CN=R3
The subject name is /CN=downloads.pureos.net
Extension Count: 9
Transport Protocol      :TLSv1.2
Cipher Suite Protocol   :TLSv1.2
Cipher Suite Name       :ECDHE-RSA-AES128-GCM-SHA256
Cipher Suite Cipher Bits:128 (128)
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : ECDHE-RSA-AES128-GCM-SHA256
    Session-ID: 39DB1E294804DA2D5AB727DE4CF12062B4FA46A36F9DFA278CD675B3535CE0FD
    Session-ID-ctx: 
    Master-Key: BCF95A63D726D1B9685B5293C6212D1CBD8620E94904D9D3A4CA8B6A9EAA6CF5976F668441B9F8F4DF24A70F457C5422
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    TLS session ticket lifetime hint: 86400 (seconds)
    TLS session ticket:
    0000 - b1 a5 92 f4 25 9b 67 fc-d5 c9 5e 0b 0d ba e7 5e   ....%.g...^....^
    0010 - 66 2e d9 f2 68 3a 4f e9-3e 00 9d 33 7b e2 66 49   f...h:O.>..3{.fI
    0020 - ff 93 f6 af 6a a0 64 7b-84 eb fc 07 f1 bf 10 ba   ....j.d{........
    0030 - 48 55 66 ca 4a 9e 44 de-3b 5e 7b f9 e0 e9 23 6a   HUf.J.D.;^{...#j
    0040 - 88 6f 52 da 28 43 c3 92-2b 9a da f7 d4 f1 3b 9c   .oR.(C..+.....;.
    0050 - 2e 6f 9c a3 71 78 cf f2-4d e6 b1 62 16 87 c3 01   .o..qx..M..b....
    0060 - 58 7d b4 9f 89 e2 e2 98-39 71 3b bd 05 06 5d 22   X}......9q;...]"
    0070 - 0e b6 fc 17 2c 86 08 13-3c e3 65 24 a3 7b 45 9a   ....,...<.e$.{E.
    0080 - 31 10 70 30 1e d7 64 92-09 b4 10 bf 09 e9 be 10   1.p0..d.........
    0090 - 18 56 32 e6 60 bf 0f 24-10 ae df 8f 48 b9 8f 48   .V2.`..$....H..H
    00a0 - 1c e3 fa bc 2b a7 d2 52-da 1f cf 28 d1 01 cd 95   ....+..R...(....
    00b0 - 91 6b c6 b2 9d 60 96 a1-24 51 18 92 19 c9 ab 3b   .k...`..$Q.....;

    Start Time: 1615226134
    Timeout   : 7200 (sec)
    Verify return code: 20 (unable to get local issuer certificate)
    Extended master secret: yes
LOG: header received:
HTTP/1.1 200 OK
Server: nginx/1.10.3
Date: Mon, 08 Mar 2021 17:55:34 GMT
Content-Type: text/html; charset=utf-8
Connection: close

<html>
<head><title>Index of /</title></head>
<body bgcolor="white">
<h1>Index of /</h1><hr><pre><a href="../">../</a>
<a href="pureos/">pureos/</a>                                            13-Sep-2019 21:41                   -
<a href="pureos-debug/">pureos-debug/</a>                                      14-Sep-2019 13:10                   -
</pre><hr></body>
</html>

LOG: Response code = 200
SSL/TLS Alert [read] warning:close notify
Completed 1000 requests
SSL/TLS Alert [write] warning:close notify
Finished 1000 requests


Server Software:        nginx/1.10.3
Server Hostname:        repo.puri.sm
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256,2048,128
Server Temp Key:        X25519 253 bits
TLS Server Name:        repo.puri.sm

Document Path:          /
Document Length:        381 bytes

Concurrency Level:      1
Time taken for tests:   442.822 seconds
Complete requests:      1000
Failed requests:        0
Keep-Alive requests:    0
Total transferred:      518000 bytes
HTML transferred:       381000 bytes
Requests per second:    2.26 [#/sec] (mean)
Time per request:       442.822 [ms] (mean)
Time per request:       442.822 [ms] (mean, across all concurrent requests)
Transfer rate:          1.14 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:      285  335 153.0    299    1400
Processing:    92  107  57.6     97     526
Waiting:       91  101  37.1     96     396
Total:        378  443 165.2    396    1781

Percentage of the requests served within a certain time (ms)
  50%    396
  66%    399
  75%    401
  80%    403
  90%    680
  95%    698
  98%    984
  99%   1485
 100%   1781 (longest request)

I did two runs of 1000 calls to the server and did not see any anomalies.

What exactly was tested 1000 times? some apt tool using an up-to-date libapt, or something more low-level?

If using a different method than actual apt, then is it certain that the level of TLS validation is equal or stronger than that done by up-to-date libapt?

I notice the following in above data dump (which I guess is a sample output similar to those 1000 calls:

Verify return code: 20 (unable to get local issuer certificate)

That looks like a potential anomaly to me...

mak added a comment.Apr 16 2021, 18:47

Some new observations:

  • This is not a proxy server issue: Even without proxy, the issue occurs
  • The TLS version doesn't matter at all
  • Before the issue occurs, we get quite a few TCP retransmissions from the client to the server, and then the current connection is dropped:
  • There is nothing suspicious in the Nginx logs, not even at info priority. A quick glance at the debug logs also didn't show anything interesting, but those are massive and it's possible that I missed something.

There's also a TCP reset sent from the server to the client, but I'm not sure if that's actually related to the issue - it is suspicious that this always happens before the connection dies.
So, I am pretty sure now that this is something weird on our side, the client appears to behave fine. Nginx however also appears to behave as it should, so I wonder whether we have something else in the network or an odd firewall setting which causes this.

mak closed this task as Resolved.Apr 17 2021, 17:11
mak claimed this task.

So, that RST packet is indeed the issue, and there's a high chance that either APT or GnuTLS don't handle this correctly. I talked with an APT developer, and we may actually need to debug this further in future.
In the meanwhile though, the issue can be mitigated by throwing an Apache2 webserver in front as proxy, instead of Nginx.

I implemented that in our infrastructure, so the issue is gone for us now though. Therefore I am marking this issue as resolved, but we may actually need to open a new one against either Nginx or, more likely, APT in future to properly fix this.

Hello,

the problem still persists on my Librem 5. I cannot download updates for tzdata and certificates

patata added a comment.Apr 22 2021, 03:12

After a recent flashing of my L5, this is not an issue anymore.

patata removed a subscriber: patata.Apr 22 2021, 03:12