Decryption failed or bad record mac
Перейти к содержимому

Decryption failed or bad record mac

  • автор:

SSL Decryption Error when using git and pip on Windows

It’s a Windows 10 system. I attempted to download the newest OpenSSL and set all the system variables to point to its installation folder or config depending on the variable (OPENSSLDIR, PATH, OPENSSL_CONFIG, OPENSSL_CFG). The new version of OpenSSL was successfully detected (verified with openssl version -a ), but the errors remained. I then tried downloading mozilla certificatates.pem as I was suggested by chatgpt, and added them with certutils , but that didn’t help either. I have tried different wifi networks — doesn’t work even on university eduroam.

asked Feb 26, 2023 at 8:26
Dennis Hvel Dennis Hvel
1 1 1 silver badge 4 4 bronze badges

1 Answer 1

When you set up an encrypted connection with TLS (including with HTTPS), the connection includes a message authentication code (MAC) or a similar construction (an AEAD) to verify that the data has not been changed. This is important because it means that an attacker cannot tamper with your connection and this feature cannot be disabled.

The message “decryption failed or bad record mac” means that the MAC (or AEAD, as appropriate) validation failed, and the data received is not the data that was sent. The only safe thing to do is abort the connection, which was done. This isn’t a certificate problem, but an indication that something, which could be software on your system, any network hardware between your computer and the destination, or anything in between, caused that data to be different.

On Windows, common reasons for failure are non-default antivirus or firewall software, as well as monitoring software. The recommended approach is to completely uninstall such software, and reboot (disabling it is often not enough), using only the default options (Windows Defender and Windows Firewall). It can also be a proxy or TLS MITM device on your network, poor quality Wi-Fi or network drivers (I’ve seen similar problems with some settings on Killer network cards), a flaky Internet connection, or really any sort of network problem, network software, or network device between your computer and the destination, inclusive. What that is, we can’t know, but it’s up to you to try to narrow it down and figure it out.

I’m getting error: SSL3_GET_RECORD:decryption failed or bad record mac

And the strange thing is that I’m getting this error every fifth click on my website. From my conf file:

SSLEngine on SSLCertificateFile /etc/letsencrypt/live/mywebsite/cert.pem SSLCertificateKeyFile /etc/letsencrypt/live/mywebsite/privkey.pem Include /etc/letsencrypt/options-ssl-apache.conf SSLCertificateChainFile /etc/letsencrypt/live/mywebsite/chain.pem SSLCompression off 

from options-ssl-apache.conf ;

SSLProtocol all -SSLv2 -SSLv3 SSLCipherSuite EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH SSLHonorCipherOrder on SSLCompression off 

I have checked log file from website but nothing, also nothing here; /var/log/apache2/error.log I’m trying to figure out what is causing this error, any ideas where can I find more info or even better, how to solve this problem? EDIT: If I try openssl s_client -connect mywebsite.com:443 , it will return: I’m using: OpenSSL 1.1.0f

CONNECTED(00000003) . 3073276480:error:1408F119:SSL routines:ssl3_get_record:decryption failed or bad record mac. /ssl/record/ssl3_record.c:469: 

ANOTHER EDIT: As @quadruplebucky suggested I changed options-ssl-apache.conf into:

SSLProtocol all -SSLv2 -SSLv3 SSLCipherSuite HIGH:MEDIUM:!aNULL:!MD5:!SSLv3:!SSLv2:!TLSv1 SSLHonorCipherOrder on SSLCompression off #SSLSessionTickets off 

I also tried to add SSLProtocol all -SSLv2 -SSLv3 into my virtualhost conf file, and in a same time I did change couple of things here; /etc/apache2/mods-available/ssl.conf

#SSLCipherSuite HIGH:!aNULL SSLCipherSuite HIGH:MEDIUM:!aNULL:!MD5:!SSLv3:!SSLv2:!TLSv1 SSLHonorCipherOrder on # The protocols to enable. # Available values: all, SSLv3, TLSv1, TLSv1.1, TLSv1.2 # SSL v2 is no longer supported SSLProtocol all -SSLv2 -SSLv3 

EDIT: After changing LogLevel into Info it returns:

[Sat Jul 08 13:34:53.374307 2017] [ssl:info] [pid 8710] [client] AH02008: SSL library error 1 in handshake (server mywebsite:443) [Sat Jul 08 13:34:53.374717 2017] [ssl:info] [pid 8710] SSL Library Error: error:140940F4:SSL routines:ssl3_read_bytes:unexpected message [Sat Jul 08 13:34:53.374750 2017] [ssl:info] [pid 8710] [client] AH01998: Connection closed to child 1 with abortive shutdown (server mywebsite:443) 

EDIT: If I run with option -crlf, like this:

openssl s_client -crlf -connect mywebsite:443 

I’m getting no error? One more thing if I change LogLevel to debug, before that error I’m getting this:

[Tue Jul 11 23:00:38.641568 2017] [core:debug] [pid 26561] protocol.c(1273): [client 188.64.25.162:23165] AH00566: request failed: malformed request line [Tue Jul 11 23:00:38.641634 2017] [headers:debug] [pid 26561] mod_headers.c(900): AH01503: headers: ap_headers_error_filter() 

So after this that same error will happen:

SSL Library Error: error:140940F4:SSL routines:ssl3_read_bytes:unexpected message openssl version OpenSSL 1.1.0f 25 May 2017 

user134969
asked Jul 7, 2017 at 20:28
user134969 user134969
439 2 2 gold badges 6 6 silver badges 19 19 bronze badges

It shouldn’t be possible for any endpoint config error to cause ‘bad decrypt or mac’ (alert 20). Is there anything in the network between these clients and this server, like a firewall, IDS, IPS, DLP, or even a ‘smart’ router? Can you test repeatedly (since this may be data dependent and thus quasi-random) connecting from the server itself, or from a machine on the same segment as the server?

Jul 9, 2017 at 5:24

show the relevant contents of the *.443 virtualhost (the whole virtualhost if needed), also the output of «apachectl -S» so we can discard misconfiguration.

Jul 10, 2017 at 6:31
error message complaints about SSL v3, yet in your config it’s disabled.
Jul 11, 2017 at 21:08

You say you are using OpenSSL 1.1.0f. Is that on both the server and on the machine where you issued the s_client commands above? What platform is your server running on? You might want to see if the error occurs with specific ciphersuites, E.g. what happens if you add «-cipher AES128-SHA» onto the end of your s_client command?

Jul 13, 2017 at 10:56

ninjaed: @alexus: function and file names and some literals ssl3* and SSL3* in OpenSSL are also used for TLS (1.0 through 1.2) because of the technical similarities between those protocols. user134969: ‘length too short’ also should never be caused by any config. If that’s repeatable please try to get a s_client -debug (with plain-RSA -cipher if you didn’t do that on the server) and wireshark capture for the same event so we can look at the actual wire data and compare it to what the program sees.

Jul 15, 2017 at 12:25

5 Answers 5

You can’t tell from that error if your server is negotiating SSLv3 or TLSv1 (you might want to have a look at this question on Unix & Linux and make sure it’s disabled everywhere in apache. ) — the 1.1.0f source code here on GitHub deliberately blurs the two.

 if (enc_err < 0) < /* * A separate 'decryption_failed' alert was introduced with TLS 1.0, * SSL 3.0 only has 'bad_record_mac'. But unless a decryption * failure is directly visible from the ciphertext anyway, we should * not reveal which kind of error occurred -- this might become * visible to an attacker (e.g. via a logfile) */ al = SSL_AD_BAD_RECORD_MAC; SSLerr(SSL_F_SSL3_GET_RECORD, SSL_R_DECRYPTION_FAILED_OR_BAD_RECORD_MAC); goto f_err; >

So you might want to reorder your cipher suite:

This post on askubuntu about the POODLE vulnerability has an excellent list of resources for SSL inspection and plumbing.

The «getting this error every fifth click» comment is a little strange. Do you mean clicks, or every fifth line is a bad request in the logs? Try starting apache single-threaded (the -X flag) and see if that does the same thing. or maybe setting SSLSessionTickets off .

My thinking here is to eliminate threading and session/cache coherency/consistency as source of trouble. Running apache single-threaded (starting it with the -X flag) is one way to accomplish this, another way is to set MaxClients=1 (at least with the MPM model). Session tickets have been a source of trouble in the past with TLSv1.2 and they are enabled by default, this is the reasoning behind SSLSessionTickets off (note this is part of the SSL «Server Hello» message, not a session cookie or similar). The ‘every fifth click’ error still bothers me — I can’t help but notice that most browsers will pipeline four resource requests in a single one. ) and open a new connection (a new ssl handshake, etc. ) for the fifth. without a packet capture it’s hard to say what’s actually going on.

It would seem that you’ve eliminated cipher negotiation as a source of error (you can duplicate the error condition under much more restrictive cipher specs, unless I’m mistaken). I would be curious to know if you can trigger the error by renegotiating SSL (just for kicks, say): openssl s_client -connect server:443 and then type ‘R’, see what the logs say.
Also see if session caching is working with the -reconnect option to s_client.
Something must be different about the receiving contexts for the SSL requests, and it seems the best way to figure that out (short of a byte-by-byte inspection of what’s going over the wire, which might be tough to anonymize) is to severely limit the size of what’s listening (that is, the number of listeners).

Other debugging tools I’d try (assuming posting packet captures is out of the question. )
— ssltap (in libnss3-tools on ubuntu)
— cipherscan
— sslscan

UPDATE
Poking at this through ssltap looks an awful lot like OpenSSL bug #3712 resurfaced (key renegotiation during read/write, basically). Looking for a decent workaround that won’t kill performance. Fun stuff!

Fixing “SSL error: decryption failed or bad record mac”

A couple of days ago, I was greeted by a message that a particular API endpoint in our application is returning 500 Server Error . I was curious because I had tests for that and even tried calling them manually a couple of times when I was coding. I logged in and immediately saw this error django.db.utils.OperationalError: SSL error: decryption failed or bad record mac for a line number that queries the database. I grabbed the terminal and tried python manage.py shell if I can get results back from my models. Ha! I could! This is weird.

The affected part in the application is where the system updates the user’s registration tier. There are three steps in the registration process of our system: (1) user registered and will be assigned tier=initial_reg (2) user updates profile filling up important information and is assigned tier=pending (3) admin verifies the submitted information and if approved will be assigned tier=full_reg . All other endpoints that queries the database works properly and I’m starting to scratch my balls that it only happens when admin approves the registration. It also doesn’t help that the endpoint in issue is working correctly in my local Docker development setup. After a lot of pdb and serious keyboard banging, I finally found the issue.

When admin approves the registration, the system generates a Membership ID; angels will sing, signifying a revered divine event in the users life within the app. The ID is in a specific format and unfortunately I can’t use UUID. The generated ID is then saved in the database and if an IntegrityError occurs, it generates another one until it can be saved successfully. Membership ID generation will then block the request and for this reason I implemented a timeout() utility method which spawns a new process and times out at a specified max seconds. This is where it all goes awry.

A helper function to limit execution of a function in seconds.

The SSL error: decryption failed or bad record mac occurs either when the certificate is invalid or the message hash value has been tampered; in our case it’s because of the latter. Django creates a single database connection when it tries to query for the first time. Any subsequent calls to the database will use this existing connection until it is expired or closed, in which it will automatically create a new one the next time you query. The PostgreSQL engine in Django uses psycopg to talk to the database; according to the document it is level 2 thread safe. Unfortunately, the timeout() method is using multiprocessing module and therefore tampers the SSL MAC. There are different ways to fix this. We can either (1) use basic threads instead of spawning a new process or (2) use a new database connection in the timeout() method. We can also (3) scrap the timeout() method altogether and handle the async task properly via Celery.

The timeout() method is simple, generic, and a quick implementation to limit the execution time of a method. We can use this for anything that takes time, not only for those that queries databases. I think most of the time spawning a new process is safer than using threads. At this point in time, I also don’t want to introduce and maintain another moving part like Celery yet; so we’re choosing to use a new database connection to fix the issue! Unfortunately, it seems it wouldn’t be as pretty. In order to create a new database connection, you just have to close the existing one and Django will create a new one. But this doesn’t work if you are inside a database transaction — it will fail InterfaceError: connection already closed . There’s also no easy way to pass a new database connection when using the models. The closest would be the .using() method where you specify a database alias. You can probably overload the cached property django.db.connections.databases and add your alias with new connection but I wouldn’t really do that! Haha! So we’re left with executing custom SQL directly:

A sample rawSQL query using a new database connection.

The issue we talked about could probably be detected earlier if I just use PostgreSQL, like in Production, instead of SQLite in memory when running the integration tests. I also could’ve tested it works properly in Dev environment where it uses proper PostgreSQL. Silly me! Overall, the debugging process have kept me busy in this quarantine period while working at home. I think it’s also a little fun and a good learning experience for becoming a better Software Engineer.

Keep safe folks! Let’s come back stronger after this pandemic.

SSL: decryption failed or bad record mac with upstream servers

2012/09/07 20:23:52 [error] 3417#0: *1 SSL_read() failed (SSL: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac) while reading upstream, client: 1.2.3.4, server: _, request: «GET /512K.bin HTTP/1.1″, upstream: » ​ https://192.168.1.1:443/512K.bin», host: «test.com»

I can repeat this at will by requesting two 512k files simultaneously. It fails every time. The upstream servers are IIS 7.5.

I’ve dug into the network, SSL on both sides, looked at captures, upgraded and downgraded openssl and Nginx, etc.

In the end, I seem to have worked around it with:

proxy_buffers 8 32k;

The number doesn’t seem to matter, but any less than 32k and the issue repeats.

Is this something getting ‘lost’ and corrupting the SSL transfer if the buffer isn’t large enough?

Change History (15)

comment:1 by Juan Hoyos , 11 years ago

I can reproduce this issue with three simultaneous requests to the same (~200k) resource. The response is gziped and truncated.

2013/03/22 18:03:23 [error] 11395#0: *195839 SSL_read() failed (SSL: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac) while reading upstream, client: 123.213.123.123, server: my.host.com, request: "GET /api/resource/1234 HTTP/1.1", upstream: "https://127.0.0.1:4443/resource1234?", host: "my.host.com", referrer: "https://my.host.com/resource/1234"`

Single 1.2.4 installation on CentOS, configured with a front server proxying to an SSL upstream server.

Fixed as OP mentioned.

Last edited 11 years ago by Juan Hoyos (previous) (diff)

comment:2 by Ziga Mahkovec , 11 years ago

We’re seeing the same issue. We have nginx talking to another upstream nginx over https (both 1.2.7).

Changing proxy_buffer size seems to help, but occasionally we still see a large response (4MB+) causing this issue.

We’ve eliminated a few things while debugging this:

  • keepalive upstream (we tried with and without)
  • http 1.0 and http 1.1
  • with and without range request support
  • with and without SSL session reuse
  • different number of workers/upstream servers

comment:3 by Michel Samia , 10 years ago

Please paste here the minimal config which is able to reproduce this issue. I had a similar issue when proxy_buffering was off and send_file was on and downloading the same file twice in parallel, it closed one connection prematurely (only a small part of file was transferred). When I disabled send_file it started to work correctly.

comment:4 by Agent Coulson , 10 years ago

I am seeing the same problem, here is my config. my upstream server has been obfuscated, but i am able to reproduce with consistency.

error_log logs/error.log debug;

include mime.types;
default_type application/octet-stream;

ssl_protocols SSLv3 TLSv1;
ssl_ciphers RC4:HIGH:!aNULL:!MD5;
ssl_prefer_server_ciphers on;

proxy_redirect off;
proxy_read_timeout 10s;
proxy_connect_timeout 6s;

proxy_buffering off;
proxy_buffer_size 64k;
proxy_buffers 6 16k;
proxy_busy_buffers_size 80k;

proxy_pass_header Server;
proxy_pass_header Date;
proxy_pass_header X-Pad;

proxy_set_header Connection «Keep-Alive»;
proxy_set_header Host «upstream.srv»;

comment:5 by Maxim Dounin , 10 years ago

sensitive: → 0

Just for record, see ​ this thread for additional information. This seems to be a bug in OpenSSL 1.0.0+ related to SSL_MODE_RELEASE_BUFFERS, needs additional investigation.

comment:6 by Aleksey Samsonov , 10 years ago

I found a bug in OpenSSL 1.0.0+ This ​ patch solved a problem on my tests.
I reproduce this issue on nginx changeset 64d4837c9541 (OpenSSL commit f3a3903)

Last edited 10 years ago by Aleksey Samsonov (previous) (diff)

comment:7 by Maxim Dounin , 10 years ago

This also seems related:

It points to an OpenSSL ticket from 2010 with an identical patch:

comment:8 by Maxim Dounin , 10 years ago

Just for record:

This seems to be already fixed by many/most major OS vendors (at least OpenBSD, FreeBSD, Debian and Ubuntu were reported; most notably, Red Hat seems missing), and will be fixed in next OpenSSL releases on all affected branches.

comment:9 by Maxim Dounin , 10 years ago

Resolution: → invalid
Status: new → closed

The OpenSSL 1.0.1h with the fix is out, closing this.

comment:10 by shifty35@… , 7 years ago

Resolution: invalid
Status: closed → reopened

I’m experiencing this same issue again using the latest nginx (1.11.5) and the latest OpenSSL (1.1.0b) while reverse proxying websockets.

I’m assuming this is a regression in OpenSSL, as using OpenSSL 1.0.1x doesn’t fail in the same way?

Also very difficult to track down, as there are zero error logs about the closed connections. Nginx error log must be put into debug to see the ssl_read() failure.

comment:11 by Maxim Dounin , 7 years ago

Resolution: → invalid
Status: reopened → closed

As you can see directly from the ticket description, SSL_read() errors are logged at the error level. And yes, as long as previous versions of the OpenSSL library doesn’t fail, this is a regression in OpenSSL and should be reported to OpenSSL, reopening this ticket doesn’t make sense.

comment:12 by bblack.wikimedia.org@… , 7 years ago

Probably deserves a new ticket at least, but in any case: I’ve observed the same error as reported recently above (nginx 1.11.x + OpenSSL-1.1.0b, revproxy use-case). No websockets here, normal HTTP/2 client traffic into an HTTP/1.1 upstream backend. It takes a fair amount of bytes and/or HTTP/2 streams to trigger reliably (my testcase has been a page that loads ~500 images over an HTTP/2 connection, totaling several megabytes).

It’s not reported at the «error» level. For me it’s reported at the «info» level (so debug-mode isn’t necessary to see it, but still. ):

2016/10/27 18:00:39 [info] 44966#44966: *132434 SSL_read() failed (SSL: error:1408F119:SSL routines:ssl3_get_record:decryption failed or bad re cord mac) while processing HTTP/2 connection, client: .

I’ve done a ton of testing with tweaking various nginx parameters related to buffering and buffer sizes, but I’m always able to reproduce the issue to varying degrees on live servers. ssl_buffer_size seems to have a notable impact on the bug behavior (it gets much harder to repro at smaller sizes). We’ve had reports of the issue from a wide variety of disparate client browsers.

For the moment I’m assuming an OpenSSL-1.1.x regression, but it may be one that depends on exactly how nginx is using the API for buffer management and such, and may be fixable on the nginx side? We’ll try to get some deeper/real debugging done on a reproduction soon.

comment:13 by Maxim Dounin , 7 years ago

It’s not reported at the «error» level. For me it’s reported at the «info» level.

In your case errors are reported at the «info» level as these are errors happening on a connection to a client. Such errors can be easily triggered by an incorrect client behaviour and hence are reported at the «info» level. The initial report was about errors happening on SSL connections to upstream servers, hence the difference.

comment:14 by bblack.wikimedia.org@… , 7 years ago

Update: taking some cues from older OpenSSL bug reports, I tried commenting out nginx’s calls to SSL_CTX_set_mode(ssl->ctx, SSL_MODE_RELEASE_BUFFERS) and SSL_CTX_set_read_ahead(ssl->ctx, 1), and this stopped my bug repro. I think that more-firmly puts this in OpenSSL bug territory and starts giving some ideas where to look.

comment:15 by bblack.wikimedia.org@… , 7 years ago

FYI in case anyone else searches up this ticket, filed upstream @OpenSSL: ​ https://github.com/openssl/openssl/issues/1799

Note: See TracTickets for help on using tickets.

Download in other formats:

  • RSS Feed
  • Comma-delimited Text
  • Tab-delimited Text

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *