Kartikaya Gupta
2018-10-30 00:34:28 UTC
Hello curl users,
I was recently writing a script to download a JSON file with curl, and discovered that the server was sending the file with 'Content-Encoding: gzip'. The downloaded file therefore had to be gunzip'd before it was usable. Other similar JSON files from the same server were *not* being similarly encoded, so I couldn't just pipe the result through gunzip unconditionally. After some searching online, I found [1] which said to use the --compressed argument, and sure enough that resolved my problem.
The documentation for --compressed says that it makes curl *request* a compressed response, which is not quite the same as just decompressing the received response. So --compressed actually does both - request a compressed response, and automatically decompress the response if needed. I only need the latter, so using the --compressed flag seems semantically incorrect for my purpose, even though it works.
I also looked at the relevant HTTP spec [2], which says (paraphrasing) that a request without any Accept-Encoding headers means the server can send any Content-Encoding in response. Personally, I think that if the client side is capable of decoding the encoding, it should attempt to do so, as that provides the most useful default. Otherwise it's up to the user of curl to check for encodings and explicitly decompress them. It just seems like a not-so-great pitfall.
Does anybody have examples where turning on automatic content-decoding would adversely impact the use case? Any other comments on changing the default behaviour here? I'm curious to know also if other people have run into this problem before. According to Daniel [3] the current behaviour is something that has just been inherited down through the ages, and so it's possible there's no strong argument for keeping it the way it is.
[1] https://stackoverflow.com/questions/8364640/how-to-properly-handle-a-gzipped-page-when-using-curl
[2] https://tools.ietf.org/html/rfc7231#section-5.3.4
[3] https://github.com/curl/curl/issues/3192#issuecomment-434124116
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquette: https://curl.haxx.se/mail/etiquette.h
I was recently writing a script to download a JSON file with curl, and discovered that the server was sending the file with 'Content-Encoding: gzip'. The downloaded file therefore had to be gunzip'd before it was usable. Other similar JSON files from the same server were *not* being similarly encoded, so I couldn't just pipe the result through gunzip unconditionally. After some searching online, I found [1] which said to use the --compressed argument, and sure enough that resolved my problem.
The documentation for --compressed says that it makes curl *request* a compressed response, which is not quite the same as just decompressing the received response. So --compressed actually does both - request a compressed response, and automatically decompress the response if needed. I only need the latter, so using the --compressed flag seems semantically incorrect for my purpose, even though it works.
I also looked at the relevant HTTP spec [2], which says (paraphrasing) that a request without any Accept-Encoding headers means the server can send any Content-Encoding in response. Personally, I think that if the client side is capable of decoding the encoding, it should attempt to do so, as that provides the most useful default. Otherwise it's up to the user of curl to check for encodings and explicitly decompress them. It just seems like a not-so-great pitfall.
Does anybody have examples where turning on automatic content-decoding would adversely impact the use case? Any other comments on changing the default behaviour here? I'm curious to know also if other people have run into this problem before. According to Daniel [3] the current behaviour is something that has just been inherited down through the ages, and so it's possible there's no strong argument for keeping it the way it is.
[1] https://stackoverflow.com/questions/8364640/how-to-properly-handle-a-gzipped-page-when-using-curl
[2] https://tools.ietf.org/html/rfc7231#section-5.3.4
[3] https://github.com/curl/curl/issues/3192#issuecomment-434124116
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquette: https://curl.haxx.se/mail/etiquette.h