Discussion:
10 second stalls when retrieving pages from washingtonpost.com
John Brayton
2018-11-09 20:53:57 UTC
Permalink
$ time curl -i "https://www.washingtonpost.com/politics/trump-says-he-doesnt-know-his-new-acting-ag-hasnt-talked-to-him-about-russia-probe/2018/11/09/c3f00922-e429-11e8-b759-3d88a5ce9e19_story.html?hpid=hp_hp-top-table-main_trumpwhitaker-11a%3Ahomepage%2Fstory-ans"
HTTP/2 302
server: AkamaiGHost
content-length: 0
location: https://www.washingtonpost.com/politics/trump-says-he-doesnt-know-his-new-acting-ag-hasnt-talked-to-him-about-russia-probe/2018/11/09/c3f00922-e429-11e8-b759-3d88a5ce9e19_story.html?hpid=hp_hp-top-table-main_trumpwhitaker-11a%3Ahomepage%2Fstory-ans&noredirect=on
cache-control: private, max-age=0
expires: Fri, 09 Nov 2018 20:16:05 GMT
date: Fri, 09 Nov 2018 20:16:05 GMT
set-cookie: wp_devicetype=0; expires=Sun, 09-Dec-2018 20:16:05 GMT; path=/; domain=.washingtonpost.com
set-cookie: rplampr=3|20181008; expires=Sat, 09-Nov-2019 20:16:05 GMT; path=/; domain=.washingtonpost.com
content-security-policy: upgrade-insecure-requests
8.16 real 0.02 user 0.01 sys
Opening the page with a browser is fast, and retrieving the page with wget takes less than .4 seconds.
$ curl --version
curl 7.61.0 (x86_64-apple-darwin17.7.0) libcurl/7.61.0 OpenSSL/1.0.2o zlib/1.2.11 nghttp2/1.32.0
Release-Date: 2018-07-11
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy
$ curl --version
curl 7.58.0 (x86_64-pc-linux-gnu) libcurl/7.58.0 OpenSSL/1.1.0g zlib/1.2.11 libidn2/2.0.4 libpsl/0.19.1 (+libidn2/2.0.4) nghttp2/1.30.0 librtmp/2.3
Release-Date: 2018-01-24
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy PSL
time curl -i -H "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0.1 Safari/605.1.15" \
"https://www.washingtonpost.com/politics/trump-says-he-doesnt-know-his-new-acting-ag-hasnt-talked-to-him-about-russia-probe/2018/11/09/c3f00922-e429-11e8-b759-3d88a5ce9e19_story.html?hpid=hp_hp-top-table-main_trumpwhitaker-11a%3Ahomepage%2Fstory-ans"
Things I have tried include:

* Forcing HTTP/2 and forcing HTTP/1.1
* Forcing IPv4 and forcing IPv6
* Adding an empty "Expect" header

Any suggestions would be very helpful to me. Thank you.

John
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Et
Daniel Stenberg
2018-11-17 11:04:22 UTC
Permalink
I am trying to retrieve pages from washingtonpost.com using curl and am
seeing 8-10 second stalls. For example, this command takes between 8 and 10
Any suggestions would be very helpful to me. Thank you.
There's no doubt that this is the server's choice or flaw.

If you add -v and --trace-time to the command line you can see clearly that
the server is taking a long time to respond in these cases.

I also tried with different curl versions and TLS backends and I couldn't
avoid the stall.

Curious! Time to fire up wireshark and see if that can teach us something
here...
--
/ daniel.haxx.se
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquette:
Ray Satiro
2018-11-17 21:56:05 UTC
Permalink
Post by Daniel Stenberg
I am trying to retrieve pages from washingtonpost.com using curl and
am seeing 8-10 second stalls. For example, this command takes between
Any suggestions would be very helpful to me. Thank you.
There's no doubt that this is the server's choice or flaw.
If you add -v and --trace-time to the command line you can see clearly
that the server is taking a long time to respond in these cases.
I also tried with different curl versions and TLS backends and I
couldn't avoid the stall.
Curious! Time to fire up wireshark and see if that can teach us
something here...
It's AkamaiGhost which I'm sure is caching the pages based on certain
header fields (usually user agent). Unfortunately it doesn't respond to
debug requests so I can't say for sure. Nothing can be done about this
except send some headers that already match a cached copy. I suggest the
ESR version of Firefox agent, since I think that would be most likely to
match over the longest period of time.

--compressed --user-agent "Mozilla/5.0 (Windows NT 6.1; rv:60.0)
Gecko/20100101 Firefox/60.0"

If you do that you may receive compressed content though which is why
--compressed so it's automatically decompressed. Also here is an old ESR
version that seems to work just as well: "Mozilla/5.0 (Windows NT 6.1;
rv:38.0) Gecko/20100101 Firefox/38.0"
John Brayton
2018-11-18 16:40:42 UTC
Permalink
Thank you, Roy and Daniel.

John
Post by Daniel Stenberg
Post by John Brayton
Any suggestions would be very helpful to me. Thank you.
There's no doubt that this is the server's choice or flaw.
If you add -v and --trace-time to the command line you can see clearly that the server is taking a long time to respond in these cases.
I also tried with different curl versions and TLS backends and I couldn't avoid the stall.
Curious! Time to fire up wireshark and see if that can teach us something here...
It's AkamaiGhost which I'm sure is caching the pages based on certain header fields (usually user agent). Unfortunately it doesn't respond to debug requests so I can't say for sure. Nothing can be done about this except send some headers that already match a cached copy. I suggest the ESR version of Firefox agent, since I think that would be most likely to match over the longest period of time.
--compressed --user-agent "Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0"
If you do that you may receive compressed content though which is why --compressed so it's automatically decompressed. Also here is an old ESR version that seems to work just as well: "Mozilla/5.0 (Windows NT 6.1; rv:38.0) Gecko/20100101 Firefox/38.0"
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquette: https://curl.haxx.se/mail/etiquette.html
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquette: https://curl.haxx.se/mail/et

Loading...