Discussion:
URL parsing for the command line tool?
David Niklas
2018-08-15 23:48:47 UTC
Permalink
On Thu, 9 Aug 2018 15:57:32 +0200 (CEST)
Hi,
In a future libcurl release, there is going to be a URL
parsing/handling API [1] added. It can parse URLs, allow access to
parts of the URL, setting or updating specific parts and can "merge" a
relative URL onto an absolute one.
Cool.
Do you think the command line tool would benefit from offering these
services somehow? If so, how? Would you like curl to help your scripts
or shells with in this regard?
if(( want_shell || want_script ) && IN_LATEST_COOL_CURL_VERSION)
{
printf "Yes.\n\n"
}


Yes.
Extract the different parts from a URL? Maybe like this?
in="https://example.com/index.html"
curl --url-input $in --url-out "%{host} %{path}\n"
Or like this:
export base="https://curl.haxx.se" # Base url.
export in2="download.html" # New part. Notice that the / is missing.
export in3="#Win64" # Something I added.

curl --append-urls $base $in2 $in3

Or

curl --base-url $base $in2 $in3
Maybe change parts of a URL and output the new version?
curl --url-input $in --url-replace "host=example.org" --url-out
"%{url}\n"
Please try something more real world (even contrived). Why would
you want to change hosts? Your example makes no sense to me.
Perhaps applying a relative path onto an absolute URL and show the
resulting path part?
curl --url-input $in --url-new "../index?nooo" --url-out "%{path}\n"
<snip>

export in1="../download.html"

for i in DragonFlyBSD FreeBSD NetBSD; do
curl --base-url $base --output-url - $in1 "#" $i | ./download_curl.sh
done

As an award for your work on curl I award you, the extra
large virtual happy face.

<@ @>
v

Sorry, I don't have anything more exiting...

David
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquette: https://curl.haxx
Daniel Stenberg
2018-08-16 13:45:46 UTC
Permalink
Please try something more real world (even contrived). Why would you want to
change hosts? Your example makes no sense to me.
I was just trying to provoke thoughts and ideas, I didn't have any particular
use case in mind.

Features we can consider:

1. host-specific sections in config files. So .curlrc can specify for example
a specific user-agent to use only when connecting to example.com and a
different user-agent for example.org.

2. command-line variables based on the most recently used URL. If you want to
save the output from a download in a directory named as the host name with the
file name part also from the URL:
"curl https://example.com/file -o "%{url_host}/%{url_file}".
export in1="../download.html"
for i in DragonFlyBSD FreeBSD NetBSD; do
curl --base-url $base --output-url - $in1 "#" $i | ./download_curl.sh
done
Or just a way to apply a relative URL on the absolute one before it is used:

curl http://example.org/foo --rel-url "../here/it/is" -O

That could be fun for those who download a HTML page and want to download
something that is pointed to with a relative URL within that.

curl $url > raw.html
extract_hrefs;
for i in $all_hrefs; do
curl $url --rel-url "$i" -O
done
As an award for your work on curl I award you, the extra large virtual happy
face.
Thank you!
--
/ daniel.haxx.se
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquett
Loading...