The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,778 other followers

Archive for the ‘wget’ Category

wget and curl: downloads that sometimes fail

Posted by jpluimers on 2018/10/19

For my archive somewhere between cURL 7.21.0 and 7.34.0 it does not like to be started from an RDP based tsclient share:

C:\Users\jeroen\Downloads>\\tsclient\bin\curl.7.21.0.exe --remote-name https://www.xs4all.nl/index.html
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 86465    0 86465    0     0  60805      0 --:--:--  0:00:01 --:--:-- 70012

C:\Users\jeroen\Downloads>\\tsclient\bin\curl.7.34.0.exe --remote-name https://www.xs4all.nl/index.html
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (6) Could not resolve host: web.archive.org

C:\Users\jeroen\Downloads>\\tsclient\bin\curl.7.61.0.exe --remote-name https://www.xs4all.nl/index.html
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (6) Could not resolve host: www.xs4all.nl

C:\Users\jeroen\Downloads>copy \\tsclient\bin\curl.7.61.0.exe
        1 file(s) copied.

C:\Users\jeroen\Downloads>curl.7.61.0.exe --remote-name https://www.xs4all.nl/index.html
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    13    0    13    0     0     10      0 --:--:--  0:00:01 --:--:--    10

It fails the same way after net use B: \\tsclient\bin, so that does not matter.

The best link I could find until I got to the real problem was [WayBack] curl: (6) Could not resolve host: application – Stack Overflow which shows a different problem: properly quoting.

In addition to remote-name, you can also grab the file name from the headers using --remote-header-name, and --remote-time use the remote file time. The --location follows 302-redirects. You can see that in the example below which I build based on

[WayBack] unix – Curl to grab remote filename after following location – Stack Overflow: The remote side sends the filename using the Content-Disposition header.curl 7.21.2 or newer does this automatically if you specify –remote-header-name / -J.curl -O -J -L $url

C:\Users\jeroen\Downloads>b:\curl.7.21.0.exe --location --remote-name --remote-time --remote-header-name "https://web.archive.org/web/20180712073755if_/https://www.danielwolf.eu/?wpdmdl=1965"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 86465    0 86465    0     0  45748      0 --:--:--  0:00:01 --:--:-- 50772
curl: Saved to filename 'pkgWuppdiWP_DX102T_1-1-2.zip'

wget failed big time:

C:\Users\jeroen\Downloads>B:\wget.exe --no-check-certificate -v -v -v --content-disposition --restrict-file-names=windows "https://web.archive.org/web/20180712073755if_/https://www.danielwolf.eu/?wpdmdl=1965"
wget: Cannot read b:/.wgetrc (No such file or directory).
--2018-07-12 09:55:23--  https://web.archive.org/web/20180712073755if_/https://www.danielwolf.eu/?wpdmdl=1965
Resolving web.archive.org... 207.241.225.186
Connecting to web.archive.org|207.241.225.186|:443... failed: Invalid argument.
Retrying.

...

--2018-07-12 09:55:23--  (try:20)  https://web.archive.org/web/20180712073755if_/https://www.danielwolf.eu/?wpdmdl=1965
Connecting to web.archive.org|207.241.225.186|:443... failed: Invalid argument.
Giving up.

This is not caused by the filename (Windows does not like the ? question mark in output file names, so  – like & ampersand in file URLs – you have to quote the full URL, but also provide the --restrict-file-names=windows parameter; see [WayBack] wget – I can’t download files with “?” – Super User).

–jeroen

Posted in *nix, *nix-tools, cURL, Power User, wget | Leave a Comment »

https://altd.embarcadero.com/ TLS certificate does not match domain name

Posted by jpluimers on 2018/09/07

One of the domains not yet monitored at embarcaderomonitoring.wiert.me, was the altd download server for ISOs and installers on http and https level. Ultimately you want https, as most of these are about installers, so you do not want any man-in-the-middle to fiddle with them.

TLS on altd fails

Upitmerobot is not yet smart enough to check validity of TLS certificates on https connections.

Chrome, Firefox, Safari, Internet Explorer, wget, curl and ssllabs however are.

altd hides as much from itself as possible

Uptimerobot did not like monitoring the plain http://altd.embarcadero.com/ and https://altd.embarcadero.com/ URLs, because the altd is not browsable, so it tries to hide most of its structure from access. This means they both return an odd response:

Those responses are actually 404 errors (note the - minus sign after curl --trace-ascii: it sends the trace to stdout):

$ wget http://altd.embarcadero.com/
--2018-09-05 10:44:23-- http://altd.embarcadero.com/
Resolving altd.embarcadero.com (altd.embarcadero.com)... 88.221.144.40, 88.221.144.10
Connecting to altd.embarcadero.com (altd.embarcadero.com)|88.221.144.40|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-09-05 10:44:23 ERROR 404: Not Found.

$ curl --verbose http://altd.embarcadero.com/
*   Trying 88.221.144.40...
* TCP_NODELAY set
* Connected to altd.embarcadero.com (88.221.144.40) port 80 (#0)
> GET / HTTP/1.1
> Host: altd.embarcadero.com
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 404 Not Found
< Server: Apache
< Content-Type: text/html; charset=iso-8859-1
< Content-Length: 16
< Date: Wed, 05 Sep 2018 08:45:57 GMT
< Connection: keep-alive
< 
* Connection #0 to host altd.embarcadero.com left intact
File not found."

$ curl --trace-ascii - http://altd.embarcadero.com/
== Info:   Trying 88.221.144.40...
== Info: TCP_NODELAY set
== Info: Connected to altd.embarcadero.com (88.221.144.40) port 80 (#0)
=> Send header, 84 bytes (0x54)
0000: GET / HTTP/1.1
0010: Host: altd.embarcadero.com
002c: User-Agent: curl/7.54.0
0045: Accept: */*
0052: 
<= Recv header, 24 bytes (0x18)
0000: HTTP/1.1 404 Not Found
<= Recv header, 16 bytes (0x10)
0000: Server: Apache
<= Recv header, 45 bytes (0x2d)
0000: Content-Type: text/html; charset=iso-8859-1
<= Recv header, 20 bytes (0x14)
0000: Content-Length: 16
<= Recv header, 37 bytes (0x25)
0000: Date: Wed, 05 Sep 2018 08:47:19 GMT
<= Recv header, 24 bytes (0x18)
0000: Connection: keep-alive
<= Recv header, 2 bytes (0x2)
0000: 
<= Recv data, 16 bytes (0x10)
0000: File not found."
File not found."== Info: Connection #0 to host altd.embarcadero.com left intact

This is also the reason that WayBack does not want to archive that link, but it can be archived at [Archive.ishttps://altd.embarcadero.com/.

Luckily, a Google search for site:altd.embarcadero.com revealed there is a non-installer file short enough (~72 kibibytes) for Uptime robot to check, so it now verifies it can access these:

–jeroen

Read the rest of this entry »

Posted in *nix, *nix-tools, cURL, Encryption, HTTPS/TLS security, Monitoring, Power User, Security, Uptimerobot, wget | Leave a Comment »

Use TLS 1.2 or higher, as TLS 1.1 is phased out on many sites, after TLS 1.0/SLL has been disabled by most for a while now

Posted by jpluimers on 2018/07/23

If you get an error like this in one of your tools

OpenSSL: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

it means you are using a tool not yet properly supporting TLS 1.2 or higher.

Or in other words: update your tool set.

The reason is that – after turning off TLS 1.0 a while ago – more and more sites do the same for TLS 1.1.

A prime example of a site that warned on this in a clear way very early on is github:

Others have done this too, for instance:

TLS 1.0 is vulnerable to many attacks, and certain configurations of TLS 1.1 as well (see for instance [WayBack] What are the main vulnerabilities of TLS v1.1? – Information Security Stack Exchange), which means that properly configuring the non-vulnerable TLS 1.1 over times gets more and more complex. An important reason to say goodbye to that as well, as TLS 1.2 (from 2008) is readily available for a long time. The much more recent TLS 1.3 (from 2018) will take a while to proliferate.

I ran in the above error because on one of my systems, an old version of wget was luring around, so I dug up the easiest place to download recent Windows binaries for both win32 (x86) and win64 (x86_64):

[WayBack] eternallybored.org: GNU Wget for Windows having a table indicating the OpenSSL version for each wget build.

–jeroen

Reference: Transport Layer Security – Wikipedia: History and development

Posted in *nix, https, HTTPS/TLS security, OpenSSL, Power User, Security, wget | Leave a Comment »

HowTo: Wget Command Examples

Posted by jpluimers on 2017/04/28

HowTo: Wget Command Examples – Wget is a free software package for retrieving files using HTTP, HTTPS and FTP, the most widely-used Internet protocols

Source: HowTo: Wget Command Examples [WayBack]

I totally forgot about

  • the -c switch that continues an aborted download.
  • the -r -A combination to only download certain file types.

–jeroen

Posted in *nix, *nix-tools, Power User, wget | Leave a Comment »

Simple download script to get all FREE Microsoft eBooks again! Including: Windows 10, Office 365, Office 2016, Power BI, Azure, Windows 8.1, Office 2013, SharePoint 2016, SharePoint 2013, Dynamics CRM, PowerShell, Exchange Server, System Center, Cloud, SQL Server and more! | Microsoft Director of Sales Excellence – Eric Ligman

Posted by jpluimers on 2016/12/21

I saw a lot of people mention the Eric Ligman, Microsoft Director of Sales Excellence Blog a while ago: FREE! That’s Right, I’m Giving Away MILLIONS of FREE Microsoft eBooks again! Including: Windows 10, Office 365, Office 2016, Power BI, Azure, Windows 8.1, Office 2013, SharePoint 2016, SharePoint 2013, Dynamics CRM, PowerShell, Exchange Server, System Center, Cloud, SQL Server and more! | Microsoft Director of Sales Excellence – Eric Ligman

Even though I make most of my income from the Microsoft World, my main machine is a Mac and I dislike browsing

So I wanted to download them all in one easy go so SpotLight could index them.

Luckily there is a file that has all the download URLs in it: http://ligman.me/29zpthb (it expands to http://www.mssmallbiz.com/ericligman/Key_Shorts/MSFTFreeEbooks.txt and is archived at http://web.archive.org/web/*/http://www.mssmallbiz.com/ericligman/Key_Shorts/MSFTFreeEbooks.txt).

The file is mentioned at How to “Download All” of the FREE eBooks and Resources in My FREE eBooks Giveaway | Microsoft Director of Sales Excellence – Eric Ligman.

It’s very easy to download from there using wget (on Windows get it from https://eternallybored.org/misc/wget/ – the x64 versions work fine).

Be sure to use a “recent” version as 1.12 and lower have no support for the --trust-server-names parameter which makes wget use filenames from the http 301 followed links:

--trust-server-names
     If this is set to on, on a redirect the last component of the redirection URL will be used as the local file name.  By default it is used the last component in the original URL.

This is the script:

wget -m -np --trust-server-names http://ligman.me/29zpthb
wget -m -np --trust-server-names --input-file http://www.mssmallbiz.com/ericligman/Key_Shorts/MSFTFreeEbooks.txt

You might think why not do just wget -m -np --trust-server-names --input-file http://ligman.me/29zpthb  in one go?

Simple answer: like all software, wget occasionally crashes somewhere in the middle of downloading the URLs embedded.

If you restart, then it sees the followed http://ligman.me/29zpthb file has already been downloaded and won’t re-scan its contents.

Bug? I’m not sure. But the two-liner just works.

–jeroen

PS: if you want a script with all the URLs, try these:

https://news.ycombinator.com/item?id=12071552

 

 

Posted in *nix, .NET, Development, Office, Power User, Software Development, wget, Windows | 2 Comments »

 
%d bloggers like this: