The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My work

  • My badges

  • Twitter Updates

  • My Flickr Stream

    20140508-Delphi-2007--Project-Options--Cannot-Edit-Application-Title-HelpFile-Icon-Theming

    20140430-Fiddler-Filter-Actions-Button-Run-Filterset-now

    20140424-Windows-7-free-disk-space

    More Photos
  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,722 other followers

sed: convert Google Drive urls to direct download ones

Posted by jpluimers on 2017/03/14

RegEx Fu

RegEx Fu

One of the things after moving most of my things from copy.com to Google Drive was the direct (public) download URLs that copy.com provides. DropBox has them as well, but Google Drive lacks them in the UI.

There is a URL format that does allow for direct download though:

While Google aims for Drive to be a competent Dropbox competitor, there’s one small but key feature that isn’t easy: sharing direct download links. Fortunately, you can create your own.

Source: Share Direct Links to Files in Google Drive and Skip the Web Viewer

You can do a similar replacement for Google Doc URLs: How to Create Direct Download Links for Files on Google Drive

The Google Drive conversion seems straightforward as they convert from either of

https://drive.google.com/file/d/FILE_ID/edit?usp=sharing
https://drive.google.com/file/d/FILE_ID/view
https://drive.google.com/open?id=FILE_ID

to

https://drive.google.com/uc?export=download&id=FILE_ID

There are tons of RegEx examples for doing the first conversion at Regex to modify Google Drive shared file URL – Stack Overflow, but

  1. they don’t cover the two conversions
  2. they use the non-greedy (.*?) capturing groups which are tricky, introduce question mark escaping issues in hash and many sed implementations fail to implement non-greedy

Since I’m a command-line person, I’ve opted for a sed conversion that wasn’t in the above list. I choose sed because it allows you to convert either a line or a complete file at one time.

There are a few indispensable resources to get my regex expressions right:

So here it goes, starting with fixing https://drive.google.com/open?id=FILE_ID as it’s the most simple replacement because the FILE_ID is at the end.

First of all, these code fragments below are part of bash functions as bash functions remove the quoting hell you have with bash aliases.

Where bash aliases have no parameters (i.e. the arguments are put after the end of the expansion), functions have parameters. So if you want to pass all function parameters to a command inside a function, you have to use “$@” to pass all parameters.

This fragment fixes https://drive.google.com/open?id=FILE_ID printing each fix on one line using the p for printing command in sed:

sed -n 's@https://drive.google.com/open?id=@https://drive.google.com/uc?export=download\&id=@p' "$@"

A few remarks:

The second fragment fixes https://drive.google.com/file/d/FILE_ID/edit?usp=sharing and https://drive.google.com/file/d/FILE_ID/view again printing each fix:

sed -n 's@https://drive.google.com/file/d/\([^.]*\)/.*@https://drive.google.com/uc?export=download\&id=\1@p' "$@"

Some more remarks:

  • The FILE_ID is obtained from a capturing group during the match using \([^.]*\) and using the value in the replace with \1 as reference.
  • There is backslash escaping of the parentheses because that’s the sed way.
  • I’ve used a non-greedy \(.*?\) capturing group (sed can’t do that) but \([^.]*\)/ which matches any non-slash inside the capturing group until the first slash outside that group.

The final part is combing both replacement into one sed command:

sed 's@https://drive.google.com/open?id=@https://drive.google.com/uc?export=download\&id=@;s@https://drive.google.com/file/d/\([^.]*\)/.*@https://drive.google.com/uc?export=download\&id=\1@' "$@"

Final remarks:

–jeroen

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: