
RegEx Fu
One of the things after moving most of my things from copy.com to Google Drive was the direct (public) download URLs that copy.com provides. DropBox has them as well, but Google Drive lacks them in the UI.
There is a URL format that does allow for direct download though:
While Google aims for Drive to be a competent Dropbox competitor, there’s one small but key feature that isn’t easy: sharing direct download links. Fortunately, you can create your own.
Source: Share Direct Links to Files in Google Drive and Skip the Web Viewer
You can do a similar replacement for Google Doc URLs: How to Create Direct Download Links for Files on Google Drive
The Google Drive conversion seems straightforward as they convert from either of
https://drive.google.com/file/d/FILE_ID/edit?usp=sharing
https://drive.google.com/file/d/FILE_ID/view
https://drive.google.com/open?id=FILE_ID
to
https://drive.google.com/uc?export=download&id=FILE_ID
There are tons of RegEx examples for doing the first conversion at Regex to modify Google Drive shared file URL – Stack Overflow, but
- they don’t cover the two conversions
- they use the non-greedy (.*?) capturing groups which are tricky, introduce question mark escaping issues in hash and many sed implementations fail to implement non-greedy
Since I’m a command-line person, I’ve opted for a sed conversion that wasn’t in the above list. I choose sed because it allows you to convert either a line or a complete file at one time.
There are a few indispensable resources to get my regex expressions right:
So here it goes, starting with fixing https://drive.google.com/open?id=FILE_ID as it’s the most simple replacement because the FILE_ID is at the end.
First of all, these code fragments below are part of bash functions as bash functions remove the quoting hell you have with bash aliases.
Where bash aliases have no parameters (i.e. the arguments are put after the end of the expansion), functions have parameters. So if you want to pass all function parameters to a command inside a function, you have to use “$@” to pass all parameters.
This fragment fixes https://drive.google.com/open?id=FILE_ID printing each fix on one line using the p for printing command in sed:
sed -n 's@https://drive.google.com/open?id=@https://drive.google.com/uc?export=download\&id=@p' "$@"
A few remarks:
The second fragment fixes https://drive.google.com/file/d/FILE_ID/edit?usp=sharing and https://drive.google.com/file/d/FILE_ID/view again printing each fix:
sed -n 's@https://drive.google.com/file/d/\([^.]*\)/.*@https://drive.google.com/uc?export=download\&id=\1@p' "$@"
Some more remarks:
- The FILE_ID is obtained from a capturing group during the match using
\([^.]*\) and using the value in the replace with \1 as reference.
- There is backslash escaping of the parentheses because that’s the sed way.
- I’ve used a non-greedy
\(.*?\) capturing group (sed can’t do that) but \([^.]*\)/ which matches any non-slash inside the capturing group until the first slash outside that group.
The final part is combing both replacement into one sed command:
sed 's@https://drive.google.com/open?id=@https://drive.google.com/uc?export=download\&id=@;s@https://drive.google.com/file/d/\([^.]*\)/.*@https://drive.google.com/uc?export=download\&id=\1@' "$@"
Final remarks:
–jeroen