One of my most frequently-visited posts is "Retrieving Oracle patches with wget." Recently, a commenter on that post asked about using wget to download software from Oracle Technet (OTN). It's a little complicated to use wget to download from OTN, because there are cookies involved. This post discusses how to extract the relevant cookies from Firefox and Google Chrome, and use those cookies with wget to retrieve files. If you manage to wade through my excessive exposition, there's even a script at the end that you might find useful.
First things first: This is not a new idea. Pythian's Marc Fielding wrote an excellent article a long time ago about using wget and the text-based browser lynx to download files from OTN. His article is the reason that I even knew this was possible; I'm just expanding on the concept to include other browsers. If you have lynx installed on your server, you can stop reading this post, go read Marc's post instead, and you're all set. In fact, it's a good idea to read Marc's article anyway, because I'm going to reference it a few times in this post.
Overview and assumptions
As you've read from Marc's article (you did go to read that, right?), the basic process to set up downloads from OTN with wget is as follows:
- Authenticate to OTN with your username and password. You need to do this for cookies to be set properly. No workarounds or backdoors here, you have to sign in.
- Navigate to the relevant page in the Downloads section for the software to be downloaded
- Click the radio button to accept the license agreement. This is necessary to reveal the download links so you can copy the URL to feed to wget. Again, no backdoors, no workarounds.
- Copy the URLs of the files you want to download
- Extract the cookies created during your browser session into a text file.
- Invoke wget with the --load-cookies option to download the desired files.
By now, you might be thinking, "If it's that easy, why is the scroll bar next to this post so long?" The answer is that step 5 is somewhat involved.
I'm going to assume a few things. First, we'll be working with either Firefox or Google Chrome, because they store their cookies in similar ways. Safari uses XML to store cookies, and Internet Explorer is just an abomination ;-), so they're beyond the scope of this post. My second assumption is that you don't need help with steps 1-4 above. We're going to jump right into extracting cookies from Firefox and Chrome.
Update, 25-Apr-2010: Here's one last opportunity to stop reading and start doing. If you're a Firefox user, and not averse to using plugins in your browser, you could install a plugin like Cookie Exporter to extract cookies to a properly-formatted text file. Also, notwithstanding my crack about IE, I have read that it's possible to export IE's cookies in the correct format without having to resort to any special shenanigans. I had intended to point out both of these things in my original post, but instead I launched right into the SQLite discussion below. Sorry about that; the editing department has been sternly reprimanded.
Lynx very conveniently exports its cookies to a text file, in the format expected by wget (the classic "Netscape cookies.txt" spec). Firefox and Chrome, on the other hand, save cookies in a SQLite database file. So, I guess that's it, right? Cookies locked up in some goofy binary file. Not going to be able to get those out. Guess I'm out of luck.
Oh. Wait. I do database stuff. There's hope!
The first thing to do is to install a SQLite client. Instructions are beyond the scope of this post, but you can download precompiled binaries of the sqlite3 command-line client from the SQLite site. If you're a Mac user, you're in luck: sqlite3 should already be installed in /usr/bin.
The next step is to locate the cookies database file for your browser. For Firefox, it should be cookies.sqlite in your default profile directory, and for Chrome, it should be the file creatively named Cookies in your profile directory. If you're really stuck, and are running Linux or OS X, you should be able to find the location by issuing
lsof | grep -i cookies while your browser is running.
Here's an example of what the cookies look like in the SQLite database in Firefox:
zathras:~ jpiwowar$ sqlite3 /Users/jpiwowar/Library/Application\ Support/Firefox/Profiles/ftpzvxjz.default/cookies.sqlite SQLite version 3.6.10 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> .schema moz_cookies CREATE TABLE moz_cookies (id INTEGER PRIMARY KEY, name TEXT, value TEXT, host TEXT, path TEXT,expiry INTEGER, lastAccessed INTEGER, isSecure INTEGER, isHttpOnly INTEGER); sqlite> select * ...> from moz_cookies ...> where name like 'ORA_UCM%' ...> ; 1271476327342462|ORA_UCM_INFO|3~70969please_dont_copy_my_cookies|.oracle.com|/|1303012327|1271477696515030|0|0 1271476327343576|ORA_UCM_VER|%200|.oracle.com|/|1303012327|1271477696515030|0|0 1271476327344121|ORA_UCM_SRVC|no_really_just_use_your_own|.oracle.com|/|1303012327|1271477696515030|0|0
The name of the cookies table in Firefox is moz_cookies. At the time of this post, the OTN cookies needed for this exercise have names starting with ORA_UCM. The name of the SQLite table in the Chrome cookies database is simply 'cookies'. You'll note that some of the column names are also different:
zathras:~ jpiwowar$ sqlite3 /Users/jpiwowar/Library/Application\ Support/Chromium/Default/Cookies SQLite version 3.6.10 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> .schema cookies CREATE TABLE cookies (creation_utc INTEGER NOT NULL UNIQUE PRIMARY KEY,host_key TEXT NOT NULL,name TEXT NOT NULL,value TEXT NOT NULL,path TEXT NOT NULL,expires_utc INTEGER NOT NULL,secure INTEGER NOT NULL,httponly INTEGER NOT NULL,last_access_utc INTEGER NOT NULL);
Of course, a simple "select *" from the cookies table is not going to produce a usable cookies file to use with wget. The cookies.txt spec requires 7 tab-, not pipe-, separated fields, and the expiration timestamp for Chrome cookies is in an unusual format. Rather than take up even more space in this post, I've written a shell script, OTNcookies.sh (it'll probably download as .txt, not .sh), that performs the work of extracting the cookies in the proper format. Here's a quick overview of the script:
- Takes two arguments: the first (required) is the location of the cookies database file, and the second (optional) is the name of the text file containing the exported cookies.
- Determines the type of cookie database (Firefox or Chrome)
- Runs an appropriate query to extract the cookies to the text file specified by the user, or to a default output file.
- Tries to accommodate for spaces in filenames and directories, but you will need to escape spaces in the file names that you pass to the script
As with any script you grab from the Intertubes, please exercise caution. This isn't designed to do anything malicious or damaging, but that doesn't mean you couldn't do something silly with it (like, say, running it as root and overwriting critical system files with cookies. Wouldn't that be embarrassing? Yes, but it would be your fault, not mine )
Here's a sample run of the OTNcookies script:
zathras:~ jpiwowar$ OTNcookies.sh /Users/jpiwowar/Library/Application\ Support/Firefox/Profiles/ftpzvxjz.default/cookies.sqlite /var/tmp/FFcookies.txt Looks like Firefox. OTN cookies written to /var/tmp/FFcookies.txt
After extracting the cookies, it's a simple matter of invoking wget with the load-cookies option, as documented in Marc's article.
zathras:~ jpiwowar$ wget --load-cookies=/var/tmp/FFcookies.txt http://download.oracle.com/otn/mac/instantclient/10204/instantclient-basic-10.2.0.4.0-macosx-x86.zip --2010-04-22 17:00:39-- http://download.oracle.com/otn/mac/instantclient/10204/instantclient-basic-10.2.0.4.0-macosx-x86.zip Resolving download.oracle.com... 126.96.36.199, 188.8.131.52 Connecting to download.oracle.com|184.108.40.206|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 34109360 (33M) [application/zip] Saving to: `instantclient-basic-10.2.0.4.0-macosx-x86.zip'<br />10% [===> ] 3,563,808 545K/s eta 65s
A final note: The cookie file should be re-usable, so you don't need to regenerate one for each file you want to download. The cookies will expire eventually, however, at the time indicated in the expiry or expires_utc field. At that point, you'll need to go through the whole process again, starting with logging in to OTN to create new cookies.