Mplayer ripping
From FBSD_tips
The way that conventional media entities are transitioning to the "new media" (basically internet technology applied to mass media) is simultaneously exciting and maddening. Exciting because of the new options this gives us to enjoy our favorite programming. Maddenning because 1/2 steps they have taken are designed to hold on to the traditional distribution paradigms, and with even more draconian controls than were previously imposed in the antique mediums. I once heard computer security described as "preventing you from being able to use your computer", I believe this situation follows the same tenet. One that I find particularly maddening is whereas I can record from my radio and no one bats an eye, if I want to record an 'internet broadcast' they are seemingly ready to instigate legal action. So here is a method of ripping internet streams, as always I neither condone nor advise breaking the laws of where you are, this is merely the set of steps it would take to do ...
So, one day I hear part of a Donald Knuth interview on NPR (national public radio) in my car. Later that day I look it up on the web. Sure enough, I can listen to the full interview as a stream but I can't put it on my mp3 player (PUBLIC radio, indeed!). However, many of the transport protocols HAVE been reverse engineered, and one of the most capable softwares at reading them is Mplayer.
So, to brass tacks ... Job #1 is to extract the URL from all the HTML/JAVA Script/Wrapper files/Etc. Here is the page link :
http://www.npr.org/templates/story/story.php?storyId=4532247
So when we click on the "Listen" icon in the upper left of the page a popup window appears. Here I use a Firefox plugin called Unplug that pulls media URLs out of pages. Using it I get this mess (w/o the escaped line breaks):
http://www.npr.org/templates/dmg/dmg_wmref_em.php?id=4532314&type=1&date=14-Mar-2005&au=1&\ pid=12006253&random=2062402431&guid=000F331CE82D070B4989C9B861626364&upf=mac&splayer=sp&mtype=WM&\ ssid=&topicName=Health___Science&subtopicName=Health___Science&prgCode=ME&hubId=4111499&\ thingId=4532247&tableModifier=
No, brave scholar, we are not done yet. not by half. Now we have to download the file that thing points to. I will omit most of the ugly from above in the next step. I had to use -o <filename> because the file that fetch tried to make from that URLwas not creatable on a UFS filesystem.
fetch -o URL.txt "http://www.npr.org/templates/dmg/dmg_wmref_em.php?id=4532314& ... "
Now, the contents of URL.txt brings us one step closer to our goal. Again I inserted escaped line break, for legibility sake.
# cat URL.txt <asx version = "3.0"> <ENTRYREF href="http://www.npr.org/templates/dmg/dmg_em.php?id=4532314&type=1&date=14-Mar-2005&au=1&\ pid=12006253&random=2062402431&guid=000F331CE82D070B4989C9B861626364&upf=mac&splayer=sp&\ mtype=WM&ssid=&topicName=Health___Science&subtopicName=Health___Science&prgCode=ME&\ hubId=4111499&thingId=4532247&tableModifier=&mswmext=.asx" /> <asx>
Think this is the pot of gold yet? Think again! One more level of indirection.
fetch -o URL2.txt "http://www.npr.org/templates/dmg/dmg_em.php?id=4532314&type=1&date=14-Mar-2005&\ au=1&pid=12006253&random=2062402431&guid=000F331CE82D070B4989C9B861626364&upf=mac&splayer=sp&\ mtype=WM&ssid=&topicName=Health___Science&subtopicName=Health___Science&prgCode=ME&\ hubId=4111499&thingId=4532247&tableModifier=&mswmext=.asx"
And we'll cat this hairball too now.
# cat URL2.txt <asx version = "3.0"> <title>NPR's Morning Edition - Monday, March 14, 2005</title> <abstract>more info at : NPR's Morning Edition Web site</abstract> <moreinfo href="http://www.npr.org/templates/rundowns/rundown.php?prgId=3" /> <PARAM name="ShowPlayList" value="true"/> <entry> <param name="track" value="Array['SEG_NUM'][0]" /> <title>Donald Knuth, Founding Artist of Computer Science</title> <author>NPR's Morning Edition - Monday, March 14, 2005</author> <copyright>(c) 2007 NPR</copyright> <ref href="mms://wm.npr.org/wm.npr.na-central/npr/me/2005/03/20050314_me_06.wma?v1st=&mt=1&\ primaryTopic=1007&assignedTopics=1019,1007,1024,1021,1001&aggIds=4111499" / > <ref href="http://wm.npr.org/wm.npr.na-central/npr/me/2005/03/20050314_me_06.wma?v1st=&mt=1&\ primaryTopic=1007&assignedTopics=1019,1007,1024,1021,1001&aggIds=4111499" /> </entry> </asx>
NOW we have something that looks like a media URL! Pick one and try to play it in mplayer to see if we are getting a media stream now.
mplayer "mms://wm.npr.org/wm.npr.na-central/npr/me/2005/03/20050314_me_06.wma?v1st=&mt=1&\ primaryTopic=1007&assignedTopics=1019,1007,1024,1021,1001&aggIds=4111499"
Now if mplayer can play it, mplayer can save it. I offer 2 examples of saving, the first will convert to wav file formet on the fly :
mplayer -ao pcm:waveheader:file=stream.wav \ "mms://wm.npr.org/wm.npr.na-central/npr/me/2005/03/20050314_me_06.wma?\ v1st=&mt=1&primaryTopic=1007&assignedTopics=1019,1007,1024,1021,1001&aggIds=4111499"
This second alternative will write in the native file formet of the stream :
mplayer -dumpstream -dumpfile stream.wav \ "mms://wm.npr.org/wm.npr.na-central/npr/me/2005/03/20050314_me_06.wma?\ v1st=&mt=1&primaryTopic=1007&assignedTopics=1019,1007,1024,1021,1001&aggIds=4111499"
Now we have our stream saved as a wav on the hard disk. It is a simple matter for an mp3 encoder like lame to make an mp3 of it. I use a constant bitrate of 128 as my mp3 player barfs on variable bitrates.
lame --preset cbr 128 stream.wav stream.mp3
As you can see, circumventing all the popups and goo that goes into making a "user friendly" experience can take some sluething and creativity, but can yield satasfactory results.
