Download ALL The Music

Given a file containing a list of songs, one per line, in the format “Artist – Song Title”, download the audio of the first youtube video link on a Google search for that song. This is quite useful if you want to the MP3 for every song you ever gave a thumbs up on Pandora. On my computer, this averages about 4 songs a minute.

The Requests API and BeautifulSoup make writing screenscrapers and automating the web really clean and easy.


# Takes a list of titles of songs, in the format "artist - song" and searches for each
# song on google. The first youtube link is passed off to youtube-dl to download it and 
# get the MP3 out. This doesn't have any throttling because (in theory) the conversion step
# takes enough time to provide throttling. 

import requests
import re
from BeautifulSoup import BeautifulSoup
from subprocess import call

def queryConverter(videoURL):
	call(["youtube-dl", "--extract-audio",  "--audio-format", "mp3", videoURL])

def queryGoogle(songTitle):
	reqPreamble = ""
	reqData = {'q':songTitle}
	r = requests.get(reqPreamble, params=reqData)
	if r.status_code != 200:
		print "Failed to issue request to {0}".format(r.url)
		bs = BeautifulSoup(r.text)
		tubelinks = bs.findAll("a", attrs={'href':re.compile("watch")})
		if len(tubelinks) > 0:
			vidUrl ="https[^&]*", tubelinks[0]['href'])
			vidUrl = requests.utils.unquote(
			return vidUrl
			print "No video for {0}".format(songTitle)

if __name__=="__main__":
	with open("./all_pandora_likes", 'r') as inFile:
		for line in inFile:
			videoURL = queryGoogle(line)
			if videoURL is not None:

Playatech started charging for their plans

Unfortunately for burners, you can no longer download Playatech’s plans for their furniture without paying them first. They used to offer the plans as free downloads, and then asked that you donate some small amount if you used them.

Unfortunately for Playatech, they left all the PDFs in a world-readable directory. The command line below gets the index of that directory, finds all the lines with “pdf” in them, gets the file names out using cut, and then downloads each file.

for file in `wget -qO- | grep pdf | cut -d ‘>’ -f 2 | cut -d ‘”‘ -f 2`; do wget$file; done

"Weaponized" Quadcopters

For a long time, I’ve been thinking it would be possible to strap a small explosively-formed penetrator (EFP) to a quadcopter. Then you feed the GPS coordinates of your enemy’s apartment or office into the on-board navigation system, and the quadcopter flies over to their place and fires a hypersonic slug of copper through their window.

Leaving aside the ethical concerns, there are a couple of issues with this. The main one for asymmetric warfare enthusiasts is that it destroys your quadcopter and leaves bits of it at the scene, which wastes resources and gives clues to whoever you were trying to shoot.

Then I saw this little post over at Hackaday. If you put a high wattage diode laser on a quadcopter, you can have it set fire to things. It could probably shoot through a glass window and set fire to things on the inside of the window. Once the place is nicely in flames, you just fly the ‘copter away again, leaving no trace.

Microcontroller fun

The Arduino has a huge hobbyist-level codebase and lots of libraries for talking to various devices.

The 8051 is a venerable old processor that still gets used in lots of stuff because it’s cheap, and has a low gate count.

It’s probably possible to port a lot of the Arduino stuff (everything that doesn’t use specific on-chip features) to the 8051, thus allowing people to use the software environment they are comfortable with on a new chip. The same is likely true of PICs, and other chips.

The general case, then, is to create a translation system that automates, as much as possible, the process of porting the Arduino libraries and environment from one chip to another. This is, at a high level, possible because anything a computer with a turing-complete instruction set can do, any other computer with a turing-complete instruction set can also do. The hang-up would be on limitations of real hardware (there’s a lot of cool stuff in there, but no infinite data/instruction tape).

So, about that re-distillation prevention on PDFs…

I got a PDF of my transcript from a university I used to attend. I won’t say which, but it’s in Salisbury and not very imaginatively named. The transcript had a variety of password-protection baloney on it, which I didn’t want.

Stripping the password is easy if you know it, because you can open the PDF in acrobat reader, select “print”, “print to file” and print to a postscript file (with a .ps extension). That removes the password protection, giving you a clear postscript file, which, in theory, you can distill back to PDF.

Unfortunately, the postscript file is marked as unredistillable (if that’s even a word). There is a fix for this, though: the exact opposite of what is described here.

Find the lines that say:

%ADOBeginClientInjection: DocumentSetup Start "No Re-Distill"
%% Removing the following eleven lines is illegal, subject to the Digital Copyright Act of 1998.
mark currentfile eexec
%ADOEndClientInjection: DocumentSetup Start "No Re-Distill"

and cut them out of the file. Yes, the file says this is illegal, and it might even be true, but that’s how it’s done.

Cross-system chat log sync

I created a directory called “pidgin_logs” in my Dropbox folder, backed up my logs, removed the old log directory, and then created a link from where the log directory should have been to the Dropbox directory. Pidgin still starts up fine, and after I do this on all of my systems, they will all log chats to my Dropbox account.

I’m not sure I’m comfortable with the level of trust that this places in Dropbox, but it will be very convenient.

The commands to do it are:

cd ~/Dropbox/
mkdir pidgin_logs
cd ~/.purple/
mv logs logs_backup
mkdir logs
ln -sfn ~/Dropbox/pidgin_logs ~/.purple/logs
mv logs_backup/aim logs/
mv logs_backup/jabber logs/

Academic Problems in New Media Art

I’m trying to convert Deluze & Guattari’s A Thousand Plateaus into a corpus for use by a program. Among other things, the program will break the text down into its component sentences. The text has a lot of notes, which connect the text to a lot of other sources, but are not always written in complete sentences, and so will result in odd output from the program when considered as sentences.

So I’m faced with a choice: lose the notes, and so lose the cultural context and references to stuff that went before, or keep them and suffer degraded output. Nobody warns you about the odd stuff you’ll have to decide when you start doing new media “art”.

CSS: Turing complete?

“The simple selector matches if all of its components match. ” according to the page on selectors. That sounds to me like an AND gate, in that something is selected if the logical AND of its components match.

Combine that with logical inversion, provided by the not selector, and it seems to me that you get NAND, which is a universal gate. Any boolean function can be composed of NAND gates, so it is in theory possible to compose an entire CPU out of NAND gates.

This makes it seem to me like you could write a processor simulation in CSS.

The main thing that I wonder about is how browsers evaluate CSS, because if it is not re-evaluated until nothing happens (e.g. nothing gets its attributes changed) then the CPU wouldn’t work because the “output” of a selection would not be able to be used as input into more selections (e.g. by changing their classes). Heck, I’m not sure you can even have CSS change the classes of an element on the fly.

But if you can…

Bad ideas for NaNoWriMo

I’m not a novelist. I can write, and with sufficient editing, I can even write passable English text, but it’s not really something that I’ve put a lot of effort into, and so not something that I’m good at. My real time investment in writing has been writing software, which is read only by compilers and interpreters, and so what it lacks in excitement, narrative, and plausible dialog, it makes up for in conciseness and precision.

My ideas for National Novel Writing Month (November, for those as don’t know) then, are coding a program that writes novels, and coding a program that takes novels as inputs and generates interactive fiction (text adventure games) from them. Both of these are problems with infinite hair, and are arguably AI-Hard problems (that is, problems whose solution is on par, difficulty-wise, with creating a human-level general-purpose AI). On the other hand, writing a good novel is probably also quite hard.

I think the most reasonable approach for the novel generator would be a recursively-defined novel description language, which selects from tropes and plot stubs, generates characters, and so forth, based on relatively simple rules. The complexity would come from applying the rules over and over, so a simple quest to throw a ring into a volcano grows branches on branches on branches until it is One Damn Thing After Another Until All The Orcs Are Dead. The goal of the program would be to use generative content and emergent behavior to do most of the writing, and leave me to fill out the turns of phrase and details (or generate them a la Dwarf Fortress, which menaces with spikes of ivory). Done badly, this would read like a Mad Lib. Done well, it would read like a Mad Lib filled out by people who don’t say “dongs” every time they are asked for a noun.

Making interactive fiction (IF) out of novels would be substantially harder. The novel parser would have to read English, which is actually quite a trick. English has multiple words for one meaning and multiple meanings for one word, highly flexible structure, and counts on the reader to sort it all out. On top of that, most of the awesome tricks one can pull in English are  more a matter of exploiting shared cultural context with your reader than they are particular sequences of words. If I wrote such a parser in one month, or at all, a lot of linguistics researchers would be out of work.

Assume I went for one tiny part of the problem: identifying the locations in the novel. The same place might be described as “where that party was”, “Joe’s house”, “the darkened house”, and “a pit of iniquity”. Only the events of the novel link them, and so the program would have to determine that these totally different words referred to the same place. The best approximation I could likely come up with is identifying all the things in the novel that sound like places, and then performing some sort of clustering based on what words are mentioned close to mentions of those places. This would likely lead to a bunch of spurious places getting generated, and real places getting overlooked.  There is an entire company, called Metacarta, that did this sort of analysis on much more constrained data sets, and even then it was a difficult problem for a team of people who were likely smarter than me.

However, doing a good job of adapting novels to interactive fiction might not be the best approach. It might be better to get a rough cut of the software together to do anything at all, and gradually improve it until it writes things that are playable curios, rather than detailed simulations of well-loved novels. It wouldn’t be a matter of playing through “A Game of Thrones” so much as it would be “poking around in a demented dreamscape based loosely on ‘A Game of Thrones'”.

This is actually related to another idea that I had, which is sort of a rails shooter based on the consequences of shooting things. You play through the game, riding the rails and shooting enemies of varying levels of craftiness and menace. When you reach the end of the level, you just loop through it again, passing the bodies of everything you killed, and getting another shot at everything you missed. This repeats until you have killed everything in the game world, whereupon you continue to loop, passing through scenes of slaughter as the heroic music fades and is replaced with silence and the buzzing of flies. Perhaps, if you let it run long enough, the dead bodies would rot to skeletons. Now, of course, I’ve spoiled it for you, but I’ll probably never get around to writing it, so at least you’ve had the idea.

Notacon Talk on Brain Hacking

This is the talk I gave at Notacon in 2008. It’s kind of goofy, but provides a broad overview of wireheading/neurohacking technologies.