Category: Programming

Bad ideas for NaNoWriMo

I’m not a novelist. I can write, and with sufficient editing, I can even write passable English text, but it’s not really something that I’ve put a lot of effort into, and so not something that I’m good at. My real time investment in writing has been writing software, which is read only by compilers and interpreters, and so what it lacks in excitement, narrative, and plausible dialog, it makes up for in conciseness and precision.

My ideas for National Novel Writing Month (November, for those as don’t know) then, are coding a program that writes novels, and coding a program that takes novels as inputs and generates interactive fiction (text adventure games) from them. Both of these are problems with infinite hair, and are arguably AI-Hard problems (that is, problems whose solution is on par, difficulty-wise, with creating a human-level general-purpose AI). On the other hand, writing a good novel is probably also quite hard.

I think the most reasonable approach for the novel generator would be a recursively-defined novel description language, which selects from tropes and plot stubs, generates characters, and so forth, based on relatively simple rules. The complexity would come from applying the rules over and over, so a simple quest to throw a ring into a volcano grows branches on branches on branches until it is One Damn Thing After Another Until All The Orcs Are Dead. The goal of the program would be to use generative content and emergent behavior to do most of the writing, and leave me to fill out the turns of phrase and details (or generate them a la Dwarf Fortress, which menaces with spikes of ivory). Done badly, this would read like a Mad Lib. Done well, it would read like a Mad Lib filled out by people who don’t say “dongs” every time they are asked for a noun.

Making interactive fiction (IF) out of novels would be substantially harder. The novel parser would have to read English, which is actually quite a trick. English has multiple words for one meaning and multiple meanings for one word, highly flexible structure, and counts on the reader to sort it all out. On top of that, most of the awesome tricks one can pull in English are  more a matter of exploiting shared cultural context with your reader than they are particular sequences of words. If I wrote such a parser in one month, or at all, a lot of linguistics researchers would be out of work.

Assume I went for one tiny part of the problem: identifying the locations in the novel. The same place might be described as “where that party was”, “Joe’s house”, “the darkened house”, and “a pit of iniquity”. Only the events of the novel link them, and so the program would have to determine that these totally different words referred to the same place. The best approximation I could likely come up with is identifying all the things in the novel that sound like places, and then performing some sort of clustering based on what words are mentioned close to mentions of those places. This would likely lead to a bunch of spurious places getting generated, and real places getting overlooked.  There is an entire company, called Metacarta, that did this sort of analysis on much more constrained data sets, and even then it was a difficult problem for a team of people who were likely smarter than me.

However, doing a good job of adapting novels to interactive fiction might not be the best approach. It might be better to get a rough cut of the software together to do anything at all, and gradually improve it until it writes things that are playable curios, rather than detailed simulations of well-loved novels. It wouldn’t be a matter of playing through “A Game of Thrones” so much as it would be “poking around in a demented dreamscape based loosely on ‘A Game of Thrones'”.

This is actually related to another idea that I had, which is sort of a rails shooter based on the consequences of shooting things. You play through the game, riding the rails and shooting enemies of varying levels of craftiness and menace. When you reach the end of the level, you just loop through it again, passing the bodies of everything you killed, and getting another shot at everything you missed. This repeats until you have killed everything in the game world, whereupon you continue to loop, passing through scenes of slaughter as the heroic music fades and is replaced with silence and the buzzing of flies. Perhaps, if you let it run long enough, the dead bodies would rot to skeletons. Now, of course, I’ve spoiled it for you, but I’ll probably never get around to writing it, so at least you’ve had the idea.

Chatbot is done

I finished writing the chatbot that I was working on. It consists of a set of scripts to prepare the data, and another script that listens for incoming messages and responds. You can get the code and an overview of how it works here.

Obviously, I’m not publishing my chat logs. Use your own. It is designed to work with Pidgin’s HTML-like format for chat logs, but it could be modified to work on almost any corpus. I really should clean up things like the string cleaning routines, but it worked for class, and that’s what actually matters.

Some Short Notes

The ToyBrain project has been on hold for a variety of reasons, mostly time and money. I finally have enough money to order the motor driver chips I wanted from Digikey. They are on back order, but should arrive near the end of the month. Once I have those, I’m going to put together a little video of the first boards doing a variety of motor driving tasks. That video will go into a Kickstarter funding round to get the second edition of the boards produced and populated.

At least one of the ToyBrain boards is going to end up hooked to a computer via the serial port at one end, and a vibrating motor at the other end. I’m reviving an old project to add a teledildonics plugin to Pidgin. It will allow a remote user to use commands like /harder and /faster (and of course /softer and /slower) to control the speed of the vibrator. That one may not make it into the Kickstarter video.

I found an interesting post about laser power ratings recently. It covers the relationship between PWM and laser output power, which is going to be useful for the power supply that I’m building for my laser. Once I build that power supply, I’ll be in the rather interesting position of having designed a cutting laser power supply that can be built from easy-to-obtain materials. Hopefully, that will knock the price down enough that more people can do DIY CNC laser builds. I may also make that PCB available as a kit, so people can build their own.

I also looked up TEA Laser plans again, and started wondering about making a dye laser. The TEA laser emits in the UV range. so it could be used to pump a UV-reactive dye. Vitamin b12 (in energy shot drinks) and tonic water both are UV reactive, so it may be possible to make a yellow or blue laser using a dye that is drinkable. Normally, laser dye is a toxic dye in a toxic solvent, so this would be pretty neat for the home experimenter.

 

OpenCv and finding rectangles

I have been working, on and off, on a computer vision application that will recognize a card from a certain game in an image, figure out what the card is, and add it to a database. I had some luck in earlier versions of the code looking for template matches to try to find distinctive card elements, but that fails if the card is scaled or skewed, and it rapidly becomes too processor-heavy if there are many templates to match. Recently, at work, I have had even more opportunity to play with OpenCV (a computer vision library), and have found a few blogs and tricks that might help me out.

The first blog shows how to pick out a Sudoku puzzle from a picture. The most important part is finding the corners of the puzzle, as after that, it can be mapped to a square with a perspective transform. I can do a similar trick, only I’ll be mapping to a rectangle. Since corner-finding is kind of scale-invariant (the corner of something is a corner at any scale), this will let me track a card pretty easily.

I think that I can actually use OpenCV’s contour finding to get most of the edges of the card, and then the Hough transform to get the corner points. I may even be able to get away with using just contour finding, getting the bounding rectangle of each contour, and checking that it has something like the proper aspect ratio. This will work in the presence of cards that are rotated, but fails on perspective-related skewing.

This StackOverflow post has a nice approach to getting the corners of a rectangle that has some rotation and perspective skew.

Once I have the card located, I’m going to throw a cascade of classifiers at it and try something like AdaBoost to get a good idea of which card it is. Some of the classifiers are simple, things like determining the color of the front of the card. Others may actually pull in a bit of OCR or template-based image recognition on (tiny) subsections of the card. Since I will actually know the card border at this point, I can scale the templates to match the card, and get solid matches fast.

Human factors research

For a piece of code I’m writing, I need to break the angle of a turn down into “continue”, “bear left (or right)”, and “turn left (or right)”. The program takes a topological map of points within a building and converts a path through the building into spoken directions. Intuitively, it seems to me, there is some small range of angles that are effectively “continue forwards” or “straight ahead”, some range of angles to the left and right of that that are “bearing” without being “turning”, and some range of angles to the left and right of that range that are “turning”.

Before I go any further, I should point out that this isn’t a study. Anything with n=7 and a population consisting of exclusively white male computer science students in their 20s is not exactly an unbiased sample. It’s a guess, based on the opinions of the people who happened to be standing around at the time.

My procedure was to show the participant a protractor, and ask them to imagine a person standing at the origin and facing the 90° mark. I then asked them what ranges, in degrees, constituted “straight forward”, “bear left/right” and “turn left/right”.

The results:

Participant "Straight"  "Bear"      "Turn"
p1          70-110      L 100-140   L 140-180
                        R 80-40     R 40-0

p2          80-100      L 130-150   L 130-180
                        R 50-30     R 50-0
 
p3          60-120      L 130-150   L 130-180
                        R 50-30     R 50-0

p4          75-115      L 115-125   L 140-180
                        R 75-50     R 40-0

p5          70-110      L 110-130   L 160-180
                        R 70-50     R 20-0

p6          88-92       L 92-112    L 160-180
                        R 88-68     R 20-0

p7          75-105      L 105-125   L 125-180
                        R 75-55     R 55-0

Based on this, I’m going to say that “straight” corresponds to about 70-110 degrees, “bear” is about 110-130 on the left and 70-50 on the right, and anything outside of that is a “turn”. This is nothing more than a stupid rule of thumb, but if anyone complains, it’s easy enough to change the code.

I could complicate it further and add “bear slightly L/R” and “turn hard L/R”, but I’m not sure the gain in resolution translates to any gain for indoor navigation. Changing how the questions are presented or whether or not the user gets to refer to a protractor would probably also change the answers.