Posted: May 30th, 2012 | Author: Abe S. | Filed under: Research | Comments Off
This is really just a rant, I’ll have neat and useful information in my next post, I swear.
Everything that this guy lists that I have had the displeasure to run into is correct, and worse than he makes it sound.
LabView is a graphical environment for creating processes to manage a flow of data, usually acquiring it from some device, performing some processing on it, and displaying or recording the data. To create a process in LabView, you drag and drop little boxes which do things to the data, and then point and click to draw wires between the boxes, which show how the data should flow. This is kind of a problem, for a number of reasons.
Imagine if you are trying to get the 1st through the 50th elements of an array. Near as I can tell, in LabView, you use an “Index Array” block, wired to fifty integer constant blocks (each containing a digit in 0..49) with 50 wires. The “Array Subset” block sounds promising, but actually gets you an array, not the elements. If you want to expand this to, say, 100 elements, you need to add 50 more constants, and 50 more wires. This gets tedious very fast, but I guess it’s not so bad, if you otherwise can’t program, and so don’t know that real programming languages will just let you operate on members of an array without pulling them out first, using syntax that takes more time to describe than it does to type.
See, if I wanted to get the fifth element of an array of integers in LabVIEW, multiply it by 10, and put it back, I’d have to use an “Index Array” block, an “Integer Constant” block with the value 4 (zero-based array indexing) to get the value, a “Multiply” block and an “Integer Constant” block with the value 12 to do the multiplication, and a “Replace Array Subset” block using the first “Integer Constant” block to put the value back. This would also require at least 8 wires. I would have to put all of those things in place, and then wire them up using the mouse.
Or I could type “arrayName = arrayName * 10;”, assuming I was working in C, C++, or Java. For Python and Perl, leave off the semicolon. Oh hey. See what I did there? The same operation, only mine is good for 5 languages, takes around 33 characters (so seconds to type), and is all done without a single mouse click or paying thousands of dollars for a license. If that were the only problem with LabVIEW, it would be enough to exclude it from me ever considering using it for anything. Since I don’t get to make that choice at work, I just spent the better part of three days trying to get it to format some data and send that data over the network. For those playing along at home, slinging some data across the network is maybe a 20 minute “problem” in any other language that real humans use (brainfuck, befunge, etc. don’t count, and assembly is a corner case).
The miserable interface doesn’t just make it slow to create anything in LabVIEW. It also makes it easy to get wrong in annoying ways. The analog of a typo in Labview is connecting things wrong. Given that you are aiming for a ~10 pixel target with no space between it and and the equally connectable, but incorrect, targets on either side of it, the quality of your code depends on the resolution of your mouse and your visual acuity. That’s right. Not your ability to break down a problem into its component parts and determine what algorithms solve those parts. Your mousing skills are what determines if you got your “code” right. This also means that if you are hooking up 60 connections, and you accidentally skip one, you have to move a bunch of the other connections to get the one that you skipped back into place. God help you if you don’t catch it, as the data will be ok, but one element of it will be out of order.
Posted: November 9th, 2011 | Author: Abe S. | Filed under: Bad Ideas, Electronics, Neurohacking, Research | Comments Off
This is the talk I gave at Notacon in 2008. It’s kind of goofy, but provides a broad overview of wireheading/neurohacking technologies.
Posted: October 8th, 2011 | Author: Abe S. | Filed under: Artificial Intelligence, Computer Vision, Programming, Research | Comments Off
I have been working, on and off, on a computer vision application that will recognize a card from a certain game in an image, figure out what the card is, and add it to a database. I had some luck in earlier versions of the code looking for template matches to try to find distinctive card elements, but that fails if the card is scaled or skewed, and it rapidly becomes too processor-heavy if there are many templates to match. Recently, at work, I have had even more opportunity to play with OpenCV (a computer vision library), and have found a few blogs and tricks that might help me out.
The first blog shows how to pick out a Sudoku puzzle from a picture. The most important part is finding the corners of the puzzle, as after that, it can be mapped to a square with a perspective transform. I can do a similar trick, only I’ll be mapping to a rectangle. Since corner-finding is kind of scale-invariant (the corner of something is a corner at any scale), this will let me track a card pretty easily.
I think that I can actually use OpenCV’s contour finding to get most of the edges of the card, and then the Hough transform to get the corner points. I may even be able to get away with using just contour finding, getting the bounding rectangle of each contour, and checking that it has something like the proper aspect ratio. This will work in the presence of cards that are rotated, but fails on perspective-related skewing.
This StackOverflow post has a nice approach to getting the corners of a rectangle that has some rotation and perspective skew.
Once I have the card located, I’m going to throw a cascade of classifiers at it and try something like AdaBoost to get a good idea of which card it is. Some of the classifiers are simple, things like determining the color of the front of the card. Others may actually pull in a bit of OCR or template-based image recognition on (tiny) subsections of the card. Since I will actually know the card border at this point, I can scale the templates to match the card, and get solid matches fast.
Posted: October 4th, 2011 | Author: Abe S. | Filed under: Computer Vision, Research, Scripting | Comments Off
Suppose you have used FindContours to find the outlines of things in an image, and now want to color each one a different color, perhaps in order to create a seed map for the watershed algorithm. This gets you an image with each area that FindContours found in a different color:
#Showing the contours.
#contour_list is the output of cv.FindContours(), degapped is the image contours were found in
contour_img = cv.CreateImage(cv.GetSize(degapped), IPL_DEPTH_8U, 3)
contour = contour_list
while contour.h_next() != None:
color = get_rand_rgb(80,255)
holecolor = get_rand_rgb(80,255)
cv.DrawContours(contour_img, contour, color, holecolor, -1, CV_FILLED)
contour = contour.h_next()
show_img(contour_img, "contours " + str(iteration))
The real key here is iterating over the list of contours using h_next(), which took me longer than it should have to find.
The show_img() function is effectively just printf for OpenCV images, and looks like this:
def show_img(img, winName):
#Debugging printf for images!
The function get_rand_rgb() gets a random RGB color with values for each color set to an integer in the range you pass it, like so:
def get_rand_rgb(min, max):
I used 80,255 to get bright colors.
Posted: September 9th, 2011 | Author: Abe S. | Filed under: Programming, Research | Comments Off
For a piece of code I’m writing, I need to break the angle of a turn down into “continue”, “bear left (or right)”, and “turn left (or right)”. The program takes a topological map of points within a building and converts a path through the building into spoken directions. Intuitively, it seems to me, there is some small range of angles that are effectively “continue forwards” or “straight ahead”, some range of angles to the left and right of that that are “bearing” without being “turning”, and some range of angles to the left and right of that range that are “turning”.
Before I go any further, I should point out that this isn’t a study. Anything with n=7 and a population consisting of exclusively white male computer science students in their 20s is not exactly an unbiased sample. It’s a guess, based on the opinions of the people who happened to be standing around at the time.
My procedure was to show the participant a protractor, and ask them to imagine a person standing at the origin and facing the 90° mark. I then asked them what ranges, in degrees, constituted “straight forward”, “bear left/right” and “turn left/right”.
Participant "Straight" "Bear" "Turn"
p1 70-110 L 100-140 L 140-180
R 80-40 R 40-0
p2 80-100 L 130-150 L 130-180
R 50-30 R 50-0
p3 60-120 L 130-150 L 130-180
R 50-30 R 50-0
p4 75-115 L 115-125 L 140-180
R 75-50 R 40-0
p5 70-110 L 110-130 L 160-180
R 70-50 R 20-0
p6 88-92 L 92-112 L 160-180
R 88-68 R 20-0
p7 75-105 L 105-125 L 125-180
R 75-55 R 55-0
Based on this, I’m going to say that “straight” corresponds to about 70-110 degrees, “bear” is about 110-130 on the left and 70-50 on the right, and anything outside of that is a “turn”. This is nothing more than a stupid rule of thumb, but if anyone complains, it’s easy enough to change the code.
I could complicate it further and add “bear slightly L/R” and “turn hard L/R”, but I’m not sure the gain in resolution translates to any gain for indoor navigation. Changing how the questions are presented or whether or not the user gets to refer to a protractor would probably also change the answers.