Raspbian Still Bad

The command “sudo apt-get install arduino” doesn’t get you the Arduino development environment. Instead, when you try to run it from the command line, you get “Error occured during initialization of VM. Server VM is only supported on ARMv7+ VFP”. I get that cross-compilation can be tricky, but this software didn’t get released, it escaped. Arduino isn’t exactly an obscure package.

To be fair, this isn’t because Arduino is broken. It’s because apt automatically chooses a version of Java that can’t run on the Zero W. So any package that uses Java is probably also broken, but I’m not looking into it, because I’m trying to actually do something with the Raspberry Pi. If I wanted to fuck around with a broken package chain, I’d… have gotten my fill of that like 10-13 years ago, so I guess I’d need a time machine. Really, the Pi Zero W is a lot like a time machine, taking me back to a time when Linux was a project, rather than being something you can use for projects.

If you, for some reason, want to use the Arduino IDE on a Pi Zero W, the incantation is to install Java 8 with ‘sudo apt-get install openjdk-8-jre-headless openjdk-8-jre’ and then use ‘sudo update-alternatives –config java’. Then you”ll be running a 5 year old Java and a 6 year old Arduino IDE, but at least the IDE will start up.

The Year of Linux on the Desktop is “Ha, Get Fucked”

People talk about usability, and in a lot of ways, Ubuntu is pretty close. It certainly beats Windows 10, at least for me, since you’re allowed to know what has gone wrong and fix it, instead of just rebooting whenever your computer gets infested with ghosts. I do complain about Ubuntu, but it’s usually because most things are decent, so the things that are bad are particularly egregious. And, after all, I did get what I paid for.

This time, though, I’m trying to work with a Raspberry Pi Zero W, and Raspbian. It has a little setup walkthrough that clearly took a serious blow to the head at some point, since it asks you to connect to WiFi, and then if you didn’t connect to a network, still asks if you want to download software. From what, I might ask, since there’s not a network connection?

Of course, the reason I didn’t connect to the network is that it’s hidden. Not a problem, I have the SSID on paper here… and no way to tell it to the Pi. There’s a network configurator, but it only lets you deal with unhidden networks, not enter your own SSIDs. Maybe this was done because “SSID” is one of those worrying “techie” terms, and this is supposed to be for somewhat less technical users? Protip: it’s not. It ships as a bare board in an antistatic bag, for fuck’s sake. Maybe this was not done because it’s somehow hard?

At any rate, the solution appears to be: manually edit wpasupplicant.conf. Manually edit a text file that only root can edit, in this, the Year of Our Lord 2019. I don’t have a problem with this. I have a PhD in making tiny robots go and have been using Linux for 15 years, because everything else is worse for my use cases. Normal humans, who are perhaps entering college and just want to check out Linux and maybe try writing a little Scratch or Python, they are going to have a problem with this.

Also, the same startup script asked me if my screen had black bars around the edges. It did, so I said yes (more fool me!). When I rebooted, the edges of the desktop (you know, where the UI goes) were mostly off the edge of the screen. Setting the hostname with the network config tool on the toolbar caused host name resolution errors every time I use sudo, apparently because sudo wanted “ouija1” but “ouija_1” got written into /etc/hosts. That’s not actually a valid hostname (my error), but it got written into /etc/hosts (an error by whoever wrote the alleged network config tool). Again, I know what /etc/hosts is, editing it isn’t an issue. I’m weird. For most people, this is an issue.

So in general, my experience with the Raspberry Pi and Raspbian is that it’s not ready for end users who want to use it to do things. If you want it in order to do things to it, rather than with it, and are already an experienced electronics enthusiast and Linux user, you’ll be fine.

Curious Effects Getting List Extents

I have a program that gets a list of GPS waypoints, and wants to figure out their bounding box. The naive way[1] to do this is find the maximum and minimum latitude and longitude, and use the maxes as one corner and the minimums as the other corner.

Off the top of my head, I can think of two ways to do this: Iterate the list of waypoints, comparing to the max and minimum so far, and updating as I go. The list has N points, I have to look at all of them, so O(N), so far so good.

The other way to do it is to do list comprehensions to get the latitudes and longitudes as separate lists, and then call max() and min() on each of those. I would assume that each list comprehension is O(N), and each call to max() or min() is also O(N), since they have to look through the whole list to find the maximum or minimum, and so it is 6 loops over the list (2 comprehensions, 2 max() calls, 2 min() calls), and so this is the slower way to do it.

It turns out, not so much.

I ran the code below on Repl.it and got, usually, the list comprehension version being just very slightly faster to twice as fast. Occasionally, the 10,000 instance case is slower, but not all the time.

import random 
from timeit import default_timer as timer

#Try some different sizes of lists
for jj in [10, 100, 1000, 10000, 100000, 1000000]:

  #Set up N waypoints
  waypoints = []
  for ii in range(jj):
    lat = (random.random() * 360) - 180
    lon = (random.random() * 360) - 180
    waypoints.append({"lat":lat, "lon":lon})

  start = timer()

  # One loop
  maxLat = maxLon = -float("inf")
  minLat = minLon = float("inf")
  for point in waypoints:
      lat = float(point["lat"])
      if lat < minLat:
          minLat = lat
      if lat > maxLat:
          maxLat = lat
      lon = float(point["lon"])
      if lon < minLon:
          minLon = lon
      if lon > maxLon:
          maxLon = lon

  mid = timer()

  # List comprehensions
  lons = [float(point["lon"]) for point in waypoints]
  lats = [float(point["lat"]) for point in waypoints]
  minLat1 = min(lats)
  minLon1 = min(lons)
  maxLat1 = max(lats)
  maxLon1 = max(lons)

  end = timer()

  #Print the results
  print(f"{jj} points")
  print(f"  first way {mid-start}")
  print(f"  second way {end-mid}")
  print(f"  speedup {(mid-start)/(end-mid)}")
  assert(minLat == minLat1)
  assert(maxLat == maxLat1)
  assert(minLon == minLon1)
  assert(maxLon == maxLon1)

So why is it faster? Clearly, I’m assuming something wrong. I suspect the main thing that I’m assuming wrong is that the constant 6 multiplied by the O(N) matters. It probably doesn’t, and that’s why we typically drop the constant multipliers in runtime comparisons. It’s likely that list comprehensions and max()/min() of iterables are calls to a very fast C implementation, and are just so much faster than my loop in Python that the fact that I’m doing 6 iterations doesn’t really matter.

Another thing that I’m assuming is that max and min are implemented as linear searches over iterables. It’s entirely possible that iterables store references to their maximum and minimum values, and just return that when asked, rather than going looking. I doubt it, since the overhead on removing an element would be large [2], but it is possible.

I haven’t looked into either of these assumptions, since timing the runs answered the question I had (“Which is faster?”), and the follow-on question (“Why?”) isn’t useful to me at this time.

[1] It does some stupid stuff around the poles and across the international date line, for example.

[2] You’ve removed the largest element. What is the new largest? Time to go searching…

Alternatively, the list could be implemented as a parallely-linked-list, where one set of links is the elements in their list order, and the other set is the elements in their sorted order, but then the list [1, 3, 5, “pickles”, <built-in method foo of Bar object at 0x1f34b881>, 6] doesn’t have well-defined links for the sorted order.

Drag and Drop Python Objects in WxPython

I’m working on a UI for a system that has agents, which have some qualities, and units, which have multiple agents. I want to be able to display each unit as a list, and drag agents from that list to other units, to reassign them.

There are a lot of examples for using drag and drop with text and files, because WxPython provides drop targets for text and files already. One way that this could be implemented is to serialize the dragged objects to JSON, drop them as text, and then de-serialize them. This has some disadvantages, notably that you end up restricted by what you can pass via JSON. I wanted to pass Python objects, so I used pickle.

What I eventually came up with is below. It uses ObjectListView, which is a great wrapper around WxPython’s ListCtrl. On drag, the selected items of the source list are pickled and passed through the Wx drag and drop mechanism. When they are dropped on another ObjectListView, they are then unpickled and added to that ObjectListView (OLV), and removed from the source OLV.

One thing that this code does leave up to the programmer is ensuring that what goes in on one side of the drag and drop is the same as what is expected out on the other side. Another, slightly more subtle thing, is that this uses pickle on the drop data, so it would be possible to have a script that generates malicious pickle data, and lets you drag it from another UI window to my script’s OLV, whereupon it unpickles into something nasty.

That said, if your attacker is sitting at your computer, launching malicious programs and dragging and dropping stuff out of them, you have already lost, and should probably invest in better door locks.

#!/usr/bin/env python
import wx
import pickle
from ObjectListView import ObjectListView, ColumnDefn  #pip install objectlistview
import wx.lib.scrolledpanel as scrolled

# UI with a few list boxes that can be drag/dropped between, and have title bars

class Agent(object):
    # Model of a single agent, has:
    # identifier (string?)
    # range (km)
    # speed (kph)
    # capacity (integer)
    def __init__(self, ident, range, speed, capacity):
        self.ident = ident
        self.range = range
        self.speed = speed
        self.capacity = capacity

    def __repr__(self):
        return f"<Agent: {self.ident}>"

# Drag and drop Drop target that supports receiving pickled 
# python data structures and doing something with them. 
class GenericDropTarget(wx.DropTarget):
    def __init__(self, object):
        super(GenericDropTarget, self).__init__()
        self.object = object

        self.data = wx.CustomDataObject("PickleData")
        self.SetDataObject(self.data)

    def OnData(self, x, y, defResult):
        #print(f"OnData({x},{y})")
        
        if self.GetData():
            # unpickle data and do something with it
            pickled_stuff = self.data.GetData()
            cukes = pickle.loads(pickled_stuff)

            # TODO We are making the assumption that a "PickleData"
            # actually only has a list of Agents in it. 
            # Add some checking before making this a real thing, or
            # limit the type to a more-specific format like "AgentList"
            self.object.AddObjects(cukes)

        return defResult

    def OnDrop(self, x, y):
        #print(f"OnDrop({x},{y})")
        return True

    def OnDragOver(self, x, y, defResult):
        #print(f"OnDragOver({x},{y})")
        return defResult

    def OnEnter(self, x, y, defResult):
        #print(f"OnEnter({x},{y})")
        return defResult

class UnitPanel(wx.Panel):
 
    def __init__(self, parent, unitName="No name set"):
        wx.Panel.__init__(self, parent=parent, id=wx.ID_ANY)

        self.dataOlv = ObjectListView(self, wx.ID_ANY, style=wx.LC_REPORT|wx.SUNKEN_BORDER)

        self.dataOlv.SetColumns([
            ColumnDefn("ID", "left", -1, "ident", minimumWidth=100),
            ColumnDefn("Range", "right", -1, "range", minimumWidth=60),
            ColumnDefn("Speed", "right", -1, "speed", minimumWidth=60),
            ColumnDefn("Capacity", "right", -1, "capacity", minimumWidth=60)
        ])
 
        self.agents = []
        self.dataOlv.SetObjects(self.agents)
        self.dataOlv.Bind(wx.EVT_LIST_BEGIN_DRAG, self.OnDragInit)
        
        # Set up a drop target on the listview
        dt = GenericDropTarget(self.dataOlv) 
        self.dataOlv.SetDropTarget(dt) 

        # Set up a title for this box
        self.unitLabel = wx.StaticText(self, id=wx.ID_ANY, label=unitName)

        mainSizer = wx.BoxSizer(wx.VERTICAL)
        mainSizer.Add(self.unitLabel, proportion=0, flag=wx.ALL, border=5)
        mainSizer.Add(self.dataOlv, proportion=1, flag=wx.ALL|wx.EXPAND, border=5)
        self.SetSizer(mainSizer)

    def populate(self, units):
        self.agents = []
        # for unit in units:
        #     self.agents.append(Agent(unit["ident"], unit["range"], unit["speed"], unit["capacity"]))
        # self.dataOlv.SetObjects(self.agents)
        for unit in units:
            a = Agent(unit["ident"], unit["range"], unit["speed"], unit["capacity"])
            self.dataOlv.AddObject(a)
            #self.draggableURLText.Bind(wx.EVT_MOTION, self.OnStartDrag)

    def OnDragInit(self, event): 
        # Get all the selected items from this list
        selected = self.dataOlv.GetSelectedObjects()

        # Pickle them and put them in a custom data object
        pickled_selection = pickle.dumps(selected)
        drag_obj = wx.CustomDataObject("PickleData")
        drag_obj.SetData(pickled_selection) 

        #Create a drag and drop source from this ObjectListView
        src = wx.DropSource(self.dataOlv) 
        src.SetData(drag_obj)
        print("Drag started")
        result = src.DoDragDrop(wx.Drag_DefaultMove) 
        if result == wx.DragCopy:
            # We don't copy because agents are hardware
            self.dataOlv.RemoveObjects(selected)
        elif result == wx.DragMove:
            # Remove the data from here, add it to another list
            self.dataOlv.RemoveObjects(selected)
        else:
            # Default, do nothing
            print("Nothing, nothing, nothing at all")
        

class AssetFrame(wx.Frame):
    def __init__(self):
        wx.Frame.__init__(self, parent=None, id=wx.ID_ANY, 
                          title="ObjectListView Demo", size=(800, 600))
        
        self.panel = scrolled.ScrolledPanel(self, id=wx.ID_ANY)

        self.mainSizer = wx.BoxSizer(wx.VERTICAL)
        self.panel.SetSizer(self.mainSizer)
        self.panel.SetupScrolling()
        self.Show()

    def populate(self, config):
        for unit in config["units"]:
            unit_panel = UnitPanel(self.panel, unitName=unit["name"])
            unit_panel.populate(unit["agents"])
            self.mainSizer.Add(unit_panel)

 
if __name__ == "__main__":
    app = wx.App(False)

    # We're going to need to be able to populate the frame from the config,
    # so represent the data as a data structure and initialze with that
    config = {"units": [{"name": "Unimatrix 01",
                        "agents": [
                            {"ident": "7 of 9", "range": 10, "speed": 10, "capacity": 12},
                            {"ident": "Third of 5", "range": 10, "speed": 10, "capacity": 12}
                        ]},
                        {"name": "Unit 2",
                        "agents": [
                            {"ident": "u2s1", "range": 10, "speed": 10, "capacity": 12},
                            {"ident": "u2a2", "range": 10, "speed": 10, "capacity": 12},
                            {"ident": "u2a3", "range": 10, "speed": 10, "capacity": 12}
                        ]},
                        {"name": "Unit Foo",
                        "agents": [
                            {"ident": "bar", "range": 10, "speed": 10, "capacity": 12},
                            {"ident": "baz", "range": 10, "speed": 10, "capacity": 12},
                            {"ident": "quux", "range": 10, "speed": 10, "capacity": 12}
                        ]},
                        {"name": "Enterprise",
                        "agents": [
                            {"ident": "Galileo", "range": 10, "speed": 10, "capacity": 12},
                            {"ident": "Copernicus", "range": 10, "speed": 10, "capacity": 12}
                        ]},
                        {"name": "GSV Insufficent Gravitas",
                        "agents": [
                            {"ident": "GCU Nervous Energy", "range": 1000000, "speed": 3452334, "capacity": 13452342},
                            {"ident": "GCU Grey Area", "range": 1000000, "speed": 234523546, "capacity": 234562312}
                        ]}] 
              }

    frame = AssetFrame()
    frame.populate(config)
    app.MainLoop()

I called the unpickled data “cukes” because this is demo code, and I was being silly. A cucumber is, after all, what you get when you undo the process of making pickles. You may want to change that if you copy/paste this into production code.

A Flat Earth Would Be Odd

I’m not sure what “flat earthers” actually believe, but assuming the world is literally flat leads to some interesting results. For the sake of keeping things simple, lets assume that the world is flat like a coin: locally bumpy, but overall shaped like a disk. This assumption is based on what the word “flat” means. A saddle isn’t flat, so a hyperbolic-curved earth doesn’t count as “flat” in any reasonable way. Further, lets assume that it keeps the same surface area that the shall we say “conventional” model claims that it has. This assumption lets us avoid the absurdities [1] required to preserve distances, and so travel times, while unwrapping a sphere to cover a disk.

These assumptions tell us how big the disk is. The earth’s surface area is allegedly 196.9 million square miles. A disk’s surface area is given by pi * r2, so do the math to get r and you wind up with a radius of 7.916 million miles. This may present astronavigation problems, as the moon is only 238,900 miles away, so it might hit the disk… if the moon were real! [2]

Ok, so we all live on a disk that’s about 16 million miles across. Because I’m comically egotistical, I’m going to say that my home town is in the middle of the disk, equidistant from every edge.

Now we come to an interesting point: How thick is the disk? Let’s assume that we believe in gravity. The gravity in my home town is one G, and it causes objects to fall down, which is to say “towards the surface of the earth”. Under the conventional model, this is because the earth is under me, and so the gravity caused by the large mass of the earth exerts a force on objects above it. Now there are two ways we can go: either the earth is made of the stuff that it is observed to be made of, and has the density it is generally observed to have, or it’s made out of something far more dense. All that this really varies is how thick the disk is under my home town. To have 1G there, using the conventional materials, requires at least the alleged thickness of the round earth, which is to say about 7,917 miles. Using something denser makes it thinner, without affecting the gravity, but there are limits on how dense matter can get.

Ok, so far, so good. However, we’ve set a little bit of a trap for ourselves here. Everywhere you go on the surface of the earth, the gravity is about 1G, so everywhere you go, the disk earth has to be about 8k miles thick. In my home town, at the center of the disk, this isn’t a problem, because the gravity in all the other directions balances out.

What’s that you say? Yes, gravity in other directions. You see, we’re talking about a disk that measures about 8k miles thick and 16M miles across. If you’re in the center, there are equal amounts of disk around you in all directions, so the pull in all the other directions balances out. If you’re off center, there’s more mass on one side of you than on the other, and so there is a component to the gravitational pull that isn’t straight down, and is unbalanced by the ratio of how much mass is on each side.

Initially, this would probably be pretty subtle. Things would fall a little bit in the direction of my home town, but more or less straight down. Friction would suffice to keep things on surfaces, but round things would always roll towards my home town. It would only be a little harder to walk away from my home town than towards it. I doubt it would affect which way water goes down the toilet all that badly, although the water would pile up on one side of the bowl. However, as you got towards the edge, things would get FUCKING DIRE.

How dire? Well, lets look at the volume of that disk earth, which is properly a cylinder now that we know how thick it is. A cylinder that’s 16,000,000 miles across and 8,000 miles thick has a volume of 6,430,000,000,000,000,000 cubic miles. The conventional model earth has a volume of 260,000,000,000 (that is to say, 260 billion) cubic miles, and exerts a gravitational pull of 1G when you’re on the surface, which is to say that all of it is under you. When you’re on the edge of the disk earth, which is to say that almost all of it is, say, east of you, it exerts a gravitational pull of (around) 24,730,769.2308G. So to understand the force exerted on some poor schmuck who happens to get teleported from my town to the literal eastern edge of the world, assume that the ISO standard schmuck weighs 150 lbs. He will suddenly experience a thrust of about four billion pounds of force to the west. For comparison, a Saturn V rocket generated about 7.5 million (with an “m”) pounds of thrust, or about 500 times less thrust than people living on the edge experience as a consequence of just being there.

Unfortunately, since some of that force (at least 1G of it, thanks to the thickness of the disk) is downwards, towards the center of mass of the disk, rather than the location of my home town, the schmuck is going to hit the ground going absurdly fast and get spread all over it.

But wait, what is that ground made of? We said earlier that this disk is made out of normal earth stuff, which you can go out and observe to be mostly silicate-based rocks. The rocks under my home town have about 8 million miles of rock around them, pressing in towards the center of mass of the disk. Extremely weird stuff happens to matter under those kinds of pressures. Hydrogen (theoretically) becomes a metal. Atomic nuclei get mashed into each other.

That said, my home town is going to have other problems. For example, all the water in the world, and all the air, and all the stuff that’s far enough away to experience mostly-sideways gravity, is all going to flow towards the center of mass of the disk, and some of it will be coming in very fast. Since my town is barely above sea level under the conventional model, I think it’s going to get both extremely hot, due to the abrupt change in pressure, and rather wet, although possibly not before the disaster of degenerate matter that’s forming under it gets to the surface.

Alright flat-earthers, you got my home town into this, you get it out. Why are none of these effects observed on the ostensibly flat earth that we live on?

Well, maybe it’s not 8,000 miles thick. Maybe, it is in fact quite thin, and They can manipulate gravity to provide ~1G everywhere you go. They put some thickness everywhere that anyone decides to dig, or anywhere a tree falls over, but everywhere else it’s…. 1 inch thick. One inch is pretty thin, but that still puts the volume of the disk around 99,804,672,000,000,000 cubic miles, and so the inwards G-force experienced by someone on the edge at around 383,864.123G. This is still a troubling amount of force, but clearly if They can provide 1G over the whole surface of the earth, They can sort this out too.

That said, how long have They been doing this? If the earth has always been a disk, then obviously They were doing it before we evolved, so they’re not human. If They did it recently, why did no one notice the change? Was that the “road work” that was making my commute to work slow this morning? Either way, this moves “Them” from the category of “Federal/NWO government conspiracy to hide the truth, man!” to “Capricious god or gods with odd senses of humor”. At that point, there’s no use arguing what shape the world is, because it might be different tomorrow.


[1] These absurdities extend from the simple “everyone who has ever traveled is part of the conspiracy and lies about how long it takes” to the complex “THEY (It’s always ‘they’, innit?) can alter spacetime to slow or speed up travel”.

[2] Spoiler alert: it is.

Let’s make a 3D rendering of a forest!

Let’s do it with a video shot from a GoPro by someone walking around a camp site!

I’m not actually sure this is all that easy, or even possible. The main way to do this sort of rendering from a monocular camera is structure from motion (SfM), which relies on SIFT features in the image, and from what I recall of SIFT features, I’m not 100% convinced that they’re going to track very well from frame to frame, because the frames are mostly full of trees, and leaves look a lot like each other. On the other hand, there’s plenty of texture in the image, so maybe it will work just fine.

I’m planning to use Visual SfM, for which there are instructions online. Those instructions are for Ubuntu 12.04, but I’m on 16.04, so there are some modifications required. I got and ran the latest CUDA installer, using the local install rather than the network one because NVidia broke the network one somehow. I suspect a bad URL in the step where you install the keys.

I got what I think are the dependencies with the command

sudo apt-get install libgtk2.0-dev glew-utils libdevil-dev libboost-all-dev libatlas-cpp-0.6-dev libatlas-dev imagemagick libatlas

The build for VSfM was so quick I thought something had gone wrong, but it did build.

The SiftGPU link in the instructions is dead, but this looks legit. The make process threw some warnings at me, but no errors.

Rather than copying the library once it’s built, I linked it with ln -s ../../SiftGPU/bin/libsiftgpu.so ./libsiftgpu.so, we’ll see if this comes back to haunt me later.

I got PBA from the suggested place, and made it without adding the include for stdlib.h, since it’s 1.05 rather than 1.04 now, and perhaps they fixed that. It seems good, as it built (although with some warnings).

I went ahead and applied the mylapack hack to pvms-2, it seems to have gone well.

cd pmvs-2/program/main/
mv mylapack.o mylapack.o.bak
make clean
mv mylapack.o.bak mylapack.o
make depend
make

Graclus 1.2 built just fine (again with the warnings, though). Read the readme, it has how to set the number of bits used, and if you’re not on a laughably ancient machine, the value you want is 64.

I installed the cmvs from here, you want the one called cmvs-fix2.tar.gz, because non-fix versions seem to have issues with finding lapack.

After that I had to reboot, because everything that built was linked against CUDA and video drivers that I wasn’t running, and so it all crashed when it tried to render anything in X. After rebooting, everything came up fine and I can run VisualSFM, so the next part of this will be getting the video out of the GoPro and feeding it to VisualSFM.

Configuring video cropping and resizing with ROS’s image_proc

The existing documentation on this seemed a little sparse, so I figured I’d post up an example based on what I figured out for my project.

I have a 1024×768 sensor_msgs/Image coming out of image rectification on the topic /overhead_cam/image_rect, with camera info on the predictably-named topic /overhead_cam/camera_info. What I want is to crop off the top 120 px of that image so that it’s the right aspect ratio to get scaled to 1680×1050 without distortion (technically, scaling is a distortion, but it’s the only distortion I want). Then I want to scale it up to 1680×1050.

<!-- Video cropping -->
<node pkg="nodelet" type="nodelet" args="standalone image_proc/crop_decimate" name="crop_img">
  <param name="x_offset" type="int" value="0" />
  <param name="y_offset" type="int" value="120" />
  <param name="width" type="int" value="1024" />
  <param name="height" type="int" value="684" />

  <!-- remap input topics -->
  <remap from="camera/image_raw" to="overhead_cam/image_rect_color"/>
  <remap from="camera/image_info" to="overhead_cam/camera_info"/>

  <!-- remap output topics -->
  <remap from="camera_out/image_raw" to="camera_crop/image_rect_color"/>
  <remap from="camera_out/image_info" to="camera_crop/camera_info"/>
</node>

<!-- Video resizing -->
<node pkg="nodelet" type="nodelet" args="standalone image_proc/resize" name="resize_img">
  <!-- remap input topics -->
  <remap from="image" to="camera_crop/image_rect_color"/>
  <remap from="camera_info" to="camera_crop/camera_info"/>

  <!-- remap output topics -->
  <remap from="resize_image/image" to="camera_resize/image_rect_color"/>
  <remap from="resize_image/camera_info" to="camera_resize/camera_info"/>
</node>

<!-- Dynamic reconfigure the resizing nodelet -->
<node name="$(anon dynparam)" pkg="dynamic_reconfigure" type="dynparam" args="set_from_parameters resize_img">
  <param name="use_scale" type="int" value="0" />
  <param name="width" type="int" value="1680" />
  <param name="height" type="int" value="1050" />
 </node>

The first bit does the cropping, running the decimate nodelet as a standalone node and doing the cropping. It can also decimate (shrink images by deleting rows/cols), but I’m not doing that.

The second bit starts up the video resizing nodelet as standalone, and the third bit fires off a dynamic reconfigure signal to set the image size. If you set use_scale to true, it will scale an image by some percentage, which isn’t quite what I wanted.

I imagine it’s possible to fold these into image rectification as nodelets (image rectification is itself a debayer nodelet and two rectify nodelets, one for the mono image and one for the color image), but I didn’t look into that because this worked for me.

 

In which Ellipses are Dealt with Handily

I’m making a user interface where people can draw a lasso select on an image, and I want to see if certain points in the image are within that lasso select. Now that I think about it, I could probably have done some form of polygon membership test (e.g. raycasting) for each of the points that I wanted to check, with the points of the lasso forming the polygon, but that has a few problems. One of them is that the ends of the lasso can overlap, but that can be handled by e.g. a convex hull. In fact, I may end up doing that, because the current approach has the problem that it can’t handle an arbitrarily-shaped lasso selection.

What I ended up doing as a first cut, however, has some other useful properties. I had the software find a minimum enclosing ellipse of the points for the lasso selection, and then tested the points I wanted checked to see if they were inside the ellipse.

 

def min_bounding_oval(points, tolerance=0.01):
  #Create a numpy 2xN matrix from a list of 2D points
  P = np.matrix(points).T

  # Pseudocode from https:#stackoverflow.com/questions/1768197/bounding-ellipse
  # Pythonificiation by me
  # Input: A 2xN matrix P storing N 2D points 
  #        and tolerance = tolerance for error.
  # Output: The equation of the ellipse in the matrix form, 
  #         i.e. a 2x2 matrix A and a 2x1 vector C representing 
  #         the center of the ellipse.

  # Dimension of the points
  d = 2;   
  # Number of points
  N = len(points);  

  # Add a row of 1s to the 2xN matrix P - so Q is 3xN now.
  Q = np.vstack([P,np.ones((1,N))]) 

  # Initialize
  count = 1;
  err = 1;
  #u is an Nx1 vector where each element is 1/N
  u = (1.0/N) * np.ones((N,1))

  # Khachiyan Algorithm
  while err > tolerance:
    # Matrix multiplication: 
    X = Q * np.diagflat(u) * Q.T

    M = np.diagonal(Q.T * X.I * Q)

    # Find the value and location of the maximum element in the vector M
    maximum = M.max()
    j = np.argmax(M);

    # Calculate the step size for the ascent
    step_size = (maximum - d -1)/((d+1)*(maximum-1));

    # Calculate the new_u:
    # Take the vector u, and multiply all the elements in it by (1-step_size)
    new_u = (1 - step_size) * u ;

    # Increment the jth element of new_u by step_size
    new_u[j] = new_u[j] + step_size;

    # Store the error by taking finding the square root of the SSD 
    # between new_u and u
    err = math.sqrt(((new_u - u)**2).sum());
    
    # Increment count and replace u
    count = count + 1;
    u = new_u;

  # Put the elements of the vector u into the diagonal of a matrix
  # U with the rest of the elements as 0
  U = np.diagflat(u);

  # Compute the A-matrix
  A = (1.0/d) * (P * U * P.T - (P * u)*(P*u).T ).I

  # And the center,
  c = P * u

  return [A, c]

That gets me A, a 2×2 matrix representing the ellipse and c, a 2×1 matrix representing the center of the ellipse.

To check if a point is in the ellipse, some fiddling has to be done to A.

  
#Convert A to a whitening matrix (W = A**-1/2)
w, v = np.linalg.eig(A)
D = np.diagflat(w)
W = v * np.sqrt(D) * v.I

#Subtract the center of the ellipse and use the whitening matrix
#tag_x and tag_y are the x and y positions of the point to check
p = np.matrix([[tag_x],[tag_y]])
p_center = p - c
p_white = W * p_center

#Check if the whitened point is in the ellipse
if np.linalg.norm(p_white) <= 1:
     return True #or whatever you want to do with it

Note that I haven’t proven this code correct or anything, just run it and had it work for me.

Chatbots as Translation

I got a translation program based on deep neural networks (http://opennmt.net/ if anyone wants to play along at home). I’m training it on a corpus of my previous text chats. The “source language” is everything that everyone has said to me, and the “target language” is what I said in return. The resulting model should end up translating from things that someone says to appropriate responses. My endgame is to hook it up to an instant messaging client, so people can chat with a bot that poses as me.

This has a couple of problems. The main one is that statistical translation generally works by converting the input language into some abstract form that represents a meaning (I’m handwaving so hard here that I’m about to fly away) and then re-representing that meaning in the output language. Essentially, the overall concept is that there is some mathematical function that converts a string in the input language to a string in the output language while preserving meaning, and that function is what is learned by the neural network. Since what someone says and what I respond with have different, but related meanings, this isn’t really a translation problem.

The other problem comes up when I do the second part of this plan, which is to train the network with a question and answer corpus. At its most abstract, a question is a request to provide knowledge which is absent, and an answer is an expression of the desired knowledge. “Knowledge”, in this case, refers to factual data. One could attempt an argument that by training on a Q&A corpus, the network is encoding the knowledge within itself, as the weights used in the transformation function. As a result, the network “knows” at least the things that it has been trained on. This “knowing” is very different from the subjective experience of knowing that humans have, but given the possibility that consciousness and subjective experience may very well be an epiphenomenon, maybe it has some similarities.

Unfortunately, this starts to fall apart when the deep network attempts to generalize. Generalization, in this case, is producing outputs in response to inputs that are not part of the training input data set. If one trains a neural network for a simple temperature control task, where the input is a temperature, and the output is how far to open a coolant valve, the training data might look like this:

Temperature Valve Position
0 0 (totally closed)
10 0.1
20 0.2
30 0.3
40 0.4
50 0.5 (half open)
60 0.6
70 0.7
80 0.8
90 0.9
100 1.0 (fully open)

So far, so good. This is a pretty simple function to learn, the valve position is 0.01 * Temperature. The generalization comes in when the system is presented with a temperature that isn’t in the training data set, like 43.67 degrees, which one would hope results in a valve setting of 0.4367 or thereabouts. There is a problem that temperatures less than zero or greater than 100 degrees result in asking the valve to be more than completely shut, or more than fully open, but we can just assume that the valve has end stops and just doesn’t do that, rather than trying to automatically add a second valve and open that too.

The problem comes when we start generalizing across questions and answers. Assume there is some question in the training data that asks “My husband did this or that awful thing, should I leave him?” and the answer is mostly along the lines of “Yes, bail on that loser!”, and another question that asks “My husband did some annoying but not really awful thing, should I leave him?” and the answer is “No, concentrate on the good in your relationship and talk to him to work through it.” These are reasonable things to ask, and reasonable responses. Now imagine that there is a new question. The deep network does its input mapping to the space of questions, and the result (handwaved down to a single value for explanation purposes) falls somewhere between the representations for the “awful thing” question and the “annoying thing” question. Clearly, the result should fall somewhere between “DTMFA” and “Stick together”, but “Hang out near him” isn’t really good advice and “Split custody of the kids and pets, but keep living together” seems like bizzaro-world nonsense. There isn’t really a mathematical mapping for the midrange here. Humans have knowledge about how human relationships work, and models of how people act in them, that we use to reason about relationships and offer advice. This kind of knowing is not something deep networks do (and it’s not even something that anyone is trying to claim that they do), so I expect that there will be a lot of hilarious failures in this range.

Ultimately, this is what I’m hoping for. I’m doing this for the entertainment value of having something that offers answers to questions, but doesn’t really have any idea what you’re asking for or about, and so comes up with sequences of words that seem statistically related to it. We (humans) ascribe meaning to words. The deep network doesn’t. It performs math on representations of sequences of bytes. That the sequences have meaning to us doesn’t even enter into the calculations. As a result, its output has flaws that our knowledge allows us to perceive and reflect on.

Plus, I’m planning to get my Q&A corpus from Yahoo Answers, so not only will the results be indicative of a lack of knowing (in the human sense), they’ll also be amazingly low quality and poorly spelled.

 

TF-IDF in Python

I am trying to process a bunch of text generated by human users to figure out what they are talking about, in the context of an experiment where there were robots a person could be controlling, a few target areas the robots could move to, and a few things they could move, like a crate. One thing that occurred to me was to use TF-IDF, which, given a text and a collection of texts, tells you what words in the text are relatively unusual to that particular text, compared to the rest of the collection.

It turns out that that’s not really what I wanted, because the words that are “unusual” are not really the ones that the particular text is about.

select(1.836) all(3.628) red(2.529) robots(1.517) ((1.444) small(5.014) drag(2.375) near(3.915) lhs(4.321) of(1.276) screen(2.018) )(1.444)

This is a sentence from the collection, and the value after each word is the TF-IDF score for that word. The things I care about are that it’s about robots, and maybe a little that it’s about the left hand side of the screen. “Robots” actually got a pretty low score (barely more than “of”), but “small” got the highest score in the sentence.

At any rate, this is how I did my TF-IDF calculation. Add documents to the object with add_text, get scores with get_tfidf. It uses NLTK for tokenization, you could also use t.split(” “) to break strings up on spaces.

class TFIDF(object):
  def __init__(self):
    #Count of docs containing a word
    self.doc_counts = {}
    self.docs = 0.0

  def add_text(self, t):
    #We're adding a new doc
    self.docs += 1.0
    #Get all the unique words in this text
    uniques = list(set(nltk.word_tokenize(t)))
    for u in uniques:
      if u in self.doc_counts.keys():
        self.doc_counts[u] += 1
      else:
        self.doc_counts[u] = 1

  def get_tfidif(self, t):
    word_counts = {}
    #Count occurances of each word in this text
    words = nltk.word_tokenize(t)
    for w in words:
      if w in word_counts.keys():
        word_counts[w] += 1
      else:
        word_counts[w] = 1
    #Calculate the TF-IDF for each word
    tfidfs = []
    for w in words:
      #Word count is either 0 (It's in no docs), or the count
      w_docs = 0
      if w in self.doc_counts.keys():
        w_docs = self.doc_counts[w]

      #the 1 is to avoid div/zero for previously unseen words
      idf = math.log(self.docs/(1+w_docs))
      tf = word_counts[w]
      tfidfs.append((w, tf * idf))
    return tfidfs