Journal Entry #78
Okay. Programming. Where to even start?
Neural networks are a good place, I suppose. Neural networks are the bread and butter of developmental robotics. The behaviors needed to navigate real-world scenarios are far too complex to program by hand. There are simply too many variables to take into consideration. Neural networks allow complex and nuanced behaviors to emerge naturally, through trial and error learning.
Mariimo is going to use neural networks extensively. She’s going to be running on a multiple network system. Each sensory system will have its own dedicated neural network, roughly analogous to the various sensory cortices of a human brain.
Her sensors will feed raw data into the corresponding network’s input layer. The data will then be interpreted within the web of neurons that compose the hidden layers, and the results will be spat out the other side. From the output layer, the data will be rerouted into a central neural network, where the compiled data from every sensory network will be compared, and patterns identified.
It’s my hope that by keeping the sensory networks separate, I can prevent a synesthesia-like blending of senses. A machine with synesthesia could very well prove to be a worthwhile project, but it’s not my goal with Mariimo.
Journal Entry #79
Speaking of goals, they will be absolutely vital to Mariimo’s development. Without goals, there is no drive. No learning. An artificial intelligence without goals would have no reason to take any action at all. The question is, what motivates Mariimo?
There is a concept in machine learning known as “reinforcement learning.” It works on the same principles in machine learning as it does in behavioral psychology; i.e. positive and negative reinforcement. This concept will be at the core of Mariimo’s behavior.
Think of it as a scoreboard. Each time Mariimo perceives a stimulus designated as positive, a point is added. Each time she perceives a stimulus designated as negative, a point is removed. By specifying that her goal is to increase her score, she can be motivated to seek out positive stimuli, while avoiding negative ones.
Her metaphorical scoreboard will be separated into several different categories. This will allow different types of stimuli to be weighted by importance. In this way, important stimuli can be given priority over unimportant ones. I’ll provide these weights in the beginning, but she’ll be able to adjust her own priorities over time, given new information.
I’m going to spend the next little while deciding, sensor by sensor, which stimuli will be designated as positive, and which will be designated as negative.
Journal Entry #80
While striving toward goals is all well and good, learning to achieve those goals without outside influence would be a very, very slow process. This is where mimicry comes in. Mimicry is going to be another core aspect of Mariimo’s behavior. It will allow her to learn by example, rather than simply be left to her own devices.
I want to be able to teach Mariimo in two ways. The first is passive observation. I want her to be able to look at me, observe my actions, and attempt to replicate them. This can be accomplished through the use of body tracking software. I suspect this type of mimicry will be the less precise of the two. It will likely require a bit of trial and error before she can translate my movements into something she can actually use.
The second is kinesthetic demonstration, where I take hold of Mariimo’s body and guide her movements directly. This type of mimicry will be more precise, I think. There are fewer steps between demonstration and proprioceptive feedback. I can use this technique to demonstrate new behaviors, as well assist in refining existing ones.
It may seem as if kinesthetic demonstration is the superior teaching method, but I personally think passive observation is by far the more interesting of the two. Kinesthetic demonstration requires me to take on an active teaching role. I choose what I’m going to teach her.
Passive observation, on the other hand, is always running in the background. She’ll be learning from my behavior whether I intend her to or not.
What she chooses to adopt for herself will be entirely up to her.
Journal Entry #81
The first sensory system I’m going to approach is vision. As a humanoid machine, Mariimo is likely to adopt vision as her primary sense. There’s no guarantee that will be the case, but I’d say it’s a pretty safe bet.
As far as vision is concerned, goals will be achieved through gaze. Imagine Mariimo’s field of view. In its center, there will be an invisible reticle. This represents her macular vision. Everything outside that central point represents varying degrees of peripheral vision.
Now imagine a positive visual stimulus. A bright red ball, for instance. When Mariimo’s reticle overlaps that red ball, the result is positive reinforcement. An increasing score. The further the ball strays from the reticle, the slower the score increases. Now, imagine a negative visual stimulus. Let’s say, a dark blue ball. The reinforcement algorithm is reversed. Looking directly at the blue ball results in a decreasing score. The further the ball strays from the reticle, the slower the score decreases.
I can foresee a problem arising with the use of this system, however. Hyperfixation. If the red ball provides a never-ending stream of positive reinforcement, she would have no reason to look away. She could potentially just stare at the ball until her supercapacitors run dry.
Hyperfixation can be avoided through the use of diminishing returns. The longer she stares at the ball, the less intense the reward. Eventually, the reward will become unsatisfactory, and her attention will shift to a positive stimulus of greater interest.
Artificial boredom, if you will.
Journal Entry #82
Color seems like a straightforward place to begin setting visual goals. Not bright or vivid colors, necessarily. That would be too obvious. Too simple.
I want Mariimo to seek out contrasting colors. Colors that stand out from their surroundings. A dark stone on a sandy beach. A red leaf on a forest floor. To Mariimo, an object’s relevance will be dictated by its uniqueness. How it differs from the things around it.
By seeking out contrasting colors, Mariimo will be able to quickly differentiate objects of interest from their surroundings, and choose to investigate further.
That seems like an elegant way to view the world.
Journal Entry #83
I wanted to avoid assigning negative reinforcement to colors. Colors aren’t inherently good or bad, so I didn’t see any point in it. It has since occurred to me, however, that I should make at least one exception.
If every pixel in Mariimo’s field of view suddenly goes a stark, overexposed white, it’s very likely that she’s damaging her sensors. I’d obviously like to avoid a situation where she chooses to stare at the sun for a prolonged period of time.
To be clear, I’m not designating the color white as a negative stimulus. Only overexposure. I’m going to implement a threshold. If a certain percentage of sensor pixels become overexposed, it will result in negative reinforcement. In this way, she can learn to shield her eyes from bright light.
I thought about designating underexposure as a negative stimulus as well. I’d hoped it would help her avoid dimly lit areas, and therefore keep her safe. Those smartphone cameras aren’t built for night vision.
Eventually I realized that a total lack of visual stimuli would likely result in reduced activity. I suspect she’d simply choose to stay put until the light returns. Like a bird wrapped in a towel. That’s safest, I think.
Besides, it seems a little cruel to make her afraid of the dark.
Journal Entry #84
Mariimo’s stereoscopic cameras are going to collect depth data in the form of a point cloud. This is the least computationally expensive way to represent 3D space.
Proximity feedback will be useful in several ways. It will be particularly important in learning obstacle avoidance. Mariimo will be able to compare point cloud data with data from her proprioceptive and touch networks. The patterns she’ll find will eventually allow her to develop an understanding of barriers. I do suspect she’ll try to pass through a few walls before she arrives at this point, however.
Mariimo’s point cloud will allow her to create a mental map of her surroundings. By committing point cloud data to memory, she’ll become increasingly adept at navigating familiar environments.
Proximity will also be useful for determining the relevance of an object. The closer an object is, the more accessible it is, and therefore the more relevant it is. An object that is nearby will be more likely to catch Mariimo’s attention.
What Mariimo will consider relevant becomes a complex question when combining multiple stimuli. Will a faraway object with striking colors take priority over a nearby object that blends into the background? It’s difficult to say.
By the time I get every sensory system up and running, it will be nearly impossible to say.
Journal Entry #85
The motion of an object will also be an important visual clue as to its relevance. A moving object is far more likely to be of immediate significance than a stationary one, and will therefore be weighted accordingly by Mariimo’s neural network.
There are several types of motion that Mariimo will need to keep track of. The first is camera motion. This has to be detected through visual cues, as data from her proprioceptive network will not be available to her visual network until the outputs of both are rerouted to her central network.
Once in the central network, comparisons between gyroscope data and camera data can be made, and patterns identified. But in the meantime, she needs a way to differentiate camera motion and environmental motion.
The reason for this is twofold. The obvious reason is that it will help Mariimo interpret her environment. The less obvious reason is that camera motion cannot be rewarded. If it were, the end result would be constant and repetitive head shaking behavior.
I’m going to approach this problem through the use of pixel tracking. If all of the pixels in Mariimo’s field of view travel in roughly the same direction, at roughly the same speed, it’s most likely the result of camera motion. This data can be safely treated as a neutral stimulus.
The second type of motion is spatial, i.e. movement along an X, Y, and thanks to Mariimo’s stereoscopic vision, Z axis. This is where visual goals come into play. As with contrasting colors and object proximity, the act of looking directly at a moving object will result in positive reinforcement. In theory, this should result in motion tracking behavior.
The third and final type of motion is rotation, i.e. the orbit of tracked points around an axis. An understanding of rotation will be vital to the development of skills such as grasping and object manipulation.
If you spot this narrative on Amazon, know that it has been stolen. Report the violation.
One final note. While motion is an easy visual shorthand for determining the relevance of an object, it’s important that the object not lose its relevance after it becomes stationary. Just because an object has stopped moving, does not mean Mariimo should immediately lose interest in it.
With that in mind, any motion of a tracked object will result in a reinforcement multiplier being assigned to that object. The end result is that Mariimo will find an object that has moved in the past to be far more interesting than one that has always remained stationary.
I think that’s true of most people as well, honestly.
Journal Entry #86
Object recognition. This is where neural networks really shine. Trying to program something like this by hand would be a fool’s errand. By using neural networks, the program itself does most of the dirty work.
Mariimo is going to collect a lot of visual data. Sixty frames per second. By training her neural network on these images, she’ll begin to identify patterns in what she’s seeing. By associating these patterns with a tracked object, she’ll be able to recall that object at a later date.
Not only that, but by classifying objects by their shared properties, she’ll be able to make assumptions about an object that she’s never encountered before. Suppose she’s encountered a fork in the past, but its matching spoon is still unfamiliar. By recognizing the similarities between the two, she might place both objects into a single category, alongside other cutlery. When she finally encounters a knife for the first time, she might very well recognize it as cutlery, and use her past experience with forks and spoons to interact with it efficiently.
In most object recognition experiments, objects and categories are labeled using natural language. A fork is a fork. A spoon is a spoon. Cutlery is cutlery. This makes it easy to check for errors in identification. That won’t be the case with Mariimo. Her thought process will be a bit more opaque.
Mariimo’s object recognition library will be labeled with numbers, rather than words. I want her to be able to label and classify objects without assistance. By assigning random number codes, she bypasses the need for human labeling, giving her a lot more autonomy.
I’m very interested in this type of unsupervised learning. Letting Mariimo come to her own conclusions is an exciting prospect. This approach means that the relationships she perceives between objects will not always be clear to an outside observer. But in the end, these identifications are for her own personal use, not mine.
Journal Entry #87
Mariimo’s facial expression recognition algorithm will be based on the universality hypothesis. It’s said that there are just six universal human expressions: anger, fear, sadness, disgust, joy, and surprise. By mixing these expressions like paint on a palette, a full range of complex and subtle expressions can be created.
This approach is useful from a programming perspective. The identification of these six facial expressions makes the problem of expression recognition a lot less intimidating. It’s simply a matter of tracking facial points, and defining the extremes of each expression.
Using this technique, Mariimo will be able to measure emotional intensity. Let’s use fear as an example. To us, a neutral expression has no emotional meaning. To Mariimo, it would read as zero percent fear. As the intensity of the expression increases, it passes through phases of concern, anxiety, fear, and finally terror. To Mariimo, these would read as twenty-five, fifty, seventy-five, and one hundred percent fear.
When reading a person’s face, Mariimo will apply a percentage to all of the six universal expressions simultaneously. What might read as puzzlement to us, Mariimo might interpret as forty percent disgust, forty percent sadness. What we might read as malice, Mariimo might interpret as ninety percent anger, eighty percent joy.
Why is any of this important? Because facial expression will be an instrumental part of Mariimo’s reinforcement learning. Expressions of anger, fear, sadness, and disgust will be treated as negative stimuli, while joy will be treated as a positive stimulus. Any expression that mixes two or more of these emotions will be subject to an algorithm that will decide its positive or negative status, as well as the intensity of that status.
Surprise is a special case, as it’s not an inherently positive or negative emotion. I’m going to treat surprise as a multiplier instead. If Mariimo were to startle someone, for instance, that would qualify as a strong negative stimulus. If she were to amaze, however, that would be a strong positive stimulus.
This type of reinforcement learning is particularly fascinating due to it’s subconscious nature. It’s a human-driven learning experience, but it doesn’t require that human to take on an active teaching role. Their natural, instinctive response is all it takes to modify Mariimo’s behavior.
Journal Entry #88
While I’m on the topic of facial expressions, I should probably talk about my plans for Mariimo’s faceplate display, programming-wise.
The segmented display pattern I created, while very expressive, allows for a lot of facial expressions with no emotional meaning at all. Switching segments on and off randomly would result in confusing and meaningless patterns the vast majority of the time.
This is a problem, as Mariimo is going to have to learn what expressions are appropriate for each situation by trial and error. For this reason, I’m going to create a comprehensive list of approved facial expression display patterns. By restricting the number of choices Mariimo can make, the rate at which she becomes a fluent communicator increases dramatically.
Journal Entry #89
Hearing is the next sensory system on my programming checklist. Mariimo’s audio-based reinforcement is going to work a little differently than her other sensory systems. Unlike vision, there will be no predesignated positive or negative audio stimuli. Instead, Mariimo’s hearing will be based on directionality and association.
Upon hearing an unfamiliar sound, Mariimo will automatically turn to face the source of that sound. This will be a reflex behavior, facilitated by her asymmetrical microphone array. If the source of the sound triggers negative reinforcement in a non-hearing sensory network, any future instance of that sound will also be regarded as a negative stimulus. Repeated associations will only strengthen the intensity of a given audio stimulus.
The opposite also holds true. If the source of an unfamiliar sound triggers positive reinforcement upon its identification, the associated sound will also be regarded as positive. In this way, Mariimo will learn to seek out positive audio stimuli, and retreat from negative ones, even before their associated sources enter her field of view.
It’s all very Pavlovian.
Journal Entry #90
Mariimo’s audio recognition will work much like her object recognition. However, it’s going to be a lot easier to implement. There’s no need to worry about colors, or edges, or angles, or tracking. Everything about a sound can be boiled down to a waveform. It’s simply a matter of finding and identifying patterns within those waveforms.
Journal Entry #91
Touch. Touch is an interesting one. I want to encourage Mariimo to explore her world through touch. Therefore, a soft, gentle touch will be treated as a positive stimulus. A pleasurable touch.
Some sensors will be designated as more sensitive than others. Generally, the denser the sensor resolution in a given area, the greater the tactile reward. I modeled Mariimo’s sensor density after the sensitivity of human skin, so the end result should be a roughly anthropomorphic touch sensitivity map. The only notable exceptions will be her face, and the soles of her feet, which due to technical limitations possess no pressure sensitivity at all.
For Mariimo, a pleasurable touch will be defined by the compressive strength of her memory foam padding. A slowly rising pressure readout means that the memory foam is currently providing a cushioning effect. A sudden spike in pressure means that the foam has reached its compression limit, and is beginning to exert force on her internal components.
This is a problem, as any additional force risks damaging Mariimo’s hardware. Any pressure beyond this threshold will be treated as a negative stimulus. A painful touch.
By applying this threshold, I can ensure that Mariimo will never exert any force beyond that of a gentle pillow fight. This feather-soft touch will not only keep her safe, it will help ensure the safety of those around her.
Journal Entry #92
Temperature should be fairly straightforward. The higher Mariimo’s core temperature, the slower her compressor will run, and the weaker her sensory reinforcement will be.
Reduced sensory reinforcement means less exertion, which will allow her to catch her breath while her compressor is cooling. A sort of programmed exhaustion.
Of course, like anything in programming, a simple concept rarely means a simple execution.
Journal Entry #93
Proprioception and balance are my next programming tasks. These sensory systems are so tightly linked that I’m going to have them share a network.
Mariimo’s proprioception will be based on a concept known in artificial intelligence circles as self-modeling. The idea is that a machine without any preprogrammed knowledge of its own anatomy can, through trial and error, form and test hypotheses about the structure of its own body.
Through gentle experimentation, Mariimo will be able to form and refine a virtual model of herself, taking into account joint location, joint rotation limit, and weight distribution. By running this model through repeated physics simulations, Mariimo will be able to plan her movements in the safety of virtual space before executing them physically.
This will be made possible through the implementation of a genetic algorithm. A genetic algorithm works on the principles of natural selection. First, a goal is chosen. This goal could be an attempt at mimicry, the tracking of a positive visual stimulus, anything. Then, Mariimo will run a series of physics simulations with that goal in mind. The simulations that come closest to achieving that goal are bred, and the resulting offspring are tested in the same way. By running the simulation through multiple generations, the planned movement can be optimized to a point where it consistently results in success. The movement can then be executed physically.
By comparing her physics simulation to the resulting sensory feedback, she can further refine her simulations to more closely match reality. The more refined these simulations become, the more accurate her predictions will be.
Journal Entry #94
I kind of failed to mention it earlier, but the system that keeps track of Mariimo’s charge level is very much a sensory system. Therefore, it will need it’s own neural network to match.
Mariimo’s power management network will run on relief-based reinforcement. A fully charged supercapacitor array will result in a neutral reinforcement state. This will allow for uninterrupted functioning during periods where charge level is a low priority.
As Mariimo’s supercapacitors are depleted, the level of negative reinforcement will rise along an exponential curve. For instance, a fifty percent charge level would only result in mild negative reinforcement. If other sensory stimuli are providing a sufficient level of reinforcement, positive or negative, Mariimo will remain undistracted.
However, by the time she reaches a twenty-five percent charge level, this negative reinforcement will become a lot more difficult to ignore. As her charge level continues to drop, it would eventually become unbearable, drowning out all other stimuli. A sort of programmed hunger.
Mariimo’s only source of relief will be her charging pad. By positioning her feet correctly on the pad, her supercapacitor array will begin to recharge. As she charges, the reinforcement level will begin reverting to neutral.
In the absence of other stimuli, Mariimo would remain on the pad until fully charged. However, since her reinforcement learning weighs various stimuli against each other, she could potentially be lured off the stand by a sufficient distraction.
Journal Entry #95
I mentioned early on in the project that I’d teach Mariimo how to forget. Today I make good on that promise.
It may seem a little counterintuitive, but it is absolutely vital that Mariimo be able to forget. Her hard drives only have so much space available. Considering the amount of sensory information she’s going to be processing, it’s only a matter of time before that space is filled completely.
On a personal computer, this is when a little pop-up window would appear asking you to delete unwanted files in order to clear up space. Attempting to do this manually on an eight terabyte hard drive full of unlabeled data would be absurd. There’s simply no way for a person to sift through that much data. Instead, Mariimo will have to decide for herself which memories are valuable, and which can be forgotten.
Think of Mariimo’s memory as a timeline. When she boots up for the first time, that timeline will immediately begin to fill with a constant stream of sensor data. Every frame of video, every audio sample, every pressure reading from every touch sensor will be recorded and timestamped.
Now, imagine a graph running alongside that timeline. This graph measures how often any given point on the timeline is referenced. Frequently referenced memories will be displayed as spikes on the graph. These memories are valuable. These are the types of memories that will allow her to function smoothly in everyday situations.
Infrequently referenced memories will be displayed as dips in the graph. This data will likely consist largely of neutral stimuli. Images of blank walls, snippets of background noise, etc.
This will continue until Mariimo’s hard drives reach full capacity. The moment this happens, a constant and calculated stream of data will begin to disappear, beginning with the least referenced.
As data is deleted from the timeline, Mariimo’s memories of those events will become fuzzier. This will free up space for the recording of new sensory data. Eventually, memories of unimportant events will disappear entirely.
Journal Entry #96
Okay. Mariimo’s neural network is complete. I hope.
Debugging a neural network isn’t like debugging other software. Other software has defined inputs, and defined outputs. If a given input results in the expected output, you move on. If not, you fix it. It’s time consuming, but it’s not complicated.
Neural networks are not so straightforward. They’re designed to make mistakes. They’re designed to experiment. They’re designed to adapt. The concept of what constitutes a “correct” output starts to get pretty fuzzy. On top of that, they take time to develop. You can’t just switch on a neural network and immediately see how well it performs a given task. It needs to be trained. It needs experience. That kind of learning takes time and patience.
In the event that a neural network fails to perform its intended task, most programmers would simply reset the network, make adjustments, and start the training process over again. I really, really hope to avoid resorting to that with Mariimo.
Mariimo doesn’t have an “intended task,” so to speak. There are things I hope she’ll accomplish, certainly, but she’s not an assembly line robot. She’s not meant to defuse bombs or perform search and rescue. Lives aren’t on the line. As long as her functioning isn’t severely impaired, I’d rather just let her be what she is. No resets.
I’ve double, triple, and quadruple checked my code. As far as I can tell, everything is in working order. I’m just going to have to switch everything on and cross my fingers.