AI summary of "2025/08 - Implementing Behavioral Models" by Thousand Brains Project

Introduction 00:00

Implementing behavioral models is a crucial aspect of understanding and predicting human behavior, and it involves using data and analytics to inform decision-making processes 00:10.
The field of behavioral modeling has evolved significantly over the years, with advancements in technology and data collection enabling more accurate and nuanced models of human behavior 00:42.
Behavioral models can be applied in various domains, including marketing, finance, and healthcare, to name a few, and they have the potential to drive business growth, improve customer engagement, and enhance overall well-being 02:06.
The implementation of behavioral models requires a multidisciplinary approach, involving expertise in psychology, statistics, and computer science, among other fields, to ensure that models are both theoretically sound and practically applicable 04:15.
Effective implementation of behavioral models also depends on the quality and availability of data, as well as the ability to integrate insights from multiple sources and stakeholders, to create a comprehensive understanding of human behavior 06:30.
The ultimate goal of implementing behavioral models is to create a more personalized and responsive experience for individuals, whether in a commercial or social context, by leveraging data-driven insights to inform decision-making and drive positive outcomes 08:50.

Object Behaviors in Monty 00:08

The presentation covers object behaviors in Monty, including how they can be implemented and related open questions, with a high-level structure that outlines the key points to be discussed 00:08.
The behavior capabilities to be added to Monty include learning models of object behavior, using those models to recognize object behaviors independent of morphology, and learning associations between behavior and morphology models 00:29.
The use of hierarchy is currently considered to speed up recognition and learn compositional models, which involves assigning behaviors to different locations on morphology objects 00:47.
The fourth capability is to compensate for object movement to make accurate predictions in the morphology model, which may involve using the behavior model to update expectations of an object's location as it moves, although this is still an open question 01:04.
The idea of using model-free signals to compensate for object movement is also being considered, but it is still a gray area and one of the open questions 01:27.

Overview of the Capabilities we Want to Add 01:34

Implementing behavioral models in Monty would start with learning and recognizing object behaviors, which is a significant capability to add, and it is believed that once this is achieved, other aspects will automatically fall out of it, such as associating behaviors to morphology models 01:34.
The general structure of the implementation would involve going through each of the four main points and concretely discussing how to implement them in Monty, along with related open questions, with the fourth point being the one that is currently the least certain about the solution 02:31.
Learning models of object behaviors would require adding a new type of sensor module to MIT that can detect changes instead of static features and send them to a learning module, which would learn a behavior model from this new kind of input 04:04.
The new sensor module would detect movement or changes at locations, sending a location in space along with information like movement direction or orientation, and the learning module would remain unchanged, learning from this different kind of input 04:41.
The ideal setup would be for the sensor modules to be configured so that the static one only sends input to the learning module when the object is not moving, and the new sensor module only sends input when the object is moving 05:16.
The implementation would also involve considering the analogy between the barcellular and magnosia pathways, and how in primates, the cortex receives center surround fields without edge detection or movement extraction 05:56.
The processing of visual information in primates, including humans, appears to occur in the cortex itself, whereas in non-primates, the information is passed up in a more pre-processed form, with this difference being noted in the context of movement detection and direction sensitivity 06:16.
In primates, the cells coming out of the LGN to the cortex have center-surround fields and are not directionally sensitive, which is different from the processing that occurs in non-primates, and this has led to speculation about the reasons for this difference 07:13.
The idea that evolution changed the mechanism of visual processing from non-primates to primates has sparked curiosity about whether this change is related to depth perception or other aspects of vision, with the suggestion that pre-processing of visual information may limit the ability to extract certain types of information 07:52.
The proposal of an extra layer for alpha, which directly feeds forward input to MT, has been mentioned as a possible way to address the differences in visual processing between primates and non-primates, with the idea that this extra layer could be involved in learning behavior models 08:30.
The concept of extra layers in the cortex being thought of as sensor module extras, rather than traditional cortex, has been suggested as a way to approach the design of artificial visual systems, with the idea that these extra layers could be used to process different types of sensory information 09:25.
The significance of understanding why evolution changed the mechanism of visual processing from non-primates to primates is considered important for the design of artificial systems, such as Monte, which may need to incorporate different types of sensors and processing mechanisms 10:03.
The discussion revolves around the implementation of behavioral models, and it is noted that the current view of lecture layer four cells as pre-processing units that throw information into the sensor module may not be a fundamental limitation 10:41.
A feature change sensor module is currently being used, which only sends signals when a significant change is detected, and a similar mechanism may be needed for behavior, where signals are only sent when changes are detected, with the difference being that feature changes look at changes between movements of the sensor, whereas behavioral changes look at changes within the receptive field of the sensor 11:19.
The need to disentangle feature movement from sensor movement is identified as a potential challenge when writing the new sensor module, as the sensor location relative to the body may be changing, but there may be no feature movement within the patch, and a mechanism to calculate the difference between sensor movement and feature movement may be necessary 11:58.
The concept of smooth pursuit is mentioned, where the eyes track an object, but it is unclear if this would be applicable in the current scenario, and it is suggested that this may not be a problem, as the sensor would not be moving in a way that would require smooth pursuit 13:11.
The possibility of observing an object's behavior while moving the sensor is discussed, and it is noted that during eye movements, vision is not typically processed, so this could be considered a null input, and it is suggested that this issue may become more relevant later on, such as when tracking a moving object like a car 13:32.
The distinction between local and global flow is mentioned, where noticing an object moving is not the same as noticing things within the object moving relative to each other, and this is related to the concept of local versus global flow, which has been discussed previously 14:48.
The concept of smooth pursuit is discussed in relation to local and global flow, where the car is an example of a local flow that is not moving much on the retina due to smooth pursuit, but the global flow is changing because the eyes are moving 15:08.
The idea of tracking a car and updating its location in a model is mentioned, with the suggestion that smooth pursuit works automatically and does not require compensation for the car's movement 15:26.
It is proposed that the focus should not be on a specific aspect of the model, and instead, a lot of progress can be made with model-free information, with smooth pursuit being an example of a model-free concept 16:22.
The need to modify the learning module to include time or state in the models is discussed, with the suggestion to add a fourth dimension to the models, allowing them to represent different states of the same object 17:00.
The idea of representing time and state in morphology models is explored, with the possibility of using key frames to represent instantaneous changes, rather than relying on behavior models 17:38.
The example of a traffic light changing color is used to illustrate the difference between behavior and morphology models, with the suggestion that such changes could be represented in a morphology model with time, rather than a behavior model 18:37.
The discussion concludes with the idea that all models could be considered behavioral in some sense, and that the inclusion of time in morphology models could be a way to represent changes and behaviors 19:15.
The concept of time is considered to be available to every learning module, and if something has a time element to it, it could learn it, with the time signal going everywhere, making it potentially useful for all modules 19:35.
There are still open questions, but considering time as a separate entity that every learning module can use, and adding other context signals, is a good approach to architecting models like Monty, with time being one of many context signals 20:29.
The idea of context signals is broad and can include various things, such as time, specific states, or action conditions, which can transition from one state to another, and can be thought of as a kind of context signal that is not necessarily time-dependent 20:46.
In terms of neuroscience, everything that projects to layer one can be thought of as context, including time, and other context signals can be added as well, with the recommendation being to consider time as a separate entity 21:03.
Different changes, such as the traffic lights changing colors or the morphology of an object changing, may be represented in different layers, with color changes potentially being represented in layer 4, and location changes being represented in layer 6 21:39.
The layer six is thought to represent movement on the object's reference frame, whereas changes at locations are considered inputs to layer 4, which detects flow in a specific direction from directionally sensitive cells 22:18.
The change in color is still a topic of discussion, and it is unclear if it would be represented in layer 4, but it is decided to focus on implementing the model and letting the details come out later 22:55.
The discussions about traffic lights and color changes are put aside for now, with the focus being on implementing the model and exploring the concepts of time and context signals further 23:12.

Recognize Object Behaviors 26:13

To implement behavioral models, the hypothesis space needs to extend into the fourth dimension, allowing for recognition of different locations in a 3D reference frame and in the temporal dimension of a sequence, with the ability to interpolate in temporal space 26:13.
The output from each learning module will require an additional entry, which is the state or time point in the sequence, including the location, orientation, state or point in the sequence, and the ID of the recognized behavior, such as a pinch behavior 27:07.
The behavior ID is independent of the object ID, but including it in the message sent out by the learning module can be helpful for making associations between morphology and different points in the sequence, as well as for voting 27:44.
Adding this information to the message protocol can serve as insurance against breaking systems in the future and can be useful for internal evaluations of correctness, even if it is not immediately necessary 28:20.
To implement this, a relatively minor change can be made by adding another field to the state class, which may need to be renamed to avoid overloading the term "state", and updating the voting algorithm to take into account the state or point in the sequence 28:59.
The voting algorithm may need to be updated to vote on locations in four-dimensional space, or to vote on ID, state, and orientation separately, which can be useful for fast inference on moving objects and behaviors 29:20.
Having a global clock may not be necessary if the learning module can vote on the state or point in the sequence, but the global clock is not object-specific and only provides information on the time between events in a sequence, rather than the global time or the position in the sequence 30:02.
The evidence suggests that the clock is a countdown between events and does not represent a global time or the position in a sequence, making it necessary to vote on the state or point in the sequence for accurate recognition of behaviors 30:41.
The concept of a global clock is discussed, where it is questioned whether it provides information about the intervals between each beat or if it also indicates the position within a sequence, such as being on the first of four beats, with the conclusion being that it likely only provides information about the intervals between events 31:00.
The idea of breaking down sequences into events, such as beats in a melody or notes in music, is explored, where the time between these events is learned, and this process is thought to be limited to intervals of around a second 31:57.
The importance of voting on the state of a sequence is highlighted, as not doing so would result in a "bag of features" where two behaviors with the same events but in different orders would be indistinguishable, drawing an analogy to the use of locations for morphology 32:50.
The difference between morphology and time is noted, with morphology allowing for multiple sensors to provide information without moving, whereas time requires movement through a sequence to recognize a behavior 33:27.
The concept of time is suggested to be treated as a non-model specific signal, potentially fitting into the idea of an interval timer that resets on events, with the example of music providing a clear illustration of this concept, but the equivalent events in other domains, such as watching someone walk, being less clear 34:05.
The challenge of understanding how time plays into recognizing behaviors in different domains, such as walking, is acknowledged, with the equivalent of the "attack on a note" in music being uncertain 34:42.
The discussion revolves around the concept of state voting in learning modules, where each module learns reference frames for an object independently, but there is a common body reference frame that allows translation between them 35:04.
When learning modules learn states in different reference frames, it becomes challenging to translate between them without a common reference frame, making voting between modules difficult 35:23.
The issue of voting on morphology requires a common space for the modules to communicate, but it's suggested that voting on states might not require a common time, instead relying on associative connections between states 36:01.
The state is not specific to behavior, but rather a global rotation of the reference frame, and building associative connections between states specific to behaviors would require a large number of connections 36:20.
The concept of state is compared to sequence memory in neuroscience, where a particular pattern of feature movements is represented, and individual cells represent a specific set of movements at a given time, forming high-order sequences 37:18.
It's proposed that the state could be a sparse distributed representation (SDR) unique to a particular behavior, making it challenging to vote on, which aligns with the fact that learning behaviors is more difficult for humans than learning morphologies 38:14.
The idea that learning behaviors is harder for humans than learning morphologies is supported by the fact that it takes more time and study to learn behaviors, and the complexity of voting on states unique to each behavior 39:07.
The learning of behaviors is considered a more effortful process compared to learning morphologies, and it seems that not as many behaviors are learned as morphologies are recognized, which is consistent with the idea that learning behaviors is a more challenging task 39:27.
The concept of a sub-ID is introduced, where different states of an object, such as an open and closed stapler, can be considered as the same object but with different states, and it wouldn't make sense to share this sub-ID between other objects 40:01.
The idea of voting on a location in a sequence rather than the full state is discussed, and it's suggested that this could be too general, as every behavior will have a start of a sequence, and voting on this information may not be meaningful 40:34.
The information about the behavior ID is available, and it's similar to how global rotation of an object is voted on without being specific to the object itself, and it's possible to extract information that can be voted on without respect to the object or behavior 41:08.
The complexity of voting between learning modules is introduced, and it's noted that while a single learning module is fine, the issue of voting between modules is still unresolved, and updating the voting algorithm to take account of state is necessary 42:08.
The idea of voting on the state specific to a behavior is discussed, and it's suggested that voting on the behavior ID may be redundant if the unique location in a sequence is known, as this information can be used to determine the object ID 43:03.
The concept of temple pooling is mentioned, where it's possible to go from a unique location to an object ID, but it's still necessary to know where you are in the sequence and on the object 43:24.
The discussion involves voting on unique locations on each object, rather than explicitly voting on object IDs, as the locations are unique to each object, which eliminates the need for an additional vote on the object ID 43:41.
The concept of voting is abstract and not specific to the object itself, even though the location is referenced in a specific frame for a particular object 44:20.
There is a disagreement on whether voting should be specific to the object or behavior ID, with one person feeling that all voting has to be specific to the object or behavior ID 44:36.
A point is made to write down the thought about voting so that it can be discussed later, as the current focus is on not getting stuck on the topic of voting this week 44:53.

Environment to Test Learning and Recognizing 45:12

A test bed environment is needed to learn and recognize object behaviors, with requirements including objects moving repeatedly in the same way, different morphologies with the same behavior, and the same behavior found in varying orientations and speeds, with the possibility of behaviors involving changes in features instead of movement 45:12.
The environment should allow for objects to stop moving at some points in the sequence, and the learning module may need to be supervised by providing the object ID, pose, and state during learning to tell it where it is in the sequence 46:25.
The learning module could start with one module, but it would probably require supervision, and the evaluation would be done in the same way as currently, including state as one of the outputs 46:43.
Testing the environment in a 2D setting could be a good starting point, as it would make it easier to visualize the third temporal dimension, but there is a concern that this might miss some 3D issues, such as recognizing behaviors as three-dimensional movement vectors 47:02.
Recognizing behaviors in 3D space is a complex task, similar to recognizing objects in 3D space at different distances and orientations, and it may require the sensor module to take into account depth information to estimate depth and calculate the flow detected as movement in three-dimensional space 48:15.
The same methods used for 3D rotations of morphology objects could potentially work for 3D rotations of behavioral objects, allowing for testing in a 2D environment, but it is important to look into the literature on 3D flow estimation from 2D camera images to ensure accuracy 49:14.
Implementing behavioral models can be challenging, but even if the system is not perfect, it can still work well with noisy input, and it may not be a fatal flaw if it's not entirely accurate, as it might just be a bit more noisy 49:33.
The system's ability to detect certain behaviors, such as someone walking away or running, might be more difficult to infer if most of the movement is in a 2D plane, making it harder to determine what's actually happening 50:13.
Using a 2D approach might be a safe option, and there's a potential dataset that could be used, which includes objects like a stapler, and this environment could be set up in Blender to animate objects and test the system 50:33.
The system could be designed to work in 3D space and then tested in 2D by only showing input with X and Y coordinates and keeping the Z coordinate the same, without changing the learning module 51:33.
The use of supervised learning is being considered as a first step, providing detailed information, but it's questioned whether this is different from unsupervised learning used in the sequence memory algorithm, and how the system would handle resetting when a sequence starts or ends 51:53.
The discussion touches on the challenges of unsupervised learning, including inferring something new, and the importance of relying on high-order sequences and common features, rather than unique features, to address issues like knowing when to start or stop learning 53:04.
The sequence memory algorithm had its own set of problems, such as knowing when to start or stop learning, and dealing with new or unfamiliar information, like hearing a new melody that sounds similar to another one but then deviates 53:23.
Creating a new sequence and determining when to keep trying to infer an existing sequence is a challenging issue, similar to the problems encountered with morphology models in unsupervised learning, and providing state during learning is crucial, especially when indicating the start and repetition of a sequence 53:43.
The main difficulty lies in figuring out when a sequence begins and ends, particularly with tiny receptive fields, and providing supervision can help avoid the need for voting, which can be complex 54:20.
Alternatively, the system can automatically learn to represent equivalent inputs uniquely based on their sequence, but the problem of knowing when a sequence starts and ends remains, as seen in examples like melodies or walking, where there may be no clear demarcation 55:00.
Using signals like the stapler stopping at the top and bottom can help mark the beginning and end of a sequence, but this approach is complicated by the sensor's location on the object, which can change as the object moves 56:14.
Supervised learning can help address this issue, especially during the learning phase, and interacting with an object and actively moving it through a sequence can make it easier to learn behaviors 56:48.
Many behaviors are learned through interaction and control, such as opening and closing objects, which can facilitate learning, but there are cases where this is not possible, like with melodies 57:09.
The problem of implementing behavioral models involves solving cases where the behavior is not fully observable, such as someone walking or a metronome swinging back and forth, and this problem needs to be addressed as a voting problem during learning 57:28.
There are two main problems to be solved: the unsupervised learning of the start and end of a sequence, and learning when the behavior happens relatively fast, which requires voting on the state 58:05.
A learning module is envisioned to have a complete model of the behavior, knowing what movement should occur at every location at any point in time, but it can only observe one location at a time, making learning through limited observation a significant challenge 58:24.
The real problem lies in the learning process, where multiple learning models need to learn the same model despite not being exposed to the same experiences, and this issue needs to be addressed in the Monty system 59:03.
A proposal has been made for a first environment to get Monty to learn and recognize object behaviors, with the understanding that further steps will be taken to build upon this initial approach 59:21.
It is crucial to ensure that the chosen path does not lead to regret and that the problem of providing state during training is solved, as it is unrealistic to expect the state to be provided during training 59:39.
The open question of how to solve the learning problem in Monty will be kept on the table for discussion in future research meetings to find a solution 01:00:13.

Open Questions - Learning and Recognizing 01:00:28

The concept of interpolating in the temporal dimension is discussed, where performing a nearest neighbor search in 4D space with XYZT coordinates could be a nice property if achievable, allowing for interpolation in time as well 01:00:28.
The idea of representing state in a sequence as part of the location in a four-dimensional model is considered, which seems easy and elegant to do in Monty, but difficult to imagine how it would work in the brain with locations in layer 6A and time in layer 1 01:01:32.
Time is not considered a regular dimension, but rather a timer between events, and the sequence of events is dictated by the association between different Sparse Distributed Representations (SDRs) and layer 4, with time adding a duration between states 01:02:12.
The temporal memory algorithm involves a series of SDRs representing points in a sequence, flowing and associated with each other, with the order of events dictated by the association between SDRs, not by time 01:02:30.
The possibility of retrieving changes at a previous or next time step if a change is not observed at a particular location is considered, and it is thought that if the changes are sufficiently similar, it could provide an expectation of what to expect at that time step 01:03:27.
It is suggested that predicting a state multiple time steps ahead is not possible, but rather, the system might represent one or two steps ahead and behind, with the activation of cells in the cortex going through a sequence that could represent a little time ahead and behind 01:04:20.
The idea of interpolating when missing points by considering the step before and after is discussed, rather than trying to jump ahead multiple steps 01:04:39.
The discussion revolves around the concept of phase possession and grid cells, which represent where an individual is going to be, where they are, and where they were, all within one cycle of a background frequency, and this process is rapid 01:04:59.
The idea of referring to slices in behavior as states is considered more appropriate, as time is seen as an additional factor that can vary, and these states are independent of time, similar to notes in a melody that cannot be imagined in reverse 01:05:38.
The simple memory algorithm works by sequencing states without considering time, and the timing signal is the transition between individual elements, allowing for the inference of sequences without time 01:06:16.
To infer through time, it is suggested that only a little can be done, such as going slightly forward, and a lookup in the state space can be performed to interpolate between slices 01:06:53.
The learning module's ability to know when a sequence repeats and adjust its speed is still an open question, but a proposal involves the learning module noticing temporal offsets between states and communicating this to the global clock 01:07:32.
The proposal for adjusting sequence speed involves the learning module signaling whether an event occurred sooner or later than expected, allowing for adjustments to be made to the global clock 01:08:12.
The concept of tempo is compared to scale in morphology, where scale was a signal stored in a column and used to modulate inputs, and a similar approach may be applied to tempo 01:09:07.
The global clock and learning module work together in a reciprocal manner, with the global clock defining a tempo and the learning module making behavioral predictions, and the predictions' timing feeding back to the global clock to adjust its tempo 01:09:23.
The learning module can infer whether events are happening quicker or slower than expected based on multiple inputs, and it sends feedback to the global clock to speed up or slow down, but it does not infer the tempo itself 01:09:44.
The concept of time in this context is different from the traditional understanding of time as a dimension, and it is more similar to an event timer that measures the time between events, such as in melodies 01:10:17.
The brain's representation of time is unique and may be worth emulating, and it is similar to but distinct from the representation of scale, with both potentially having model-free signals that provide information about tempo or scale 01:10:34.
Model-free signals could be used to quickly estimate tempo or scale, and voting based on relative positions could be used to determine scale, potentially tying together multiple concepts 01:11:16.
The idea of voting based on relative positions relates to scale, as the distance between two learning modules detecting features can provide information about the scale of an object 01:12:11.
Besides time, other state change signals, such as actions, could be used to transition between different states or slices, and while this may not be necessary for the first implementation, it is an interesting question to consider for the general design 01:12:47.
The answer to whether other state change signals besides time are necessary is likely yes, as many real-world events are triggered by specific actions, and it may be possible to move forward with the design without fully addressing this question 01:13:25.
The discussion revolves around the implementation of behavioral models, with a focus on whether actions are necessary for the first prototype, and it is agreed that actions will be needed at some point, but the exact implementation can be discussed in future research meetings 01:13:45.
The fourth dimension in morphology models is questioned, with uncertainty about whether it corresponds to time or discrete states associated with times in behavior models, and it is suggested that the labels should be states that can get time as input 01:14:46.
The possibility of morphology models learning temporal inputs and transitions between states is explored, with the idea that if a morphology model transitions between states, it can learn those transitions as sequences, even if there is no movement involved 01:15:44.
The neural mechanisms of learning models and morphology models are compared, with the assumption that if the morphology goes through a sequence of predictable transitions, those transitions would be learned, and it is noted that sometimes morphology changes happen without detectable flow or movement 01:16:40.
The difference between learning modules that focus on changes and static features is discussed, with the understanding that both can learn sequences, but the morphology model learns sequences of key frames without learning the flow or motion between them 01:17:40.
The concept of key frames is introduced, where the morphology model learns the transition between key frames without learning the motion, allowing it to learn sequences of states without necessarily learning the actions or movements involved 01:17:59.
The discussion revolves around implementing behavioral models, with a focus on learning modules that can learn transitions, temporal sequences, and non-temporal sequences, whether it's morphology or change input, and the idea is to use the same learning module for all of it 01:18:18.
The learning module's policy and setup are crucial, and it's essential to determine when the sensor should be moving to learn a behavior or remain static, as smooth pursuit of the behavior is not desired 01:19:14.
A critical issue is how a learning module can observe and learn all the necessary information, and it seems impossible for a single learning module to learn behaviors, but the answer may lie in columns training each other and transferring knowledge to other learning modules 01:20:13.
The idea of supervision by columns that are observing is proposed as a possible solution, and it's suggested not to focus too much on this issue yet, but rather supervise and figure out the learning policy, which may be a group's learning policy rather than a single column's 01:21:03.
For inference, the current policies should still work, and the conclusion is that smooth pursuit will not be used, with the sensor movements being processed as feature movements 01:21:41.
The discussion also touches on learning association, but it's decided to postpone further discussion and potentially do a part two, as well as wait with voting, which is still being worked on 01:22:19.

Learning Associations Between Behavior and Morphology Models 01:24:08

The idea is to assign a behavior to a location and orientation on an object by taking the behavior ID as input to layer 4 of the morphology model, which would assign the behavior on a location-by-location basis to the morphology, and also consider the orientation of the behavior relative to the parent object 01:24:45.
The morphology can bias which behavior is expected at a particular location through feedback connections, and this approach is considered elegant, but there is a tricky part regarding the relationship between columns in different learning modules, such as V2 and V1, which need to be colllocated in the sensory array 01:25:23.
The separation of behavior learning modules, such as MT, from other modules, like V2, requires a one-to-one relationship between columns representing the same location in retinal space, which may be easy to implement in Monty, but the evidence for this relationship in the MT mapping is unclear 01:26:05.
The proposal to separate behavior learning modules is still considered the right approach, but it requires careful consideration of the retinal topic mapping between modules like MT and V2, and Monty can accommodate this requirement as long as it is agreed upon 01:26:45.
There is a need to investigate whether associating state in the behavior sequence is necessary, or if simply associating the existence of the behavior at a particular location is sufficient, and Monty can easily implement the latter, but additional mechanisms may be needed to communicate state in the sequence 01:27:41.
The state of the sequence could be communicated through the CMP message in Monty, which would provide information about where in the behavior sequence the system should be, based on the current morphological state 01:28:03.
The discussion revolves around implementing behavioral models, where the relationship between morphology models and behavioral models is explored, and it is suggested that the morphology models could be the child of the behavioral models, with the behavioral models being the parent, allowing for input when the child object is changing, such as rotating or moving in space 01:28:26.
The idea of adding associative voting on ID is mentioned, which could help in recognizing objects and their behaviors, even when the receptive fields are not always collocated, and this concept is compared to seeing a dog and associating it with the barking behavior without hallucinating the sound 01:30:34.
To test the capability of the behavioral models, it is proposed to assign different behaviors to locations on one object, allowing for the testing of seeing a static object and inferring the behavior, as well as seeing a behavior and biasing which morphology object to recognize, with the state in the behavioral sequence activating a state in the morphology model and vice versa 01:31:52.
The importance of feedback connections is highlighted, where seeing a static object can infer behavior, and the state in the behavioral sequence can activate a state in the morphology model, with the example of seeing a stapler and predicting its movement, and the possibility of adding associative voting to address issues with non-collocated receptive fields 01:32:29.
The concept of receptive fields and their potential non-collocation is mentioned as a challenge, where the system may need to add associative voting to improve object recognition and behavior association, and the idea of testing the system with objects that have different behaviors at different locations and orientations is discussed 01:32:48.
The need to test feed and feedback connections to ensure they work properly is emphasized, with the understanding that these connections can be interpreted in different ways 01:33:06.
It is suggested that multiple behaviors can be associated with a single location, such as a leg that can be jumping, flexing, or stretching its toes, making it reasonable to think that some morphology can have multiple different behaviors 01:33:27.
The concept of inferring behaviors implies a one-to-one relationship, which can introduce bias, but this can be adjusted to account for multiple behaviors at one location, as noted by Jeff 01:34:02.
If there are multiple behaviors at one location, it is still a bias issue, but if only one behavior is stored, it should predict that one behavior, as illustrated by the example of a leg with motion dots on different points participating in various behaviors 01:34:22.
The example of a leg with multiple motion dots participating in different behaviors, such as jumping or flexing, highlights the complexity of associating multiple behaviors with a single location 01:34:41.

Open Questions - Learning Associations 01:34:50

The morphology model is considered the child and the behavior model is considered the parent, with the input representing changes in location and orientation of an object recognized in the child column, such as a stapler top changing its location and orientation as the stapler is opening 01:34:50.
The child column represents a morphology model of a specific part of an object, like the stapler top, and recognizes changes in its location and orientation, which are then sent as input to the behavior model 01:35:10.
The behavior model learns the behavior of the object, and the rotation change of the stapler top is sent to the behavior model, with the orientation change being calculated relative to the parent column's reference frame 01:36:49.
However, location changes are more complex, as they are represented as locations on the object model and not in the environment, requiring a different approach to communicate location changes of the object in the environment to the parent column 01:37:05.
The detection of motion seems to require raw sensor input, and the behavior model may rely on this input to detect movement, rather than using information from the child object, which can only infer the presence of an object at different orientations but not its motion 01:38:25.
The model may be able to gate the attention of the sensor to restrict it to a specific area, such as the movement of a particular object, but the actual detection of motion appears to need to be done using more raw sensory data 01:39:16.
The column on the right can detect changes in sensory input on its own, and it is intuitive that some of that change is communicated up and would be useful, especially for movement in abstract spaces that cannot come from direct sensory input 01:39:32.
Using information from other parts of the model could make it more robust and based on the actual model and shape of the object, but it is not necessary for a first implementation, and the goal is to get started with implementing the behavioral model 01:39:54.
The issue of communicating object location is an open question, but it seems useful in several applications, and it can be revisited later when discussing voting or other related topics 01:41:04.
To make progress, it is necessary to pick a set of problems to focus on for behavior and implement a solution, even if it does not capture all possible scenarios, similar to the approach taken with compositional object research 01:41:41.
It is important to prioritize and sort out the list of questions and open issues, answering the necessary ones now and leaving others for later, in order to make progress with implementing the behavioral model 01:42:36.
Starting to build the model is exciting, and implementing it will likely reveal new issues and insights that were not considered initially, but having a initial idea of how to implement it is a good starting point 01:43:12.
The idea of conceptualizing object behaviors is still in its early stages, and creating initial prototypes can help identify potential issues, whereas compositional objects are a more mature concept that still requires further work, but exploring object behaviors can provide valuable insights 01:43:29.
The goal is to incorporate changes and reduce the chance of breaking the system later when introducing new features, but there is no clear answer on how to achieve this, and analogies can be helpful in understanding the challenges 01:44:07.
The proposal for modeling object behaviors in Monty involves adding four capabilities: learning object behaviors, recognizing object behaviors independent of morphology, learning associations between behavior and morphology models, and compensating for object movement to make predictions in the morphology model 01:44:44.
The plan is to start with the first two capabilities for the initial implementation, and the third capability should become available once compositional objects are implemented, while the fourth capability is a big topic that will not be addressed in the first prototype 01:45:58.
The discussion highlighted that morphology models and behavior models share similarities, as they both involve modeling transitions between states and can be informed by a global signal, such as time, which can help understand the concepts of state and transitions 01:47:13.
The team has identified open questions and has decided to figure some of them out later, but has concrete answers to the relevant ones, allowing them to start working on a first prototype of object behaviors in Monty 01:46:55.
The concept of behavioral columns is being reevaluated, with the suggestion that behavior isn't unique to these columns and that a different name, such as "dynamic input" or "learning modules," might be more accurate, as all columns seem to be doing the same thing 01:47:55
The idea of a single learning module that can handle different types of input is being considered, with the possibility of adding new types of input, such as movement, which would simplify the implementation and eliminate the need for specialized learning modules for behaviors 01:48:31
The neurons in the learning module represent sequences without time and can move through those sequences at different speeds, which is equivalent to having a sequence of states that can be moved through by applying actions, ultimately representing different states of objects 01:49:08
Vivian's argument is that the incorporation of movement as a new type of modality doesn't require significant changes to the learning module, which supports the idea of a single, versatile learning module 01:48:49
The discussion is open to continuation, with the option for attendees to take a break or leave, as the meeting can go longer than two hours and a recording will be available for those who need to leave 01:50:13
Neil had a minor point to discuss regarding behavioral connectivity, which he considered recording separately, but decided to address during the meeting, which can be constrained to allow for a relatively quick discussion 01:49:56

Asymmetric Connections and Behavior Columns 01:50:44

The discussion revolves around the concept of asymmetry in columns, where not all columns are doing the same thing, particularly in relation to behavior and how it can be applied to compensate in a model, with an idea that originated from Vivian and Jeff 01:51:01.
The idea involves applying motion to move through a reference frame as a compensatory movement, enabling the prediction of low-level features, and it is thought that this connection could be related to the purple line connection in a hierarchical relation 01:51:57.
The concern is that this connection seems specific to movement and it is unclear what a morphology model would do with that connection, potentially leading to different types of connections being sent by behavioral and morphology columns 01:52:48.
The usual connectivity is shown, with a behavioral column, a morphology column, and feedback connections with local synapses in L6, predicting the specific location, and it is observed that these synapses also go into L5 01:53:24.
There is evidence that cells in L5 are equivalent to double bouquet cells but for movement vectors, and if they were to form synapses on those cells, it could provide a way for hierarchical connections to provide movement through the space of the reference frame 01:53:45.
The projection going up through the column could choose to form synapses on double cells or in the reference frame, depending on what the parent column is trying to predict in the low-level one, and this would mean that all columns send hierarchical feedback connections through the column and up to L1 01:54:24.
The difference in the connections would depend on the type of things being learned, and it is suggested that this could be a way for columns to send the same type of connections, regardless of their specific function 01:54:41.
The brain's neural connections, specifically in layers 5 and 6, form synapses to predict movements or locations, with local connectivity patterns observed in diagrams from Rockland's papers, showing widespread L1 connectivity and more local projections in L5 01:55:01.
The formation of synapses in L5 or L6 may be necessary for certain functions, such as goal states or action policy, and could be involved in various processes, with the possibility of half a dozen other functions for the L5 connection 01:56:00.
The idea is proposed that if a higher-level column is a behavior model, it will synapse in layer 5, while a morphology model in a hierarchy would synapse in layer 6, with this process being driven by learning in a non-predefined manner 01:56:57.
The genes specify the overall structure of the brain, including the types of cells in different layers and their destinations, but do not specify particular synapses, which are instead associatively learned through experience 01:58:14.
The formation of synapses as an axon rises up through the layers is thought to be an associative learning process, where the axon forms connections with other cells it can associate with, without prior knowledge of its location or specific targets 01:58:56.
The formation of synaptic connections between axons and dendrites is opportunistic and depends on associative connections, with excitatory cells forming pairings of synaptic connections with other nearby axons or dendrites, and some inhibitory cells behaving differently 01:59:16.
The growth and direction of axons and dendrites are determined by their ability to form synapses, with those that form synapses continuing to grow and those that do not retracting and growing in another direction in search of associatively paired connections 01:59:56.
The solution to a previously discussed problem seems reasonable, involving the formation of connections between different layers, such as layer five or layer six, depending on what is most useful and predictive 02:00:32.
The discussion involves understanding the biological basis of connection and movement information from higher-level columns, with the possibility of a totally different kind of top-down connection, such as one sent by L5, that has not been previously discussed 02:00:50.
The introduction of a new connection raises questions about asymmetry and specificity to the relationship between behavioral and morphology columns, and how this might be handled in a model like Monty 02:01:27.
In the context of Monty, it is suggested that the model could interpret connections in a way that is most useful and predictive, similar to how the brain connects to different layers, but this would require explicit specification of how associations are formed and learned 02:02:26.
The current implementation of Monty does not include synaptic learning, and would need to be modified to allow for the formation of associations between different components, with the addition of a mechanism to determine when to learn and when not to 02:02:45.
The general proposal involves distinguishing between feedback connections as general movement vectors and context signals that are specific to certain objects or layers, with the context signal going to layer one being a specific example 02:03:03.
The discussion revolves around the idea that certain cells, such as double bouquet cells, are non-specific to a particular object and may require reference frame transformations, with the issue of reference frame being a potential problem 02:03:20.
The concept of general movement vectors is introduced, where it is suggested that layer one may not benefit from information about general movement vectors, but rather it could be a biasing signal that influences the movement to compensate for what is seen 02:03:39.
The idea of biasing signals is further explored, with the example of a stapler being used to illustrate how a movement vector can be applied to different objects in various orientations, but it may not be useful for learning associations 02:04:19.
The L1 connectivity is mentioned as potentially biasing the ID of an object, such as a stapler, but it is noted that it would not be a comprehensive model of the object's behavior, just a single movement vector 02:04:36.
The conversation touches on the idea that the directions of movement vectors can be applied to different objects in various orientations, limiting the usefulness of learning associations from this information 02:04:52.
The topic is categorized under a previous presentation's category four, and it is decided that further discussion on this topic can be postponed, with an alternative proposal being mentioned for future consideration 02:05:12.
The importance of considering the formation of connections in different layers, such as layer five, is highlighted, and the potential usefulness of this concept for other applications, like goals, is noted 02:05:30.
The meeting is wrapped up, with a suggestion to prepare for the next week's meeting and a reminder to address the problem of learning and voting 02:06:11.

Jeff Talks About the Problem of Inter-Column Teaching 02:06:32

The problem of inter-column teaching arises when different learning modules do not experience everything, and this issue is more pronounced in behavioral models, with possible solutions including the use of hierarchy or local sharing between learning modules 02:06:32.
One potential approach to addressing this issue is to have learning modules that are not receiving input learn from other modules that are sensing something, by being fed movement vectors and modeling through an equivalent space to the learning module that is learning 02:07:28.
This approach would require mechanisms for sharing and training between nearby learning modules, with one module getting input and the other not, and may also necessitate the use of hierarchical methods 02:08:04.
The idea being explored is whether uninput or null learning models can actually learn all the time by getting input from other modules, and this concept may be applicable to various senses, including touch and vision 02:08:25.
In the context of vision, columns that receive non-visual input or are looking at a blank space may not get any input due to the lack of center-surround activity, highlighting the complexity of this issue 02:09:01.
The discussion touches on the fact that even if all columns in the retina receive axons, most of those axons are not active, which further complicates the problem of inter-column teaching 02:09:20.
The overall goal is to figure out how different learning modules can learn the same thing even if they are not exposed to the same input, and this is an ongoing area of research and exploration 02:09:38.

AgentRev/ImplementingBehavioralModels.md

Select an option

No results found

Select an option

No results found