Dynamic Domain Architectures for Model-based Autonomy Bob Laddaga Howard Shrobe Brian C. Williams (PI) MIT Artificial Intelligence Lab Space Systems Lab Structure of the MIT Project • Two Major Design Foci – Autonomous Vehicles • • • • Building on Brian Williams’ Work at NASA Ames 1st Generation Software flown on Deep Space 1 Extending the modeling framework to support hybrid systems New mode-identification and mode-control algorithms – Perceptually Enabled Spaces • • • • Building on the MIT AI Lab’s Intelligent Room Emphasis on self-adaptivity, recovery from faults and attacks Use of Machine Vision, Speech and NLP New Emphasis on modeling, frameworks • Common Themes – Self-diagnosis, recovery, “domain architecture frameworks” – Integration through model-driven online inference – Multiplicity of methods for common abstract tasks • Extension to the MOBIES OEP’s. Model-based Integration of System Interactions Through Online Deduction Model-based Programs Software Component Control templates Models of physical constituents Goals Model • • • • • monitoring Controller tracking goals confirming commands isolating faults diagnosing faults mode identification g s’(t) s (t) • • • • mode control f reconfiguring hardware coordinating control policies recovering from faults avoiding failures (t) Plant • • • • allocate resources select execution times select procedures order actions Model-based Deductive Executive Online Propositional Deduction TMS generate successor conflict database Model Based Troubleshooting GDE 15 3 Times 40 Plus 40 5 5 25 Times 5 25 20 Plus 40 35 3 Times 15 Conflicts: Blue or Violet Broken Green Broken, Red with compensating fault Green Broken, Yellow with masking fault Diagnoses: Applying Failure Models B A IN L Normal:3 Fast: -30 Slow: 7 0 H 6 2 30 MID Low = 3 High = 6 P .7 .1 .2 L H P Normal:2 4 0.9 Fast: -30 1 .04 Slow: 5 30 .06 OUT1 L H P Normal:5 10 0.8 Fast: -30 4 .03 Slow: 11 30 .07 OUT2 C A B C MID Low Normal Normal Slow 3 Slow Fast Normal 7 Fast Normal Slow 1 Normal Fast Slow 4 Fast Slow Slow -30 Slow Fast Fast 13 Observed: Predicted: Observed: Predicted: 5 Low = 5 High = 10 17 Low = 8 High =16 Consistent Diagnoses MID Prob Explanation High 3 .04410 C is delayed 12 .00640 A Slow, B Masks runs negative! 2 .00630 A Fast, C Slower 6 .00196 B not too fast, C slow 0 .00042 A Fast, B Masks, C slow 30 .00024 A Slow, B Masks, C not masking fast Modeling Reactive Fallible Systems • Create a new modeling reactive programming language to describe the behavior of the combined hardware-software system • Underlying semantics is that of a (partially observable) Markov Model • Mode Identification (diagnosis) is now the problem of state estimation in a HMM Moving to a Multi-Tiered Bayesian Framework • The model has two levels of detail specifying computations, the underlying resources and the mapping of computations to resources • Each resource has models of its state of compromise • The modes of the resource models are linked to the modes of the computational models by conditional probabilities • The Model can be viewed as a Bayesian Network Normal: Delay: 2,4 Delayed: Delay 4,+inf Accelerated: Delay -inf,2 Conditional probability = .2 Conditional probability = .4 Conditional probability = .3 Normal: Probability 90% Parasite: Probability 9% Other: Probability 1% Has models Has models Component 1 Node17 Located On Summary of Autonomous Vehicle Work • To survive decades, autonomous systems must orchestrate complex regulatory and immune systems. • Future systems will be programmed with models, describing themselves and their environments. • Future runtime kernels will be agile, deducing and planning from these models within the reactive loop. • This requires a new foundation for embedded computation, replacing discrete automata with partially observable Markov decision processes. • We propose to extend model-based programming to dynamic domain specific languages with distributed model-based executives that reason over complex, functionally redundant behaviors. Software Frameworks for Embedded Systems A framework reifies a model in code, API, and in the constraints and guarantees the model provides. It includes: • A set of properties & formal ontology of the domain of concern • An axiomatization of the core domain theory • Analytic (e.g. proof) techniques tailored to these properties and their domain theory • A run-time infrastructure providing a rich set of layered services • Models describing the goal-directed structure of these software services • A protocol specifying the rules by which other software interacts with the infrastructure provided by the framework • An domain-specific extension language for coupling an application to the services rendered by the framework. Frameworks Support Analysis • The model is specified in terms of a domain specific extension language that captures the terms and concepts used by application domain experts. • The ontology provides a language in which to state annotations of the program (e.g. goals, alternative strategies and methods for achieving goals, sub-goal structure, state-variables, declarations, assertions, and requirements). • Annotations inform program analysis. – Both logical and probabilistic. • Annotations facilitate the writing of high level generators – Synthesizing the wrapper code that integrates multiple frameworks. Dynamic Domain Architecture Frameworks • Structures the procedural knowledge of the domain into layers of services • Services at one layer invoke services from lower layers to achieve their sub-goals. • Each service has many implementations corresponding to the variability and parameterization of the domain. • The choice of implementation to invoke is made at runtime, in light of runtime conditions, with the goal of maximizing expected utility. • Exposes its models, goal structure, state-variables, its API, its protocol of use and constraints on those subsystems that interact with it. DDA Framework Rational Selection Component Asset Base Foo 1 2 A 1 2 Method 3 Is most Attractive 3 1 2 Diagnostic Service To: Execute Foo B 3 Diagnosis & Recovery 1 2 3 3 Repair Plan Selector Super routines alerts Layer1 Layer2 Self Monitoring Layer3 Plan Structures Foo B A C A Post Condition 1 of Foo Because Post Cond 2 of B And Post Cond 1 of C PreReq 1 of B Because Post Cond 1 of A Development Environment B Rollback Designer Resource Allocator Condition-1 Condition-1 Enactment Runtime Environment Integration of DDA Frameworks • Frameworks interact at runtime by observing and reasoning about one another's state and by posting goals and constraints to guide each other's behaviors. • The posting and observation of state is facilitated by wrapper code inserted into each framework by modelbased generators of the interacting frameworks. • Use of generated observation, control points and novel, fast propositional reasoning techniques allow this to happen within reactive time frames. • The composite system behaves as if it is goal directed while avoiding the overhead normally associated with generalized reasoning. Perceptually Enabled Spaces • The Intelligent Room is an Integrated Environment for Multi-modal HCI. It has Eyes and Ears.. – The room provides speech input – The room has deep understanding of natural language utterances – The room has a variety of machine vision systems that enable it to: • • • • • • • Track motion and maintain the position of people Recognize gestures Recognize body postures Identify faces (eventually) Track pointing devices (e.g. laser pointer) Select Optimal Camera for Remote Viewers Steer Cameras to track focus of attention – Perceptually enabled environments are good surrogates for sensor driven DoD applications (e.g. missile defense). Command Post Demo (2 years old) QuickTime™ and a Sorenson Video decompressor are needed to see this picture. MetaGlue:A Platform for Perceptually Enabled Environments • Naming and Discovery – Society, Role within Society, Required Properties – Societies are collection that act on behalf of a common entity (person or space) • Central Registry – Discovery – Environmental Specific info – Storage of Agent State • Communication – Direct Method Call (using RMI) • Robustness – Freezing of State and Thawing on Automatic restart of dead-agents • Dynamic Reloading of Agents during system execution • Dynamic Collaboration Between Agents – Publish and Subscribe driven event interfaces Agents Currently Provided in MetaGlue • Control of Devices – X10 and similar simple sensors and effectors – Audio visual equipment and multiplexers • Display and Screen Management • Speech recognition components – Contextual grammars and command processing • Visual processing – Laser pointer tracking – Face tracking for video conferencing • Natural language processing – Interfaces to the start system Command Post Logger Notifier Living Room AgentTester Demo Manager Max Prob Doc Info Retrieval Retrieval START Room Tutor Map Display Eliza WWW Preference Learner Person Tracker Grammar Agents CD VCR Mux Player Hal Cluster Learner On Table Above Couch Room State Display Audio Manager Manager Event Cluster Space Above Door Music Selector Drapes On TV Vision Agents Laser-1 Laser-2 X10 Blinds IR RS-232 Tuner TV Mux Room Lamp Manager Lamp Lamp Window Door The Need for Frameworks • MetaGlue is a lightweight, distributed object infrastructure for perceptually enabled systems. Meta-Glue provides tools for integrating and dynamically connecting components, extensibility and for saving and restoring the state of components. • The current incarnation of the system deals with the key modeling challenges with ad croc techniques. • In MOBIES we are developing principled frameworks for these modeling challenges. • Each framework provides languages, interfaces and guarantees for a specific set of concerns. • These frameworks deal with semantics, context, resource management and robustness. 5 Key Modeling challenges • How to model people, processes and perceptions so as to enable group interactions in multiple spaces • How to model services so as to enable reasonably optimal use of resources • How to model processes so as to recover from failures (e.g. equipment breakdown, failed assumptions) • How to model perceptual events so as to coordinate and fuse information from many sensors • How to model and exploit context Grounding in Semantics • We want to build applications that simultaneously service multiple people, organizations, physical spaces, sensors and effectors. • The individuals move within and among many physical spaces • The roles and responsibilities of individuals changes over time. • The devices and resources they use change as time progresses • The context shifts during interactions • The relevant information base evolves over time. Models and Knowledge Representations • People – Interests, Skills, Responsibilities, Organizational Role • Organizations – Members, structure, roles, processes and procedures. • Spaces – Location, Subspaces – Devices and Resources contained in the space. • Services: Methods, parameter bindings, resource requirements • Agents: Capabilities, interfaces, society. • Resources: Interfaces, capabilities, cost, reliability • Information Nodes: Topic area, place in ontology, format • Events – Changes in any of the above representations or in the system’s knowledge about any of the above. – E.g. person identification, Motion in space. Abstracting Away from Specific Devices • Until recently, applications were written in terms of specific resources without context (e.g. the left projector). • This conflicts with: – Portability across physical contexts – Changes in equipment availability across time – Multiple applications demanding similar resources – Need to take advantage of new resources – Need to integrate mobile devices as they migrate into a space – Need to link two or more spaces • What is required is a more abstract approach, a framework for resource management, in which no application needs to be tied to a specific device. Framework 1: Service Mapping and Resource Management • Users request abstract services from the Service Mapper – “I want to get a textual message to a system wizard” • The Service Mapper has many plans for how to render each service – “Locate a wizard, project on a wall near her” – “Locate a wizard, use a voice synthesizer and a speaker near her” – “Print the message and page the Wizards to go to the printer” • Each plan requires certain resources (and other abstract services) – Some resources are more valuable, scarce, utilized than others – The Resource Manager places a “price” on each resource • Each plan provides the service with different qualities – Some of these are more desired by the user (higher benefit) • The Service Mapper picks a plan which is (nearly) optimal – Maximum net benefit Service Mapping and Resource Management Each Method Binds the Settings of The Control Parameters in a Different Way Service Control Parameters User’s Utility Function Abstract Service The binding of parameters has a value to the user User Requests A Service with certain parameters Each Method Requires Different Resources Resource1,1 Method1 Resource1,2 Method2 Resource1,j Resource Cost Function Methodn Each Service can be Provided by Several Methods The Resources Used by the Method Have a cost Net Benefit The System Selects the Method Which Maximizes Net Benefit A Service Mapping Example Plan 1: Locate A Systems Wizard, Project on a wall near her Costs: Project (in use) high, person Location (high) Benefits: Fast, Clear Plan 2: Locate A Systems Wizard, Voice Synthesize on nearby speaker Costs: Load-Speaker (unused) mid, person location (high) Benefit: Fast, Catches attention I need to ask a question of a Systems Wizard The best plan under the Circumstances is Plan 3! Plan 3: Print on printer, Send a page on the Wizard Line Costs: Printer (busy) mid, pager (mid) Benefit: Mid Framework 2: Recovery From Failures • The Service Mapping framework renders services by translating them into plans involving physical resources • Physical resources have known (and unknown) failure modes • Each plan step accomplishes sub-goal conditions needed by succeeding steps – Each condition has some way of monitoring whether it has been accomplished – These monitoring steps are generated into the code implementing the plan • If a sub-goal fails to be accomplished, the diagnostic infrastructure is invoked Making the System Responsible for Achieving Its Goals Diagnostic Service Localization & Characterization alerts Repair Plan Selector Scope of Recovery Selection of Alternative B A requires Rollback Designer achieves Condition-1 Condition-1 Concrete Repair Plan prerequisite Resource Allocator Monitor Resource Plan Enactment Diagnosis and Recovery Framework • Model-based diagnosis isolates and characterizes the failure – Driven by detected discrepancies between expectations and observations – Each component has models of both normal and abnormal modes of operation – Selects set of components models consistent with observations • A recovery is chosen based on the diagnosis – It might be as simple as “try it again”, we had a network glitch – It might be “try it again, but with a different selection of resources” – It might be as complex as “clean up and try a different plan” Example of Recovering from Failures I don’t see light on the screen Locate a Wizard by the Screen Monitoring: check that wizard is still there Turn on selected projector Monitoring: Check that the projector is on Plan Breakdown I see a wizard by the screen The projector-1 must be broken We’ll try again, but use Projector-3 Project the message Monitoring: Check that the person noticed the message Framework 3: Coordination of Perceptual Information • We wish to separate the implementation of perceptual tasks from the uses to which perception is put • Communication between modules focuses on “behavioral events” – An event is signaled whenever the state of any object in the model is perceived to change – E.g. a person moves, a person is identified, a device dies • Communication is controlled by publish-subscribe interface – Each module publishes in central registry which events it can notice – Each module subscribes with registry for events of interest – Registry distributes to signalers list of consumers • More exactly code to test list of consumers • No centralized communications hub The Event “Bus” For Perceptual Coordination I’m interested in the location of individuals I signal people approaching the whiteboard I’m interested in face location I signal the location of individuals I’m interested in body motion I signal face location Face Spotter Face Recognition White Board Context Manager I’m interested in body motion I signal the location of individuals Voice Identification I signal body motion Visual Tracke r Coordination of Perceptual Information • “Behavioral events” are organized into a taxonomy – Some behaviors are “close to the physics” – Some behaviors are more abstract (e.g. a person is near the white-board) • Events are signaled with a certainty estimate • Behaviors abstract over time – Recognizers take the form of Markov chains • The same event can be signaled by different perceptual modules – both face and voice recognition can identify a person • Recognizer code is synthesized from logical descriptions Interaction of Frameworks: Perceptual Integration through Service Request • For each behavior there is a corresponding Service • This can be requested through the Service Mapping framework – Requested when the initial perception lacks confidence – Driven by diagnosis of perceptual breakdown • Cause the system to marshal resources necessary to gather the additional information needed for disambiguation It’s Sue Certainty:high It’s Sally Certainty:very low There is a face Certainty:high There is a body Certainty: high Service Request: Get Good Pose Plan: Synthesize Good Pose Get another view Visual Hull Interpolate Recovery: Get Good Pose Diagnosis: Bad Pose Framework 4: Contextual management of perception • Context is a combination of: • Information from all perceptual modalities • Task Structure • Context biases perceptual processing • speech • vision (eventually) perception context Task structure time = i time = i+1 Using Context to Guide Speech Recognition • Many small grammars • Each Appropriate to a Specific Context + + + + • • Based on IBM’s ViaVoice Large assembly (~ 50) of software agents • Dynamically activate & deactivate based on the interaction • Simulate a very large set of supported utterances • Reject utterances inappropriate to the Context An Example of Context Management • The attention of the system should be focussed by what people do and say. • Speech recognition should be biased in favor of things going on at that time – Speech system is made up of many “grammar fragments” – Grammar fragments are activated (and deactivated) when perceptual events (visual or speech) suggest they should be Frobulate the sidetracker Activate the Drawing Grammar If Person ?P approaches space ?x and Grammar ?y is relevant to ?x Then Activate Grammar ?y Sally approaches space Whiteboard-1 White Board Place Manager The Drawing Grammar Is Relevant to space Whiteboard-1 If A space ?x contains a drawing device ?y Then the Drawing Grammar is relevant to space ?x Space Whiteboard-1 contains device mimeo-1 Mimeo devices are drawing devices Framework 5: Application Frameworks for Display and Cmd Mgt • An application manages: • A set of displays • A set of perceptual capabilities • An underlying set of state-variables (the root nodes of its representation) • Application loop: • Accept a service request (a command) • Perform the service • Update the displays to reflect new state-variables • Gives a sense of synchronicity at the high level even though at the lower level there are many distributed agents at work Elements of Application Framework Display Display Display Manager State Variables Summary • First focus is on model-driven integration of autonomous vehicles – New modeling, and mode-identification techniques being developed • Second focus is on perceptually guided environments – Several frameworks being developed to support specific issues – Service Mapping, Perceptual Coordination, Diagnosis and recovery, Contextual Biasing • Common theme of structuring into frameworks along the lines of Dynamic Domain Architectures. • Will soon investigate OEP domains as well.