Invited speakers

Kaoru Hirota  – Beijing Institute of Technology, China

Humans-Robots Interaction in Multi-agent Smart Society

Humans-robots interaction through the internet is presented in the frame work of multiagent smart society. At first, a deep level emotion understanding method is proposed for agent to agent communication, where customized learning knowledge of an observed agent is used with the observed input information from Kinect.  It aims to realize agent dependent emotion understanding by utilizing special customized knowledge of the agent, rather than ordinary surface level emotion understanding by using visual/acoustic/distance information without any customized knowledge. Then a concept of Fuzzy Atmosfield (FA) is introduced to express the atmosphere in humans-robots communication. It is characterized by a 3D fuzzy cubic space with “friendly-hostile”, “lively-calm”, and “casual-formal” based on a cognitive science experiments and PCA. The atmosphere in the communication is expressed by a point in the 3D fuzzy cubic space and is supposed to be varying/moving in the space time by time. To understand intuitively such atmosphere change, a graphical representation method is also proposed. To illustrate the FA and its visualization method for public, a demonstration scenario “enjoying home party by five eye robots and four humans” is demonstrated.

Pitoyo Hartono – Chukyo University, Japan

Toward comprehensible neural networks

In recent years we have seen the rapid proliferation of neural networks, especially deep models. Learning tasks, which due to their data sizes and complexity were unattainable until a few years ago, now can be learned in reasonable time scales. Now, deep models are being applied widely in various fields including self-driving vehicles, industrial and service robots, and medical sciences including pathological diagnostics. One of the primarily drawbacks of hierarchical neural networks including deep models is their incomprehensiveness, in that it is hard to understand how they are doing what they do. For many applications like game playing, this incomprehensiveness is not a drawback. However, for critical applications like pathological diagnostics, the incomprehensiveness may hindered further applications of neural networks. It is important to some extent understand the rationale behind neural networks’ decisions. In this talk, some studies about understanding neural networks will be reviewed. The primary focus will be on a recently proposed neural network model called Softmax Restricted Radial Basis Function Networks (S-rRBF) that provides visualization of its internal layer to complement its decisions. The visualization of the internal layer provides intuitive understanding on how the neural network connects its input into output. Some examples of real world problems including pathological diagnostics will be given and the potential for the further usages of the S-rRBF will be explained.

Takahiro Yamanoi – Hokkai-Gakuen University, Japan

Elucidation of Brain Activities by Electroencephalograms and its application to Brain Computer Interface

According to  research on the human brain, the primary process of visual stimulus is first processed on V1  in the  brain occipital lobe. Then the process goes to the parietal associative area. Higher order processes of the brain thereafter have their laterality. For instance, 99% of right-handed person and 70% of left-handed person have their language area in the left hemisphere, in the Wernicke’s area and Broca’s area. By presenting several visual stimuli to the subjects, Yamanoi and collaborators measured electroencephalograms (EEGs) during those stimuli were presented. And the EEGs data were summed and averaged according to the type of stimuli and the subjects so as to obtain the event related potentials (ERPs). These ERPs were analyzed by the equivalent current dipole source localization (ECDL). Some spatiotemporal activities in the brain will be explained as results of ECDL method. Yamanoi and his collaborators measured electroencephalograms (EEGs) during recognizing and recalling ten types still images of robot PLEN movement. They tried to discriminate these EEGs according to ten type images by using the canonical discriminant analysis. As a result, the mean of discriminant ratio was 100%, hence we could control the robot PLEN by EEGs.

Vladimir Marik – Czech Technical University in Prague, Czech Republic

Hamido Fujita – Iwate Prefectural University, Japan

Takashi Minato – Advanced Telecommunications Research Institute International, Japan

Development of an autonomous android that can naturally talk with people

Our research group have been developing a very humanlike android robot that can talk with people in a humanlike manner involving not only verbal but also non-verbal manner such as gestures, facial expressions, and gaze behaviors, while exploring essential mechanisms for generating natural conversation. Humans most effectively interact only with other humans, hence, very humanlike androids can be promising communication media to support people’s daily life. The existing spoken dialogue services have mainly focused on task-oriented communication, like voice search service on smartphones and traffic information services, to serve information through natural verbal interaction. But there is no intention and agency in the dialogue system itself, and it cannot be a conversation partner for casual conversation. A conversation essentially involves mutual understanding of each intention and opinion between conversation participants; therefore, as a humanlike manner, we introduced a hierarchical model of decision-making for dialogue generation in our android, that is based on the android’s desires and intentions. Furthermore, it is also important to express humanlike bodily movements for natural conversation, and we have developed a method to automatically generate humanlike motions which are synchronized with the android utterances. So far, we have studied human-android interaction in both of verbal and non-verbal aspects, and this talk will introduce some research topics which are related those studies.

Filippo Cavallo – Scuola Superiore Sant’Anna, Italy

Andreas Holzinger – Graz University of Technology, Austria

Explainable AI: Augmenting Human Intelligence with Artificial Intelligence

Explainable AI is not a new field. Rather, the problem of explainability is as old as AI itself. While rule-based approaches of early AI are comprehensible “glass-box” approaches at least in narrow domains, their weakness was in dealing with uncertainties of the real world. The introduction of probabilistic learning methods has made AI increasingly successful. Meanwhile deep learning approaches even exceed human performance in particular tasks. However, such approaches are becoming increasingly opaque, and even if we understand the underlying mathematical principles of such models they lack still explicit declarative knowledge. For example, words are mapped to high-dimensional vectors, making them unintelligible to humans. What we need in the future are context-adaptive procedures, i.e. systems that construct contextual explanatory models for classes of real-world phenomena. Maybe one step is in linking probabilistic learning methods with large knowledge representations (ontologies), thus allowing to understand how a machine decision has been reached, making results re-traceable, explainable and comprehensible on demand – the goal of explainable AI.

Napoleon Reyes – Massey University, New Zealand

Combining Fuzzy Logic with Optimal Path-Finding Algorithms for Robot Navigation

Recently, there is an influx of research activities involving the development of semi-autonomous and fully-autonomous vehicles. Still, autonomous robot navigation is a feat that is yet to be perfected, attracting major car manufacturers and tech giants in the quest to build a completely safe driverless vehicle. Many have fared but none has completely dominated the problem. Navigation easily extends outside the realm of well-structured urban settings, and that includes navigating partially uncharted territories, and navigating with a partially incorrect map. In these cases, robot navigation typically takes place with the aid of sensor-based terrain acquisition systems. Using some assumptions about the unknown parts of the world, the robot immediately proceeds with goal-directed path-planning, to find a candidate shortest path. En route to the target, on-board sensors verify if the path is indeed traversable, noting down any inconsistencies found in the map, or any new discoveries learned about the world. When faced with a path that leads to a dead end, re-planning takes place for course-correction, then, the process repeats. Central to the navigation problem is the need for a fast, optimal and incremental path-planner that can work cohesively with the lowest level of motor control. Whenever possible, the algorithm should be able to reuse previous path-planning results to speed up future path-planning tasks. In light of this problem, this work focuses on integrating a cascade of fast reactionary fuzzy systems with the A* and D*Lite algorithms. The hybrid algorithms are able to re-plan smooth trajectories continuously in dynamic environments. Moreover, the algorithms provide the computational mechanisms for completely manoeuvring robots, down to controlling the actual wheel velocities.

Cindy L. Bethel – Mississippi State University, USA

Therabot: An Adaptive Therapeutic Support Robot

Mental health disorders are a prominent problem across the world. An effective treatment has been the use of animal-assisted therapy; however not everyone can interact with and/or care for a live animal. Therabot has been developed as an assistive robot to provide therapeutic support at home and in the counseling setting. Therabot is designed as a stuffed robotic dog and has adaptive touch sensing to allow for improved human-robot interactions. Through its touch sensing, it will determine if the level of stress of its users is increasing and adapt to provide support during therapy sessions and for home therapy practice. Over time, Therabot is expected to learn the preferences of its user and adapt its behaviors.

Daniel W. Carruth – Mississippi State University, USA

Simulation for Training and Testing Intelligent Systems

Intelligent systems require sizable sets of known data for training. Developing datasets from real-world data can be a time-consuming and complex process. Simulation provides an alternative with key advantages: rapid data collection, automated semantic tagging, lower cost, etc. at the cost of realism. The major automotive autonomy efforts rely on simulation to supplement real-world driving with millions of miles of simulated driving. Simulators are also being used to develop and test autonomous algorithms for do-it-yourself autonomous remote-control cars, small-to-medium sized unmanned ground vehicles, and intelligent agents that exist only in virtual environments. Building and using simulated environments for training and testing intelligent systems provides many significant advantages. Users must be aware of the limitations and constraints of simulated environments and carefully evaluate transfer of learning from simulated datasets to real-world performance.

Yuan-Fu Liao – National Taipei University of Technology, Taiwan

Some experiences on applying DNNs to speech/speaker/language recognition, TTS and sentiment analysis

With the advent of deep learning there has been a major paradigm shift in how speech signal processing techniques work. In this talk, I would like to share my experiences on applying DNNs to speech/speaker/language recognition, TTS and sentiment analysis. First, I will talk about word2vec/bi-LSTM-based dimensional sentiment analysis for Chinese phrases. Second, I will introduce a LSTM/RNNLM-based bilingual (Chinese/English) ASR system for broadcast radio transcription. Third, I will describe a d-vector-based speaker/language recognition system. Fourth, a pure modular DNN-based bilingual TTS system will be discussed. Finally, some closing remarks will be given, especially, about the promise of deep semi-supervised and unsupervised trainings to exploit large amount of readily available unlabeled data.

Wang XingFu – University of Science and Technology in China HEFEI,  China

Key Technologies of Software defined Wireless Body Area Network

After a decade of extensive research on application specific Wireless Body Area Networks (WBANs), the recent development of information and communication technologies make it practical to realize a new BAN paradigm, software-defined sensor network (SDN- WBAN), which is able to adapt to various application requirements and to fully explore the communication, computation and sensing resources of WBANs. Sensor nodes in SDN- WBAN can be dynamically reprogrammed for different sensing tasks via the over-the-air-programming technique. SDN decouples the control plane, which makes the network forwarding decisions, from the data plane, which mainly forwards the data. This decoupling enables more centralized control where coordinated decisions directly guide the network to desired operating conditions. Moreover, decoupling the control enables graceful evolution of protocols and the deployment of new protocols without having to replace the data plane switches. This project mainly proposed two protocols. First, Mini-SDN, SDN-based architecture that separates the control and data planes in WBANs. It consists of three integrated sub-architectures, Mini-SDN for sensor nodes, Mini-SDN for sinks and the controller. Second, Mini-Flow, an integrated communication protocol that facilitates and manages the data routing between the data plane and the control plane. Mini-Flow Omni prises three routing mechanisms, the uplink routing (data plane to control plane), the downlink routing (controller to data plane) and the information reporting at the network initialization level (controller knows nothing about the data plane). To reduce the communication overhead and the amount of information exchanged between the two planes, in the uplink/downlink routing mechanisms are enhanced by an exponential-controlled probability distribution that prioritizes the candidate nodes based on node temperature, received Signal Strength Indicator (RSSI), number of hops and remaining energy.