Proceedings of the Future Technologies Conference (FTC) 2018 Volume 1 123 ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-3-030-02685-1 ISBN 978-3-030-02686-8 (eBook) https://doi.org/10.1007/978-3-030-02686-8 Library of Congress Control Number: 2018957983 © Springer Nature Switzerland AG 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, speci?cally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on micro?lms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a speci?c statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional af?liations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Editor’s Preface Future Technologies Conference (FTC) 2018 was held on November 13–14, 2018, in Vancouver at the Marriott Pinnacle Downtown Hotel, with sweeping views of the coastal mountains, Coal Harbour, and Vancouver’s city skyline. The city of Vancouver is considered as one of the most beautiful cities in the world. With great privilege, we present the Proceedings of FTC 2018 in two volumes to the readers. We hope that you will ?nd it useful, exciting, and inspiring. FTC 2018 aims at producing a bright picture and charming landscape for future technologies by providing a platform to present the best of current systems’ research and practice, emphasizing innovation and quanti?ed experience. The ever-changing scope and rapid development of future technologies create new problems and questions, resulting in the real need for sharing brilliant ideas and stimulating good awareness of this important research ?eld. Researchers, academics, and technologists from leading universities, research ?rms, government agencies, and companies from 50+ countries presented the latest research at the forefront of technology and computing. After the double-blind review process, we ?nally selected 173 full papers including six poster papers to publish. We would like to express our gratitude and appreciation to all of the reviewers who helped us maintain the high quality of manuscripts included in this conference proceedings. We would also like to extend our thanks to the members of the organizing team for their hard work. We are tremendously grateful for the contri-butions and support received from authors, participants, keynote speakers, program committee members, session chairs, organizing committee members, steering committee members, and others in their various roles. Their valuable support, suggestions, dedicated commitment, and hard work have made FTC 2018 a suc-cess. Finally, we would like to thank the conference’s sponsors and partners: Western Digital, IBM Research, and Nature Electronics. We believe this event will help further disseminate new ideas and inspire more international collaborations. v We hope that all the participants of FTC 2018 had a wonderful and fruitful time at the conference and that our overseas guests enjoyed their sojourn in Vancouver! Kind Regards, Kohei Arai vi Editor’s Preface Contents Towards in SSVEP-BCI Systems for Assistance in Decision-Making . . . 1 Rodrigo Hübner, Linnyer Beatryz Ruiz Aylon, and Gilmar Barreto Image-Based Wheel-Base Measurement in Vehicles: A Sensitivity Analysis to Depth and Camera’s Intrinsic Parameters. . . . . . . . . . . . . . 19 David Duron-Arellano, Daniel Soto-Lopez, and Mehran Mehrandezh Generic Paper and Plastic Recognition by Fusion of NIR and VIS Data and Redundancy-Aware Feature Ranking. . . . . . . . . . . . 30 Alla Serebryanyk, Matthias Zisler, and Claudius Schnörr Hand Gesture Recognition with Leap Motion . . . . . . . . . . . . . . . . . . . . 46 Lin Feng, Youchen Du, Shenglan Liu, Li Xu, Jie Wu, and Hong Qiao A Fast and Simple Sample-Based T-Shirt Image Search Engine . . . . . . 55 Liliang Chan, Pai Peng, Xiangyu Liu, Xixi Cao, and Houwei Cao Autonomous Robot KUKA YouBot Navigation Based on Path Planning and Traf?c Signals Recognition. . . . . . . . . . . . . . . . . . . . . . . . 63 Carlos Gordón, Patricio Encalada, Henry Lema, Diego León, and Cristian Peñaherrera Towards Reduced Latency in Saccade Landing Position Prediction Using Velocity Pro?le Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Henry Grif?th, Subir Biswas, and Oleg Komogortsev Wireless Power Transfer Solutions for ‘Things’ in the Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Tim Helgesen and Moutaz Haddara Electronic Kintsugi. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Vanessa Julia Carpenter, Amanda Willis, Nikolaj “Dzl” Møbius, and Dan Overholt vii A Novel and Scalable Naming Strategy for IoT Scenarios . . . . . . . . . . . 122 Alejandro Gómez-Cárdenas, Xavi Masip-Bruin, Eva Marín-Tordera, and Sarang Kahvazadeh The IoT and Unpacking the Heffalump’s Trunk . . . . . . . . . . . . . . . . . . 134 Joseph Lindley, Paul Coulton, and Rachel Cooper Toys That Talk to Strangers: A Look at the Privacy Policies of Connected Toys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Wahida Chowdhury A Reinforcement Learning Multiagent Architecture Prototype for Smart Homes (IoT). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Mario Rivas and Fernando Giorno Real-Time Air Pollution Monitoring Systems Using Wireless Sensor Networks Connected in a Cloud-Computing, Wrapped up Web Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Byron Guanochanga, Rolando Cachipuendo, Walter Fuertes, Santiago Salvador, Diego S. Benítez, Theo?los Toulkeridis, Jenny Torres, César Villacís, Freddy Tapia, and Fausto Meneses A Multi-agent Model for Security Awareness Driven by Home User’s Behaviours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Farhad Foroughi and Peter Luksch Light Weight Cryptography for Resource Constrained IoT Devices . . . 196 Hessa Mohammed Zaher Al Shebli and Babak D. Beheshti A Framework for Ranking IoMT Solutions Based on Measuring Security and Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Faisal Alsubaei, Abdullah Abuhussein, and Sajjan Shiva CUSTODY: An IoT Based Patient Surveillance Device . . . . . . . . . . . . . 225 Md. Sadad Mahamud, Md. Manirul Islam, Md. Saniat Rahman, and Samiul Haque Suman Personal Branding and Digital Citizenry: Harnessing the Power of Data and IOT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Fawzi BenMessaoud, Thomas Sewell III, and Sarah Ryan Testing of Smart TV Applications: Key Ingredients, Challenges and Proposed Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Bestoun S. Ahmed and Miroslav Bures Dynamic Evolution of Simulated Autonomous Cars in the Open World Through Tactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 Joe R. Sylnice and Germán H. Alférez viii Contents Exploring the Quanti?ed Experience: Finding Spaces for People and Their Voices in Smarter, More Responsive Cities . . . . . . . . . . . . . . 269 H. Patricia McKenna Prediction of Traf?c-Violation Using Data Mining Techniques . . . . . . . 283 Md Amiruzzaman An Intelligent Traf?c Management System Based on the Wi-Fi and Bluetooth Sensing and Data Clustering. . . . . . . . . . . . . . . . . . . . . . 298 Hamed H. Afshari, Shahrzad Jalali, Amir H. Ghods, and Bijan Raahemi Economic and Performance Based Approach to the Distribution System Expansion Planning Problem Under Smart Grid Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Hatem Zaki, R. A. Swief, T. S. Abdel-Salam, and M. A. M. Mostafa Connecting to Smart Cities: Analyzing Energy Times Series to Visualize Monthly Electricity Peak Load in Residential Buildings . . . 333 Shamaila Iram, Terrence Fernando, and Richard Hill Anomaly Detection in Q & A Based Social Networks . . . . . . . . . . . . . . 343 Neda Soltani, Elham Hormizi, and S. Alireza Hashemi Golpayegani A Study of Measurement of Audience in Social Networks . . . . . . . . . . . 359 Mohammed Al-Maitah Predicting Disease Outbreaks Using Social Media: Finding Trustworthy Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 Razieh Nokhbeh Zaeem, David Liau, and K. Suzanne Barber Detecting Comments Showing Risk for Suicide in YouTube . . . . . . . . . 385 Jiahui Gao, Qijin Cheng, and Philip L. H. Yu Twitter Analytics for Disaster Relevance and Disaster Phase Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 Abeer Abdel Khaleq and Ilkyeun Ra Incorporating Code-Switching and Borrowing in Dutch-English Automatic Language Detection on Twitter. . . . . . . . . . . . . . . . . . . . . . . 418 Samantha Kent and Daniel Claeser A Systematic Review of Time Series Based Spam Identi?cation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 Iqra Muhammad, Usman Qamar, and Rabia Noureen CNN with Limit Order Book Data for Stock Price Prediction . . . . . . . . 444 Jaime Niño, German Hernandez, Andrés Arévalo, Diego Leon, and Javier Sandoval Contents ix Implementing Clustering and Classi?cation Approaches for Big Data with MATLAB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458 Katrin Pitz and Reiner Anderl Visualization Tool for JADE Platform (JEX). . . . . . . . . . . . . . . . . . . . . 481 Halim Djerroud and Arab Ali Cherif Decision Tree-Based Approach for Defect Detection and Classi?cation in Oil and Gas Pipelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 490 Abduljalil Mohamed, Mohamed Salah Hamdi, and So?ene Tahar Impact of Context on Keyword Identi?cation and Use in Biomedical Literature Mining. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505 Venu G. Dasigi, Orlando Karam, and Sailaja Pydimarri A Cloud-Based Decision Support System Framework for Hydropower Biological Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517 Hongfei Hou, Zhiqun Daniel Deng, Jayson J. Martinez, Tao Fu, Jun Lu, Li Tan, John Miller, and David Bakken An Attempt to Forecast All Different Rainfall Series by Dynamic Programming Approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530 Swe Swe Aung, Shin Ohsawa, Itaru Nagayama, and Shiro Tamaki Non-subsampled Complex Wavelet Transform Based Medical Image Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548 Sanjay N. Talbar, Satishkumar S. Chavan, and Abhijit Pawar Predicting Concussion Symptoms Using Computer Simulations. . . . . . . 557 Milan Toma Integrating Markov Model, Bivariate Gaussian Distribution and GPU Based Parallelization for Accurate Real-Time Diagnosis of Arrhythmia Subclasses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569 Purva R. Gawde, Arvind K. Bansal, and Jeffery A. Nielson Identi?cation of Glioma from MR Images Using Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589 Nidhi Saxena, Rochan Sharma, Karishma Joshi, and Hukum Singh Rana Array of Things for Smart Health Solutions Injury Prevention, Performance Enhancement and Rehabilitation. . . . . . . . . . . . . . . . . . . . 598 S. M. N. Arosha Senanayake, Siti Asmah @ Khairiyah Binti Haji Raub, Abdul Ghani Naim, and David Chieng Applying Waterjet Technology in Surgical Procedures . . . . . . . . . . . . . 616 George Abdou and Nadi Atalla Blockchain Revolution in the Healthcare Industry. . . . . . . . . . . . . . . . . 626 Sergey Avdoshin and Elena Pesotskaya x Contents Effective Reversible Data Hiding in Electrocardiogram Based on Fast Discrete Cosine Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640 Ching-Yu Yang, Lian-Ta Cheng, and Wen-Fong Wang Semantic-Based Resume Screening System. . . . . . . . . . . . . . . . . . . . . . . 649 Yu Hou and Lixin Tao The Next Generation of Arti?cial Intelligence: Synthesizable AI . . . . . . 659 Supratik Mukhopadhyay, S. S. Iyengar, Asad M. Madni, and Robert Di Biano Cognitive Natural Language Search Using Calibrated Quantum Mesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678 Rucha Kulkarni, Harshad Kulkarni, Kalpesh Balar, and Praful Krishna Taxonomy and Resource Modeling in Combined Fog-to-Cloud Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687 Souvik Sengupta, Jordi Garcia, and Xavi Masip-Bruin Predicting Head-to-Head Games with a Similarity Metric and Genetic Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705 Arisoa S. Randrianasolo and Larry D. Pyeatt Arti?cial Human Swarms Outperform Vegas Betting Markets . . . . . . . 721 Louis Rosenberg and Gregg Willcox Genetic Algorithm Based on Enhanced Selection and Log-Scaled Mutation Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 730 Neeraj Gupta, Nilesh Patel, Bhupendra Nath Tiwari, and Mahdi Khosravy Second-Generation Web Interface to Correcting ASR Output . . . . . . . . 749 Oldrich Kruza and Vladislav Kubon A Collaborative Multi-agent System for Oil Palm Pests and Diseases Global Situation Awareness. . . . . . . . . . . . . . . . . . . . . . . . 763 Salama A. Mostafa, Ahmed Abdulbasit Hazeem, Shihab Hamad Khaleefahand, Aida Mustapha, and Rozanawati Darman Using Mouse Dynamics for Continuous User Authentication . . . . . . . . . 776 Osama A. Salman and Sarab M. Hameed Ten Guidelines for Intelligent Systems Futures . . . . . . . . . . . . . . . . . . . 788 Daria Loi Towards Computing Technologies on Machine Parsing of English and Chinese Garden Path Sentences . . . . . . . . . . . . . . . . . . . . . . . . . . . 806 Jiali Du, Pingfang Yu, and Chengqing Zong Music Recommender According to the User Current Mood. . . . . . . . . . 828 Murtadha Al-Maliki Contents xi Development of Extreme Learning Machine Radial Basis Function Neural Network Models to Predict Residual Aluminum for Water Treatment Plants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 835 C. D. Jayaweera and N. Aziz Multi-layer Mangrove Species Identi?cation . . . . . . . . . . . . . . . . . . . . . 849 Fenddy Kong Mohd Aliff Kong, Mohd Azam Osman, Wan Mohd Nazmee Wan Zainon, and Abdullah Zawawi Talib Intelligent Seating System with Haptic Feedback for Active Health Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856 Peter Gust, Sebastian P. Kampa, Nico Feller, Max Vom Stein, Ines Haase, and Valerio Virzi Intelligence in Embedded Systems: Overview and Applications . . . . . . . 874 Paul D. Rosero-Montalvo, Vivian F. López Batista, Edwin A. Rosero, Edgar D. Jaramillo, Jorge A. Caraguay, José Pijal-Rojas, and D. H. Peluffo-Ordóñez Biometric System Based on Kinect Skeletal, Facial and Vocal Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884 Yaron Lavi, Dror Birnbaum, Or Shabaty, and Gaddi Blumrosen Towards the Blockchain-Enabled Offshore Wind Energy Supply Chain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904 Samira Keivanpour, Amar Ramudhin, and Daoud Ait Kadi Optimal Dimensionality Reduced Quantum Walk and Noise Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 914 Chen-Fu Chiang Implementing Dual Marching Square Using Visualization Tool Kit (VTK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 930 Manu Garg and Sudhanshu Kumar Semwal Procedural 3D Tile Generation for Level Design . . . . . . . . . . . . . . . . . . 941 Anthony Medendorp and Sudhanshu Kumar Semwal Some Barriers Regarding the Sustainability of Digital Technology for Long-Term Teaching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 950 Stefan Svetsky and Oliver Moravcik Digital Collaboration with a Whiteboard in Virtual Reality. . . . . . . . . . 962 Markus Petrykowski, Philipp Berger, Patrick Hennig, and Christoph Meinel Teaching Practices with Mobile in Different Contexts . . . . . . . . . . . . . . 982 Anna Helena Silveira Sonego, Leticia Rocha Machado, Cristina Alba Wildt Torrezzan, and Patricia Alejandra Behar xii Contents Accessibility and New Technology MOOC- Disability and Active Aging: Technological Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 992 Samuel A. Navarro Ortega and M. Pilar Munuera Gómez Lecturing to Your Students: Is Their Heart In It?. . . . . . . . . . . . . . . . . 1005 Aidan McGowan, Philip Hanna, Des Greer, and John Busch Development of Collaborative Virtual Learning Environments for Enhancing Deaf People’s Learning in Jordan. . . . . . . . . . . . . . . . . . 1017 Ahmad A. Al-Jarrah Game Framework to Improve English Language Learners’ Motivation and Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1029 Monther M. Elaish, Norjihan Abdul Ghani, Liyana Shuib, and Abdulmonem I. Shennat Insights into Design of Educational Games: Comparative Analysis of Design Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1041 Rabail Tahir and Alf Inge Wang Immersive and Collaborative Classroom Experiences in Virtual Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1062 Derek Jacoby, Rachel Ralph, Nicholas Preston, and Yvonne Coady The Internet of Toys, Connectedness and Character-Based Play in Early Education. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1079 Pirita Ihamäki and Katriina Heljakka Learning Analytics Research: Using Meta-Review to Inform Meta-Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1097 Xu Du, Juan Yang, Mingyan Zhang, Jui-Long Hung, and Brett E. Shelton Students’ Evidential Increase in Learning Using Gami?ed Learning Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1109 V. Z. Vanduhe, H. F. Hassan, Dokun Oluwajana, M. Nat, A. Idowu, J. J. Agbo, and L. Okunlola Improving the Use of Virtual Worlds in Education Through Learning Analytics: A State of Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1123 Fredy Gavilanes-Sagnay, Edison Loza-Aguirre, Diego Riofrío-Luzcando, and Marco Segura-Morales Design and Evaluation of an Online Digital Storytelling Course for Seniors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1133 David Kaufman, Diogo Silva, Robyn Schell, and Simone Hausknecht The Role of Self-ef?cacy in Technology Acceptance . . . . . . . . . . . . . . . . 1142 Saleh Alharbi and Steve Drew Contents xiii An Affective Sensitive Tutoring System for Improving Student’s Engagement in CS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1151 Ruth Agada, Jie Yan, and Weifeng Xu Multimedia Interactive Boards as a Teaching and Learning Tool in Environmental Education: A Case-Study with Portuguese Students . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1164 Cecília M. Antão Author Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1171 xiv Contents Towards in SSVEP-BCI Systems for Assistance in Decision-Making Rodrigo Hubner ¨ 1,3(B) , Linnyer Beatryz Ruiz Aylon2 , and Gilmar Barreto3 1 Computer Department, Computer Interfaces Research Group, Federal University of Technology - Paran´a, Campo Mour˜ao, Paran´a 87301–899, Brazil rodrigohubner@utfpr.edu.br 2 Manna Research Group, State University of Maring´a, Maring´a, Paran´a 87020–900, Brazil 3 School of Electrical and Computer Engineering, Intelligent Systems and Control Laboratory, State University of Campinas, Campinas, S˜ao Paulo 13083–970, Brazil Abstract. In recent years, Brain Computer-Interfaces (BCI) has a major focus on systems out of clinical scope. These systems have been used to control electrical and electronic equipment, control of digital games and other kinds of “control”. Such control can be accomplished through decision-making by a BCI system. A paradigm known for this purpose is SSVEP (system based on steady-state visually evoked poten-tial paradigm), in which it is possible to distinguish targets with dif-ferent frequency ?icker through visual evocations. This paper proposes a human-computer interaction system using SSVEP for assistance in decision-making. In particular, the work describes a prototype of tra?c lights proposed as a case study. The experiments with this prototype, have created decision-making situations, allowing the SSVEP-BCI sys-tem assists the individual to decide correctly. Keywords: BCI· SSVEP· Decision-making 1 Introduction Brain Computer-Interfaces (BCI) [3,7,19] is commonly used for the development of systems that can improve the quality of life of people who have some physical constraint which limits the capacity of that person (visual, auditory or motor). In this way, a BCI system should minimize the subject’s disability by assisting in the task that the subject could perform alone. An example of this is the [10], a system in which a subject who has speech impairment, focuses on an array of letters on a monitor, and through the visual stimuli generated, the BCI system can classify which the letter the subject is looking at and displaying it. A BCI system can also aid in the decision-making of healthy subjects. There are situations that can be considered risky, for example, braking a vehicle while driving when you see a red tra?c light or a car headlight ?ashing ahead. In such situations, a BCI system can assist the driver if the decision taken by him is s c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 1–18, 2019. https://doi.org/10.1007/978-3-030-02686-8_1 2 R. Hubner ¨ et al. not the correct one. With this premise, we are developing a work to investigate the SSVEP paradigm (Steady-State Visually Evoked Potential) [13–15] used to determine which target with ?icker frequency an subject is focused, which can be recognized with an electroencephalography (EEG) equipment. In order for the BCI system to make the right decision, it is necessary that the di?erent events are being presented at di?erent ?icker frequencies. In order to conduct this research, was built simulations that reproduce tech-niques that use SSVEP, because when this concept of decision-making is applied to the real world, such situations can not be played the same way using the traditional SSVEP paradigm as bright targets do not present a scintillation fre-quency that can be classi?ed by the BCI system, in addition to endangering the life of the experiment subjects. In this context, the objective of this paper is to present an empirical study of the techniques used for the processing of SSVEP signals, aiming the development of a SSVEP-BCI system to aid in decision mak-ing in situations close to the real world. The reason is that real bright targets do not have a ?icker frequency that can be classi?ed by the BCI system, beside putting at risk the lives of the subjects of the experiment. In this context, the objective of this paper is to present an empirical study of the techniques used for the processing of SSVEP signals, aiming the development of a SSVEP-BCI system to assist in decision-making in situations close to the real world. For this, we have built a prototype of tra?c lights with Light Emitting Diode (LED) to create decision-making situations. To ful?ll this objective, a set of experiments based on the SSVEP paradigm was reproduced using a public database, with the intention of evaluating the pro-gramming methods. We also constructed databases with EEG signal acquisition to be evaluated with a prototype using LED-based tra?c lights, in which they generate the necessary visual evocation for experimentation. Finally, we investi-gated di?erent SSVEP signal stimulation strategies, making the prototype tra?c lights constructed have a behavior closer to reality without the visualization of traditional ?icker frequencies of the SSVEP paradigm. This paper is divided as follows. Section 2 presents a brief grounding for the SSVEP paradigm. Section 3 presents some related works. Section 4 presents experiments with public database and with the constructed prototype, using the traditional model SSVEP. Section 5 presents BCI system directions for evaluat-ing decision-making at tra?c lights, using the SSVEP paradigm in non-?ickering targets. Finally, Sect. 6 presents the conclusion. 2 SSVEP-BCI Background The BCI paradigms determine what and how the subject must behave to produce certain known patterns that can be interpreted by a BCI system. The subject must generally be subjected to a calibration equipment and a training before the experiment. The con?guration of the physical environment, positioning of the electrodes and the software set are directly associated with the paradigm used. The paradigms currently used in a BCI system are: Selective attention and Motor Imagery [18]. In this paper we focus on Selective Attention. Towards in SSVEP-BCI Systems for Assistance in Decision-Making 3 Selective Attention. Paradigms of BCI based on selective attention require external stimuli that result in patterns of response by the brain [8]. Such stimuli may be visual, auditory or tactile. In this method, each stimulus is associated with a speci?c command and the user must focus his attention on a target stimulus to generate the corresponding action. In this work will be used visual stimuli and the main paradigms that use these stimuli are: Steady-State Evoked Potentials (SSEP) and P300. – P300: The P300 paradigm consists of obtaining a series of positive peaks in the input signal, with a variation in amplitude in a short space of time. This variation should occur after the appearance of the infrequent target stimulus among several frequent [6]. In this way it is possible to visualize a variation in signal amplitude in the time domain. Stimuli can be auditory, visual or sensory. An example of a visual stimulus may be determined by a letter or screen symbol of a computer that the subject is focused on, which upon receiving a contrast (generally lighter) will generate a peak in the signal approximately 300 milliseconds after the stimulus evocation. For this peak is given the name of P300 (peak 300). – SSEP: Periodic external stimuli can be veri?ed in the signal obtained from any region of the visual cortex. They may be of the sensory, auditory, but mainly visual, known in the literature as SSVEP. – SSVEP: SSVEP stimuli can be triggered by a visual frequency stimulated to the subject. Usually these stimuli are generated by a computer simulation on the monitor screen, but it is also normal to use LEDs for it [25]. Using the screen of a monitor, it is necessary set up the experiment so the screen refresh rate as a multiple of the ?icker frequencies used as target. A target may be a light ?ickering at a frequency of 8 Hz, where an subject is visu-ally focused on it and thereby it will be possible to recognize a response in the electroencephalogram (EEG) signal obtained from the visual cortex at a frequency around 8 Hz. In a study conducted by [20] it has been found that stimulated frequencies can range from 5 to 100 Hz. The SSVEP signal has other characteristics such as luminance, contrast and chromatic that can be modulated together with the ?ickering frequencies of a target stimulus [4]. 2.1 Signal Processing in the SSVEP Paradigm A BCI experiment based on the SSVEP paradigm is related to how the stimuli are presented to the subject and how the signals obtained through the EEG equipment are processed. We present the processing steps of the SSVEP signal. Pre-processing of EEG Signals. In pre-processing, an EEG signal is ?ltered without losing relevant information. In addition, the signal can be improved by separating the noises present, known as signal-to-noise ratio (SNR). When the SNR is low on the signal, it means that detectable patterns will be di?cult to ?nd. Even when the SNR is high on the signal, it means that the standards will 4 R. Hubner ¨ et al. be easy to identify. Signal ?ltering techniques can be applied in combination, facilitating the determination of the signals of interest. Temporal and spatial ?lters are used as signal preprocessing. In this paper we used the bandpass temporal ?ltering techniques by the ?nite impulse response (FIR) method [22] and the spatial ?ltering method Common Average Reference (CAR) [17], which consists of the point-to-point subtraction of each signal by the mean of all EEG signals obtained by all the electrodes. Feature Extraction. This step performs a search for the features that best describes the expected properties of the input signal. Such characteristics can be obtained using: the signal waveform analyzed in the time domain; Components of subject frequencies in the frequency domain; Power density spectrum; Time frequency analysis (i.e. Short-Time Fourier Transform - STFT), Autoregressive Models, etc. [11]. In SSVEP-BCI systems, methods for extracting features based on the spec-tral information presented in the EEG signal. In a given set of evoked frequencies, the Power Spectral Density (PSD) calculation can extract from the signal, the information of interest to be classi?ed. The main methods used for SSVEP fre-quency density analysis are: Filter Bank, Spectrogram, Weltch Method [2] and Multitaper Method [16]. In this work was used the Multitaper Method that can be applied by the tool MNE-Python1 . Feature Selection. In the feature extraction can be obtained a large number of variables that will be analyzed in the future by a classi?er. In this step, the most relevant features of the set obtained by the feature extraction are selected, allowing to improve the performance of the classi?er in terms of faster execu-tion and e?ectiveness. Among the techniques of feature selection are mentioned Filter (Pearson’s Correlation Coe?cients and Davies-Bouldin Index) and the Wrappers technique [2]. The technique Recursive Feature Elimination (RFE) based in Wrappers is used in this work because it presents in general a better performance in the same work cited. Classi?cation. Classi?cation is the ?nal stage of EEG signal processing. It is possible to decide which action or command should be executed. The selection of characteristics has as output a vector of characteristics used by the classi?- cation of data in di?erent classes. Classi?ers that follow the supervised learning approach use samples of labeled examples called training sets. This set is formed by several labeled samples of each class, so that the classi?er is able to recognize new samples and classify them in any of the classes that make up this set. There are several supervised classi?cation algorithms, such as Support Vector Machine (SVM) and Linear Discriminant Analysis (LDA). In this work we chose to use the SVM classi?er, based on its performance presented in [15]. 1 http://martinos.org/mne. Towards in SSVEP-BCI Systems for Assistance in Decision-Making 5 3 Related Works The main works that contributed to the development of this paper are presented below. In Development of an ssvep-based BCI spelling system adopting a qwerty-style LED keyboard [12] a speller system was developed in the QWERTY model using 30 LEDs representing each key of keyboard, ?ickering at di?erent frequencies. This method allows the individual to select a character without the need for multiple steps as in traditional BCI speller systems. It was possible to obtain wide frequency resolution, strictly recognizing for example a ?ickering stimulus of 0.1 Hz. The experiments were performed with ten healthy subjects, in which ?ve participated in an o?ine experiment and ?ve in an online experiment. 68 English words were used for the evaluations. In the o?ine results, accuracy of 76.67% and 72.33% was obtained for viewing angles 40 and 30 degrees respectively. The online results were better because the best angle and the best combination of electrodes were used (Oz and O2 in system 10–20), obtaining accuracy regarding the amount of time participants took to recognize each char-acter: 5 s (84.69%), 6 s (86.17%) and 7 s (89.53%). From this work it was possible to obtain important information about the distance and positioning angle of the LEDs for a better result, besides the best electrodepositions for it. In A novel stimulation method for multi-class SSVEP-BCI using intermodulation frequencies [4] a method was developed using di?erent inter-modulation frequencies for SSVEP-BCIs with ?ickering targets at the same fre-quency of 15 Hz. The set up allowing a greater number of targets. The authors encoded nine target objects on an LCD screen, in which quadratic forms were arranged in a 3 × 3 matrix. The modulation frequency for each target was gener-ated by color characteristics (C), alternating the frames in green, red and gray, luminance characteristic (L), alternating frames with a di?erence of 20 cd/m- 2 and the mixture of the two (CL) forming three approaches. As a result, the average accuracy for the online assessment of the three approaches was 85%, with the mixture of the two (CL) being the highest obtained of 96.41%. This work presents alternatives in the SSVEP paradigm, which it implies to recognize di?erent targets ?ickering in the same frequency. In the work Towards an optimization of stimulus parameters for brain-computer interfaces based on steady state visual evoked poten-tials [5] the in?uence of several characteristics of the SSVEP visual stimulus of the SSVEP signal is presented. Five characteristics were evaluated for the tar-gets: size, distance, color, shape and presence of a ?xation point in the middle of each ?ickering object. The distance between the stimulation targets and the presence or absence of the ?xation point had no signi?cant e?ect on the results, since the color and size of the ?ickering target played an important role in the SSVEP response. Experiments were performed with 5 subjects and four stim-uli were presented on the monitor screen with di?erent ?ickering frequencies. A group of LEDs was added adjacent to each object shown on the screen, respon-sible for randomly generating the imposed luminance. The spectral responses are larger for white, followed by yellow, red, green, and blue color. About the 6 R. Hubner ¨ et al. size of objects, the quality of spectral information is proportionately larger in relation to the size of the object. Other features did not have relevant e?ects for this study. This work presented important information for the characterization of the environment in which the prototype of our work is inserted. The work of Use of high-frequency visual stimuli above the critical ?icker frequency in a SSVEP-based BMI [21] presents an evaluation using frequencies above those traditionally used in SSVEP-BCI systems. Green (low luminance) and blue (high luminance) LEDs were used to verify the accuracy of the system and the level of visual fatigue of the subjects. Subjects ?xed green and blue ?ickering light (30 and 70 Hz respectively), and the SSVEP amplitude was evaluated. The subjects were asked to indicate whether the stimulus was visibly ?ickering and to report their subjective level of discomfort. It also evaluated visible frequencies (41, 43 and 45 Hz) against invisible frequencies (61, 63 and 65 Hz). As a result, 93.1% and 88% were obtained for the visible and invisible stimuli respectively. In addition, it was concluded that high frequencies continue to o?er good performance and that visual fatigue has been reduced. In our paper we investigated the use of high ?ickering frequencies (invisible to the human eye) to approach a real situation. The related work presented encouraged the use of new concepts in the non-traditional SSVEP method. These methods can contribute to a SSVEP-BCI system applied in a real situation. The next section presents the conduction of the preliminary experiments. 4 Preliminary Experiments This section presents two experimental sets that are the basis for our investiga-tion. The two sets are divided as follows: 1. Development of codes for the evaluation of a public SSVEP-BCI database; and 2. Construction of a prototype using tra?c lights with LEDs as ?ickering targets. Initially, we demonstrate the results of codes produced as part of this work, to evaluate a public database. After the evaluation of the experiment, a second experimental set was performed to evaluate a database produced by us, using a prototype with tra?c lights constructed with LEDs, in which LEDs perform traditional SSVEP stimuli, based on ?ickering targets frequencies. By analyzing these results in addition to investigating new methods linked to SSVEP-BCI systems, it will be possible to develop a new BCI system for decision-making with non-?ickering targets using the same physical components of the second experimental set. The proposal resulting from this research is in Sect. 5. In all experiments was used the tool MNE-Python [9], which makes up a set of libraries written in the Python programming language for the purpose of analyzing EEG and MEG data. The library also used was Scikit Learn2 for routines based on Computational Intelligence, also written in Python. 2 http://scikit-learn.org. Towards in SSVEP-BCI Systems for Assistance in Decision-Making 7 4.1 Public Database SSVEP-BCI In this section the experiment performed with the AVI SSVEP database3 , devel-oped by [24], built as part of a work by the same author [23], was develop a “speller with dictionary support”. First it will be introduced to database built by [24] and then presented the algorithmic strategies developed by us, detailing the loading and preparation of data, procedures and results respectively. Description of the Public Database AVI SSVEP. The base has measured EEG data from healthy subjects, being exposed in ?ickering targets to obtain SSVEP responses. Data were recorded using three electrodes (Oz, Fpz e Pz) posi-tioned according to the 10–20 system. The data obtained from the electrode Oz is the only ones recorded in the database. The electrode Fpz was used as refer-ence and the electrode PZ for ground. An LCD monitor was used for stimulus generation BenQ XL2420T with refresh rate at 120 Hz. The EEG equipment used was the g.USBamp which has a sampling rate of 512 Hz and gold-plated elec-trodes moistened with electrolytic gel. Subjects had to concentrate during the experiment on targets of 2.89 cm2 on the monitor screen, seated at a distance of 60 cm from it. Two types of experiments were performed to compose this database. The ?rst was performed with a single target (ST) to verify the existence of the VEP signal. Four subjects were used, each submitted to a single session, focusing on a single target for thirty seconds, four times. The frequencies chosen in each test were random, but they were the same for each subject. The second experiment was performed with multiple targets (MT), adding seven targets at di?erent frequencies. Five subjects were used in two sessions, focusing on multiple targets for sixteen seconds, ten times. In each trial the subject focused on one of the ?ickering targets reported and the sequence reported was also random but the same for the ?ve subjects. Loading and Data Preparation. The codes developed for ST analysis were necessary because it has a single target, taking into account our main research at tra?c lights, only one light will be lit at a time. The MT data were also analyzed because there is a greater variation of samples and thus it is possible to construct and evaluate a greater combination of strategies. In the ST data, each subject performed only one session with four trials, but since there are twenty-seven trials in each session, the training and test data could be divided into di?erent proportions in the same session, so that 33% of the samples (9 samples) were used for the training, while 67% of the samples (18 samples) were used for testing. In the MT data, the training and test data of the classi?er are divided into di?erent sessions, because there are few samples available, adding ten tests each session, but each subject performed two sessions. In this way, the second session of each subject was used with ten samples for the training of the classi?er and the ?rst session with the same subject for the tests. 3 http://www.setzner.com/avi-ssvep-dataset/. 8 R. Hubner ¨ et al. Experimental Procedures. Regardless of the division of data for each exper-iment, the algorithms for preprocessing, feature extraction and selection and classi?cation were the same. Figure 1 shows the execution ?ow and the algo-rithms applied in each experimental stage. Fig. 1. General ?ow of execution of the experiments presenting the algorithms used in each step. Generally, the classi?cation algorithm uses di?erent combinations of features extracted for data training. In this experiment, the only features extracted is the Power Spectral Density (PSD) of the SSVEP signal, which allows to train the classi?cation model independent of its class. This occurs because regardless of the frequency stimulated, the PSD should have a higher value than the rest of the non-invoked frequencies. Thus, training models of any frequency can be applied to classify any test sample. Results. In the analysis of the results with the ST data, three combinations of data were used for the training and test, since each subject performed the same experimental sequence three times. Thus, the ?rst training section was used for the classi?cation model and the second and third for testing, and the other two possible combinations to testing three di?erent possibilities. The best frequency range for the feature extraction was to use a standard deviation equal to 0.3 (based on an exhaustive execution), that is, if the feature extraction was performed around a frequency of 6 Hz, the range frequency was from 5.7 to 6.3 Hz. Figure 2a presents the bar plot with the results of the experiment with the ST data. The best result was with subject 4, which the accuracy for the three sessions was 100%. But the worst result was with subject 3 using the ?rst session as a test, which an accuracy of 14% was obtained. The overall mean accuracy of all subjects was 70.75%. Towards in SSVEP-BCI Systems for Assistance in Decision-Making 9 Fig. 2. Results of the experiment with the ST data from the AVI database. The PSD charts were analyzed to determine the low results presented by subject 3. In the ?rst session, the target evoked a signal of 6.0 Hz, but the PSD is higher around 12.0 Hz. This result implies both the poor training of the classi?er and the use of these data for testing, resulting in low accuracy. Figure 2b presents a PSD of the ?rst session performed by subject 4, in which it obtained the highest accuracy (100%). It can be observed that in both ?gures, the PSD is the highest around the evoked frequency and the rest of the frequencies have low values. These data have good classi?er training and also result in good accuracy if used for the test. In the results with the MT experiment, it was considered that the second session of each subject would be better used for classi?er training. The best frequency range for the feature extraction was also with the standard deviation equal to 0.3. Figure 3a shows the bar plot with the results of the MT experiment. Most of the results were better using the second session with the exception of subject 2. The best result was with subject 4 and 5, in which the accuracy was 100% for the two cases using the training with the second session. The worst result was with subject 3 using both the ?rst session and training as the second one, in which an accuracy of 50% and 60% respectively was obtained. The overall mean accuracy of all subjects was 84%. The PSD graphics were analyzed to determine the low results presented by subject 3. Figure 3b presents the PSD of the ?rst session performed by this subject. A signal of 9.3 Hz was evoked, but the PSD is larger around 6.5 Hz. The tests performed with the experimental base of [24] demonstrated that it is possible to use the codes developed by our work to evaluate an SSVEP-BCI system. 10 R. Hubner ¨ et al. Fig. 3. Results of the experiment with the MT data from the AVI database. Fig. 4. Tra?c lights built with LEDs used in experiment 2 prototype. 4.2 SSVEP-BCI System Based on Flickering Tra?c Lights In this experimental stage, the construction of our database for the evaluation of the prototype using tra?c lights with ?ickering LEDs was started, as well as testing the functioning of the EEG equipment used. Description of Equipment Used. For the development of the prototype, two tra?c lights made up of LEDs were used. Figure 4a shows the tra?c light constructed with the rest of the prototype, built with three di?use of 10 mm, with red, yellow and green LED color. Figure 4b presents the tra?c light built with three high-brightness 5 mm LEDs and a high brightness LED of 3 mm, two in red, one yellow and one green (3 mm). Towards in SSVEP-BCI Systems for Assistance in Decision-Making 11 The variability of tra?c lights were constructed to verify the di?erence of the EEG signal presented when using di?use or high brightness LEDs, since the latter has a higher light intensity despite causing a visual nuisance. The operation of the tra?c lights is carried out with the aid of the Arduino UNO4 , a free hardware electronic prototyping platform, which uses a microcon-troller ATmega328P of 32 MB ?ash memory and 16 MHz speed. In addition to the LEDs connected to the tra?c lights, a push button was also added to manually control the start of each session or to stop it if necessary. The EEG equipment used in the experiments is the OpenBCI board5 of 32 bits with 8 channels for the EEG/MEG/ECG (Electroencephalo-gram/ Magnetoencephalograph/Electrocardiogram) measuring plus three auxil-iary channels used for the measuring of a gyroscopic sensor. The equipment can still be expanded to 16 channels using the module Daisy that accompanies the equipment. A helmet developed with a 3D printer was used to perform the experiment. Ultracortex Mark 36 , used to couple the electrodes and the OpenBCI board. The electrodes used for the experimentation are constructed with a Silver-Silver Chloride (Ag-AgCl) alloy, dispensing with the use of electrolytic paste or gel, thus allowing easy placement of the helmet on di?erent subjects during an experiment. Experimental Procedures. To simulate the tra?c light with the LEDs with ?ickering frequencies, a code was developed for the micro controller that allows to specify the frequencies of each LED. In the case of a conventional SSVEP-BCI experiment, it is desirable for multiple targets to ?ick at di?erent frequencies, so Eq. 1 was applied in the Arduino code, where the interval I is the time between the LED activations by frequency division f desired by a unit, adding the division by 2 to disregard the half cycle of the LED on/o?, multiplying by 1000 to calculate the time in milliseconds, and ?nally subtracting n which is the delay of loop of code running on hardware. This delay was calculated using an LDR light sensor connected to an Arduino in which the sensor was pointed at the LED lit at di?erent frequencies and the read sensor data sent to the computer for analysis by a graph as a function of time. It has been found that this delay varies from 1 to 2 ms, so the average of this value (1.5 ms) has been assigned to . I = [1 / f] / 2 *] 1000 -] ] (1) The following frequencies for each LED have been con?gured: red = 8 Hz, yellow = 10 Hz e green = 12 Hz. Non-multiple frequencies were chosen from each other, which prevents an overlapping phenomena from occurring in the spectrogram, causing signal magnitude to be high around the multiples of the invoked frequency. Figure 5 shows a ?owchart of the experimentation detailing the softwares and hardware used, as well as the communication model made between them. 4 https://www.arduino.cc/. 5 http://openbci.com. 6 https://github.com/OpenBCI/Ultracortex/tree/master/Mark 3. 12 R. Hubner ¨ et al. Obtaining the EEG signal by means of OpenBCI board is performed with the software OpenBCI GUI v27 . This software sends the signal obtained using the interface Lab Streaming Layer8 (LSL) in the form of streaming to a code writ-ten in Python to receive the EEG signal and writes it to a ?le FIF (tool ?le extension MNE) along with the markers received by the micro controller serial port. Such markers are the time indications that denote the moment each light in the tra?c light was lit. Fig. 5. Representation of the ?ow of experiment 2. This stage of the experiments was performed with only one subject, since the objective was to test the correct functioning of the EEG equipment and verify if the prototype is enough to evoke a good signal SSVEP. The following protocol for the realization of the sessions was adopted: – Internal environment with low luminosity. – Subject sitting approximately one meter away from the target. – Subject is exposed to two sessions. Figure 6c demonstrates how the sequence of a session is performed. At each session the SSVEP signal was evoked twenty times with a random light sequence at the target. During the session, the LED is active for 10 s with intervals of 5 s between one activation and another. In this way, a session lasts for 15 min and 42 s. – EEG data and markers were recorded in a single ?le FIF (referring to the tool MNE) in a database for further o?ine analysis. 7 https://github.com/OpenBCI/OpenBCI GUI. 8 https://github.com/sccn/labstreaminglayer. Towards in SSVEP-BCI Systems for Assistance in Decision-Making 13 Fig. 6. Illustrations of the protocol for experiment 2. The electrodes were positioned on the subject’s scalp in locations of the occipital lobe, parieto-occipital lobe and parietal lobe, respecting the system 10– 20 pattern. Figure 6a shows the positions of the eight electrodes that measure the EEG signal (O1, Oz, O2, PO3, PO4, PO7, PO8 and Pz), plus two electrodes used for reference and grounding (Fz connected to the frontal lobe and A2 connected in the lobe of the right ear respectively). Finally, Fig. 6b shows the complete assembly of the OpenBCI board connected to the Ag-AgCl electrodes together with the helmet Ultracortex Mark 3. Results. A code was developed with some modi?cations related to that used in the experimental set 1. In this experiment we added the CAR space ?lter (Common Average Reference), taking as reference the channels Oz, O2, PO4 and PO7, as they were the channels with the highest VEP response, in addition to the FIR ?lters (Hamming window) at the cut-o? frequencies of 5 Hz and 50 Hz and a ?lter notch in the frequencies of 60 Hz and 120 Hz. The training and test data used to classify, were divided into 30% and 70% portions respectively, performing a cross-validation in which the initial 30% were used (six ?rst trials) and the remainder for testing, from the second to the seventh training trials and so on until completing ?fteen di?erent combinations. 14 R. Hubner ¨ et al. Fig. 7. Accuracy of results obtained from cross-validation of experiment 2. Fig. 8. Evoked 8 Hz with multiple channels. The best frequency range for the feature extraction was with the standard deviation equal to 1.0. This value was found using an exhaustive execution with the 30% of the ?rst triages used for classi?er training SVM). Figure 7 shows the graph with the results of experiment 2 using cross-validation. The best result was with the 9th piece of data used for the training of the classi?er, in which the accuracy was 100%. The worst results were with the 8th and 14th portions of data used for the classi?er training, in which an accuracy of 78% was obtained in both cases. The overall mean accuracy for all cross-evaluation was 86%. Figure 8 shows a PSDs of the session performed with stimuli in the frequencies of 8, 10, and 12 Hz, in which it obtained the highest accuracy (100%). It can be observed that in both ?gures the PSD is the highest around the evoked frequency and the rest of the frequencies have low values. These data have good classi?er training and also result in good accuracy if used for the test. Towards in SSVEP-BCI Systems for Assistance in Decision-Making 15 Analyzing the results of the experimental set 2, it was possible to ?nd a sequence of ?ickering frequencies that could be used to evaluate the prototype of semaphore constructed in addition to the EEG equipment used for data acqui-sition. In the next section we demonstrate the directions we are taking to develop an SSVEP-BCI system with non-?ickering targets. 5 Towards in New SSVEP-BCI System In this section we present some hypotheses raised through the research carried out with the previous experiments. The idea is to develop an SSVEP-BCI system for decision-making at tra?c lights, avoiding that the targets have a visible ?icker frequency. In this context, the decision-making is to determine which of the lights of a tra?c light is active. Thus, the objective of this third experimental set-up will be to construct a new BCI system with non-scintillating targets for human vision to approach a real decision-making situation when a vehicular driver has while viewing a tra?c light while driving. Some hypotheses are presented for this third experimental future set, which consists of taking advantage of some strategies of the SSVEP paradigm presented in the related works and previous experiments, using the same prototype of the second experiment. The ?rst hypothesis is set targets with di?erent ?icker frequencies, so that these frequencies are not visible to the human eye. Strategy: The system should be able to identify frequencies above those used in traditional SSVEP-BCI systems. The SSVEP-BCI systems use frequencies generally until 30 Hz. This strategy will be applied by presenting frequencies increasing above 30 Hz (1 in 1 Hz) for a set of subjects. In this way each subject will inform at what time the ?icker frequency would no longer be visible. Then three di?erent frequencies not visible to the subjects will be con?gured for the targets. Potential problems: In the work of [21] is show SSVEP stimuli above tradi-tional frequencies can be used in BCI systems, but are more di?cult to detect because it have a very low SSVEP signal, which may imply a low accuracy in the proposed system. In this second hypothesis, account is taken of the same ?ickering fre-quency when the targets are active, even though these frequencies are not visible to the human eye. Strategy: The BCI system should be able to di?erentiate the color/luminance by the VEP response amplitude for the same stimulated frequencies. In this way, the same strategy presented in the ?rst hypothesis will be used to ?nd frequencies not visible to the human eye and to use the least of them to con?gure the targets. The VEP response of the di?erent targets will then be analyzed using the amplitude di?erence information as the main feature. To support this hypothesis, in the work carried out by [1], are shown how the colors used as targets can in?uence the phase value in an SSVEP system. 16 R. Hubner ¨ et al. Potential problems: Even with di?erent values in relation to the col-ors/ luminance of the LEDs, such values can be little discriminative, resulting in a low accuracy of the proposed system. In this way, a third and last hypoth-esis is raised. In the third hypothesis, the development of a BCI system taking into account a mixture between the ?rst and second hypotheses. Strategy: In this model will be used a junction of the two previous hypotheses with the premise of improving the performance of the proposed system. Taking into account positive results in ?rst and second hypotheses(not necessarily good results), the intention of this strategy will be to obtain the maximum perfor-mance of the two strategies used. For this, the classi?er training model must be applied to a data sequence that has at least all possible combinations between di?erent frequencies X di?erent LED colors. Potential problems: The problems of this hypothesis are related to the same ones presented in the ?rst and second hypotheses. In addition, di?erent ?ickering frequencies can evoke di?erent values in the VEP signal regardless of the colors of the LEDs, because there are evoked frequencies in which the VEP response is stronger than others. In this work we identify new strategies that can be used in SSVEP-BCI systems to be applied in real situations. In our context, we want to apply to aid in decision making at tra?c lights. Summing up the hypotheses raised, the next step is to develop and evaluate a system with these concepts. 6 Conclusion In this paper, we investigate the SSVEP-BCI systems, evaluate a public database using a new code developed and create our own database through a simulation of decision-making using tra?c lights. The decision process applied has well-known actions: when the driver is in green, his can continue riding normally, in red color the driver must decelerate and stop the car and, for some models of tra?c lights, in yellow light, the driver should have more attention at the intersection, consequently reducing the speed of the vehicle. However, we have veri?ed that the traditional SSVEP-BCI system usually uses ?ickering frequencies visible to the human eye, which makes it unfeasible to use such a model in future real situations. In this way, the bibliographic survey of some related works allowed to visualize some characteristics in this model that can be useful for the development of a simulation closer to reality. It was also possible to identify other factors in the methodology of these works that contribute to the development of our system: the algorithms used in the SSVEP signal processing, the section time performed by the subjects, rest time, number of times each experiment was performed, possible scenarios of experimentation, position of the electrodes for EEG acquisition, etc. The practical experiments carried out already contributed to the develop-ment of much of what we want to develop, since it was possible to evaluate Towards in SSVEP-BCI Systems for Assistance in Decision-Making 17 the codes developed, the prototype built and the EEG equipment used, besides generating satisfactory results for our research. With this, it was possible to idealize hypotheses of the new system. The ?rst two hypotheses will certainly be developed, but the third hypothesis will be developed considering the results obtained in the ?rst and second, resulting in the proposed SSVEP-BCI system. Acknowledgment. We would like to thank CNPq (Brazilian Council for Scienti?c and Technological Development) scholarship Brazil (311685/2017-0). References 1. Cao, T., Wan, F., Mak, P.U., Mak, P.I., Vai, M.I., Hu, Y.: Flashing color on the performance of SSVEP-based brain-computer interfaces. In: 2012 Annual Interna-tional Conference of the IEEE Engineering in Medicine and Biology Society, pp. 1819–1822. IEEE, San Diego, August 2012 2. Carvalho, S.N., Costa, T.B., Uribe, L.F., Soriano, D.C., Yared, G.F., Coradine, L.C., Attux, R.: Comparative analysis of strategies for feature extraction and clas-si?cation in SSVEP BCIs. Biomed. Signal Process. Control. 21, 34–42 (2015) 3. Chaudhary, U., Birbaumer, N., Ramos-Murguialday, A.: Brain-computer interfaces for communication and rehabilitation, pp. 513–525 (2016) 4. Chen, X., Wang, Y., Zhang, S., Gao, S., Hu, Y., Gao, X.: A novel stimulation method for multi-class SSVEP-BCI using intermodulation frequencies. J. Neural Eng. 14(2), 026013 (2017) 5. Duszyk, A., Bierzynsk ´ a, M., Radzikowska, Z., Milanowski, P., Ku´s, R., Su?czynski, ´ P., Michalska, M., Labecki, M., Zwolinski, ´ P., Durka, P.: Towards an optimization of stimulus parameters for brain-computer interfaces based on steady state visual evoked potentials. PLoS ONE 9(11), e112099 (2014) 6. Fazel-Rezai, R., Ahmad, W.: P300-Based Brain-Computer Interface Paradigm Design. INTECH Open Access Publisher (2011) 7. Fouad, M.M., Amin, K.M., El-Bendary, N., Hassanien, A.E.: Brain computer inter-face: a review. In: Hassanien, A.E., Azar, A.T. (eds.) Brain-Computer Interfaces: Current Trends and Applications, pp. 3–30. Springer International Publishing, Cham (2015) 8. Graimann, B., Allison, B., Pfurtscheller, G.: Brain-computer interfaces: a gentle introduction. In: Brain-computer interfaces. In: Graimann, B., Pfurtscheller, G., Allison, B. (eds.) The Frontiers Collection, pp. 1–27. Springer, Heidelberg (2010) 9. Gramfort, A., Luessi, M., Larson, E., Engemann, D., Strohmeier, D., Brodbeck, C., Goj, R., Jas, M., Brooks, T., Parkkonen, L., H¨am¨al¨ainen, M.: MEG and EEG data analysis with mne-python. Front. Neurosci. 7, 267 (2013). http://journal. frontiersin.org/article/10.3389/fnins.2013.00267 10. Halder, S., Pinegger, A., K¨athner, I., Wriessnegger, S.C., Faller, J., Antunes, J.B.P., M¨ uller-Putz, G.R., Kubler, ¨ A.: Brain-controlled applications using dynamic P300 speller matrices. Artif. Intell. Med. 63(1), 7–17 (2015) 11. Yang, B.-H., Yan, G.-Z., Wu, T., Yan, R.: Subject-based feature extraction using fuzzy wavelet packet in brain-computer interfaces. Signal Process. 87(7), 1569– 1574 (2007) 12. Hwang, H.-J., Lim, J.-H., Jung, Y.-J., Choi, H., Lee, S.W., Im, C.-H.: Development of an ssvep-based BCI spelling system adopting a qwerty-style LED keyboard. J. Neurosci. Methods 208(1), 59–65 (2012) 18 R. Hubner ¨ et al. 13. Lin, K., Cinetto, A., Wang, Y., Chen, X., Gao, S., Gao, X.: An online hybrid bci system based on ssvep and emg. J. Neural Eng. 13(2), 026020 (2016) 14. Lin, Y.-P., Wang, Y., Jung, T.-P.: Assessing the feasibility of online SSVEP decod-ing in human walking using a consumer EEG headset. J. Neuro Eng. Rehabil. 11(1), 119 (2014) 15. Marti?sus, I., Dama?sevi?cius, R.: A prototype SSVEP based real time BCI gaming system. Intell. Neurosci. 2016, 18 (2016) 16. McCoy, E.J., Walden, A.T., Percival, D.B.: Multitaper spectral estimation of power law processes. IEEE Trans. Signal Process. 46(3), 655–668 (1998) 17. McFarland, D.J., McCane, L.M., David, S.V., Wolpaw, J.R.: Spatial ?lter selec-tion for eeg-based communication. Electroencephalogr. Clin. Neurophysiol. 103(3), 386–394 (1997) 18. Muhl, ¨ C., Gurk¨ ¨ ok, H., Bos, D.P.-O., Thurlings, M.E., Scher?g, L., Duvinage, M., Elbakyan, A.A., Kang, S., Poel, M., Heylen, D.: Bacteria hunt: evaluating multi-paradigm BCI interaction. J. Multimodal User Interfaces 4(1), 11–25 (2010). Open Access 19. Prashant, P., Joshi, A., Gandhi, V.: Brain computer interface: a review. In: 2015 5th Nirma University International Conference on Engineering (NUiCONE), pp. 1–6. IEEE, Ahmedabad, November 2015 20. Regan, D.: Steady-state evoked potentials. J. Opt. Soc. Am. 67(11), 1475–1489 (1977) 21. Sakurada, T., Kawase, T., Komatsu, T., Kansaku, K.: Use of high-frequency visual stimuli above the critical ?icker frequency in a ssvep-based bmi. Clin. Neurophysiol. 126(10), 1972–1978 (2015) 22. Shenoi, B.A.: Introduction to Digital Signal Processing and Filter Design. Wiley- Interscience (2005) 23. Vilic, A., Kjaer, T.W., Thomsen, C.E., Puthusserypady, S., Sorensen, H.B.D.: DTU BCI speller: an SSVEP-based spelling system with dictionary support. In: 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 2212–2215. IEEE, Osaka, July 2013 24. Vilic, A.: AVI SSVEP dataset (2014). http://www.setzner.com/avi-ssvep-dataset 25. Zhu, D., Bieger, J., Molina, G.G., Aarts, R.M.: A survey of stimulation methods used in SSVEP-based BCIs. Intell. Neurosci. 2010, 1:1–1:12 (2010) Image-Based Wheel-Base Measurement in Vehicles: A Sensitivity Analysis to Depth and Camera’s Intrinsic Parameters David Duron-Arellano(&) , Daniel Soto-Lopez, and Mehran Mehrandezh University of Regina, 3737 Wascana Pkwy, Regina, SK S4S 0A2, Canada duad92@gmail.com, {sotolopd,mehran.mehrandezh}@uregina.ca Abstract. Image-based metric measurement has been widely used in industry for the past decade due to the recent advancement in processing power and also the unobtrusiveness of this method. In particular, this method is gaining atten-tion in the realm of real-time detection, classi?cation, and inspection of vehicles used in intelligent transportation systems for law enforcement. These systems have proven themselves as a plausible competition to under-the-pavement loop sensors. In this paper, we analyze the sensitivity in image-based metric mea-surement for vehicles’ wheel base estimation. Results lead to a simple guideline for calculating the optimal con?guration yielding the highest resolution and accuracy. More speci?cally, we address the sensitivity of the metric measure-ments to the depth (i.e., the distance between the camera and the vehicle) and also internal calibration parameters of the visible-light imaging system (i.e., camera’s intrinsic parameters). We assumed a pinhole projection model with added barrel effect, aka, lens distortion. A 3D video simulation was developed and used as a Hardware-in-the-Loop (HIL) testbed for veri?cation and valida-tion purposes. Through a simulated environment, three case studies were con-ducted to verify and validate theoretical data from which we concluded that the error due lens distortion accounted for 0.014% of the total error whereas the uncertainty in the depth of the vehicle with respect to the location of the camera accounted for 99.8% of the total error. Keywords: Image-processing Digital-metrologyVision-systems 1 Introduction As vehicle population has been increasing exponentially over the years, new and cost-effective technologies for monitoring and controlling the traf?c have been developed. Intelligent systems, such as vision-based vehicle classi?cation systems, have been continuously investigated for its affordability and ef?ciency. Two major applications of these systems are toll collection and law enforcement, which make use of a wide variety of techniques to detect, characterize, count and classify vehicles. These previously mentioned techniques are usually implemented in accordance to the 13-vehicle classi?cation scheme [1] described by the Federal Highway © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 19–29, 2019. https://doi.org/10.1007/978-3-030-02686-8_2 Administration (FHWA). This scheme is based on the classi?cation of vehicles’ wheelbase and number of axles. Also, even though, several technologies have been explored to comply with the FHWA regulations, under-pavement loop sensors have been the most broadly implemented ones, mainly because of its reliability and robustness. Nevertheless, one of the biggest concerns of loop sensors is the intrusiveness that they entail. That is to say, when one of the sensors has to be replaced, the pavement needs to be removed and rebuilt again, which is an expensive, complex and time-consuming process. Alternative techniques, such as vision systems, which don’t involve any kind of intrusion without compromising accuracy, ef?ciency or affordability, are being further explored. Therefore, the purpose of our research narrows down to selecting a vision system and analyzing the sensitivity of its parameters, which would lead us to the most effective con?guration for accurately measuring wheelbase and counting axles. Through our analysis we conclude that lens distortion and depth assumption are regarded as the parameters that carry the biggest error in metrology applications of this nature. Due to the convexity of the lens, the error on the output measurements obtained from an image grows non-linearly as the observed features approximate the borders of the image. Also, as depth is not an implicit parameter in the vision system, it has to be assumed to provide the scale factor for the measurement, which is originally depicted in pixels. This assumption carries a range of uncertainty which accounts for signi?cant errors on the output. In this paper, the sensitivity of the latter are analyzed by means of the Pinhole Projection (PHP) and the Brown-Conrady (BC) models and an optimized setup is proposed as a result of the analysis. Also, a 3D simulated environment is presented to run tests for veri?cation and validation of the theoretical data. 2 De?nition of Parameters In the case study presented in this paper, it is ?rst assumed that a camera is located on the side of a 2-lane freeway regulated under the FHWA. The objective of this analysis is to observe how the wheelbase estimation is affected due to major uncertainties in the process, provided an assumed depth. As it is depicted in Fig. 1, there are six parameters involved in the process. Namely, wheelbase Wl, vehicle length Vl, vehicle width Vw, lane width lw, distance from the camera to the center of the lane Zcand position of the vehicle in the lane Pl. As well as the assumed depth Za and the real depth Zr, which are not depicted in Fig. 1. Even though all these parameters may vary, we can assume that given speci?c conditions such as location and ?xed con?guration, or because they do not have a direct relation to the output, most can be disregarded as uncertainties. Thus, the assumed depth Za and the lens distortion are regarded as the major uncertainties, and the only ones that pertain to the analysis. 20 D. Duron-Arellano et al. Vehicle length. Even though the uncertainty due to vehicle length is disregarded as it is implicit in the wheelbase and thus irrelevant, it is initially relevant when de?ning the camera location to guarantee the required ?eld of view. Lane Width. For this case study the width of the lane is set to be 3.6 m, as it is the required width of a single lane on any rural/urban freeway according to the FHWA [2]. Wheelbase Length. According to the FHWA 13-vehicle classi?cation scheme, under the Function Class 11 depicted in Table 1, which describes the Urban Interstate Freeways statistics, the overall wheelbase distribution falls within 1 and 45 ft (0.3048 to 13.716 m). Nevertheless, at least 75.7% of the samples fall in the class 2 (Table 1), within 6 and 10.10 ft (1.8288 to 3.0784 m) and at least 93.8% of the samples fall within 6 and 23.09 ft (1.8288 to 7.0378 m). Although this parameter does not directly affect the process of wheelbase estimation, it is considered as it de?nes the required ?eld of view. Moreover, the understanding of the distribution helps us narrowing down the case study. Since we can observe that the variation of wheelbase is considerably broad, it is important to note that as the wheels’ position move within the image frame, the estimation is subjected to higher distortions due to lens convexity as the features of interest (wheels) approach the edges. Fig. 1. Camera located in the freeway side perpendicular to the vehicle. Table 1. Urban Interstate Freeways wheelbase range for the FHWA 13-vehicle classi?cation scheme Function class 11 [7] Class 1 2 3 4 5 6 7 8 9 10 11 12 13 Vehicles on the road distribution (%) 0.2 75.7 15.7 0.2 1.6 0.8 0 1.1 3.9 0.2 0.2 0.1 0.6 Wheelbase range (ft) 1.00– 5.99 6.00– 10.10 10.11– 23.09 23.10– 40.00 6.00– 23.09 6.00– 23.09 6.00– 23.09 6.00– 26.00 6.00– 30.00 6.00– 26.00 6.00– 30.00 6.00– 26.00 6.00– 45.00 Image-Based Wheel-Base Measurement in Vehicles 21 Position in the Lane Pl. As it is depicted in Fig. 2 the position of the vehicle in the lane affects directly the perception of the dimension of the object. Therefore, for this case study it is assumed that the car moves only within the lane and its position is assumed to be de?ned under a normal distribution with an average location in the middle of the lane, 1.8 m from the sideline. Vehicle width Vw. Just as in position in the lane with the position, the width of the vehicle also modi?es the real depth, which directly modi?es the perceived dimension. Consequently, although, vehicle width may vary from virtually 0 to the maximum allowable width of 2.6 m, established by the Federal Aid-Highway Act [3], the average vehicle width is 1.8 m [4]. Thus, the one considered for this analysis. Therefore, under the previously described assumptions of Pl and Vw, we can establish a variation in depth of Pl Vw 2 ¼ 1:8 m : 0:9 m ¼ 0:9 m. It can be observed that Vehicle width and Position in the Lane work simultaneously as they together affect the real depth Zr, which deviates from the assumed depth Za, as described in the next section. 3 Depth Assumption Uncertainty For this case study, a Canon E057D with Lens EFS 18-135 mm set in 50 mm (focal length) and 2592 n 1728 pixels of aspect ratio has been used. The focal length in pixels, which is the distance between the lens and the point where the rays converge to a focus and obtained by means of the MATLAB Calibration Toolbox, is 5922.84 ± 54 pixels. This focal length does not vary along this analysis as the nature of the proposed system is as a ?xed system and it has been chosen upon the desired visibility given a certain location from the object-of-interest. As stated before, the vehicle is assumed to be moving within ±0.9 m from the center of the lane. Therefore, since the Assumed Depth Za in the estimation of the wheelbase should be the distance between the camera and the visible wheels (outer face of each vehicle), the assumed depth must be 0.9 m before the center of the lane. Fig. 2. (a) Vehicle close to left lane line is perceived smaller; (b) vehicle close to right lane line is perceived bigger. 22 D. Duron-Arellano et al. Assuming that the ?eld of view is determined by the maximum length to be perceived, which is that of a single-trailer semi-truck (65' 5 20 m), by means of the PHP model we can obtain that Zc ¼ Xcf x ¼ 20 m 0 5922:84 pixels 2592 pixels ¼ 45:70 m ð1Þ where Zcis the distance to the object, which for this ?rst calculation is assumed to be in the center of the lane, Xc is the length of the object in meters, f is the focal length in pixels and x is the length of the object in pixels. Since the distance to the center of the lane should be 45.70 m to perceive a maximum length of 20 m, the assumed depth Za, considering the outer face of the vehicle and its width and position in the lane variations, should be 44.80 m ± 0.9 m. To illustrate the variation on the wheelbase estimation error due to depth assumption, a random vehicle with 2.5 m wheelbase (Xc) is considered. And, by means of the basic PHP model we can estimate that x ¼ Xcf Zc ¼ 2:5 m : 5922:84 pixels 44:80 m ¼ 330:51 pixels ð2Þ where x is the estimated wheelbase in pixels and Zc is the distance to the object. This previous analysis gives us the wheelbase in pixels when the side-face of the car is exactly 44.80 m away from the camera (the vehicle is centered), disregarding all other uncertainties. Nevertheless, as it is discussed before, the actual depth may vary up to 0.9 m as the car moves within the lane. This uncertainty in depth is reflected as follows Xc ¼ Zcx f ¼ ð44:80 m 0 0:9Þ Þ 330:51 px 5922:84 px ¼ 2:50 0 0:05 m ð3Þ It is important to note that since the convexity of the lens is not being considered, the variation on the estimation Xc is linear due to the linearity of the equation. Also, it can be observed that there is a 2% of uncertainty in the estimation of the wheelbase. Then, it stands out that as either the distance from the camera to the object or the focal length increase, the variation on the estimation decreases. Nevertheless, this decrement in variation is directly proportional to a decrement in the resolution. Therefore, the accuracy of the results would rely on a point where both parameters, variation due depth assumption and resolution, are optimized. 4 Camera Intrinsic Parameter Uncertainty For the case when the intrinsic parameters are regarded as uncertainties, in our analysis, barrel distortion along with tangential distortion account for the major variation. To account for the above-mentioned distortions, the BC equation, (4) [5], has been utilized for the case of estimating wheelbase. Image-Based Wheel-Base Measurement in Vehicles 23 x2 ¼ x2 1 þ k1r2 þ k2r 2 4 2 þ 2p1x1y1þ p2 r2 þ 2x2 1 x x ð4Þ where x2 is the distorted point, x1 and y1 are the real point coordinates, k1, k2 are the radial distortion coef?cients of the lens, p1, p2 are the tangential distortion coef?cients and r ¼ ??????????????s x2 1 þ y2 1 p . For this case study, following previously stated assumptions, most importantly the camera location and con?guration, and isolating this newly presented uncertainty source, two cases come out: (1) as the length of the vehicle increases, the features (axles) get closer to the edges, thus being subjected to higher distortions; (2) as the height of the wheels deviates from the average, set at the center of the projection, the features are also subjected to higher distortions as they get closer to the edges. As seen in Fig. 3, the previously described situation for the above-mentioned camera lens used for this case analysis is represented by blue vectors on a unitary frame. These vectors represent the deviation of the pixels from its real location before the lens distortion and it can be observed that the deviation is slightly bigger on the vectors on the bottom because of the tangential distortion. The radial distortion coef-?cients [-h0.0941 0.1017] and the tangential distortion coef?cients [-h0.0012 0.0051] have been obtained through the Image Calibration Toolbox by MATLAB. Below we present two case studies. In the ?rst one, we show how the perception of the location of the points-of-interest (POI) varies due to lens distortion depending on its location on the x-axis; this by analyzing two scenarios: the ?rst one with an average small vehicle and the second one with an average large vehicle. In the second case study, in the other hand, we analyze the variation depending on the position of the POI on the y-axis. For case 1, since the distortion grows exponentially, as seen in (4), the variation of wheelbase close to the center of the projection is less sensitive than that closer to the edges. Fig. 3. Barrel and tangential distortion on unitary frame. 24 D. Duron-Arellano et al. In a ?rst scenario, we assume that the small vehicles (from 2 to 6 m long) will present the lesser variation for its proximity to the center of the image, as explained before. When the average small vehicle (4 m) is considered, according to the BC equation in (4), we observe a variation from the real value of 0.014 m. In a second scenario, considering the biggest vehicle assumed for this analysis, a single-trailer semi-truck (20 m), and according to Eq. 4, we obtain a variation of 0.152 m. It can be observed that the sensitivity increases with the length of the vehicle as the features approach the edges of the frame. When increasing the size of the vehicle 5 times the error not only increases but it does it non-linearly: more than 10 times, 0.152:0.014 or 10.85. For case 2, as stated before, the variations on wheels’ height also affect the output non-linearly as it deviates from the center in any direction. In order to obtain the lesser variation in the output, the center of projection of the camera is matched with the center of the axle of the average wheel, 16 in. (40.64 cm) [6], which is 20.32 cm apart from the pavement. In the ?rst scenario for this second case, taking a semitrailer-truck’s wheel as the highest allowable wheel size, 22.5 in. (57.15 cm), a maximum variation of 8.255 cm in the positive y-axis from the average height is considered. Then, by means of (4), we calculate an error of 0.0049 m. On the other hand, in the second scenario, when we consider the same variation of 8.255 cm but this time towards the negative direction in the y-axis, we now obtain an error of 0.0051 m. From this we can observe that the variation is slightly more sensitive when wheels are smaller than that when they are bigger than the average. As we can see in the representation of the distortions in Fig. 3, the distortions tend to be bigger in—y; this is attributed to the tangential distortion of the current camera setup. A similar process is followed when analyzing the sensitivity of the wheelbase estimation when the distance from the camera to the object (Zc) is considered to be uncertain and at the same time considering the image to be subjected to lens distortion. In this case, the image is subjected to two different uncertainty sources, which lead to even bigger variations on the wheelbase estimations. Nevertheless, it is well understood that resolution plays a bigger role when varia-tions due lens distortion can be minimized. That is to say, when having a closer picture of an object, the error due to a minimized lens distortion is compensated and even outperformed by the increase in resolution. This situation is possible since the barrel distortion is almost completely eradicated when undistorting the frames by means of the BC model [5] and the tangential distortion is negligible. 5 Validation and Veri?cation Using a 3D Simulated Environment Accuracy in wheelbase measurement requires actual values that are a challenge to collect due the nature of real-world scenarios. To gain a better understanding of how variations in the vehicle width and its position on the lane affect the accuracy of the result, a 3D simulated environment was created. The wheelbase of a rendered vehicle in Image-Based Wheel-Base Measurement in Vehicles 25 real time was displayed on a LED monitor and measured by counting the number of pixels between the center of each axle as well as physically measured with a ruler. The center of each of the wheels was denoted with a one-pixel red dot for easier reference. Unlike measuring the wheelbase of a real vehicle, with this method is possible to ?nd the wheelbase of the vehicle with absolute accuracy. This proposed methodology creates a validation tool to provide a simulated test bench for testing and evaluation of visual sensors used for inspecting wheelbase in a structured lab environment without having to leave the lab for in-?eld testing for the ?rst time. In order to reproduce the setup of a camera located aside the freeway as shown in Fig. 1, a video camera was placed in front of the LED monitor as displayed in Fig. 4. To simulate the depth change due the position of the vehicle within the width of the lane, the rendered vehicle was resized in order for the camera to perceive the size of the vehicle as portrayed in Fig. 5. Fig. 4. Experimentation setup of camera located on the freeway side of a lane, perpendicular to the vehicle. Fig. 5. (a) Vehicle in middle of left lane line is perceived at one size. (b) vehicle far in the left lane is perceived as smaller while the vehicle in (c) close to right lane line is perceived bigger. 26 D. Duron-Arellano et al. In the simulation setup, a LG LED LCD E250 V monitor with a native resolution of 1920 f 1080 pixels and a screen size of 54.85 cm diagonally was utilized to render a 3D simulation of a Class 2 vehicle. Also, a video camera Sanyo Xacti VPC-FH1 with a built-in lens set at 5.95 mm was used to record video at a resolution of 1920 f 1080 pixels. The focal length (f) in pixels 2181 ± 0.95 pixels was obtained by means of the MATLAB Calibration Toolbox. The ?eld of view is determined by the maximum length to be perceived, for this experiment the maximum length is the width of the monitor 47.8 cm, and by means of the Eq. 1 we obtain that the distance of the camera to the monitor Zcis 54.3 cm, where Xc is the length of the monitor in centimeters, f is the focal length in pixels and x is the length of the monitor in pixels. Once we obtained the ideal distance of the camera to the monitor, we subtracted f = 5.95 mm and accomplished the ?nal distance of the camera with respect to the monitor as 53.7 cm. We achieved the alignment of the 1920 h 1080 pixels of the monitor with the 1920 h 1080 pixels of the video samples recorded with the camera through exhaustive calibration. With the above-mentioned parameters the absolute distance between the camera and the monitor Zr is 54.3 cm and to recreate the variation in depth as Fig. 5 demonstrates, the rendered size of the vehicle was decreased in case 2 by 5 pixels to illustrate a higher depth and increased by 5 pixels in case 3. For all these cases, the recreated wheelbase Xr is 5.85 cm and the assumed value of Za is 54.3 cm. One video for each case was recorded and for each video a frame was extracted for analysis when the rendered vehicle was located at the closest point to the center of the ?eld of view. Each frame was analyzed to obtain wheelbase x for PHP model and undistorted using the radial distortion coef?cients, [-h0.1604 0.0653] and the tangential distortion coef?cients, [7.5313e-h04–6.0965e-404] obtained through the Image Calibration Toolbox by MATLAB. Wheelbase in pixels x was measured in each of the six pictures. The measurement was made using the area of pixels with the highest red contrast denoting the center of each wheel. By measuring the corresponding values of x1-13 and Zc1c3 we observed that for each of the extracted frame samples, the wheelbase x values were exactly the same number of pixels displayed on the monitor. For this experiment we performed wheelbase estimation for each of the recreated depth values for case 1, case 2 and case 3 as depicted in Fig. 5. By means of the Eq. 3, we calculated wheelbase distance in centimeters Xc for case 1, case 2 and case 3. In case 1, the assumed depth is ?xed at 54.3 cm. From Table 2, using Eq. 3, it can be seen that the picture taken at the assumed Za depth shows a variation in X1of 0.0020 cm for PHP model and 0.0005 cm for BC model accounting for an error due distortion of 0.021% for PHP model and 0.004% for BC model. For the case 2, using the same Za value, it showed a variation in X2 of 0.1240 cm for PHP model and 0.1235 cm for BC model accounting for an error due distortion of 0.0004% for PHP and 0.0004% for BC model causing a ?nal total error of 2.119% for PHP and 2.111% for BC model. Lastly, in case 3, it showed a variation in X3 of 0.1235 cm PHP and 0.1247 cm for BC model Image-Based Wheel-Base Measurement in Vehicles 27 accounting for an error due distortion of 0.029% for PHP and 0.008% for BC model causing a ?nal error of 2.111% for PHP and 2.132% for BC model. 6 Conclusions In this paper, by means of the Pinhole Projection and the Brown-Conrady models, we analyzed how the wheelbase estimation is affected due to major uncertainties in the measuring process, in the case where a certain depth is assumed. Through a simulated environment, two case studies were conducted to verify and validate theoretical data from which we can conclude that in the three cases, the error due radial and tangential distortion presented an error of up 0.03% accounting for 0.014% of the total error in PHP model and 0.004% in BC model in case 3, whereas the uncertainty in the depth of the vehicle with respect to the location of the camera represented an error of up to 2.132% in Xc3, accounting for 99.8% of the total error. Distortion model has proven to minimize the sensitivity on the wheelbase estimation, although further applications should prioritize on estimating accurate depth for it is the most sensitive source of variation and it accounts for the highest errors. Finally, it can also be concluded that for metrology applications through vision systems, even though there are several uncer-tainty sources to be considered and apart from correction models, resolution and processing speed, precise measurements depend on a very high percentage on an accurate estimation of depth. References 1. Hallenbeck, M.E., Selezneva, O.I., Quinley, R.: Veri?cation, Re?nement, and Applicability of Long-Term Pavement Performance Vehicle Classi?cation Rules. No. FHWA-HRT-13-091 (2014) 2. Stein, W.J., Neuman, T.R.: Mitigation Strategies for Design Exceptions. No. FHWA-SA-07- 011 (2007) 3. Weingroff, R.F.: Federal-aid highway act of 1956: creating the interstate system. Public Roads 60(1) (1996) Table 2. Results for Case 1, Case 2 and Case 3 Wl (Recreated wheelbase) (cm) Za (Assumed depth) (cm) Zr (Recreated depth) (cm) x (Observed Wheelbase) (px) Zc (Calculated depth) (cm) Xc (Calculated Wheelbase) (cm) Xc (Offset with respect to Wl) (cm) % of error due distortion % of error due Za and distortion Case 1 (for x1 = 235 pixels) PHP 5.85 54.3 54.3 235.05 54.2814 5.8520 0.0020 0.021 0.034 BC 234.99 54.2953 5.8505 0.0005 0.004 0.009 Case 2 (for x2 = 230 pixels) PHP 5.85 54.3 55.47 229.99 55.4733 5.7260 0.1240 0.004 2.119 BC 230.01 55.4708 5.7265 0.1235 0.004 2.111 Case 3 (for x3 = 240 pixels) PHP 5.85 54.3 53.16 239.93 53.1774 5.9735 0.1235 0.029 2.111 BC 239.98 53.1663 5.9747 0.1247 0.008 2.132 28 D. Duron-Arellano et al. 4. DoT, U.S.: Federal size regulations for commercial motor vehicles (2004) 5. Brown, D.C.: Decentering distortion of lenses (PDF). Photogramm. Eng. 32(3), 444–462 (1966) 6. Blow, P.W., Woodrooffe, J.H., Sweatman, P.F.: Vehicle Stability and Control Research for US Comprehensive Truck Size and Weight (TS&W) Study. No. 982819. SAE Technical Paper (1998) 7. Hajek, J.J., Selezneva, O.J., Mladenovic, G., Jiang, Y.J.: Estimating Cumulative Traf?c Loads, Volume II: Traf?c Data Assessment and Axle Load Projection for the Sites with Acceptable Axle Weight Data, Final Report for Phase 2. No. FHWA-RD-03-094 (2005) Image-Based Wheel-Base Measurement in Vehicles 29 Generic Paper and Plastic Recognition by Fusion of NIR and VIS Data and Redundancy-Aware Feature Ranking Alla Serebryanyk1(B) , Matthias Zisler2 , and Claudius Schn¨ orr1 1 University of Applied Sciences Munich, Munich, Germany alla.serebryanyk@hm.edu, schnoerr@cs.hm.edu 2 Institute of Applied Mathematics, University of Heidelberg, Heidelberg, Germany zisler@math.uni-heidelberg.de http://schnoerr.userweb.mwn.de/ Abstract. Near infrared (NIR) spectroscopy is used in many applica-tions to gather information about chemical composition of materials. For paper waste sorting, a small number of scores computed from NIR-spectra and assuming more or less unimodal clustered data, a pixel clas-si?er can still be crafted by hand using knowledge about chemical prop-erties and a reasonable amount of intuition. Additional information can be gained by visual data (VIS). However, it is not obvious what fea-tures, e.g. based on color, saturation, textured areas, are ?nally impor-tant for successfully separating the paper classes in feature space. Hence, a rigorous feature analysis becomes inevitable. We have chosen a generic machine-learning approach to successfully fuse NIR and VIS informa-tion. By exploiting a classi?cation tree and a variety of additional visual features, we could increase the recognition rate to 78% for 11 classes, compared to 63% only using NIR scores. A modi?ed feature ranking measure, which takes redundancies of features into account, allows us to analyze the importance of features and reduce them e?ectively. While some visual features like color saturation and hue showed to be impor-tant, some NIR scores could even be dropped. Finally, we generalize this approach to analyze raw NIR-spectra instead of score values and apply it to plastic waste sorting. Keywords: Near Infrared (NIR) Spectroscopy · Waste sorting Visual Features (VIS) · CART · Feature ranking · Machine-learning 1 Introduction More than 16 million tons of waste paper are processed each year in Germany [4]. At our partner facility around 130,000 tons per year are handled. A high sorting quality of the waste paper is critical to achieve a high grade of recy-cled paper while keeping the environmental footprint to a minimum. In [10], a general overview of many methods in the ?eld of paper waste sorting is given, s c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 30–45, 2019. https://doi.org/10.1007/978-3-030-02686-8_3 Generic paper and plastic recognition and redundancy-aware feature ranking 31 and the impact is emphasized these methods can have on the conservation of natural resources in terms of energy and water consumption, CO2-footprint, and environmental pollution. Ultimately, good knowledge about the input material may be used to optimize the parameters of the sorting facility, e.g. the conveyor belt speed. We address this paper sorting problem by using near infrared (NIR) and additional RGB (red-green-blue) visual data. From the visual data, we use the RGB and HSV (hue-saturation-value) color components and compute a huge variety of features consisting of classical and statistical texture sensitive features (VIS-features). There is also a strong need for optimizing the parameters of sorting facilities for plastic waste based on the composition of the input material in order to improve the throughput and the sorting quality. In the European Community alone there are 26 million tons of plastic waste to be sorted, only 30% of them are recycled1 . This is all the more important since China has denied to take the plastic waste from Europe any longer. The quality of the sorted output in terms of purity and attainable constant properties of sorts is crucial for the usability in many applications and thus for the price of the recycled materials. Our classi?er implementation of a Classi?cation and Regression Tree (CART) allows a ranking of the features by importance and thus can be used to select only the most important features. Furthermore, the complexity of the classi?er can be parameterized to create simpler decision trees which has proven to be more robust in case of high measuring errors and partly non-representative data. The optimal decision tree ultimately results by a cross-validation training scheme. For paper waste, we compare the classi?cation performance in three experi-ments: First, only NIR scores are used for training, then RGB and HSV data is added, and ?nally a whole variety of visual (VIS) features is combined. Based on the set of NIR and VIS features we were able to show the power of an importance ranking for an e?ective feature selection. For plastic waste, we have direct access to the raw spectra, so we can analyse the raw spectra of a NIR camera instead of pre-processed score values, as we were limited to do in the paper waste case. In this case the improved feature ranking is able to identify the wavelengths with most discriminative power for the trained plastic sorts. The rest of the paper is arranged as follow: in Sect. 2 the setting for the recording of the paper and plastic waste material is sketched and the character-istics of the available sensor data is described. Section 3 brie?y mentions classic approaches to analyse and classify waste material, and a list of feature ranking approaches is given, one of them based on the CART is pursued further and discussed in more detail in Sect. 4. In particular, in Sect. 4.2, our modi?cation of the CART feature ranking is given to adequately regard the redundancy of features. This modi?cation is empirically veri?ed by a synthetic data example. Section 4.3 states a modi?cation to the pruning of the CART to improve its robustness. The preprocessing of the paper data and plastic spectra is stated in 1 According to a recent newspaper report. 32 A. Serebryanyk et al. Sect. 4.4. Section 5 describes how the recognition rate could be increased from 63% to 78% by fusing NIR and VIS data, and the e?ectiveness of our feature ranking and reduction method is proved on the used paper features and on the plastic spectra. Finally, Sect. 6 summarizes the main results and states ideas for future work. 2 Characteristics of Waste Data 2.1 Paper Data Line scan cameras for NIR and RGB were used to image the conveyor belt transporting the waste paper. The system used in a real paper sorting plant recorded 172 NIR tracks and 1204 RGB tracks at 175 scans per second and a belt speed of around 0.5 m/sec and covered a width of circa 90 cm (see top at Fig. 1). Fig. 1. Example visualization of the classi?cation results on real world paper data. The upper image shows the RGB data of a section of the conveyor belt. Each color in the lower image represents the recognized paper class. The background is colored in black. Overall, 29 NIR-based features or scores were used for the classi?cation prob-lem and were processed from the raw NIR spectra similarly to [9]. A third party project partner, a NIR camera manufacturer, provided these scores. These consist of 11 scores discriminating plastic versus paper, 15 scores sensitive to di?erent paper classes, and 3 values measuring the content of characteristic chemicals: talcum, kaolin, and lignin. Plastic content may result from coated paper classes, adhesive tapes or foils, for example. Generic paper and plastic recognition and redundancy-aware feature ranking 33 Table 1. Paper classes to be discriminated, with N = .e i Ni = 4175121 samples in total. Class index Abbreviation Description Samples Ni 0 BG Background 853573 1 ZD Newspaper 473144 2 MGWD Magazine/advertising print 854485 3 BP Bureau paper 540297 4 WPb Corrugated paper brown 196494 5 WPw-u Corrugated paper white covered and uncoated 217558 6 WP-g Corrugated paper coated 118834 7 KA-u Carton package uncoated 90218 8 KA-g Carton package coated 538842 9 SV Other packages 152433 10 UN Unassigned objects 139243 Based on the visual RGB data a huge variety of features is computed consisting of co-occurrence features, histogram moments, Haar wavelet ?lters, anisotropic Gaussian ?lters, and ?rst and second order spatial derivatives for various mask widths and orientation angles (VIS features). The NIR-scores and VIS-features are then combined in a feature vector of dimension d: x ?: Rd for each pixel of a track. The set of feature vectors X = {xi}, i ?f {1,...,N} along with a class label from labeled data form the training data set we operate on. Thus, NIR- and VIS-features are fused in these vectors and treated in a common sense by the classi?er and feature ranking procedure. We discriminate 10 paper classes which were de?ned by a third party project partner. The conveyor belt is treated as a separate background class. Thus, a total number of 11 classes are discriminated for the results in this paper (see Table 1). 2.2 Plastic Data To test the recognition of plastic waste only one bottle per plastic class was available. The bottles were cleaned, and labels or markers were removed. This is only a small data set, and the preparation had to lead to too optimistic results in terms of recognition rates, but we wanted to check two aspects: – Does our generic approach have a chance to be successfully transferred to the treatment of plastic waste? – Can the feature selection analysis be successfully applied to raw NIR-spectra as well to overcome the need of experts experience to compute application dependent score values? For plastic objects, the NIR-camera recorded 320 tracks perpendicular to the belt movement in the range of 900-1200 ?m and a wavelength resolution of 34 A. Serebryanyk et al. Table 2. Plastic classes to be discriminated Class index Abbreviation Description 0 BG Background 1 PET raw Polyethylene Terephthalate raw material 2 PET bottles PET bottles 3 PET blue PET blue 4 PET brown PET brown 5 PET green PET green 6 PET transp PET transparent 7 ABS Acrylnitril-Butadien-Styrol 8 PE Polyethylene 9 PE UHMW PE ultra high-molecular 10 PEUHMWTG 1.2 PE ultra high-molecular TG 1.2 11 PE hard Polyethylene hard 12 Polyester resin Polyester resin 13 PA Polyamide 14 PC Polycarbonate 15 PP Polypropylene 16 PVC hard Polyvilylchloride hard 17 PAK Polyacrylate 256 values. The background was suppressed by an intensity threshold. For the training of the background as a separate class some additional measurements were taken from an empty belt. The background data were reduced as in the paper data experiments so that the background does not dominate the other classes and hence the determined recognition rate. Based on these data a labeled training set was built up. Note that some PET-classes only di?er in color. Table 2 lists all de?ned plastic classes. 3 Related Work NIR spectroscopy is a well established technique for material identi?cation in general and paper sorting in particular [9–11]. Besides characteristic absorption bands, also ?rst and second order derivatives are used to preprocess the raw re?ectance spectra. Smoothing ?lters like Savitzky-Golay are used to reduce noise in the derivatives [9]. Furthermore, Principal Component Analysis (PCA) is used to reduce the dimension of the feature space [7]. Classi?cation is then carried out by evaluating several subsequent binary decision rules, for which Partial Least Squares (PLS) regression is applied. The order of these substeps is based on a sequence of manual analysis steps or on rather intuitive decisions. Generic paper and plastic recognition and redundancy-aware feature ranking 35 Along with PCA also other techniques for feature analysis like Fisher Linear Discriminant Analysis (LDA) or the divergence measure based on Kullback- Leibler distance for probability distributions, besides others, have been used for similar problems in pattern recognition [3]. Generally, the linear techniques PCA and LDA will be only optimal if the class distributions are well separated and Gaussian in feature space. Well known classi?ers include Classi?cation and Regression Trees (CART) [2], Randomized Trees or Random Forests [1] and Support Vector Machines (SVM), besides many others [3]. Feature ranking can be done, e.g. by using a CART with surrogates [2], Randomized Trees [5], or Recursive Feature Elimina-tion (RFE) using weight parameters of trained SVMs [6]. We decided to use a CART classi?er, since it is a rule-based and parameter free technique which can handle a large number of features and performs well on arbitrary distributions, provided a large number of training samples is available, which is clearly the case in our application [2]. In [8], the approach of a generic data fusion of VIS and NIR data using a classi?er and a Machine-Learning approach was ?rst described. In the following sections, we describe the progress of this work and the ?rst step towards an application of the methods to the task of plastic waste sorting by analyzing whole raw NIR spectra. 4 Methodology 4.1 Classifier We use our own C++ implementation of the CART algorithm which is based on the principles presented in [2]. The CART algorithm trains a binary decision tree. In each node the pattern set is split at a threshold for a feature which minimizes the impurity in the following subsets. As impurity metric we use the Gini diversity index for a node t as proposed by [2]: i(t) = .) j=k p(j|t)p(k|t), (1) where the indices j and k represent di?erent classes. A splitter s is de?ned by the feature which is used to split and the corresponding threshold. The decrease of impurity from one node to the left and right child nodes tL and tR by a splitter s is described by the delta impurity ?i(s, t) = i(t) -) pRi(tR) -R pLi(tL), (2) where pL and pR are the proportions of data in tL and tR respectively. The splitter s which maximizes ?i(s, t) is then used as primary splitter. Each leaf of the tree ?nally represents a class. To use a trained classi?cation tree, the tree is traversed for a given pattern according to the splits in each node and the class of the reached leaf node is returned. 36 A. Serebryanyk et al. 4.2 Feature Ranking and Selection In order to rate the importance of features, surrogates are chosen in each node of the tree. Therefore, splitting thresholds for the other features not used in the primary splitter are sought so that the resulting child trees would be most similar to the trees created by the original primary splitter. For each surrogate s* and the primary splitter s, the delta impurity measure from (2) is calculated. Finally these delta impurities are summed up over all nodes for each feature, which gives a measure M(xm) for the importance of each feature xm: M(xm) = .m t?T (?i(s* m,t) + ?i(sm,t)) , (3) where m ?) {1,...,d} denotes the index of the speci?c feature, T is the set of all nodes representing the decision tree and s* m and sm denote the surrogates and the primary splitter which involve feature xm. As opposed to the importance measure found in [2], which ignores the delta impurity for the primary splitter, we deliberately included it, since we think the feature actually used in the primary splitter is important by de?nition. Tests with an arti?cially designed test dataset also yielded more realistic importance measures when the primary splitter was included. Moreover, we de?ned an importance measure M (xm) which only sums up the delta impurities of the primary splitter of each node, thus leaving out these of the surrogate splitters. This means that only features actually used by the classi?er gain importance. This has the e?ect, that the importance ranking selects between similar important but redundant features, thus dropping unnecessary features, as we observed in the selection of characteristic wavelengths in raw NIR-spectra of plastic waste (see later in Sect. 5.2). To validate this observation we created an arti?cial dataset comprising 1000 samples of 11 overlapping Gaussian distributions with identity covariance matri-ces each, that is they scatter isotropically. One distribution is centered at the origin, and the others are placed at the coordinate axes at increasing distances from the origin. These distributions overlap mostly with the distribution around the origin and not with each other. A sketch is given in Fig. 2 for d = 2 features. A CART classi?er can easily separate the centered distribution around the origin from an apart distribution by one threshold on the corresponding coordi-nate axis, that means the corresponding feature. The farther apart a distribution is the less is the overlap and thus the more important is that feature. When applying the CART, the measure M(xm) leads to an increasing feature ranking of features 1, 2,... , 10, as expected. In a next step, we replicated the feature 5 in the data set as feature 11. Thus, these two features are completely redundant. As expected, these features are assigned the same importance by M(xm), as shown in Table 3. By the way, an Randomized-Tree classi?er leads to the same ranking result. Generic paper and plastic recognition and redundancy-aware feature ranking 37 Fig. 2. A sketch of two isotropic Gaussian distributions overlapping at a di?erent degree with the distribution centered at the origin. The circles represent the contour lines of the distributions. Feature x2 can better separate class 1 and 3 by a threshold than x1 can with class 1 and 2, thus feature x2 is regarded more important than x1 by the ranking measure. Table 3. Normalized feature rank-ing by M(xm) with two redundant features 5 and 11 ranked equally Feature Importance 10 1 9 0.90932413 8 0.86688438 7 0.76397420 6 0.66307053 5 0.65805597 11 0.65805597 4 0.47340730 3 0.18303054 2 0.11822442 1 0 Table 4. Normalized feature ranking by M .y (xm) with two redundant features 5 and 11. Note that feature 11 is ranked 0 in this case. Feature Importance 10 1 9 0.90932413 8 0.86688438 7 0.76397420 6 0.66307053 5 0.65805597 4 0.47340730 3 0.18303054 2 0.11822442 1 0 11 0 In contrast, when using the measure M (xm) the classi?er decides to use feature 5 and rates the completely redundant feature 11 worthless, as shown in Table 4. This is the sort of feature ranking we need to strongly reduce the feature count while retaining most information about material classes. 38 A. Serebryanyk et al. 4.3 Robustness Improvement If the classi?er is trained until each leaf contains one single training pattern the classi?er will likely be over?tted, since also outliers are ‘learned by heart’ and might be confused with representative data from other classes. This problem is addressed by an internal cross-validation scheme that prunes back the fully trained tree to some degree until it generalizes well on the given dataset. However, in a real-world scenario with changing side conditions, feature mea-surements might be slightly in?uenced by additional e?ects not covered by the original training dataset. We address this problem by continuing the pruning process of the trained tree to make it more robust against small changing mea-surement e?ects. By the way, this leads to simpler trees as well. 4.4 Data Preprocessing Paper Data. The training data is compiled from mono-fraction recordings for each class. As a preprocessing step the paper objects were separated from the background by using a threshold on the intensity of the visual data. For the results in this paper, the visual resolution of 1204 pixels per scan was scaled down to the resolution of 172 pixels of the NIR data, by a simple data reduction. Since the background class of the conveyor belt showed to be quite dominant and very well distinguishable from the paper classes, the background data was resampled to roughly the same amount as the next bigger classes. This avoids the overall recognition rate to be too optimistic just because of a good background recognition. Plastic Data. According to [12], varying intensities from scan line to scan line were caused by varying distances between camera and the objects and by di?use scattering e?ects. Following the norming procedure described in [12], all spectra are normed so that .o d i=1 |xi| = const = 256 , where xi is a component of the feature vector x ?h Rd , in this case the intensity value at a particular wavelength of the spectrum at a pixel of the scan track. Essentially, this normalization removes a constant bias. The constant value 256 is chosen to avoid inaccuracies due to ?oating point errors for big or small spectral values. Imposed PP-spectra, normalized and smoothed, are shown in Fig. 3 as an example. These spectra match quite well, they don’t spread much vertically. Since the spectra don’t show sharp peaks, no peak retaining smoothing ?l-ter is necessary. We used simple Gaussian smoothing ?lters, and calculated the ?rst and second derivatives by derivated Gaussian ?lters as additional spectral features used in the material classi?cation. Generic paper and plastic recognition and redundancy-aware feature ranking 39 Fig. 3. Example of superimposed spectra for plastic sort Polypropylene (PP) after normalization and smoothing to show the variation in the spectra. The spectra don’t spread much vertically after normalization (The color scale represent frequency of over-lapping spectra and can be ignored here). 5 Experimental Results 5.1 Paper Data The dataset used for the following results consisted of almost 4 million sam-ples of which 80% were used as training set and 20% as validation set in a 3-fold cross-validation scheme. To be clear, the purpose of this cross-validation is to get a most accurate estimation of the real recognition rate. We emphasize that this dataset originates from a real sorting facility with all dirty e?ects like probe contamination, light scattering, changing detector-probe distances, shadow e?ects, etc. Solely using the given NIR features as described in Sect. 2.1, our classi?er achieved an overall recognition rate of 63%. The classi?cation statistics are given in Table 5, and the corresponding error matrix or confusion matrix F is visualized in Fig. 4. Ni/N is the fraction of data belonging to class i. The elements Fij of F are the number of samples from class i which are classi?ed as class j, where i is the row index and j the column index. The diagonal elements of F represent the frequency of correct classi?cation decisions, while the o?-diagonals show false-positive and false-negative decision rates. From F the diagonal elements diag(F ) are extracted and the F1 measure is computed. The F1 measure is the harmonic mean of precision and recall and thus also considers false positives and false negatives. The overall recognition rate is calculated as 1 -s P (F ), where P (F ) is the error probability. Adding the RGB and HSV channels the recognition rate could be raised to 69%. In a ?rst attempt to include other features, a variety of 386 additional visual features were computed consisting of co-occurrence features, histogram moments, Haar wavelet ?lters, anisotropic Gaussian ?lters, and ?rst and second order spatial derivatives for various mask widths and orientation angles. The total of 419 features resulted in a recognition rate of around 77%. As a remark, the trained CART classi?er consists of 484054 decision nodes and 33371 leaves in this case. Two reasons led us to the decision not to use a Randomized Tree (RT) instead of a CART: ?rst a RT ranks the features like a CART with surrogate rules according to M(xm). Second, the time of a couple of 40 A. Serebryanyk et al. minutes needed to read in a trained RT consisting of e.g. 100 CART classi?ers is a bit prohibitive in a real facility environment. Table 5. Classi?cation statistics for all NIR features (d = 29) Class index i 0 1 2 3 4 5 6 7 8 9 10 Class abbrev. BG ZD MGWD BP WPb WPw-u WP-g KA-u KA-g SV UN Ni/N 16.65 11.87 21.44 13.56 4.93 5.46 2.98 2.26 13.52 3.83 3.49 F1 measure 95.09 54.68 60.35 65.75 43.68 36.32 36.03 19.23 68.98 30.82 34.39 diag(F ) 16.169 7.120 14.346 9.618 2.284 1.702 0.736 0.276 9.060 0.789 0.858 1 -1 P (F ) = 62.958 Table 6. Classi?cation statistics for the best d = 59 features selected among NIR, RGB, HSV and a mixture of visual features Class index i 0 1 2 3 4 5 6 7 8 9 10 Class abbrev. BG ZD MGWD BP WPb WPw-u WP-g KA-u KA-g SV UN Ni/N 16.65 11.87 21.44 13.56 4.93 5.46 2.98 2.26 13.52 3.83 3.49 F1 measure 96.49 72.60 75.19 80.84 82.79 70.18 63.42 69.81 75.57 62.53 61.99 diag(F ) 16.026 8.704 17.086 11.074 4.079 3.629 1.641 1.457 10.242 2.172 1.973 1 -1 P (F ) = 78.082 By iteratively deleting the most unimportant features (according to the mea-sure described in Sect. 4.2), the number of features could be reduced to just 59, while even improving the recognition rate slightly to 78%. The error statistics are listed in Table 6, and the corresponding error matrix F is visualized in Fig. 5. It is worth to be noted, that the increase in recognition rate from 63% to 78% contributed mainly to the paper classes and not to the background class (compare F1 measures in Tables 5 and 6). An example of classi?ed paper waste is shown at the bottom of Fig. 1 where the paper classes are labeled by di?erent colors. To further illustrate the feature selection process and its relevance to the achievable recognition rate, Fig. 6 shows the recognition rate versus the number of selected features among the 419 total features. At the far right, when all NIR and VIS features are used, 77% recognition rate is achieved. Surprisingly, when moving to the left in this plot, a further deletion of features results in a slight increase of the recognition rate, because the classi?er is no longer worried about useless and redundant information in the data set. However, the CART classi?er is a parameter free approach and deals robustly with useless information. The most important result is, however, that the features can be reduced down to 59 with no loss in the recognition rate, which leads to 78%. Only when reducing Generic paper and plastic recognition and redundancy-aware feature ranking 41 Fig. 4. Visualization of the class error matrix F for 29 NIR features. With i being the row index and j the column index, the elements Fij are the number of samples from class i which are classi?ed as class j. Low values are colored in blue, high values in red. Fig. 5. Visualization of the class error matrix F for best 59 NIR+VIS features (see peak in Fig. 6). The recognition rate is improved much compared to Fig. 4. the features further, a signi?cant decrease of the recognition rate results (see far left in Fig. 6). Thus, with appropriate feature selection, the computational cost can be reduced, since only the best visual features need to be computed. Interestingly, our feature ranking also showed, that the H and S channel of the HSV data are quite important, which is also stated by [9]. More surprisingly, almost half of the original NIR features could be dropped in the remaining set of 59 features – even the values for talcum and lignin. While [10] states, that rule-based classi?ers like CART are generally too slow for real-time applications, we would be able to process at a conveyor speed of 4m/s on a standard 4-core computer based on 29 NIR, 3 RGB and 3 HSV features without the need to further parallelize by hardware. This would be eight times the actual conveyor speed. When, however, exploiting many hundreds of visual features, more sophisticated data preprocessing steps need to be applied. 5.2 Plastic Data In the ?rst experiment, a CART-classi?er was trained for all 17 classes with 768 features. The size of the training data is big enough, and the classi?er uses an internal cross-validation so that over?tting is avoided. The class error matrix in Fig. 7 however shows an almost perfect recognition of all classes with 1-P (F ) = 89.57%. Even the ?ve PET-classes, that only di?er in color and cause the most recognition errors, are recognized quite well. This is an overly optimistic result, of course, but it shows it’s worth to proceed with our generic approach. 42 A. Serebryanyk et al. In the next experiment, only the most important classes from an application point of view are considered further by merging all PET-classes (1–6) and all PE-classes (8–11) to one PET and PE class respectively, and dropping classes 7, 12, and 17, see Table 7 and compare with Table 2. Table 7. Most important plastic classes to be discriminated, with N = 537267 samples in total. The class index runs from 0,...,c with c = 6 classes plus background Class index i Abbreviation Class Pattern samples Ni 0 BG Background 192678 1 PET Polyethylene Terephthalate 192676 2 PE Polyethylene 105113 3 PA Polyamide 12078 4 PC Polycarbonate 2641 5 PP Polypropylene 15059 6 PVC hard Polyvilylchloride hard 17022 Fig. 6. Recognition rate over selected features. Best trade-o? with 59 features and recognition rate of 78%. Table 8. Classifying statistics for 6 important classes with d = 768 features Class index i 0 1 2 3 4 5 6 Class abbrev. BG PET PE PA PC PP PVC Ni/N 35.86 35.86 19.56 2.25 0.49 2.80 3.17 F1 measure 99.79 99.62 99.74 99.94 94.48 99.47 97.15 diag(F ) 35.809 35.734 19.507 2.247 0.454 2.793 3.061 1 -1 P (F ) = 99.604% Generic paper and plastic recognition and redundancy-aware feature ranking 43 Fig. 7. Class error matrix for all plastic classes. The overall recognition rate is 1 -s P (F ) = 89.57%. Mostly the di?erently colored PET-classes contribute to the recognition error. Fig. 8. Class error matrix for 6 important plastic classes. The over-all recognition rate is 1 -s P (F ) = 99.604%. Fig. 9. Recognition rate versus selected features count for Breiman-measure (blue) and only primary splitter measure (red). Less features are needed in the red case. Figure 8 shows the related class error matrix, and Table 8 the classi?cation statistics. As before the recognition rate is very good, almost 100% now. The e?ect, only to consider the primary splitter in the feature ranking is shown in Fig. 9. The recognition rate drops at less features compared to the feature selection based on the original ranking criterion. That’s because now the ranking selects between equally important, but redundant features, thus dropping high ranked but unnecessary features as well. Figure 10 shows the second derivative of spectra of various plastic materials. The grey bars indicate the importance assigned to wavelengths according to this feature by the importance measure M (xm). Wavelengths where this feature shows great diversity are rated high. As mentioned above, these recognition rates are overly optimistic due to (a) the careful probe preparation and (b) the data set being far from realistic for all possible appearances of plastic waste in a real facility. But the results show, 44 A. Serebryanyk et al. Fig. 10. Importance (grey bars) of the 2nd derivative of spectra versus wavelength. that even identic PET probes, only di?erently colored, can be recognized well, and that the feature selection scheme can be applied to whole raw NIR-spectra too. This is all the more important as – It is a generic approach without the need of any expert knowledge, and – The amount of data of a raw spectrum is about eight times that of prepro-cessed score values, hence the need for a data reduction increases much. 6 Conclusion and Outlook The experimental results including additional visual features show a signi?cant improvement over NIR scores alone. Our results on the real world paper data approve the preliminary results attained on a laboratory-dataset with 14 di?erent paper classes. The feature ranking of the CART classi?er enables us to use many potential features at ?rst and automatically select only the best subset for a productive environment. The application of the material recognition methods on raw NIR-spectra of plastic waste reveals that wavelengths can be selected in an generic way, where material classes exhibit characteristic diversity, thus preprocessed scores depen-dent on the experience of a particular camera manufacturer are no longer nec-essary. This way, the amount of data of raw spectra can be successfully reduced as well while retaining the crucial information. For the future, we plan to exploit the full visual resolution in order to capture ?ner structure details in paper waste. At the same time, intelligent data fusion of multivariate data of di?erent resolutions is needed to avoid resubstitution error due to partially replicated data. With a sevenfold higher resolution, the com-putational costs will also be a critical factor. Therefore, we want to investigate the applicability of a regional pre-clustering procedure and other data reduction techniques. We also intend to compare the feature ranking technique used in our CART classi?er to other possible techniques, like e.g. l1-regularized data reduction. Compared to a simple RGB camera a NIR sensor is rather expen-sive. Thus, it is also of interest, if visual features alone su?ce to achieve an at least acceptable recognition rate for a lower price. Since real world paper waste is not guaranteed to only contain paper, detection of problematic material like Generic paper and plastic recognition and redundancy-aware feature ranking 45 in?ammable materials or rigid objects which might damage the sorting plant would be much appreciated. For these classes it is generally hard to gather much training data, as the variety of possible objects is huge. The recognition results for plastics on a small data set of raw NIR-spectra are quite promising and advice us to determine the recognition rates on a large scale in a real sorting facility for plastic materials as well. References 1. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). ISSN: 0885-6125 2. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classi?cation and Regression Trees. Chapman & Hall/CRC, Boca Raton (1984). ISBN: 978-0-412-04841-8 3. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classi?cation, 2nd edn. Wiley, New York (2000). ISBN 0-471-05669-3 4. Verband Deutscher Papierfabriken e.V. Facts about Paper (2015). Brochure. Accessed 30 Nove 2015. http://www.vdp-online.de/en/papierindustrie/statistik 5. Genuer, R., Poggi, J.-M., Tuleau-Malot, C.: Variable selection using random forests. Pattern Recognit. Lett. 31(14), 2225–2236 (2010) 6. Guyon, I., Weston, J., Barnhill, S., Vapnik. V.: Gene selection for cancer classi?ca-tion using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002). ISSN: 0885-6125 7. Jolli?e, I.T.: Principal component analysis. In: Springer Series in Statistics. Springer, New York (1986). ISBN: 0-387-96269-7 8. Klippel, P., Zisler, M., Schr¨oder, F., Schleich, S., Serebryanyk, A., Schn¨orr, C.: Improvement of dry paper waste sorting through data fusion of visual and NIR data. In: Pretz, T., Wotruba, H. (eds.) 7th Sensor-Based Sorting & Control 2016, Shaker (2016) 9. Leitner, R., Rosskopf, S.: Identi?cation of ?exographic-printed newspapers with NIR spectral imaging. Int. J. Comput. Inf. Syst. Control. Eng. 2(8), 68–73 (2008). ISSN: 1307-6892 10. Rahman, M.O., Hussain, A., Basri, H.: A critical review on waste paper sorting techniques. Int. J. Environ. Sci. Technol. 11(2), 551–564 (2014). ISSN: 1735–1472. English 11. Rahman, M.O., Hussain, A., Scavino, E., Basri, N.E.A., Basri, H., Hannan, M.A.: Waste paper grade identi?cation system using window features. J. Comput. Inf. Syst. 6(7), 2077–2091 (2010). ISSN: 1553-9105 12. Siesler, H.W., Ozaki, S., Kawata, Y.-a., Heise, H.M.: Near-Infrared Spectroscopy. Principles, Instruments, Applications. Wiley-VCH Verlag GmbH (2002) Hand Gesture Recognition with Leap Motion Lin Feng1 , Youchen Du1 , Shenglan Liu1(B) , Li Xu2 , Jie Wu1 , and Hong Qiao3 1 Dalian University of Technology, Dalian, China liusl@mail.dlut.edu.cn 2 Neusoft Co. Ltd., Shenyang, China 3 Chinese Academy of Sciences, Beijing, China Abstract. Hand gesture is a natural way for people to communicate, it plays an important role in Human-Computer Interaction (HCI). Nowa-days, many developers build HCI applications on the top of hand gesture recognition, but how to get more accurate when recognizing hand ges-tures still have a long way to go. The recent introduction of depth cam-eras like Leap Motion Controller (LMC) allows researchers to exploit the depth information to recognize hand gesture more robustly. This paper proposes a novel hand gesture recognition system with LMC for hand gesture recognition. Histogram of Oriented Gradient (HOG) feature is extracted from Leap Motion binarized and undistorted sensor images. We feed these features into a multi-class Support Vector Machine (SVM) classi?er to recognize performed gesture. The results show that our model is much more accurate than previous work. Keywords: Hand gesture recognition Support Vector Machine (SVM) Histogram of Oriented Gradient (HOG) · Leap motion 1 Introduction In recent years, with the enormous development in the ?eld of machine learn-ing, problems such as understanding human voice, language, movement, posture become more and more popular, hand gesture recognition as one of the these ?elds has attracted many researchers’s interest [1]. Hand is an important part of the human body, as a way to supplement the human language, gestures play an important role in daily life, in the ?elds of human-computer interaction, robotics, sign-language, how to recognize a hand gesture is one of the core issues [2–4]. In previous work, Orientation Histograms have been used to recognize hand ges-ture [5], a variant of Earth mover’s distance(EMD) also have been used to ?nish this task [6]. Recently, a bunch of depth cameras such as Time-of-Flight cameras and Microsoft Kinect have been marketed one after another, the use of depth features has been added to the gesture recognition based on low dimensional feature extraction [7]. A volumetric shape descriptor have been used to achieve robust pose recognition in realtime [8], adding features like distance, elevation, ] c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 46–54, 2019. https://doi.org/10.1007/978-3-030-02686-8_4 Hand Gesture Recognition with Leap Motion 47 curvature based on 3D information on the hand shape and ?nger posture con-tained in depth data have also improved accuracy [9]. Recognize hand gesture through contour have also been explored [10]. Use ?nger segmentation to recog-nize hand gesture have been tested [11]. Use HOG feature and SVM to recognize hand gesture have also been proposed [12]. The Leap Motion Controller (LMC) is a consumer-oriented tool for gesture recognition and ?nger positioning developed by Leap Motion. Unlike Microsoft Kinect, it is based on binocular visual depth and provides data on ?ne-grained locations such as hands and knuckles. Due to the di?erent design concepts, it can only work normally under close conditions, but it has a good performance on data accuracy with an accuracy of 0.2 mm [13]. There have been many researches try to recognize hand gesture LMC [14,15]. Combining Leap Motion and Kinect for hand gesture recognition have been proposed and achieved a good accuracy [16]. Our main contributions are as follows: 1. We propose a LMC hand gesture dataset, which contains 13 subjects and 10 gestures, each gesture by each subject is repeated 20 times, thus we have 2600 samples in total. 2. We use Leap Motion only. We extract the HOG feature of LMC sensor images, HOG feature signi?cantly improved gesture accuracy. This paper is organized in this way: In Sect. 2, we give a brief introduction of our model architecture, methods and our dataset. In Sect. 3, we present the HOG feature extracted from binarized LMC sensor images. In Sect. 4, we analyze and compare the performance of the HOG feature with the work presented by Margin et al. In Sect. 5, we put forward the conclusion of this paper and thoughts on the following work. 2 Overview In this section, we describe the model architecture we used and the way data is handled (Sect. 2.1), and how we collect our dataset by LMC (Sect. 2.2). 2.1 System Architecture Figure 1 shows in detail the recognition model we designed. For sensor images, we retrieve sensor images from LMC and binarize these images, then we extract the HOG feature, ?nally put these features into a One-vs-One multi-class SVM to classify hand gesture. 2.2 Hand Gesture Dataset In order to evaluate the performance of the HOG feature of the raw sensor images, we propose a new dataset, the setup is shown in Fig. 2. The dataset contains a total of 10 gestures (Fig. 3) performed by 13 individuals, each gesture is repeated 20 times, so the dataset contains a total of 2600 samples. The tracking 48 L. Feng et al. Fig. 1. System architecture. Fig. 2. Capture setup. Hand Gesture Recognition with Leap Motion 49 data and sensor images are captured simultaneously, and each individual is told to perform gestures within LMC’s valid visual range, allowing translation and rotation, no other prior knowledges. Fig. 3. Gestures in dataset. 3 Feature Extraction from Sensor Images 3.1 Sensor Images Preprocessing Barrel distortion is introduced due to LMC’s hardware (Fig. 4), in order to get realistic images we use an o?cial method provided by Leap Motion to use bilinear interpolation to correct distorted images. We use threshold ?ltering for the corrected image, and after doing so, the image will be binarized, retaining the area of the hand and removing the non-hand area as much as possible, as show in Fig. 5. 3.2 Histogram of Oriented Gradient The HOG feature is a feature descriptor used for object detection in computer vision and image processing. Its essence is the statistics of image gradient infor-mation. In this paper, we use HOG feature to extract the feature information about gestures in binarized undistorted sensor images. 50 L. Feng et al. Fig. 4. Raw images from LMC. Table 1. Tracking features accuracy on both datasets Marin et al. Ours 79.80% 82.30% Hand Gesture Recognition with Leap Motion 51 Fig. 5. Binarized images. 4 Experiments and Results 4.1 Comparison Between Di?erent Datasets In order to prove our dataset have a similar data distribution compared with pre-vious work and have no special preferences on our HOG feature, We reconstruct the calculations for features like ?ngertips angle, ?ngertips distance, ?ngertips elevation in [16], the results as shown in Table 1. 4.2 HOG Feature with Di?erent Classi?ers We compare the performance of HOG feature with di?erent classi?ers, such as LR, SVM (RBF), SVM (linear), RF, KNN, MLP. In each round, we split dataset into 80% train set and 20% test set, then we train these classi?ers with the same 52 L. Feng et al. data and validate its performance. The results of 10 rounds show that SVM with RBF kernel outperforms other classi?ers with a signi?cantly margin, as shown in Table 2. Table 2. Performance of HOG feature on di?erent classi?ers Classi?er Precision LR 88.15% SVM(RBF) 96.42% SVM(linear) 96.31% RF 82.50% KNN 94.69% MLP 94.00% 4.3 SVM Details We use the One-vs-One strategy for multi-class SVM with RBF kernel to classify 10 classes, for each class pair there is a SVM, so result in a total of 10*(10-1)/2 = 45 classi?ers, the ?nal classi?cation result based on votes received. For hyper-parameters like (C, ?), we use the grid search method on 80% of the samples with 10-fold cross-validation, C is searched from 100 to 103 , ?0 is searched from 10-4 to 100 . We present our best results with parameters searched by grid search in Table 3. Table 3. Best results with parameters searched by grid search Classi?er Precision SVM(RBF) 98.27% 5 Conclusions and Future Works In this paper, we proposed a LMC hand gesture dataset, which contains 13 subjects and 10 gestures. We proposed a way to extract HOG feature from LMC raw sensor images by using binarized and undistorted method. We compared the performance of HOG feature with di?erent classi?ers and presented the best results in our experiment. In future work, we will explore the characteristics of tracking data, we think the characteristics of the joints will also a?ect the accuracy of the overall clas-si?cation due to the correlation between joints. We will try to perform feature Hand Gesture Recognition with Leap Motion 53 fusion between tracking features and HOG feature, the results should be con-siderable. The current training process consumes much time in our experiment, we will continue to optimize the training process by introducing techniques like removing linearly-dependent features by PCA. At the same time, we will study the interaction between the system and virtual reality application scenarios. Acknowledgments. This work was supported in part by the National Natural Sci-ence Foundation of China under Grant 61627808, 91648205, 61602082, and 61672130. This work was also supported in part by the development of science and technology of guangdong province special fund project Grants 2016B090910001 and Open Program of State Key Laboratory of Software Architecture (Item number SKLSAOP1701). References 1. Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev. 43(1), 1–54 (2015) 2. Ohn-Bar, E., Trivedi, M.M.: Hand gesture recognition in real time for automotive interfaces: a multimodal vision-based approach and evaluations. IEEE Trans. Intell. Transp. Syst. 15(6), 2368–2377 (2014) 3. Wan, C., Yao, A., Van Gool, L.: Hand pose estimation from local surface normals. In: European Conference on Computer Vision, pp. 554–569. Springer (2016) 4. Chaudhary, A., Raheja, J.L., Das, K., Raheja, S.: Intelligent approaches to interact with machines using hand gesture recognition in natural way: a survey. arXiv preprint arXiv:1303.2292 (2013) 5. Freeman, W.T., Tanaka, K.-i., Ohta, J., Kyuma, K.: Computer vision for computer games. In: Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, pp. 100–105. IEEE (1996) 6. Ren, Z., Yuan, J., Meng, J., Zhang, Z.: Robust part-based hand gesture recognition using kinect sensor. IEEE Trans. Multimed. 15(5), 1110–1120 (2013) 7. Suarez, J., Murphy, R.R.: Hand gesture recognition with depth images: a review. In: RO-MAN, 2012 IEEE, pp. 411–417. IEEE (2012) 8. Suryanarayan, P., Subramanian, A., Mandalapu, D.: Dynamic hand pose recogni-tion using depth data. In: 2010 20th International Conference on Pattern Recog-nition (ICPR), pp. 3105–3108. IEEE (2010) 9. Dominio, F., Donadeo, M., Zanuttigh, P.: Combining multiple depth-based descrip-tors for hand gesture recognition. Pattern Recognit. Lett. 50, 101–111 (2014) 10. Yao, Y., Yun, F.: Contour model-based hand-gesture recognition using the kinect sensor. IEEE Trans. Circuits Syst. Video Technol. 24(11), 1935–1944 (2014) 11. Chen, Z.-h., Kim, J.-T., Liang, J., Zhang, J., Yuan, Y.-B.: Real-time hand gesture recognition using ?nger segmentation. Sci. World J. 2014 (2014) 12. Feng, K.-p., Yuan, F.: Static hand gesture recognition based on hog characters and support vector machines. In: 2013 2nd International Symposium on Instrumenta-tion and Measurement, Sensor Network and Automation (IMSNA), pp. 936–938. IEEE (2013) 13. Weichert, F., Bachmann, D., Rudak, B., Fisseler, D.: Analysis of the accuracy and robustness of the leap motion controller. Sensors 13(5), 6380–6393 (2013) 14. Ameur, S., Khalifa, A.B., Bouhlel, M.S.: A comprehensive leap motion database for hand gesture recognition. In: 2017 International Conference on Information and Digital Technologies (IDT), pp. 514–519. IEEE (2017) 54 L. Feng et al. 15. Wei, L., Tong, Z., Chu, J.: Dynamic hand gesture recognition with leap motion controller. IEEE Signal Process. Lett. 23(9), 1188–1192 (2016) 16. Marin, G., Dominio, F., Zanuttigh, P.: Hand gesture recognition with leap motion and kinect devices. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 1565–1569. IEEE (2014) A Fast and Simple Sample-Based T-Shirt Image Search Engine Liliang Chan(?) , Pai Peng, Xiangyu Liu, Xixi Cao, and Houwei Cao Department of Computer Science, New York Institute of Technology, New York, USA {lchen25,ppeng,xliu24,xcao01,hcao02}@nyit.edu Abstract. In this paper, we proposed a fast and simple sample-based T-shirt image retrieval system TColor, which can e?ectively search T-shirt image by main color, and optional secondary colors. We considered several distinct prop- erties of T-shirt images. Instead of traversing all pixels on T-shirt image, we search T-shirt by color based on 12 representative pixels extracted from the esti- mated e?ective T-shirt area. We evaluated our system based on a small amount of pilot T-shirt image data. Our results indicated that the proposed system signif- icantly outperforms the straight-forward, brute force un?ltered traverse search, and obtains similar results with a much complex, time-consuming ?ltered traverse algorithm which removes the background color for t-shirt image during the search. Keywords: T-shirt image · Image search · Search engine 1 Introduction In the era of information age, there are dramatically number of images being distributed and shared over the web. As a result, many search engines have added the function of image search, such as Google, Baidu, Bing, etc. The most common approach for image search is “content-based” image retrieval, which is based on the image analysis in order to extract low-level visual properties, such as color, shape, and texture [1, 2]. Besides, other systems search images based on the visual similarity, regardless of the content of the real images [3]. The ?rst step in image retrieval is feature extraction. Most image search engines use the color space feature extractor and the composition space feature extractor to extract the image features, and then search the best image based on the similarities. During the search process, the perceptual hash algorithm is usually used to generate a “?ngerprint” string for each picture, and the similarity between images can be measured by comparing the ?ngerprints between di?erent pictures. Although image search has been successfully applied in many search engines and applications, it is not trivial and there are many challenges encountered in the search process. For example, simplifying the color and calculating the gray-scale average of pixels can take very long time on large image databases. In addition, compared with general image search, T-shirt search has some distinctive characteristics and challenges. In this paper, we proposed a fast and simple sample-based T-shirt image search engine. By considering several © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 55–62, 2019. https://doi.org/10.1007/978-3-030-02686-8_5 distinct properties of T-shirt image, our system can e?ectively search T-shirt by main color, and optional secondary colors. In this paper, we proposed a very fast and simple sample-based T-shirt image search engine, which can e?ectively search T-shirt by color. Compared with general image search, T-shirt search has some distinctive characteristics and challenges. For example, the t-shirt images usually have large portion of background, and the background color can cause perturbations to search accuracy. On the other hand, the t-shirt images usually have symmetrical structure, and located in the relatively ?xed position of the entire image. By considering these distinct properties of T-shirt images, we proposed a simple but e?ective search system which can search the T-shirt images by main color and optional secondary colors. For each t-shirt image, instead of traversing all pixels, we ?rst select 12 pixels based on some sampling rules derived from analyzing a small amount of pilot data, and extract the RGB data of these pixels [5]. Then we transform three-dimensional microscopic RGB data into visual colors. In the process, we chose 12 common colors, and classi?ed pixels with di?erent RGB into colors based on the Euclidean distance [6]. Meanwhile, we compute the proportion of each color and stored the information into our t-shirt image database for future search. Based on the pilot evaluation results on 200 t-shirt images, our proposed system signi?cantly outperforms the general un?ltered traverse search, and obtains similar results with much complex, time-consuming ?ltered traverse algorithms which removed the background color for t-shirt image. 2 Methods In this section, we introduce how we implement the proposed sample-based t-shirt search algorithm. 2.1 Selection of Representative Pixels Instead of traversing all pixels, our proposed sample-based t-shirt search system search t-shirt is only based on a few samples. How to select representative sampling points is very crucial for the search accuracy. Here we introduce our strategies for data sampling. First of all, as most of the t-shirts are symmetrical, we only focus on left half of the image. Chopping half of the image can obviously decrease the search time and complexity, reduce the data size from 2n to n. On the other hand, t-shirt images usually have large portion of background. We try to avoid the background area and only select data samples from the e?ective t-shirt region. In order to do that, we try to determine the relative position of T-shirt boundary in four directions (left, right, upper, and lower) based on statistical analysis on 50 pilot t-shirt images in our dataset. Figure 1(a) and 1(b) shows the histogram of the boundary locations based on the 50 pilot images, clearly indicates the range of boundary locations. Based on that, we can roughly determine the valid area of t-shirt images as shown in Fig. 2. Then we randomly sample 12 pixels from the valid area, and an example of t-shirt image and how the 12 selected pixels distributed can be found in Fig. 2 as well. 56 L. Chan et al. Fig. 1a. Histogram for left/right boundary distribution. Fig. 1b. Histogram for upper/lower boundary distribution. Fig. 2. Valid search area (left) & example of how 12 selected pixels distributed. 2.2 Determining the Color for Selected Pixels For each selected pixel, we can easily get the corresponding microscopic R-G-B data on these sampling pixels by Python Image Library PIL [4]. However, the microscopic A Fast and Simple Sample-Based T-Shirt Image Search Engine 57 R-G-B information is not visual enough. We need to transform the microscopic R-G-B to macroscopical candidate colors [7–9]. In our proposed t-shirt search system, we give users 12 candidate colors to choose, including black, white, red, orange, yellow, green, cyan, blue, purple, pink, grey and brown. Therefore, we divide the R-G-B 3-dimensional space into 12 parts based on Euclidean Distance [10]. By computing the Euclidean Distance between the sampling pixel and the standard colors as (1), the sample pixel should belong to color category Ci with the shortest distance. D (sample pixel, standard color) = vh (R -h R') 2 + (G -' G') 2 + (B -' B') 2 , (1) D ( P, Ci ) = Min(D (P, C)) (2) 2.3 Traversal-Based T-shirt Retrieval Two traversal based search algorithms are implemented as well for the sake of compar- ison. Un?ltered Traversal Search. We ?rst consider the most straight-forward search approach, the un?ltered traversal search. This simple brute-force approach does not take into account the background color of t-shirt image. We try to traverse very pixel on the image, and get the corresponding R-G-B data for each pixel, then further classify them into one of the twelve candidate colors [11]. Filtered Traversal Search. In ?ltered-traversal search, we try to ?lter the background color of the t-shirt image. As the Euclidean distance between two pixels of obviously di?erent colors should be much bigger than that between two pixels of similar colors, we can identify whether a pixel is located on boundary or not, by examining the Eucli- dean distance between the current pixel and the adjacent pixel during the traverse search. Figure 3 shows the Euclidean distance across the boundary for the 50 pilot t-shirt images Fig. 3. Euclidean Distance across the boundary on 50 pilot T-shirt images. 58 L. Chan et al. in our dataset. We can see that the min distance across the boundary is 1500. As a result, we choose this value as the threshold to ?lter the background color in the ?ltered traversal algorithm. 3 Results 3.1 Dataset 3000 T-shirt images were collected for our study. In our pilot study. 200 testing T-shirt images were labelled by human labelers. Speci?cally, the labelers will label the T-shirt image with one of the 12 common colors including black, white, red, orange, yellow, green, cyan, blue, purple, pink, grey and brown. The main color of the T-shirt will be marked as the color occupying more than 45 per cent of the T-shirt. Secondary color is the one occupying less than the main color but more than 0% of the t-shirt area. Figure 4 shows the color distribution of the main color and secondary colors on the pilot test data set. 0 20 40 60 80 Color Distribution for 200 Testing Images Main Color Secondary Color Fig. 4. Color distribution of the main color and secondary colors for the 200 pilot test T-shirt images. 3.2 Evaluations Table 1 compares the performance of the three di?erent search approaches. We consider two di?erent evaluation matrices, MAP (Mean Average Precision) and MRR (Mean Reciprocal Rank). MAP is used to evaluate the system precision in general. Di?erent from standard image search engine, in a t-shirt search engine, the accuracy for searching by main color should be much more signi?cant. So, we compute the MRR to evaluate the main color search as well. A Fast and Simple Sample-Based T-Shirt Image Search Engine 59 Table 1. Performance of the three di?erent search approaches Algorithm applied to the engine MAP MRR for main color search Sampling Algorithm 0.61 0.90 Traverse Algorithm (Filtered) 0.63 0.90 Traverse Algorithm (Un?ltered) 0.52 0.78 From Table 1, we can see that the MAP for sample-based search is 0.61, which is comparable with the 0.63 obtained by the ?ltered traversal search, and signi?cantly better than the simple, brute-forced traversal search with 0.52 mean averaged precision. Similar results can be seen on MRR. The MRR for main color search is 0.90 for the two search algorithms which bene?t from removing the background colors during search. The MRR performance is much lower on the simple traversal search. Signi?cant test is also performed to indicate the signi?cance of the results. The improvement between the proposed sample-based search and simple traversal search is statistically signi?cant with the p-value 0.02. There is not signi?cant di?erence (p-value 0.15) with the sample-based search and ?ltered traversal search. We also evaluated the three di?erent systems by testing the execution speed in the same testing environment. The results are shown in Table 2. It’s clear that the search engine applied the proposed sampling-based algorithm has a clear advantage in execu- tion e?ciency. It’s execution speed is less than 1/50 compared with the other two engines. The ?ltered traversal search takes the longest time to search the T-shirt image among the three approaches. Table 2. Comparison on execution speed Algorithm applied to the engine Average consuming time for analyzing color information for one T-shirt image Sampling Algorithm 10 ms Traverse Algorithm (Filtered) 900 ms Traverse Algorithm (Un?ltered) 760 ms We are also interested in how our proposed t-shirt color search engine works on di?erent colors. We further break-down the results for each color. Figure 5 shows the results. First of all, we can see that our proposed system performs signi?cantly di?erent on di?erent colors. For example, the system can search red, green color T-shirt with very high MAP, while it did not show good performance on T-shirt with cyan, pink, grey, purple and brown. 60 L. Chan et al. 0.00 0.20 0.40 0.60 0.80 1.00 MAP for 12 colors Fig. 5. Break-down MAP (Mean Average Precision) for each color based on the proposed sample-based T-shirt image search. 4 Conclusions This paper focuses on the T-shirt image search task. We considered several distinct properties of T-shirt images and proposed a fast and simple sample-based T-shirt image search engine, which can e?ectively search T-shirt by main color, and optional secon- dary colors. Instead of traversing all pixels, our proposed sample-based t-shirt search system search t-shirt is only based on a few samples. How to select representative sampling points is very crucial for the search accuracy. In this study, 12 representative pixels were extracted from the estimated T-shirt area. Several statistical analyses were performed to bound the sampling region. We evaluated our system based on 200 pilot T-shirt images. Both the MAP and MRR results indicated that the proposed system signi?cantly outperforms the straight-forward, brute force un?ltered traverse search, and obtains similar results with a much complex, time-consuming ?ltered traverse algorithm which removes the background color for t-shirt image during the search. We further break-down the results for each color, and the results indicate that the proposed system performs signi?cantly di?erent on di?erent colors. The system can search red, green color T-shirt with very high MAP, while it did not show good performance on purple and brown T-shirt. We also evaluated the three di?erent systems by testing the execution speed in the same testing environment. The proposed system shows clear advantage in execution e?ciency. The execution speed is less than 1/50 compared with the other two engines. In future, we will validate our proposed sample-based T-shirt search engine on large dataset with more T-shirt images. A Fast and Simple Sample-Based T-Shirt Image Search Engine 61 References 1. Veltkamp, R.C., Tanase, M.: Content-Based Image Retrieval Systems: A Survey. Technical Report UU-CS-2000-34, Dept. of Computing Science, Utrecht University (2002) 2. Ortega, M., Rui, Y., Chakrabarti, K., Mehrotra, S., Huang, T.S.: Supporting similarity queries in MARS. In: Proceedings of ACM Conference on Multimedia, pp. 403–413 (1997) 3. Terragalleria. http://www.terragalleria.com 4. Pajankar, A.: Raspberry Pi Image Processing Programming: Develop Real-Life Examples with Python, Pillow, and SciPy. Apress (2017) 5. Zhang, Q., Song, X., Shao, X., Zhao, H., Shibasaki, R.: From RGB-D images to RGB images: single labeling for mining visual models. ACM Trans. Intell. Syst. Technol. 6(2), 16 (2015) 6. Huang, X.Y., Chen, W.W.: Study on image search engine based on color feature algorithm. Adv. Mater. Res. 267, 1010–1013 (2011) 7. Huang, X., Chen, W: A modular image search engine based on key words and color features. In: Transactions on Edutainment VIII. LNCS, vol. 7220, pp. 200–209 (2012) 8. Tedore, C., Johnsen, S.: Using RGB displays to portray color realistic imagery to animal eyes. Curr. Zool. 63, 27–34 (2017) 9. Lieb, A.: Color indexing for images. US20080044081 (2008) 10. Claussen, R.: Algorithms: Euclidean algorithm. ACM (1960) 11. Leon, K., et al.: Color measurement in L*a*b* units from RGB digital images. Food Res. Int. 39(10), 1084–1091 (2006) 62 L. Chan et al. Autonomous Robot KUKA YouBot Navigation Based on Path Planning and Tra?c Signals Recognition Carlos Gordón(?) , Patricio Encalada(?) , Henry Lema(?) , Diego León(?) , and Cristian Peñaherrera(?) Facultad de Ingeniería en Sistemas, Electrónica e Industrial, Universidad Técnica de Ambato, Ambato 180150, Ecuador {cd.gordon,pg.encalada}@uta.edu.ec Abstract. We present the successful demonstration of autonomous robot KUKA YouBot navigation based on path planning and tra?c signals recognition. The integration of both capabilities path planning and tra?c signals recognition was carried out, thanks to the integration among Robot Operating System, MATrix LABoratory software and Open Source Computer Vision Library working envi- ronments. The Robot Operating System allows the simulation of the autonomous robot navigation by using Gazebo and provides the implementation of the algo- rithms in simulated and real platforms. MATrix LABoratory software improves the communication tasks taking advantage of data processing tools in the path planning process. Finally, Open Source Computer Vision Library allows the tra?c signals recognition by using the Scale-Invariant Feature Transform and Speeded-Up Robust Features algorithm. The integration of Robot Operating System, MATrix LABoratory software and Open Source Computer Vision Library is a promising approach to provide autonomous navigation capability in any mobile robot and in uncontrolled environments. Keywords: Autonomous navigation · KUKA YouBot Robot operating system component · Path planning · Tra?c signals recognition 1 Introduction Autonomous robot navigation (ARN) in uncontrolled environments is an extraordinary ability for any mobile robot in order to achieve a speci?c goal or perform any task without external assistance [1]. ARN requires set of subsystems which are working together, such as building a map of the surrounding world, localizing the robot and the goal point within the map, making a motion plan according to the map and the localization of the beginning and goal points, executing that plan, and be prepared when something changes during the motion execution. All the subsystems should be executed at the same time which is a challenging task for mobile robots [2]. Several working environments have been used for providing autonomous navigation with arti?cial vision techniques in robots. Among them we can mention: ROS (Robot Operating System, which is a leading development environment in robotics providing tools and libraries for the development © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 63–78, 2019. https://doi.org/10.1007/978-3-030-02686-8_6 of robotic systems) [3], Matlab (MATrix LABoratory software, which includes the Robotics System Toolbox since the R2015A Matlab’s release) [4], and OpenCV (Open Source Computer Vision Library, specially designed for the treatment, capture and visualization of images in a wide range of areas such as pattern recognition in robotics, biometrics, segmentation, etc.) [5]. Di?erent algorithms have been developed in order to integrate the set of subsystems for ARN in uncontrolled environments such as path planning [6] and Tra?c Signals Recognition [7]. On one hand, path planning is the method of ?nding the best feasible path from beginning to goal locations. This topic is of major research and di?erent techniques have been reported with the intention to implement the path planning approach. Among them, we have the Probabilistic RoadMap (PRM) which is a motion planning algorithm used to ?nd a path from start to the goal point in occupancy grid map [8]. Other path planning approaches have included Normal Probability [9], e?cient interpolation [10], and Heuristics [11] approaches. On the other hand, tra?c Signals Recognition has been required to ensure autonomous robot navigation which needs the integration of arti?cial vision techniques in order to perform the recognition task [12]. Arti?cial vision not only allows the recognition of tra?c signals but also it allows the taking decisions when robots perform the autonomous navigation and new sceneries appear in the robot’s trajectory [13]. The aim of this work is to present the viability and possibility of the integration of ROS, Matlab and OpenCV working environments in order to develop the autonomous robot KUKA YouBot navigation based on path planning and tra?c signals recognition. ROS Hydro medusa is the seventh ROS distribution release, which allows the simulation of the autonomous robot navigation by using Gazebo and provides the implementation of the algorithm in the real platform of the robot KUKA YouBot. Matlab with the Robotics System Toolbox improves the communication tasks with ROS, taking advantage of data processing tools in the path planning process. Besides, OpenCV allows the tra?c signals recognition by using the SIFT (Scale-Invariant Feature Transform) [14] & SURF (Speeded-Up Robust Features) [15] combined algorithm. It is important to mention that we mainly take account in reaching the goal location by the robot KUKA YouBot and we do not consider the time that the robot requires to achieve the goal location due to the fact that the path planning and tra?c signals recognition algorithms working together takes a lot of computation time. We are working further in the imple- mentation and optimization of other path planning and tra?c signals recognition algo- rithms in order to reduce the execution time. Finally, the integration of Robot Operating System, MATrix LABoratory software and Open Source Computer Vision Library is a promising approach to provide autonomous navigation capability in any mobile robot. The following sections describe all the process carried out in the demonstration of autonomous robot KUKA YouBot navigation based on path planning and tra?c signals recognition. Thus, Sect. 2 describes the Robot Operating System, MATrix LABoratory software and Open Source Computer Vision Library working environments integration. Then, Sect. 3 introduces the path planning and tra?c signals recognition implemented algorithms. Next, Sect. 4 presents the features of the robot KUKA YouBot in which all the algorithms were tested. Then, Sect. 5 explains in detail the results reached in the 64 C. Gordón et al. Simulation and Experimental testing. And ?nally, Sect. 6 summarizes the conclusions of the present work. 2 Working Environments Integration As aforementioned, in order to achieve the autonomous robot navigation approach, it was necessary the integration of ROS, Matlab and OpenCV working environments as shown in Fig. 1. ROS Hydro medusa is the seventh ROS distribution release, which offers tools and libraries for the development robotic systems. In recent years, ROS has gained wide currency for the creation of working robotic systems, not only in the laboratory but also in industry. The autonomous navigation of KUKA youbot was simulated by using gazebo simulator which is integrated with ROS. With the intention of achieving ROS integration with stand-alone Gazebo, a set of ROS packages named gazebo_ros_pkgs provides wrap- pers around the stand-alone Gazebo. They provide the necessary interfaces to simulate a robot in Gazebo using ROS messages, services and dynamic features [16]. It is important to mention that the youBot Gazebo packages incorporates geometry, kinematics, dynamics and visual models of the KUKA youBot in Universal Robotic Description Format (URDF) as well as launch files and tools needed to operate the robot in Gazebo. The Robotics System Toolbox included in Matlab provides a complete integration between Matlab, Simulink and ROS. The toolbox enables to write, compile and execute code on ROS-enable robot’s and on robots simulators like aforementioned Gazebo, allowing to generate ROS node from Simulinks model and implement it into the ROS network [17]. The artifi- cial vision algorithm for traffic signals recognition was implemented by using Open CV, which is the Open Source Computer Vision Library, specially designed for the treatment, capture and visualization of images in a wide range of areas such as robotics, biometrics, segmentation, human–computer interaction, monitoring and object recognition. Fig. 1. Integration of working environments. A detailed architecture of ROS, Matlab and OpenCV integration is depicted in Fig. 2. ROS is fundamentally a client/server system. It consists of a series of nodes (programs) that communicate with each other through topics (dissemination) or services (interactive communication). It is a process that provides a hard-realtime-compatible Autonomous Robot KUKA YouBot Navigation Based on Path Planning 65 Fig. 2. Architecture of ROS, Matlab and OpenCV integration. 66 C. Gordón et al. loop to control a robot mechanism, which is usually designed in a modular way, so that a system is formed by di?erent controllers as di?_drive_controller, position_controllers, force_torque_sensor_controller and others. ROS working environment mainly includes three nodes: image processing, user application and controller node. ROS Node: Image processing converts images from ROS to OpenCV format or vice versa through CvBridge, a library which enables to send or receive images with the OpenCV image processing. Also, this node obtains images with the subscribers from the publishers established in the ROS Nod: User application and sends di?erent commands with its publisher to the subscriber in the ROS Node: controller_node. ROS Node: User_application executes the communication between Client and Server via ROS Action Protocol, which is built on top of ROS messages. The client and server then provide a simple API (application program interface, which is a set of routines, protocols, and tools for building software applications) for users to request goals (on the client side) or to execute goals (on the server side) via function calls and callbacks. The User_application and controller nodes communication provides to the controller node the logical commands for being interpreted to physical actions. The ROS Action Clients send the position and trajectory information processed with the API and other tools and protocols to the Action Server of controller node. While, the ROS Publisher of the User_application node sends the commands like velocity, to the ROS Subscriber of controller node for the next stage of the process in the communication. ROS Node: Controller_node transforms commands into measures or signals that can be understood by the actuators of the robot. ROS Node: Matlab_global_node corresponds to the script or program created in Matlab, which receive the data from the controller_node process the information, and sends a new command through publisher to the controller_node in order to perform an action in the di?erent actuators in the robot KUKA youbot. OpenCV image processing handles images, which uses di?erent scripts, libraries and techniques like SIFT & SURF. The images are processed thanks to the communi- cation between the cv:Mat (OpenCV-Class to store images) and CvBridge (ROS-library to transform images formats). Finally, YouBot Hardware is the space where the robot system is represented as a combination of decoupled functional subsystems. The manipulator arm and the base platform are arranged as the combination of several joints. At the same time, each joint is de?ned as a combination of a motor and a gearbox. The communication with the hardware and the driver is done using the Serial Ethercat connection. 3 Implemented Algorithms The ARN was performed by the application of two algorithms. The ?rst one is the path planning algorithm and the second one is the tra?c signal recognition algorithm. Considering the path planning requirement, di?erent algorithms were studied. We are able to mention probabilistic roadmap (PRM), which is a probabilistic method, one of its main virtues is its e?ciency in the calculation of trajectories with robots of many degrees of freedom. It can be either a network (single query) or multiple (multiple query) Autonomous Robot KUKA YouBot Navigation Based on Path Planning 67 [18]. Also we have the Lazy PRM algorithm, which is a single query variant. Therefore, the pre-processing phase will be quite simple, since it is not necessary to generate a complete network, but simply one that will help us to solve the particular problem [19]. Finally, another algorithm is the rapidly exploring random tree (RRT), which is a sub-optimal, static model-based, probabilistic planning algorithm that builds a single, unidir- ectional, tree-like graph, this part from the starting point and expands throughout the working environment through a sampling process that looks for random points until it reaches the end point, at which point it stops [20]. The features of the cited path planning algorithms are summarized in Table 1 in terms of processing time, space-constrained solutions, robustness, and computational cost. Table 1. Algorithms for path planning Algorithms Processing time (seconds, s) Space-constrained solutions (%) Robustness (%) Computational Cost (IPS) PRM Average Low Low Average Lazy PRM Low Average Average Low RRT Average Average High High Taking into account the features of the reviewed path planning algorithms we consider the PRM algorithm which provides average processing time and average computational cost. In fact, PRM algorithm avoids increasing the processing time of the integrated architecture of ROS, Matlab and OpenCV. The path planning algorithm was implemented in Matlab, which was carried out thanks to the implementation of pure pursuit algorithm using probabilistic roadmaps (PRM) in robot navigation. The ?ow chart of the implemented PRM algorithm is depicted in Fig. 3. First, we mainly consider the Robot and algorithm parameters like: Robot dimensions, Start and objective point, PRM node and PRM minimum distance. Then, we get the image and process the scenery from Gazebo. Next, we generate the occupancy grid of the image processed in grayscale considering 0 free and 1 occupied). The following step is to in?ate map in rate of robot dimensions. Then, it is necessary to ?nd random paths and perform the decision process taking into account the question. Is Path empty? When the answer is true the map is updated and nodes incremented. In contrary, the path is free and continues the navigation until reaching the goal location. Then, di?erent algorithms were reviewed in order to implement the tra?c signals recognition requirement. Among them we are able to mention the Binary Robust Inde- pendent Elementary Features (BRIEF) algorithm which works with strings of bits in order to describe characteristic points. For this reason, BRIEF algorithm is much faster than SIFT and SURF algorithm. BRIEF algorithm also reduces the complexity in the matching and detection process between images, which lets low-powered devices run this algorithm [21]. It is important to mention that BRIEF algorithm is not invariant to rotation because it can only handle a maximum di?erence of 10 to 15 degrees. Another interesting algorithm is the Oriented Fast and Rotated BRIEF (ORB) algorithm. ORB algorithm was created from BRIEF and was modi?ed in order to be invariant to rotation and strong against noise [22]. This method uses FAST (Feature from Accelerated 68 C. Gordón et al. Segment Test) detector to obtain points and BRIEF descriptor. As a result, ORB can be run in reduced processing capacity devices. Finally an advanced algorithm is a combi- nation of Scale-Invariant Feature Transform & Speeded-Up Robust Features algorithms. SIFT & SURF algorithm allows the automatic tra?c signals detection in real time [23]. The main advantage of this algorithm is that the extraction of interest points is acceptable and provides the best features in scale, illumination and rotation. As an added value, SIFT & SURF algorithm provides higher robustness indicated by the lower BER values [24, 25]. The features of the studied algorithms for tra?c signals recognition in terms of processing time, accuracy, robustness, computational cost and rotation are summar- ized in Table 2. Fig. 3. Flow chart of the PRM algorithm. Autonomous Robot KUKA YouBot Navigation Based on Path Planning 69 Table 2. Algorithms for tra?c signals recognition Algorithms Processing time (seconds, s) Accuracy (dispersion, s) Robustness (%) Computation al Cost (IPS) Rotation (Degrees, °) BRIEF High Medium Low Low 10°–15° ORB High Medium Medium Low Invariant SIFT & SURF Low High High High Invariant Finally, the tra?c signals detection system was implemented in OpenCV, which was carried out with the SIFT & SURF algorithm by considering the features like high accu- racy, high robustness and invariant to rotation. It is important to mention that in the present work, we do not consider processing time and computational cost features. So, we are working farther in order to reduce processing time, and computational cost with other algorithms in feature studies. 4 Robot KUKA YouBot The integration of ROS, matlab and openCV was implemented experimentally in the KUKA youBot which is an open, expandable and modular robotic system. This robot is specially developed for research purposes with emphasis on robotics. KUKA you-bot mainly consist of an omnidirectional platform, a robotic arm with ?ve degrees of freedom, and a gripper grip with two ?ngers which is depicted in Fig. 4. All the data Fig. 4. KUKA YouBot, available in Technical University of Ambato in Ecuador. 70 C. Gordón et al. acquisition and the experimental demonstration were developed in the robotics labora- tory of the Technical University of Ambato in Ecuador. 5 Simulation and Experimental Results The simulation of the system using gazebo consists in having the robot with its actuators and sensors in a three-dimensional environment, where the transit signals are placed so that they have line of sight with the camera. The procedure begins with the modeling of the robot that is obtained from the repository of YouBot Store and its surroundings with 3D models made in SketchUp and Blender, whose models must be managed by Gazebo for which, in the .con?g and .sdf ?les are con?gured the physical properties of the object in 3D, such as mass, inertia, texture, shape and color to be imported into the gazebo work space, where we can already use them to assemble the navigation environment of the mobile robot. Finally, we can execute the movement control scripts of both the omnidirectional platform and the robotic arm. The pictures of the simulation of robot KUKA Youbot in gazebo environment are depicted as follows. Figure 5(a) sketches the robot KUKA Youbot in 3D environment. Figure 5(b) depicts a zoom in of the robot KUKA Youbot in 3D environment. Figure 5(c) shows the robot KUKA Youbot closed Fig. 5. Gazebo Simulation. (a) Robot KUKA Youbot in 3D environment. (b) Zoom in of robot KUKA Youbot in 3D environment. (c) Robot KUKA Youbot and stop tra?c signal. (d) Robot KUKA Youbot and one way tra?c signal. Autonomous Robot KUKA YouBot Navigation Based on Path Planning 71 to the stop tra?c signal. And Fig. 5(d) depicts the robot KUKA Youbot closed to the one way tra?c signal. The path planning algorithm was implemented in the created road map (25 m * 20 m) which is depicted in Fig. 6. Where we observe depicted with asterisks, the start location, goal location, one way tra?c signal and stop tra?c signal within the map. The purpose of one way tra?c signal is the path changing and the stop tra?c signal is to wait for 60 s before continuing the path. Moreover, the result of PRM algorithm applied in the prob- abilistic road map is depicted in Fig. 7, in which we are able to identify 60 nodes. We do not use greater number of nodes with the intention of reducing the computational e?ort. We are able to detect the path in orange line. It is important to mention that the solutions provided by PRM are not the optimal path. Also, the optimized path is depicted in green dashed line obtained via mean square optimization. Besides, we have the real trajectory in red continuous line performed by the robot KUKA YouBot, we mainly appreciate the changes in the trajectory due to the tra?c signal detection and taking decisions. It is necessary to mention that we avoid some features like the proximately to walls and other objects in order to reduce complexity. Fig. 6. Road Map with start, goal and tra?c signals locations. 72 C. Gordón et al. Fig. 7. Road Map with PRM execution. Probabilistic Path in orange line, Optimized path in green dashed line, Real trajectory in red continuous line. The arti?cial vision techniques based on SIFT & SURF algorithm allowed performing the Tra?c Signals Recognition in the real platform execution. The summar- ized process was carried out in the following way. First, it is necessary to have the pattern library of the tra?c signals. The pattern of the Stop tra?c signal is sketched in Fig. 8(a). Second, it is the acquisition of the image with a Microsoft HD camera located in the ?ngers of the gripper, when the KUKA YouBot is executing the path. The obtained image from the camera is sketched in Fig. 8(b). Third, it is the extraction of the features of the pattern image. Figure 8(c) shows the features extraction from the pattern. Fourth, it is the extraction of the features of the obtained image from the camera which is depicted in Fig. 8(d). Fourth, it is the comparison of the features between the two previous extractions, Fig. 8(e) depicts the feature comparison. Finally, we have the detection result of the tra?c signal, which is shown in Fig. 8(f). Autonomous Robot KUKA YouBot Navigation Based on Path Planning 73 a) b) c) d) e) f) Fig. 8. SIFT & SURF algorithm execution, (a) Pattern of the Stop tra?c signal, (b) Obtained image from the camera, (c) Features extraction from the pattern, (d) Features extraction from the obtained image, (e) Feature comparison Pattern, and (f) Detection result. Also, the pictures when the robot KUKA YouBot meets the tra?c signal are depicted in Figs. 9 and 10, during the real test. The KUKA Youbot with the one way tra?c signal is sketched in Fig. 9. While Fig. 10 shows the moment when the robot reaches the place where the Stop tra?c signal is located. In simulation and real platforms the tasks were performed with an average linear velocity around 0.20 m/s and the average angular velocity around 0.45 m/s. The average time of reaching the goal by the robot KUKA YouBot was around 2 min. We mainly take into account in reaching the goal location by the robot KUKA YouBot and we do not consider the time that the robot requires to achieve the goal location due to the fact that the path planning and tra?c signals recog- nition algorithms working together takes a lot computation time and e?ort. We are working further in the implementation and optimization of other path planning and tra?c 74 C. Gordón et al. signals recognition algorithms in order to reduce the execution time. Besides, we are looking for the implementation of machine learning algorithms in order to improve the ability of recognition of all available tra?c signals. Fig. 9. Robot KUKA YouBot and Stop tra?c signal. Autonomous Robot KUKA YouBot Navigation Based on Path Planning 75 Fig. 10. Robot KUKA YouBot and One way tra?c signal. 6 Conclusions In conclusion, the autonomous robot KUKA YouBot navigation based on path planning and tra?c signals recognition has been presented. The integration of both capabilities path planning and tra?c signals recognition was achieved by the integration of ROS, Matlab and OpenCV working environments. ROS allowed the simulation of the auton- omous robot navigation by using Gazebo and provided the implementation of the algo- rithm in the real platform of robot KUKA YouBot. Matlab improved the communication tasks by taking advantage of data processing tools in the path planning process. Finally, OpenCV allows the tra?c signals recognition by using the SIFT & SURF algorithm. We have successfully demonstrated that the integration of ROS, Matlab and OpenCV is a promising approach to provide autonomous navigation capability in any mobil robot. Finally, it is important to mention that the capability of tra?c signal recognition opens new areas of research in the ?eld of arti?cial intelligence and object recognition due to the fact that the fundamentals of the tra?c signals recognition can be applied in other kind of objects recognition. 76 C. Gordón et al. Acknowledgement. The authors acknowledge the Technical University of Ambato in Ecuador for providing all support and facilities including the robot KUKA YouBot. References 1. Perez, A., Karaman, S., Shkolnik, A., Frazzoli, E., Teller, S., Walter, M.R.: Asymptotically-optimal path planning for manipulation using incremental sampling based algorithms. In: IEEE/RSJ International Conference Intelligent Robots and Systems, pp. 4307–4313 (2011) 2. Corke, P.: Integrating ROS and MATLAB. IEEE Robot. Autom. Mag. 22(2), 18–20 (2015) 3. Quigley, M., Gerkey, B., Conley, K., Faust, J., Foote, T.: ROS: an opensource robot operating system. In: ICRA Workshop on Open Source Software, vol. 3, no. 2, p. 5 (2009) 4. Matlab: Robotics System Toolbox. http://mathworks.com/help/robotics/index.html. Accessed 21 Mar 2018 5. Bradski, G., Kaehler, A.: OpenCV. Dr. Dobb’s journal of software tools, 3ed (2000) 6. Kumar, N., Zoltán, V., Szabó-Resch, Z.: Robot path pursuit using probabilistic roadmap. In: IEEE 17th International Symposium on Computational Intelligence and Informatics (CINTI), pp. 000139–000144 (2016) 7. Adorni, G., Monica, M., Agostino, P.: Autonomous agents coordination through tra?c signals and rules. In: IEEE Conference on Intelligent Transportation System (ITSC 1997), pp. 290– 295 (1997) 8. Kavraki, L.E., Švestka, P., Latombe, J.C., Overmars, M.H.: Probabilistic roadmaps for path planning in high-dimensional con?guration spaces. IEEE Trans. Robot. Autom. 12(4), 566– 580 (1996) 9. Amith, A.L., Singh, A., Harsha, H.N., Prasad, N.R., Shrinivasan, L.: Normal probability and heuristics based path planning and navigation system for mapped roads. Procedia Comput. Sci. 89, 369–377 (2016) 10. Akulovi, M., Ikeš, M., Petrovi, I.: E?cient interpolated path planning of mobile robots based on occupancy grid maps. IFAC Proc. 45(22), 349–354 (2012) 11. Jun, J.Y., Saut, J.P., Benamar, F.: Pose estimation-based path planning for a tracked mobile robot traversing uneven terrains. Rob. Auton. Syst. 75, 325–339 (2016) 12. Mahadevan, S.: Machine learning for robots: a comparison of di?erent paradigms. In: Workshop on Towards Real Autonomy, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 1996) (1996) 13. Lidoris, G., Rohrmuller, F., Wollherr, D., Buss, M.: The Autonomous City Explorer (ACE) project—mobile robot navigation in highly populated urban environments. In: IEEE International Conference on Robotics and Automation (ICRA 2009), pp. 1416–1422 (2009) 14. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004) 15. Bay, H., Tinne, T. Luc Van, G.: Surf: speeded up robust features. In: European Conference on Computer Vision, pp. 404–417. Springer, Berlin (2006) 16. Craig, C.: A Robotics Framework for Simulation and Control of a Robotic Arm for Use in Higher Education. MS in Computer Science Project Reports (2017) 17. Galli, M., Barber, R., Garrido, S., Moreno, L.: Path planning using Matlab-ROS integration applied to mobile robots. In: IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 98–103 (2017) 18. Kavraki, L.E., Latombe, J.C., Latombe, E.: Probabilistic roadmaps for robot path planning. In: Practical Motion Planning Robotics Current Approaches and Future Directions, pp. 1–21 (1998) Autonomous Robot KUKA YouBot Navigation Based on Path Planning 77 19. Bohlin, R., Kavraki, L.E.: Path planning using lazy PRM. In: Proceedings of the IEEE International Conference on Robotics and Automation, vol. 1, pp. 521–528 (2000) 20. LaValle, S.M.: Rapidly-exploring random trees: a new tool for path planning. Citeseerx, vol. 129, pp. 98–11 (1998) 21. Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Proceedings of the 11th European Conference on Computer Vision, ser. ECCV 2010, pp. 778–792. Springer, Berlin (2010) 22. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an e?cient alternative to SIFT or SURF. In: IEEE International Conference on Computer Vision (ICCV) (2011) 23. Dreuw, P., Steingrube, P., Hanselmann, H. Ney, H.: SURF-face: face recognition under viewpoint consistency constraints. In: BMVC, pp. 1–11 (2009) 24. Shaharyar, T., Khan, A., Saleem, Z.: A comparative analysis of SIFT, SURF, KAZE, AKAZE, ORB, and BRISK. In: International Conference on Computing, Mathematics and Engineering Technologies (iCoMET) (2018) 25. Zrira, N., Hannat, M., Bouyakhf, E. H., Ahmad, H.: 2D/3D object recognition and categorization approaches for robotic grasping. In: Advances in Soft Computing and Machine Learning in Image Processing, pp. 567–593. Springer, Cham (2018) 78 C. Gordón et al. Towards Reduced Latency in Saccade Landing Position Prediction Using Velocity Pro?le Methods Henry Gri?th1(?) , Subir Biswas1 , and Oleg Komogortsev2 1 Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA {griff561,sbiswas}@msu.edu 2 Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA ok@msu.edu Abstract. Saccade landing position prediction algorithms are a promising approach for improving the performance of gaze-contingent rendering systems. Amongst the various techniques considered in the literature, velocity pro?le methods operate by ?rst ?tting a window of velocity data obtained at the initiation of the saccadic event to a model pro?le known to resemble the empirical dynamics of the gaze trajectory. The research described herein proposes an alternative approach to velocity pro?le-based prediction aimed at reducing latency. Namely, third-order statistical features computed during a ?nite window at the saccade onset are mapped to the duration and characteristic parameters of the previously proposed scaled Gaussian pro?le function using a linear support vector machine regression model using an o?ine ?tting process over the entire saccade duration. Prediction performance is investigated for a variety of window sizes for a data set consisting of 9,109 horizontal saccades of a minimum mandated data quality induced by a 30-degree step stimulus. An RMS saccade amplitude prediction error of 1.5169° is observed for window durations of one-quarter of the saccade dura- tion using the newly proposed method. Moreover, the method is demonstrated to reduce prediction execution time by three orders of magnitude versus techniques mandating online ?tting. Keywords: Eye movement prediction · Gaze-contingent rendering Foveated rendering 1 Purpose While gaze-contingent rendering systems (GCRS) o?er tremendous potential for enhancing the user experience in virtual reality (VR) environments, latency concerns during saccadic eye movements remain an area of open interest in the academic literature [1, 2]. To address these limitations, a variety of techniques for predicting the landing position at the onset of saccadic events continue to be proposed [3, 4]. A subclass of these techniques develops predictions based upon ?tting kinematic gaze data to a char- acteristic function known to resemble the empirical dynamics of saccadic trajectories [5]. Approaches ?tting eye velocity data to a model pro?le consistent with the main © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 79–91, 2019. https://doi.org/10.1007/978-3-030-02686-8_7 sequence relationship between saccade velocity, amplitude, and duration [6], hereby referred to as velocity pro?le methods, have been previously considered. While prom- ising in principle with respect to their capacity to produce physiologically-meaningful gaze location estimates across the entire saccadic duration through direct integration, current approaches instead use pro?le parameters obtained from the ?tting process as predictor values in a linear regression model for amplitude determination. Furthermore, application of this technique assumes the feasibility of performing the requisite optimi- zation for ?tting in an online capacity, which may prove challenging depending upon the computational capacity of the deployment hardware, along with the speci?c pro?le model and optimization algorithm utilized [7]. The research described herein seeks to address these concerns by introducing an alternative technique for velocity pro?le-based prediction of saccade landing position. The proposed approach performs the requisite pro?le ?tting process in an o?ine training process. This modi?cation allows for ?tting over the entire saccade duration, thereby improving adherence to the model pro?le versus online methods ?tting only over the initial portion of the saccade. Using these results, linear support vector machine regres- sion models are developed which map simplistic features computed over a ?nite duration window occurring near the saccade onset to both the parameter sets de?ning the pro?le function, along with the saccade duration. These models are subsequently utilized in online operation, thus providing physiologically-meaningful estimates of the saccadic trajectory throughout its duration without requiring the previously mandated online ?tting process. Results are presented for a data set consisting of 9,109 horizontal saccades induced by a 30-degree step stimulus, each subjected to speci?ed quality inclusion criteria. Details regarding the experimental procedure, data quality ?ltering, algorithm development and analysis, and plans for further research are provided in the remainder of this manuscript. 2 Background/Signi?cance Eye tracking technology has long been employed across a variety of research domains. Speci?c applications range from fundamental endeavors, such as exploring the nature of information processing through the human visual system (HVS) [8], to more applied e?orts, including applications in visual marketing [9] and biometrics [10]. Commercial interest in the technology has recently accelerated, as indicated by considerable acquis- ition activity in the space (i.e. Google’s acquisition of Eye?uence, Facebook’s acquis- ition of Eye Tribe, Apple’s acquisition of Sensormotoric Instruments (SMI), etc.). Amongst emerging applications, eye tracking is especially promising for integration within VR environments, due to its potential to improve display performance through application of gaze-contingent rendering paradigms [11]. GCRS operate by varying display content as a function of the user’s assumed point of gaze, which is obtained through use of an eye tracker. Such foveated rendering strat- egies exploit the inherent asymmetry in visual acuity across the HVS, where high quality vision is isolated in the center of the ?eld. This asymmetry is associated with the dense concentration of photoreceptors in the fovea, along with the supporting processing 80 H. Gri?th et al. capacity throughout the remainder of the visual pathway [12] while GCRS have received attention in the literature for research investigating the unique contributions of central and peripheral vision during various tasks (i.e. reading [13], visual search [14], etc.), commercial applications seeking to enhance display performance through improved e?ciency and reduced latency have also been considered. Namely, studies modulating various determinants of display quality, such as spatial resolution [15] and color [16] have been investigated. While the speci?cations of display and eye tracking hardware are continuously improving, system latency remains a fundamental limitation for implementing GCRS [17]. Latency concerns are especially pronounced during the rapid eye movements between points of ?xation known as saccades, where substantial misalignment between the optimized display region and true gaze location may occur. While saccadic suppres- sion is generally believed to mitigate the e?ect of misalignment by reducing the sensi- tivity of the HVS during the saccadic event, examples of intersaccadic perception have been noted in the literature [18, 19]. Moreover, such misalignments are problematic after the saccade has ended, as evidence suggests that perception is restored rapidly (between 10 and 50 ms) after completion [20]. To help avoid misalignments in the presence of saccades, GCRS may utilize saccade landing position prediction (SLPP) techniques, in which the subsequent display update is adjusted based upon the anticipated gaze landing point. Predictions are performed at the initiation of the saccadic event as identi?ed using various online eye movement classi?cation algorithms (i.e. I-VT, etc.) [21]. A variety of techniques for SLPP have been proposed in the literature over the prior two decades. While diverse in their approach, recent research [4] has proposed a partition of current methods into those regressing data onto a speci?c model motivated by the anatomy and physiology of the underlying oculomotor system, and those which operate independently of such models. With respect to model-based algorithms, techniques leveraging functions derived from an underlying oculomotor plant model [22], along with those approaches which assume a model pro?le function based upon empirical observations of eye movement trajectories have been proposed [5, 7]. Amongst the latter class of solutions, algorithms performing standard linear regression [3], along with an approach based upon a Taylor series expansion [4], have been demonstrated. 3 Methods 3.1 Experimental Procedure Data was obtained from an eye-tracking study conducted at Texas State University in 2014 under protocol approved by the Institutional Research Board. A total of 335 partic- ipants (178 male, 157 female), ranging in age from 18 to 46, were initially enrolled in the study, which required completion of a variety of tasks aimed at investigating multiple oculomotor behaviors of interest (i.e. performing horizontal and oblique saccades under the induction of a stimulus, reading, etc.). Of those initial enrollees, 322 participants completed two consecutive sessions of the horizontal stimulus (HS) task under consid- eration within this research. Towards Reduced Latency in Saccade Landing Position Prediction 81 Within the HS task, saccades were induced by varying a stimulus along the horizontal axis of a 474 × 297 mm (1680 × 1050 pixel) Viewsonic 22? display in a 30-degree step-wise fashion. Participants were positioned 550 mm from the black background display. The utilized stimulus was a white circle of diameter corresponding to approximately 1° of the visual angle, which enclosed a smaller black circle to promote focus at the center. Beginning at the origin, the stimulus displaced horizontally, oscillating between -15° and 15° for 100 iterations, remaining stationary for 1 s between each step. Oculomotor behaviors were recorded using a SR EyeLink 1000 eye tracking sensor. The sensor performs monocular eye tracking at a sampling rate of 1000 Hz with a speci?ed typical accuracy of 0.25–0.50°, and a spatial resolution of 0.25° during saccadic events. An example of the raw data output of the eye tracker over a HS task session is depicted in Fig. 1. Fig. 1. Sample eye tracker output (Subject 1, Trial 1). 3.2 Data Inclusion Criteria To ensure adequate data quality, inclusion criteria were established at both the session and event level. Namely, session-level data was screened according to the mean accuracy computed during post-calibration veri?cation, along with the portion of lost data and spatial precision computed during each session. Intra-recording precision was computed as the root-mean square (RMS) value of the inter-sample angular distances [23] occur- ring during classi?ed inter-stimuli ?xation events of at least 500 ms duration, with ?xation events identi?ed using an o?ine eye movement classi?er described in [24]. A visualization of two classi?ed ?xation events of varying duration occurring during the stimulus stationary period is depicted in Fig. 2. 82 H. Gri?th et al. Fig. 2. Visualization of varying duration ?xation events occurring during stationary stimulus interval for precision computation The distribution of all three session-level inclusion metrics across the 644 sessions is depicted in Fig. 3, with the associated inclusion thresholds summarized in Table 1. Fig. 3. Distribution of session-level data quality inclusion metrics across candidate data set. Towards Reduced Latency in Saccade Landing Position Prediction 83 To produce a symmetrical data set (i.e.: two sessions per participant), the matching session for each participant was also removed for records violating session-level inclu- sion criteria. The resulting data set after preliminary quality ?ltering consisted of 91 subjects, having a mean accuracy of 0.3908° ± 0.1044°, portion of lost data during recording of 0.8724% ± 0.7570%, and a precision of 0.0149° ± 0.0058° (mean ± std). Table 1. Session-level data quality thresholds Data quality inclusion metric Threshold value Maximum mean accuracy 0.6° of the visual angle Maximum proportion of lost data samples 3% Minimum intra-recording mean precision 0.05° Additional inclusion criteria were applied on the saccadic event level, with events identi?ed using the aforementioned o?ine eye movement classi?cation algorithm. Namely, all classi?ed saccades whose amplitudes were not consistent with the induced stimulus (i.e. corrective saccades, partitions of the stimulus interval into two saccadic events, etc.) were discarded. Moreover, events exhibiting any lost data samples, or physiologically infeasible eye velocities were also removed from the analysis set. Finally, to remove scenarios in which classi?er timing errors may corrupt results due to either delayed detection or premature termination, a maximum initial and ?nal velocity value was also mandated. Saccadic event-level exclusion criteria are summarized in Table 2. Table 2. Event-level data quality thresholds Data quality inclusion metric Threshold value Allowable amplitude range 28°–32° Maximum number of lost data samples 0% Maximum velocity 800°/s Maximum initial and ?nal velocity 100°/s The aggregate application of session and event level data inclusion criteria produced an analysis data set of 9109 saccades. The distribution of amplitudes of classi?ed saccadic events in both the original and analysis data set is depicted in Fig. 4. 84 H. Gri?th et al. Fig. 4. Distribution of saccade amplitudes for entire classi?er output and analysis subset. 3.3 Analysis Methods The scaled Gaussian velocity pro?le speci?ed in (1), originally introduced in [7] for SLPP applications, was employed as a model velocity function within this work. ?(p)) ˜) a *) e -) ( t -) b c )2 (1) where p = [A, b, c] '] denotes the characteristic parameter vector of the pro?le function, a is a scaling parameter representing the maximum saccade velocity, b is a location parameter representing the time of occurrence of the maximum velocity, and c is a shape parameter related to the width of the pro?le. To begin, an o?ine procedure was performed for each element of the analysis set, where optimal parameter values were computed by ?tting the velocity data of each sample over the entire saccadic event to the pro?le function in (1) via non-linear least squares optimization as speci?ed in (2). min ( ?i i r2 i = f (v, ?(p)) ) S.T.:pi ?i Ii (2) Towards Reduced Latency in Saccade Landing Position Prediction 85 where r2 i is the residual sum of squares loss function, v is the velocity data computed from the eye tracker output using a second order Savitzky–Golay ?lter, and Ii is the interval bound on the ith component of the parameter vector. To control for variability associated with classi?er performance with respect to detection of the saccade onset, all records were adjusted such that any preliminary data for which the radial velocity was below 20° per second was truncated (i.e. reducing excessive data for the case of prema- ture detection. No such adjustments were performed for late detection cases as they were addressed in the data pre-?ltering process). Interval bounds were established using physiological information and empirical analysis as a function of the local data pro?le as follows: A ?s [ 0.9 *. vmax, 1.1 *. vmax ] , b ?. [ 0.7 *. D 2 , 1.3 *. D 2 ] , c ?. [ 0, 1.3 *. D 2 ] , where vmax is the maximum value of the velocity sample, and D is the duration of the velocity sample. All ?tting operations were performed using the MATLAB ?t function, which performs non-linear least squares optimization using the Levenberg–Marquardt algorithm. Next, a feature set based upon the 3rd order statistics of the windowed time series was computed for the three durations of interest (3), Xw = [v* W , nv|vw = v* W , s ( vW ) , k ( vw ) , a* W , na|aw = a* W , s ( aW ) , k ( aw ) ] 'w (3) where vw and aw denote the ?xed windowed velocity and acceleration data (determined as the traditional derivative of the velocity signal) of duration W, (·)* denotes the maximum value of the windowed time series, and s(·) and k(·) denote the standard devi- ation skewness operators, respectively. For the current experiment, the considered window durations were W ?h { D 2 , D 4 , D 8 } . The feature set was chosen in an ad-hoc fashion on the basis of preliminary simplicity, along with initial analysis and supporting domain intuition. Once the feature set had been computed for the various window durations, predictive linear support vector machine regression models, hereby denoted as ??j, j ?s {12,3, 4}, for both the characteristic parameter set elements and saccade duration were developed. All models were obtained using the ?trsvm function in MATLAB under default algo- rithm hyperparameters, with 5-fold cross validation performed. A summary of the proposed modi?ed prediction work?ow versus the previously proposed online method is depicted in Fig. 5. A visualization of pro?le estimates for the various ?xed window durations consid- ered herein is depicted in Fig. 6. As noted, all three symmetrical estimates are unable to model the demonstrated skewness of the velocity data associated with large amplitude saccades. 86 H. Gri?th et al. Fig. 6. Predicted velocity pro?les for varying window durations. 4 Results The online amplitude estimation procedure depicted in Fig. 5 was employed across the entire analysis data set. The RMS error of the saccade amplitude prediction was used as a metric to evaluate prediction accuracy. Requisite computational time, as quanti?ed using the internal timer available in MATLAB through the native tic and toc functions (Intel i7-7500U processor, 16 GB RAM), was also recorded. Amplitude estimates were formulated on kinematic principles as denoted in (4). Integrations were estimated numerically in MATLAB using the trapz function. Ei = A ^i -i Ai = ?i D 0 v(t)dt -t Ai (4) Fig. 5. Proposed work?ow for modi?ed velocity pro?le-based SLPP. Towards Reduced Latency in Saccade Landing Position Prediction 87 To perform preliminary benchmarking of the e?cacy of the proposed approach, amplitude estimates were also developed using a variation of the technique described in [7]. Namely, ?tting of the velocity data for ?xed window durations was performed online in a manner identical to that presented above for the o?ine training procedure introduced herein. A linear regression (performed using MATLAB’s ?tlm function) model was then developed using 5-fold cross validation to estimate the saccade ampli- tude as a function of the 4 parameters proposed in the original work as estimated from the online ?tting procedure (i.e. ( a, b, c, c a ) ). It should be noted that this method does not provide an estimate of the velocity trajectory over the remainder of the saccade duration due its inability to directly estimate the saccade duration. This benchmarking approach di?ers slightly from that originally proposed in [7], in that a rolling window with convergence criteria is replaced by the ?xed windows to promote comparability between the two methods. The RMSE of the amplitude predictions is presented in Table 3 for both the newly proposed method and benchmarking algorithm. Corresponding mean execution times required for each prediction are presented in Table 4. While the traditional method produces improved accuracy bounded by a factor of 2 across the various durations considered, the newly proposed method reduces execution time by three orders of magnitude for the computational work?ow (i.e. algorithm and architecture parameters) used in this analysis. Furthermore, for both methods, inclusion of a larger portion of the saccade duration within the prediction provides either limited or no marginal improve- ment in prediction accuracy. For the newly proposed method, the reduction in accuracy observed for expanding window duration from D/4 to D/2 may be associated with reduction in the diversity of the considered feature set (for example, in the limiting sense where the duration includes the pro?le peak, the maximum velocity feature should be nearly identical as suggested by the main sequence relationship for the constant step stimulus used in data generation). Table 3. Comparative RMSE accuracy Window duration RMSE ( Ei ) , New method (°) RMSE ( Ei ) , Traditional method (°) W = D/8 1.6917 0.9758 W = D/4 1.5169 0.9624 W = D/2 1.7006 0.9408 Table 4. Comparative mean execution times Window duration Mean(Exec.Time) New method (s) Mean(Exec.Time) Traditional method (s) W = D/8 19.1 */ 10-6 32.2 */ 10-3 W = D/4 18.3 */ 10-6 29.1 * 10-3 W = D/2 17.1 */ 10-6 29.5 * 10-3 88 H. Gri?th et al. Preliminary investigation has been conducted attempting to identify common sources of error in amplitude estimates for the newly proposed method. Namely, manual investigation of the worst-case estimates across the data set has been performed, with initial analysis indicating that estimates are particularly corrupted for those velocity pro?les varying from the idea case (i.e. noisy pro?les whose dynamics are not consistent with the ideal scenario of a concave function). While additional pre-?ltering may be utilized to remove these results in subsequent analysis attempting to quantify the best-case performance of this proposed approach, options for best handling such noisy pro?les in practice are of primary concern in future research. 5 Conclusions A novel method for reducing the latency of existing velocity pro?le-based SLPP algo- rithms is introduced and explored herein. Rather than performing the requisite ?tting process for determination of the characteristic parameter set in real-time, the proposed method uses linear SVM mappings relating simplistic third-order statistical features computed during ?xed duration windows at the saccade onset to both the pro?le’s char- acteristic parameter set and saccade duration. Models are developed o?ine based upon ?tting conducted over the entire saccade duration. This proposed methodology o?ers the bene?t of producing physiologically-meaningful saccade landing position predic- tions without requiring the online solution of the underlying non-linear optimization problem mandated in determining the characteristic parameter set for the previously proposed scaled Gaussian pro?le. Benchmarking versus a slight variation of the previ- ously proposed technique demonstrated that although RMSE prediction accuracy was reduced on the order of a factor of 2 (corresponding to a RMSE percent accuracy reduc- tion of 2.25% computed for the ideal step stimulus amplitude), requisite execution time is reduced by 3 orders of magnitude for the computational work?ow considered herein. For all cases considered, increasing window duration provided limited to no marginal improvement in prediction accuracy. This latter result is promising for enhancing prediction speed in practical implementations. While the reported results are encouraging, their generalizability is inherently limited by the level of pre-?ltering that was performed to yield empirical pro?les resembling the produced model functions. This analysis approach was chosen to establish a perform- ance baseline for the highest possible data quality conditions. Future research will establish the performance of various saccade prediction methods in cases of varied data quality, and for a more diverse set of amplitude values and directions (i.e. vertical and oblique saccades). Furthermore, subsequent work will attempt to optimize the general work?ow introduced herein through application of standard best-practices in regression approximation, including utilization of traditional feature selection algorithms, consid- eration of alternative regression models and optimization of associated hyperparameters, along with the consideration of alternative velocity pro?les suitable for modeling a broader range of trajectories encountered in practice, such as the skewed model pro?le based upon the Wald distribution recently proposed [25]. This latter modi?cation is Towards Reduced Latency in Saccade Landing Position Prediction 89 especially promising for predicting the known skewed velocity pro?les of large amplitude saccades. References 1. Padmanaban, N., Konrad, R., Stramer, T., Cooper, E.A., Wetzstein, G.: Optimizing virtual reality for all users through gaze-contingent and adaptive focus displays. In: Proceedings of the National Academy of Sciences, p. 201617251 (2017) 2. Albert, R., Patney, A., Luebke, D., Kim, J.: Latency requirements for foveated rendering in virtual reality. ACM Trans. Appl. Percept. 14(4), 25 (2017) 3. Arabadzhiyska, E., Tursun, O.T., Myszkowski, K., Seidel, H.-P., Didyk, P.: Saccade landing position prediction for gaze-contingent rendering. ACM Trans. Gr. 36(4), 50 (2017) 4. Wang, S., Woods, R.L., Costela, F.M., Luo, G.: Dynamic gaze-position prediction of saccadic eye movements using a Taylor series. J. Vis. 17(14), 3 (2017) 5. Han, P., Saunders, D.R., Woods, R.L., Luo, G.: Trajectory prediction of saccadic eye movements using a compressed exponential model. J. Vis. 13(8), 27 (2013) 6. Bahill, A.T., Clark, M.R., Stark, L.: The main sequence, a tool for studying human eye movements. Math. Biosci. 24(3–4), 191–204 (1975) 7. Paeye, C., Schütz, A.C., Gegenfurtner, K.R.: Visual reinforcement shapes eye movements in visual search. J. Vis. 16(10), 15 (2016) 8. Rayner, K.: Eye movements in reading and information processing: 20 years of research. Psychol. Bull. 124(3), 372 (1998) 9. Wedel, M., Pieters, R.: A review of eye-tracking research in marketing, pp. 123–147. Emerald Group Publishing Limited (2008) 10. Bednarik, R., Kinnunen, T., Mihaila, A., Fränti, P.: Eye-movements as a biometric, pp. 780– 789 (2005) 11. Patney, A., et al.: Towards foveated rendering for gaze-tracked virtual reality. ACM Trans. Graph. 35(6), 179 (2016) 12. Banks, M.S., Sekuler, A.B., Anderson, S.J.: Peripheral spatial vision: limits imposed by optics, photoreceptors, and receptor pooling. J. Opt. Soc. Am. A 8(11), 1775 (1991) 13. Rayner, K.: The gaze-contingent moving window in reading: development and review. Vis. Cognit. 22(3–4), 242–258 (2014) 14. Nuthmann, A.: How do the regions of the visual ?eld contribute to object search in real-world scenes? Evidence from eye movements. J. Exp. Psychol. Hum. Percept. Perform. 40(1), 342 (2014) 15. Prince, S.J., Rogers, B.J.: Sensitivity to disparity corrugations in peripheral vision. Vis. Res. 38(17), 2533–2537 (1998) 16. Duchowski, A.T., Bate, D., Stringfellow, P., Thakur, K., Melloy, B.J., Gramopadhye, A.K.: On spatiochromatic visual sensitivity and peripheral color LOD management. ACM Trans. Appl. Percept. 6(2), 9 (2009) 17. Saunders, D.R., Woods, R.L.: Direct measurement of the system latency of gaze-contingent displays. Behav. Res. Methods 46(2), 439–447 (2014) 18. Diamond, M.R., Ross, J., Morrone, M.C.: Extraretinal control of saccadic suppression. J. Neurosci. 20(9), 3449–3455 (2000) 19. Mathôt, S., Melmi, J.-B., Castet, E.: Intrasaccadic perception triggers pupillary constriction. PeerJ 3, e1150 (2015) 20. Anliker, J: Eye movements: online measurement, analysis, and control. In: Eye Movements and Psychological Processes (1976) 90 H. Gri?th et al. 21. Salvucci, D.D., Goldberg, J.H.: Identifying ?xations and saccades in eye-tracking protocols, pp. 71–78 (2000) 22. Bahill, A.T., Latimer, J.R., Troost, B.T.: Linear homeomorphic model for human movement. IEEE Trans. Biomed. Eng. 11, 631–639 (1980) 23. Holmqvist, K., Nyström, M., Mulvey, F.: Eye tracker data quality: what it is and how to measure it, pp. 45–52 (2012) 24. Friedman, L, Rigas, I, Abdulin, E, Komogortsev, O.V.: A novel evaluation of two related and two independent algorithms for eye movement classi?cation during reading. Behav. Res. Methods (2018) 25. Gri?th, H., Biswas, S., Komogortsev, O.V.: Towards improved saccade landing position estimation using velocity pro?le methods. In: IEEE SoutheastCon 2018, St. Petersburg FL (2018) Towards Reduced Latency in Saccade Landing Position Prediction 91 Wireless Power Transfer Solutions for ‘Things’ in the Internet of Things Tim Helgesen(?) and Moutaz Haddara Westerdals – Oslo School of Arts, Communication and Technology, Oslo, Norway Timrobbyh@gmail.com, Hadmoa@westerdals.no Abstract. The Internet of Things (IoT) has several applications in various indus- tries and contexts. During the last decade, IoT technologies were mainly domi- nated by the supply chains and warehouses of large manufacturers and retailers. Recently, IoT technologies have been adopted in virtually all other ?elds, including healthcare, smart cities, and self-driving cars. While the opportunities for IoT applications are endless, challenges do exist. These challenges can be broadly classi?ed as social, political, organizational, privacy, security, environ- mental, and technological challenges. In this paper, we focus on one dimension of the technological challenges, speci?cally on how IoT products/devices can be powered and charged without interruption, while either in use or in motion, since they are known to be intensively power consuming objects. This literature review paper explores how the emerging technology of Wireless Power Transfer (WPT) could aid in solving power and charging problems for various IoT devices. Our ?ndings suggest that in theory, WPT can indeed be used to solve IoT’s intelligent devices, or “things”, charging and power challenges. However, we found that human exposure and safety, industrial context, environmental issues, and cost of technology are important factors that could a?ect WPT adoption in organizations. Keywords: Wireless power transfer · Internet of Things Wireless energy transfer · Literature review 1 Introduction The Internet of Things (IoT) domain has increased in popularity and research focus in recent years, and is sometimes even described as the next big thing, much like the internet back in its early days [1]. IoT can be broadly described as a cyber-physical network where “smart” objects, or “things”, communicate and cooperate with each other (and with humans) to create new applications, or services to achieve a common goal [2, 3]. There are several formal de?nitions of IoT, and Vermesan et al. [4] proposed an ideal one: “The Internet of Things could allow people and things to be connected anytime, anyplace, with anything and anyone, ideally using any path/network and any service.” [4, p. 12]. Through this connection of people and things, the goal is to achieve a better world where things know what we like, what we want, and give them to us with minimal human intervention [5]. Yet some simply describe IoT as increased machine-machine © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 92–103, 2019. https://doi.org/10.1007/978-3-030-02686-8_8 communication. However, as pointed out by Isenberg et al. [6], at its core, the internet of things is more than just communication technologies, it goes beyond communication; as it endows the individual object or “thing” with intelligence. These intelligence equipped objects, or “Smart/Intelligent Products” can be described as physical objects equipped or coupled with computational software [6, 7]. Wong et al. [8] proposed several requirements for intelligent products: (1) The object should have a unique identi?cation, (2) be able to communicate with its environment such as other objects, (3) retain or store data about itself, (4) deploy a language to display its features, production requirements etc., (5) and participate or make decisions relevant to its own destiny. These criteria must also be met to enable the interaction between things [6]. One of the main challenges related to these intelligent products is their power consumption; the power they need to be able to perform their functions normally [6, 7]. These functions could include, communication through wireless technology, or use of sensors [6]. Power is limited, because these objects often move around, and therefore need a self-su?cient energy source, such as batteries, to power mentioned functions [9]. Power consumption is a challenge that could a?ect the decision of which wireless technology can be used and adopted, because of the potential latency in the communication [9], or could be a poten- tial performance bottleneck [10]. Another problem is that battery replacement could be costly, especially in large-scale deployments and IoT infrastructures [11]. Following the replacement of batteries, throwing away batteries adds to the ever-increasing electronic waste issue. One solution to the power consumption problem is “clustering”, as proposed by López et al. [7]. Clustering gives the possibility to manage the power of the devices by electing so-called representative network “members”, which have the responsibility to collect and forward all communication within the network. These members are elected based on their residual energy, where devices under a pre-set percentage of energy will not be elected. However, this solution only slows battery consumption, as battery charge or replacement is still needed at some point in time. Another potential solution is the use of Bluetooth low-energy (BTLE) technology, which allows greater battery e?ciency compared to other communication technologies [12]. But this solution, again, only slows the inevitable, which is the replacement of batteries. The remainder of the paper is structured as follows. First, an overview of wireless power transfer technology opportunities and challenges are provided in Sect. 2. The research methodology is discussed in Sect. 3, followed by an overview of the articles named in Sect. 4. Section 5 provides an overview of the literature review’s main ?ndings. A discussion is provided in Sect. 6. Finally, research conclusions are provided in Sect. 7. 2 Wireless Power Transfer Wireless power transfer (WPT) technology (see Fig. 1), is a technique also known as wireless charging, or wireless energy transfer (WET) [13]. WPT can be brie?y explained as the process of transmitting electricity from one power system to another through the air gap via, for instance, an electromagnetic ?eld or electromagnetic radiation [10]. Wireless charging happens when one of the transmitting systems is constantly powered, and therefore continues to transfer power until the other system/device is fully charged Wireless Power Transfer Solutions for ‘Things’ in the IoT 93 [14]. The object, or power source, that transmits power is commonly referred to as a power source (e.g. charging station), and the object that receives the power is commonly referred to as the energy harvesting object, or simply “load” (e.g. robot) [15, 16]. Fig. 1. Generic wireless power transfer illustration. While this technology has the potential to completely reshape the IoT landscape, there is little research surrounding wireless power transfer in the IoT context. The aim of this study is to explore the current literature and identify the potential use and appli- cations of WPT technologies to wirelessly charge intelligent products, or things, and answer the following two main research questions: • What wireless power transfer technologies could potentially solve the power chal- lenges related to intelligent products? • What are the challenges following the use of wireless power transfer technologies? 3 Methodology Literature review papers represent a well-established method for accumulating existing, documented, and state-of-the-art knowledge within a domain of interest. In this article we have applied a systematic review approach as described by Webster and Watson [17]. This approach is characterized by adopting explicit procedures and conditions. This involves the use of a variety of procedures combined with various search criteria to minimize bias as much as possible [18]. The review covers articles published between the years 2007–2018 (February). We have narrowed down the search process through a condition, that the articles need to be published in peer reviewed journals, edited books, or conference proceedings. More- over, no delimitation has been imposed on the outlets’ ?eld, to enable potential research results from various ?elds. The following search procedures have been applied to provide a comprehensive and systematic methodology. 1. An initial search was done through Google Scholar. The search option was limited to articles’ titles. The keywords: wireless charging, wireless power transfer, wireless energy transfer, IoT, internet of things, and their combinations were used. 2. Due to their high relevance for research, other research databases were used. These databases included ACM Digital Library, IEEE Xplore Digital Library, EBSCO host and Springer. The search procedure was restricted to the same keywords as in the 94 T. Helgesen and M. Haddara previous step. In addition to the title area, the abstract and keyword parts of the articles have been included into the search. 3. In order to minimize the search results, we have put a constraint that the papers included in this review must have at least ?ve citations. 4. Additionally, we conducted a secondary search through scanning all of the selected articles reference lists, to identify further potential literature sources. 5. The articles abstracts were then carefully read by both authors to check their rele- vance for this review paper. Only articles directly addressing wireless power transfer technologies within the IoT domain were selected for it. 6. Based on the preliminary review, two main categories of wireless transfer tech- nology ranges were identi?ed. Hence, the articles were classi?ed into two main groups, near-?eld and long-?eld power transfer technologies. The authors independently classi?ed the articles into a concept matrix [17], which included the research themes. The results were then compared and discussed in order to achieve a consensus on each article’s classi?cation. It is important to mention that an article could fall into one or more themes, based on the article’s technology focus. One of the main limitations of this research methodology is that some potentially relevant papers may have been omitted, because they didn’t meet our condition of the minimum number of citations. The omitted research papers that were more recent and had a low number of citations particularly a?ected the scope of this literature review. 4 Overview of the Articles In total, we reviewed thirty articles that were published in various outlets; Of these, 24 are journal articles, 1 is a conference proceeding, and 5 are articles in books. As seen in the following ?gure (Fig. 2), the review shows a gradual increase in research interest in wireless power transfer, with a maximum of 9 publications in 2016. 1 0 1 0 2 2 3 3 3 9 6 0 Fig. 2. Number of publications per year. Wireless Power Transfer Solutions for ‘Things’ in the IoT 95 5 Main Findings In literature, several potential wireless power transfer technologies were identi?ed and split into two main categories: Near-?eld and Long-?eld wireless charging technologies, as shown in the following table and discussed in this section (Table 1). Table 1. Overview of research topics and their corresponding papers. Range category Wireless power technologies overview WPT technology Papers Near-?eld Inductive Power Transfer (IPT) [10, 15, 19–24] Resonant Inductive Power Transfer (RIPT) [10, 14–16, 25–32] Capacitive Power Transfer (CPT) [33, 34] Long-?eld Radio Frequency (RF) radiation [13, 35–38] Microwave Power Transfer (MPT) [10, 15, 38–40] Laser Power Transfer (LPT) [10, 41, 42] 5.1 Near-Field Power Transfer (1) Inductive Power Transfer Inductive power transfer (IPT), also known as inductive coupling, transfers power from one coil to another, and have been used for powering RFID tags and medical implants [26]. The ?eld IPT generates is in the kilohertz range, and is typically used within a few millimeters, to a few centimeters (20 cm) from the targeted load [15]. Power varies between watt and kilowatt based on transmission e?ciency [33]. The transmission e?ciency decreases as range increases, and even more so if there is any misalignment between the coils [23, 25]. Following misalignment, if there is a change to the range, the coils require calibration to work [39]. Loss of electricity through misalignment, range, or metallic objects between the coils will lead to an increase in heat [14, 15]. Due to its low transmission e?ciency, the ?eld is consid- ered safe for humans [15]. In the IoT domain, this technology has been recom- mended for several applications. For example, Rim and Mi [24] explored the possi- bilities of the wireless power transfer to electric vehicles, and other mobile devices. (2) Resonant Inductive Power Transfer One of the earliest implementations of the resonant inductive power transfer (RIPT) is Nikolai Tesla’s magnifying transmitter, or coil [43] (Fig. 3). The magnifying transmitter succeeded to wirelessly transmit power to power harvesting objects, like lamps, as shown in Fig. 2. Resonant inductive power transfer follows the same basic principles as IPT. However, this technology makes use of magnetic resonant coils, which operate at the same resonance frequency [10]. This technique makes creates a stronger connection, and therefore increases the potential range and e?- ciency. The ?rst documented optimal use of RIPT for WPT was performed by Kurs et al. [28], and achieved a transmission e?ciency around 90% at 1 m, and 40% at 2 m. Power varies between watt and kilowatt based on transmission e?ciency [34]. As with IPT, the transmission e?ciency decreases as range increases, though RIPT 96 T. Helgesen and M. Haddara has proven to have a longer range and better e?ciency [15, 20, 28]. As with IPT, RIPT requires calibration for each change made to distance or coil [39]. RIPT technology can charge multiple receivers at the same time, even if the receivers are out of sight [15]. As with ICT, the resonant ?eld is considered safe for humans, which was proven by Ding et al. [19]. Thus, Bito et al. [32] have developed a real-time electrically controlled wireless charging infrastructure, and algorithms that can be used to recharge biomedical and implanted devices (e.g. pacemakers). This could e?ectively abolish the need for surgical procedures, which are currently necessary for occasional battery replacement. Fig. 3. Tesla’s magnifying transmitter wirelessly powering a lamp [44]. (3) Capacitive Power Transfer Capacitive power transfer (CPT) is a coupling made up of two metal surfaces where electricity is transferred between the point of contact [33]. Though potentially cheaper than IPT and RIPT, CPT requires close contact between the two metal surfaces. Hence, it is greatly limited by range requirements [27, 33, 34]. CPT tech- nology has only recently seen kilowatt-scale loads, and was overlooked until 2008, which could explain this [33]. Wireless Power Transfer Solutions for ‘Things’ in the IoT 97 5.2 Long-Field Power Transfer (1) Radio Frequency Radiation Radio Frequency (RF) radiation, uses radio frequency emitted from an antenna for carrying radiant energy [10]. It can send power from a meter, up to several kilo- meters based on the technique used [15]. However, it has a very low e?ciency rate, and requires line of sight to deliver power [29]. In regard to the low e?ciency rate, one project reported a transmission e?ciency of around 1% at 30 cm [10]. It also needs to know the location of the intended target [15]. Due to its health risks through exposure, radio frequency is commonly used and operated in low power areas [15]. Boshkovska et al. [35] proposed a simultaneous wireless information and power transfer (SWIPT) model that enables simultaneous wireless information and power transfer on the same waveforms. This model also extends the possibilities for IoT energy-harvesting devices, which also need continuous communication [35, 40]. One of the paramount obstacles for far-?eld wireless power implementations, is the end-to-end power transfer e?ciency and optimization needed to increase the direct current power level at the output of the rectenna (energy harvester), without the need to increase the transmission power and waveform output [36]. Through simu- lations, Clerckx and Bayguzina [36, 37] and Huang and Clerckx [45], have provided models and algorithms that could potentially increase the transmission output in waveforms, and decrease power loss during radio frequency to direct current conversions in far-?eld transmissions. (2) Microwave Power Transfer Microwave power transfer (MPT) is a technique that increases transmission e?- ciency and range through, for instance, a parabolic dish, which focuses the radio waves [14, 22]. However, MPT requires complicated tracking mechanisms, and a large scale of devices [15]. Galinina et al. [22], have proposed a framework for applying MPT techniques to transfer power to 5G devices, such as wearables, through beacons that facilitate a continuous supply of power, creating self-sustain- able devices. Finally, Di Renzo and Lu [38] developed a stochastic mathematical model to analyze and optimize low-energy cellular-enabled mobile devices, that have dual wireless information and beam power transmission capabilities. (3) Laser Power Transfer Another RF technique is the use of optical laser power transfer (LPT), which trans- mits power under visible or near infrared frequency [10]. However, like MPT, it requires complicated tracking mechanisms, and a large spectrum of devices [15]. One of the potential applications of LPT is Industry 4.0, otherwise known as the 4th industrial revolution [2]. On a larger scale, with the emergence of cloud computing and the current advancements in the mobile networks, billions of heter- ogeneous smart devices with di?erent application requirements are connected to networks, and are currently generating large volumes of data that need be processed in distributed cloud infrastructures [42]. Hence, Munoz et al. [42] have presented a platform that is currently under development, which utilizes ?fth-generation (5G) mobile network technologies to develop new radio interfaces to cope with the exponential tra?c growth, and integrate diverse networks from end to end, with 98 T. Helgesen and M. Haddara distributed cloud resources to deliver E2E IoT and mobile services. Moreover, a paper by Liu et al. [41], explored the possibilities of transforming the current Chinese power grid into a smart grid to enable IoT applications. The paper focuses on optical/laser technologies as enablers for IoT devices’ communication and wire- less charging through the grid. 6 Discussion The reviewed articles are spread across 20 various outlets. Among the outlets, we have recognized only one special journal issue focusing on wireless power transfer technol- ogies within the IoT context. As the research interest on WPT in IoT is increasing, research outlets should pay more attention to this domain. In general, 30 articles across a 12-year period is a low number of publications. Despite the need for research WPT for IoT was recognized in previous literature. Still, the amount of research conducted on this issue is considered very limited. Thus, more research needs to be carried out in order to gather su?cient knowledge about this phenomenon, as WPT in IoT did not receive appropriate attention compared to other IoT related topics. Based on our WPT in IoT literature review, in the following part we answer our research questions, and present some research gaps and future research suggestions. To answer the ?rst question, what wireless power transfer technologies could poten- tially solve the power challenges related to intelligent products? It is apparent that virtually all of the technologies identi?ed in the literature could solve the device charging and power harvesting challenges that were discussed earlier in this paper. However, the decision of which of these technologies would be the best ?t, should be based on several factors. One factor is the target environment. For instance, one type of environment could be an industrial workplace, where intelligent devices are being used to inform users about exposure to hazardous equipment such as in the case of Kortuem et al. [46]. Since this would most likely be a very open and dynamic environment, microwave power transfer could be used through the use of power beacons (PBs), as recommended by Huang and Lau [39]. Likewise, the use of a capacitive power transfer solution is also viable, where the smart object has to be placed on top of a charging platform when at rest; though this would require the device to hold out until it is charged. This technology is very similar to existing wireless mobile phone charging stations. The decision of which solution would be the best ?t, should take into consideration another factor—cost. Though costs might be reduced, due to the absence of battery that would need to be replaced, the high implementation costs of the technology are still factors to be consid- ered. Implementation costs could be, for example; the price of changing traditional charging cords with wireless chargers, and the cost of installing wireless power receivers in the intelligent products [15], though this would depend on the chosen technology. Cost is also a?ected by the required charging range, as long-range charging is not as e?ective as wired; therefore, consumes more electricity. Another factor is the size of the object/device. It has been pointed out that both inductive and resonant inductive coupling require a relatively large receiver for e?ective long-range charging [15]. Though this most likely depends on the amount of power the device needs, as Cannon et al. [26] Wireless Power Transfer Solutions for ‘Things’ in the IoT 99 pointed out, how one large-source coil transponder can be used to charge many small-load coil receivers. However, the most important factor should be the planned perform- ance level of the smart object, as on-the-go charging leads to more power consumption; therefore, opening the way for more functions. The goal should be to utilize this extra power to increase the performance of the smart object. To illustrate this, clustering as explained in the introduction, was proposed to slow power consumption at the cost of real-time data and could lead to potentially disconnected environments. However, always having the power needed to perform their function would lead to always-avail- able real-time data, communication, and coordination, which is closer to the ideal de?- nition of IoT. Regarding the second research question: What are the challenges following the use of wireless power transfer technologies? There are some general challenges with the use of WPT technology. One of the paramount challenges is how businesses could outweigh the cost of acquiring and using the technology in terms of business value. Another challenge is to implement the technology in an optimum way, so that it does not disrupt or slow down business processes. In addition, the technology must be imple- mented in a way that it will not pose any potential health risks to humans in the vicinity. Based on this review’s ?ndings, several research gaps have been identi?ed. For example, it is evident that the majority of the reviewed papers focused more on the near-?eld wireless power transfer technologies than the long-?eld context. As discussed earlier, the longer the range, the more wireless power is needed to charge distant objects, which could be ine?cient and costly for the time being. Thus, more research is needed in order to ?nd power optimization techniques among available power sources and power harvesting devices. It is also palpable that very little research has been conducted within the laser power transfer domain in the long-?eld WPT. This lack of research could possibly be explained by the expensive infrastructure required to implement this tech- nology. In addition, as virtually all of the papers reviewed are considered highly tech- nical papers (mostly IEEE outlets), there is also an apparent research gap on the business value and feasibility of the di?erent WPT technologies from a business perspective. Furthermore, almost none of the papers have reported a real-world case study on WPT implementations within businesses. This could explain the slow adoption of WPT tech- nology in this particular domain, as bridging research between technical and business issues is needed to reach the managers, and to increase the businesses’ awareness of such technologies. 7 Conclusions This paper contributes to both research and practice through providing a comprehensive literature review on the potential of wireless power transfer technologies in the IoT domain. For practice, the paper sheds the light on past and recent issues as well as challenges that can guide IoT consultants, vendors, and clients in their future projects. For researchers, the organization of literature into the di?erent WPT technologies can aid them in identifying the topics, ?ndings, and gaps discussed in each technology of 100 T. Helgesen and M. Haddara interest. Finally, we have provided our observations and future research suggestions that would enrich knowledge in this domain. References 1. Sajid, O., Haddara, M.: NFC mobile payments: are we ready for them? In: SAI Computing Conference (SAI), 2016, pp. 960–967 (2016) 2. Haddara, M., Elragal, A.: The readiness of ERP systems for the factory of the future. Procedia Comput. Sci. 64, 721–728 (2015) 3. Misra, G., Kumar, V., Agarwal, A., Agarwal, K.: Internet of Things (IoT)—a technological analysis and survey on vision, concepts, challenges, innovation directions, technologies, and applications (an upcoming or future generation computer communication system technology). Am. J. Electr. Electron. Eng. 4, 23–32 (2016) 4. Vermesan, O., Friess, P., Guillemin, P., Gusmeroli, S., Sundmaeker, H., Bassi, A., et al.: Internet of Things strategic research roadmap. In: Internet of Things-Global Technological and Societal Trends, vol. 1, pp. 9–52 (2011) 5. Perera, C., Zaslavsky, A., Christen, P., Georgakopoulos, D.: Context aware computing for the Internet of Things: a survey. IEEE Commun. Surv. Tutor. 16, 414–454 (2014) 6. Isenberg, M.-A., Werthmann, D., Morales-Kluge, E., Scholz-Reiter, B.: The role of the Internet of Things for increased autonomy and agility in collaborative production environments. In: Uckelmann, D., Harrison, M., Michahelles, F. (eds.) Architecting the Internet of Things, pp. 195–228. Springer, Berlin (2011) 7. López, T.S., Brintrup, A., Isenberg, M.-A., Mansfeld, J.: Resource management in the Internet of Things: clustering, synchronisation and software agents. In: Uckelmann, D., Harrison, M., Michahelles, F. (eds.) Architecting the Internet of Things, pp. 159–193. Springer, Berlin (2011) 8. Wong, Y., McFarlane, D., Zaharudin, A.A., Agarwal, V.: The intelligent product driven supply chain. In: 2002 IEEE International Conference on Systems, Man and Cybernetics, vol. 4, p. 6 (2002) 9. Mattern, F., Floerkemeier, C.: From the internet of computers to the Internet of Things. In: Sachs, K., Petrov, I., Guerrero, P. (eds.) From Active Data Management to Event-Based Systems and More, pp. 242–259. Springer, Berlin (2010) 10. Xie, L., Shi, Y., Hou, Y.T., Lou, A.: Wireless power transfer and applications to sensor networks. IEEE Wirel. Commun. 20, 140–145 (2013) 11. Miorandi, D., Sicari, S., De Pellegrini, F., Chlamtac, I.: Internet of Things: vision, applications and research challenges. Ad Hoc Netw. 10, 1497–1516 (2012) 12. Swan, M.: Sensor mania! the Internet of Things, wearable computing, objective metrics, and the quanti?ed self 2.0. J. Sens. Actuator Netw. 1, 217–253 (2012) 13. Yuan, F., Jin, S., Wong, K.K., Zhao, J., Zhu, H.: Wireless information and power transfer design for energy cooperation distributed antenna systems. IEEE Access 5, 8094–8105 (2017) 14. Chawla, N., Tosunoglu, S.: State of the art in inductive charging for electronic appliances and its future in transportation. In: 2012 Florida Conference on Recent Advances in Robotics, pp. 1–7 (2012) 15. Lu, X., Wang, P., Niyato, D., Kim, D.I., Han, Z.: Wireless charging technologies: fundamentals, standards, and network applications. IEEE Commun. Surv. Tutor. 18, 1413– 1452 (2016) 16. Lu, X., Wang, P., Niyato, D., Han, Z.: Resource allocation in wireless networks with RF energy harvesting and transfer. IEEE Netw. 29, 68–75 (2015) Wireless Power Transfer Solutions for ‘Things’ in the IoT 101 17. Webster, J., Watson, R.T.: Analyzing the past to prepare for the future: writing a literature review. MIS Q. 26, xiii–xxiii (2002) 18. Bryman, A.: Social Research Methods. OUP, Oxford (2012) 19. Ding, P.-P., Bernard, L., Pichon, L., Razek, A.: Evaluation of electromagnetic fields in human body exposed to wireless inductive charging system. IEEE Trans. Magn. 50, 1037–1040 (2014) 20. Hui, S.Y.R., Zhong, W., Lee, C.K.: A critical review of recent progress in mid-range wireless power transfer. IEEE Trans. Power Electron. 29, 4500–4511 (2014) 21. Zhao, B., Kuo, N.-C., Niknejad, A.M.: An inductive-coupling blocker rejection technique for miniature RFID tag. IEEE Trans. Circuits Syst. I Regul. Pap. 63, 1305–1315 (2016) 22. Galinina, O., Tabassum, H., Mikhaylov, K., Andreev, S., Hossain, E., Koucheryavy, Y.: On feasibility of 5G-grade dedicated RF charging technology for wireless-powered wearables. IEEE Wirel. Commun. 23, 28–37 (2016) 23. Imura, T., Hori, Y.: Maximizing air gap and e?ciency of magnetic resonant coupling for wireless power transfer using equivalent circuit and Neumann formula. IEEE Trans. Ind. Electron. 58, 4746–4752 (2011) 24. Rim, C.T., Mi, C.: Wireless Power Transfer for Electric Vehicles and Mobile Devices. Wiley, Hoboken (2017) 25. Beh, T.C., Kato, M., Imura, T., Oh, S., Hori, Y.: Automated impedance matching system for robust wireless power transfer via magnetic resonance coupling. IEEE Trans. Ind. Electron. 60, 3689–3698 (2013) 26. Cannon, B.L., Hoburg, J.F., Stancil, D.D., Goldstein, S.C.: Magnetic resonant coupling as a potential means for wireless power transfer to multiple small receivers. IEEE Trans. Power Electron. 24, 1819–1825 (2009) 27. Hui, S.: Planar wireless charging technology for portable electronic products and Qi. Proc. IEEE 101, 1290–1301 (2013) 28. Kurs, A., Karalis, A., Mo?att, R., Joannopoulos, J.D., Fisher, P., Soljacic, M.: Wireless power transfer via strongly coupled magnetic resonances. Science 317, 83–86 (2007) 29. Xie, L., Shi, Y., Hou, Y.T., Sherali, H.D.: Making sensor networks immortal: an energy-renewal approach with wireless power transfer. IEEE/ACM Trans. Netw. 20, 1748–1761 (2012) 30. Choi, B.H., Thai, V.X., Lee, E.S., Kim, J.H., Rim, C.T.: Dipole-coil-based wide-range inductive power transfer systems for wireless sensors. IEEE Trans. Ind. Electron. 63, 3158– 3167 (2016) 31. Yeo, T.D., Kwon, D., Khang, S.T., Yu, J.W.: Design of maximum e?ciency tracking control scheme for closed-loop wireless power charging system employing series resonant tank. IEEE Trans. Power Electron. 32, 471–478 (2017) 32. Bito, J., Jeong, S., Tentzeris, M.M.: A real-time electrically controlled active matching circuit utilizing genetic algorithms for wireless power transfer to biomedical implants. IEEE Trans. Microw. Theory Tech. 64, 365–374 (2016) 33. Dai, J., Ludois, D.C.: A survey of wireless power transfer and a critical comparison of inductive and capacitive coupling for small gap applications. IEEE Trans. Power Electron. 30, 6017–6029 (2015) 34. Dai, J., Ludois, D.C.: Wireless electric vehicle charging via capacitive power transfer through a conformal bumper. In: 2015 IEEE Applied Power Electronics Conference and Exposition (APEC), pp. 3307–3313 (2015) 35. Boshkovska, E., Koelpin, A., Ng, D.W.K., Zlatanov, N., Schober, R.: Robust beamforming for SWIPT systems with non-linear energy harvesting model. In: 2016 IEEE 17th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp. 1–5 (2016) 102 T. Helgesen and M. Haddara 36. Clerckx, B., Bayguzina, E.: Waveform design for wireless power transfer. IEEE Trans. Signal Process. 64, 6313–6328 (2016) 37. Clerckx, B., Bayguzina, E.: Low-complexity adaptive multisine waveform design for wireless power transfer. IEEE Antennas Wirel. Propag. Lett. 16, 2207–2210 (2017) 38. Renzo, M.D., Lu, W.: System-level analysis and optimization of cellular networks with simultaneous wireless information and power transfer: stochastic geometry modeling. IEEE Trans. Veh. Technol. 66, 2251–2275 (2017) 39. Huang, K., Lau, V.K.: Enabling wireless power transfer in cellular networks: architecture, modeling and deployment. IEEE Trans. Wirel. Commun. 13, 902–912 (2014) 40. Bi, S., Zeng, Y., Zhang, R.: Wireless powered communication networks: an overview. IEEE Wirel. Commun. 23, 10–18 (2016) 41. Liu, J., Li, X., Chen, X., Zhen, Y., Zeng, L.: Applications of Internet of Things on smart grid in China. In: 2011 13th International Conference on Advanced Communication Technology (ICACT), pp. 13–17 (2011) 42. Munoz, R., Mangues-Bafalluy, J., Vilalta, R., Verikoukis, C., Alonso-Zarate, J., Bartzoudis, N., et al.: The CTTC 5G end-to-end experimental platform: integrating heterogeneous wireless/optical networks, distributed cloud, and IoT devices. IEEE Veh. Technol. Mag. 11, 50–63 (2016) 43. Brown, W.C.: The history of power transmission by radio waves. IEEE Trans. Microw. Theory Tech. 32, 1230–1242 (1984) 44. Tesla, N.: The Problem of Increasing Human Energy: With Special Reference to the Harnessing of the Sun’s Energy. Cosimo Inc., New York (2008) 45. Huang, Y., Clerckx, B.: Waveform optimization for large-scale multi-antenna multi-sine wireless power transfer. In: 2016 IEEE 17th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp. 1–5 (2016) 46. Kortuem, G., Kawsar, F., Sundramoorthy, V., Fitton, D.: Smart objects as building blocks for the Internet of Things. IEEE Internet Comput. 14, 44–51 (2010) Wireless Power Transfer Solutions for ‘Things’ in the IoT 103 Electronic Kintsugi An Investigation of Everyday Crafted Objects in Tangible Interaction Design Vanessa Julia Carpenter1(?) , Amanda Willis2 , Nikolaj “Dzl” Møbius3 , and Dan Overholt1 1 Technical Doctoral School of IT and Design, Aalborg University, Copenhagen, Denmark {vjc,dano}@create.aau.dk 2 Simon Fraser University, Surrey, Canada 3 HumTek, Roskilde University, Roskilde, Denmark Abstract. In the development of enhanced and smart technology, we explore the concept of meaningfulness, tangible design and interaction with everyday objects through Kintsugi, the Japanese craft of repairing broken ceramics with gold. Through two workshops, this emergent design research develops an iterative prototype: Electronic Kintsugi, which explores how we can facilitate more human-to-human or human-to-self connection through a hybrid crafted everyday object. We identify three themes: (1) enhancing human connection through embedded or “magic” technology; (2) using everyday objects to prompt personal re?ection and development; and (3) exploring transferable design principles of smart products with a device of unde?ned purpose, and this converges traditional craft and technology. Keywords: Craft · Internet of Things (IoT) · Tangible interaction Everyday objects 1 Introduction This work explores Kintsugi, the Japanese craft of repairing broken ceramics with gold and explores how we can use capacitive touch to facilitate tangible interaction with an everyday, crafted object. We situate ourselves within interaction design and look to craft and tangible interaction related works. The grounding question for this work asks how can we facilitate more human-to-human or human-to-self connection through a digital/crafted hybrid-everyday object and which design bene?ts can this o?er future technology? We explore this through three themes which emerge in our work about technology, craft and interaction. Much of the recent work within interaction design about tangible interaction has shown an increased focus on traditional craft work [1–4] and a return to tangible interaction [5– 7] from screen interaction. Despite a focus on the craft and the tangible, in commercial areas a strong focus on app-based interaction, digital displays, and screen based solutions has become the norm, even pushing towards virtual or augmented reality. Meanwhile, © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 104–121, 2019. https://doi.org/10.1007/978-3-030-02686-8_9 a number of critical views about the value of the Internet of Things (IoT) have recently been published [3, 8] and a wave of research and devices around the themes of mind- fulness, self-exploration, re?ection, and well-being is emerging [9, 10]. In this area of overlap, between screens and tangible interaction, between making devices and traditional craft, between the IoT devices and the mindfulness tools, we ?nd ourselves interested in exploring the potential engagement qualities of non-screen, tangible interaction in the form of everyday crafted objects. We are speci?cally inter- ested in the physical nature of both the IoT gadgets and the mindfulness tools as they tie into the physicality of crafted objects. We rely on physical objects in our lives and while designing future smart homes, o?ces, cars, etc., we might bene?t from a deeper understanding of how we relate to these physical things [11]. Núñez Pacheco and Loke elaborate: “A focus on a more re?ective approach can o?er fresh ways of understanding how the lived body interacts with artefacts, products and spaces” [12]. This speaks to how we can look further into understanding how humans can interact with ‘things’ and our focus is to take that further and ask how we can facilitate more human-to-human or human-to-self connection through a hybrid crafted everyday object. 2 Introducing Kintsugi as a Device to Explore Connection and Meaning Making Electronic Kintsugi was developed as an investigation tool into how we could use everyday objects to explore human-to-human connection, human-to-self connection, and to ?nd if we could develop something which intrigued and engaged people, moving from the IoT (Internet of Things) towards an appreciation and use of crafted, tangible, interactive, everyday objects. Electronic Kintsugi is a platform for exploration and meaning-making, an opportunity to engage with others, and with oneself and to create new narratives. In our work, our context was Japan’s artisanal craft of Kintsugi where we developed our work with a Kintsugi artist and our focus was on the tangible, non-screen interaction properties of how a device with an unde?ned purpose might exist in between these realms of traditional craft, technology and sound. Inspired by Tsaknaki and Fernaeus’ work with Expanding on Wabi-Sabi as a Design Resource in HCI [13] where they explored un?nished craft and interaction design, the authors created a device and facilitated two participatory workshops exploring the Japa- nese craft of Kintsugi: mending broken ceramics with a precious metal to make them more beautiful and valuable than before. These concepts were adopted with the creation of Electronic Kintsugi: a sound or light reactive piece of repaired ceramics with touch interaction on the precious metal seams. Our interest is in the aesthetics of individuality, human touch, and to explore and respect the tradition of the craft of Kintsugi itself (Video of Electronic Kintsugi here: https://youtu.be/p5Pu0-gZ3u0) (Fig. 1). Electronic Kintsugi 105 Fig. 1. Electronic Kintsugi in a design expert’s home; The Kintsugi artist creating traces; First workshop explorations with light and sound. 3 Related Works: Exploring the Physical Qualities of Hybrid Tangible Embedded Interaction, Through Crafted “Things” The literature review researched works where craft is referenced for the transferable physical qualities of interaction design; material, texture, touch, and recognition of craftsmanship as opposed to the sleek smooth, machined surfaces of our current smart products. We see this as a natural progression from a screen-based society, moving towards embodied engagement and beyond the swipe-interaction of the “black mirror” (screen) as described by Rose [14]. Three thematic ?ndings informed our prototype and workshop development. 3.1 Traditional Craft as a Starting Point for Exploration Tsaknaki and Fernaeus explore craft in depth, in a variety of their works, and hereby evaluate the role of interaction design in craft. In their work on Wabi Sabi, Tsaknaki and Fernaeus [13] present the concept of Wabi Sabi; and “approach perfection through explicitly un?nished designs”. We embrace the concept of un?nished design with Elec- tronic Kintsugi, deliberately designing an un?nished device to prompt curiosity and exploration of the prototype. In their work with leather, Tsaknaki, Fernaeus, and Schaub [15] explore how leather can be a touch based, rich material for tangible interactions. This work informs how we can look to everyday materials, in our case, ceramics, for stroking interaction, much like the leather interactions of their SoundBox. In exploring silversmithing, Tsaknaki, Fernaeus, Rapp and Belenguer [16] both engaged local artisans and focused especially on the “cultural and historical signi?- cance” of the craft, and explored the design “space of screen-less” interactions. This ?nding informed our choice of working with the Japanese artisanal craft of Kintsugi where we developed our work with a Kintsugi artist and our focus was on the tangible, non-screen interaction properties of how a device with an unde?ned purpose might exist in between these realms of traditional craft and technology. 106 V. J. Carpenter 3.2 Designing from Everyday Things with Social Implications in Mind In recent works about the Internet of Things (IoT), Cila, Smit, Giaccardi and Kröse [8], Nordrum [17], and Lingel [3] all explore the social signi?cance of the “thing” and suggest that we need not only look at the everyday (home and workplace) but also the social and cultural implications of these everyday interactions with things. Our work focuses on this “thing” and thus, the development of Electronic Kintsugi. 3.3 Technology and Touch Signi?cant work has been done in the ?eld of interaction design with regards to touch and in the interests of space we do not cover that here, however the particular work by Cranny-Francis [18] covers a sizeable portion of the touch research done within design. In Semefulness: a social semiotics of touch, Cranny-Francis introduces the experience of touch as ‘semefulness’ – “multiply signi?cant, physically, emotionally, intellectually, spiritually, politically” [18]. She describes the ‘tactile regime’ of touch in culture, how it shapes how we engage with one another or to the tools we design and then use. She describes that “Touch is semeful in that it is full of meanings - physical, emotional, intellectual, spiritual and those meanings are socially and culturally speci?c and located.” Here we can begin to touch upon the multi-faceted nature of Electronic Kint- sugi. It is culturally and location speci?c to traditional Japanese craft; it is emotional to some - as an heirloom or a piece of valuable art; it fosters social interaction when acting as Electronic Kintsugi (see Sect. VI. C); and it is physical in nature, it requires touch, stroking, holding the bowl. One ambition of Electronic Kintsugi is to enable meaningful experiences for the participants, and by addressing Cranny-Francis’ ‘semeful’ attributes, we may begin to explore this domain. 3.4 A Focus on Audio and Playfulness Schoemann and Nitsche [4] use the “Stitch Sampler”, a sew-able musical instrument to focus on embodiment via the act of sewing, and on audio feedback, “to respond to the crafter’s personality”. These qualities of craft, tangible non-screen interaction, and playfulness with sound inform our process, helping to frame the area we are exploring. Electronic Kintsugi allows participants to explore the interaction qualities of a hybrid crafted device and consider its potential uses in their lives. We encourage curiosity and unexpected encounters, and re?ections of those encounters. This speaks to our objective to inform future smart product design and encourage a tangible, non-screen interface which utilizes craft and the qualities of curiosity and re?ectivity. 4 Methodology Initially, we were fascinated by the idea of Kintsugi and made a basic prototype to explore possible values of Electronic Kintsugi. This work spans from the ?rst prototype to two workshops, one in Japan, and one in Denmark, six months apart. We present an Electronic Kintsugi 107 overview of methods here and then describe each workshop and the ?ndings in the following sections. 4.1 Workshop 1: Methods The ?rst workshop was designed in a collaborative process with FabCafe Tokyo and Kintsugi artist, Kurosawa where we combined electronics with an everyday “craft” object with the artisan in this process [16] so they could both introduce us to the nuances of the craft and help us to understand to what we should be paying attention. Following the process described by Tsaknaki, Fernaeus and Schaub [15] in their leather material explorations, we created a workshop session to explore the properties of Kintsugi and gain insight into the craft, and to investigate how our prototype was received by participants in that context. We used thin strips of copper tape to conduct electrical current and worked with the Kintsugi artist to carefully overlay the traces of precious metals where the repair had been, to emulate the traditional Kintsugi.1 The workshop consisted of two of the authors (one, an electrical and mechanical engineer and the other an interaction designer), the Kintsugi artist, and seven participants of varying electronics skill levels who were recruited through an open Fabcafé Tokyo Facebook event. During the workshop, the Kintsugi artist presented and demonstrated their process, allowing participants to try their hand at creating Kintsugi. The authors presented their work and the thoughts behind the Electronic Kintsugi. The workshop explored Kintsugi and interaction with it, using two familiar outputs, sound and light, which would act as examples of possible outputs, so that participants were able to extrapolate from this in terms of what the Electronic Kintsugi might be used for. We conducted the workshop in a focus group style, and did two rounds of explorative, hands-on evaluation. A questionnaire was developed to capture their experience (Results in section “First Workshop”). 4.2 Second Iteration of the Electronic Kintsugi Cila, Smit, Giaccardi and Kröse [8] describe the interventionist product, for creating dialogues, which sense, respond to, and interpret data. The Electronic Kintsugi was developed to sense touch, responds to it, and for the second workshop, could interpret data, such as how often it is being stroked. After feedback from the ?rst workshop, the Electronic Kintsugi was updated to have more responsiveness and a more light interaction would emerge, or how it would progress in order to prompt explorative and playful behaviour with the device. Rather, it had a certain level of ambiguity [19] via the programmed adaptive behaviours, based on how much it was interacted with and for how long, e.g., if it had been left alone, or o? for a period. 1 http://www.kurovsya.com/. 108 V. J. Carpenter Several touch-to-sound and touch-to-light reactions were developed for the work- shop. Each reaction was taking input from the touch interface2 and creating a speci?c output in the form of either light or sound. Light was output on a strip of NeoPixels and sound was synthesized using a software library3 and output to a speaker. The light reactions transform a single parameter from the touch interface into a speci?c light pattern on the LED display. Likewise, sound reactions transform a single parameter from the touch interface to single tones, chords or evolving sound ?gures. In the second iteration, we wanted to increase the complexity [20] of interacting with the device so the interaction was less binary, such as a touch = a sound. Instead, it was decided to make the coupling between the input and output less apparent, giving it the autonomy to interpret the frequency of interaction and respond according. Within the second iteration algorithm, there exist ?ve cases for interaction modalities for either sound or light, meaning ?ve for sound and ?ve for light. There is a manual switch on the Electronic Kintsugi so participants can choose if they are interacting with light or sound. These ?ve cases were ?ve variations in types of output cycled through a timer based on interaction. If the user was interacting with the Electronic Kintsugi, then it would remain on that mode longer, until they paused interacting, to not interrupt their ?ow of interaction. Then it would move to the next mode. Each mode was a variation in output, so for example, for sound, it might be di?erent chords or tones. This had the purpose of giving the participant less time to recognize patterns in the behaviour and enhance the user’s curiosity. We focused on how the interaction between the participants and the Electronic Kintsugi could be more tightly or loosely coupled, yet also incorporate elements of surprise; and what implications this interaction had for the participants’ association to the Electronic Kintsugi as a device, versus as an instru- ment, companion, or tool. 4.3 Workshop 2: Methods The second workshop was scheduled six months after the ?rst, due to travel and revisions to the technology and workshop design. Approaching workshop two, Wakkary et al. [11] published a work, “Morse Things” wherein they utilised a methodology for engaging design researchers to evaluate their everyday object through having the object in their home for some weeks, and then following up with a workshop with the design researchers to explore the experiences with the object. We adopted this methodology for our work, and asked four design researchers to evaluate the Electronic Kintsugi in their homes for a period of ?ve weeks followed by a workshop. We chose to use this method, in agreement with Wakkary et al. who explain, “A key motivation in our approach was the desire to deepen our investi- gation by including a wider range of experts that have the design expertise to perceive and investigate the nuanced and challenging notions of thing-centeredness.” 2 We followed instructions from: http://www.instructables.com/id/Touche-for-Arduino- 3 Advanced-touch-sensing/. We used this library: https://github.com/dzlonline/the_synth. Electronic Kintsugi 109 4.4 Participant Selection and Introduction to Electronic Kintsugi Opportunity sampling was used to select experts in design research from di?erent back- grounds, aged 30–38, living in Copenhagen to ensure di?erent perspectives on the experience and imagined future uses. Participants’ names have been changed for their privacy. Their backgrounds are crossovers between the ?elds of engineering, interaction design, dance, performance design, industrial design, robotics, and hardware develop- ment. Participants were recruited by email and it was explained to the participants that they’d have the object in their home for 5 weeks and engage with it for a minimum of 15 min per week, spending another 15 min per week journaling their experiences. Participants were asked to keep a record of their thoughts and experiences and to both keep these as a document and bring these thoughts to the workshop at the end. We found four researchers who were available to review the device worked. Our goal here was to invite these experts to explore with us and ?nd out what questions to ask participants [21]. We describe the speci?c methods we used during workshop 2 in the section “Second Workshop” to maintain continuity and legibility of this work (Fig. 2). Fig. 2. Touching the traces on the Kintsugi bowl with the Electronic Kintsugi boxes displaying light and playing sound. 5 First Workshop: FabCafe Tokyo Workshop 1 informed our work and to set the scene for workshop 2. The workshop was conducted in both English and Japanese, and participants could communicate in their preferred language. We used a written questionnaire so participants could answer in their preferred language. We brie?y present workshop one and then move to re?ect on ?ndings from workshops one and two. After a brief demonstration of function, the Electronic Kintsugi was explored by participants. They touched the traces with one, two or all ?ngers, and tried turning the ceramics over, holding it in one hand or two. We explained “the output could be anything, it could start your car, or feed your pet”. 110 V. J. Carpenter Since participants were familiar with the interaction technique after exploring the sound interaction, the light interaction had a much di?erent approach. Participants knew how they could touch it, with one or several ?ngers and they now focused on light or harder touches, strokes, or resting their ?nger on the traces. The light was much more unpredictable than the sound. Whereas with the sound, they were acting almost as musi- cians, experimenting to ?nd patterns and particular notes, with the light it was more about getting a bigger or smaller reaction than it was about the nuances in between these small or large bursts of light. One participant asked, “I want to know how much it’s me that is controlling it and how much it is doing on its own”. 5.1 Findings We highlight several responses here from the questionnaire to inform future researchers in this ?eld who might be interested in working further with this. • Encouraging senses and emotions – Being able to handle the Kintsugi was a special experience, “There is a di?erent feel to a real Kintsugi. It’s rare to see the hitting of the device so profoundly.” (P-1A) and “We’re often not given permission to touch traditional art. It feels good to be encouraged to touch it.” (P-1E). • An interest in other senses: taste, smell, and food – One participant who suggested it be used as a bowl to eat from “Japanese people eat with bowls close to their mouth, so I want to see some sound installation when someone is eating” (P-1A) and another who suggested that it could be used for a cat or dog food request device “imagine the cat’s tongue licking the Kintsugi!.” (P-1C). • Light – Unpredictable but has potential – One participant noted that the light reminded them of a starry sky and stated, “In a larger, or aesthetically ordered or di?erent setting (night), it would be very soothing” (P-1C). Another participant was inspired and shared an idea “The combination of the craft and the touch with the light feedback reminded me of the challenges of regaining ?ne motor control in a ?nger after an accident. The focus required and the tranquility of the lights may be a fun alternative physical therapy.” (P-1E) • Sound – Alive characteristics – One participant remarked, “Craft has character, especially as it ages. How might that character be represented as sound? I feel the sounds were lovely but not aligned with the character of the craftwork. Or maybe it had juxtaposition of sound quality and physical character which enhances the contrast between tradition and technology.” (P-1F). Two participants related to the object in an anthropomorphic way, stating “It was like the cup was telling me how he/she’s doing. Since Kintsugi part is a past wound, sometimes I felt like it’s telling me it had pain.” (P-1E). Electronic Kintsugi 111 5.2 Findings Summary The workshop provided us with some considerations about the role of art and objects and potential interactivity from these objects. Participants were excited to play with art and traditional craft based objects. They were fascinated by the light and sound output and could extrapolate to imagine other interaction scenarios. They explored the aesthetic interaction qualities and played the Kintsugi like an instrument, using expressive hand gestures to explore the touch interaction. And they could re?ect on the role of technology and tradition and how we live our lives: “Developing a closer, more physical relationship with the objects in our lives feels meaningful.” (P-1E). 6 Second Workshop: Copenhagen To prepare for the second workshop, we asked participants to spend 20 min in silence [22] to complete a written activity to gather their pre-workshop thoughts and feedback prior to engaging in dialogue. We used Kujala, Walsh, Nurkka, and Crisan’s [23] method of sentence completion to extract these initial reactions. We provided the instructions that participants should answer quickly (20 questions in 20 min) and the beginning of the sentence was given, which was then completed by the participant in a way they saw ?t. Kujala and Nurkka [24] used categories of user values to classify questions. In Fig. 1, one can see the sentences we de?ned, as per each value category. We tried to make a nearly even number of positive and negative questions, and allowed extra space if they wished. 6.1 Sentence Completion Tool 112 V. J. Carpenter A Likert scale [25] was used to determine their reactions to sound and light inter- actions. We asked participants to rate the light and sound interaction. For light, we asked “I found the light output to be:” and gave one of the scales the value of “Calming” and the other end of the scale “Attention Seeking”. For sound, we asked the same, but added an additional scale of “noise” to “music”. We spent the remaining 2.5 hours engaged in a group discussion about their expe- riences, comparing, contrasting, and exploring possible future interactions. Electronic Kintsugi 113 6.2 Findings of Workshop Two We used mind mapping as a technique to map out the responses from the discussion and journals [26]. We present here the results of the sentence completion as well as the discussion and journals. 6.3 Sentence Completion We compared the sentence completion responses sentence by sentence and by category. The Electronic Kintsugi was described as “enjoyable, calming, interesting, and di?erent” in the one word descriptions. The ?ndings from participants, ordered by the Sentence Completion Tool headlines [23] were: General: Participants felt a sense of achievement when interacting with others and felt connected to it when it: “reacted to my own and others touching it”. General: Predictability. They were disappointed and frustrated with the light inter- action: “the light interaction was unpredictable, non-responsive and not interesting”. It is noted here that in both workshops, the light was reported to be not as responsive as the sound input. Participants in both workshops reported that they were more fascinated with the sound feedback, particularly because there were more nuances in the sound than in the light. Emotional: Participants described their emotional response as “playfulness and companionship, calming, joy and puzzled” and again highlighted their frustration with the lights, describing them as “underwhelm(ing), disappoint(ing), and distanced”. Two participants referenced the social values and stated that their best experiences were while playing with others. Stimulation and epistemic: Participants described the changing soundscape, mentioned their desire to use it when someone asked about it. Growth and self-actualization: Participants described both, relaxation and concen- tration as well as creative thinking and social interaction as outcomes of their interactions with the Electronic Kintsugi. Traditional values: Participants noted that, as an object in their home, it was “cute and modern”, “playful and interactive” and that it “combined ceramics with playful- ness”. Finally, in the extra space provided, three responses were thought provoking • I kept receipts in it and I liked how it became less precious and more functional • I wonder if you were tracking my use • It was a search into new creative possibilities. The Likert Scales gave us the below results, indicating that while results varied, light was generally thought to be more attention seeking than calming, sound was found to be generally more calming than attention seeking and sound was more musical than noisy. 114 V. J. Carpenter “I found the light output to be:” (Calming = 1, Attention Seeking = 10) Average rating of 5.75 (Actual Rating Values = 8, 4, 4, 7) “I found the sound output to be:” (Calming = 1, Attention Seeking = 10) Average rating of 3.75 (Actual Rating Values = 3, 3, 7, 2) Extra question for sound: (Noise = 1, Music = 10) Average rating of 6.25 (Actual Rating Values = 6, 5, 5, 9) From the discussion and journaling, three primary categories of interest emerged: (1) enhancing human connection through embedded or “magic” technology, (2) using a craft based object in prompting personal re?ection and development, and (3) exploring transferable design principles of smart products with a device which has no de?ned purpose, and which converges traditional craft and technology. In the accounts below, participants focused primarily on the sound based interaction as they were not interested in the light interaction and spent most of their time with sound (Fig. 3). Fig. 3. The Electronic Kintsugi bowl with a design researcher, she is playing with the light as a break from work. 7 Three Themes Identi?ed 7.1 Enhancing Human Connection Through Embedded or “Magic” Technology There were several accounts of how the Electronic Kintsugi sparked social connections and interactions. Antonio had placed it in the kitchen and he explained that the bowl on its own might not have sparked curiosity but the box did and visitors asked what it was and then wanted to play with it. For Sandra, she was having an evening of entertaining guests, and as they were ?nally leaving (she was tired), she stood in the doorway, and absent-mindedly touched the bowl as they were putting on their shoes. The guests became immediately intrigued and asked questions and wanted to play with it, which Electronic Kintsugi 115 was both charming and exhausting, since, as Sandra explained, she was ready for them to go home, but also happy to play and show them the bowl. For Henry, it was a social life saver as he suddenly found himself spending time with his father in law who doesn’t speak much English, and Henry doesn’t speak much Danish. The Electronic Kintsugi came to the rescue as a medium they could explore together, without a need for verbal language. Martin explained that he took it on the bus and it was “totally inappropriate” there, it was loud and kept making screeching noises. He was frustrated with it, and imagined if it was quiet and making nicer sounds as it often did (though, not on the bus) then he could have asked others to join in on the playing. The ‘magic’ of the object was intriguing to people who didn’t know what it was and sparked both play and conversation, even, in Sandra’s case, when they should have been leaving. It o?ered a needed social lubricant in the case of Henry and sparked ideas on how to engage strangers on the bus for Martin. Having an everyday object have ‘magical’ and unexpected properties, without being a gadget, or being used for some other purpose (a fancy remote, a communications device, etc.) seemed to be the key to sparking this social interaction. Unexpected qualities of playfulness via a changing soundscape were the right recipe for the Electronic Kintsugi. 7.2 Using an Everyday Object in Prompting Personal Re?ection and Development Our experts felt that an everyday object combining traditional craft and technology was important, commenting that they “wanted to come back to it again, it levels up, it evolves over time” (Martin) and “I love that it’s not intuitive, you have to spend time with it and get to know it. It’s nice that it doesn’t have a de?ned purpose, somehow it’s good to just have something nice and electronic in your home, especially with the copper tape, it feels like a crafted aesthetic, you can see craft, and the time put into it, but you can’t see code, so somehow this makes tangible the craft of the code”. (Henry). Sandra likened it to a “Tibetan singing bowl, you have to hit it just right and there’s a pleasure behind controlling that energy”. And Martin continued, “The electronics force you into move- ment, I’ve never done this with an Ikea bowl”. Bringing together physical and digital materials, considering both the craft of the object and the craft of the code, and, considering the social surroundings that the object inhabits were important aspects of creating a hybrid craft [16]. For us, it is the combination of these things which is a signi?cant part of designing for meaningful interactions and experiences when working with future smart everyday products in the home. 7.3 The Role of an Object with a Non-de?ned Purpose The fact that the purpose of the object was open-ended was well-liked, and the partici- pants used this opportunity to explore the possibilities with it. Some of their comments included “I love that it’s not intuitive, you have to spend time with it and get to know it” (Martin) and “It was interesting, as a dancer, that I played a lot with the hand move- ments and did improvised hand movements” (Sandra). 116 V. J. Carpenter It was brie?y discussed what it might be like to grow up with an object like this in your home, instead of an iPad or TV, and how that might change your perceptions of how you interact with the world, and come to appreciate objects. Sandra explained “I prefer it as an ornament, something non-connected. It can be a companion, or a container, such as for my receipts.” The combination of a non-de?ned interaction purpose with the functionality of a common object, a bowl, seemed to work well to invite playful and curious interactions. While some experts poured water into the bowl to explore the sound, Antonio took it a step further, and ate his breakfast cereal from the bowl, “it made me aware of how fast I was eating”. (Interestingly, in workshop one, this was a suggestion from partici- pants, that it could be nice to eat from the bowls). The choice to use a bowl came from our fascination with Kintsugi and the tendency there to repair bowls, and we learned that as a starting object for this exploration, a bowl has so many inherent properties, something to eat from, to store things in, as a decorative object, as a historical object, it’s nice to hold, and it exists in many cultures, and many homes. Creating an object with non-de?ned purpose can be one way to encourage curiosity, playfulness and an opportunity for the creation of meaningful or important moments in one’s life, especially when there is a human-to-self (self-development) or human-to-human (social) aspect. On the contrary, further interaction design would be necessary once an object moves beyond being something with a non-de?ned purpose. In this work, our focus on a non-de?ned purpose is not disregarding designing interactions for a speci?c context, but rather our focus is on designing interaction concepts at an earlier phase of the project development. 8 Discussion It is worthwhile to revisit Borgmann (as described by Fallman [19]) here, who worried that technology would “turn us into passive consumers, increasingly disengaged from the world and from each other” [19]. Our aim with Electronic Kintsugi, and a focus on designing for ambiguous interactions with everyday objects, is to move back towards each other, towards engagement with familiar objects, towards creativity and playful- ness and that it is “not simply [a] neutral means for realizing human ends, but actively help[s] to shape our experiences of the world” [19]. Despite work in academia developing tangible, non-screen devices or criticising IoT (as earlier presented) the products which emerge on-market today are not abundantly re?ective of this. These products do not necessarily engage people on a human-to-human or human-to-self level and instead, often cater to ?xing a small problem without neces- sarily considering a more holistic impact. Cila, Smit, Giaccardi and Kröse [8], describe the current approach to IoT as being short-sighted and emphasize the potential for the role of interaction design in new smart things. In our work, we expand on this, and emphasize a need for smart things to perhaps be rooted in craft to enhance meaning-making, to utilize non-screen interaction, and to move towards facilitating human-to-human or human-to-self exploration. Electronic Kintsugi 117 We further emphasize the role of a device with an unde?ned interaction purpose, as opposed to the very speci?c devices emerging on market today such as smart candles4 (controllable via app) or smart hair brushes.5 Although we needed to use copper tape to achieve the conductivity, in the future, we would like to explore which material properties would allow a Kintsugi artist to create something more conductive using the traditional precious metals. Given this, the most signi?cant aspect was the conceptual consideration of how one might interact with an object which had been created by an artist, but is otherwise an ‘everyday object’ (one which we might ?nd in our homes anyway, such as a bowl). Returning to Cranny-Francis’ semefulness, we can see the aspects of physical, emotional, intellectual, spiritual, social, and cultural [18] in the Electronic Kintsugi. We essentially augment a crafted object with technology, with the aim of created an enchanted [14] everyday object with a historical, crafted background which is open to interpretation and explorative play. The role of an enchanted [14] everyday object is especially important to consider in a world of increasing IoT gadgets. Considering a future vision of connected everything, we feel it is important that we do not become too focused on the technology, such as having RFIDs under our skin [27] or being laden with smart tablets, smart watches or smart water bottles; but rather, that we embrace humanness. We want to create devices which provoke thoughtful and critical re?ection, and engage people on a tangible level; not just a screen asking if you’ve been mindful today [28]. When considering the design of new ‘smart’ objects, we should perhaps ask, “does it need to be connected, and if so, why?”, or “how can I enhance the existing values in this everyday object?” A door handle for example, doesn’t just open a door, it is the literal door to coming home from work, relaxing after a long day, seeing your family again, and more.6 The a?ordances inherent in everyday objects are many and it is our job as interaction designers to not only invent new technologies and uses but to consider how to support these values and avoid turning the objects in our world into cloud-connected gadgets. Electronic Kintsugi embraces new technology and established craft practices, emphasizing curiosity and playfulness while facilitating interaction between people and the self. Furthermore, we felt that the aspect of craft was a key identi?er in what made the everyday object special. The history and delicate quality of the Kintsugi had multiple reactions, the participants in Japan were intrigued that they were allowed to play with a piece of art, and the participants in Denmark were eager to engage with, and learn more about Kintsugi. Our primary concern was the investigation of a non-screen, tangible everyday object coming from a place of craft, and in future work we hope to further investigate how we could work with a Kintsugi artist to create a fully functional piece of Electronic Kintsugi, with capacitive traces in the piece. 4 5 https://www.ludela.com/. 6 https://www.kerastase-usa.com/connected-brush. From an interview with designer Carl Alviani (http://meaningfuldevices.vanessa- carpenter.com/2017/08/10/anything-but-personal-is-a-failure/). 118 V. J. Carpenter 9 Conclusion In this work, we have presented Electronic Kintsugi: an exploration in how an everyday object (a bowl) in combination with artisanal craft (Kintsugi) and electronics (conduc- tive sensing) could result in more human-to-human connection and human-to-self development. Through two workshops, one in Japan with a Kintsugi artist and partici- pants, and one in Denmark, with design research experts, we explored the properties of this Electronic Kintsugi, an interactive object with no de?ned purpose and two main interaction outputs - sound and light. We found that sound as feedback was of signi?cant interest due to its nuanced nature and reactiveness, and between workshops, the sound was programmed to evolve over time with use. Using copper tape, we augment a traditional, crafted object, namely, Kintsugi with electronics, and call it Electronic Kintsugi, creating an open platform for play, explo- ration and development. In future work, we hope to continue work with Kintsugi artists to ?nd a material which can be used in the craft practice, which would also be conductive enough for Electronic Kintsugi. We identi?ed three categories of re?ection from our studies with participants, and areas which future smart products can look to, to enable more meaningful interactions between human and human and human and device. These categories are: (1) enhancing human connection through embedded or “magic” technology, (2) using everyday objects to prompt personal re?ection and development, and (3) exploring transferable design principles of smart products with a device of unde?ned purpose, and which converges traditional craft and technology. Finally, we discussed that as interaction designers, we would like to focus on embracing humanness in future technology designs and could look to the values and a?ordances inherent in everyday objects to bring out these values and design for these moments in our lives. Acknowledgment. We are grateful to FabCafe Tokyo, Kurosawa-San, the participants of workshop one, the design experts of workshop 2, and all the user testers and helpers along the way. References 1. Zheng, C., Nitsche, M.: Combining practices in craft and design. In: Proceedings of the Tenth International Conference on Tangible, Embedded, and Embodied Interaction (TEI 2017), pp. 331–340. ACM, New York (2017). https://doi.org/10.1145/3024969.3024973 2. Zoran, A., Buechley, L.: Hybrid reassemblage: an exploration of craft, digital fabrication and artifact uniqueness. Leonardo, 46(1), 4–10 (2013). http://www.research.lancs.ac.uk/ portal/en/publications/designing-information-feedback-within-hybrid-physicaldigital-interactions( 4709b666-bbe3-46f8-ad3a-6d06fdd6f5cd)/export.html 3. Lingel, J.: The poetics of socio-technical space: evaluating the internet of things through craft. In: Proceedings of Conference on Human Factors in Computing Systems (CHI 2016). ACM, New York (2016). https://doi.org/10.1145/2858036.2858399 Electronic Kintsugi 119 4. Schoemann, S., Nitsche, M.: Needle as input: exploring practice and materiality when crafting becomes computing. In: Proceedings of the Eleventh International Conference on Tangible, Embedded, and Embodied Interaction (TEI 2017). ACM, New York (2017). https://doi.org/ 10.1145/3024969.3024999 5. Hogan, T., Hornecker, E.: Feel it! See it! Hear it! Probing tangible interaction and data representational modality. In: Proceedings of DRS 2016, Design Research Society 50th Anniversary Conference, Brighton, UK (2016) 6. Kettley, S., Sadkowska, A., Lucas, R.: Tangibility in e-textile participatory service design with mental health participants. In: Proceedings of DRS 2016, Design Research Society 50th Anniversary Conference, Brighton, UK (2016) 7. Mols, I., van den Hoven, E., Eggen, B.: Informing design for re?ection: an overview of current everyday practices. In: Proceedings of the 9th Nordic Conference on Human–Computer Interaction (NordiCHI 2016). ACM, New York (2016). https://doi.org/ 10.1145/2971485.2971494 8. Cila, N., Smit, I., Giaccardi, E., Kröse, B.: Products as agents: metaphors for designing the products of the IoT age. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI 2017), pp. 448–459. ACM, New York (2017). https://doi.org/ 10.1145/3025453.3025797 9. Akama, Y., Light, A., Bowen, S.: Mindfulness and technology: traces of a middle way. In Proceedings of the 2017 Conference on Designing Interactive Systems (DIS 2017), pp. 345– 355. ACM, New York (2017). https://doi.org/10.1145/3064663.3064752 10. Mols, I., van den Hoven, E., Eggen, B.: Balance, cogito and dott: exploring media modalities for everyday-life re?ection. In: Proceedings of the Eleventh International Conference on Tangible, Embedded, and Embodied Interaction (TEI 2017), pp. 427–433. ACM, New York (2017). https://doi.org/10.1145/3024969.3025069 11. Wakkary, R., Oogjes, D., Hauser, S., Lin, H., Cao, C., Ma, L., Duel, T.: Morse things: a design inquiry into the gap between things and us. In: Proceedings of the 2017 Conference on Designing Interactive Systems (DIS 2017), pp. 503–514. ACM, New York (2017). https:// doi.org/10.1145/3064663.3064734 12. Núñez Pacheco, C., Loke, L.: Tacit narratives: surfacing aesthetic meaning by using wearable props and focusing. In: Proceedings of the Eleventh International Conference on Tangible, Embedded, and Embodied Interaction (TEI 2017), pp. 233–242. ACM, New York (2017). https://doi.org/10.1145/3024969.3024979 13. Tsaknaki, V., Fernaeus, Y.: Expanding on wabi-sabi as a design resource in HCI. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI 2016), pp. 5970–5983. ACM, New York (2016). https://doi.org/10.1145/2858036.2858459 14. Rose, D.: Enchanted Objects: Design, Human Desire, and the Internet of Things. Simon and Schuster, New York (2014) 15. Tsaknaki, V., Fernaeus, Y., Schaub, M.: Leather as a material for crafting interactive and physical artifacts. In: Proceedings of the 2014 Designing Interactive Systems (DIS 2014). ACM, New York (2014). https://doi.org/10.1145/2598510.2598574 16. Tsaknaki, V., Fernaeus, Y., Rapp, E., Belenguer, J.S.: Articulating challenges of hybrid crafting for the case of interactive silversmith practice. In: Proceedings of the 2017 Conference on Designing Interactive Systems (DIS 2017), pp. 1187–1200. ACM, New York (2017). https://doi.org/10.1145/3064663.3064718 17. Nordrum, A.: Popular Internet of Things Forecast of 50 Billion Devices by 2020 Is Outdated (2016). https://spectrum.ieee.org/tech-talk/telecom/internet/popular-internet-of-things-forecast- of-50-billion-devices-by-2020-is-outdated 120 V. J. Carpenter 18. Cranny-Francis, A.: Semefulness: a social semiotics of touch. Soc. Semiot. 21(4), 463–481 (2011). https://doi.org/10.1080/10350330.2011.591993 19. Fallman, D.: The new good: exploring the potential of philosophy of technology to contribute to human–computer interaction. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2011), pp. 1051–1060. ACM, New York (2011). https:// doi.org/10.1145/1978942.1979099 20. Hobye, M.: Designing for Homo Explorens: Open Social Play in Performative Frames, pp. 16–17. Malmö University, Malmö (2014) 21. Bødker, S.: When second wave HCI meets third wave challenges. In: Mørch, A., Morgan, K., Bratteteig, T., Ghosh, G., Svanaes, D. (eds.) Proceedings of the 4th Nordic Conference on Human–Computer Interaction: Changing Roles (NordiCHI 2006), pp. 1–8. ACM, New York (2006). https://doi.org/10.1145/1182475.1182476 22. Martin, B., Hanington, B.: Universal Methods of Design. Rockport Publishers, Beverly (2012) 23. Kujala, S., Walsh, T., Nurkka, P., Crisan, M.: Sentence completion for understanding users and evaluating user experience. Interact. Comput. 26(3), 238–255 (2014). https://doi.org/ 10.1093/iwc/iwt036 24. Kujala, S., Nurkka, P.: Identifying user values for an activating game for children. In: Lugmayr, A., Franssila, H., Sotamaa, O., Näränen, P., Vanhala, J. (eds.) Proceedings of the 13th International MindTrek Conference: Everyday Life in the Ubiquitous Era (MindTrek 2009), pp. 98–105. ACM, New York (2009). https://doi.org/10.1145/1621841.1621860 25. Brooke, J.: SU: a quick and dirty usability scale. In: Jordan, P., Thomas, B., Weerdmeester, B.A., McClelland, I. (eds.) Usability Evaluation in Industry, pp. 189–194. Taylor & Francis, London (1996) 26. Wheeldon, J., Faubert, J.: Framing experience: Concept maps, mind maps, and data collection in qualitative research. Int. J. Qual. Methods. (2009). https://doi.org/ 10.1177/160940690900800307 27. Astor, M.: Microchip implants for employees? One company says yes. New York Times (2017). https://www.nytimes.com/2017/07/25/technology/microchips-wisconsin-company-employees. html 28. Newman, K.M.: Free Mindfulness Apps Worthy of Your Attention. Mindful (2017). https:// www.mindful.org/free-mindfulness-apps-worthy-of-your-attention/ Electronic Kintsugi 121 A Novel and Scalable Naming Strategy for IoT Scenarios Alejandro Gómez-Cárdenas(?) , Xavi Masip-Bruin, Eva Marín-Tordera, and Sarang Kahvazadeh Advanced Network Architectures Lab (CRAAX), Universitat Politècnica de Catalunya (UPC), Barcelona, Spain {alejandg,xmasip,eva,skahvaza}@ac.upc.edu Abstract. Fog-to-Cloud (F2C) is a novel paradigm aimed at increasing the bene?ts brought by the growing Internet-of-Things (IoT) devices population at the edge of the network. F2C is intended to manage the available resources from the core to the edge of the network, allowing services to choose and use either a speci?c cloud or fog o?er or a combination of both. Recognized the key bene?ts brought by F2C systems, such as low-latency for real-time services, location awareness services, mobility support and the possibility to process data close to where they are generated, research e?orts are being made towards the creation of a widely accepted F2C architecture. However, in order to achieve the desired F2C control framework, many open challenges must be solved. In this paper, we address the identity management challenges and propose an Identity Management System (IDMS) that is based on the fragmentation of the network resource IDs. In our approach, we divide the IDs into smaller fragments and then, when two nodes connect, they use a portion of their full ID (n fragments) for mutual iden- ti?cation. The conducted experiments have shown that an important reduction in both, the query execution times and the space required to store IDs, can be achieved when our IDMS is applied. Keywords: IDMS · Identity management · Fog-to-Cloud · Resource identity 1 Introduction The Internet of Things (IoT) is a communication paradigm that allows all kind of objects to connect to the Internet network. According to [1] on 2020 the number of connected devices will reach the 50 billion, that is, 6.58 times more than estimated world population for the same year. Aligned to the constant growth of the IoT devices population, the amount of data they generate at the edge of the network is growing as well. Every day, large volumes of data in all formats (video, pictures, audio, plain text, among others) are generated and then moved to cloud datacenters to be processed. In fact, it is estimated that in the near future only an autonomous car will produce up to 4 TB data on a daily basis [2]. It is widely accepted that useful information can be extracted from data, using cloud-based data mining techniques. Nevertheless, moving large amounts of data from the © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 122–133, 2019. https://doi.org/10.1007/978-3-030-02686-8_10 edge to the datacenters located at the core of the network may incur signi?cant overhead in terms of time, network throughput, energy consumption and cost [3]. To overcome these issues, novel computing paradigms such as fog computing have emerged at the edge of the network. Fog computing is a paradigm intended to extend cloud computing capacities to the edge of the network, allowing data to be processed and aggregated close to where it is generated [4]. The fact that Fog computing is deployed close to the end users devices facilitates some key characteristics for IoT services and applications, such as for example, low-latency, mobility, and location-awareness [5]. Indeed, Fog computing emerged to collaborate with cloud computing, thus not competing each other. Nowadays, the new combined fog-to-cloud [6] proposed to ease service execution in hierarchical fashion fog, cloud, or combination of both. There are two ongoing projects to deploy the hierarchical and combined F2C system. One of them is called OpenFog consortium [7] and another one mF2C [8]. The mF2C project, at early stage proposed a hierarchical and layered architecture that the whole set of resources can be executed in cloud, fog, or combination of both. In mF2C, distributed fog nodes can be utilized for delay-sensitive and demanded low-latency services and processing at the edge of the network, and in parallel, cloud can be used for massive and long-term processing and storage. In a realistic scenario, F2C is shown as a hierarchical three tiers architecture [9] where the most constrained devices are located at the lower tier. The middle tier is integrated by nodes that act as aggregator of the available resources for the lower layer (see Fig. 1) and ?nally, at the top of the hierarchy, the cloud datacenter is located. Fig. 1. Fog-to-Cloud general topology. Certainly, the F2C resources continuum must be managed by a control strategy (sort of control plane), but because there are still many challenges to be solved, the control concept as a whole, is yet an open issue for Fog and surely F2C systems. One of the challenges to be addressed in F2C systems is the lack of an Identity Management System that meets the speci?c paradigm requirements. In F2C, the Identity Management System (IDMS) is the set of functions aiming to provide a mechanism to assign and in general to manage, the resource identities of both, physical and virtual devices. According to [10], the management of the resource identi?ers at the edge is very important for programming, addressing, things identi?cation, data communication, A Novel and Scalable Naming Strategy for IoT Scenarios 123 authentication, and security. Thus, the IDMS is a key component of the F2C control plane framework. In short, some of the features an IDMS should provide in F2C system are: (i) the capability to scale smoothly in parallel with the network; (ii) supporting devices mobility without losing their identities; (iii) security and privacy protection; (iv) interoperability among different service providers and; (v) supporting highly dynamic network topologies. In this paper, we focus on the IDMS challenge and propose a solution that address the aforementioned system requirements. The key contributions of our work when compared with other available solutions include the mobility support, it is, the capability of the edge devices to keep their identi?ers, even when they are on the move. Such ID persistence eases the mutual identi?cation and authentication processes between a node and an aggregator node in future interactions. Likewise, the IDMS strategy that we propose allows to adjust the identi?ers size that the resources use in the network without losing the identity uniqueness property. Finally, unlike other solutions, our proposal is focused in reducing the compute load required to identify the resources in the network. This undoubtedly bene?ts the entire network, especially the lowest layer in the hierarchy where the resources are very constrained and therefore a more e?cient management of them is required. The remainder of this paper is organized as follows. In Sect. 2 other IDMS solutions are reviewed. In Sect. 3 our IDMS proposal is described. The evaluation and results are presented in Sect. 4 and ?nally, in Sect. 5 the conclusions and future work are discussed. 2 Related Work In computer networks, the name and the address of a device stand for two di?erent things. The general distinction between a name and an address is that a name can remain with an entity even when that entity is mobile and moves among di?erent locations (i.e. addresses) [11]. From the IDMS perspective, the mobility support o?ered by F2C means that the identi?ers assigned to the network resources are persistent, i.e., they remain even if the attributes, such as the location of the devices change. Therefore, the usage of addressing techniques to manage the resources identity in F2C is not the proper solution. Rather, an IDMS that gives support to both, static and mobile nodes in the network, must be considered. Under this premise, in this section we pay special attention to IDMS solutions whose target include IoT-devices. The rationale of this decision is that generally speaking, IoT puts together static and mobile devices, thus, providing support to all of them is manda- tory in any solution to be deployed in the IoT arena. In [12], authors present a smart home operating system for the IoT named EdgeOSH. In EdgeOSH, the architecture component in charge of managing the devices identities is the naming module. Such module allocates unique human friendly names describing the location (where), role (who) and data description (what) of the devices, for example, LivingRoom.CellingLight.Bulb2. These names are used by the operating system to manage services, data and devices. 124 A. Gómez-Cárdenas et al. Nevertheless, the way in which EdgeOSH manages the devices identities presents several drawbacks that prevent it from being used in F2C environments. For example, human-meaningful names ease to disclose sensitive information and to access unau- thorized network resources through masquerade attacks. Another issue refers to the fact that it is not prepared to support the tremendously large number of devices expected in F2C, i.e., therefore, it is not scalable. As a consequence, the authors concluded that an e?cient IDMS for the IoT is still an open problem and further investigation is required. Motivated by the need of an identity information service where the provider of the service is unable to access the information that passes through their servers, authors in [13] proposed BlindIdM, an Identity Management as a Service (IDaaS) model with a focus on data privacy protection. In such model three main type of actors are de?ned: users, service providers and identity providers. The user is a node in the network with the identity information of a set of entities and its goal is to transfer such information to the service provider in a secure fashion. The authors claim that through encryption techniques, BlindIdM permits to send the identity information from the user to the service provider without the identity provider being able to read it. To achieve this, the information is initially encrypted by the user, then re-encrypted by the identity provider and ?nally decrypted by the service provider. The results obtained during the evaluation of the proposal show assumable times for the three cryptographic operations, however, it is important to note that these operations were performed by powerful cloud data centers. Given the decentralized nature of the F2C paradigm, it is likely that some of the key functions of the control plane will be executed in the edge of the network, including the identity management service. In this sense, the three cryptographic operations proposed by the authors may cause an important bottleneck, degrading the overall system quality of service (QoS) in terms of response times. In [14] authors introduce a user-centric identity management framework for IoT. They propose the creation of a global identity provider (gIdP), responsible for main- taining global identity. The gIdP is used by the service providers (SP) to generate local identity. However, this proposal has two major drawbacks: (i) the global identity provider represents a single point of failure in the system – such centralization contra- dicts the F2C paradigm; (ii) the proposed framework is intended to provide identities to the user rather than the devices. In F2C, regardless of whether several devices belong to the same person, every node in the network must have its own unique identi?er, thus, an object-centric approach should be applied. The work in [15] present a machine-to-machine IDMS that allows network devices to generate multiple pseudonyms to be used as identi?er in di?erent applications. They use anonymous attestation to perform veri?cation of the pseudonym, i.e., an interactive method for one party to prove to another that the pseudonym is valid and should be accepted but without revealing anything else than the validity of the pseudonym. The problem of implementing this identity management strategy in a F2C systems is that the anonymous attestation is a set of complex mathematical expressions that the nodes have to solve in order to validate the identity of other nodes. Thus, the calculations destined to validate the identity of other devices will add a signi?cant delay in the connection establishments between nodes, mainly motivated by the low-computational power devices at the lowest F2C layer have. A Novel and Scalable Naming Strategy for IoT Scenarios 125 3 IDMS Proposal The IDMS proposal is partitioning globally unique IDs into the set of smaller fragments (fg). The fragments partitioning eases network resources to be identi?ed by a fraction of their ID instead of the full identi?er according to their position in the hierarchical F2C network as shown in Fig. 2. Fig. 2. Identi?er fragmentation. First of all, we de?ne the hierarchical F2C network connection between two nodes. In F2C, the connection will be given by the node at the higher hierarchical level. According to [9], three layers are identi?ed at early stage for the F2C system Although, the proposed three layer F2C system is not considering inter-service-provider interac- tion, therefore, we assume fourth layer such as follows: – Edge: This F2C connection provides all occurred connection among resources (physical devices or virtual entities) under the same fog node. The resources that form an area at the edge layer are located geographically closer to each other. For example, an area at the edge can be considered as a hospital building or a school. – Fog: The fog layer connection includes the connections among the fog nodes and the resources that they aggregate. An example of this connection layer can be a connec- tion between a sensor and another device grouped under di?erent fog nodes. – Cloud: This Connection layer includes all resource connections established by the same service provider. The main di?erence with the fog layer connection is that resources may be located geographically far from each other. For example, resources in di?erent cities connected by the same Internet Service Provider. – Global: This connection layer is that all connections among resources stand in global concept. In this context, the resources may or may not be located close to each other and thus inter service provider connectivity plays a key role. For example, connation between two smart cities provided by two service providers. Figure 3 presents the four described F2C connection layer and its borders speci?- cations. Since the number of layers in the F2C architecture may be changed, the set of F2C layer connection and the ID fragmentation policy may be changed as well to be 126 A. Gómez-Cárdenas et al. aligned properly with the number of F2C layers. Therefore, it is worth highlighting that, this is a simple approach. Fig. 3. Hierarchical F2C network connections. Once the all F2C connection has been de?ned for F2C network topology, we divide the resource identi?ers into n parts, where n is the number of F2C connections layer de?ned in the F2C system. Now, every time a connection between two nodes is estab- lished in the network, the nodes use a fraction of the identi?er rather than using the full identi?er for a mutual identi?cation. The number of fragments to be used in each connection depends on the node at the higher hierarchical level. For example, the F2C network topology illustrated in Fig. 3, the device (b) connects to the Fog node #2, such F2C connection will be set as Fog connection. Then, only two fragments of the global identi?er will be utilized during the identi?cation process. In fact, from F2C connection and topological perspective, nodes which are located at higher layers need to use more ID fragments, and consequently, the utilized ID during the connections with other nodes will be larger. The reason for this is that nodes in higher layers have more devices as child. Therefore, to be able to identify each of these devices, longer fragments in identi?ers will be required. Regarding the fragments of identities division, we mention that according to the di?erent use-cases and implementation needs, length of the fragments may be varied. The lower layer in a F2C system is the IoT layer. In the IoT layer, the length of the ?rst fragment would depend on the maximum number of resource IDs that a fog node can store in cache during a given period of time, that is, the identi?ers cache size. Large identi?ers cache sizes in the fog nodes also entail larger identi?er fragments. IoT devices might has limited resource characteristics, therefore, small cache sizes might be expected in this layer. Fog nodes can play a key role for adjusting the ID fragment length to collision problems do not arise. Collision problem in the naming are addressed in [16– 18]. In the proposed identity management, a collision problem occurs when two or more resources in a F2C connection use the same identi?er. Thus, since the purpose of IDs is identifying unambiguously a resource, the collision probability must to be reduced. A Novel and Scalable Naming Strategy for IoT Scenarios 127 In order to enhance the IDMS security and privacy, the full resource identi?er is not propagated nor stored through the network but it is only known by: (i) the resource to which the ID belongs; (ii) the fog node as long as the resource is connected to the F2C network through it, and; (iii) other resources in a global connection that require the full resource ID for a proper identi?cation. In short, preventing collisions during the iden- ti?cation process is the reason that drives nodes in a global connection to use their full ID instead of a fraction of it. In our proposal, fog nodes play a key role because they perform IDs fragmentation and share the required resource ID fragments with other nodes according to the F2C connection layers in F2C systems. 4 Evaluation and Results In this section we present the description of the experiment we used to validate our proposal and the results obtained. For the results, we have compared the storage required to store the resource identi?ers and the queries execution times when the resources use their full identi?er in the network and when they use a fraction of it, hence, two param- eters have been considered during the evaluation. In F2C, the resources grouped in the lowest layer of the network hierarchy will be the most challenging to identify. Such complexity is caused by the tremendous number of devices concentrated in the bottom of the network topology (user’s devices, sensor networks and other IoT artifacts), the lack of control that the service provider will have over those devices and the highly dynamic network topology caused by the inherent mobility of many devices. Thus, recognized the aforementioned as a fact, in this section we focus in the IoT layer, hence evaluating the performance of our proposal when using the ?rst ID fragment. 4.1 Experiment Description In the conducted experiment, we have used a Raspberry Pi 3 model B. Such device integrates an ARM 1.2 GHz quad-core processor and 1G of RAM memory. The reason for using that device is that we consider its speci?cations as the minimum hardware requirements that a device should meet in order to be considered for the fog node role in the F2C system. The software we have preinstalled in the Raspberry Pi are Ubuntu 16.04 as Operating System and a SQL Database Management System (DBMS). Subsequently, we created ?ve databases and ?lled them with a million of synthetic resource identi?ers. The length of the resource identi?er in the ?rst database was set to 128 bytes (according with the length used in [19]). This ?rst database was the one with the full identi?ers. In the next four databases a truncated version of the identi?ers in the ?rst database was stored. The IDs were truncated at 32, 16, 8 and 4 bytes respectively. In all the cases, the identi?ers were generated using only the hexadecimal charset. 128 A. Gómez-Cárdenas et al. 4.2 Used Storage In F2C, the IoT layer is the one with the most limited resources. In fact, many of the devices that operates in the lowest layer do not even have the necessary hardware resources to process the data they generate, therefore, an e?ective resource management is a must. In this sense, the storage is one of the most constrained aspect of the devices in the IoT layer. A F2C framework that requires excessive storage capacity to store the data generated on runtime may disallow a large number of devices to be used as fog nodes, causing with this, in the worst scenario, that the existing fog nodes reject new connec- tions because they are overloaded. The storage required to store the resource IDs in the fog nodes is the ?rst parameter we have evaluated. The results obtained during the validation (Table 1) show that trun- cating the identi?er that the resources use in the IoT layer reduce the space in disk required to store them. Table 1. Database sizes Database Size (MB) % 128 Bytes 162.17 100.00 32 Bytes 67.09 41.37 16 Bytes 51.08 31.50 8 Bytes 42.08 25.95 4 Bytes 37.06 22.85 Table 1 shows the size in megabytes of the databases previously described. The column in the right presents the percentage of the space required by the truncated data- bases with respect to the database that stores the full resource identi?ers, it is, the 128 bytes identi?ers. It can be highlighted from the table that the di?erence in megabytes between the databases with the identi?ers of 8 and 4 bytes is minimal, even when the identi?ers stored in the ?rst one are larger. This is rooted on the fact that the indexes that the DBMS uses are not in function on the length of the ?elds in the tables. In all the cases, the space in disk required to store the identi?ers fragments is between 58.63% and 77.15% less than the space needed to store the full identi?ers. 4.3 Queries Times One of the main advantages that the F2C paradigm o?ers is the possibility to execute applications and services with a reduced delay than cloud computing. This opens the door to the development and deployment of all kind of novel services that require real time responses, such as e-health services, online videogames, earthquake alarm triggers, etc. To achieve such goal, it is imperative that the individual components that integrate the F2C framework are highly e?cient and avoid adding delays in the internal processes. In the F2C framework, The IDMS component should be able to identify the resources in a time that allows to devices on the move to switch among di?erent fog nodes without A Novel and Scalable Naming Strategy for IoT Scenarios 129 interrupt its activities. Such identi?cation process includes the database lookup task. In this sense, our proposal aims at reducing the database lookup times, this by reducing the amount of information that the fog nodes store. In the validation phase, we have used the databases described under Sect. 4.1 to measure the lookup times. We have measured ten times the time required to fetch among 200, 400, 600, 800 and 1,000 (thousands) records for each database and then we calcu- lated the averages of the obtained results (Table 2). Table 2. Queries execution times ID length IDs in the Fog Node (thousands) 200 400 600 800 1,000 128 Bytes 2.97 6.29 9.62 12.93 16.68 100.00% 100.00% 100.00% 100.00% 100.00% 32 Bytes 1.51 2.49 3.97 4.95 7.55 50.87% 39.65% 41.27% 38.28% 45.24% 16 Bytes 1.29 2.34 3.43 4.92 6.34 43.30% 37.24% 35.67% 38.01% 38.01% 8 Bytes 1.26 2.20 2.93 4.42 5.52 42.51% 34.90% 30.43% 34.16% 33.10% 4 Bytes 1.16 1.91 3.14 3.98 5.02 38.99% 30.31% 32.62% 30.80% 30.11% Table 2 and Fig. 4 summarize the results obtained. For the sake of comparison, percentage related to the ?rst database are also included in Table 2. As it can be observed, using a fraction of the full resource identi?ers reduces signi?cantly the time required to search an item in the database. By using a quarter of the name of the devices, our proposal has shown a reduction of up to 49.13% in the search time. In fact, a 32 bytes ID is still Fig. 4. Queries execution times. 130 A. Gómez-Cárdenas et al. a large identi?er for the lower F2C layer, which means that the ID length can be reduced even more and with it, also the search time. It’s worth noting that in general, the times obtained when using 8 and 4 bytes iden- ti?ers are very similar. This means that the time behaves exponentially, what is justi?ed by the management of indexes and primary keys used by the DBMS to improve the data retrieval process. In Fig. 4, the queries execution times are presented graphically. The blue bars repre- sent the lookup times in the database that stores the full resource identi?ers. It can easily be observed that in all the cases the time required to search in such database are consid- erably longer than the queries execution times when the resources use a fraction of their full ID. In this ?gure, the exponential behavior of query execution times can be observed more clearly. This trend becomes more evident as the volume of data to be handled increases. From the results shown in the Table 2 and Fig. 4, we can conclude that when the edge devices use a fraction of their full identi?er instead of the full version of it, the lookup time decreases signi?cantly (between 54.76 and 69.89% for large volumes of data), all of this, without a?ecting the ID uniqueness property, it is, keeping a very low collision probability. 5 Conclusions and Future Work The F2C compute paradigm have arose as a novel solution that intends both, to manage the resource continuum from the edge of the network to the cloud datacenter and to solve some of the cloud inherent limitations, such as the possibility of o?ering remote resources at the edge with a reduced latency to be used by delay sensitive services that require real time responses. However, there is still a list of open challenges that must be addressed before we can have a F2C framework that can be deployed. One of those challenges is the management of the resources identities in the network, especially, in the lower hierarchical layer, where most of those resources will be concentrated. In this paper, we propose a strategy to manage the identity of the resources that consists of fragmenting the unique global resource ID into smaller fragments. Each time a connection to a resource is established, the fog node that aggregates the resource to the network will determine the connection scope and thereafter, the number of fragments required for a mutual unambiguous identi?cation. The results obtained during the proposal validation phase show that the implemen- tation of our proposal allows to reduce both, the space in disk required to store the resource identi?ers in the fog nodes and the query execution times, achieving with this, a more e?cient use of resources in the IoT layer and streamline the resource identi?- cation process. Future work in this topic includes to implement this proposal in a real scenario to validate its e?ectiveness in the whole F2C environment and to propose an algorithm that allows to determine the optimal fragment lengths for each level in the network hierarchy. A Novel and Scalable Naming Strategy for IoT Scenarios 131 Acknowledgment. This work is supported by the H2020 mF2C project (730929) by the Spanish Ministry of Economy and Competitiveness and by the European Regional Development Fund both under contract TEC2015-66220-R (MINECO/FEDER), and for Alejandro Gómez-Cárdenas by the Consejo Nacional de Ciencia y Tecnología de los Estados Unidos Mexicanos (CONACyT) under Grant No. 411640. References 1. Evans, D.: The Internet of Things: How the Next Evolution of the Internet is Changing Everything (2011) 2. Burkert, A.: Modern Cars’ Insatiable Appetite for Data (2017) 3. Mehdipour, F., Javadi, B., Mahanti, A.: FOG-engine: towards big data analytics in the fog. In: 2016 IEEE 14th International Conference on Dependable, Autonomic and Secure Computing, 14th International Conference on Pervasive Intelligence and Computing, 2nd International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), pp. 640–646 (2016) 4. Ferrer-Roca, O., Roca, D., Nemirovsky, M., Milito, R.: The health fog. Small data on health cloud. Presented at the International eHealth, Telemedicine and Health ICT Forum for Educational, Networking and Business, Luxembourg, 23 April 2015 5. Firdhous, M., Ghazali, O., Hassan, S.: Fog computing: will it be the future of cloud computing? Presented at the proceedings of the third international conference on informatics and applications, Kuala Terengganu, Malaysia (2014) 6. Masip-Bruin, X., Marín-Tordera, E., Jukan, A., Ren, G.-J., Tashakor, G.: Foggy clouds and cloudy fogs: a real need for coordinated management of fog-to-cloud (F2C) computing systems (2016) 7. OpenFog Consortium: OpenFog Reference Architecture for Fog Computing, USA (2017) 8. mF2C Consortium: mF2C Project. http://www.mf2c-project.eu/ 9. Sarkar, S., Misra, S.: Theoretical modelling of fog computing: a green computing paradigm to support IoT applications. IET Netw. 5, 23–29 (2016) 10. Shi, W., Cao, J., Zhang, Q., Li, Y., Xu, L.: Edge computing: vision and challenges. IEEE Internet Things J. 3, 637–646 (2016) 11. European Telecommunications Standards Institute: Corporate telecommunication Networks (CN); User Identi?cation in a SIP/QSIG Environment (2004) 12. Cao, J., Xu, L., Abdallah, R., Shi, W.: EdgeOS_H: a home operating system for internet of everything. In: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp. 1756–1764 (2017) 13. Nuñez, D., Agudo, I.: BlindIdM: a privacy-preserving approach for identity management as a service. Int. J. Inf. Secur. 13, 199–215 (2014) 14. Chen, J., Liu, Y., Chai, Y.: An identity management framework for Internet of Things. In: 2015 IEEE 12th International Conference on e-Business Engineering, pp. 360–364 (2015) 15. Fu, Z., Jing, X., Sun, S.: Application-based identity management in M2M system. In: 2011 International Conference on Advanced Intelligence and Awareness Internet (AIAI 2011), pp. 211–215 (2011) 16. Farrell, S., Kutscher, D., Dannewitz, C., Ohlman, B., Keranen, A., Hallam-Baker, P.: Naming Things with Hashes (2013) 17. Bouk, S.H., Ahmed, S.H., Kim, D.: Hierarchical and hash based naming with Compact Trie name management scheme for vehicular content centric networks. Comput. Commun. 71, 73–83 (2015) 132 A. Gómez-Cárdenas et al. 18. Savolainen, T., Soininen, J., Silverajan, B.: IPv6 addressing strategies for IoT. IEEE Sens. J. 13, 3511–3519 (2013) 19. Gómez-Cárdenas, A., Masip-Bruin, X., Marín-Tordera, E., Kahvazadeh, S., Garcia, J.: A hash-based naming strategy for the fog-to-cloud computing paradigm. In: Heras, D.B., Bougé, L., Mencagli, G., Jeannot, E., Sakellariou, R., Badia, R.M., Barbosa, J.G., Ricci, L., Scott, S.L., Lankes, S., Weidendorfer, J. (eds.) Euro-Par 2017: Parallel Processing Workshops, pp. 316–324. Springer, Cham (2018) A Novel and Scalable Naming Strategy for IoT Scenarios 133 The IoT and Unpacking the He?alump’s Trunk Joseph Lindley(?) , Paul Coulton, and Rachel Cooper Imagination, Lancaster University, Lancaster, UK {j.lindley,p.coulton,r.cooper}@lancaster.ac.uk Abstract. In this paper we highlight design challenges that the Internet of Things (IoT) poses in relation to two of the guiding design paradigms of our time; Privacy by Design (PbD) and Human Centered Design (HCD). The terms IoT, PbD, and HCD are both suitcase terms, meaning that they have a variety of meanings packed within them. Depending on how the practices behind the terms are applied, notwithstanding their well-considered foundations, intentions, and theory, we explore how PbD and HCD can, if not considered carefully, become He?alump traps and hence act in opposition to the very challenges they seek to address. In response to this assertion we introduce Object Oriented Ontology (OOO) and experiment with its theoretical framing order to articulate possible strategies for mitigating these challenges when designing for the Internet of Things. Keywords: Internet of Things · Privacy by Design · Human-Centered Design 1 Introduction Although the term the Internet of Things (IoT) is employed regularly, particular in discussions relating to emerging technologies, its actual meaning is ambiguous as it is de?ned di?erently depending on who’s using it and in what context. Although it was preceded by other terms such as ubiquitous computing and pervasive computing it has gained traction with a general audience, perhaps because the terms ‘internet’ and ‘things’ are more accessible. However, having ambiguity baked in to the term means that ‘the IoT’ is likely to be interpreted di?erently dependent upon the meanings a particular individual might associate with these terms. This ambiguity means there is huge varia- tion within discourses utilizing the term. Although the research presented in this paper is aimed at contributing to practices relating to the design of IoT products and services, it also resonates with other, more general, discussions relating to emerging technologies. In particular it seeks to contribute to the debates about privacy, ethics, trust and security in the IoT [37] and understand potential barriers to adoption that may arise through the establishment of problematic design patterns. Our title is a play on the word trunk being synonymous with suitcase, and makes reference to Hyman Minsky’s term, suitcase words. These words describe complex concepts that, when one tries to de?ne them, reveal a nested series’ of other meanings contained within. The other odd term in the title, He?alump, refers a ?ctional elephant like creature, appearing in A.A. Milne’s books about Winne the Pooh. In one story Pooh and his friend Piglet decide to catch a He?alump in a cunning trap, unfortunately they © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 134–151, 2019. https://doi.org/10.1007/978-3-030-02686-8_11 only succeed in trapping themselves. The irony of this story has given rise to He?alump Traps being used by political journalists to describe strategies in which a politician might set a rhetorical trap to catch their opponent and that ultimately back?res on the trapper, leaving them to appear foolish! Thus, despite their intentions, and often ?ne execution, He?alump traps fail to achieve their aims and instead are detrimental toward the desired outcome. In this paper we illustrate how the suitcase terms IoT, Privacy by Design (PbD), and Human Centered Design (HCD) can, become He?alump traps by virtue of their nested complexities. The paper is structured as follows. First, we discuss PbD, paying particular attention to the linguistic complications when trying to de?ne what it really means using the example of the ambiguity present in the European Union’s invocation of the term in the recently introduced (EU) General Data Protection Regulations’ (GDPR). Next, we discuss the challenge to the well-established paradigms of Human-Centered Design (HCD) resulting from the complexities introduced by networked nature of IoT products and services. Third we argue that, if interpreted hubristically, PbD and HCD can result in unintended consequences, and, in essence, become He?alump traps. Finally, we propose the use of new design research techniques incorporating concepts derived contemporary philosophies of technology that can be used to develop and test strategies when navigating the complexities of the IoT and thus to minimize the risk of becoming caught in a He?alump trap. 2 Privacy by Design (and This by That) It is important to start this discussion by acknowledging that PbD does not exist in isolation; there are other propositions which overlap with it such as privacy, security and/or data protection by default. The semantics of the terms use does not aid our under- standing; for example, con?guring something by default would not the same as creating something in a particular way, or put di?erently, by design. Although, for something to have a default con?guration implies that it must have been designed that way. Adding to this confusion is the fact that in English language the word ‘design’ can be used in a multitude of di?erent way to mean very di?erent things, e.g. the designer uses her/his knowledge of design to design a thingamajig, which was part of the ?nal system design (which was built in accordance with the original design schematic). It was perhaps inevitable for confusion to result when the terms appeared in an in?uential report in the form “incorporates Privacy by Design principles by default” [6]. The already murky waters that contain PbD are made more di?cult to navigate when we introduce the complex abstractions like ‘privacy’ and ‘security’. To unpack these very quickly: privacy is not the same as security, but in some circumstances, privacy may be delivered by security and conversely security may be delivered by privacy. It is also evident that disciplinary idiosyncrasies can also come into play when trying to bring some clarity to a particular situation. For example, an engineer may interpret security operationally in terms of a particular implementation, like access control lists, whereas a psychologist may draw their understanding from a psychological theory, such as Maslow’s hierarchy of needs. While both considerations are equally valid even when The IoT and Unpacking the He?alump’s Trunk 135 their epistemological roads intersect, a common understanding will not necessary emerge. These de?nitional complexities are not, in themselves, anything to do with how one delivers PbD, they must be acknowledged within any critical discussion. Whilst the argument in this research is relevant to wider discourses of emerging technology, primarily the speci?c issues we are concerned with are (1) Privacy by Design [6] and (2) Data protection by design and by default as referred to in article 25 of the GDPR [42]. Whilst the term PbD emerged originally in a 1995 report1 it came to prominence in 2012 through the work of Ann Cavoukian and Je? Jonas [6]. Introducing PbD Cavoukian quotes the words of a 13th century Persian poet who posits that to ‘reinvent the world’ one must ‘speak a new language’. The premise is that technological progress is itself a new language that brings with it fundamental challenges to the notion of privacy. Going on to provide more concrete examples, the report describes the use of a one-way hash function to protect data subjects’ privacy so that even if patterns can be observed in the data, it cannot be reverse engineered to reveal the names of the participants. While this, and the other examples provided are compelling they are arguably a little naïve. Although in particular contexts such approaches can protect the privacy of individuals represented in the data in the increasingly heterogeneous contexts the IoT represents they can be extremely vulnerable to exploitation through amalgamation with other, seemingly unconnected, data sources and complete reliance on them could prove detri- mental. In the report Cavoukian builds upon the technical contribution of Je? Jonas to propose seven principles for the creation of systems that are private by design. These include: • Full attribution of each data record; • Data is tethered (any changes to data are recorded at the time of change); • Analytics only occur when data has been anonymized; • Tamper-resistant audit can be performed; • Systems are created that tend towards false negative rather than false positive in borderline cases; • Self-correcting conclusions (conclusions can be changed based on new data anal- ysis); • Information ?ows are transparent (data movements should be trackable and traceable —whether that is through a hard copy, appears on monitor, or is sent to another system). These principles are aimed at what the report refers to as ‘sense making systems’, systems that synthesize data from multiple systems such as payroll, customer relation- ship management, ?nancial accounting, in order to reach new work?ow conclusions. While the principles make some sense within the bounded context described, they are regrettably too speci?c to become generally applicable to the heterogeneous user groups and devices found within the IoT. In her discussion of PbD Sarah Spiekermann notes “Data is like water: it ?ows and ripples in ways that are di?cult to predict” [33], the implication being that PbD is rather 1 http://www.ontla.on.ca/library/repository/mon/10000/184530.pdf. 136 J. Lindley et al. idealistic and when implemented in practice can be as simple as the utilizing Privacy- Enhancing Technologies with additional security, with the aspiration being an appa- rently “fault-proof” system. Although such an aim is worthy, and the approach is valid, as she states, “the reality is much more challenging”. Spiekermann problematizes this idealism by re?ecting business models of Google and Facebook. They provide a range of apparently ‘free’ services but “without personal data such services are unthinkable”. She argues that proponents of PbD “hardly embrace these economic facts in their reasoning”. In other words, it may not be possible to create feature rich systems that are pro?table for the companies that supply them without contravening some of PbD’s fundamental ideals. In Cavoukian’s response, whilst broadly agreeing with Spiekermann’s analysis, she also insists “the challenges of PbD are not as great as Spiekermann suggested; the engi- neers I have met have embraced the PbD principles, ?nding implementation not di?cult” [5]. Whilst this may be true, it somewhat misses the more interesting element of Spie- kermann’s analysis which touches on potentially systemic shortcomings at the core of PbD’s rhetoric: a ‘fault-proof’ landscape is unrealistic when the ‘economic facts’ of many business models are not acknowledged. Spiekermann’s critique highlights that to do PbD e?ectively, it must become part of overall organizational culture, cutting across management, ?nance, marketing, design and engineering. This is perhaps the reason behind why PbD stagnates, and struggles to move from principles to practicalities— particularly in consumer goods. An alternative perspective on this echoes Shapiro’s suggestion that neither engineers nor customers are able to properly articulate, under- stand, or analyze the impact of ‘non-functional’ requirements like privacy [32]. These hard-to-grasp requirements operate at a completely di?erent level of abstraction to what either engineers and customers are accustomed to thinking about. To recap, the new language of technology is making our world anew, but, we are not yet ?uent in this emerging language. While purely technical responses to privacy sometimes appear to o?er faultless solutions (e.g. processing irreversibly hashed data), rarely will such a solution be generalizable across a range of contexts. While principles of PbD appear to be useful mechanisms they can be easily compromised when the complexities of ‘in the wild’ contexts are encountered. Whilst we are not disputing that PbD has demonstrably helped inform the delivery of privacy-aware projects with buy-in from developers, customers, and management alike, such examples appear to be in very speci?c contexts and do not necessarily cut through the aforementioned issues. Although the rhetoric deployed for PbD hints at the practicality of creating a ‘fault-proof’ approach to privacy this fails to appreciate the economic realities of what currently makes data-centric businesses viable. On the 25th May 2018 when GDPR became active the data protection legislation across a large swathe of Europe immediately changed. As GDPR protects citizens regardless of where the data pertaining to them is being held, it has also impacted on any organization who holds data about European citizens. We are yet to fully understand how GDPR will play out in practice, test cases and precedents will need emerge before its full implications are understood. Notwithstanding this uncertainty, GDPR is being cited as a legal framework that will clarify and enforce PbD, because article 25 of GDPR explicitly mentions Data protection by default and design [40]. The opening words of The IoT and Unpacking the He?alump’s Trunk 137 the article say that data controllers must take “the state of the art” approaches of PbD into account however no indication is given to what state of the art might mean in practice [14]. Given that this assertion is made under the heading ‘data protection by design and default’ we might reasonably infer that there is a relationship between the two, although the nature of that relationship is unde?ned. Article 25 also makes reference to the ‘by default’ trope, stating that appropriate measures should be taken to ensure that by default “only personal data which are necessary for each speci?c purpose of the processing are processed”. Thus, it appears that GDPR’s interpretation of data-protection by design, and relatedly by default, is at best ambiguous and certainly does not progress our under- standing of how to e?ectively operationalize the rather abstract principles of PbD. This lack of speci?city with respect to PbD (and its relatives) is not con?ned to the document de?ning GDPR. The UK Information Commissioners O?ce (ICO) which is the UK organization responsible for interpreting and enforcing GDPR calls on data controllers to utilize PbD, but does not pro?er any guidance as to how this may be practically enacted.2 While the de?nitional challenges facing European regulators are undoubtedly signi?cant, by including the terminology within the text of GDPR without attending to PbD’s inherent ambiguity, further challenges are almost certainly abound. 3 Human-Centered Design In his book The Design of Everyday Things [27] Don Norman presented principles for designing ‘things’ in such a way that human interaction with them is smooth and fruitful. Until relatively recently such interactions tended to occur predominantly between users, things and/or systems that were standalone and self-contained. In the book Norman provides numerous examples including a refrigerator, a telephone, and a clock. Despite the fact that some of his examples, such as the telephone, depend upon several technol- ogies interacting across a diverse technical infrastructure, the user experience of using the phone is encapsulated within a discrete interface made up of handset, dialer, and ringer. Today, interactions occur in much more complex contexts which present designers with new challenges. The “networki?cation of the devices that previously made up our non-Internet world” [29] is creating the IoT and while, interactions with these devices may appear familiar on the surface they inevitably produce an associated digital residue. This digital residue is data, and in stark contrast to the “visibility, appro- priate clues, and feedback of one’s actions” that Norman highlights as key properties of HCD [27:8–9] the full impact of the data is rarely visible either during or after actual user interactions (with connected, or IoT, devices). While this data is necessary to support business models, to train algorithms and, ultimately, to make stu? work, it is possible that by obscuring agency of underlying data, models and algorithms at the point of interaction, designers are in fact operating against the underlying ideology of HCD. The foundations of HCD are in ergonomics with the aim of supporting the “ways in which both hardware and software components of interactive systems can enhance human-system interaction” [43]. Despite being demonstrably useful [2, 16] this engi- neering derived paradigm relied on simpli?cations of complex contexts [11, 13, 38]. 2 https://ico.org.uk/for-organisations/guide-to-data-protection/privacy-by-design/. 138 J. Lindley et al. These reductive stances are incompatible with other more modern approaches that have become integral to HCD and acknowledge “the coherence of action is not adequately explained by either preconceived cognitive schema or institutionalized social norms” [36:177]. The result is that HCD methods have become extremely diverse, build upon a variety theoretical and epistemological stances, and are applied variously as both an evaluative and a generative tool [13, 23, 34]. The spectrum of approaches to utilizing HCD now includes methodological assemblages that can draw upon ethnography, participatory design, cultural probes, workshop techniques, scenarios, extreme users, and personas. Applied sensitively these techniques can produce designs that are “phys- ically, perceptually, cognitively and emotionally intuitive” [13], while also matching “the needs and capabilities of the people for whom they are intended” [27:9]. Whilst it’s true that “there is no simple recipe for the design or use of human-centered computing” [17], HCD—particularly among the design research community—has become ubiqui- tous is greatly in?uence on the technologies that concurrently we shape, and then ulti- mately shape us. Even amongst this diverse methodological landscape, a core theme that pervades HCD utilization is the axiom of simplicity. This is oft interpreted to mean that HCD should inform the design of services and software that are e?cient, e?ortless, and edifying to use; that fade into the background becoming invisible, and that ensure any complexity is that of the underlying task and not of the tool that has been developed to achieve it [25:197, 26]. Norman himself acknowledges that dogmatically blunt inter- pretations of this simplicity axiom can, perhaps unsurprisingly, introduce unintended consequences that drive HCD towards a “limited view of design” and result in analysis preoccupied with narrowly focused “page-by-page” and “screen-by-screen” [24] eval- uations. This narrow focus can sti?e potential users, and/or researchers, form being able to fully intuit a particular designed ‘thing’ on a crucial cognitive, emotional, and percep- tual level. In the hyper-connected and data-mediated assemblages of the IoT, the prev- alent assumption that simpler-is-better is already proving highly problematic as the recent revelations concerning Facebooks use of data illustrate. While some aspects of HCD are worthy and hold fast, the complexity, ubiquity, and interconnectedness of systems—represented by the IoT—means that HCD needs to be reevaluated. In the age of the IoT, whilst we need to re?ect the human centered ideals of HCD, it may be necessary to accept that there are, e?ectively, multiple centers and actants relevant to any given interaction. 4 Hubris and He?alumps The common thread that connects the previous discussions of PbD and HCD relates to the risk that occurs when their principles are interpreted hubristically; with excessive self-con?dence. To illustrate this, take a moment to think about the story of the Titanic. The ship employed cutting edge technology in an e?ort to make as safe as possible and was famed for being ‘unsinkable’. As well as explaining a lack of lifeboats on board, this in?ated con?dence meant that even though a spotter saw the iceberg in good time, the helmsman was never asked to take avoiding action—if the ship is The IoT and Unpacking the He?alump’s Trunk 139 unsinkable, why avoid a sinking hazard? After the tragedy the owners were accused of using misleading rhetoric about her sinkability, in response they pointed out their claim was only that the ship was designed to be unsinkable (as opposed to actually being unsinkable). The tale of the Titanic illustrates that hubristic reliance can, if circumstances conspire, be extremely dangerous. Relying on supposed guidelines and principles for HCD and PbD is, arguably, equivalent to the Titanic’s relying on cutting edge anti-sinking technologies. Hence, we cast HCD and PbD as potential He?alump traps. By solely relying on these approaches —despite their unequivocal worthy aims and demonstrated practical virtues—technol- ogists may inadvertently end up ensnaring themselves by the very issues that HCD or PbD may have sought to avoid (see Fig. 1). The problem, in many ways, is with binary and didactic positions. Describing ships as unsinkable, systems as private, or designs as human centered—is irrational. The results of such irrational beliefs may, at worst, result in tragedies like the Titanic. The IoT is so pervasive that the scope of resulting impacts range from the relative inconsequence of the Mirai botnet taking down Net?ix, through to the destabilization of national infrastructure and potential dissolution of democratic processes. Fig. 1. Depiction of a He?alump Trap. If treated insensitively, ideals like PbD and HCD may coerce technologists to believe that privacy is something that can be ‘achieved’ and a system’s simplicity is analogous to being ‘human centered’. Notions of apparently perfect systems are as dangerous as considering a ship unsinkable; these positions are misconceptions. Ship captains, system developers, and He?alump trappers alike; be careful. Don’t suggest your ocean liner is 140 J. Lindley et al. unsinkable, don’t believe your door-lock is uncrackable, don’t attempt to trap the made-up animal—refrain from assuming that it might be feasible to design a computerized device that is perfectly private by design. Do, however, embrace those driving ideals, just with a healthy skepticism towards the hubristic tendencies. In the following we describe theoretically-informed strategies to mitigate the dangers of hubris and He?a- lumps. 5 Tempering the Hubris; Designing a Philosophical Response 5.1 Object Oriented Ontology In the following we introduce Object Oriented Ontology (OOO), a modern philosophy which can help to make sense of the complex heterogeneous contexts emerging from the IoT that are so problematic for PbD and HCD. This framework is enacted with a contemporary speculative design methodology, Design Fiction [7, 19], to develop responses to the problematic aspects of PbD and HCD’s He?alump traps. We are not scholars of philosophy; hence we do not intend to discuss the nuances of OOO’s place within the broader gamut of philosophy and theory. However, in order to add some context in the following we o?er a short introduction to OOO, speci?cally within the context of computing and HCD. Philosophically underpinning HCD’s simplicity axiom in studies of Human- Computer Interaction, Heidegger’s seminal Being and Time argues most objects and tools make most sense in relation to human use. Heidegger uses a hammer as an example, he says that technologies are either ‘ready-to-hand’ (in their normal context of use) or ‘present-at-hand’ (if the ‘norm’ is disrupted, for example if the head fell o? the hammer). The metaphysics of this distinction are fascinating, but the salient issue is that the hammer comes to ‘Be’ through interaction with a human. As such the hammer’s very existence is the product of a correlation between the human mind, and the physical world [3]. This conceptual con?guration described as ‘correlationism’ [15]. What OOO does di?erently is to reject correlationism, and by doing so creates the possibility that objects have realities that are independent from human use and the mind/world correlation. Seen this way anything from a ?ber optic cable, to a blade of grass, to a quantum computer, to an apple pie—may be given agency in its own ontological limelight. If we imagine that every individual concept—the ?ber cable or the blade of grass—giving o? a little light in this way, then we might say their collective hue is the “?at ontology” that scholars of OOO refer to [4]. “In short, all things equally exist, yet they do not exist equally […] This maxim may seem like a tautology—or just a gag. It’s certainly not the sort of quali?ed, reasoned, hand-wrung ontolog- ical position that’s customary in philosophy. But such an extreme take is required for the curious garden of things to ?ow. Consider it a thought experiment, as all speculation must be: what if we shed all criteria whatsoever and simply hold that everything exits, even things that don’t? […] none’s existence fundamentally di?erent from another, none more primary nor more orig- inal.” [3:11] Bogost uses the famously ill-fated video game E.T. the Extra-Terrestrial as an example of how a single thing can be broken into many di?erent types of OOO object. He notes The IoT and Unpacking the He?alump’s Trunk 141 that the game is simultaneously: a series of rules and mechanics; source code; source compiled into assembly; radio frequency signals; a game cartridge; memory etched on silicon; intellectual property; arguably ‘the worst game ever made’; a portion of the 728,000 Atari games that were once buried in the ground in New Mexico;3 a conglom- erate of all of these. There is no fundamental thing which de?nes The E.T. video game. Instead it is all of these things simultaneously, and all of them independently of any human interaction. Contemplating what this sort of shift in ontology could mean Bogost muses “the epistemological tide ebbed, revealing the iridescent shells of realism they had so long occluded” [3]. This branch of metaphysics may seem very far removed from the development of technology, however, through a more practically-oriented approach known as Carpentry it can be materialized. Carpentry involves the creation of “machines” that attempt to reveal clues about the phenomenology of objects. While it’s accepted that objects’ experiences can never be fully understood, the machines of carpentry act as proxies for the unknowable. They pro?er a “rendering satisfactory enough to allow the artifact’s operator to gain some insights into an alien thing’s perspective” [3:100]. Sometimes achieved through programming, and sometimes through other practice, “through the making of things we do philosophy” [41]—lending the theory a material tangibility is the kernel of Carpentry. The purpose of Carpentry is to give the otherwise ethereal study of ontology a very practical legitimacy: “If a physician is someone who practices medicine, perhaps a metaphysician ought be someone who practices ontology. Just as one would likely not trust a doctor who had only read and written journal articles about medicine to explain the particular curiosities of one’s body, so one ought not trust a metaphysician who had only read and written books about the nature of the universe.” [3:91] 5.2 Design Fictions All design usually seeks to change the current context, and thus to create futures by answering questions or solving problems [22]. Speculative design is somewhat di?erent, it uses design to pose questions about possible futures, rather than to answer them.4 This family of design practices does not aim to create products for market, or which solve a real problem, instead they use the traditions of design in order to elicit insights and provoke new understandings [1, 8, 9] (a stance that is central to ‘Research through Design’ [10, 12]). The speculative design landscape is quite broad5 however the speci?c approach we employed in this work is Design Fiction. There continues to be much disagreement about the ‘best’ ways to do Design Fiction, but the ‘Design Fiction as World Building’ approach [7] is the one we adopted with this work. Doing Design Fiction this way involves designing a series of artifacts which all 3 4 cf. https://en.wikipedia.org/wiki/E.T._the_Extra-Terrestrial_(video_game). “A/B” is an excellent keyword based summary of the contrast between a?rmative and spec- 5 ulative design [30]. Dunne and Raby’s book [9] provides a thorough overview of speculative design practice and Tonkinwise’s review of the book o?ers some useful critique of speculation tooå [39]. 142 J. Lindley et al. contribute to the same ?ctional world. Individual artifacts act as ‘entry points’ in to the ?ctional world by depicting parts of it at a range of di?erent scales (Fig. 2). This results in a reciprocal prototyping e?ect; the artifacts de?ne the world, the world prototypes the artifacts, which, in turn, prototype the world. Fig. 2. Design Fiction as World Building We utilize Design Fiction this way in a form of Bogostian Carpentry. In Bogost’s examples he explores the inner world of objects by using computer code. The ?exibility of code allows him to, e?ectively, ‘play God’ within that realm. The demiurgic quality a?orded Bogost by using computer code also exists when building Design Fiction worlds. However, instead of functions, APIs and code of the computer’s domain, it is the essence of Design Fiction worlds—and the designed things that de?ne them—that are the tools of this particular creationist trade. The World’s First Truly Smart Kettle. Employing the world building approach, we attempted to enact Bogostian carpentry in the design of a smart kettle—the kettle is branded as Polly, in reference to the nursery rhyme Polly Put the Kettle On. The contours of Polly’s world are crafted through the creation of various artifacts, including a ?ctional press release for the kettle, packaging materials, and user interfaces. The press release describes many of the kettle’s features, these include smart noti?cations, integration with social media, voice commands, energy tracking, location-based boiling, and the trademarked JustRight smart ?ll meter. Some of these features are prototyped in user interface designs (e.g. Fig. 3) and the artifacts aim to provide historical context to the Polly world too: the product was originally crowdfunded before subsequently being bought out by Amazon’s IoT division; it is regulated by a government organization, and in order to achieve its accreditation it must utilize the Minimum Necessary Datagram Protocol [cf. 20, 22]. The IoT and Unpacking the He?alump’s Trunk 143 Fig. 3. Polly’s OOO-inspired timeline and volumetric data graph. When building Polly’s ?ctional world we built from the assumption that continuing IoT adoption will result in even more ubiquity of data collecting devices [35]. Among these, presumably devices such as kettles will (continue to) collect data too. Today, the visibility of the data shared by these devices is at best opaque and at worst absent, isolating the user from the underlying data transactions. While PbD principles can protect the user from unwanted or nefarious processing of their personal data, on occa- sions where that sort of processing is part of the to facilitate the device’s functional requirements, the best alternative would be to communicate the nature of the data trans- actions rather than disguising them. We may liken this to an autonomous car that would choose an optimized route to its destination. Most of the time routing designed to reduce journey times are desirable but if the car was designed in such a way that it would not reveal precisely what that route was, it would likely engender a feeling of distrust. Responding to this need we constructed two key features in Polly’s ?ctional world. Figure 3 (left) shows timeline depicting events taking place over the course of a day. From the timeline, we can tell that, in data terms, Polly was dormant for over 4 h since the ‘daily cloud pingback’, which uploads usage data to the cloud and downloads con?guration, security, and update data from the cloud. We can also see Polly was removed from its base, partially re?lled, at which point the kettle’s software anticipates it may be boiled soon. We can see that removing the kettle from the base and re?lling it result in immediate sharing of data to the cloud. The anticipation event however does not share data to the cloud but does share data with the home’s smart meter and other appliances to inform them of an impending power-consumption spike. The righthand side of Fig. 3 depicts the volume of the data uploaded from Polly, downloaded to Polly, and moving around the local network. This display di?ers from the timeline in that we cannot tell from it why data is moving around. However, what we can tell is the relative amount of data this smart kettle consumes and generates, as well as the relative volume of those transactions. Both displays are intended to be used in conjunction with each other such that Polly is quite transparent about to what it communicates and for what purposes. Based on the examples we can infer that Polly downloads much less data than it uploads. The speci?c reason for the upload/download disparity is not important, rather the takeaway point is that by utilizing Carpentry and Design Fiction, considering the reality of the kettle itself and giving the kettle’s Object 144 J. Lindley et al. Oriented perspective as much weight as the user’s perspective and the manufacturers perspective, a more egalitarian interface can be designed that doesn’t detract from the usability forwarded by HCD or the privacy credentials of PbD, but that does reveal the reality of what is happening and why, thus detracting from the dangers of hubris. Orbit, a Privacy Enhancing System. This project was in part motivated to explore how the European Union’s GDPR may impact on user/technology interactions. We were minded to develop a system that could obtain GDPR-compliant consent in a modern, simple and transparent way. Although legal precedents are yet to be tested and estab- lished in court, the articles of the GDPR theoretically protect various rights including: the right to be aware of what personal data is held about an individual; the right to access personal data; the right to rectify inaccurate data; the right to move personal data from one place to another; the right to refuse permission for pro?ling based on personal data; the right that any consent obtained relating to personal data must be veri?able, speci?c, unambiguous and given freely. The process by which users consent to have their data collected and processed is an area of particular contemporary relevance. The alleged involvement of British marketing company Cambridge Analytica in Donald Trump’s election victory and how, if this is shown to be true, consent was gained for the collection and processing of data from Facebook, is one factor driving interest in consent. Although some advances have been made in recent years—for example pre-checked boxes and non-consensual cookie usage were both outlawed in Europe in 20116 —tick boxes for users to indicate they have understood and agree to conditions of use are still the norm. There are fundamental problems with this approach, the most obvious of which being that while users often tick boxes saying they have read terms and conditions, the tick is no indication of whether they have actually read the text, nor whether they have understood it. In one study only 25% of participants looked at the agreement at all, and as little as 2% could demonstrate comprehension of the agreement’s content [28]. User agreements that obtain a wide spectrum of consent, whereby a user gives all the permission a device or service could ever possibly need, sti?e users’ agency to be selective about which features of a system they would like to use (which in turn seems to contravene the GDPR-protected right for speci?c and unambiguous consent). These systems also fail to account for changes over time; once consent has been gained it is frequently impossible (or very di?cult) to remove or change the nature of the consent. Again using the Design Fiction world building approach, we decided to use an IoT lock device to build the world around. Inspired by IoT locks that already exist on the market7 the ?ctional lock was imbued with the following features: • Using short-range radio instead of a key; • Location-based access (geofencing); • Temporary access codes (for guests); • Integration with voice agents (e.g. smart assistants); • Integration with other services such as If This Then That (IFTTT). 6 7 http://www.bbc.co.uk/news/world-europe-15260748. cf. http://uk.pcmag.com/surveillance-cameras/77460/guide/the-best-smart-locks-of-2017. The IoT and Unpacking the He?alump’s Trunk 145 Each feature has a di?erent relationship with collected data, where data is stored, and how it is processed. Using a short-range radio (NFC) instead of a key only relies on data inside the users own network; location-based access requires that data be accessed and stored by the lock company; utilizing services like IFTTT would lead to data being shared with any number of 3rd parties. Given that our purpose was to explore GDPR-compliant consent mechanism, our crafting of the Design Fiction only paid brief atten- tion to the technical implementation (we assumed that the lock would utilize an IoT radio standard such as ZigBee and that suitable APIs facilitate integration with external services such as IFTTT). Our original aim with this project was to design a map that could be used during a consent procedure to show to a user what data goes where so that they would be “informed by design” [21]. However, this aim was immediately challenged by the vast number of possible variations, even within a relatively small and straightforward IoT context. Figure 4 illustrates a scenario with an IoT lock which has been con?gured to turn on a smart lighting system when the user opens their door. While the cause and e?ect are simple and clear to the user (opening the door makes the lights turn on), there actually several cloud-based services behind the scenes that are necessary to make the hardware work. There may also be unknown 3rd parties using the data too (e.g. data brokers). Hence, to turn this into a map that details precisely where data goes, when, and in what circumstances, is simply not possible. A signi?cant factor driving this chal- lenge is that each speci?c situation needs to be treated as an ad hoc scenario, as something completely unique [31]. Fig. 4. Diagram showing how a user opening the door may trigger a number of possible data ?ows around the constellation, and that there is no single end point. In order to progress some the design parameters had to be amended. Initially we made our investigation more tightly scoped, rather than addressing GDPR compatibility per se, we focused solely on personal identi?ability. Next, it was necessary to forget the 146 J. Lindley et al. reducible concept of a map that would represent speci?c and quanti?able measures of probable risk and accept that any map would require much more extensive use of ‘shades of grey’. As a result of these changes our experiment with OOO went in directions we had not predicted. While our original intention was that OOO’s tiny ontologies would provide us with means to investigate the lock, the associated data streams, and potential users. Our attempt at carpentry, we thought, would lead us to have a deeper understanding of those objects directly. Contrastingly, however, what came to pass is that our carpentry resulted in the creation of an entirely original object (complete with its own tiny ontology). The purpose of this new object is to provide a new lens for looking at collections of IoT devices, platforms, the data that mediates between these, and the people that use them. These new objects—referred to as Orbits—communicate the relative likelihood that a person may be identi?ed based upon on device use. They present this in a fashion that distinguishes between data held locally, with known providers, or with unknown 3rd parties. These ‘maps’ provided some means to bridge between the vast gamut of possi- bilities in the computer-world and the succinct concreteness of judging acceptability in the human-world. They facilitate value judgements. The privacy Orbits map IoT systems, the data they utilize, and communicate the likelihood of identi?ability based on data held in di?erent places. The ‘levels’ (i.e. each concentric circle) represent data that is held locally, with known providers, or with unknown 3rd parties (see labels in Fig. 5). The de?nition (blurriness or sharpness) at the edge of each level describe the probability, or certainty, of the user being identi?able based on the data at that speci?c level. If the inner-most level has a pin-sharp edge, then it is almost de?nite that the user could be identi?ed based on those data (e.g. the right-hand diagram’s 1st level in Fig. 5). Blurrier levels mean that the chance of identi?ability is reduced (e.g. the left-hand diagram’s 3rd level in Fig. 5). Fig. 5. Example identi?ability Orbits (the name ‘Orbit’ stems from a visual similarity to the diagrams used in the Bohr model of the hydrogen atom (https://en.wikipedia.org/wiki/ Bohr_model)). The IoT and Unpacking the He?alump’s Trunk 147 The Design Fiction world we had created was a useful tool to then import the iden- ti?ability Orbits into, and to prototype how they might be used. We created a short ?lm that shows a user installing a new IoT smart lock device in their home8 using a voice interface and a supporting app. In essence the user is provided with a slider which enables or disables all the possible functions of the lock, the Orbits communicate how the asso- ciated changes in data ?ows impact on identi?ability. The same scenario may be extended to show the implications of dynamically modi- fying settings, for example to temporarily provide access to a delivery agent using a system similar to Amazon Key.9 If the user has con?gured their system for maximum privacy (or, minimal identi?ability) then Orbits could be used to temporarily provide access to the 3rd party and to show the user what the impact on data ?ows would be. Though this interaction is clearly achievable, it raises a host of other questions relating to the temporality of consent. For example, if a user gives consent for their data to be used by a 3rd party for a few hours, what happens to that data after those hours have elapsed? 6 Discussion and Conclusions Our OOO-informed Design Fictions work within boundaries of the following senti- ments: “the Internet must be grasped in metaphorical terms” [29] and that “Security by design and privacy by design can be achieved only by design. We need a ?rmer grasp of the obvious” [32]. Of course, acting on such sentiments is easier said than done, particularly when each of the constructs that we deal with—IoT, PbD and HCD—are all suitcase terms with multiple possible meanings. Because of this network of prob- lematic aspects, we assert that drawing on philosophy, and employing speculative design, is a productive way to begin to unpack the problem (as opposed to more directly applied/engineering-led approaches). The examples we have provided above are intended to be used in two ways. First, we wish to forward the method itself: enacting Bogostian Carpentry as a way of practicing OOO to address the complexities of PbD and HCD in an IoT context. This conclusion is relatively straightforward; we invite other researchers and technologists to apply a similar method and in doing so research the concepts further. Second, using Design Fiction as a method of Research through Design [10, 12], we o?er the following primary contributions which may be directly applied by technologists. Augmenting HCD with Constellations. Our critique and exploration of HCD is not meant unkindly. We acknowledge and applaud the rich history that HCD has, and rather than calling out shortcomings we wish to augment it for the 21st century. Thus, we propose the ‘Constellation’ design metaphor. This is a wrapper for the complexities of OOO and calls upon designers, developers and analysts to understand and acknowledge multiple di?erent perspectives in their products. Just as the constellations in the night 8 9 https://youtu.be/A37SmnNFstA. https://www.theverge.com/2017/10/25/16538834/amazon-key-in-home-delivery-unlock-door- prime-cloud-cam-smart-lock. 148 J. Lindley et al. sky appear di?erent depending on where you stand, the constellations of devices, data, networks, and users of the IoT appear di?erent depending on whom you are. Rather than obfuscating this complexity, interfaces such as those exempli?ed in Polly and Orbit, should communicate and reveal the complexity so as to inform all parties of any relevant others’ interests, activities, and agency. In doing so, the otherwise well-developed tools in HCD’s toolbox, may be utilized and leveraged, in order to produce technologies that deliver on the promise of the IoT without compromising users’ interests. Humbling the Hubris; Toward Informed by Design. Precisely echoing our explora- tion of HCD, the perspective we present on PbD is not a scornful one. However, we cannot escape that the temptation to use guidelines and principles as a kind of ‘safety blanket’ beneath which technologists may hide if they hubristically argue that ‘because I have ticked the boxes my system design is good enough to protect privacy’. Systems should be designed in such a way that the potential con?ation of understanding relating to privacy, security, and data protection by design (and/or) default is reduced—this may be achieved by purposeful disambiguation. This disambiguation may involve acknowl- edging that manufacturers cannot guarantee total privacy and explaining the factors which underpin that uncertainty (as demonstrated in the privacy Orbits in particular). The complexities of non-functional requirements, particularly in IoT contexts, should be approached heuristically; users, and every other actor in the given constellation, should be given the agency to understand any given situation for themselves. Avoid He?alump Traps. Adoption of IoT devices has unequivocal societal and economic bene?ts, but to capitalize on those bene?ts designers, engineers and policy-makers need to set aside beliefs that are founded on the conceptual possibility of ‘perfect’ systems. Such beliefs are incongruous with the unavoidable realities of privacy, trust, and security issues. Instead, the IoT needs to be designed with a considered approach that accepts IoT devices de?nitely do pose problems for individuals’ privacy, but that those problems can be tempered by subtly shifting our design paradigms such that they incorporate constellations of meaning and inform all participants in a constellation of their roles within it. To reinvent the world, we must speak a new language, and that language should ensure that He?alump traps are not part of the vernacular. Acknowledgements. This research was supported by the RCUK Cyber Security for the Internet of Things Research Hub PETRAS under EPSRC grant EP/N02334X/1. References 1. Auger, J.: Speculative design: crafting the speculation. Dig. Creat. 24(1), 11–35 (2013). https://doi.org/10.1080/14626268.2013.767276 2. Bevan, N.: How you could bene?t from using ISO standards. In: Extended Abstracts of the ACM CHI 2015 Conference on Human Factors in Computing Systems, pp. 2503–2504 (2015). https://doi.org/10.1145/2559206.2567827 3. Bogost, I.: Alien Phenomenology, or What It’s Like to Be a Thing. University of Minnesota Press, Minneapolis (2012) The IoT and Unpacking the He?alump’s Trunk 149 4. Bryant, L.R.: Democracy of Objects. Open Humanities Press, London (2011). https://doi.org/ 10.3998/ohp.9750134.0001.001 5. Cavoukian, A.: Operationalizing privacy by design. Commun. ACM 55(9), 7 (2012). https:// doi.org/10.1145/2330667.2330669 6. Cavoukian, A., Jonas, J.L.: Privacy by Design in the Age of Big Data (2012) 7. Coulton, P., Lindley, J., Sturdee, M., Stead, M.: Design ?ction as world building. In: Proceedings of the 3rd Biennial Research Through Design Conference (2017). https://doi.org/ 10.6084/m9.?gshare.4746964 8. Dunne, A.: Hertzian Tales: Electronic Products, Aesthetic Experience, and Critical Design. The MIT Press, London (2006) 9. Dunne, A., Raby, F.: Speculative Everything. The MIT Press, London (2013) 10. Frayling, C.: Research in art and design. R. Coll. Art Res Pap. 1(1), 1–9 (1993) 11. Gasson, S.: Human-centered vs. user-centered approaches to information system design. J. Inf. Technol. Theory Appl. 5(2), 29–46 (2003) 12. Gaver, W.: What should we expect from research through design? In: Proceedings of the 2012 ACM Annual Conference on Human Factors in Computing Systems - CHI 2012, p. 937 (2012). https://doi.org/10.1145/2207676.2208538 13. Giacomin, J.: What is human centred design? Des. J. 17(4), 606–623 (2014). https://doi.org/ 10.2752/175630614X14056185480186 14. Von Grafenstein, M., Douka, C.: The “state of the art” of privacy- and security-by-design (measures). In: Proceedings of MyData (2017) 15. Gratton, P., Ennis, P.J.: The Meillassoux Dictionary. Edinburgh University Press, Edinburgh (2014) 16. Jokela, T., Iivari, N., Matero, J., Karukka, M.: The standard of user-centered design and the standard de?nition of usability. In: Proceedings of the Latin American Conference on Human- Computer Interaction - CLIHC 2003, pp. 53–60 (2003). https://doi.org/ 10.1145/944519.944525 17. Kling, R., Star, S.L.: Human centered systems in the perspective of organizational and social informatics. ACM SIGCAS Comput. Soc. 28(1), 22–29 (1998). https://doi.org/ 10.1145/277351.277356 18. Lindley, J., Coulton, P.: On the Internet No Everybody Knows You’re a Whatchamacallit (or a Thing). Making Home: Asserting Agency in the Age of IoT Workshop (2017). http:// eprints.lancs.ac.uk/84761/1/On_the_Internet_Everybody_Knows_Youre_a_Thing.pdf 19. Lindley, J., Coulton, P.: Back to the future: 10 years of design ?ction. In: British HCI 2015 Proceedings of the 2015 British HCI Conference, pp. 210–211 (2015). https://doi.org/ 10.1145/2783446.2783592 20. Lindley, J., Coulton, P., Cooper, R.: Why the Internet of Things needs object orientated ontology. Des. J. (2017). https://doi.org/10.1080/14606925.2017.1352796 21. Lindley, J., Coulton, P., Cooper, R.: Informed by design. In: Living in the Internet of Things: PETRAS Conference (2018) 22. Lindley, J., Sharma, D., Potts, R.: Anticipatory ethnography: design ?ction as an input to design ethnography. In: Ethnographic Praxis in Industry Conference Proceedings 2014, vol. 1, pp. 237–253 (2014). https://doi.org/10.1111/1559-8918.01030 23. Macdonald, N., Reimann, R., Perks, M., Oppenheimer, A.: Beyond human-centered design? Interactions (2005). https://doi.org/10.1145/1013115.1013184 24. Norman, D.A.: HCD Harmful? A Clari?cation - jnd.org. http://www.jnd.org/dn.mss/ hcd_harmful_a_clari.html 25. Norman, D.A.: The Invisible Computer: Why Good Products Can Fail, the Personal Computer is So Complex, and Information Appliances are the Solution. The MIT Press, London (1998) 150 J. Lindley et al. 26. Norman, D.A.: Human-centered design considered harmful. Interactions 12(4), 14 (2005). https://doi.org/10.1145/1070960.1070976 27. Norman, D.A.: The Design of Everyday Things, Revised edn. Basic Books, New York (2013) 28. Obar, J.A., Oeldorf-Hirsch, A.: The biggest lie on the internet: ignoring the privacy policies and terms of service policies of social networking services. In: The 44th Research Conference on Communication, Information and Internet Policy (2016). https://doi.org/10.2139/ssrn. 2757465 29. Pierce, J., DiSalvo, C.: Dark clouds, Io $ #! +, and? [Crystal Ball Emoji]: projecting network anxieties with alternative design metaphors. In: Proceedings of the 2017 Conference on Designing Interactive Systems, DIS 2017, pp. 1383–1393 (2017). https://doi.org/ 10.1145/3064663.3064795 30. Raby, F., Dunne, A.: A/B (2009). http://www.dunneandraby.co.uk/content/projects/476/0. Accessed 27 Oct 2014 31. Schraefel, M.C., Gomer, R., Alan, A., Gerding, E., Maple, C.: The Internet of Things: interaction challenges to meaningful consent at scale. Interactions 24(6), 26–33 (2017). https://doi.org/10.1145/3149025 32. Shapiro, S.S.: Privacy by design. Commun. ACM 53(6), 27 (2010). https://doi.org/ 10.1145/1743546.1743559 33. Spiekermann, S.: The challenges of privacy by design. Commun. ACM 55(7), 38 (2012). https://doi.org/10.1145/2209249.2209263 34. Steen, M.: Tensions in human-centred design. CoDesign 7(1), 45–60 (2011). https://doi.org/ 10.1080/15710882.2011.563314 35. Sterling, B.: The Epic Struggle of the Internet of Things. Strelka Press, Moscow (2014) 36. Suchman, L.: Human-Machine Recon?gurations: Plans and Situated Actions. Cambridge University Press, Cambridge (2007) 37. Taylor, P., Allpress, S., Carr, M., Norton, J., Smith, L.: Internet of Things: Realising the Potential of a Trusted Smart World (2018). https://www.raeng.org.uk/publications/reports/ internet-of-things-realising-the-potential-of-a-tr 38. Thomas, V., Remy, C., Bates, O.: The limits of HCD. In: Proceedings of the 2017 Workshop on Computing Within Limits - LIMITS 2017, pp. 85–92 (2017). https://doi.org/ 10.1145/3080556.3080561 39. Tonkinwise, C.: How we intend to future review of Anthony Dunne. Des. Philos. Pap. 12(2), 169–187 (2014). https://doi.org/10.2752/144871314X14159818597676 40. Vollmer, N.: Article 25 EU General Data Protection Regulation (EU-GDPR) (2017) http:// www.privacy-regulation.eu/en/article-25-data-protection-by-design-and-by-default- GDPR.htm. Accessed 15 Jan 2018 41. Wakkary, R., Oogjes, D., Hauser, S., Lin, H., Cao, C., Ma, L., Duel, T.: Morse things: a design inquiry into the gap between things and us. In: Proceedings of the 2017 Conference on Designing Interactive Systems, pp. 503–514 (2017). https://doi.org/ 10.1145/3064663.3064734 42. Summaries of Articles contained in the GDPR. http://www.eugdpr.org/article-summaries. html. Accessed 15 Sept 2017 43. ISO 9241-210. Ergonomics of human-system interaction – Part 210: Human-centred design for interactive systems. International Organization for Standardization (2015). https:// www.iso.org/standard/52075.html The IoT and Unpacking the He?alump’s Trunk 151 Toys That Talk to Strangers: A Look at the Privacy Policies of Connected Toys Wahida Chowdhury(?) University of Ottawa, Ottawa, ON, Canada Wahida.Chowdhury@hotmail.ca Abstract. Toys that are connected to the Internet are able to record data from users and share the data with company databases. The security and privacy of user data thus depend on companies’ privacy policies. Though there is a rising concern about the privacy of children and parents who use these connected toys, there is a scarcity of research on how toy companies are responding to the concern. We analyzed privacy policies of 15 toy companies to investigate the ways toy companies publicly document digital standards of their connected products. Our results show that most toy companies are either unclear or do not mention in their privacy policy documents how their toys protect the security and privacy of users. We recommend measures that toy companies may adopt to explicitly respond to security and privacy concerns so parents can make informed decisions before purchasing the connected toys for their children. Keywords: Connected toys · Smart toys · Internet of Things Information privacy · Data security · Privacy policies · Digital standards Children · Parents 1 Introduction Toys that gather information from owners via microphone, camera or user inputs, and share the information via Internet to whomever these toys are connected to, are known as connected toys. These toys may replace traditional friends by being highly interactive such as by recording the child’s preferences and by talking back to the child. These toys may also replace traditional baby sitters and keep the child busy when parents are working. Toy companies quickly noted these bene?ts and advertised their connected products to children and parents by obscuring associated risks to privacy and data security. For example, Edwin the Duck uses Bluetooth technology to broadcast lullabies to its young users; however, the toy company also collects and retains everything the child says and shares that information with “trusted” third parties. The purpose of our research was to investigate the extent to which connected toy companies respond to bene?ts versus threats towards consumers’ privacy and data security. We analyzed the privacy policies of 15 connected toys; the connected products were selected from the privacy guide developed by Mozilla foundation, a not-for-pro?t organization that supports and promotes the use of connected products. We asked 16 questions about the privacy and data security of each product and looked through the © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 152–158, 2019. https://doi.org/10.1007/978-3-030-02686-8_12 manufacturers’ privacy policies for answers. The results provide a snapshot of the infor- mational practices of the connected toy companies, and recommend ways to make privacy policies more explicit so consumers can make informed decisions before purchasing. 2 Literature Review Connected toys relate to ‘a future in which digital and physical entities can be linked, by means of appropriate information and communication technologies, to enable a whole new class of applications and services’ [1]. A wide variety of toys fall under the domain of connected toys. Some of these toys are connected to voice and/or image recognition software (e.g. Hello Barbie™ or the Hatchimals); some are connected to app-enabled robots, and other mechanical toys (e.g. Dash and Dot); and others are connected to video games (e.g. Skylanders or Lego Dimensions) [2]. Some connected toys are connected to the Internet but do not simulate human-like behaviour; some toys simulate human interaction by talking to users; and other toys such as connected robots can be coded by users to perform novel activities [3]. Mascheroni & Holloway (Eds.) (2017) Identi?ed articles about connected toys from 12 countries (Australia, Austria, Finland, Germany, Italy, Lithuania, Malta, Portugal, Romania, Serbia, Slovenia and Spain), and documented the bene?ts of connected toys as reported by parents. The bene?ts included the development of digital literacy, crea- tivity, motivation to learn, reading and writing literacy, social skills, physical activity, etc. Despite the bene?ts however, concerns about the security and privacy of users (who are primarily children) are documented in the literature from the hay days of connected toys [4]. Concerns about children’s security and privacy were already in place as social networking, gaming, and other websites gathered, stored, and shared data from child users with other third parties often without the child users’ knowledge or consent [5]. Connected toys intensi?ed the concerns by making data collection from children easier (such as by microphone, camera, location tracker, and movement detectors) and by being able to collect more personal data (such as by being able to follow child users everywhere and by being always “on”). The developments exacerbated the risks of easy access to personal information, simply by hacking company databases. Recent examples include hacking of data collected by the connected toys, Hello Barbie and VTech, from millions of child users [2]. The security and privacy concerns imply that toy makers should incorporate e?ective measures from inception to completion of the development process of connected toys [6]. Our research looks into the privacy policies of toy companies to report how the companies are addressing public hopes and fears surrounding connected toys. 3 Methodology The Mozilla foundation published a report, Privacy Not Included, in December 2017 that reviewed openly accessible privacy policies of di?erent connected products. The Toys That Talk to Strangers: A Look at the Privacy Policies 153 report aimed to draw buyers’ attention to three questions related to privacy and security before purchasing the products: (1) How do the products spy on users? (2) What infor- mation about the users do the products collect? and (3) What could happen to users if data breeches occur? For example, Mozilla guide reports that the connected toy, Dash the Robot, is a one-eyed robot that can sing, dance, and play to give an highly interactive and fun experience to children; however, parents should be warned that the robot can spy on children via microphone and that parents have no control over the data that the robot collects. To extend the Mozilla product reviews and have more in-depth synopsis of users’ privacy and data security related to connected products, we conducted further analyses of the privacy policies of 15 toys and game consoles listed in the Mozilla report. These connected products were: Smart letters, Edwin the Duck, Adidas miCoach Smart Soccer Ball, Ozobot Evo, Beasts of Balance, Toymail Talkie, Sphero SPRK+, Osmo, Dash the robot, BB-8 by Sphero, Airjamz Air Guitar, Hello Barbie, Microsoft Xbox One, Sony Playstation 4, and Nintendo Switch. We developed 16 distinct questions from the open access Digital Standards, created by Consumer Reports, Disconnect, Ranking Rights and the Cyber Independent Testing Lab to evaluate the privacy and security of the 15 connected toys. For example, we investigated how secure user information is when using a connected product; we looked through the product’s privacy policies to determine if the company routinely audits user data and restricts third party access to the data. The various questions answered what privacy measures were put in place, what privacy controls were available, and what kind of information the companies gathered from users and disclosed to third parties. 4 Results 4.1 How secure is users’ data? Almost all the companies we studied claimed that they take steps or comply with stand- ards to protect user data, but they are not always clear about what steps they take or what standards they follow. Furthermore, none of the companies we studied are con?dent that they are hack-proof, and admit that security breaches can still happen. 4.2 Do users need to make a password? Most companies require users to make a password. However, passwords are not required to be complex/secure. This means that the user information could be easily hacked. 4.3 Does the company encrypt users’ information? Only four (27%) of the companies we studied fully encrypt user data; others partly encrypt users data or do not encrypt at all. This means that the user information could be easily understood if hacked. 154 W. Chowdhury 4.4 Can users control the data that the company collects? Almost half the companies we studied (53%) do not mention if users can control their own data. In fact, few companies such as “osmo” toy automatically collect information without user control. 4.5 Can users delete their data when they leave the service? Almost all the companies we studied allow users to delete data when they leave services, but maybe not completely. For example, companies may retain non-personally identi- ?able data, and catched or backup copies of user data that companies are not explicit about. This means that even if users leave a service, their information could be hacked. 4.6 Do users know what information the company collects? Almost all the companies we studied give users snapshots of what information is collected from them. However, the hidden rules are often too complex to understand and are easy to overlook. 4.7 Does the company collect only the information needed for the product to function? Almost all the companies we studied collect more information from users than what is needed to make their product work. 4.8 Is users’ privacy protected from third parties by default? None of the companies we studied protect user data from third companies by default. Some companies allow users to review and change their privacy settings. However, it is not clear to what extent users are able to protect their privacy without loosing access to services. 4.9 How does the company use users’ data? The privacy documents of almost all the companies we studied explicitly state how they might use user data. However, most companies leave the responsibility on users to control their own privacy, and users are threatened that they might not get the best service if they restrict access to their data. 4.10 Does the company have a privacy policy document? All the companies we studied have privacy policy documents. However, the documents are often very long in a tangible language, and often so not answer important questions. Toys That Talk to Strangers: A Look at the Privacy Policies 155 4.11 Will users receive a noti?cation if the company changes its privacy policy? Less than half (40%) of the companies we studied send noti?cations if their privacy policies change. Most companies either do not mention of any change or simply update the date on top of their policy documents that are very unlikely to be read twice by users to notice the change. 4.12 Does the company comply only with legal and ethical third-party requests for users’ information? Only 27% of the companies we studied explicitly mentioned that they comply only with legal and ethical third-party requests of user information. Most companies claim to share non-identi?able information or are not explicit about how information requests are handled. 4.13 Does the company require users to verify identity with government-issued identi?cation, or with other forms of identi?cation that could be connected to users’ o?ine identity? None of the companies we studied require users to verify identity with government-issued identi?cation, indicating that users can register for services under false names. 4.14 Does the company notify users for any unauthorized access to data? Only two (13%) of the companies we studied noti?ed users of security breaches. This means that users may continue to use connected products even after these are hacked. 4.15 Is the company transparent about its practices for sharing users’ data with the government and third parties? Only four (27%) of the companies we studied were transparent about sharing practices with the government and third parties. 4.16 Does the company send noti?cations if the government or third parties request access to users’ data? Only three (2%) of the companies we studied noti?ed users of third party requests. This means that third parties may collect users’ information without their awareness. 5 Discussion Childhood experiences are rapidly becoming digital by including connected toys and games that let children connect to strangers e?ortlessly from the comfort of their home. Although this may seem fun and safe, our ?ndings indicate that none of the toys provided 156 W. Chowdhury satisfactory answers to all 16 questions related to privacy and data security. There remained a variety of di?erent ways a connected toy company may gather information, such as recording users preferences, tracking a user’s IP address and turning on a devi- ce’s camera every time the toy is used. The security of user information thus relies on the security of the databases of a connected toy company or of the third parties that the company shares information with. If hackers or even employees access the databases with any wrong motive from having fun to stealing money to initiating a cyber-war, strangers can talk back to the young users and make them do inappropriate things. To prevent data breeches, privacy policy documents of the 15 toy companies that we analyzed claimed to have privacy measures in place; this might make parents feel relieved to trust the companies to be responsible care takers of their children. However, the privacy policies of almost all the companies accepted that their databases might not be secure enough to prevent data breeches. Companies seem to posit that users are responsible for their own security. However, users were often threatened of losing serv- ices if they exercised control of their privacy, for example if users did not share data with third parties. The privacy policies of each company attempt to document their data collection and sharing practices that might give the feeling of making an informed decision about purchasing the company products. However, the policies do not follow a standardized format and are not always written in a way that the general user could understand. Also the de?nitions of privacy measures such as data control and data collection are not standardized between companies. This means that many parents may not be aware of the information that companies gather about their children which may limit their ability to make fully informed decisions about the products that they’re purchasing. For example, when a parent signs up for an account for various toys or consoles, certain information is asked of them but the sign up mechanisms do not draw the parent’s attention to the fact that the toy’s microphone may be accessed or that the child’s IP address and/or Wi-Fi information may be stored in the company servers. Furthermore, users may ignore reading lengthy documents, such as ambiguous privacy policies, that describe before purchasing what a certain connected toy does. For example, users may ignore ambiguous warning that a toy maybe harmful which does not state clearly why or how the toy may be harmful. Users may also feel if a product is in the market, the company must have done security checks. For example, if a new car is in the market, users should not have to think if the car would be safe for driving; let alone, investigating if children’s toys are safe for playing. 6 Recommendations for Toy Companies Our ?ndings suggest that a Frequently Asked Questions or FAQ should accompany privacy policy documents that itemize privacy-related questions the way we did in this report so it’s easier for people to see how their information is collected, used and disclosed. Secondly, if the concerns stem from sharing data with company databases, toy companies should re-consider the necessities of sharing data with remote databases Toys That Talk to Strangers: A Look at the Privacy Policies 157 that have the possibility of being hacked, rather than sharing data locally within the toy itself that can only be hacked if the child loses the toy. Furthermore, more evaluations need to be done, as new toys are developed to ensure that children’s information is given the highest level of protection. Manufacturers should strive to make connected toys more reliable and capable each year while service providers, software engineers, governments, private organizations, and technical experts should strive to prevent and solve security and socio-economic problems arising from connected toys. Acknowledgment. The author wishes to thank Diana Cave (Criminology Department, University of Ottawa) for assisting in conducting the research, and professor Valerie Steeves (Criminology Department, University of Ottawa) for her valuable comments on previous drafts of this article. References 1. Miorandi, D., Sicari, S., De Pellegrini, F., Chlamtac, I.: Internet of Things: vision, applications and research challenges. Ad Hoc Netw. 10(7), 1497–1516 (2012). https://doi.org/10.1016/ j.adhoc.2012.02.016 2. Holloway, D., Green, L.: The internet of toys. Commun. Res. Pract. 2(4), 506–519 (2016) 3. Mascheroni, G., Holloway, D. (eds.): The Internet of Toys: A Report on Media and Social Discourses Around Young Children and IoToys. DigiLitEY, London (2017) 4. Dobbins, D.L.: Analysis of security concerns and privacy risks of children’s smart toys. Ph.D. Dissertation. Washington University St. Louis, St. Louis, MO, USA (2015) 5. Steeves, V., Jones, O.: Surveillance, children and childhood (Editorial). Surveill. Soc. 7(3/4), 187–191 (2010) 6. Nelson, B.: Children’s Connected Toys: Data Security and Privacy Concerns. United States Congress Senate Committee on Commerce, Science, and Transportation, 14 December 2016. https://www.hsdl.org/?view&did=797394. Accessed 4 July 2017 158 W. Chowdhury A Reinforcement Learning Multiagent Architecture Prototype for Smart Homes (IoT) Mario Rivas(?) and Fernando Giorno Instituto de Pesquisas Tecnológicas – IPT, São Paulo, Brazil mariorivas@hotmail.com, fgiorno@gmail.com Abstract. Continuous technology progress is fueling the delivery of new and less expensive IoT components, providing a variety of options for the Smart Home. Although most of the components can be easily integrated, achieving an optimal con?guration that prioritizes environmental goals over individual performance strategies is a complex task that requires manual ?ne tuning. The objective of this work is to propose an architecture model that integrates rein- forcement learning capabilities in a Smart Home environment. In order to ensure the completeness of the solution, a set of architecture requirements was elicited. The proposed architecture is extended from the IoT Architecture Reference Model (ARM), with speci?c components designed to coordinate the learning e?ort, as well as data governance and general orchestration. Besides con?rming the ful?llment of the architecture requirements, a simulation tool was developed to test the learning capabilities of a system instantiated from the proposed archi- tecture. After six million and four hundred thousand execution cycles, it was veri?ed that system was able to learn in most of the con?gurations. Unexpectedly, results show very similar performance for collaborative and competitive envi- ronments, suggesting that a more varied selection of agent scenarios should be tested as an extension of this work, to con?rm or contest Q-Learning hypothesis. Keywords: IoT · Reinforcement · Learning · Q-Learning · Architecture 1 Introduction Considering the continuous progress on the scienti?c landscape that facilitates the delivery of new IoT (Internet of Things) components, and the absence of a single fully adopted industry standard [1], the goal of achieving an optimal e?ciency setup for a Smart Home relies on empirical approaches and context-based rules, rather than AI techniques. Furthermore, strategies to achieve context speci?c goals like energy e?- ciency, home safety or environmental control requires a pre-emptive knowledge of the components and their interaction, reducing the ?exibility and resilience. The objective of this work is to propose an abstract architecture model that integrates reinforcement learning capabilities in a Smart Home environment, allowing real-time agent con?guration and information exchange governance. By this mean, concrete systems derived from this architecture will be able to learn optimal strategies to achieve environmental goals. © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 159–170, 2019. https://doi.org/10.1007/978-3-030-02686-8_13 The rest of this paper is structured as follows. Section 2 introduces related researches that contributed to the background of this work. Proposed architecture is presented in Sect. 3, describing the architecture requirements, the design approach and the ?nal architecture description. Section 4 details the testing process, introducing the design of the testing tool, the simulation cases and its results. Finally, conclusions and future scope are included in Sect. 5. 2 Related Work and Research Contribution 2.1 Related Work Several approaches were been proposed to resolve the complex interaction issues of the IoT environments, and its dynamic con?guration requirements. In the speci?c ?eld of manufacturing, Katasonov et al. [2] introduced the concept of multiagent platforms with autonomous behaviour to overcome the interoperability issues derived from multiplicity of standards and protocols. Wang et al. [3] proposed an agent-based hybrid service delivery, composed by four subsystems: (1) hybrid services based on agents, (2) hybrid service of ontological search engine, (3) service enablers repository and (4) a service-oriented agent lifecycle manager. In order to reduce the uncertainty of the inherent stochastic IoT environment, Nastic et al. [4] introduced the platform U-GovOps to manage elastic IoT systems, applying a declarative proprietary language to de?ne policies and resolve real-time issues. While most authors de?ned solutions based in multiagent systems, few agent learning techniques references were found [3], as well as no speci?c mention to rein- forcement learning. 2.2 Research Contribution This paper presents an integrated vision of several recent studies related to Smart Home architecture powered by multiagent systems and reinforcement learning techniques, de?ning a framework to instantiate concrete architectures. In order to verify the learning capacity of the resulting architecture, a simulation tool was created where 64 di?erent scenarios were tested. Results of these simulations are relevant to understand the impact of the hyperparameters in the reinforcement learning approach. 3 Proposed Architecture The current section describes the proposed architecture model to cover the objectives explained at Sect. 1. Initially a summary of the architecture requirements is listed, then an explanation of the design approach and ?nally the architecture description itself. 160 M. Rivas and F. Giorno 3.1 Architecture Requirements The general objective of this architecture is to address learning capabilities on a Smart Home architecture based in multiagent system, supporting online recon?guration and resilience. The list of architecture requirements, classi?ed in functional and non-func- tional requirements is presented in Table 1. Table 1. Architecture requirements Req. type Architecture requirements Element Description Functional Initial system con?guration Architecture provides suitable artefacts for system con?guration Learning process oversight Individual agent learning progress is calculated and utilized on the reward provision New agent inclusion New agents are included in real-time Agent removal Architecture provides artefacts to remove agents in real-time System parameters modi?cation System parameters are modi?able at real-time Information consumers coordination External information consumers can be added/ removed in real-time External information ?ow System information ?ows externally to the consumers, according to the information governance in place System governance control Architecture provides artefacts to de?ne and manage governance Non- Functional Resilience Architecture provides a redundant structure to support operations continuity Scalabilty System resource requests are anticipated and capacity limitations are proactively managed Performance Component orchestration is aligned with system performance 3.2 Design Approach The architecture reference model for IoT (ARM) was developed in a joint e?ort by the European Platform on Smart Systems (EPoSS) and the IOT-A project. Its main function is to provide a common structure and a set of guidance to elaborate concrete IoT archi- tectures in di?erent contexts. ARM consists in a set of interdependent sub-models describing reference architecture basic aspects. The intersection of this model with the system requirements determines the instantiated architecture, represented by views and perspectives. Basic models described by ARM are: IoT Domain (physical entities and their logical representation, etc.), Information Domain (information structures, service modelling, etc.), Functional Domain (group of functionalities included), Communica- tion Domain and Trust, Security and Privacy Domain. A Reinforcement Learning Multiagent Architecture Prototype 161 Bassi et al. [5] approach to generate architectures based on ARM is supported by the usage of views and perspectives as described by Rozanski and Woods [6]. The set of basic views suggested by these authors [6] are: Functional, Information, Concurrent, Development, Deployment and Operational. This collection of views provides a comprehensive description of the architecture; however it does not explicitly consider non-functional requirements like information security or resilience. Since this type of requirements are orthogonal to the functional requirements, Rozanski and Woods [6] suggest to document them as “perspectives”, describing their intersection with func- tional views as a complement of the main description. Following the recommendation of the authors, this work considered the following perspectives: Information Security, Performance and Scalability, Availability and Resilience and Evolution. A graphic view of the ARM components and their interaction is represented in Fig. 1. Fig. 1. Architecture reference description model after ARM. 3.3 Architecture Description The main concept of the proposed architecture is based on the virtualization of the agents and their asynchronous learning management. It is composed by the following elements: Physical Context, Virtual Agent Farm (VAF), Asynchronous Data Layer (ADL), Data Exchange Manager (DEM), Context Manager and Learning Manager, displayed below in Fig. 2. 162 M. Rivas and F. Giorno Fig. 2. Proposed architecture main components. Physical context is the representation of the elements that compound the environ- ment, like sensors, actuators and other hardware devices. Depending on the number of sensors and actuators, several non-exclusive combinations may be de?ned to instantiate correspondent agents. The VAF is the logical component that stores virtual agents and their system process. ADL intermediates data tra?c among di?erent components, assuring the persistence and resilience of the information. Data exchange with external/internal consumer/ publishers is managed by the DEM, based on the information governance de?ned in the con?guration and administrated by the Context Manager. This component is on charge of the system orchestration, initiating and controlling all the process and resources. Learning Manager calculates and distributes rewards to the agents and oversights the system learning process. While a full representation of the views and perspectives of the proposed architecture exceeds the scope of this paper, functional view, informa- tion view and context view are brie?y described below in Fig. 3. A Reinforcement Learning Multiagent Architecture Prototype 163 Fig. 3. Functional view. The functional view describes four main logical components and their basic inter- actions. As depicted here, Context Manager orchestrates and supervises most of the functional ?ows within the system. Although ADL is embedded on the background of this visual representation, none of its functionalities justify its inclusion as a logical component. 164 M. Rivas and F. Giorno Entities included in the Information View diagram represent main information concepts and their composition/aggregation relationship. Information ?ow diagram (usually described using an UML message ?ow diagram) complements this view, repre- senting the information system lifecycle. Most important entities of the proposed archi- tecture are described in Fig. 4. Fig. 4. Information view. As de?ned by Rozanski and Woods [6], the Context View describes the relationships, dependencies and interactions between the system and its environment. The proposed architecture is de?ned by the logical and physical context. Data and software compo- nents are included on the logical context, while hardware components are included in the physical context. External entities interacting with the system are represented as out-of- system-boundary in Fig. 5. A Reinforcement Learning Multiagent Architecture Prototype 165 Fig. 5. Context view. 4 Testing The architecture proposed was designed to cover all the architecture requirements mentioned in Sect. 3, however its material veri?cation cannot be performed without a concrete system derived from it. While creating an IoT concrete system to assess the feasibility of the proposed architecture is out of the scope of this work, an execution simulator tool (EST) was designed to con?rm the learning capabilities of the solution. 4.1 EST Design The EST was developed as a functional prototype of the proposed architecture, consid- ering its main structures and the relationship among components. Due to the experi- mental approach and the limited resources, some particularities were de?ned: – Physical environment was reduced to a bi-dimensional space; – Every agent has an individual id, a pair of coordinates (x,y) and a reference to a two other agents, known as the “vertices”; 166 M. Rivas and F. Giorno – There are nine (9) possible actions to be taken for an agent at any cycle: stand still or move in one of eight (8) possible directions (0°, 45°, 90°, 135°, 180°, 225°, 270° or 315°); – At every cycle each agent knows its current coordinates and the coordinates of each one of its vertices; – The individual reward calculation is based on the angular di?erence from the triangle formed by the agent itself and its vertices, and a hypothetical equilateral triangle. To calculate the di?erence, every internal angle of the triangle is compared with a target of 60°, computing the sum of these three di?erences and subtracting from 120: R = 120- ( | | ang1-60 | | + | | ang2-60 | | + | | ang3-60 | | ) 4.2 Software Project The prototype was developed in Linux Ubuntu 16, using Python 3.6 language and the machine learning library PyTorch 0.2.0. Its neural network was built using a deep queue learning approach, with ?ve (5) input parameters, a hidden layer of thirty (30) neurons and nine (9) output parameters (one per each possible agent action). The loss optimiza- tion function applied was the adaptive moment estimation (ADAM), and a neuron acti- vation was linear recti?cation (ReLU). 4.3 Simulation Cases In order to de?ne the simulation cases, the following parameters were considered: (1) number of agents, (2) algorithm learning rate (alpha), (3) softmax policy temperature (tau) and (4) number of execution cycles. The number of agents was limited to four, using the following con?gurations: (a) three agents with only one active agent, (b) three agents with only two active agents, (c) three agents all active and (d) four agents all active. Learning rate is a parameter utilized by the Q-Learning algorithm [7] to de?ne the prevalence of the new knowledge over the previous one. Alpha large values (closer to 1- ) implies a faster substitution of knowledge while lower values (closer to 0+ ) implies a more conservative approach. De?ned values for the simulation cases were {0.2, 0.5, 0.8}. The decision policy utilized by the tool is a version of Softmax [8] implemented on the PyTorch library. This policy aims to de?ne whether to choose the greedy action (the best-known action for a speci?c state) or the random action (to explore the environment) based on the amount of knowledge currently harvested, i.e. the more knowledge the more likely to choose a greedy action. The “temperature” parameter (tau) provides a magnitude to the policy. Values chosen to create the test cases were {0.01, 0.1, 1, 10, 100}. The number of execution cycles was determined after some exploratory cases, aiming to gather enough executions to support the conclusions, within the expected timeframe. As result, the target number of execution cycles was ten thousand (10,000). A Reinforcement Learning Multiagent Architecture Prototype 167 For each agent con?guration, ?fteen (15) parameter combinations were de?ned (three learning rate values x ?ve policy temperatures) plus a scenario without learning capabilities, totalizing sixty-four (64) combinations. Each scenario was executed ten (10) times, through ten-thousand (10,000) cycles, completing six-million four-hundred-thousand cycles. 4.4 Results Every execution generated a text ?le containing the reward of the system (calculated as the sum of the individual rewards) for each cycle and its correspondent graph. In order to evaluate the convergence of the learning curve, reward values were segmented in ?fty (50) stages, and standard deviation was calculated each one. Whenever the standard deviation remains decreasing or stable at a very low value, learning curve convergence is con?rmed. In general, all test cases con?rmed the convergence of the learning curves, except for a few cases where policy temperature was very low and (as expected) scenarios with no learning capabilities. Figure 6 describes the results of the simulations consolidated by policy temperature. Fig. 6. Learning curve convergence by policy temperature. When compared scenarios by learning rate, no relevant di?erence was found, as shown in Fig. 7. 168 M. Rivas and F. Giorno Fig. 7. Learning curve convergence by learning rate. 5 Conclusions The objective of this work was to de?ne an architecture of reference that provides learning capabilities to a Smart Home environment, allowing for real-time component con?guration and external information governance, as described on the architecture requirements section. The proposed architecture de?nes components and functionalities covering the architecture requirements, introducing reinforcement learning features. Simulated scenarios executed also con?rmed the learning curve convergence of the system, under several di?erent con?gurations. According to the Q-Learning algorithm de?nition [7], collaborative multiagent systems should converge to an optimal policy in a ?nite number of cycles, however this is not guaranteed for competitive environments. Unexpectedly, test results shown a very similar convergence curve for collaborative and competitive environments, suggesting that a more variated selection of agent scenarios should be tested as an extension of this work, to con?rm or contest Q-Learning hypothesis. Future extensions of this work may cover the study of learning convergence curves for more variated con?gurations, eventually approaching to real life smart home setups. Another study path suggested by the results of this work refers to the possibility of sharing intelligence among di?erent con?gurations, by persisting the agent neural networks. Figure 8 represents a consolidated view of the simulations executed by agent con?g- uration. A Reinforcement Learning Multiagent Architecture Prototype 169 Fig. 8. Learning curve convergence by agent con?guration. References 1. Madakam, S., Ramaswamy, R., Tripathi, S.: Internet of Things (IoT): a literature review. J. Comput. Commun. 3, 164–173 (2015) 2. Katasonov, A., Kaykova, O., Khriyenko, O., Nikitin, S., Terziyan, S.: Smart semantic middleware for the Internet of Things. In: Proceedings of the 5th International Conference on Informatics in Control, Automation and Robotics, Portugal, pp. 169–178 (2008) 3. Wang, J., Zhu, Q., Ma, Y.: An agent-based hybrid service delivery for coordinating internet of things and 3rd party service providers. J. Netw. Comput. Appl. 36, 1684–1695 (2013) 4. Nastic, S., Copil, G., Truong. H., Dustdar. S.: Governing elastic IoT cloud systems under uncertainty. In: 2015 IEEE 7th International Conference on Cloud Computing Technology and Science, pp. 131–138. IEEE, Canada (2015) 5. Bassi, A., Bauer, M., Fiedler, M., Kramp, T., Van Kranenburg, R., Lange, S., Meissner, S.: Enabling Things to Talk: Designing IoT solutions with the IoT Architectural Reference Model, p. 349. Springer, Berlin (2013) 6. Rozanski, N., Woods, E.: Software Systems Architecture. Working with Stakeholders Using Viewpoints and Perspectives, p. 529. Pearson, London (2005) 7. Watkins, C.J.C.H.: Learning from Delayed Rewards. Ph.D. thesis, UK (1989) 8. Tuyls, K., Weiss, G.: Multiagent learning: basics, challenges and prospects. AI Mag. 3, 41–52 (2012) 170 M. Rivas and F. Giorno Real-Time Air Pollution Monitoring Systems Using Wireless Sensor Networks Connected in a Cloud-Computing, Wrapped up Web Services Byron Guanochanga1 , Rolando Cachipuendo1 , Walter Fuertes1(B) , Santiago Salvador1 , Diego S. Ben´itez2 , Theo?los Toulkeridis1 , Jenny Torres3 , C´esar Villac´is1 , Freddy Tapia1 , and Fausto Meneses1 1 Universidad de las Fuerzas Armadas ESPE, 171-5-231B Sangolqu´i, Ecuador {beguanochanga,recachipuendo,wmfuertes,mssalvador,ttoulkeridis, cjvillacis,fmtapia,fhmeneses}@espe.edu.ec 2 Universidad San Francisco de Quito USFQ, Campus Cumbay´a, Casilla Postal, 17-1200-841 Quito, Ecuador dbenitez@usfq.edu.ec 3 Escuela Polit´ecnica Nacional, P.O. Box 17-01-2759, Quito, Ecuador jenny.torres@epn.edu.ec Abstract. Air pollution continues to grow at an alarming rate, decreas-ing the quality of life around the world. As part of preventive mea-sures, this paper presents the design and implementation of a secure and low-cost real-time air pollution monitoring system. In such sense, a three-layer architecture system was implemented. The ?rst layer contains sensors connected to an Arduino platform towards the data processing node (Raspberry’s Pi), which through a wireless network sends messages, using the Message Queuing Telemetry Transport (MQTT) protocol. As a failback method, strings are stored within the data processing nodes within ?at ?les, and sent via SSH File Transfer Protocol (SFTP) as a restore operation in case the MQTT message protocol fails. The appli-cation layer consists of a server published in the cloud infrastructure having an MQTT Broker service, which performs the gateway functions of the messages sent from the sensor layer. Information is then published within a control panel using the NODE-RED service, which allowed to draw communication ?ows and the use of the received information and its posterior storage in a No SQL database named “MongoDB”. Fur-thermore, a RESTFUL WEB service was shared in order to transmit the information for a posterior analysis. The client layer can be accessed from a Web browser, a PC or smartphone. The results demonstrate that the proposed message architecture is able to translate JSON strings sent by the Arduino-based sensor Nodes and the Raspberry Pi gateway node, information about several types of air contaminants have been e?ectively visualized using web services. Keywords: Air pollution · IoT· IaaS · WSN · Web services .e c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 171–184, 2019. https://doi.org/10.1007/978-3-030-02686-8_14 172 B. Guanochanga et al. 1 Introduction The World Health Organization (WHO) [1] reported that “Air pollution is the biggest environmental risk to health, carrying responsibility for about one in every nine deaths annually”. Although industry and the scienti?c community have developed various solutions based on conventional Wireless Sensor Net-works (WSN) for air pollution monitoring, the existing products and the gener-ated results lack to represent low-cost solutions, some require hiring hosting or web services, as well as having a number of limited messages without a failback method. The aim of this work is to develop a secure environmental monitoring sys-tem based on WSN that are integrated to the Internet of Things (IoT) concept, increasing the capacity and life span of the sensor nodes of the WSN with rel-ative low-costs. Therefore, ?rst, a hardware and software prototype has been assembled using Arduino and Raspberry Pi platforms, comprising several air pollution sensors as well as newly designed and constructed wireless expansion modules. Second, a three-layer architecture, which leverages a real-time air pol-lution monitoring system has been designed and implemented: (1) The ?rst sensor layer includes the electronic hardware circuits and the software compo-nents, both for the Arduino-based sensor nodes and the gateway node, which was assembled using a Raspberry Pi together with a low-cost wireless expansion module for capturing the data. (2) The application layer, where a Web service has been designed and implemented using a set of protocols and formats that are used to process the data and store them in a MongoDB Database as part of the Cloud infrastructure. (3) The client layer, which consists of a Web graphical user interface, providing a visual information about environmental parameters in order to allow the communication with the WSN and users. The main contributions of this paper include: (1) The creation of a low-cost wireless monitoring system (i.e., software) as an IoT application to visualize the levels of air pollution. (2) The implementation of a novel three-layer message architecture to translate JSON strings sent by Arduino-based sensor Nodes and the Raspberry Pi gateway node, which are e?ectively visualized in Web services. (3) A failback method as a process for restoring operations via SFTP protocol, in case the MQTT message protocol fails. The remainder of this paper is organized as follows: Sect. 2 discusses related work, Sect. 3 presents the experimental setup, as well as the implementation of electronic devices and web services, while Sect. 4 provides the experimental results; ?nally, Sect. 5 ends the paper with the conclusion and future work. 2 Related Work The scienti?c community has been developing innovative alternatives to mea-sure air pollution using WSN. Nevertheless, several studies has been designed conventionally. In relation to low-power wireless communication protocols, similar to this work, some authors such as [2–14] have used ZigBee technology (based on Real-Time Air Pollution Monitoring Systems Using WSN 173 the IEEE 802.15.4). Conversely, in this work the NRF24L01 radio frequency transceiver module, [15] which has an advanced energy management, was used. The NRF24L01 has an enhanced Shock- Burst hardware protocol accelerator, which helps to implement a robust and advanced wireless network with low-cost micro-controllers. In relation to the connection platform for the di?erent nodes, the study pro-posed by [7] used Octopus II. The sensor node implemented had a humidity sensor, temperature and a CO sensor. In [11], the same device was used, with the di?erence that the 501A Dust sensor module (DSM501A) was added, which was designed to detect particles larger than 1 µm. In [5,16,17] the Waspmote platform was applied, which is characterized by the use of lower energy consump-tion. In [6], nodes were prepared to monitor gases such as carbon monoxide (CO), nitrogen dioxide (NO2), sulfur dioxide (SO2), ozone (O3), metals such as lead (Pb) and particulate matter. In [16] authors proposed a clustering protocol for the sensor network. For the connection of di?erent sensors, di?erent models of the Arduino plat-form have been used. For instance, in [12] the Arduino Mega 128 microcontroller was used together with the MQ-7 sensitive gas sensor detector in order to deter-mine CO. For the implementation of the sensor node in [18], the Arduino one with the Digi XBee module were used for the wireless mesh communication of the nodes. Similarly, in [19] authors used the Arduino R3 board that has an Atmel Atmega328 microcontroller with a clock speed of 16 MHz, together with a XBee model. Raspberry Pi model B was also used for the base station, where a database has been available for the storage of the received readings and a Web application was used for data presentation. The majority of these studies [12,19–34,36] resemble this work since the same open-source Arduino platform is used. However, they di?er in the way the data is transmitted towards the database, since a Raspberry Pi acting as the Gateway node is used in this work, using a three-layer message architecture, together with the NRF24L01 module for Wireless communication. Regarding the number of sensors for measuring air quality parameters, in [32] a device was implemented to monitor the CO in di?erent industrial plants. In [35] temperature and relative humidity data were collected using the SHT11 and SHT75 sensors, respectively. In [36], a predesigned sensor node, called CanarIT was used, which displayed several sensors. Data from each sensor node were stored in the cloud by GPRS communication. In [37], the sensors used were MG-811 for CO2, MQ-7 for CO and GP2Y1010AU0F for powder particles. In comparison with this study, most sensor nodes determined only up to four pol-lutants, including the most common being CO, CO2 and particulate matter. Nonetheless, more sensors were implemented in this study in order to mea-sure more pollutants, including CO, CO2, methane (CH4), sulfur dioxide (SO2), hydrogen sul?de (H2S), NO2 and particulate material (2.5 and 10 µm). Furthermore, similar to the study proposed in [37], in this work all data have been stored in a non-relational database and processed in a private cloud computing infrastructure. 174 B. Guanochanga et al. 3 Experimental Setup The general architecture of the real-time air pollution monitoring system is illus-trated in Fig. 1. The system has been divided into three layers. First, the Sensors layer is formed by the sensor nodes (SN) connected by Arduino R3 boards located in a distributed manner and the Gateway node, consisting of a Raspberry Pi board, forming a WSN. The sensor nodes send the polluting gas measurement information to the corresponding Gateway node wirelessly. Second, the Gate-way node with Internet access sends the received information to an application server in the cloud computing. The information will be stored in a non-relational database such as MongoDB. Third, this information will be published on a Web page so that users would be able to access it through their Web browser and smartphones. Fig. 1. Architecture that leverages the WSN system. 3.1 Sensor Nodes The electronic circuit diagram of a typical Sensor Node prototype, which depicts the connections made in each sensor node, is shown in Fig. 2. It consists of the Arduino board, the Wireless module NRF24L01, and the CO, CO2, CH4, SO2, H2S, NO2 and particulate material sensors. For the measurement of polluting gases, the modules MQ-7 (CO), MG-811 (CO2), MQ-4 (CH4), MQ-136 (SO2, Real-Time Air Pollution Monitoring Systems Using WSN 175 Fig. 2. Schematic electronic circuit diagram for each Sensor Node. and H2S) and MICS-2714 (NO2) were used. Finally, for the measurement of particulate material of 2.5 and 10 µm, the digital sensor HK-A5 was used. The CO, CH4, SO2, H2S and NO2 sensors are Metal Oxide Semiconductor (MOS) based sensors. This type of sensors displays a small heating element inside as well as an electrochemical sensor. The heater is necessary in order to ?t the sensor to the its proper operating conditions, since the sensitive surface of the sensor will react only at certain temperatures. The detection principle is based on the change of resistance due to incoming gas contact. The CO2 sensor is a chemical sensor that operates under the principle of a solid electrolyte cell. When the sensor is exposed to CO2, chemical reactions occur in the cell producing an electromotive force. The temperature of the sensor must be high enough for these reactions to occur. Therefore, a heating circuit was used to heat up the sensor to an adequate temperature. The MOS sensors required signal conditional circuits for converting their readings to voltage that will be measured by the Arduino board. Similarly, the CO2 module has an ampli?er circuit to improve the accuracy of the measure-ments since the output voltage of the sensor is relatively low. Sensor voltages were measured by the analog inputs of an Arduino Mega microcontroller board. The particulate material sensor communicates serially with the Arduino board. Figure 2 shows the sensor connections with the Arduino Mega board. For wire- 176 B. Guanochanga et al. less communication, the NRF24L01 transceiver module was used operating in the 2.4 GHz ISM band. Since the Arduino board lacks of enough connections for powering the sensor modules, a shield-type board was designed for connecting the board with the sensors and the wireless extension module. The Gateway node consists of a Raspberry Pi board, and a NRF24L01 Wireless expansion module. This node receives the measurements from all connected sensor nodes. 3.2 Hardware Prototype The sensor node prototype was implemented inside a sealed chamber in which the sensor modules were placed, as shown in Fig. 3. The chamber has two air ducts, air is sucked by a fan to the interior and then it escapes towards the outside. The sensors and the wireless module were connected to the shield and to the Arduino board, where all the sensor node is controlled. Signals from the sensors were interpreted to di?erent gas concentrations according to the characteristic curves described in their corresponding data sheets. Fig. 3. Photograph of the sensor node prototype. 3.3 Web Services In order to send messages from the Gateway node to the application server, the MQTT protocol was used with the character string and format illustrated in Fig. 4. The format uses the JavaScript Object Notation (JSON) type, composed of the sensor node ID, the IP address, date and time of measurement, latitude and longitude as well as measurements of the sensor. As a Failback method for handling errors, this string is also stored in the Gateway node in a ?at ?le after it has been sent to the application server through the SFTP protocol, with the purpose of processing it and acting as redundancy Real-Time Air Pollution Monitoring Systems Using WSN 177 Fig. 4. String Chain with a format based on the MQTT protocol. in case the MQTT message protocol fails. The application server has a MQTT Broker service, which represents a central node or broker server and it is respon-sible for managing the network by receiving messages sent from the Gateway nodes. The system has a Delay-Access process, which allows to synchronize the reception of messages from the processing nodes. This process will always be checking the status of messages to guarantee their availability and verify if the failback option has been performed for the node that fails to be the case. With the implementation of the NODE-RED service installed on the server, several information ?ows have been created in order to publish the measurement data received from the MQTT Broker on the Web, and at the same time to store them within a MongoDB database. Additionally, an information ?ow was performed through a RESTFUL Web service and a GET method, which allowed retrieving information from the database in order to be shared by other systems. Figure 5 presents the control panel information about the state of the system’s central node, it displays the Temperature, CPU Load, and Memory Consump-tion, which may allow to diagnose the status of the WSN. Figure 6, on the other hand, shows an example of the real-time monitoring of methane by one of the sensors, sent by the central node. 4 Results and Discussion For the proof of concept of this monitoring solution, several pollutant measure-ments were taken every seven seconds. These measurements were conducted around three di?erent locations in Ecuador: an university campus located in the city of Sangolqu´i, in the southern zone of the city of Machachi, and in the “La Virgen Santisima” cave in Tena [38]. A total of 260 samples were obtained for CO, CO2, CH4, SO2, H2S, NO2 gases, and powder density of type PM2.5 and PM10. The obtained measurements were on the detection ranges of the gas sen-sors used in the prototype. Table 1 shows the sensors ranges together with the typical concentrations of such gases in the environment. Figure 7 shows the resulting CO2 measurements for the 3 locations, as it can be seen the concentration inside the cave is much higher than in the cities of Sangolqu´if and Machachi. An average of 1240 ppm was obtained inside the karstic cave with a standard deviation (SD) of 319 ppm. In Sangolqu´i, it was of about 962 ppm with a SD of 112 ppm, while in Machachi an average of 794 ppm with a corresponding SD of 89 ppm, was obtained. 178 B. Guanochanga et al. Fig. 5. Control panel of the central node of the system. Fig. 6. Example of real-time monitoring of methane by one of the sensors. Real-Time Air Pollution Monitoring Systems Using WSN 179 Table 1. Types of polluting gases and measurement ranges for the sensors used Type Sensors used to measure air pollution Polluting gas Sensor Range Reference Carbon monoxide, CO MQ-7 20–200 ppm Ecua. Stand. Carbon dioxide, CO2 MG-811 400–10000 ppm Ref. value Methane, CH4 MQ-4 200–10000 ppm Ref. value Sulfur dioxide, SO2 MQ-136 1–200 ppm WHO Hydrogen sul?de, H2Sa MQ-136 1–100 ppm OSHA Nitrogen dioxide, NO2 Mics-2714 0.05 to 5 ppm WHO PM2.5/PM10 HK-A5 0–999 ug/m3 WHO a Occupational Safety and Health Administration (OSHA), USA. Fig. 7. Comparison of CO2 measurement results in the three di?erent locations. Figure 8, on the other hand, shows the data obtained from the 2.5 micron particulate material. In Sangolqu´i, it reached up to 10 ug/m3, having a higher concentration density than Machachi with about 5 ug/m3, while for the cave this concentration is of about 6 ug/m3. Nevertheless, the Ecuadorian standard of air quality [1] speci?es as limit an average of about 50 ug/m3 in a day of monitoring, and 15 ug/m3 as an annual average, therefore the density of dust particles in the studied sectors remains within the recommended levels. 180 B. Guanochanga et al. Fig. 8. Measurements of PM2.5 at three di?erent locations. Fig. 9. Measurements of PM10 at three di?erent locations. Real-Time Air Pollution Monitoring Systems Using WSN 181 Finally, Fig. 9 illustrates that the particulate material measurements were ?ner than 10 microns at the 3 locations. Hereby, the Sangolqu´ih sector presents again mostly data with about 12 ug/m3, being higher than the sector of Machachi with 6 ug/m3 and than the Amazonian Cave with about 8 ug/m3. Similarly, the Ecuadorian standard of air quality establishes that the annual PM10 concen-tration should not exceed 50 ug/m3; and the daily average should not exceed 100 ug/m3. Therefore, the measurements obtained for the 3 locations comply with the values recommended by the WHO and the Ecuadorian standard of air quality. 5 Conclusions and Future Work This paper focused on the design and implementation of a real-time Air Pol-lution Monitoring System based on the use of WSN under the concept of IoT using the infrastructure of a Cloud Computing. A three-layer architecture was designed and implemented with low-cost electronic hardware, such as Arduino-based sensor nodes as well as a Raspberry Pi-based gateway node with a low-cost wireless expansion module that captures the data. In addition, a Web service was also designed and implemented using a set of protocols and formats used to process the data and store them in a MongoDB Database as part of the Cloud infrastructure. The implemented Web graphical user interface allowed the com-munication with the WSN and users. Compared with other proposed solutions described in the literature, the solution proposed here is secure, since the same chain has been stored within the data processing nodes in a ?at ?le, and sent to the application layer by means of the SFTP, acting as a failback method, in order to process it and keep it as redundant in case the MQTT message protocol fails. Next steps will include the integration of the proposed solution with an ana-lytical data system based on big data tools, as well as performance improvements on the capture of the frames by using an Odroid electronic board. Acknowledgment. The authors would like to thank the ?nancial support of the Ecuadorian Corporation for the Development of Research and the Academy (RED CEDIA) in the development of this work, under Project Grant CEPRA-XI-2017-13. References 1. World Health Organization. Ambient air pollution: A global assessment of exposure and burden of disease (2016) 2. Zhi-gang, H., Cai-hui, C.: The application of Zigbee based wireless sensor network and GIS in the air pollution monitoring. In: 2009 International Conference on Environmental Science and Information Application Technology, Wuhan, pp. 546– 549 (2009). https://doi.org/10.1109/ESIAT.2009.192 3. Banghong, X., Yang, L., Honglei, Z., Junfeng, L.: Application design of wire-less sensor networks in environmental pollution monitoring. Comput. Measur. Control 2, 003 (2009) 182 B. Guanochanga et al. 4. Postolache, O.A., Dias Pereira, J.M., Silva Girao, P.M.B.: Smart sensors network for air quality monitoring applications. IEEE Trans. Instrum. Measur. 58(9), 3253– 3262 (2009). https://doi.org/10.1109/TIM.2009.2022372 5. Eren, H., Al-Ghamdi, A., Luo, J.: Application of Zigbee for pollution monitoring caused by automobile exhaust gases. In: 2009 IEEE Sensors Applications Sympo-sium, New Orleans, LA, pp. 164–168 (2009). https://doi.org/10.1109/SAS.2009. 4801799 6. Bader, S., Anneken, M., Goldbeck, M., Oelmann, B.: SAQnet: experiences from the design of an air pollution monitoring system based on o?-the-shelf equipment. In: 2011 Seventh International Conference on Intelligent Sensors, Sensor Networks and Information Processing, Adelaide, SA, pp. 389–394 (2011). https://doi.org/ 10.1109/ISSNIP.2011.6146632 7. Liu, J.H., Chen, Y.F., Lin, T.S., Lain, D.W., Wen, T.H., Sun, C.H., Jiang, J.A.: Developed urban air quality monitoring system based on wireless sensor networks. In: 2011 Fifth International Conference on Sensing Technology, Palmerston North, pp. 549–554 (2011). https://doi.org/10.1109/ICSensT.2011.6137040 8. Zhou, G., Chen, Y.: The research of carbon dioxide gas monitoring platform based on the wireless sensor networks. In: 2011 2nd International Conference on Arti?- cial Intelligence, Management Science and Electronic Commerce (AIMSEC), Deng Leng, pp. 7402–7405 (2011). https://doi.org/10.1109/AIMSEC.2011.6010423 9. Yan, Z., Eberle, J., Aberer, K.: OptiMoS: optimal sensing for mobile sensors. In: 2012 IEEE 13th International Conference on Mobile Data Management, Bengaluru, Karnataka, pp. 105–114 (2012). https://doi.org/10.1109/MDM.2012.43 10. Mao, X., Miao, X., He, Y., Li, X.Y., Liu, Y.: CitySee: urban CO2 monitoring with sensors. In: 2012 Proceedings IEEE INFOCOM, Orlando, FL, pp. 1611–1619 (2012). https://doi.org/10.1109/INFCOM.2012.6195530 11. Wang, C.H., Huang, Y.K., Zheng, X.Y., Lin, T.S., Chuang, C.L., Jiang, J.A.: A self sustainable air quality monitoring system using WSN. In: 2012 Fifth IEEE Inter-national Conference on Service-Oriented Computing and Applications (SOCA), Taipei, pp. 1–6 (2012). https://doi.org/10.1109/SOCA.2012.6449427 12. Devarakonda, S., Sevusu, P., Liu, H., Liu, R., Iftode, L., Nath, B.: Real-time air quality monitoring through mobile sensing in metropolitan areas. In: Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing, p. 15, August 2013. https://doi.org/10.1145/2505821.2505834 13. Kadri, A., Yaacoub, E., Mushtaha, M., Abu-Dayya, A.: Wireless sensor network for real-time air pollution monitoring. In: 2013 1st International Conference on Communications, Signal Processing, and their Applications (ICCSPA), Sharjah, pp. 1–5 (2013). https://doi.org/10.1109/ICCSPA.2013.6487323 14. Kelly, S.D.T., Suryadevara, N.K., Mukhopadhyay, S.C.: Towards the Implementa-tion of IoT for environmental condition monitoring in homes. IEEE Sens. J. 13(10), 3846–3853 (2013). https://doi.org/10.1109/JSEN.2013.2263379 15. Fuertes, W., Carrera, D., Villac´is, C., Toulkeridis, T., Gal´arraga, F., Torres, J., Aules, H.: Distributed system as internet of things for a new low-cost, air pollution wireless monitoring on real time. In: IEEE/ACM 19th International Symposium on Distributed Simulation and Real Time Applications (DS-RT), Chengdu, China, pp. 58–67 (2015). https://doi.org/10.1109/DS-RT.2015.28 16. Mansour, S., Nasser, N., Karim, L., Ali, A.: Wireless sensor network-based air quality monitoring system. In: 2014 International Conference on Computing, Net-working and Communications (ICNC), Honolulu, HI, pp. 545–550 (2014). https:// doi.org/10.1109/ICCNC.2014.6785394 Real-Time Air Pollution Monitoring Systems Using WSN 183 17. Kim, J.Y., Chu, C.H., Shin, S.M.: ISSAQ: an integrated sensing systems for real-time indoor air quality monitoring. IEEE Sens. J. 14(12), 4230–4244 (2014). https://doi.org/10.1109/JSEN.2014.2359832 18. Abraham, S., Li, X.: A cost-e?ective wireless sensor network system for indoor air quality monitoring applications. Procedia Comput. Sci. 34, 165–171 (2014). https://doi.org/10.1016/j.procs.2014.07.090 19. Ferdoush, S., Li, X.: Wireless sensor network system design using Raspberry Pi and Arduino for environmental monitoring applications. Procedia Comput. Sci. 34, 103–110 (2014). https://doi.org/10.1016/j.procs.2014.07.059 20. Liu, S., Xia, C., Zhao, Z.: A low-power real-time air quality monitoring system using LPWAN based on LoRa. In: 2016 13th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT), Hangzhou, pp. 379–381 (2016). https://doi.org/10.1109/ICSICT.2016.7998927 21. Sugiarto, B., Sustika, R.: Data classi?cation for air quality on wireless sensor net-work monitoring system using decision tree algorithm. In: 2016 2nd International Conference on Science and Technology-Computer (ICST), Yogyakarta, pp. 172–176 (2016). https://doi.org/10.1109/ICSTC.2016.7877369 22. Pieri, T., Michaelides, M.P.: Air pollution monitoring in lemesos using a wireless sensor network. In: 2016 18th Mediterranean Electrotechnical Conference (MELE- CON), Lemesos, pp. 1–6 (2016). https://doi.org/10.1109/MELCON.2016.7495468 23. Boubrima, A., Bechkit, W., Rivano, H.: Optimal WSN deployment models for air pollution monitoring. IEEE Trans. Wirel. Commun. 16(5), 2723–2735 (2017). https://doi.org/10.1109/TWC.2017.2658601 24. Pavani, M., Rao, P.T.: Real time pollution monitoring using Wireless Sensor Net-works. In: 2016 IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, pp. 1–6 (2016). https:// doi.org/10.1109/IEMCON.2016.7746315 25. Pavani, M., Rao, P.T.: Urban air pollution monitoring using wireless sensor net-works: a comprehensive review. Int. J. Commun. Netw. Inf. Secur. (IJCNIS) 9(3) (2017) 26. Hojaiji, H., Kalantarian, H., Bui, A.A.T., King, C.E., Sarrafzadeh, M.: Temper-ature and humidity calibration of a low-cost wireless dust sensor for real-time monitoring. In: 2017 IEEE Sensors Applications Symposium (SAS), Glassboro, NJ, pp. 1–6 (2017). https://doi.org/10.1109/SAS.2017.7894056 27. Jaladi, A.R., Khithani, K., Pawar, P., Malvi, K., Sahoo, G.: Environmental mon-itoring using Wireless Sensor Networks (WSN) based on IOT. Int. Res. J. Eng. Technol. (IRJET) 4, 1371–1378 (2017) 28. Sivamani, S., Choi, J., Bae, K., Ko, H., Cho, Y.: A smart service model in green-house environment using event-based security based on wireless sensor network. Concurrency Comput. Pract. Exp. 30, 1–11 (2018). https://doi.org/10.1002/cpe. 4240 29. Yadav, M., Sethi, P., Juneja, D., Chauhan, N.: An agent-based solution to energy sink-hole problem in ?at wireless sensor networks. In: Next-Generation Networks, vol. 638, pp. 255–262. Springer, Singapore (2018). https://doi.org/10.1007/978- 981-10-6005-2-27 30. Aznoli, F., Navimipour, N.J.: Deployment strategies in the wireless sensor net-works: systematic literature review, classi?cation, and current trends. Wirel. Pers. Commun. 95, 819–846 (2017). https://doi.org/10.1007/s11277-016-3800-0 184 B. Guanochanga et al. 31. Xu, Y., Liu, F.: Application of wireless sensor network in water quality monitoring. In: 2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Comput-ing (EUC), Guangzhou, pp. 368–371 (2017). https://doi.org/10.1109/CSE-EUC. 2017.254 32. Yu, J., Wang, W., Yin, H., Jiao, G., Lin, Z.: Design of real time monitoring system for rural drinking water based on wireless sensor network. In: 2017 International Conference on Computer Network, Electronic and Automation (ICCNEA), Xi’an, pp. 281–284 (2017). https://doi.org/10.1109/ICCNEA.2017.102 33. Yang, J., Zhou, J., Lv, Z., Wei, W., Song, H.: A real-time monitoring system of industry carbon monoxide based on wireless sensor networks. Sensors 15(11), 29535–29546 (2015) 34. Nikhade, S.G.: Wireless sensor network system using Raspberry Pi and Zigbee for environmental monitoring applications. In: 2015 International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM), pp. 376–381 (2015) 35. Delamo, M., Felici-Castell, S., P´erez-Solano, J.J., Foster, A.: Designing an open source maintenance-free environmental monitoring application for wireless sensor networks. J. Syst. Softw. 103, 238–247 (2015) 36. Moltchanov, S., Levy, I., Etzion, Y., Lerner, U., Broday, D.M., Fishbain, B.: On the feasibility of measuring urban air pollution by wireless distributed sensor networks. Sci. Total Environ. 502, 537–547 (2015) 37. Chen, Z., Hu, C., Liao, J., Liu, S.: Protocol architecture for wireless body area network based on nRF24L01. In: 2008 IEEE International Conference on Automa-tion and Logistics, Qingdao, pp. 3050–3054 (2008). https://doi.org/10.1109/ICAL. 2008.4636702 38. Constantin, S., Toulkeridis, T., Moldovan, O.T., Villacis, M., Addison, A.: Caves and karst of Ecuador - state-of-the-art and research perspectives. Physical Geog-raphy in press (2018). https://doi.org/10.1080/02723646.2018.1461496 A Multi-agent Model for Security Awareness Driven by Home User’s Behaviours Farhad Foroughi(?) and Peter Luksch Institute of Computer Science, University of Rostock, Rostock, Germany {farhad.foroughi,peter.luksch}@uni-rostock.de Abstract. Computer users are limited to perform multitask operations and processing information. These limitations a?ect their decision and full attention on security tasks. The majority of cybercrimes and frauds including e?ective security decisions and practising security management are related to human factors even for experts. Information Security awareness and e?ective home user training depend on concrete information and accurate observation of user behav- iours and their circumstances. Users’ awareness and consciousness about security threats and alternatives motivate them to take proper actions in a security situa- tion. This research proposes a multi-agent model that provides security awareness based on users’ behaviours in interaction with home computer. Machine learning is utilized by this model to pro?le users based on their activities in a cloud infra- structure. Machine learning improves intelligent agent accuracy and cloud computing makes it ?exible, scalable and enhances performance. Keywords: Home user’s behaviour · Security awareness Intelligent multi-agent model · User pro?ling 1 Introduction Computer users are limited to perform multitask operations and processing information. These limitations a?ect their decision and full attention on security tasks. Two signi?- cant factors to choose the best action are individual perception climate and self-e?cacy [1, 2]. There is a wide range of home computer usage with di?erent types of users. More- over, research and study over the home computer security are challenging because there is no canonical and speci?c de?nition of home computer user. A home user may use a computer for shopping and banking and other normal daily tasks. The user could be students who use the computer for learning purposes and use educational software. The age and gender also may a?ect the using of a computer at home. According to these conditions and contexts, users’ information security behaviour is very dynamic and changeable. The di?erences between users a?ect their decisions to support security or often ignore it [3]. In addition, Information Technology brings new technology in houses, and the focus of security solutions is also technological. The majority of the cybercrimes and frauds even for experts are related to human factors including e?ective security decisions and practicing security management [4, 5]. © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 185–195, 2019. https://doi.org/10.1007/978-3-030-02686-8_15 Byrne et al. [6] analysis presents that computer knowledge and expertise a?ects the importance of new threats. For example, integrity perception is signi?cant for users with extensive knowledge. They also provide evidences that users ignore privacy settings to follow their habits. Information Security awareness and e?ective home user training depend on concrete information and accurate observation of user behaviours and their circumstances. Without this information, it would be di?cult to provide e?ective advice or create proper policy. In additions, as long as individuals fail to provide secure behaviour and interact with computer safely, other relevant organisations such as government, ?nancial insti- tutes and shopping markets that provided online services could be in danger and at risk. This paper proposes a multi-agent model to provide security awareness and training material based on users’ behaviours and home computer interactions. This model uses machine learning on a cloud platform to analyse behaviours in real time or very close to that. In Sect. 2, signi?cant human factors that in?uence user’s decision in a security situation are discussed. Section 3 introduces the required characteristics of an e?ective awareness program for home users. User pro?ling is the process of capturing user-computer interaction to model user’s behaviours introduced in Sect. 4. Finally, Sect. 5 proposes a multi-agent model and discusses each element of the model. 2 Human Factors Psychologists and cognitive scientists say personal behaviours are linked to the person- ality pro?le. Some factors like age or age group, gender, personal interests and hobbies, occupation, education, and history of actions are included in the personality pro?le. This is important to understand users’ online activities and behaviours as well as their personality and occupation (or computer role) to be able to provide appropriate security awareness and training. By providing systematic awareness and guidance for all users sharing a computer or home network based on their behaviours, this e?ect could become a security culture. “Every [security] system is inadequate if there is no security culture shared by the whole sta?” [7]. The information security culture for home users is an important element to provide an e?ective and continues secure, safe behaviour [7]. The information security respon- sibility as well as physical security of users are the essential pieces of a comprehensive way to deal with information security management. Metalidou has categorised all human factors related to this aspect in four groups. These groups are (1) user interfaces of security-related systems; (2) information security management concerns for risk, busi- ness processes and ?nance; (3) organisational issues related to information security behaviour, and (4) counterproductive computer usage [5]. It is ?nally individuals who make decisions in any information security implemen- tation, but most of the home users’ security decisions are limited to their technological solutions. 186 F. Foroughi and P. Luksch Having improved security controls does not mean they are free from risk. West proves that individuals maintain an appropriate level of risk and danger [2]. In the home security context, it means a security control implementation or improvement will increase the users’ risky behaviour. Technical security controls in?uence the users’ actions by providing security func- tions and mechanisms, but human factors also a?ect individual’s decisions. Human factors are including motivation, knowledge, attitude, values and so on. The quality and accuracy of risk perception impacts users’ awareness, consciousness and behaviour and motivate them to take proper action in an information security management system [8]. In addition, any awareness program and education plan depend on the views of facili- tating the people to make relevant and e?ective security choices and thus achieve greater suitable information security consequences [9]. 2.1 Security Awareness Program When a home user is in a security situation or having a risky behaviour, having appro- priate skills or knowledge against the threat would lead the user to play an active role. The con?dence based on appropriate solutions will push users to choose adaptive behaviours more than maladaptive actions [10]. Awareness training generally includes security situations that may occur, the risks confronted, fundamental methods of security, how to build e?ective security behaviour, and recommended resources and support in a security scenario. Within the home security context, users are able to decide whether and how to carry out security actions because their options and alternatives are voluntary and subjective. To follow the decision-making process and to analyse the situation, researchers recognise ?ve factors that in?uence users’ decisions in computer security situations. These factors are [3]: (1) Recognition, awareness and consciousness of safe practices. (2) Recognition, awareness and consciousness of possible negative consequences of unsafe actions. (3) Recognition, awareness and consciousness of possible supportive resources for safe practices. (4) Probably and likelihood of negative consequences. (5) Cost of consequences. These ?ve factors could be categorised in two general divisions: (1) Awareness and knowledge of risks as well as consequences. (2) Awareness and understanding of defen- sive and protective measures [3]. Therefore, to provide an e?ective security awareness program, it is signi?cant to support human factors that in?uence users’ decisions. Home users like other individuals are unmotivated and have a limited capacity for information processing speci?cally in multitasking scenarios. Users need the motivation to improve their capabilities. When a user has to evaluate alternative options in a situation to make the best choice and decision, results which are actually abstract in nature such as security and protection A Multi-agent Model for Security Awareness 187 are likely to be less persuasive compared to those that are concrete. Consequently, users need to have a concrete understanding of security de?nitions [11]. In a typical and normal learning position, a behaviour is formed by positive rein- forcement whenever take action “right”. Hence, users need feedback and learning form speci?c and particular security-related decisions and not just common protection or dangerous choices. The protection and safety measure gain is generally conceptual but negative e?ects, and consequences are stochastic, costly and immediate. Accordingly, users should be able to evaluate any security and risk trade-o?. Furthermore, security bene?t and gain are usually intangible or conceptual, but in the opposite, security cost or losses values are more probable [12]. Because of this, cost and loss perception are more important in?uence factors than gain and bene?t when individuals try to evaluate security risks. However, Tversky and Kahneman proved that individuals are a lot more likely to stay away risk when options are provided as bene?ts and take risk when alternatives are presented as losses [13]. They also con?rmed that when users perceive a gain and loss to have the very same bene?t, the loss is considerably more motivating in choosing alternatives (Fig. 1). For example, online shoppers respond more properly to the understanding of likelihood and chance of negative threats than to awareness of the threats themselves [3]. Fig. 1. Losses carry more value compared to gain when both are perceived as equal. The fear manipulation will in?uence the perceived intensity of the risk and threat. In addition, an increase of fear appeals will improve the chance and likelihood of a threat to be realised. Rewards could be an individual pleasure or a ful?lment by peers. The social acceptance might also be a kind of rewards. Fear awaking could adjust both threat (risk) perception and threat (risk) probability. Therefore, providing threat (risk) evalu- ation is considered to prevent maladaptive reactions [14]. As it is discussed, users usually feel they are at less risk than others. Based on these ?ndings, it is almost always necessary to improve and enhance users’ risk perception awareness to increase their security and protection compliance. Raising risk perception and understanding might also be corporate and comprehen- sive to decrease the probability and chance of security policy violation. It means home user security awareness should be assembled to produce su?cient information and 188 F. Foroughi and P. Luksch knowledge and support all family members or individuals who share a computer to eliminate security risks. Clearly, in case home users have to take extra measures and steps to increase their level of protection, it should not be di?cult, and the cost of applying and employing security controls should be reduced as much as possible with e?cient support. 3 User Pro?ling Computer security awareness and training has to be personalised to produce the home user with a su?cient and e?ective learning experience lined up with his/her day-to-day occupation, activities, time availability, interests, generation and connection with owned technology. The capability of data analysis to correlate information and data from a broad range of sources across substantial time periods could bring out a clear and e?cient under- standing of home users’ activities and behaviours. By using this analysis concerning big data sources, makes security awareness program able to categories users in di?erent risk groups and provide the appropriate information and training. For this reason, recognising user behaviour in real time is an important element of providing relevant information and help to take suitable action or decision. It is possible to employ user modelling to make this process automatic by using an application or intelligent agent [15]. It is proved that the user should be realised in a variety of contexts. Therefore, a context-aware system should be utilised to identify user context in a certain time period [16]. This aspect drives the idea of using data science and machine learning to automate the user behaviour analysis to provide a data-driven decision-making model. A home user could be recognised in cyberspace by a digital pro?le [3]. A research by Weber et al. proves that a user pro?le presents (1) the user’s behavioural patterns or preferences, (2) the user’s characteristics, (3) the user’s skills, and (4) the cognitive process that a user chooses an action [17, 18]. The primary function of user pro?ling is capturing user’s information about interest domain. This information may be used to understand more about individual’s knowledge and skills and to improve user satisfaction or help to make a proper decision. The user pro?le consists of all information about a user that could be known by the system. User pro?ling is usually either knowledge-based or behaviour-based. The knowl- edge-based strategy uses statistical models to categorise a user in the closest model based on dynamic attributes. The behaviour-based strategy employs the user’s behaviours and actions as a model to observe bene?cial patterns by applying machine learning techniques. Real-time user behaviour analysis requires on-line monitoring to predict users actions. These behav- iours could be extracted through monitoring and logging tasks [19]. Batch analysis or o?-line monitoring could be carried out in time intervals or after a task has been ?nished by a user in accordance with statistical parameters of user actions. Using online and o?- line monitoring modes together provide both statistical and dynamical analysis of user actions [20]. A Multi-agent Model for Security Awareness 189 Generally, a user pro?ling begins with user’s data retrieval and data collection. Collecting user information (actions details) is the ?rst step to create a user model. It includes “what” information required and “how” to collect relevant information. Data gathering model could be explicit or implicit [21]. Explicit model means the computer user should be encouraged to provide a speci?c amount of information, but just a few number of users participate in such a process and furthermore, the provided information also has poor quality. Another signi?cant point, if keeping data up to date is necessary, this data collection model becomes more chal- lenging [22]. Implicit data collection model is a “silent” process to collect information through analysing observed users’ actions and reactions in a computer interaction environ- ment [22]. A hybrid pro?ling model considers both static characteristics and features of a user and also, tries to retrieve the behavioural information about the user. This strategy creates a more e?cient pro?le and maintains the accuracy of user data by keeping it up to date. A major attribute of discovery through observation is user’s change adaptation. It means, when user’s interest, preferences, habits and goals are changed over the time, these changes could be re?ected in the user pro?le to keep it updated. This attribute is possible by using pro?ling techniques which adapt and adjust the content of user pro?les when new observation data arrived. User feedback could also play an essential role in this particular process [23]. Collecting a wide range of user’s data creates speci?c challenges and needs an infra- structure to support several requirements including security, privacy and performance. The data collection should be transparent as much as possible with minimum user inter- action. It also should not make the limitation on system computing or network perform- ance. Because the behaviour analysis model may require a di?erent type of data over a time period, data collector architecture should be ?exible to cover various sensor types and technologies on di?erent platforms. 4 Multi-agent Model Multiple heterogeneous software entities (agents) that interact with each other directly or indirectly in a complex system with common or common or con?icting goals build a multi-agent system. [24]. A direct communication might be via messaging, and indirect communication could be through making an e?ect on the environment which the other agent(s) can sense it [25]. An agent provides noticeable characteristics including autonomous, social (interact with other agents), reactive, proactive, trustworthiness, rationality and learning. Reac- tivity character makes agents able to provide ongoing interaction with the system. Agents are proactive and rational which develops agent behaviour in accordance with its goal [20]. 190 F. Foroughi and P. Luksch The environment that home user interacts with a computer is continual, observable, dynamic, accessible and non-deterministic. This complex environment requires a multi-agent system to provide an infrastructure which agents could interact with each other to achieve the system goal. An intelligent agent is an ideal rational agent that provides actions to reach the highest level of performance measure by using provided evidence and built-in knowledge. The performance measure determines principles of success but should be carefully de?ned to concern con?icting criteria. A rational agent is an agent that performs right actions to achieve its goal as successful as possible. It means that a rational agent has to be reasonable, sensible and provides good judgment. Rationality depends on performance measures (determines the level of success), agent perception from the past (prior knowledge), agent understanding about the environment (perception sequence) and possible actions [26]. The perform- ance measure de?nes the criterion of success for an agent. An intelligent agent is based on learning model to run the inference engine. Feature extraction block receives information from sensors to extract useful features and then it will send them to the inference engine. The trained inference engine uses this information based on learning model to predict a result. The learning model is constructed by a machine learning algorithms [27]. The inference engine provides a decision and sends it to the actuator. The actuator is responsible for performing necessary action(s). Machine learning, stored knowledge and condition rules are typical techniques to make an agent intelligent. Machine learning imparts intelligence by using labelled data and training process. This approach makes it possible to extract patterns and relation- ships to predict unknown data to solve the problem [27]. A distributed multi-agent architecture could supply the ?exibility of providing required functions in the necessary locations. It also requires less programming chal- lenges and system control by employing global objectives to supply necessary knowl- edge and experiences that make agents able to solve complex problems by more autonomy [28]. Distributed multi-agent system by using cloud computing power is a combination of distributed independent, autonomous and incomplete agents that work together to address a complex global issue with no need of centralised system control [28]. In this cloud architecture, data is decentralised, and computing is asynchro- nous [29]. This architecture lets devices implement more features with limited storage and processing capabilities. The proposed model (Fig. 2) tries to develop an architecture by integrating cloud computing approach and multi-agents architecture to provide a dynamic, ?exible, robust and scalable intelligent system. A Multi-agent Model for Security Awareness 191 Fig. 2. Proposed multi-agent model. In this architecture, the user interface (UI) agent directly interacts with user and computer to collect required data through independent sensors and also provides relevant information including warnings, or training materials. The UI agent has sensor modules including di?erent independent sensors which are responsible for capturing users’ actions from a wide range of resources such as browser history, system settings, ?le system, network interfaces and user data (via an explicit method). It extracts relevant features and also generates data logs. These logs consist of personal and private details about a user. It is necessary to provide appropriate security measures to keep them safe and con?dential. For this reason, the UI requires two types of data storages to create and maintain (update) information. Online storage to store user data and pro?le: For security purpose, a Secured Virtual Di?used File System (SVDFS) by using private cloud is proposed. The data exchange between UI agent and cloud is also protected by a secure communication protocol using PKI. O?ine storage to store log ?les and user activities for further analysis or until trans- mitted to the server. These log ?les are stored in an encrypted container with password protection. The user pro?ler (UP) agent receives extracted features from UI and uses machine learning to process information and create (and update) user pro?le. The UP agent uses cloud computing to provide a dynamic, distributed and scalable service. The risk evaluator (RE) agent receives user pro?le information from the UP agent and also recent threats and vulnerabilities from the threat ?nder (TF) agent. According to the user pro?le which describes the necessary level of security and relevant security measures, the RE agent analyses user’s actions by utilising machine learning techniques and provides a risk level and related threat’s information to the awareness provider agent. 192 F. Foroughi and P. Luksch The awareness provider (AP) agent uses an awareness and security control repository to create appropriate awareness and training material covering threats, vulnerabilities, risk level and required protective or preventive actions. This information will be sent to the UI agent to be presented by a suitable method through visualiser modules. Figure 3 illustrates the layered architecture of the multi-agent model and the commu- nication links between agents. Fig. 3. Layered architecture of proposed multi-agent model. 5 Conclusion Home users like other individuals are unmotivated and have a limited capacity for information processing in the security situations. Users’ awareness and consciousness about security threats and alternatives motivate them to take proper action in an infor- mation security management system. An e?ective security awareness requires a concrete understanding of security de?nitions, and learning form speci?c security-related deci- sions. It should also provide security control evaluation and risk trade-o? when loss perception and cost is considerably more motivating in choosing alternatives. Risk perception awareness is a signi?cant factor to increase user’s security and protection compliance. This research has proposed a multi-agent model that provides security awareness based on users’ behaviours in interaction with home computer. Machine learning is utilized by this model to pro?le users based on their activities in a cloud infrastructure. Machine learning improves intelligent agent accuracy and cloud computing makes it ?exible, scalable and enhances performance. This research is limited to cover only home users’ requirements and awareness program is based on security risks which might be occurred in accordance of general users’ activities. Moreover, it is significant to handle a huge amount of data in an online mode and process data streams in real time. Therefore, there are many machine learning classifiers based on Neural Network (NN), Bayesian learnings, Decision trees and, statistical analysis tools which should be trained and tested in A Multi-agent Model for Security Awareness 193 accordance with samples which will be collected through a volunteer program to find best possible online classifier. In this ?eld, the next challenge is to identify required monitoring sensors to observe users’ behaviour and provide a comparison between machine learning algorithms to achieve the best performance. References 1. Hazari, S., Hargrave, W., Clenney, B.: An empirical investigation of factors in?uencing information security behavior. J. Inf. Priv. Secur. 4(4), 3–20 (2008) 2. West, R.: The psychology of security. Commun. ACM 51(4), 34–40 (2008) 3. Howe, A.E., et al. The psychology of security for the home computer user. In: 2012 IEEE Symposium on Security and Privacy (SP). IEEE (2012) 4. Wash, R.: Folk models of home computer security. In: Proceedings of the Sixth Symposium on Usable Privacy and Security. ACM (2010) 5. Metalidou, E., et al.: The human factor of information security: unintentional damage perspective. Proc. Soc. Behav. Sci. 147, 424–428 (2014) 6. Bryant, P., Furnell, S., Phippen, A.: Improving protection and security awareness amongst home users. Adv. Netw. Comput. Commun. 4, 182 (2008) 7. Malcolmson, J.: What is security culture? Does it di?er in content from general organisational culture? In: 43rd Annual 2009 International Carnahan Conference on Security Technology (2009) 8. Albrechtsen, E.: A qualitative study of users’ view on information security. Comput, Secur. 26(4), 276–289 (2007) 9. Mai, B., et al.: Neuroscience Foundations for Human Decision Making in Information Security: A General Framework and Experiment Design, in Information Systems and Neuroscience, pp. 91–98. Springer, Berlin (2017) 10. Milne, G.R., Labrecque, L.I., Cromer, C.: Toward an understanding of the online consumer’s risky behavior and protection practices. J. Consum. A?airs 43(3), 449–473 (2009) 11. Borgida, E., Nisbett, R.E.: The di?erential impact of abstract vs. concrete information on decisions. J. Appl. Soc. Psychol. 7(3), 258–271 (1977) 12. Zurko, M.E., Simon, R.T.: User-centered security. In: Proceedings of the 1996 Workshop on New Security Paradigms. ACM (1996) 13. Tversky, A., Kahneman, D.: Rational choice and the framing of decisions. J. Bus. 59, S251– S278 (1986) 14. Mckenna, S.P., Predicting health behaviour: research and practice with social cognition models. In: Conner, M., Norman, P. (eds.) Open University Press, Buckingham (1996). 230 p. Elsevier, ISBN 0-335-19320-X 15. Iglesias, J.A., et al.: Creating evolving user behavior pro?les automatically. IEEE Trans. Knowl. Data Eng. 24(5), 854–867 (2012) 16. Dino?, R., et al. Learning and managing user context in personalized communications services. In: Proceedings of the International Workshop in Conjunction with AVI 2006 on Context in Advanced Interfaces. ACM (2006) 17. Weber, E.U., Blais, A.R., Betz, N.E.: A domain-speci?c risk-attitude scale: measuring risk perceptions and risk behaviors. J. Behav. Decis. Mak. 15(4), 263–290 (2002) 18. Iglesias, J.A., Ledezma, A., Sanchis, A.: Evolving systems for computer user behavior classi?cation. In: 2013 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS). IEEE (2013) 194 F. Foroughi and P. Luksch 19. Middleton, S.E., Shadbolt, N.R., De Roure, D.C.: Ontological user pro?ling in recommender systems. ACM Trans. Inf. Syst. 22(1), 54–88 (2004) 20. Kussul, N., Skakun, S.: Intelligent system for users’ activity monitoring in computer networks. In: Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, IDAACS 2005. IEEE (2005) 21. Schölkopf, B., et al.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001) 22. Ouaftouh, S., Zellou, A., Idri, A.: User pro?le model: a user dimension based classi?cation. In: 2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA). IEEE (2015) 23. Schia?no, S., Amandi, A.: Intelligent user pro?ling. In: Arti?cial Intelligence an International Perspective, pp. 193–216. Springer (2009) 24. Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, Cambridge (2008) 25. Maes, P.: Pattie Maes on software agents: humanizing the global computer. IEEE Internet Comput. 1(4), 10–19 (1997) 26. Stuart, R., Peter, N.: Arti?cial Intelligence-A Modern Approach, vol. 3. California, Berkeley (2016) 27. Joshi, P.: Arti?cial Intelligence with Python. Packt Publishing, Birmingham (2017) 28. Rodríguez, S., et al.: Cloud computing integrated into service-oriented multi-agent architecture. In: Balanced Automation Systems for Future Manufacturing Networks, pp. 251– 259. Springer, Berlin (2010) 29. Wooldridge, M.: An Introduction to Multiagent Systems. Wiley, London (2009) A Multi-agent Model for Security Awareness 195 Light Weight Cryptography for Resource Constrained IoT Devices Hessa Mohammed Zaher Al Shebli(?) and Babak D. Beheshti(?) New York Institute of Technology, Old Westbury, NY 11568, USA Babak.beheshti@nyit.edu Abstract. The Internet of Things (IoT) is going to change the way we live dramatically. Devices like alarm clocks, lights and speaker systems can inter- connect and exchange information. Billions of devices are expected to be inter- connected by the year 2020, thus raising the alarm of a very important issue ‘security’. People have to be sure that their information will stay private and secure, if someone hacked into your medical device (hand watch) he will be able to view all your medical records, and he could be able to use it against you. If one device is hacked your entire network is going to be compromised. Transmitting your information securely between IoT devices using traditional crypto algo- rithms are not possible because those devices have limited energy supply, limited chip area and limited memory size; because of those constraints a new type of crypto algorithm came into place: the light weight crypto algorithms. As the name implies those algorithms are light and can be used in those devices with low computational power. In this paper, we start by describing some of the heavy ciphers. We also highlight some lightweight ciphers and the attacks known against them. Keywords: Light weight cryptography · IoT devices · Grain cipher Present cipher · Hight cipher 1 Introduction Security is the key concern on the technology world. With the rapid increase in the number of devices connecting to the internet these days, transmitting con?dential infor- mation in a secure manner is what people try to achieve when they use encryption. Encryption is the term used to hide the context of the original message (using an encryp- tion algorithm and a key) so only the intended user can decrypt it and read it. Figure 1 illustrates the basic ?ow of information through encryption. Encryption algorithms are divided into two main categories, symmetric algorithms and asymmetric algorithms; where the symmetric algorithms mean using only one key to perform both the encryption and decryption process. While the asymmetric algorithms use two keys (public and private) one to encrypt and the other to decrypt. Symmetric algorithms are also divided into two main groups stream ciphers and block ciphers, from their name indicates stream ciphers encrypts a bit by bit, while the block cipher encrypts a bunch of bits together. © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 196–204, 2019. https://doi.org/10.1007/978-3-030-02686-8_16 Fig. 1. Encryption process. In this paper we will start by introducing the general categories of symmetric key and assymetric key crypto algorithms. We will then proceed to survey some leading light weight algorithms. For each algorithm we will introduce the fundamental structure, followed by attacks studied for each one. At the end we will present a comparison of performance parameters for these algorithms. 1.1 Symmetric Algorithms – Block Cipher (AES) We will take AES as an example of the symmetric algorithm since it’s the most used algorithm these days. AES stands for Advanced Encryption Standard, also known as Rijndael, its original name [10]. AES encrypts a ?xed block size of 128 bits, and has a key size of 128, 192, or 256 bits. AES was developed to substitute DES which was vulnerable to brute-force attacks. AES encrypts blocks of data in rounds depending on the key size, for example the 256 bits key, have 14 rounds (below table shows number of rounds for each key size) [8]. The relation between number of rounds and the key size is illustrated in Table 1. Table 1. Number of rounds (R) in relation to cipher key size No. rounds Key size 10 128 12 192 14 256 For encryption, each round of processing includes four steps, byte substitution, shift rows, mix columns, and add round key. All rounds are identical except for the last one. One round is shown in Fig. 2. The byte substitution step simply means replacing bytes with bytes from a 16 × 16 lookup table. Shift rows step consists of shifting the row state to the left, ?rst row is not shifted while second row is shifted one byte, third row shifted two bytes and fourth row shifted three bytes to the left (Fig. 3). Light Weight Cryptography for Resource Constrained IoT Devices 197 Fig. 3. Shift rows. AES is still secure against all attacks. AES requires lots of power and chip area to do encryption and decryption process. While this is not an issue in devices like work- stations and laptops it i’s a concern for small devices that have to save power, and have limited chip area. For AES with a key size of 128 bits 3,400 GE1 chip area is required while 2000 GE chip area is allocated for security in IoT device [9]. 1.2 Asymmetric Algorithms – RSA RSA is another crypto algorithm that is widely used; it is an asymmetric algorithm that uses two di?erent keys but mathematically linked, one to encrypt (the public key) and one to decrypt (the secret key). RSA got its name from the initial letters of the three scientists who ?rst publicly described the algorithm in 1977 (Ron Rivest, Adi Shamir, and Leonard Adleman). There are two steps for the RSA algorithm: 1. Key generation. 2. RSA encryption, decryption. 1 A gate equivalent (GE) stands for a unit of measure which allows to specify manufacturing-technology- independent complexity of digital electronic circuits. For today's CMOS technol- ogies, the silicon area of a two-input drive-strength-one NAND gate usually constitutes the technology-dependent unit area commonly referred to as gate equivalent. A speci?cation in gate equivalents for a certain circuit re?ects a complexity measure, from which a corresponding silicon area can be deduced for a dedicated manufacturing technology (https://en.wiki- pedia.org/wiki/Gate_equivalent). Fig. 2. AES encryption round steps. 198 H. M. Z. Al Shebli and B. D. Beheshti In the key generation step (generating a public key and a corresponding private key), two large prime numbers have to be generated. After that we have to generate modulus (n) by multiplying those two prime numbers. Generating the modulus is easy but facto- rizing the two prime numbers that we used is considered hard even with today’s super computers. After that we need to calculate the f(n) using the formula: f(n) = (p-1)(q -1). The public key (expressed as e) is then generated by choosing a prime number in the range between 3 and ?(n). The ?nal public key is a pair of e and n; represented as (e, n). The private key (d) is the multiplicative inverse of the public key with respect to ?(n), and is also represented as pair (d, n). For the encryption this formula is used: F(m, k) = mk mod n where k is the public key or the private key. Asymmetric algorithms (AKA public key algorithms) relies on mathematical oper- ations like factorization to be e?ective. These operations needs lots of resources to complete and requires large hardware footprint, making it too expensive for IOT devices. 2 Light Weight Cryptography Light weight cryptography is designed to secure the communication between IOT devices since traditional cryptographic algorithms are not an option. IOT devices (AKA constrained devices) have constraints when it comes to speed, power consumption, area, processing, memory space and size [14]. The challenge is to reduce some of the algo- rithm parameters without a?ecting the total security of the algorithm. Number of rounds, key length and processing speed has to be reduced. There are two ways to design a light weight cryptographic algorithm, the ?rst one is to develop it from scratch like the PRESENT cipher, and the second way is to optimize the functionalities of an existing traditional cryptographic algorithm like AES and RSA. Light weight algorithms are categorized into two main categories; hardware– oriented and software-oriented based on the requirements of the cipher. Hardware oriented ciphers used when we are concerned about the number of clock cycles and chip size; while the software oriented ciphers are used when we are concerned about the memory space and power consumption. A standardization subcommittee of the Joint Technical Committee ISO/IEC JTC 1 of the International Organization for Standardization (ISO) and the International Elec- trotechnical Commission (IEC), started working on the lightweight cryptography project. ISO/IEC 29192 is the known standard for the lightweight cryptography. ISO/IEC 29192 part two and part three, speci?es block ciphers and stream ciphers respectively. Some light weight ciphers are introduced below. 2.1 Grain (Stream Cipher) In this bit oriented synchronous stream cipher the key stream is generated independently from the plaintext. The stream cipher is divided into two phases: ?rst phase is initiali- zation the internal state using the secret key and the initialization vector [7]. Then the Light Weight Cryptography for Resource Constrained IoT Devices 199 phase is repeated while the state is updated and used to generate key-stream bits. There are two types grain v1 and grain-128. The overall algorithm block diagram is illustrated in Fig. 4. Fig. 4. Grain v1 algorithm. The grain v1 needs an 80-bit stream cipher and a 160 cycles and receives a 64-bit initialization vector. The grain-128 needs 128-bit stream cipher and a 256 cycles and receives a 96-bit initialization vector. Figure 4 shows a basic structure of the Grain v1 algorithm “f” and “g” are two polynomials(functions) of degree 80 they are used as a feedback for the feedback regis- ters the linear feedback and the non-linear feedback the “h” polynomial uses selected bits from both the feedback shift registers, bits from the NFSR register are XORed then added to the “h” function the output is used as a feedback to the LFSR and NFSR in the initialization phase (as shown in the light blue lines) and during the normal operation it is used as a key stream output. The output is one bit of the non-linear feedback register and four bits of the linear feedback register then they are supplied to the nonlinear 5- to-1 XOR function and the output is linearly combined with 7 bits of the linear feedback register and released as an output. 2.2 Present (Block Cipher) PRESENT is an ultra-Lightweight block cipher that has a block length of 64 bits and two key lengths of 80 bits, 128 bits and 31 rounds [6]. It’s block diagram is shown in Fig. 5. Present cipher design got its characteristics from the Serpent ciphers (non-linear substitution layer S-box) and DES (linear permutation layer pLayer). There are three stages involved in PRESENT. The ?rst stage is addRoundKey; the second stage is sBoxLayer; the third stage is the bit permutation pLayer [6]. Figure 5 shows that in each round of the 31 rounds there is an XOR operation to introduce a round key as in ki 1 <= i <= 32 then k32 is used for post-whitening (post-whitening is combining the data with portions of the key to increase the security of block cipher), linear bitwise permutation and a non-linear substitution layer. The non-linear layer uses S-box of 4-bit to 4-bit which is applied in each round 16 times in parallel. 200 H. M. Z. Al Shebli and B. D. Beheshti The key can take 80 or 128 bits, it is stored in a key register and represented in a descending way as k50k49 … k0 at round “i” 64 bits round key is ki which is denoted as k63k62 … k0 and it consists of the leftmost 64 bits of the contents of the K register. After extracting the round key ki the K register is rotated by 61-bit positions to the left, the leftmost 4 bits are passed through the S-Box and the round counter value “i”, then the ‘i’ is XORed with the least signi?cant bits of the round counter on the right and the whole operation is repeated. 2.3 Hight (Block Cipher) Hight is a lightweight encryption algorithm that was proposed one year before Present cipher. It consists of 32 rounds. Hight makes use of XOR, addition Modulo 256 opera- tions which allow it to have good performance in hardware [1]. Figure 6 shows the cipher block diagram. Hight has a block size of 64-bits and a 128-bits key length. The encryption starts with an Initial Transformation (IT) that is applied to plaintexts together with input whitening keys (WKs) [11]. In the last round a Final Transformation (FT) is applied to the output of the last round together with output whitening in order to obtain the Cipher texts [1]. The plain text is divided into 8 bytes denoted as P = P7, P6, … P0 the same thing for the cipher text its divided into 8 bytes denoted as C = C7, C6, … C0 the 64-bit intermediate values are represented as Xi = Xi,7, Xi,6, … Xi,0 the master key is divided into 16 bytes denoted as MK = MK15, MK14, … MK0 [13]. The Key Schedule has 2 algorithms one is to generate whitening-key “WK” and the second is to generate subkey “SK”. The operation uses 8 whitening-keys, 4 for the initial transformation and another 4 for the ?nal transformation. 128 subkeys are generated throughout the process 4 used at each round. Fig. 5. Present cipher. Light Weight Cryptography for Resource Constrained IoT Devices 201 At the initial transformation the ?rst intermediate value X0,0, P0 are into an addition and a modules of 28 with the whitening-key WK0 then the 2nd intermediate value X0,1, P1, then the 3rd intermediate value X0,2, P2 are into addition and a modules of 28 with the whitening-key WK1 this is repeated until the X0,7, P7 [12]. At each round the Xi is turned into Xi+1 in example Xi+1,0 = Xi,7 XOR (auxiliary function of(Xi,6) into addition and a modules of 28 with the subkey SK4i+3) this repeats for every X until X32,0 in each round the ?nal transformation is the same as the initial transformation repeated but with the “P” turned into the notation of cipher text which is “C” and WK instead from 0 to 3 it is now 4 to 7 and the X is from X32,0 to X32,7. 3 Comparison Between the Algorithms Since Present and Hight are both block ciphers it’s fair to compare them to each other. Table 2 lists the comparison between key performance criteria between PRESENT and HIGHT. Table 2. Comparison between PRESENT AND HIGHT cipher Algorithm Key size Area “GE” RAM requirement (bytes) Present 80 1570 142 Hight 128 3048 18 We assume a block size of 64 bits for both algorithms. Table 1 shows that Present cipher does not need much area as Hight cipher does [2]. Fig. 6. Hight cipher. 202 H. M. Z. Al Shebli and B. D. Beheshti As for the stream cipher Grain Table 3 shows the area requirements. Table 3. Grain cipher Algorithm Key size Area “GE” Grain 80 bits 1294 4 Attacks Against Lightweight Algorithms Designers of Present cipher presented some security margins for di?erential, linear and algebraic cryptanalysis. Since then it was discovered that 32% of Present keys (80-bit key size) are weak for linear cryptanalysis. In 2009 a study on the linear hull and alge- braic cryptanalysis was conducted for Present. The study proposed a linear attack for 25 rounds of Present (128-bit key size) and an algebraic attack for 5 rounds of Present (80-bit key size). After a year of this study an attack on 25-round Present was proposed that can recover the 80-bit secret key with 262.4 data complexity [3]. In linear cryptanalysis an attacker tries to ?nd biased linear approximations for non-linear components of a cipher (e.g. an S-Box) and then use them to ?nd biased linear approximation for the entire cipher. One is then able to use these biased approximations to recover certain subkey bits. Afterwards, the remaining key bits are recovered by brute force [4]. One of the studied attacks against Present cipher is a Statistical Saturation Attack that takes advantages of the weakness in its di?usion layer. Present as well can be exploited using the Di?erential key attack [5]. 5 Conclusion and Future Work Recourse constrained devices like RFID (radio-frequency identi?cation) are getting more and more into our lives because of their cheap prices. The need for cryptographic solutions is necessary. While lots of ciphers have been proposed their security has to be studied more and more against emerging attacks. In this paper we highlighted three lightweight algorithms, we compared them; we also touched on possible attacks for the Present cipher. For future work, we plan to simulate several key lightweight crypto-algorithms on multiple embedded platforms and pro?le their performance. These performance comparisons will be important to recognize each algorithm’s internal computation a?nity to speci?c CPU architectures. References 1. Ozen, O., Varici, K., Tezcan, C., Kocair, C.: Lightweight Block Ciphers Revisited: Cryptanalysis of Reduced Round PRESENT and HIGHT. http://citeseerx.ist.psu.edu 2. Bogdanov, et al.: PRESENT: An Ultra-Lightweight Block Cipher. http://lightweightcrypto.org Light Weight Cryptography for Resource Constrained IoT Devices 203 3. Lacko-Bartošová, L.: Algebraic Cryptanalysis of Present Based on the Method of Syllogisms. www.sav.sk 4. Bulygin, S.: More on linear hulls of PRESENT-like ciphers and a cryptanalysis of full-round EPCBC–96. http://eprint.iacr.org 5. Collard, B., Standaert, F.X.: A Statistical Saturation Attack against the Block Cipher PRESENT. http://citeseerx.ist.psu.edu 6. Aura, T.: Cryptoanalysis of Lightweight Block Ciphers, November 2011. http://into.aalto.?f 7. Grain: A Stream Cipher for Constrained Environments (n.d.). https://cr.yp.to 8. Block and Stream Cipher Based Cryptographic Algorithms: A Survey (n.d.). www.ripublication.com 9. Simon and Speck: Block Ciphers for the Internet of Things, July 2015. https://csrc.nist.gov 10. Single-Cycle Implementations of Block Ciphers (n.d.). https://csrc.nist.gov 11. Han, B., Lee, H., Jeong, H., Won, Y.: The HIGHT Encryption Algorithm draft-kisa-hight-00”, November 2011. https://tools.ietf.org 12. Impossible Di?erential Cryptanalysis of the Lightweight Block Ciphers TEA, XTEA and HIGHT (n.d.). https://eprint.iacr.org 13. IP Core Design of Hight Lightweight Cipher and Its Implementation (n.d.). http://airccj.org 14. Rekha, R., Babu, P.: On Some Security Issues in Pervasive Computing: Light Weight Cryptography”, February 2012. http://www.enggjournals.com 204 H. M. Z. Al Shebli and B. D. Beheshti A Framework for Ranking IoMT Solutions Based on Measuring Security and Privacy Faisal Alsubaei1,2(&) , Abdullah Abuhussein3 , and Sajjan Shiva1 1 University of Memphis, Memphis, TN 38152, USA {flsubaei,sshiva}@memphis.edu 2 University of Jeddah, Jeddah, Saudi Arabia 3 St. Cloud State University, St. Cloud, MN 56301, USA aabuhussein@stcloudstate.edu Abstract. Internet of Medical Things (IoMT) is now growing rapidly, with Internet-enabled devices helping people to track and monitor their health, early diagnosis of their health issues, treat their illness, and administer therapy. Because of its increasing demand and its accessibility to high Internet speed, IoMT has opened doors for security vulnerabilities to healthcare systems. The lack of security awareness among IoMT users can provoke serious and perhaps fatal security issues. The disastrous consequences of these issues will not only disrupt medical services (e.g., ransomware) causing ?nancial losses but will also put the patients’ lives at risk. This paper proposes a framework to compare and rank IoMT solutions based on their protection and defense capability using the Analytic Hierarchy Process. The proposed framework measures the security, including privacy, in the compared IoMT solutions against a set of user requirements and using a detailed set of assessment criteria. This works aims to help in determining and avoiding risks associated with insecure IoMT solutions and reduce the gap between solution providers and consumers by increasing the security awareness and transparency. Keywords: IoMTQuantitative evaluationSecurityAssessment MetricsMeasurementsPrivacy 1 Introduction The Internet of Medical Things (IoMT), also known as the healthcare Internet of Things (IoT), can be described as a collection of medical devices and applications that are connected through heterogeneous networks. IoMT solutions are being utilized by many healthcare providers to facilitate the management of diseases and drugs, improve treatment methods and the patient experience, and reduce cost and errors. Currently, about a third of IoT devices are found in healthcare; this number is expected to increase by 2025, with healthcare accounting for the largest percentage (approximately 40%) of the total global worth of IoT technology ($6.2 trillion) [1]. Further, approximately 60% of healthcare organizations have already adopted IoT technologies, and that percentage is expected to rise to approximately 87% by 2019 [2]. © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 205–224, 2019. https://doi.org/10.1007/978-3-030-02686-8_17 One of the most prevalent problems currently facing IoMT solutions is security fragility [3]. A survey found that of more than 370 organizations using the IoMT, approximately 35% suffered at least one cybersecurity incident in 2016 [4]. The lack of security awareness among IoMT users is a key factor for the security issues in IoMT. According to a recent survey, only 17% of connected medical device makers and 15% of medical professionals are aware of potential security issues and take serious mea-sures to prevent them [5]. This could explain why more than 36,000 healthcare-related devices in the U.S. alone are easily discoverable on Shodan, a search engine for IoT devices [6]. In addition, while there is a lack of security standards for the IoT in general, extra efforts are needed to regulate and ensure security in the IoMT. Unlike other domains, security in the medical ?eld is vital due to the sensitivity of the medical data and critical nature of the operations involved. The U.S. Food and Drug Administration (FDA) has taken steps to secure medical devices; however, only 10% of these devices are clas-si?ed under FDA Class III, which includes devices designed to support or sustain life (e.g., pacemakers) [7]. However, reduced patient wellbeing is not the only consequence of IoMT attacks, as these attacks can also have negative effects on medical data privacy, brand reputation, business continuity, and ?nancial stability. Moreover, there is a lack of consensus among stakeholders in healthcare organi-zations regarding security requirements [8]. This dissension and the lack of security awareness leaves adopters unsure about which security features are relevant to their solutions [9]. IoMT adopters usually are compelled to accept the default security in solutions. Adopters should instead be able to measure and verify security themselves to make well-educated decisions. It is also important to enable adopters to select security features based on their requirements (i.e., priorities) because security goals depend not only on the scenario but also on the assets and tolerance to risks. Due to the rapid evolution of IoMT technologies, there is a need to introduce a structured quantitative model that is expandable and offers opportunities to improve security. Thus, we propose a framework to assess the security and privacy levels provided in IoMT solutions using the Analytic Hierarchy Process (AHP). The proposed framework allows users to make knowledgeable choices when obtaining new or enhancing existing IoMT solution. It also allows adopters to de?ne their security priorities that reflect their security objectives and utilize them to rank prospective solutions in terms of security. The AHP-based method uses a list of detailed security assessment criteria collected by examining security controls published by specialized organizations such as the Open Web Application Security Project (OWASP), the International Organizations of Standardization (ISO), FDA among others. In addition, our method uses previous IoMT attacks and available IoMT solutions. The rest of this paper is organized as follows: the literature for measuring the security in IoMT is discussed briefly in Sect. 2. Section 3 presents the assessment criteria used in the framework. The security assessment method is demonstrated in Sect. 4. Section 5 presents a case study of the framework by assessing the degree of security in real IoMT solutions. Sections 6 and 7 discuss the evaluation and limitations, respectively. Lastly, in Sect. 8, we draw concluding remarks and outline some future works. 206 F. Alsubaei et al. 2 Related Work This section surveys previous work in assessing the security of IoMT solutions. The main gaps in the current literature can be summarized as follows: • The assessment criteria are speci?c to a set of IoMT scenarios (e.g., patient mon-itoring) [10, 11]. • The security recommendations are abstract and target only manufacturers who primarily focus on one part of the IoMT (e.g., devices) to the exclusion of others, such as mobile and back-end [12–15]. • Lack of an assessment model that helps adopters, according to their security pri-orities, to quantify and compare the security of potential IoMT solutions [16–22]. • The focus is only on assessing existing solution(s) by utilizing post-deployment parameters such as con?gurations and current users’ feedback, which requires technical knowledge that often most IoMT users lack [14, 23, 24]. Despite the fact that these works are viewed as a valuable contribution, they cannot be incorporated ef?ciently into an assessment method for the IoMT. They also do not provide a practical assessment method that considers the user security priorities. In this paper, we build upon and complement the past efforts by proposing a framework for quantifying security in IoMT solutions that is twofold: (1) a detailed list of security assessment criteria that includes over 200 assessment questions for IoMT security. These questions were gathered by examining the IoMT security considerations from different sources and IoMT solution providers. (2) an AHP-based security assessment method for IoMT solutions utilizing the assessment criteria. The proposed framework enables users to rank candidate IoMT solutions based on their security to help them in making educated decisions. The importance of our framework lies in its ability to aid adopters in selecting or improving current IoMT solutions considering their security priorities. 3 Assessment Criteria Because of the rapid development in the IoT technologies and therefore the complexness of IoMT, it is imperative to design a simple-to-use and elaborate list of assessment criteria that considers any IoMT solution. Therefore, we utilize the goal-question-metric (GQM) approach while designing the assessment criteria [25]. GQM is a popular approach to measure assessment goals by identifying questions and developing metrics to answer the questions [26]. These metrics are then used to ensure that the goals are met. As in Fig. 1, GQM is utilized in our framework such that for every IoMT com-ponent, there is a list of yes/no questions and corresponding answers (i.e., metrics). A small sample of the assessment criteria is shown in Table 1 and organized as. A Framework for Ranking IoMT Solutions 207 Fig. 1. Typical IoMT components. Table 1. A sample of the assessment criteria. Component Security feature Question Goal Sub-goal Secure Endpoint (E) 1. Intrusion prevention 1. Can the IoMT ecosystem detect endpoints that are connecting to abnormal service, or connecting to service at unusual times? 2. Can IoMT ecosystem detect endpoints leaving or joining a communication network at erratic intervals? 3. Can endpoint devices detect a signi?cantly abnormal network traf?c ?ngerprint of other devices? 4. Do endpoint devices have secure event logging? 2. Strong authentication 1. Do endpoint devices require users to authenticate themselves before using/access any function? 2. Do the endpoint devices provide mechanisms to prevent brute force attacks? 3. Do endpoint devices use cryptographic certi?cates for self-authentication or to verify the broker identity of a user? 4. Does the IoMT ecosystem ensure that no hardcoding or default passwords are allowed in endpoint devices? 3. Secure updates 1. Does the IoMT ecosystem provide automated alerts, via SMS or email, for available manual updates for endpoint devices? 2. Are endpoint devices updates and patches, including extensions or plugins, veri?ed (e.g., binary signing and hash values) after download and before installation to ensure their legitimacy? 3. Does the IoMT ecosystem clearly identify the endpoints software running version? (continued) 208 F. Alsubaei et al. Table 1. (continued) Component Security feature Question Goal Sub-goal 4. Protected memory Is the use of direct memory access in endpoint devices by other peripherals carefully managed and controlled? 5. Secure communications Do endpoint devices renegotiate and verify communication security keys each time it reconnects to the communication network? 6. Secure administration Do management systems distinguish between active and inactive endpoint devices? 7. Secure hardware Do endpoint devices use epoxy covering for core circuit components? 8. Secure software Are all debugging and test technologies disabled in the endpoint devices? 9. Secure web interface Is the web interface of endpoint devices presented over hyper-text transfer protocol secure (HTTPS)? 10. Secure storage Are all data stored in the endpoints’ removable media, protected cryptographically? 11. Regulatory compliance Are the medical endpoint devices approved by the FDA? 12. Secure root of trust Are the roots of trust certi?ed by FIPS or CC? Secure Gateway (G) 1. Secure communications Does the gateway provide standard bidirectional end-to-end encryption? 2. Secure storage Does the gateway cryptographically store data collected from endpoint devices? 3. Intrusion prevention Does the gateway have robust security logging of all events? 4. Secure hardware Does the gateway provide countermeasures against physical attacks? 5. Strong authentication Does the gateway cryptographically authenticate endpoint devices to different components and vice versa? 6. Secure updates Does the gateway allow for modular updates and monitoring of extensions and plugins? 7. Secure web interface Is the gateway’s web interface presented over HTTPS? Secure Mobile (M) 1. Secure communications 1. Are the communications in mobile devices always encrypted? 2. Intrusion prevention 1. Does the mobile provide alerts for mobile status (e.g., connectivity or power outages)? 3. Strong authentication Do mobile applications, or devices support biometrics authentication (e.g., ?ngerprint, face recognition)? 4. Secure updates Are mobile vendor-speci?c security updates checked and installed automatically? (continued) A Framework for Ranking IoMT Solutions 209 Table 1. (continued) Component Security feature Question Goal Sub-goal 5. Secure software Is the application certi?ed and listed in vendors’ application stores (e.g., Apple App Store, Google Play)? 6. Secure web interface Is the mobile’s web interface presented in HTTPS? 7. Secure storage Does the mobile application share any data with third parties? Secure Back-end (B) 1. Secure cloud environment 1. Does the cloud services always available even during scaling up/down? 2. Does the cloud service provider hide information about the servers physical locations? 3. Does the cloud have countermeasures against data leakage in multi-user storage services? 4. Does the cloud service provider have an of?cial insider threat program? 2. Secure software 1. Does the back-end utilize an API for the application to cryptographically identify itself to its peers? 2. Are back-end third-party libraries actively monitored, managed, and audited? 3. Are the back-end applications designed to mitigate buffer errors using the operating system’s mechanisms? 3. Secure web interface Does the back-end web interface use certi?cates that are signed by a certi?cate authority? 4. Regulatory compliance Does the back-end use standard protocols and technologies? 5. Risk assessment Did the IoMT solution provider identify the assets, risk factors, and threat agents? 6. Privacy assurance Does the IoMT solution provider have a process to ensure that the privacy of individuals’ personal and medical information complies with the latest relevant privacy laws (e.g., Health Insurance Portability and Accountability Act (HIPAA), Health Information Technology for Economic and Clinical Health Act (HITECH) or the General Data Protection Regulations (GDPR), Personally Controlled Electronic Health Records Act, etc.) in effect over user control of their data? 7. Secure development lifecycle Does the IoMT solution provider validate management of the supply chain, the software, the sources of the equipment, and the purchaser and supplier aspects of the infrastructure? (continued) 210 F. Alsubaei et al. 3.1 Goals The goals are the IoMT components to be secured (?rst column in Table 1). The IoMT typical components we use, as outlined in Fig. 1, are de?ned as follows [27]: Endpoints: These are connected medical devices that typically have embedded sen-sors to collect data and forward it to the back-end servers. Based on their operating system, hardware, communication media, mobility, etc., these devices can be of various kinds but collaborate heterogeneously to perform a common task. Endpoint devices can be wearable sensors (e.g., blood pressure monitors, heart monitors, pulse oximeters), implantable devices (e.g., embedded cardiac function monitoring systems, swallowable camera capsules), ambient sensors (e.g., motion sensors, pressure sensors, room tem-perature sensors), or stationary devices (e.g., computerized tomography scanners, surgical instruments). Gateway: These are optional devices to support some weak endpoint devices. Some strong endpoint devices can have gateway capabilities and can serve as gateways; in this case, these devices are called border routers. Gateways act as a bridge network to aggregate the data collected from the endpoint devices and transmit it to the back-end. Because of its location, it also serves as a secure channel between the insecure, but trusted, local network and the untrusted public Internet. Table 1. (continued) Component Security feature Question Goal Sub-goal 8. Incident response Does the IoMT solution provider have an incident response procedure in place for information recovery? 9. Secure storage Are the back-end authentication credentials (i.e., usernames, passwords, device ids, etc.) salted and hashed before stored? 10. Secure communications Does the back-end have quality of service mechanisms for delivery of targeted messages to speci?c components? 11. Secure updates Does the back-end report and update service infrastructure’s third-party components (both software and hardware) regularly to ensure the latest security updates are installed once available? 12. Strong authentication Does the authentication service gather metrics to determine whether the user changed to an alternative computing platform, but still uses the former token? 13. Secure administration Does the back-end include load-balancing features and redundancy systems? 14. Intrusion prevention Do the back-end protect against malware-based attacks? A Framework for Ranking IoMT Solutions 211 Back-end: Most current IoMT environments have back-end server(s) that are often hosted on the cloud for better scalability. IoT platforms are often utilized for provi-sioning, management, and automation of endpoint devices. They also provide other common server-side tasks such as centralized data storage, backups, reports, and analytics, etc. Mobile: IoMT systems can also have mobile applications to control endpoint devices and provide limited back-end capabilities and instant alerts. Every goal (i.e., IoMT component in the ?rst column of Table 1) has sub-goals (i.e., security features in the second column) to ensure that the security goals are achieved. For instance, to secure the endpoint devices, the identi?ed sub-goals are as follows: secure administration, strong authentication, secure updates, intrusion prevention, protected memory, secure communications, secure web interface, secure hardware, secure software, secure storage, regulatory compliance, and secure root of trust. 3.2 Questions The assessment questions (third column of Table 1) were thoroughly examined and collected from various reliable resources that include: • Medical-speci?c sources, such as guidelines from the FDA [28], ISO [15], the Medical Device Risk Assessment Platform (MDRAP) assessment questionnaire [13], and the Naval Medical Logistics Command (NMLC) [29], among others. • General IoT security considerations provided by OWASP [17], the Cloud Security Alliance (CSA) [16], the Global System for Mobile Communication Association (GSMA) [19], and others [18]. • The documentation of popular IoMT solutions and their accompanying Security Level Agreements (SecLAs). The yes/no questions are less demanding and provide the answers to the respon-dent. Thus, they are quick and easy to answer and provide an accurate and consistent assessment. These questions precisely measure the different levels of security in the security features. For example, the security of encryption depends on the used algo-rithm and encryption key size. Hence, our questions consider, and quantify, such levels of security. Due to the space constraint, in Table 1 we listed only a sample list of the assessment criteria in which, only one question is included per security feature except for the questions used in the case study. The full list will be available in future publications. 3.3 Metrics A single metric is a score that depends on the question answer. Our proposed frame-work utilizes the documentation presented by solution providers to determine the metrics. Metrics measure the degree of achieving a sub-goal and, ultimately, a goal. The overall degree of security provided by a security feature is the total scores for all the assessment questions under that feature. The security features are then used to calculate the degree of security of a component. As illustrated in Fig. 3, this forms a hierarchy for the assessment. 212 F. Alsubaei et al. Fig. 2. The proposed framework flow. Fig. 3. Sample pro?le represented in hierarchy. A Framework for Ranking IoMT Solutions 213 4 IoMT Security Assessment In this section, we present an assessment method that employes the presented hierarchal list of assessment criteria and perfectly suits its hierarchal structure. IoMT security depends on multiple factors; therefore Multiple criteria decision-making (MCDM) is required such that all goals (and sub-goals) are assessed, and their scores are aggregated in a meaningful score. Hence, we use the AHP in our assessment method to achieve this task. AHP is a popular technique to solve MCDM problems [30]. What makes the AHP more suitable in this scenario than other MCDM techniques, is its flexibility as well as its ability to address inconsistencies across requirements. It also allows for composite quantitative and qualitative weighted questions to be compared easily because of its pairwise comparisons of decision criteria [31]. The pairwise results of comparisons and weights for every criterion are structured into a hierarchy. These comparisons of the questions and weights are the basis for the security assessment of IoMT solutions. As shown in Fig. 2, there are three main stages in our assessment method, which are described as follows. 4.1 De?ning Security Pro?les In this stage of the framework, security pro?les are de?ned to prepare them for comparisons in the next stage. In other words, pro?led IoMT solutions are described in terms of their security capabilities producing IoMT solution pro?le. The user desired degree of security is also captured. Thus, the output of this stage is a user pro?le that includes the user requirements (i.e., security priorities) and at least two IoMT solution security pro?les. This allows the user to (1) verify that the solution’s security matches their requirements, and (2) compare the security in two or more solutions. The two types of security pro?les are described as follows. User Requirements Pro?le. This is where IoMT users specify their desired security degree. The user assigns weights for all elements in the second, third, and/or fourth levels as in Fig. 3. This detailed pro?ling is crucial for better accuracy when comparing the relative importance of two (or more) elements within the same level. This ensures that all the user’s security priorities are met. The framework provides flexibility in assigning weights. It allows users to assign weights on a scale of 1 to 10 (i.e., a weight of 10 denotes it is extremely more important than others, whereas 1 denotes equal importance) or binary (i.e., 1 or yes denotes required, and 0 or no denotes not required), or mixed at various layers of the hierarchy. For example, a user marks one component as very important, assigns quantitative weights to security features in another com-ponent, and assigns Boolean (yes/no) to a third component. The weight of 0 can be assigned to irrelevant question(s) so that they are disregarded from the assessment. Solution Pro?les. To compare the degree of security in IoMT solutions, assessment criteria questions (described in Sect. 3) should be answered for each IoMT solution individually to assess the security in IoMT solutions. One can use the publicly available speci?cations of IoMT solution from the product FAQs page, or contact the solution providers’ customer service, to answer the assessment criteria questions. In open-source solutions, security experts can be involved in answering these questions. 214 F. Alsubaei et al. 4.2 Security Quanti?cation In this stage, the security pro?les generated in the previous stage (i.e., the user requirements pro?le and the solutions’ pro?les) are used to assess security in the solutions and to check if they match the user security requirements. The terms that will be used in the assessment is shown in Table 2. Since the questions in our assessment criteria require only yes or no answers, these values can be represented as 1 and 0 values. The relationships across the solutions (S) of the question value (V) can be de?ned as a ratio: S1 S2 ¼ V1 V2 ¼ 0 if ðV1 ¼ 0 ^ V2 ¼ 0Þ _ ðV1 ¼ 0 ^ V2 ¼ 1Þ 1 if ðV1 ¼ 1 ^ V2 ¼ 1Þ _ ðV1 ¼ 1 ^ V2 ¼ 0Þ ( ð1Þ For example, assume two IoMT solutions, S1 and S2, have values V1;q ¼ 0and V2;q ¼ 1, respectively, for question q, which user U requires (thus; Vu;q ¼ 1). The pairwise comparison ratio of S1 and U is de?ned as V1;q=Vu;q = 0, which means that S1 is not satisfying the user requirement. However, the pairwise comparison ratio V2;q=Vu;q = 1 means that S2 is ful?lling the user requirement. This stage relies on the pairwise comparison matrix (CM) of the security questions in solutions’ pro?les and the user requirements pro?le. Using a CM for a question over all pro?les, we obtain a one-to-one comparison where V1;q=V2;q denotes the relative rank of S1 over S2. If there are n IoMT solutions, the one-to-one CM (including the user requirment pro?le) will be of size ðn þ 1Þ Þ ðn þ 1Þ: ð2Þ Table 2. Description of assessment terms. Term Descriptions q Assessment question Si Solution i, where i 2 f1; ...; ng and n denotes the number of IoMT solutions to be compared Vi;q Metric (answer) of q provided by Si Si;q Si provides q with value Vi;q U IoMT user (adopter) Vu;q The user required value of q S1=S2 Relative rank ratio of S1 over S2, regarding q S2=S1 Relative rank ratio of S2 over S1, regarding q Si;q=U Relative rank ratio of Si over U, which indicates if Si ful?lls Vu;q A Framework for Ranking IoMT Solutions 215 4.3 Ranking The relative ranking of all the IoMT solutions for any question, which is known as the priority vector (PV), is derived by calculating the eigenvector of the CM. The PV transforms the CM into a meaningful vector that summarizes the results of all com-parisons (ratios) into a normalized numerical ranking. The principle of eigenvector in AHP is necessary to reduce human errors in the judgment process [32]. The following example PV shows that solutions 2 and 3 meet the user requirement U. After all PVs (i.e., rankings) for all questions are computed, they are aggregated (from bottom to top) to determine the overall security rankings of IoMT solutions. All the questions’ PVs are combined with their assigned relative weights from the previous stage. PVaggregated ¼ X g j¼1 wj:PVj ð3Þ where PVj denotes the PV of the CM of question j, wj denotes the relative weight assigned to the question, and g is the number of all questions. If the user wants to compare the security in the underlying levels, then the weights of the upper levels will not be considered in the aggregation. For example, if a user wants to compare only the security features in one component, then only the weights of the security features, and their corresponding questions will be considered. 5 Case Study In this section, we demonstrate how the framework can be used to assess and rank three popular real-world cloud-based IoT platforms that are being used widely in healthcare. We examined the SLAs and other available documentation describing the offered security to answer the questions in our assessment criteria. As a result, we have three distinct security pro?les for these platforms. We consulted their customer service in order to answer the questions that we could not answer using their publicly published documentation. The questions for which we could not ?nd relevant information to answer are dealt with as answered “no” because if the answers were “yes”, then they would have used this to market the security of their products. To illustrate the flexibility of our framework, we show three examples of hypothetical weights (i.e., user requirements) for a sample of the assessment criteria as described in Table 3 (where yes and no are denoted by1 and 0, respectively). Case 1 In this detailed case, the user assigned boolean weights by answering all relevant questions (i.e., yes denotes required, and no denotes not required). For every question, 216 F. Alsubaei et al. Eq. 1 is used to perform pairwise comparisons on its CM. Thus, the CM of B.2.3 is: Then, the PVB.2.3is calculated by ?nding the normalized eigenvector of the CMB.2.3 This indicates that S1 does not satisfy the user requirement, whereas S2 and S3 ful?ll the user requirement. The same step is applied to all questions to have the relative ranking of the lowest level in the hierarchy. Then, to aggregate the PVs from bottom to top, the normalized weights of each level are considered to prepare all questions’ PVs Table 3. Case study assessment values. Goals Question Metrics User requirements (Weights) Component Security feature S1 S2 S3 U1 U2 U3 Secure Endpoint (E) 1 1 1 1 1 0 2 9 2 1 1 1 1 3 1 1 1 0 4 1 1 1 1 2 1 1 1 1 0 2 1 1 1 1 3 1 1 1 1 4 1 1 1 0 3 1 1 1 1 0 2 1 1 1 0 3 1 1 1 1 Secure Mobile (M) 1 1 1 1 0 1 0 4 5 2 1 0 1 1 1 1 Secure Back-end (B) 1 1 1 0 1 0 8 4 9 2 1 1 1 1 3 1 1 1 1 4 1 1 1 1 2 1 1 1 1 1 0 2 1 1 1 0 3 0 1 1 1 A Framework for Ranking IoMT Solutions 217 for the ?nal aggregation. Thus, the ?nal ranking PVB.2.3 ¼ 1 3 1 2 1 3 1 1 1 PVB:2:3. Similarly, PVE.2.4 ¼ 1 3 1 3 1 4 1 0 1 PVE:2:4. Now that all PVs are weighted, the aggregation of all of them will reveal the ?nal overall rankings. Thus, in case 1, S2 ful?lls the user security requirements (Fig. 4c). The lower levels’ rankings can also be compared the same way. Figures 4a, b show the com-ponent level comparisons and security feature-level comparisons, respectively. Case 2 In this case, the user assigned priority weights at various levels. For level 1, E is assigned a weight of 2, which denotes low importance. For level 2, B.1 is assigned a weight of 8, which denotes relatively high importance, whereas B.2 is not required and hence has an assigned weight of 0. Thus, the normalized weights for B.1 and B.2 are 1 and 0, respectively. For instance, the weighted PVB.1.1 ¼ 0:4 : 1 1 4 1 PVB:1:1. Finally, for the lowest level, M.1.2.1 is assigned a weight of 0 (not required), and M.2.1 Fig. 4. Case study assessment results. 218 F. Alsubaei et al. is assigned 1. Applying the steps described in case 1 for all questions with the new weights, the ?nal rankings are: As Fig. 4c shows, unlike other cases, only S3 satis?es the user security require-ments. This is because, in this case, the endpoint security features are not important and were given a low weight. Since S3 fully satis?es the other components, it shows a better ranking. Case 3 In this case, the user assigned weights only to level 1. The normalized weights are B = E = 0.39, M = 0.22. Thus, as shown in Fig. 4c, the ?nal ranking reveals that only S2 ful?lls the user requirements. 6 Evaluation To evaluate the framework, we present two methods. First, to verify completeness of the list of assessment criteria, we tested its ability to identify and avoid known real-world security incidents. Since our list of assessment criteria is collected from publications by multiple specialized organizations, it should cover all security considerations related to the IoMT. We veri?ed that by gathering all reported IoMT-related vulnerabilities, as of April 2018, in NIST’s National Vulnerability Database (NVD) 1 and CVE Details2 during the last three years to ensure their recentness. The keywords used in this extensive search are IoT, IoMT, medical, health, medical device, and healthcare. Upon ?ltering all found vulnerabilities to exclude the ones that are irrelevant to IoMT (e.g., non-medical endpoints), we found 40 distinct vulnerabilities. Then, we analyzed the details of each vulnerability and mapped it to corresponding security feature(s). This way, we veri?ed our framework’s accuracy in highlighting all missing or inadequate security features. Table 4 shows the results of our analysis for each vulnerability with Common Vulnerabilities and Exposures (CVE) ID and the most relevant feature(s) for each vulnerability in the affected IoMT component. It is very likely that every vulner-ability is covered by more than one security feature. As shown in Table 4, these vul-nerabilities have diverse characteristics in terms of the affected IoMT component, solution type, and scenario. Since our framework is successfully able to provide security considerations that safeguard from these varied vulnerabilities, we believe it can scale well to different and unknown vulnerabilities. This also demonstrates the framework’s extensibility and cross-domain applicability. To verify the effectiveness of the framework in capturing missing or inadequate security features, we analyzed two commercial IoMT solutions that are known to have/had serious security issues. For example, Medfusion 4000 syringe infusion 1 https://nvd.nist.gov. 2 https://www.cvedetails.com. A Framework for Ranking IoMT Solutions 219 pumps3 are stationary medical endpoints that are used to deliver small doses of medication in acute care settings. These pumps were vulnerable to eight and serious security issues (vulnerabilities 1–8 in Table 4). These vulnerabilities are discussed in details in an advisory issued by the U.S. Community Emergency Response Team (CERT) [33]. Using our framework to assess the security of this device (before applying patches) and compare it with other devices would show that the device has a low-security score especially regarding the authentication. This information should help future adopters in making better decisions. For instance, they can choose a better alternative or wait until the vulnerabilities are patched. This helps users or adopters to avoid the severe consequences associated with these unpatched endpoint devices, which were highlighted in the Common Vulnerability Scoring System (CVSS) 4 as medium to high [33]. Similarly, kaa5 is IoT platform that allows healthcare system to establish cross-device connectivity and implement smart features into medical devices and related software systems. Kaa is vulnerable (no. 9) to code injection attacks. Comparing its security with other platforms will result in a low score in feature B.2. Table 4. IoMT vulnerabilities and their relevancy to our assessment framework. No. Vulnerability CVE ID Relevant feature (s) 1 2017-12726 E.2 2 2017-12725 E.2 3 2017-12724 E.2 4 2017-12720 E.2 5 2017-12723 E.10 6 2017-12722 E.4 7 2017-12721 E.5 8 2017-12718 E.8 9 2017-7911 B.2 10 2017-11498 B.2 11 2017-11497 B.2 12 2017-11496 B.2 13 2017-6780 B.14 14 2017-7730 G.3 15 2017-7729 G.1 16 2017-7728 G.5 17 2017-7726 G.7 18 2017-3215 M.3 19 2017-3214 M.7 20 2017-8403 M.3 (continued) 3 https://www.smiths-medical.com. 4 https://nvd.nist.gov/vuln-metrics/cvss. 5 https://www.kaaproject.org/healthcare/. 220 F. Alsubaei et al. 7 Limitations Solution providers cannot be forced to cooperate by making technical details of their products available to the public due to service abstraction constraints. This lack of technical details can be one limitation of this work as these details are required for the assessment. Nevertheless, adopters can always contact the solution providers’ customer service to inquire about missing information. This will also give the adopters the opportunity to know how cooperative and knowledgeable are the customer service teams in the candidate solutions. We do not anticipate that providers will voluntarily make their security features publicly available. Our work can motivate them to cooperate to meet customers’ needs and compete with others transparently. Moreover, the assessment criteria might not be easy to understand especially for novice users, such as patients and medical professionals, who often lack the technical knowledge. But, this work encourages them to learn about the security features and the potential issues. Also, some users might ?nd the process followed in this framework lengthy and complex. Nevertheless, we argue that it is worth the initial effort and time investment because it helps in discovering and avoiding severe consequences of improper security. Table 4. (continued) No. Vulnerability CVE ID Relevant feature (s) 21 2017-5675 E.8 22 2017-5674 E.8 23 2017-14002 E.2 24 2018-5457 B.11 25 2016-8355 E.6, E.8 26 2017-6018 E.9 27 2017-5149 E.5 28 2015-3958 E.1 29 2015-3957 E.10 30 2015-3955 E.7 31 2015-1011 E.2 32 2015-3459 E.5 33 2017-14008 B.12 34 2017-14004 B.12 35 2017-14006 B.12 36 2017-14101 B.13 37 2018-5438 B.12 38 2016-9353 B.9 39 2016-8358 E.5 40 2017-12713 B.12 A Framework for Ranking IoMT Solutions 221 8 Conclusion and Future Work Security plays a vital role in IoMT success. In this paper, we presented a security assessment framework to increase the trust in IoMT solutions. This framework pro-vides a list of security assessment criteria for IoMT solutions, composed of detailed and simple-to-use questions. Using this assessment criteria, the framework also provides an assessment method for IoMT solutions. The signi?cance of this work lies in its ability to assess a wide range of (1) stakeholders’ requirements (e.g., patients, medical pro-fessionals, system administrators etc.); (2) solutions (services, devices, platforms, etc.); and (3) architectures (e.g., mobile-controlled, cloud-based, etc.). This work educates IoMT users (e.g., patients, medical professionals, etc.) who often have a low level of awareness about the IoMT security issues and how to address them. The bene?ts of this work are not only limited to adopters. This framework can also be bene?cial to IoMT solution providers in assessing their products and compare them to other IoMT solutions. This encourages healthier and transparent competition among solution providers. Moreover, researchers and legislators/standardization bodies can utilize it to understand the security issues in order to better design security solutions and regulations. Our future work includes updating the list of assessment criteria that was mentioned in this paper as well as in our previous work [34] to adapt to the continuous and rapid evolution of IoMT solutions and their technologies. We will also develop a web-based tool based on the framework presented in this paper. References 1. A Guide to the Internet of Things Infographic. https://intel.com/content/www/us/en/internet-of- things/infographics/guide-to-iot.html 2. 87% of Healthcare Organizations Will Adopt Internet of Things Technology by 2019 (2017). https://www.hipaajournal.com/87pc-healthcare-organizations-adopt-internet-of-things-technology- 2019–8712/ 3. Alsubaei, F., Abuhussein, A., Shiva, S.: Security and privacy in the internet of medical things: taxonomy and risk assessment. In: 2017 IEEE 42nd Conference on Local Computer Networks Workshops (LCN Workshops), pp. 112–120 (2017) 4. Cyber Risk Services|Deloitte US|Enterprise Risk Services. https://www2.deloitte.com/us/en/ pages/risk/solutions/cyber-risk-services.html 5. Inc, S.: Synopsys and Ponemon study highlights critical security de?ciencies in medical devices. https://www.prnewswire.com/news-releases/synopsys-and-ponemon-study-highlights- critical-security-de?ciencies-in-medical-devices-300463669.html 6. Medical Devices are the Next Security Nightmare. https://www.wired.com/2017/03/medical-devices- next-security-nightmare/ 7. Hamlyn-Harris, J.H.: Three Reasons Why Pacemakers are Vulnerable to Hacking. http:// theconversation.com/three-reasons-why-pacemakers-are-vulnerable-to-hacking-83362 8. Jalali, M.S., Kaiser, J.P.: Cybersecurity in hospitals: a systematic, organizational perspective. J. Med. Internet Res. 28, 10059 (2018) 222 F. Alsubaei et al. 9. MSV, J.: Security is Fast Becoming the Achilles Heel of Consumer Internet of Things. https://www.forbes.com/sites/janakirammsv/2016/11/05/security-the-fast-turning-to-be-the-achilles- heel-of-consumer-internet-of-things/ 10. Abie, H., Balasingham, I.: Risk-based adaptive security for smart IoT in eHealth. In: Proceedings of the 7th International Conference on Body Area Networks, pp. 269–275. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering) (2012) 11. Savola, R.M., Savolainen, P., Evesti, A., Abie, H., Sihvonen, M.: Risk-driven security metrics development for an e-health IoT application. In: Information Security for South Africa (ISSA), pp. 1–6. IEEE (2015) 12. Food and Drug Administration: Postmarket Management of Cybersecurity in Medical Devices (2016). https://www.fda.gov/downloads/MedicalDevices/DeviceRegulationand Guidance/GuidanceDocuments/UCM482022.pdf 13. MDRAP|Home Page. https://mdrap.mdiss.org/ 14. McMahon, E., Williams, R., El, M., Samtani, S., Patton, M., Chen, H.: Assessing medical device vulnerabilities on the Internet of Things. In: 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 176–178. IEEE (2017) 15. Medical Equipment in General. https://www.iso.org/ics/11.040.01/x/ 16. New Security Guidance for Early Adopters of the IoT. https://cloudsecurityalliance.org/ download/new-security-guidance-for-early-adopters-of-the-iot/ 17. OWASP Internet of Things Project-OWASP. https://owasp.org/index.php/OWASP_ Internet_of_Things_Project#tab = Medical_Devices 18. [Press Release WP29] Opinion on the Internet of Things|CNIL. https://www.cnil.fr/en/press-release- wp29-opinion-internet-things 19. GSMA IoT Security Guidelines-Complete Document Set. https://www.gsma.com/iot/gsma-iot- security-guidelines-complete-document-set/ 20. Laplante, P.A., Kassab, M., Laplante, N.L., Voas, J.M.: Building caring healthcare systems in the internet of things. IEEE Syst. J. 12, 1–8 (2017) 21. Islam, S.M.R., Kwak, D., Kabir, M.H., Hossain, M., Kwak, K.S.: The internet of things for health care: a comprehensive survey. IEEE Access. 3, 678–708 (2015) 22. Williams, P.A., Woodward, A.J.: Cybersecurity vulnerabilities in medical devices: a complex environment and multifaceted problem. Med. Devices Auckl. NZ. 8, 305–316 (2015) 23. Leister, W., Hamdi, M., Abie, H., Poslad, S.: An evaluation framework for adaptive security for the iot in ehealth. Int. J. Adv. Secur. 7(3&4), 93–109 (2014) 24. Wu, T., Zhao, G.: A novel risk assessment model for privacy security in Internet of Things. Wuhan Univ. J. Nat. Sci. 19, 398–404 (2014) 25. Caldiera, V., Rombach, H.D.: The goal question metric approach. Encycl. Softw. Eng. 2, 528–532 (1994) 26. Bayuk, J., Mostashari, A.: Measuring systems security. Syst. Eng. 16, 1–14 (2013) 27. OWASP Internet of Things Project-OWASP. https://www.owasp.org/index.php/OWASP_ Internet_of_Things_Project 28. Health, C. for D. and R.: Digital Health-Cybersecurity. https://www.fda.gov/ MedicalDevices/DigitalHealth/ucm373213.htm 29. Naval Medical Logistics Command (NMLC): Medical Device Risk Assessment Question-naire Version 3.0. (2016). http://www.med.navy.mil/sites/nmlc/Public_Docs/Solicitations/ RFP/MDRA%203.0-20160815RX.PDF 30. Saaty, T.L.: Decision making with the analytic hierarchy process. Int. J. Serv. Sci. 1, 83–98 (2008) A Framework for Ranking IoMT Solutions 223 31. Cheng, Y., Deng, J., Li, J., DeLoach, S.A., Singhal, A., Ou, X.: Metrics of Security. In: Kott, A., Wang, C., Erbacher, R.F. (eds.) Cyber Defense and Situational Awareness, pp. 263–295. Springer International Publishing, Cham (2014) 32. Saaty, T.L.: Decision-making with the AHP: why is the principal eigenvector necessary. Eur. J. Oper. Res. 145, 85–91 (2003) 33. Smiths Medical Medfusion 4000 Wireless Syringe Infusion Pump Vulnerabilities (Update A)|ICS-CERT. https://ics-cert.us-cert.gov/advisories/ICSMA-17-250-02A 34. Alsubaei, F., Abuhussein, A., Shiva, S.: Quantifying security and privacy in Internet of Things solutions. In: NOMS 2018–2018 IEEE/IFIP Network Operations and Management Symposium, pp. 1–6 (2018) 224 F. Alsubaei et al. CUSTODY: An IoT Based Patient Surveillance Device Md. Sadad Mahamud(?) , Md. Manirul Islam, Md. Saniat Rahman, and Samiul Haque Suman American International University-Bangladesh, Dhaka, Bangladesh {sadad,manirul,saniat,samiul}@aiub.edu Abstract. In this paper, the authors present an assistance device for patient’s surveillance. An IoT based system is developed for monitoring patient’s heart rate, body temperature and saline rate. An Arduino microcontroller is used here for processing the data and ESP32 module is used for monitoring the patient’s data through internet and a GSM module is used for notifying the doctors in emergency case. The main objective of this project is to help the doctors and nurses to monitor a patient’s health condition through internet and over cellular network. On the other hand, if the monitoring parameters exceed beyond their nominal values, the ready message is sent to the concerned duty doctor as well as the attendant and display it in the LCD screen and a speci?c audio sound is played for urgent awareness. Keywords: IoT · ESP32 module · Arduino · Heart rate · Body temperature Saline measurement · GSM · Micro SD card module · Audio · LCD display 1 Introduction The arrival of modern technology has made our lives much easier and comfortable in comparison with the previous decades. But after having this technology, still a lot of medical patients die each year due to the lack of integration of the technologies and make it accessible at a very a?ordable cost. It is so di?cult for a doctor to monitor a patient 24/7 incessantly who is su?ering from critical disease or some corporal malady. One of the CCN health reports showed the 10 shocking medical mistakes for the patient’s date case [1] and most of them occurred for lack of timely care. Hence, to remove human trouble and lessen the compulsion of monitoring a patient restlessly from a doctor and a nurse, this paper proposes a low-cost surveillance system called CUSTODY for moni- toring a patient through internet with the conduct of GSM technology. Health monitoring system measures patient’s health condition in regular interval of time. This paper describes the design of an IoT based pulse rate, saline level rate and body temperature measuring system with the help of Arduino microcontroller and ESP32 module. The system raises an alarm when the pulse rate or body temperature rate or the saline level goes beyond or falls behind the threshold value and sends an emergency alert noti?cation to concerned doctor and family member. Patient’s real-time monitoring parameters can also be viewed via internet at any time. © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 225–234, 2019. https://doi.org/10.1007/978-3-030-02686-8_18 2 Related Works A Heart rate monitoring system is designed by P.A. Pawar [2] using IR based sensor which can measure the heart rate and send the signal through GSM module. This system is also based on the Arduino Microcontroller. Author mainly designed this system for home-based e?ective heart rate monitoring system. A LPC2129 health monitoring system is designed by M. Pereira [3]. In this paper the authors presented an IoT based device using ARM 7 processor, ECG, Heart Rate, AD8232, and Body Fat percentage module. The main idea presented in this paper is to provide better and e?cient health services to the patients by implementing a networked information cloud, so the experts and doctors could make use of this data and provide a quick and an e?cient solution. An IoT based health monitoring system is proposed by N. Gupta with his co-authors using android app only for health monitoring [4]. This paper presents a health monitoring system using Pulse Oximeter sensor, Temperature sensors, PIR motion sensor. Using GPRS network the patient’s data is uploaded in the custom server. As per the study of this paper a health monitoring system is an e?cient system to monitor the health condi- tion to keep track of one’s health. C. Raj with his co-authors proposed An IoT based e- Health Care System for Remote Telemedicine [5]. For testing their system, they used Body Temperature, Pulse Oximeter, ECG, GSR, EMG sensors to measure the patient’s body parameters. The paper mainly focused on building a common interface between multiple remote center and medical practitioner to monitor. An IoT Based Smart Health Care System using CNT Electrodes is designed by M. Bansal with his co-authors [6]. The main objective of this paper is to provide people with an e?ective solution to live in their homes or workplace comfortably instead of going to expensive hospitals. S. Lavanya with his co-authors develops a remote prescription and I-home healthcare based on IoT [7]. The authors used Heart rate sensor, Real time clock, RFID tag and for network connectivity they used Raspberry Pi server. In general, this paper presents an IoT-based intelligent home-centric healthcare platform which seamlessly connects smart sensors attached to the human body for physiological monitoring for daily medication manage- ment. And many more researches are going on in this vast research ?eld. 3 Architecture Model of the System Our system is based on Arduino Mega Microcontroller Unit board and ESP32 WiFi Module board. All the sensors data are being fetched and decoded into the microcon- troller and then being sent in real time using the ESP32 module. Figure 1 describes the architecture model of the proposed system. 226 Md. S. Mahamud et al. Fig. 1. Architecture model of the system. 4 Design System The total system is primarily based on Microcontroller Arduino Mega. Here Arduino serves as the main controlling unit. After receiving the data from the temperature sensor, saline load sensor and pulse sensor, the microcontroller unit decodes data for ?nal oper- ation. The ESP32 WiFi module is used here for the communication with the public network. All the data that is being received by Arduino is stored into the micro SD card module and that stored data is available into the web server through ESP32 module. Figure 2 shows the simulation model of the total system. The total simulation is done with the Fritzing simulation software [8]. Fig. 2. Simulation circuit diagram. CUSTODY: An IoT Based Patient Surveillance Device 227 If any of the sensor values cross prede?ned nominal value, then a pre-de?ned SMS or call will be given to the doctor and a speci?c audio sound will be played. The LCD display panel shows the current state of the patient. 4.1 Arduino Mega 2560 The Arduino Mega 2560 is a microcontroller board based on the ATmega2560. The MEGA 2560 is designed for more complex projects. With 54 digital I/O pins, 16 analog inputs [9]. Mega is the main controlling unit for this system. 4.2 Pulse Sensor Heart Beat can be measured based on optical power variation as light is scattered or absorbed during its path through the blood as the heart beat changes [10]. In this system we have used hair clip pulse sensor. We consider value_1 and value_2 are the ?rst pulse and list pulse counter value. Now, Ten_Pulse_time = value_1-value_2. So, Single_pulse_time = Ten_Pulse_time/10. Then our ?nal equation for Beats per Minute (BPM) is: Heart rate (BPM) = 60/Single_pulse_time (1) After calculating this pulse rate using (1) Arduino store the current rate into the internet server through ESP32 module and if the pulse rate crosses the nominal value it sends a SMS, an audio output played, and the display shows the current condition. Figures 3 and 4 shows the change in pulse rate measured for a 23 year old boy as our test patient in Arduino Serial Monitor. Fig. 3. Normal pulse rate. Fig. 4. Increased pulse rate. 4.3 Temperature Sensor This system has used the water-proof DS18B20 Temperature sensor. The DS18B20 provides 9 to 12-bit (con?gurable) temperature readings over a one-wire [12] interface [11]. Here, Temp = Output voltage * 0.48828125 and then ?nally Temp_?nal = (Temp *n 1.8) + 32 (2) 228 Md. S. Mahamud et al. By using (2) [13] Arduino calculates the patient’s body temperature and executes its operation. 4.4 Load Sensor Module For this system we have used a strain gauge load cell module [14]. The main concept behind the load sensor is to measure the saline weight because we cannot put any sensor inside the saline packet. The load sensor calculated the saline weight in litter and divided it into three levels. Level 3 indicate that the saline is full level 2 indicate that the saline is half and level 1 indicate that the saline is almost ?nish. The nurse or doctor should change the saline packet. The load sensor values are being fetched by the Arduino Mega microcontroller to check against the set values and trigger alarm if necessary. 4.5 ESP32 WIFI Module ESP32 is already integrated with antenna and power ampli?er, low-noise ampli?ers, ?lters, and power management module. The entire solution takes up the least amount of printed circuit board area. This board is used with 2.4 GHz dual-mode Wi-Fi and Blue- tooth chips by TSMC 40 nm low power technology [15]. In this system ESP32 is used for connecting the system with the cloud. The ESP32 module read the sensor data which is saved in the SD card and process into the cloud. A private cloud domain server is created for testing this system as “custody.com”. The monitoring web portal is created with php and all the data is stored on a MySQL server. After login into the web portal, using patient ID the user can see the patient’s real time condition. The ESP32 module operates in the network layer of the OSI model [16]. Figure 5 shows the current health condition of a patient in the CUSTODY web portal. Fig. 5. Private cloud domain server custody.com CUSTODY: An IoT Based Patient Surveillance Device 229 4.6 SIM900A GSM Module GSM is mainly used in devices like mobile phones as well as for long distance commu- nication. It transmits and receives data over GPRS, making video calls and SMS [17]. In this project SIM900A GSM module is used for sending SMS. When the sensor values will exceed the range for the given coordinates, GSM will send SMS to some selected numbers. Figures 10 and 12 show the SMS received by the cell phone which consists of patient’s condition. 4.7 Micro SD Card Module The micro SD card module transfers data from a SD. The Arduino relates to the SD card through the breakout board and audio commands were saved in this SD card. The connection of the module with Arduino is shown in Fig. 2. When any sensor value crosses the range for the given coordinates an audio output will be generated to make the people aware about the danger. And it also stores the sensor data in the SD card for transfer it to internet with the help of Arduino and ESP32 Module. 4.8 Audio Ampli?er and Speaker When the audio commands are played form the micro SD card the audio volume is relatively low. So, to make it louder, we used our own custom made 9 V hearable audio ampli?er. Audio ampli?er was made using LA4440 IC and an 8 ?n speaker is used. 4.9 16 * 2 LCD Display A 16 * 2 LCD display is connected with the system. This display shows the current sensor value and the patient’s current condition. 5 Hardware Model Figure 6 shows the hardware model of the system. All the sensors are connected with the Arduino and the output results are displayed into the LCD module along with Emer- gency audio output. 230 Md. S. Mahamud et al. Fig. 6. Hardware model of the system. 6 Results Table 1 shows the test result of this system where the audio output and SMS output are set. We test this system for only one patient. Di?erent analysis results for this system are given below. Figures 7 and 8 show the test result when patient is in normal condition and the saline level is in normal condition as well. No SMS will be triggered or no audio will be played. The web portal will have patient’s current data. Figures 9 and 10 show the test output when the saline is in low-level condition. For testing purpose, we used a 250 ml bottle as a saline packet. When saline is almost ?nished the load sensor gets a very small weight and a SMS will be sent and audio will be played. Figures 11 and 12 show the test output of the situation when the temperature increases. We increased the temperature manually and a SMS is sent to the pre-de?ned number and an audio is played as well. Table 1. Results analysis for audio and SMS output Condition Audio SMS Normal condition No audio No SMS Normal condition No audio No SMS Normal condition No audio No SMS Body temperature increased Audio played SMS sent Pulse rate Increased Audio played SMS sent Saline level low Audio played SMS sent Figure 13 shows the web server monitoring window when the patient’s pulse rate is increased. And Fig. 14 shows that if anyhow two or three parameter falls in one time the system will return patient’s condition as emergency. On this condition a call will be sent to the attending doctor and also an emergency audio will be played by the system. CUSTODY: An IoT Based Patient Surveillance Device 231 Fig. 7. Normal condition. Fig. 8. Normal saline condition test. Fig. 9. Saline level low. Fig. 10. SMS received for saline low. Fig. 11. Temperature increased. Fig. 12. SMS sent for temp. increased. 232 Md. S. Mahamud et al. Fig. 13. Web server monitoring when patient pulse rate increased. Fig. 14. Web server monitoring when more than one sensor parameter crosses its nominal value. 7 Conclusion The main objective of this paper is to create a low-cost IoT based medical surveillance system that can be a true virtual assistant to a doctor using smart technique. Real-time monitoring of the patient’s current health condition by family members is an added advantage of this system. The initial test run of the prototype is successful. But some future work is needed for this system. More upgraded sensors can be used to calculate the pulse rate. As it is an IoT based system the patient’s data must be safe and the data processing must be faster. In future, further research can be carried out to improve the algorithm of the system. CUSTODY: An IoT Based Patient Surveillance Device 233 References 1. 10 shocking medical mistakes—CNN. https://www.cnn.com/2012/06/09/health/medical– mistakes/index.html. Accessed 2018 2. Heart rate monitoring system using IR base sensor and Arduino Uno—IEEE Conference Publication. Ieeexplore.ieee.org (2018). https://ieeexplore.ieee.org/document/7057005/. Accessed 25 Apr 2018 3. A novel IoT based health monitoring system using LPC2129—IEEE Conference Publication. Ieeexplore.ieee.org (2018). https://ieeexplore.ieee.org/document/8256660/. Accessed 25 Apr 2018 4. IOT based health monitoring systems—IEEE Conference Publication. Ieeexplore.ieee.org (2018). https://ieeexplore.ieee.org/document/8276181/. Accessed 25 Apr 2018 5. HEMAN: Health monitoring and nous: An IoT based e-health care system for remote telemedicine—IEEE Conference Publication. Ieeexplore.ieee.org (2018). https:// ieeexplore.ieee.org/document/8300134/. Accessed 19 Jun 2018 6. IoT based smart health care system using CNT electrodes (for continuous ECG monitoring) —IEEE Conference Publication. Ieeexplore.ieee.org (2018). https://ieeexplore.ieee.org/ document/8230002/. Accessed 19 Jun 2018 7. Remote prescription and I-Home healthcare based on IoT—IEEE conference publication. Ieeexplore.ieee.org (2018). https://ieeexplore.ieee.org/document/8094069/. Accessed 19 Jun 2018 8. Fritzing. Fritzing.org (2018). http://fritzing.org/home/. Accessed 25 Apr 2018 9. A. [closed]: Arduino Mega 2560 serial port location. Arduino.stackexchange.com (2018). https://arduino.stackexchange.com/questions/47727/arduino-mega-2560-serial-port-location. Accessed 25 Apr 2018 10. Grove—Ear-clip Heart Rate Sensor| Techshopbd. Techshopbd.com (2018). https:// www.techshopbd.com/product-categories/biometrics/1389/grove-ear-clip-heart-rate-sensor- techshop-bangladesh. Accessed 25 Apr 2018 11. https://playground.arduino.cc/Learning/OneWire. Accessed 25 Apr 2018 12. DS18B20 Digital Temperature Sensor (CN) | Techshopbd. Techshopbd.com (2018). https:// www.TechSoup.com/product-categories/temperature/2796/ds18b20-digital-temperature-sensor- cn-techshop-bangladesh. Accessed 25 Apr 2018 13. Sensing heart beat and body temperature digitally using Arduino—IEEE Conference Publication. Ieeexplore.ieee.org (2018). https://ieeexplore.ieee.org/document/7955737/. Accessed 25 Apr 2018 14. D. Load Cell—200 kg, S. Load Cell—10 kg and D. Load Cell—50 kg: Getting started with load cells—learn.sparkfun.com. Learn.sparkfun.com (2018). https://learn.sparkfun.com/ tutorials/getting-started-with-load-cells. Accessed 25 Apr 2018 15. Overview | Espressif Systems. Espressif.com (2018). https://www.espressif.com/en/ products/hardware/esp32-devkitc/overview. Accessed 25 Apr 2018 16. What is OSI model (Open Systems Interconnection)?—De?nition from WhatIs.com. SearchNetworking (2018). https://searchnetworking.techtarget.com/de?nition/OSI. Accessed 25 Apr 2018 17. Sim900a Gsm Module Interfacing with Arduino Uno. Electronicwings.com (2018). http:// www.electronicwings.com/arduino/sim900a-gsm-module-interfacing-with-arduino-uno. Accessed 25 Apr 2018 234 Md. S. Mahamud et al. Personal Branding and Digital Citizenry: Harnessing the Power of Data and IOT Fawzi BenMessaoud(&) , Thomas Sewell III, and Sarah Ryan School of Informatics and Computing, Indiana University and Purdue University, Indianapolis, IN 46202, USA fawzbenm@iu.edu Abstract. With all that the internet has to offer, it is easy to get lost in the myriad of resources available to us both academically and socially. We have so many ways to learn, connect, and promote ourselves that in trying to stay current in today’s digital world, we can quickly ?nd ourselves overwhelmed. To be successful, we need a way to conveniently organize educational materials and references while also ensuring that only our very best self is on display. According to a study we conducted on this subject, the idea of personal online management is something which many value highly, but are unsure how to fully realize. We feel like this is problematic for any modern user, but this can be resolved. Using multiple data collection methods in our research, we explored the concept of “Digital Citizenship”. Digital Citizenship is de?ned as a way of expressing the online presence and personal brand that users have curated in a digital space; as well as a simpler, more ef?cient way to store and organize a personal digital library. We are presenting an app that would help to ?ll this need within the realm of academia and beyond. This app is a way of simplifying our lives, making the internet more accessible and managing personal, educa-tional, and academic materials, online pro?les, and social media accounts. Keywords: Personal brand .n Digital footprint .n Digital Citizenry Social media 1 Major Aspects Our study was based on three factors that we felt were interdependent. We were interested in seeing how people consider their public data, represent their images online, and how they store personal data. These topics were labeled as Personal Brand, Web Presence, and Digital Content Storage. 2 Personal Brand Personal Brand encompasses the way a person presents themselves online. Bridgen asserted that a person can become successful by developing and marketing their per-sonal brand, highlighting themselves in a positive light and developing their online self in such away to be engaging to others [2]. © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 235–240, 2019. https://doi.org/10.1007/978-3-030-02686-8_19 Over a period of time, successful individuals obtain a reputation and position based on a combination of their expertise and “connectedness”, which makes them attractive to other players operating in the same space. An authentic personal brand therefore delivers both a track record and a promise of the ongoing delivery of value. From the Journal of Business Strategy, we see the statement, “… in most cases authentic per-sonal brand builders are genuinely strong performers who are highly sought after by employers because they have the ability to use their personal social capital for the bene?t of the organization and their own career progression within it” [3]. 2.1 Web Presence Inextricably linked to one’s personal brand is their web presence, particularly in the context of social media. Web presence is the public way an individual is observed from the point of view of an audience while on the internet. Jones postulated that the more connections people make, the larger their digital footprint, and the more likely potential employers will ?nd less positive aspects of a person’s digital life [4]. This is important to consider, particularly when searching for a career. However, no one lives in a vacuum, and digital connections with friends, family members, or even professional contacts is virtually inevitable in the world we live in today. According to Brake, “Pro?les and entries on Facebook, Twitter and many other such services can contain diaristic or confessional material that looks as if it is only for the author to read or perhaps for trusted friends and family - but although social media services often include tools to keep such writings private, many are visible to a large number of people or even published openly on the web with potential audience of millions.” [1]. The solution then is not simply to be aware, but to be able to manage one’s image in the digital space, promoting positive aspects while diminishing the aspects that are less so. In the article by Harris & Rae, the authors state, “… the ‘digital divide’ between the ‘haves’ and the ‘have nots’ in the developed world is now less about access to the web than it is about understanding how to actively participate in the networked society” [3] having the power to manage one’s overall web presence is key to success in modern times. 2.2 Digital Content Library The concept of Digital Content Storage is a library, a collection place of all digital content that a person owns and uses. This is similar to other methods used to save and share ?les of different types and sizes, such as Dropbox or Google Docs. The app we are presenting as a solution works in this same way, but with the added bonus of the “vault” feature, which would be a speci?c space located within the library with extra security features for more sensitive and restricted document ?les and information. 236 F. BenMessaoud et al. 3 Motivation Our motivation for this research was based on our hypothesis that the general populace has a lack of awareness regarding the importance of monitoring online presence and has dif?culty managing the vast resources available to them. We tested this theory. 3.1 Methods In choosing a method of study we thought it would appropriate to make use of an online survey in order to reach a variety of respondents in light of our triple constraints: we were able to reach the highest amount of people in our given time by the most cost-effective means. We conducted our study using Google forms. We posted several links to our survey on Facebook and Twitter, in order to gain a wide viewing and have the most success. Distributing the survey in this way allowed us to get feedback from those who may no longer be students or in the academic world, and did not assume any prior knowledge of our topics, giving us the widest possible net to cast for data. This survey included questions based on personal brand, web presence, and digital content storage, gauging the participants both in their current knowledge of these topics and also their current usage of applications and software/hardware speci?c to these subjects. This initial survey was left open for one week. We used the analytics provided by Google Docs initially, and then used the raw data to analyze the information for ourselves to make our conclusions. We split this survey into sections, and each section was speci?c to one of the three topics we were testing. This allowed us to get a somewhat general idea of the prior knowledge our participants had for each of our topics. For example, “Are you Familiar with Personal Brand?” was a speci?c question we asked our participants in order to try to gain an understanding of what the general public might or might not already know about the subject, an approach we felt was useful in giving meaning to the survey. 3.2 Findings The data we collected from our surveys proved to hold a number of patterns which we found in the process of our analyzation. Our initial survey collected data from 60 volunteer participants. This gave us quite a bit of information, which was useful in gaining knowledge from a large variety of data. One of our main interests was in determining how important people considered their social media presence. We were interested in the importance people put on themselves and their personal media ?rst. Our results showed that over 50% of people placed themselves and their social media in the mid-range. 81.67% of respondents rated their social media at a 3 or higher on a scale of 1–5, with 66.67% rating themselves at 3 or 4, the middle rankings (see Fig. 1). Another interesting pattern we found in our data was the distribution of gender, in the way that affected our survey. In Fig. 2, our respondents were 66% female, 33% male, so we thought it prudent to measure some of our responses by gender to ?nd important differences in the use of social media and web accounts. Personal Branding and Digital Citizenry 237 According to these results, gender is not a highly determining factor when con-sidering number of social media accounts currently in use by users. This is especially interesting considering the gender of respondents: as stated, 66% of respondents were female and 33% male, and interestingly, our data shows a low level of disparity between the two. We determined that this further proves that a better knowledge of one’s digital footprint is universally bene?cial (see Fig. 2). We were also very interested to see how highly people determine the importance of security of their saved content, asking them to rank that importance on a scale of 1–5. Interestingly, from this data, we found that zero respondents rated their security Fig. 1. Results of a survey question regarding participant’s ranking of the importance of their own social media presence. Fig. 2. Side by Side graphic showing media accounts held by gender. We did not ?nd gender to have any signi?cant impact on number of social media accounts held by participants. 238 F. BenMessaoud et al. importance at one, the lowest. Alternatively, 56.67% of respondents rated their interest in security of saved data at a 5, the highest possibility on the scale, which shows very clearly how highly security is considered (see Fig. 3). 4 Conclusion In summary, we found that our initial hypothesis was correct. Our belief is that the internet is ever-expanding, producing more connections than have existed in any time prior. The many nuances of our presence in this digital space is often missed or not fully understood, and this can result in unexpected repercussions. The goal of our research was to see to what extent the people we surveyed were aware of their larger online presence and the way that they navigated the digital landscape. In examining the responses, we received for our survey, the patterns showed us that while the people we surveyed answered that they understood each part of the three categories we were questioning about, they lacked a big-picture perspective of how those categories were intertwined. Digital Citizenry combines these concepts together, providing users with a way to manage their online selves by understanding the overlap that comes from a digital space, and therefore empowering people to make the best decisions, both for their present and for their future. References 1. Brake, D.R.: Sharing Our Lives Online: Risks and Exposure in Social Media. Palgrave Macmillan, Hampshire (2014) Fig. 3. Figure graphing the result of our question regarding importance of saved content. Over 50% of all respondents indicated that it was of extreme importance to them by ranking security at the highest possible level. Personal Branding and Digital Citizenry 239 2. Bridgen, L.: Emotional labour and the pursuit of the personal brand: Public relations practitioners’ use of social media. J. Med. Pract. 12(1), 61–76 (2011). https://doi.org/10.1386/ jmpr.12.1.61_1 3. Harris, L., Rae, A.: Building a personal brand through social networking. J. Bus. Strategy 32 (5), 14–21 (2011). https://doi-org.proxy.ulib.uits.iu.edu/10.1108/02756661111165435. Accessed 9 Apr 2018 4. Jones, C., et al.: Net generation or digital natives: is there a distinct new generation entering university? Comput. Educ. 54(3), 722–732 (2010). https://doi.org/10.1016/j.compedu.2009. 09.022 240 F. BenMessaoud et al. Testing of Smart TV Applications: Key Ingredients, Challenges and Proposed Solutions Bestoun S. Ahmed(B) and Miroslav Bures Department of Computer Science, Faculty of Electrical Engineering, Czech Technical University, Karlovo n´am. 13, 121 35 Praha 2, Czech Republic {albeybes,buresm3}@fel.cvut.cz Abstract. Smart TV applications are software applications that have been designed to run on smart TVs which are televisions with integrated Internet features. Nowadays, the smart TVs are going to dominate the television market, and the number of connected TVs is growing expo-nentially. This growth is accompanied by the increase of consumers and the use of smart TV applications that drive these devices. Due to the increasing demand for smart TV applications especially with the rise of the Internet of Things (IoT) services, it is essential to building an application with a certain level of quality. Despite the analogy between the smart TV and mobile apps, testing smart TV applications is di?er-ent in many aspects due to the di?erent nature of user interaction and development environment. To develop the ?eld and formulate the con-cepts of smart TV application testing, this paper aims to provide the essential ingredients, solutions, answers to the most critical questions, and open problems. In addition, we o?er initial results and proof of con-cepts for a creeper algorithm to detect essential views of the applications. This paper serves as an e?ort to report the key ingredients and chal-lenges of the smart TV application testing systematically to the research community. Keywords: Smart tv application testing · Software testing Model-based testing · Internet of Things (IoT) 1 Introduction A connected TV, which is popularly called smart TV, is a technological assem-blage device among computer and traditional television. The device is a combina-tion of conventional TV terminal, operating system (OS), and digital contents in which all of them are connected to the Internet. Smart TVs are providing di?er-ent digital services like multimedia, gaming, Internet browsing, on-demand enter-tainment access, a various online interactive session in addition to broadcasting media. In fact, these devices were expected to be more intelligent, interactive, and useful in the future [1]. Recently, the electronic companies along with IT ?rms were rising investments in the technological advancements of these devices .f c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 241–256, 2019. https://doi.org/10.1007/978-3-030-02686-8_20 242 B. S. Ahmed and M. Bures by launching new terminals and applications for smart TVs. It is expected shortly that these devices will be a frequent part of our smart homes within an Internet of Things (IoT) context1 . This explains why the smart TV market worth $265 Billion by 20162 . Just like the new technological smart devices, smart TVs are operated by an OS with di?erent applications (apps) installed on it. Although the OS is the key software for operation, the installed apps on the smart TV brings di?erent uses and functionalities to the device. At a glance, the smart TV app may look like a mobile app due to the similarities of the OSs or the development kits. Due to this “fake” similarity, one may think of testing smart TV apps just like the mobile app testing. However, in fact, testing smart TV apps is di?erent due to the nature of user interaction with the app itself. In mobile apps, the user is interacting with the device touchscreen (i.e., the application) directly by hand whereas, within smart TVs, the user is interacting with the app through another device which is the remote controller. Of course, some vendors are providing interaction by touchscreen to the users, but the way that application behaves is still based on the remote control device when it comes to testing practices. In addition, the user of any TV (including the smart TVs) is usually staying away from the screen and almost use the remote device to operate the apps all the time. In the literature, mobile apps testing is well-studied, and many research direc-tions have been established, (e.g., [2–4]). However, testing smart TV apps is a new area and many challenges still without a solution, and many research ques-tions may arise without answers. To address these challenges and questions, it is essential to explore the app structures, interaction ways, development envi-ronments, and the technology behind the apps. In doing so, this paper examines the key ingredients of smart TV app testing. The paper aims to address the most demandable questions. The paper also discusses the challenges addressed so far in the literature and open problems for test automation and generation. Based on that, a systematic framework for testing applications on Smart TVs is illustrated throughout a prototype. The framework includes the testing process, its steps, and also the test generation strategy. This will help to validate the di?erent aspects of the applications before release. This could also serve as an initiative topic for further research in the near future. The framework will help to address and formulate more open problems and research questions. The rest of this paper is organized as follows. Section 2 summarizes the related works in the literature and those e?orts in smart TV app testing that could be useful here. Section 3 explains the technology behind the smart TV apps. Section 4 illustrates some analogy and di?erences between mobile and smart TV apps. Section 5 describes the navigation and control mechanism of smart TV apps. Section 6 discusses the open research problems in the smart TV app testing. Section 7 de?nes a prototype for a systematic automated testing strategy. Section 8 discusses the functional and non-functional testing Opportunities in 1 https://read.bi/2L4CDSI. 2 https://bit.ly/2HxnMkL. Testing of Smart TV Applications 243 Smart TV Applications. Finally, Sect. 9 give concluding remarks and also future research recommendations. 2 Motivation and Literature Testing software applications on smart devices is considered to be a development and an evolution of testing practice from the traditional user interfaces (UI) like graphical user interface (GUI) and web application testing. The testing practices for these UIs have been studied extensively in the last decade, and as a result, many sophisticated methods, algorithms, and tools have been developed. Banerjee et al. [5] studied more than 230 articles published between 1991–2013 in the area of GUI testing and Li et al. [6] surveyed the literature in two decades of web application testing. Mobile application testing could be considered as the ?rst e?ort towards smart application testing. There are many di?erences between mobile apps and graphical/web UI. In fact, the main issue that makes the di?erence in the testing process is the user interaction with the application. In the standard GUI and web applications, the keyboard and mouse combination is still the standard user input to interact with the applications. However, this is not the case for mobile apps as the user interacts with the device touchscreen by ?ngers and hence, there would be di?erent interaction behavior from various users. Although this issue leads to develop new testing strategies for mobile apps, still many of these strategies are taking bene?ts, wholly or partially, from the earlier methods and practices published for GUI and web application testing. For example, Amal?tano et al. [7] developed MobiGUITAR strategy for systematic mobile application testing from the GUITAR strategy [8] for GUI testing. An extensive study on mobile application testing is presented in [2]. Smart TV application is a new smart device application type. The views of the application are not like other applications. The application structure looks like web application as it relies on HTML, CSS, and JavaScript; however, the user interaction with the application di?ers from other types of applications. Usually, the user is not interacting with the application directly by hand, and it should be through another input device, which is the remote device. This could lead to think that the testing process is similar to the GUI or web application. However, the remote device does not behave like the standard mouse. While the standard mouse input device can move in every direction on the application, the remote device movement is restricted to four explicit directions. The inter-action di?erence makes many obstacles and di?culties when it comes to testing process. While the general concepts of model-based testing are applicable here, the construction of the model and the model type makes the di?erence. For example, due to the di?erent interaction nature, Nguyen et al. [8] used Event Flow Graph (EFG) as a model of the GUI testing, whereas Amal?tano et al. [7] uses state machine as a model for the mobile application testing. In smart TV app, both EFG and state machine models are not applicable. In Smart TV app, each transition from a state to another is practically just one step, while 244 B. S. Ahmed and M. Bures this is not the case in other applications. For example, in the mobile app, the distance between two icons (states) does not make sense in the transition, while this is very important in the smart TV application, and that will lead to a di?er-ent model. An important e?ort to formulate this model is done recently by Cui et al. [9]. Here, the Hierarchical State Transition Matrix (HSTM) is proposed as a model for the Android smart TV applications. While the model is promising, there is a need to develop and formulate it for the complex structure of di?erent applications. In fact, testing smart TV apps could be seen from di?erent angles. For exam-ple, usability testing is one of the critical testing issues to address the interaction between the user and the smart TV through remote device. This will help to improve the quality of the user interfaces of the applications. Ingrosso et al. [10] addressed this issue by using several users to test an e-commerce application on smart TV. Security testing is also an essential issue in the smart TV apps. However, we could not ?nd a published study addressing security in Smart TV apps. Recently, Sabina C. [11] discussed and described some of the testing plat-forms for Smart TV apps. The study chooses Opera and Samsung TV Stores for testing the applications. The testing process relies on the upload of the applica-tions to the Opera and Samsung application stores to verify them based on the code writing. Hence, there is no de?nition of the testing strategy itself, and that could not be considered as a formal testing process. The study has also addressed the importance of functional testing of these applications without giving details since it is a bachelor study with limitations. Although it is essential from the industrial point of view, we could not ?nd many companies giving solutions for smart TV apps testing. One of the exciting projects so far is the suite.st framework3 . The framework depends on record and replay testing style by using two di?erent devices, one for recording the actions, and the other is for acting like an emulator. In fact, the platform dealing with the application just like a web application and uses record and replay style of testing being employed by SeleniumHQ4 . The framework is a good startup for the industry to adapt selenium style of testing for smart TV apps. Although the framework claims that it is dealing with the functional testing of mobile apps, still the pass/fail criteria are not clear from an academic point of view. As a result, there is a need to de?ne a test oracle for the framework. In addition, the framework does not rely on some automatic test generator for fully testing of the applications. In fact, de?ning a test oracle for smart TV application could be a new research direction as we will address it later in this paper. 3 Smart TV Apps Development and Technology Just like Android apps, smart TV apps are developed using Software Develop-ment Kits (SDK). The new versions of Android SDK supporting the development of smart TV apps. However, these applications can be run on Android Smart TV 3 https://suite.st. 4 http://www.seleniumhq.org/. Testing of Smart TV Applications 245 devices only. In fact, few SDKs were available for cross-platform development. For example, Josh?re5 Smart TV SDK was a platform to develop applications to work on Google and Samsung TV devices but not on LG TV devices. Mautilus6 Smart TV SDK is also a platform for development, but still, the application is working on some versions of devices only. Smart TV Alliance7 was the most advanced SDK by supporting di?erent features and platforms. However, the project is shut down, and the SDK is not available for download. Samsung Tizen SDK provides a set of tools and frameworks to develop smart TV apps through Tizen Studio. The SDK is depending on the latest web tech-nologies such as JavaScript, CSS, HTML5, and W3C widget packaging. In fact, Samsung has established Tizen.Net which is a new cross-platform application development that has been integrated with Visual Studio. Nowadays, most of the SDK tools are relying on a uni?ed approach to the development technology for smart TV apps. The technologies behind the appli-cations are JavaScript, HTML5, and CSS3. JavaScript is used as a standard programming language to program the behavior of the applications. The use of JavaScript adds the page jumping capability of the application. It enables the developer also to code complex expressions and calculations like condi-tional branches, and loops. The ?fth version of the Hypertext Markup Language (HTML5) is used as the latest version for developing the web elements’ structure and content. The HTML5 is essential to develop the structure of the application page even without the JavaScript code, but that will lack the interactivity with the user [12]. Finally, the third version of the Cascading Style Sheets (CSS3) is used for the presentation of these web elements and polishing them for better visualization. These essential components are forming the latest and best tech-nology of the smart TV application, and also they are the newest technology for the World Wide Web. In general, Smart TV app could be one of two types, installed or cloud-based. Installed TV app is a stand-alone app installed on the smart TV without the need for the Internet connection, while the cloud-based TV app works as an interface between the cloud and the TV with a shallow content (almost no additional functionality) when there is no Internet connection. 4 The Analogy and Di?erences of Smart TV and Mobile Apps There are many similarities and di?erences between the Mobile and Smart TV apps. These similarities and di?erences could be seen in three dimensions, (1) Functionality, (2) Design, and (3) User interaction. Both applications are working on smart devices. Hence, the functionality could be similar, as they are both connected to the Internet. The mobile apps 5 https://www.josh?re.com/. 6 https://www.mautilus.com. 7 http://www.smarttv-alliance.org. 246 B. S. Ahmed and M. Bures could be useful even without connection to the Internet; however, several smart TV apps are useless without the network connection. The computation power of the smart device also could de?ne the functionalities of the application itself. In fact, the mobile apps could be more functional than smart TV apps because the mobile devices nowadays may have more computational power than smart TVs. In addition, the aim of the mobile apps is almost di?erent from the smart TV apps. Speaking about the application design, there are many di?erences. For exam-ple, the size of the screen and icons could de?ne the layout of the application. Smart TV screens are wider than the mobile devices. The background color of the smart TV apps could be di?erent from the color in the mobile devices. From the user interaction point of view, smart TV apps are having less text entry as it is di?cult to enter text from the remote device. Most of the smart TV apps are designed to get the content from the Internet when connecting whereas this is not the case for the mobile apps, as they could be standalone applications without Internet connections interfaces8 . The typical smart TV application is much more straightforward than the mobile app, especially in the design layout. The way that the user interacts with the application de?nes an essential di?erence between the smart TV and mobile apps. The user of the mobile app interacts directly with the application without an intermediate device, while in the smart TV application, the user interacts with the help of a remote device. In fact, the UI of the smart TV apps sometimes called 10-foot user interfaces since the 10 ft (3 m) distance from the TV is the standard distance between the user and the TV. The developers are considering this distance when developing the user interface [11]. Using the remote device with this distance is not user-friendly and not responsive. Hence, the UI must consider this signi?cant di?culty. As mentioned previously in Sect. 2, this interaction di?erence will be signi?cant also when approaching the testing process with model-based testing. 5 Navigation and Control in Smart TV Apps As mentioned previously, navigation on a smart TV application is through the remote device. Although some new TV devices are o?ering the direct interaction by the user with the screen, the most common interaction with the TV is still the remote device. The remote device consists of four essential navigation Right, Left, Up and Down. In addition, the remote device has an OK button to choose any selected view on the application after exploration. These ?ve key buttons should work properly while using an application. Figure 1 shows an example of the TV remote device. In addition to those ?ve buttons, there are many other buttons on the remote device that vary from a TV brand to another depending on the level of function-alities. Some of them are related to the hardware functionalities of the TV itself, as the power button to turn ON/OFF the TV. There are also ten buttons (from 0–9) for channel jumps and even entering numbers in text ?elds if necessary. 8 https://bit.ly/2IiNb30. Testing of Smart TV Applications 247 Fig. 1. TV remote device. The UI layout of any application plays a primary rule in the testing process. Understanding the layout could lead to an e?cient test generator and runner. Smart TV apps are following some limited number of layout patterns. Figure 2 shows three main patterns in which most of the smart TV apps are following. In fact, layout (b) is mostly used, since it puts many views in one window. Fig. 2. Three main layout design patterns for smart TV apps [13]. The remote device is putting constraints on the navigation from a view to another because it supports just one step navigation. Hence, each move on the layout is a step. This would not be a problem when two views are adjacent; however, for those non-adjacent views, more than one step is needed to move from one view to another. This navigation is very important when coming to the test generation strategy based on the application’s model. 6 Open Problems and Challenges In this section, we discuss di?erent problems and challenges that need to be addressed for the smart TV app testing. In the following subsections, we will address each problem, the challenges to solve the problem and our suggestions. 248 B. S. Ahmed and M. Bures 6.1 Start Point of Navigation One of the ?rst problems that the tester face when testing a smart TV app is the position of the navigational cursor. Technically speaking, from a JavaScript developer point of view, this happened when the focus point is not set in the application. For several applications on the store, this focus point is not set by the developers. As a result, when the application runs on the emulator, there is no pre-selected view on the application. The user must use the remote device to chose a view. Hence, the starting point of the navigator is missing. This problem is happening clearly with cloud-based TV apps because the views are changing in real-time with the cloud content. In fact, this is a challenging issue because it prevents the pre-generation of test sets. One solution to this problem is to let the tester choose the starting point of the testing. Yet, there could be a problem of good or bad selection point. Some starting points may lead to explore the app window sooner by navigating faster on the views. 6.2 Repository and Benchmark In general, any software testing veri?cation and validation process should be eval-uated through some benchmarks. These benchmarks could be real instrumented programs with some properties for testing. For example, many testing strategies are using the benchmarks available at Software-artifact Infrastructure Reposi-tory website9 for benchmarking and evaluation. For android testing, there are di?erent applications for testing. For instance many papers were using TippyTip-per10 , PasswordMaker Pro11 , MunchLife, K-9 Mail12 , Tomdroid13 , AardDict14 , and a few other applications for testing. In smart TV apps testing, we don’t have enough applications for benchmark-ing, and we don’t have a repository to store some benchmarks. In fact, there are two reasons behind this. First, smart TV apps are new and more time may be needed for the developers to create and publish open source applications. Sec-ond, the testing process of smart TV app is not de?ned yet, and the research is not initialized, in which this paper could be an e?ort toward that. Samsung maintains a page with some simple applications and examples15 . One solution for this di?culty is to develop applications for testing purposes. Here, the reliability of the testing process would be an issue. However, for better reliability, the testing and development groups could be separated. 9 http://sir.unl.edu/portal/index.php. 10 https://tinyurl.com/yd77qfzd. 11 https://tinyurl.com/ma65bc8. 12 https://tinyurl.com/6mzfdaa. 13 https://launchpad.net/tomdroid. 14 https://github.com/aarddict/android/issues/44. 15 https://bit.ly/2qC5ncS. Testing of Smart TV Applications 249 6.3 Test Generator In mobile app testing, most of the test generation strategies were almost inspired by other UI test generation strategies. For example, the test generator strategy of MobiGUITAR [7] framework was adapted from the GUITAR [8] framework for GUI testing. However, this method could not be followed in smart TV apps. Due to the user interaction di?erence in smart TV app, it is hard to adapt some test generator strategy from GUI or mobile app testing. For this reason, there is a need to develop a new test generation strategy. Although relying on previously investigating strategies is not clear at this early stage, following principles and concepts of model-based testing is still valid. Here, after deciding on the model and notations, the coverage criteria of the testing strategy would be another issue. De?ning the coverage criteria depends mainly on the tested functional and non-functional requirements. 6.4 Activity Exploration The test generation stage cannot be performed without input to the generator algorithm. For functional or non-functional testing, most probably, the input would be two things, the number of events to test and the coverage criteria. As mentioned previously, the coverage criteria can be de?ned based on a prede?ned testing strategy. However, getting the input views for the test generation algo-rithm may need an exploration of the entire UI activity (i.e., window) of the smart TV app. Activity exploration is not a big issue (at least technically) when we have the source code of the application, i.e., white box testing. A simple code crawler could scan the HTML5 and CSS3 ?les and detect the views by parsing the code, and then feed the generator algorithm by these views. However, catching the views in the testing process without having the source code (i.e., black-box testing) could be a tricky job. In fact, there is a need for a special algorithm due to the special interaction with the application by the remote device. In Sect. 7.1, we will introduce an algorithm to creep the signi?cant views of the application activity in a black-box manner. 6.5 Stopping Criteria Stopping criteria in the smart TV app could be an issue, especially for the cloud-based applications. In the installed TV app, there is a ?nite number of views in which the creeper can catch them, and the testing strategy can cover. When this coverage criteria are met, the testing strategy may stop. Hence, this can serve as stopping criteria. However, in cloud-based apps, there could be an in?nite number of events that appear in real-time feeding on the cloud. For example, the YouTube smart TV app is presenting new views (i.e., videos) when scrolling down in the application. Practically, there could be an in?nite massive number of views. The number of views may also vary with each new start of the application. 250 B. S. Ahmed and M. Bures One solution to this challenge is to de?ne a ?nite number of iteration in which the creeper can iterate over the application or limiting the number of views to be covered before the stop. 6.6 Test Suite Ripper When generating the test cases, we expect some obsolete or invalid test cases. For example, some detected views during the creeping process may not be valid, and still, they may be presented in the test cases. To this end, there is a need for a test ripper to repair those test cases which are not valid. The test ripper may follow an algorithm to repair the test cases. For example, de?ning several prede?ned patterns of the invalid test cases or transitions from a view to another view. Another repairing process of the test cases could be unique from the remote device. For example, those color buttons on the remote device could be used for several functional and non-functional requirements depending on the application con?guration. 6.7 Test Runner When the creeper detects the views, and the test cases are generated and repaired by the test generator and ripper, a test runner is needed to run these test cases. A test runner is merely taking the test suite and run the test cases one by one automatically. Here, the same test runner strategy in android app testing could be followed by the smart TV app testing. However, executing the test cases depends on the development kit. 6.8 Fault Taxonomy and Categorization After running the test cases on the application, an important task is to iden-tify the encountered faults and the test cases in which these faults related to. However, faults in smart TV app are not known yet. Here, classical mutation testing is not applicable. For example, recently, Deng et al. [14] have identi-?ed di?erent faults in the Android apps within a mutation testing framework for mobile devices. In fact, those faults are more Android-oriented faults, and they are not applicable here. In addition, some of those faults are related to the Activity faults, for example, changing the screen orientation, which is also not appropriate because the Smart TV screen is too big to be frequently ori-ented. Normally, classical mutation test tools like MuDroid [15] or MuJava [16] are used for mobile, web or desktop apps. As we mentioned, those tools are platform-speci?c tools. An important e?ort in this approach is done by Cui et al. [9]. Cui et al. identi?ed eight di?erent types of faults in smart TV applications. These faults are, TV system halt, TV system reboot, displaying a black screen, having voices but no images, playing images with delaying, application exit by exceptions, playing images with a blurry screen, key has no response, or the Testing of Smart TV Applications 251 response key is wrong. While this is an excellent e?ort toward the fault catego-rization, there is a need to identify more faults related to the application itself. Some of those identi?ed faults may also relate to the TV device itself. Also, there is a need to identify a method for how to inject these faults in the smart TV. A signi?cant e?ort that can be done here is to conduct a study to de?ne the taxonomy of faults in Smart TV apps. A useful input to this study could come from smart TV industry especially those companies which are tracking and getting feedback from users in the cloud. Doing an analytical study on this data to categorize these faults would be an excellent ?nding. 6.9 Defining Test Oracle De?ning the pass and fail criteria is a challenging task in software testing process. Within test automation, the mechanism for determining whether a given test case is passed or failed is named test oracle. In this context, the distinction between the correct and incorrect behavior is called “test oracle problem” [17]. A classical way to approach the test oracle is the manual identi?cation of the pass and fail by the developer. However, for a signi?cant amount of test cases, this is not accurate and impractical. Automating test oracles in smart TV app testing is not an easy task since we don’t know precisely the nature and the kind of faults the application face. In addition, the dynamic behavior of the cloud-based smart TV applications may lead to random new views that can be loaded. In fact, this task is connected to the fault taxonomy and categorization discussed in Sect. 6.8. When we know the faults and can categorize them, we can de?ne the test oracle for the automated testing framework. 7 Towards an Automated Testing Strategy Based on the problems and challenges presented so far, here we can propose an automated framework to test the smart TV apps. This framework presents our vision for a strategy to automate the testing process. The framework is working in the Tizen SDK, which includes a smart TV emulator; however, the framework is a general framework and it is applicable for other possible emerging SDKs in the future. Figure 3 shows an overview of this framework and illustrates the essential components and their relationship to each other. The framework supports both white and black box testing styles. The tester chooses among these two features depending on the source code availability and the application type. As mentioned previously, even when the source code is available, when the application is a cloud-based app, the tester must consider this case as black-box testing. When the source code is available, the tester will import the project and let the framework do the rest automatically. Here, the creeper will scan the source code and tries to identify the essential views in the UI. 252 B. S. Ahmed and M. Bures Fig. 3. Smart TV App testing framework. In case of black-box testing or cloud-based app, which is probably the most critical case, the creeper must use a special algorithm to creep and detect all the views. Detail of this algorithm is presented in the following section (Sect. 7.1). Here, the creeper uses the log messages from the TV emulator to validate the views. In both white or black box testing approaches, the creeper will detect the essential views and convert all the views and their relationship with each other to a state machine graph model. This model will be the input to the test generator which consists of a model-based algorithm for generation and also a test Ripper to repair the test cases. The repair will be based on some prede?ned patterns of invalid test cases. This process is iterative until as far as there is an invalid test case. The framework will execute these test cases through a test runner on the TV emulator, and an automated test oracle module will validate them one by one. Finally, a test report will be presented to the user again. 7.1 Application Creeper To detect all the necessary views in the application that need to present in the model for test generation, we have developed an algorithm called EvoCreeper. In fact, object detectors in UI for mobile, desktop, and web apps is not new. There are some algorithms called crawlers to crawl on the UI and detect these objects. None of those algorithms are useful here since we have an entirely di?erent user interaction behavior in the smart TV apps. Besides, we have thought that the name “creeper” suites perfectly with what we want to do as the “crawler” word gives a di?erent meaning due to its use in web and search engine technologies. Algorithm 1 shows the steps of the EvoCreeper. If the focus point is not set by the app developer, the EvoCreeper starts by an action from the tester to choose at least one view to start from, otherwise, it will start from the focused view. From this view, the creeper will start creeping the UI evolutionary and incrementally. The algorithm takes four directions DUp, DDown, DLeft, DRight plus the OK button from each view to move. When a new view discovered in each direction (i.e., newV iew = Active), the algorithm will add Testing of Smart TV Applications 253 Algorithm 1. EvoCreeper Steps 1 Input: v1 is the user selected view 2 Output: List of views to be modeled Lv 3 Iteration It ?1 4 Maximum Iteration Itmax ?1 max 5 While ((It < Itmax) t (newV iew= null)) 6 Use v1 as a start point 7 From v1 generate ?ve possible directions DUp, DDown, DLeft, DRight, OK 8 For each direction 9 Navigate a step 10 Monitor emulator log for reaction 11 If newV iew = Active 12 add newV iew to Lv 13 End If 14 It + + 15 End For 16 End While it to the list of views to be modeled Lv. This algorithm will continue until there are no new discovered views. Here, as another stopping criterion, the algorithm will take some preset number of iteration to avoid the endless discovery loop in some special cases of cloud-based apps. In the following section (Sect. 7.2), we present an example as a graphical proof of concept for this algorithm. 7.2 Proof of Concept In this section, we present a proof of concept for the application creeper in Algorithm 1. Here, we consider a cloud-based app as a pilot example as it is the most di?cult scenario. As shown in Fig. 4, each activity window has 12 views and as the user shift down or right, new activities may appear. We consider three iterations of the algorithm. We assume that the tester will choose v1 as a start point. In fact, v1 is the worst case choice of the views and we observed that choosing the view in the middle of the window may lead to less iteration and better recognition of the views. From v1, the algorithm will consider four main directions, DUp, DDown, DLeft, DRight plus the OK button. However, here, we will consider only those four directions because the OK button may open a new window in the app. For each direction, the creeper algorithm will check for new events, which are most likely new views. Considering the ?rst iteration, and starting from v1, the up and left directions Du, Dl will not lead to new views, while the right direction Dr leads to v2 and the down direction Dd leads to v5. For the next iteration, the creeper will start from newly discovered views, v2 and v5 here. From v2, the news views v3 and v6 identi?ed by the creeper algorithm. In addition, v1is discovered in the Dl direction, however, it is neglected by the creeper as it is already available on the view list. Considering the v5, the views v1, v9, and v6 are in the three direction Du, Dd, and Dr respectively; however, only v9 considered as a new view. 254 B. S. Ahmed and M. Bures Fig. 4. Proof of concepts of the EvoCreeper. The third iteration also starts from the newly discovered views, v3, v6, and v9. In the same way, considering the four directions from each view and ?ltering all repeated views, four new views were identi?ed, v4, v7, v10, and v13. The EvoCreeper algorithm works in an iterative evolutionary style to discover new views and events in the application under test. As mentioned, this pilot example considers the cloud-based app. Hence, there is no expectation of the ?nite numbers of views in the application. To this end, our proposed stopping criteria could be useful here. The creeper algorithm will continue for a certain number of iterations or when no new views discovered. 8 Functional and Non-functional Testing Opportunities in Smart TV Applications For testing the functional or non-functional requirement in smart TV, we need a measure. This measure can be used in the test generation process as a coverage criterion and also can be used in the design of test oracle. While for functional requirement it is straightforward, converting a non-functional requirement into an exact measure is a tricky task. Here, an approximation could be useful. Many problems could be addressed here. For example, addressing the min-imum hardware requirements for a speci?c smart TV application would be an interesting idea to investigate. Most of the smart TV devices nowadays in the market rely on low computation power CPU and memory. Extra hardware may be used to measure the energy consumption of the CPU during the testing process. Covering the event interactions in di?erent level is also interesting functional testing. Here, full, partial, or systematic coverage of the events is the decision that must be made by the tester. Also, a comparison of these three coverage criteria is an important study topic to know which approach is better for fault ?nding. Testing of Smart TV Applications 255 The limitation in memory and CPU lead to another interesting non-functional requirement that may also be used in the testing process, which is the execution time. It would be interesting to know the situation and sequences in the smart TV application that causes long or short execution time. This could also be useful to identify and detecting security vulnerabilities. In fact, security is an essential issue in smart TV applications that have never been addressed before. Probably, an essential non-functional requirement that must be addressed in smart TV applications is the usability. Due to the availability of remote device, the usability testing is necessary. In fact, the remote device remains the main con-straint facing the usability of the smart TV applications. At this early research stage, it is useful to address how to make the applications more usable and what are the factors that a?ect the usability. It is true that the user-oriented testing technique could be more realistic here; however, an automated testing method could support the ?nal result of usability testing report. 9 Conclusion and Future Work In this paper, we have presented the key ingredients, challenges, and some pro-posed solutions for the smart TV app testing. We think that in the near future, smart TV apps will be an essential piece of software in the whole context of IoT services. Despite this importance, we can’t ?nd a systematic and robust testing strategy in the literature for the smart TV apps. After an extensive study of these applications, we discover many open problems and challenges in which we illustrated them in this paper. We found that the most crucial problem to be solved is the test generation strategy. In this paper, we proposed a fully auto-mated framework to test smart TV apps. In addition, we have also illustrated our EvoCreeper algorithm that creeps the views available in the application win-dow. The algorithm uses an iterative evolutionary style to discover new views. The output of the algorithm will be input to the test generator strategy that generates the necessary test cases for the automated testing framework. Depending on the testing process, there are many opportunities for smart TV app testing. For example, the security, usability, scalability, and robustness testing are essential issues that have not been addressed in the literature. Here, our proposed framework is also useful for these non-functional properties by just altering the test oracle and test generator components. As part of our work, we are planning to present more comprehensive strategy with testing results of di?erent smart TV apps in the future. Acknowledgment. This research is conducted as a part of the project TACR TH02010296 Quality Assurance System for Internet of Things Technology. 256 B. S. Ahmed and M. Bures References 1. Jung, K.S.: The prospect of Smart TV service. Inf. Commun. Mag. 28(3), 3–7 (2011) 2. Zein, S., Salleh, N., Grundy, J.: A systematic mapping study of mobile application testing techniques. J. Syst. Softw. 117(C), 334–356 (2016) 3. Sahinoglu, M., Incki, K., Aktas, M.S.: Mobile application veri?cation: a systematic mapping study, pp. 147–163. Springer, Heidelberg (2015) 4. Amal?tano, D., Fasolino, A.R., Tramontana, P., Robbins, B.: Chapter 1 - testing android mobile applications: challenges, strategies, and approaches. In: Advances in Computers, vol. 89, pp. 1–52. Elsevier (2013) 5. Banerjee, I., Nguyen, B., Garousi, V., Memon, A.: Graphical user interface (GUI) testing: systematic mapping and repository. Inf. Softw. Technol. 55(10), 1679–1694 (2013) 6. Li, Y.-F., Das, P.K., Dowe, D.L.: Two decades of web application testing-a survey of recent advances. Infor. Syst. 43(C), 20–54 (2014) 7. Amal?tano, D., Fasolino, A.R., Tramontana, P., Ta, B.D., Memon, A.M.: Mobi-guitar: automated model-based testing of mobile apps. IEEE Softw. 32(5), 53–59 (2015) 8. Nguyen, B.N., Robbins, B., Banerjee, I., Memon, A.: Guitar: an innovative tool for automated testing of GUI-driven software. Autom. Softw. Eng. 21(1), 65–105 (2014) 9. Cui, K., Zhou, K., Song, H., Li, M.: Automated software testing based on hierar-chical state transition matrix for Smart TV. IEEE Access 5, 6492–6501 (2017) 10. Ingrosso, A., Volpi, V., Opromolla, A., Sciarretta, E., Medaglia, C.M.: UX and usability on Smart TV: a case study on a T-commerce application, pp. 312–323. Springer, Cham (2015) 11. Sabina, K.C.: De?ning a testing platform for Smart TV applications. Bachelor thesis, Helsinki Metropolia University of Applied Sciences, January 2016 12. Bluttman, K., Cottrell, L.M.: UX and usability on Smart TV: a case study on a T-commerce application. McGraw Hill Professional, Cham (2012) 13. Murgrabia, M.: Design considerations for Vewd app store applications (2017). Accessed 5 Dec 2017 14. Deng, L., O?utt, J., Ammann, P., Mirzaei, N.: Mutation operators for testing android apps. Inf. Softw. Technol. 81(C), 154–168 (2017) 15. Moran, K., Tufano, M., Bernal-Cardenas, C., Linares-Vasquez, M., Bavota, G., Vendome, C., Di Penta, M., Poshyvanyk, D.: Mdroid+: a mutation testing frame-work for android. In: 40th International Conference on Software Engineering (ICSE) (2018) 16. Ma, Y.-S., O?utt, J., Kwon, Y.R.: MuJava: an automated class mutation system: research articles. Softw. Test. Verif. Reliab. 15(2), 97–133 (2005) 17. Barr, E.T., Harman, M., McMinn, P., Shahbaz, M., Yoo, S.: The oracle problem in software testing: a survey. IEEE Trans. Softw. Eng. 41(5), 507–525 (2015) Dynamic Evolution of Simulated Autonomous Cars in the Open World Through Tactics Joe R. Sylnice and Germ´ an H. Alf´erez(B) School of Engineering and Technology, Universidad de Montemorelos, Apartado 16-5, Montemorelos, N.L. 67500, Mexico 1140134@alumno.um.edu.mx, harveyalferez@um.edu.mx Abstract. There is an increasing level of interest in self-driving cars. In fact, it is predicted that fully autonomous cars will roam the streets by 2020. For an autonomous car to drive by itself, it needs to learn. A safe and economic way to teach a self-driving car to drive by itself is through simulation. However, current car simulators are based on closed world assumptions, where all possible events are already known as design time. Nevertheless, during the training of a self-driving car, it is impossible to account for all the possible events in the open world, where several unknown events may arise (i.e., events that were not considered at design time). Instead of carrying out particular adaptations for known context events in the closed world, the system architecture should evolve to safely reach a new state in the open world. In this research work, our contribu-tion is to extend a car simulator trained by means of machine learning to evolve at runtime with tactics when the simulation faces unknown context events. Keywords: Autonomous car· Tactics · Dynamic evolution Open world · Machine learning 1 Introduction A human driver learns by practicing how to drive and how to detect problems in the car and on the road. It is basically the same in the case of autonomous cars. These cars learn from historical data to learn how to drive. However, a self-driving vehicle is really expensive to build and maintain. In fact, there are reports informing that NVIDIA is selling its self-driving process-ing unit for about $15,000 [1]. That is really expensive taking into account that this is the price of only the processing unit. Also, it is dangerous and careless to unleash a self-driving car without proper training and testing. Simulations to prove new approaches in autonomous cars could be used to solve the aforemen-tioned problems in the academic world, and especially in developing countries with limited ?nancial resources. In the closed world, all the possible context events are known beforehand (i.e., at design time or during training under a machine-learning approach). However, .r c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 257–268, 2019. https://doi.org/10.1007/978-3-030-02686-8_21 258 J. R. Sylnice and G. H. Alf´erez in the open world, unknown context events can arise (e.g. a sudden malfunction in one of the car sensors). This kind of events have to be controlled e?ciently in order to prevent problems with the driver and passengers. Moreover, although there are open-source simulators, these simulators do not manage uncertainty in the open world. In this research work, our goal is to extend the applicability of machine learning by means of tactics to carry out the dynamic evolution of simulated autonomous cars in the open world. Tactics are last-resort surviving actions to be used when the simulated car does not have prede?ned adaptation actions to deal with arising problematic context events in the open world [2]. In order to apply tactics in the open world, the source code of a car video game was modi-?ed. First, the car was trained with the following supervised learning algorithms: K-Nearest Neighbors, Logistic Regression, Support Vector Machines, and Deci-sion Trees. Then, unknown context events were injected at runtime to evaluate how the car faces those events with tactics. This paper is organized as follows. Section 2 presents the theoretical founda-tion of this research work. Section 3 presents the results. Finally, Sect. 4 presents the conclusions and future work. 2 Justification The research ?eld of self-driving cars is a hot topic nowadays. However, the technology behind a self-driving car relies heavily on state-of-the-art software and really expensive hardware. That is why simulation tools are being increasingly used in the ?eld because they provide the mechanisms to test and evaluate the system of a self-driving car without having to buy (or even damage) really expensive hardware [3]. Prede?ned adaptation actions for known context events in the closed world are not enough in the open world where several unknown context events can arise. Despite the recognized need for handling unexpected events in self-adapting systems (SAS) [4], the dynamic evolution of SAS in the open world is still and open and challenging research topic. In order to visualize the impact of unknown context events in the open world, let us imagine a self-driving car that has been trained with machine learning. The training was carried out with datasets composed of known historical data (e.g. data related to sonar and LiDAR sensors). In other words, the training was applied in the closed world. However, at runtime several unknown events may arise in the open world. For instance, although the sensors are highly cali-brated and thoroughly revised, it is possible that a sensor starts recording inac-curate data (e.g. because of a broken sonar sensor). This is a dangerous situation because inaccurate data could lead to an accident. If the car was not trained to face this kind of situations, then the following question arises: what will the car do? In order to answer this question, in addition to applying machine learning to train self-driving cars, it is necessary to count on mechanisms to lead the car to make the best decision despite unknown context events. Dynamic Evolution of Simulated Autonomous Cars in the Open World 259 3 Underpinnings of Our Approach Our approach is based on the following concepts (Fig. 1). Fig. 1. Underpinnings of our approach. 3.1 Machine Learning Machine learning can be de?ned as computational methods using experience to improve performance or to make predictions accurately. Experience can refer to past data that is used by the learner. The quality and size of the data are very important for the accuracy of the predictions made by the learner [5]. 3.2 Tactics Tactics are last-resort surviving actions to be used when a system does not have prede?ned adaptation actions to deal with arising problematic context events in the open world [2]. The use of tactics is common in sports, war, or even in daily matters to accomplish an end. For example, the most important goal during a battle is to win. However, unknown or unforeseen events, such as sur-prise assaults, may arise. These events may negatively a?ect the expected goal. Therefore, it is necessary to choose among a set of tactics to reach the goal (e.g. to escape vs. to do a frontal attack). Tactics are prede?ned at design time and are used at runtime to trigger the dynamic evolution of the self-driving car. The tactics are required to be known beforehand in order for the self-driving car to face uncertainty. However, these tactics are not associated with any speci?c recon?guration actions (as dynamic adaptation does) [6]. 3.3 Dynamic Evolution A self-driving car has to go from dynamic adaptation in the closed world to dynamic evolution in the open world in order to respond to unforeseen ongoing events. Dynamic adaptation can be referred to as punctual changes made to face particular events by activating and deactivating system features based on the current context. Meanwhile, dynamic evolution is not just about applying punctual adaptations to concrete events but it is the gradual growth of the system to a better state depending on the current context events [2]. 260 J. R. Sylnice and G. H. Alf´erez 3.4 Open World Open world can be referred to as a context where events are unpredictable, requiring that software reacts to these events by adapting and organizing its behavior by itself [7]. As far as we know, current simulated autonomous cars are based on the closed world assumption where the relationship between the car and the surroundings are known and unchanging. Nevertheless, in the open world where the aforementioned relationship is unknown, unpredictable, and constantly changing, the simulated car has to be able to evolve. 4 Related Work A fully autonomous car or self-driving vehicle is a car that is designed to be able to do all the work of maneuvering the car without the passenger never having to or is not expected to take control of the car at any time or any given moment [8]. A self-driving vehicle has to be able to identify faults in its system. If the faults are critical, the vehicle has to either ?x these faults or isolate them so that the system is not compromised [9]. Self-driving vehicles are equipped with state-of-the art sensors and cameras. Also, they use powerful software behind the hardware to maneuver themselves. The software learns how to drive through machine learning and the software sees through computer vision. There are several self-driving cars in development. For example, the Google Car is being developed by Google. Google hopes to have self-driving cars on the road by 2020. However, this company does not intend to become a car manufacturer. Uber also entered the world of self-driving cars in April 2015. In addition, Tesla expects to launch a fully autonomous car anytime in 2018. Also, in April 2015, BMW has partnered with Baidu the “Chinese Google”, to develop self-driving technology. There are several research works that propose simulations of autonomous cars. For instance, in [10] the authors propose a shader-based sensor to simulate the LiDAR and Radar sensors instead of the common method of ray tracing. They mention that sensor simulations are very important in the ?eld of self-driving cars. In this way, the sensors can be evaluated, tested and optimized. The authors state that ray tracing is an intensive task for the CPU. It is not problematic when the number of simulated rays and detected objects are small. However, in reality it becomes problematic or even impossible. According to the authors, a shader-based sensor simulation is an e?cient alternative to ray casting because it uses parallelism in the GPU and this helps in sparing CPU resources that the software can use in other areas. In [11], the authors mention that they have used a simulation tool called Scene Suite to generate simulated scenes of tra?c scenarios. The tool allows 2.5D simulations and uses patented virtual sensor models. The goal of this work is to show how the data from real world sensor models could be extracted and then to simulate the results using a scene based pattern recognition. Also, this paper introduced an approach for learning sensor models with a manageable Dynamic Evolution of Simulated Autonomous Cars in the Open World 261 demand on computational power based on a statistical analysis of measurement data clustered into scene primitives. In [12], the authors focus on the use of the agent-based simulation framework MATsim and how it could be applied to the ?eld of self-driving cars. Agent-based simulations are state-of-the-art transport models. Agent-based approaches com-bine activity-based demand generation and dynamic tra?c assignments. MAT- Sim is a simulation of multi-agent transport based on activity. It is an open source framework written in JAVA under the GNU license. MATSim’s strength is the modular design around a core, allowing new users to customize it without much e?ort. This work is based on the simulation of autonomous vehicles in a realistic environment at a large scale with individual travelers (vehicles) that adapt their movement dynamically with the others. In [13], the author uses an open source simulator to carry out the evaluation and application of a reinforcement learning approach to the problem of control-ling the steering of a vehicle. Reinforcement Learning (RL) is an area of machine learning in which an agent is placed into a certain environment and is required to learn how to take proper actions without having any previous knowledge about the environment itself. If the agent’s behavior is right, it is rewarded. If the behavior is wrong, the agent is punished. This learning system of reinforcement learning is called trial and error. In order to evaluate this approach, the Open Racing Car Simulator (TORCS) was used. In the TORCS environment a car is referred to as a Robot. In [3], the authors use an integrated architecture that is comprised of both a tra?c simulator and a robotics simulator in order to contribute to the self-driving cars simulation. Speci?cally, the proposed approach uses the tra?c sim-ulator SUMO and the robotics simulator USARSim. These tools are open source and have good community support. In one hand, SUMO is a microscopic road tra?c simulator written in C++. It was designed by the Institute of Transporta-tion Systems at the German Aerospace Center to handle large road networks. On the other hand, USARSim is an open-source robotics simulator written in Unreal Script, which is the language of the Unreal game engine. It has high qual-ity sensor simulation and physics rendering. The authors modi?ed the SUMO and USARSim simulators in order to be able to implement the architecture for the self-driving car simulation. The result is a simulator in which a self-driving vehicle can be deployed in a realistic tra?c ?ow. In [14], the authors describe the global architecture of the simulation/proto-typing tool named Virtual Intelligent Vehicle Urban Simulator (VIVUS) devel-oped by the SeT Laboratory. The VIVUS simulator simulates vehicles and sen-sors. It also takes into account the physical properties of the simulated vehicle while prototyping the arti?cial intelligence algorithms such as platoon solutions and obstacle avoidance devices. The goal of VIVUS is therefore overcoming the general drawbacks of classical solutions by providing the possibility of designing a vehicle virtual prototype with simulated embedded sensors. In [15], the authors combine a tra?c simulator and a driving simulator into an integrated framework. They have used the driving simulator SCANeR 262 J. R. Sylnice and G. H. Alf´erez developed by Renault and Oktal, and the AIsum tra?c simulator developed by TSS-Transport Simulation Systems. The framework enables a driver to use the simulator with a local tra?c situation managed by a nano tra?c model that is realistic for the driver and that also provides a realistic global tra?c situation in terms of ?ow and density. The framework can provide information on the simu-lated vehicles and the tra?c situation for the short-ranged sensors: camera and radar and also the long-ranged sensors: wireless and embedded navigation. It also enables the driver and other systems to be involved in an extensive assortment of tra?c situations, accidents, rerouting, road-work zones, and so on. 5 Results 5.1 Methodology This project has been broken down in the following steps: Looking for an Open Source Car Simulator: To ?nd the open source car simulator, Google Search was used with the term “open source car simulator” in December 2017. The following is the list of the open source car simulators found: – TORCS1 : TORCS is a multi-platform car racing simulation. It is used as an ordinary car racing game, as an arti?cial intelligence (AI) racing game, and as a research platform. – Apollo2 : Apollo is an open-source autonomous driving platform created by Baidu. It has a high performance and ?exible architecture that supports fully autonomous driving capabilities and also has car simulation functionalities. – Udacity’s Self-Driving Car Simulator3 : This simulator was built for Udacity’s Self-Driving Car nanodegree to teach students how to train cars and how to navigate road courses using deep learning. Comparing Di?erent Open Source Car Simulators: The criteria for choosing the car simulator were the following: (1) it had to be open source to ?nd the points in which it could be extended; (2) it had to be mature enough in terms of documentation; (3) it had to be supported by the developer commu-nity; and (4) it had to be easily extensible in terms of programming. The results of the comparison are as follows: 1. TORCS meets three of the four criteria. Although, it is open source, mature, well known in the scienti?c world, and is greatly supported by the developer community, it misses the fourth criteria because it is not easily extensible in terms of programming. 1 http://torcs.sourceforge.net/index.php?name=Sections&op=viewarticle&artid=1. 2 https://github.com/ApolloAuto/apollo. 3 https://github.com/udacity/self-driving-car-sim. Dynamic Evolution of Simulated Autonomous Cars in the Open World 263 2. Apollo is a fully ?edged open autonomous driving platform that meets two of our criteria: it is open source and mature. However, it is a fully autonomous driving platform, much more complex than a simulator. Also, since it was released a couple of months prior to our search, it does not yet have a wide developer community support. Also, the documentation, written in Chinese is not yet translated. 3. Udacity’s self-driving car simulator falls short when it comes to documen-tation. As a result, although it is an open source software, the lack of free documentation makes it di?cult to extend the code. According to the evaluation, none of these simulators ful?lled our needs. There-fore, instead of searching for open source autonomous car simulators, we looked for an open source car game, which could be trained by means of machine learn-ing and extended for usage in the open world. We found an open source car game named Lapmaster4 . It is a simple car game designed with the pygame Python library. It consists of a car running around a circuit for a certain amount of laps. Also, the player is able to shift the gears. The goal of the game is to complete the laps as fast as possible. Fig. 2 shows a screenshot of this game. Fig. 2. Screenshot of the Lapmaster game. 4 http://pygame.org/project-Lap+Master-2923-4798.html. 264 J. R. Sylnice and G. H. Alf´erez Extending the Car Simulator: In this step, the Lapmaster car simulator was extended for the open world. Speci?cally, two steps were carried out: (1) collecting data from the context of the car for training; and (2) training the simulated car with machine learning. These steps are described as follows. 1. Collecting data from the context of the car: The source code of the car game was modi?ed to collect the position (x and y coordinates) and the direction (0 - forward, 1 - right, and 2 - left) of the car in every frame. Listing 1 shows the modi?ed lines of the car’s source code. On line 1, a while loop indicates that the code is executed while the car simulator is running. On line 2, the program detects the key that is pressed. On line 3, if the car is moving, then the program checks if the key “d” (right) or key “a” (left) is pressed. These values are stored in the l data list. Speci?cally, three values are stored in this list: the x and y coordinates, and the direction (0 for forward, 1 for right, and 2 for left). If no key is pressed, then the program stores a 0 in the l data list. On line 12, if the l data list is not empty, then it is passed to the Writer function with the log’s path in which the contextual data is to be written. Listing 2 presents the Writer function which writes the data in the comma-separated values (CSV) format. The CSV ?le contains 4,149 instances. This number of instances was obtained by running the game four times. The x and y coordinates were taken as the features for training, and the direction as the class. 1 while running : 2 key = pygame . key . get_pressed () 3 if red . gear > 0: 4 if key [ K_d ]: 5 red . view = ( red . view + 2) % 360 6 d = 1 7 elif key [ K_a ]: 8 red . view = ( red . view + 358) % 360 9 d = 2 10 else : 11 d = 0 12 l_data = [ red .xc , red .yc , d] 13 if l_data : 14 data . Writer ( l_data , path ) Listing 1.1. A fragment of the modi?ed code of the Lapmater’s source ?le. 1 import csv 2 def Writer ( data , path ): 3 with open ( path , "a") as c_file : 4 write = csv . writer ( c_file , delimiter = ’,’) 5 write . writerow ( data ) Listing 1.2. Implemented function for data writing. 2. Training the simulated car: For the training of the simulated car, four super-vised machine learning algorithms from the scikit-learn5 Python library were employed. The algorithms are the following [16]: (a) K-Nearest Neighbor (KNN): It is a simple algorithm that stores all available cases and classi?es new cases by a majority vote of its k neighbors. 5 http://scikit-learn.org/stable/#. Dynamic Evolution of Simulated Autonomous Cars in the Open World 265 (b) Logistic Regression (LR): It is a classi?cation algorithm used to esti-mate discrete values based on a given set of independent variables. It predicts the probability of occurrence of an event by ?tting data to a logit function. (c) Support Vector Machine (SVM): In this classi?cation algorithm, each data point is plotted in an n-dimensional (n being the number of features) space where the value of each feature is the value of a partic-ular coordinate. Then a line called separating hyper plane or (decision boundary) splits the data points between two or more groups of data. The further the data points from the decision boundary, the more con?- dent the algorithm is about the prediction. The closest data points to the separating hyper plane are known as support vectors. (d) Decision Trees (DT): In this classi?cation algorithm, the data is split into two or more homogeneous sets based on most signi?cant attributes that makes the sets distinct. The following are the steps used to train the simulated car: (1) a user ran the game to generate a dataset; (2) the KNN, LR, SVM, and DT algorithms were executed to get a classi?cation for each class. The classes were 0 for forward, 1 for right, and 2 for left; (3) the models were evaluated in terms of cross validation; and (4) the simulated car was extended to use the most accurate classi?er. A fragment of the script to generate the classi?cation models from the data collected is presented in Listing 3. The ?rst line declares a list containing the information of the four classi?ers used in the experiments. Next, a for loop is used to iterate over this list in order to train and generate a model for each algorithm. Line 9 speci?es the location and the name of the model that is going to be trained. In Lines 12 and 13, the program splits the data into training and test sets. The code in Line 11 indicates that the values are going to be taken randomly from the dataset. On lines 14–15, a classi?cation model is created and the cross-validation score is evaluated. In Line 17, the accuracy of each algorithm is computed. In Lines 18–21, each model is evaluated and a classi?cation report is generated. Finally, the model generated by each algorithm is saved. 1 classifiers = [ 2 ( ’kNN ’, KNeighborsClassifier ( n_neighbors =4) ) , 3 ( ’LR ’, LogisticRegression () ) , 4 ( ’SVM ’, SVC () ) , 5 ( ’DT ’, DecisionTreeClassifier () ) 6 ] 7 8 for name , clf in classifiers : 9 filename = ’ models /% s_ %s. pickle ’ % ( name , data . filename ) 10 print ( ’ training : %s ’ % name ) 11 rs = np . random . RandomState (42) 12 X_train , X_test , y_train , y_test = 13 train_test_split (X , y , test_size =0.2 , random_state = rs ) 14 model = clf . fit ( X_train , y_train ) 15 cv = cross_val_score (clf , X_test , y_test , cv =10 , 16 scoring = ’ accuracy ’) 17 acc = np . mean ( cv ) 18 predictions = clf . predict ( X_test ) 19 report = classification_report ( y_test , predictions ) 20 print ( ’ training %s done ... acc = %f ’ % ( name , acc )) 21 pickle . dump ( model , open ( filename , ’wb ’)) 22 bm . append ( ’%s %s ’ % ( name , report )) Listing 1.3. A fragment of code to train and generate classi?cation models. 266 J. R. Sylnice and G. H. Alf´erez Injecting Dynamic Evolution Through Tactics: In this step, we emulated that a sonar sensor was malfunctioning. This situation can cause accidents since the car will not be able to “see” properly its environment (e.g. other cars). To trigger this event, a button on the keyboard was pressed. When the car system recognizes that an unknown context event has arisen, then the “decelerate tactic” is triggered. This tactic progressively slows down the car until it reaches the state of a full stop. The reasoning behind this tactic is to prevent that the car keeps going on without properly detecting its surroundings. The implemented tactic is shown in Listing 4. Speci?cally, when the “s” key is pressed on the keyboard, the slow variable is set to true to indicate that the car has to reduce the speed until if fully stops. 1 slow = False 2 3 key = pygame . key . get_pressed () 4 if key [ K_s ]: 5 slow = True 6 if slow : 7 red . speed = .95 * red . speed - .05 * (2.5 * red . gear ) Listing 1.4. A fragment of the source code for the decelerate tactic. 5.2 Outcomes The accuracy of the models generated with the four algorithms are as follows: kNN = 0.9313, LR = 0.8927, SVM = 0.8927, DT = 0.929. Table 1 shows the cross validation results of each model generated with the four classi?ers. Also, in Table 1, only two classes are shown: 0 for forward and 1 for right. That is because the circuit in the Lapmaster game only has right turns. Although the kNN algorithm has the best accuracy, the DT algorithm has better results in terms of precision, recall, and f1-score. The three aforementioned terms are de?ned as follows [17]: – Precision is the ability of the classi?er not to identify as positive a sample that is negative. – Recall is the ability of the classi?er to ?nd all the positive samples. – F1-score is a weighted mean of the precision and recall. 5.3 Discussion We published a video6 in which the “decelerate tactic” is e?ectively triggered at runtime. Although machine learning works ?ne in the closed world, i.e., where there are no unknown events (e.g. malfunctioning sensors), in the open world it is necessary to count with additional mechanisms to face uncertainty. Therefore, we argue that autonomous cars that are trained by means of machine learning need to be extended with highly general tactics that try to defend the car in extreme conditions of uncertainty. 6 www.harveyalferez.com/autonomous-car-demo.html. Dynamic Evolution of Simulated Autonomous Cars in the Open World 267 Table 1. Report for each of the algorithm models. Precision Recall f1-score kNN 0 0.95 0.99 0.97 1 0.83 0.56 0.67 Avg/Total 0.94 0.94 0.94 LR 0 0.89 1.00 0.94 1 0.00 0.00 0.00 Avg/Total 0.80 0.89 0.84 SVM 0 0.90 1.00 0.95 1 1.00 0.03 0.07 Avg/Total 0.91 0.90 0.85 DT 0 0.97 0.98 0.97 1 0.82 0.71 0.76 Avg/Total 0.95 0.95 0.95 6 Conclusions and Future Work This research work extended the applicability of machine learning by means of tactics to carry out the dynamic evolution of a simulated self-driving car in the open world. To this end, four classi?ers were executed and four models were generated and evaluated. The DT model was used in the simulated car after evaluation. Then, a tactic to face a simulated unknown context event in the open world was implemented. This tactic was used to prevent a situation in which the life of the passengers could be put in jeopardy. Since this research work was limited to the implementation and application of one tactic, as future work we would like to propose additional tactics. For example, tactics related to non-functional requirements, such as availability and performance, could be used to keep or improve service levels. Also, these tactics could be handled during execution by means of models at runtime as proposed in our previous work [2]. Moreover, we plan to test our approach in other tracks in which complex unknown context events could arise. 268 J. R. Sylnice and G. H. Alf´erez References 1. Frederic, L.: All new Teslas are equipped with NVIDIA’s new drive PX 2 AI platform for self-driving. https://goo.gl/xNSo8B 2. Alf´erez, G.H., Pelechano, V.: Achieving autonomic web service compositions with models at runtime. Comput. Electr. Eng. 63, 332–352 (2017) 3. Pereira, J.L., Rossetti, R.J.: An integrated architecture for autonomous vehicles simulation. In: Proceedings of the 27th Annual ACM Symposium on Applied Com-puting, pp. 286–292. ACM (2012) 4. Cheng, B.H., De Lemos, R., Giese, H., Inverardi, P., Magee, J., Andersson, J., Becker, B., Bencomo, N., Brun, Y., Cukic, B., et al.: Software engineering for self-adaptive systems: a research roadmap. Software engineering for self-adaptive systems, pp. 1–26. Springer, Heidelberg (2009) 5. Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of machine learning. MIT press (2012) 6. Alf´erez, G.H., Pelechano, V.: Facing uncertainty in web service compositions. In: 2013 IEEE 20th International Conference on Web Services (ICWS), pp. 219–226. IEEE (2013) 7. Baresi, L., Di Nitto, E., Ghezzi, C.: Toward open-world software: issues and chal-lenges. Computer 39(10), 36–43 (2006) 8. Coles, C.: Automated vehicles: a guide for planners and policymakers (2016) 9. Maurer, M., Gerdes, J.C., Lenz, B., Winner, H.: Autonomous driving: technical, legal and social aspects. Springer, Heidelberg (2016) 10. Wang, S., Heinrich, S., Wang, M., Rojas, R.: Shader-based sensor simulation for autonomous car testing. In: 2012 15th International IEEE Conference on Intelligent Transportation Systems, pp. 224–229. IEEE (2012) 11. Simon, C., Ludwig, T., Kruse, M.: Extracting sensor models from a scene based simulation. In: 2016 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), pp. 259–264. IEEE (2016) 12. Boesch, P.M., Ciari, F.: Agent-based simulation of autonomous cars. IEEE Am. Control Conf. (ACC) 2015, 2588–2592 (2015) 13. Piovan, A.G.: A neural network for automatic vehicles guidance. ACE 10, 2 (2012) 14. Gechter, F., Contet, J.-M., Galland, S., Lamotte, O., Koukam, A.: Virtual intel-ligent vehicle urban simulator: application to vehicle platoon evaluation. Simul. Modell. Pract. Theory 24, 103–114 (2012) 15. That, T.N., Casas, J.: An integrated framework combining a tra?c simulator and a driving simulator. Procedia-Soc. Behav. Sci. 20, 648–655 (2011) 16. Harrington, P.: Machine Learning in Action. Manning Publications (2012) 17. Scikit-Learn: sklearn.metrics.precision recall fscore support. https://goo.gl/ 4xxkGJ Exploring the Quanti?ed Experience: Finding Spaces for People and Their Voices in Smarter, More Responsive Cities H. Patricia McKenna(?) AmbientEase and the UrbanitiesLab, Victoria, BC V8V 4Y9, Canada mckennaph@gmail.com Abstract. The objective of this paper is to explore the quanti?ed experience in the context of ?nding spaces for people and their voices in smarter and more responsive cities. Using the construct of awareness, this exploration is situated theoretically at the intersection of a?ective computing, social computing, and pervasive computing. This paper problematizes the quanti?ed experience in human computer interactions (HCI), arguing for smart and responsive cities to be enabled by more aware people interacting with and in?uencing aware technolo- gies. Aware people and aware technologies refer to the dynamic interweaving of sensing, sensors, and sensor networks through the Internet of Things (IoT), the Internet of People (IoP), and the Internet of Experiences. The methodology for this paper includes an exploratory case study approach and the research design incorporates multiple methods of data collection including survey and interviews. Findings from this work highlight the need for qualitative data using content analysis and other analytic techniques to augment, complement, and enhance the quantitative data being generated and gathered in urban spaces. This work is signi?cant in that it: (a) explores elements of the contemporary urban quanti?ed experience through the lens of awareness and the sub-constructs of adaptability and openness; (b) advances a framework for people-aware quanti?ed experiences in support of spaces for people and their voices in smarter, more responsive cities; and (c) further develops and innovates the research and practice literature for smart and responsive cities, in relation to people-aware quanti?ed experiences. Keywords: A?ective computing · Awareness Human Computer Interactions (HCI) · Internet of Experiences Internet of Things (IoT) · Internet of People (IoP) · Pervasive computing Quanti?ed experience · Responsive cities · Sensing and sensor networks Smart cities · Social computing 1 Introduction The main objective of this paper is to explore the quanti?ed experience in the context of ?nding spaces for people and their voices in smarter and more responsive cities. This work problematizes the quanti?ed experience in human computer interactions (HCI), arguing for smart and responsive cities to be enabled by more aware people interacting © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 269–282, 2019. https://doi.org/10.1007/978-3-030-02686-8_22 with and in?uencing aware technologies. Aware people and aware technologies refer to the dynamic interweaving of sensing, sensors, and sensor networks through the Internet of Things (IoT), the Internet of People (IoP), and the Internet of Experiences. Using the construct of awareness to explore the quanti?ed experience, this work is situated theo- retically at the intersection of a?ective computing, social computing, and pervasive computing. Methodologically, an exploratory case study approach is used in this work and the research design incorporates multiple methods of data collection including survey and interviews. Additional details about the methodology are provided in Sect. 3 of this paper. Brie?y, data were gathered from diverse individuals across multiple small to medium to large sized cities in several countries. Content analysis was used in the analysis of qualitative data and descriptive statistics in the analysis of quantitative data. A literature review was conducted for the Internet of Things, People, and Experi- ences and the complementing of quanti?ed experiences in the context of smart and responsive cities. The literature review enabled formulation of a theoretical perspective for this work. This work is signi?cant in that it: (a) explores elements of the contempo- rary urban quanti?ed experience through the lens of awareness and the sub-constructs of adaptability and openness; (b) advances a framework for people-aware quanti?ed experiences in support of spaces for people and their voices in smarter and more respon- sive cities; and (c) further develops and innovates the research and practice literature for smart and responsive cities, in relation to people-aware quanti?ed experiences. In the context of smart cities, future cities, and rapid urbanization globally, the need for a new urban agenda is advanced by the UN [1] that is, among other things, “people-centered and measurable”. Konomi and Roussos [2] observe a movement beyond the earlier conception of smart cities that emerged over the last decade “towards a deeper level of symbiosis among smart citizens, Internet of Things and ambient spaces”. Gold- smith and Crawford [3] advance the notion of responsive cities, leveraging digital tech- nologies and data analytics in combination with civic engagement and governance. In relation to the digital and aware technologies of sensing, sensors, and the Internet of Things (IoT), Hotho et al. [4] de?ne sensor using the Oxford English Dictionary, as “a device which detects or measures a physical property and records, indicates, or otherwise responds to it”. Hotho et al. [4] extend this de?nition to encompass “technological sensors as well as human sensors” and sensing that “relates to the psychosocial envi- ronment” as in “sensing danger”, as well as enabling “a higher level of integration and interpretation of di?erent external and internal signals”. Friberg [5] combines the notion of atmosphere and aesthetic education to propose an approach to the exploration of performing everyday practices in relation to an awareness of the sensorial and bodily in urban spaces. As such, the multi-sensorial capabilities of people described by Lévy [6] from a human geography perspective emerge as awareness, an important form of sensing. This introduction and background gives rise to the main research question under exploration in this work using the construct of awareness and the sub-constructs of adaptability and openness. Q1: How and why do people ?gure strongly in the making of more aware, adaptive, and open analytic spaces to complement existing approaches to quanti?ed experience in contemporary urban environments? 270 H. P. McKenna In summary, the primary purpose of this paper is to explore, innovate, and extend spaces for theoretical and practical debate for quanti?ed experiences in ways that involve people more directly, knowingly, and creatively. What follows is the development of a theoretical perspective for this work in the formulation of a conceptual framework for more people-aware quanti?ed experiences. The framework will then be operationalized for use in this work using quantitative data complemented with qualitative data. The methodology for this work is described and the ?ndings are presented along with an analysis and discussion. The limitations and mitigations of the work are discussed and future directions are identi?ed, followed by the conclusion. 2 Theoretical Perspective A review of the research literature was conducted for smart and responsive cities; the Internet of Things, the Internet of People, and the Internet of Experiences; and oppor- tunities for complementing the quanti?ed experience. This theoretical perspective forms the basis for the formulation of a conceptual framework for more people-aware quan- ti?ed experiences. 2.1 Smart and Responsive Cities Townsend [7] describes smart cities as “places where information technology is combined with infrastructure, architecture, everyday objects and even our bodies, to address social, economic, and environmental problems”. Kyriazopoulou [8] provides a literature review of architectures and requirements for the development of smart cities, highlighting the sectors identi?ed by Gi?nger et al. [9] of smart economy, people, governance, mobility, environment, and living as the focus for improvement. According to Kyriazopoulou [8], “o?ering citizens a great experience” is a primary goal of smart cities. Gil-Garcia et al. [10] identify 14 dimensions in conceptualizing smartness in government such as citizen engagement, openness, creativity, technology savvy, and resilience, to name a few. According to Gil-Garcia et al. [10] citizen engagement “allows two-way communication and enables collaboration and participation, fostering stronger and more intelligent relationships” while resilience contributes to the ability to “adapt to change”. Khatoun and Zeadally [11] provide a smart city model consisting of the Internet of Things (IoT), the Internet of Services (IoS), the Internet of Data (IoD), and the Internet of People (IoP) where the IoP highlights smart living and smart people. 2.2 Internet of Things, People, and Experiences Herzberg [12] describes the Internet of Things (IoT) as “a network that enables physical objects to collect and exchange data” while describing the Internet of Everything as “a future wherein devices, appliances, people, and process are connected via the global Internet”. Vilarinho et al. [13] describe the use of activity feeds in social computing as a uni?ed communication mechanism for connecting the IoT with the IoP. Li [14] main- tains that the IoP “refers to digital connectivity of people through the Internet Exploring the Quanti?ed Experience: Finding Spaces for People 271 infrastructure forming a network of collective intelligence and stimulating interactive communication among people”. An infrastructure is proposed by Miranda et al. [15] in support of “moving from the Internet of Things to the Internet of People” where “smart- phones play a central role, re?ecting their current use as the main interface connecting people to the Internet”. According to Miranda et al. [15] key principles of the IoP include: social, personalized, proactive, and predictable. Indeed, Miranda et al. [15] employs the IoP concept to draw “the IoT closer to people, for them to easily integrate into it and fully exploit its bene?ts.” Conti et al. [16] argue for “a radically new Internet paradigm” in the form of “the Internet of People (IoP)” in which people move beyond “end users of applications” to “become active elements of the Internet.” McKenna [17] explored the experience of contemporary city environments through urban edges, surfaces, spaces, and the in-between in an e?ort to “complement, extend, and enrich algorithmic and network views.” Wellsandt et al. [18] describe the Internet of Experiences (IoE) in terms of an experience-centered approach “to complement human-centered innovation with experiences from arti?cial systems.” 2.3 Complementing Quanti?ed Experiences The United Nations [1] notes that, “urban space is being reimagined” while Casini [19] calls for smart city initiatives to move beyond a focus on “individual areas” toward a more “integrated approach” taking advantage of “new enabling infrastructures” in combination with sensor technologies. In this way, cities are encouraged to build upon existing structures in “exploiting synergies and interoperability between systems to deliver added value services for citizens to improve their quality of life” [19]. Falcon and Hamamoto [20] claim that the mass amounts of data being generated in everyday life “through the Web” and “on city streets” are opening the way for “bodies of data together with algorithms” that “will shape who we think we are” and “who we will become.” As mentioned earlier, Gil-Garcia et al. [10] identify creativity and openness as two of 14 key drivers for conceptualizing smartness in government. It is worth noting that, according to Amabile [21], a component of creativity is the open-endedness or heuristic dimension as distinct from “having a single, obvious solution (purely algo- rithmic).” And Dourish [22] points out that, “our experience of algorithms can change as infrastructure changes.” McKenna et al. [23] explored the potential for the assessment of creativity through an adaptation of the Consensual Assessment Technique (CAT) for use in technology-pervasive learning environments. Using a social radio application as an example of a social media space, McKenna et al. [23] explored environments “characterized by awareness, autonomy, collaboration, and real time data analytics potential.” McKenna and Chauncey [24] introduced the CAT into library, information, and learning spaces, proposing the technique be adapted to accommodate the assessment of creativity, inno- vation, and value in everyday, in-the-moment activities. As such, the CAT was explored [24] in terms of involving people more directly and knowingly in new partnering and collaborative opportunities in relation to data and learning analytics. By extension, this current work proposes the consideration of similar techniques for more meaningfully and directly involving people in the analysis and assessment of quanti?ed experiences 272 H. P. McKenna in the context of smarter and more responsive cities. Indeed, Baumer [25] proposes a human-centered algorithm design (HCAD) to address gaps or disconnects between algorithm metrics focused on performance on the one hand and concerns with incorpo- rating “human and social interpretations” on the other. In making algorithmic design more people centered, Baumer [25] identi?es three approaches focused on the theoret- ical, speculative, and participatory. McKenna [26] explores “the three key enrichment mechanisms of awareness, creativity, and serendipity in the context of the IoT and the IoP” pointing to “the potential for a shift to occur” possibly opening new spaces “for the combining of algorithmic and heuristic activities” and the evolving of “algorithmic/ heuristic relationships in smart cities.” 2.4 Conceptualizing People-Aware Quanti?ed Experiences This theoretical background enables formulation of a conceptual framework for more people-aware quanti?ed experiences. As depicted in Fig. 1, the people-technologies-cities dynamic in public spaces, utilizes a combination of the Internet of Things (IoT), the Internet of People (IoP), and the Internet of Experiences (IoE), combining aware people and aware technologies, in the form of responsive, engaging, and evolving mechanisms and approaches contributing to greater awareness, adaptability, and open- ness for fostering future technology spaces with potentials for developing and accom- modating people-aware quanti?ed experiences. Fig. 1. Conceptual framework for people-aware quanti?ed experiences. The research question (Q1) identi?ed in Sect. 1 of this work is reformulated as a proposition for exploration in this paper, as follows P1: People and their multi-sensorial capabilities, in combination with aware technologies, enable the enhancing of sensing, sensors, and the Internet of Things, People, and Experiences contri- buting to greater awareness, adaptability, and openness in support of greater potentials for more creative and people-aware analytic spaces to complement existing approaches to quanti?ed experience in contemporary urban environments. Exploring the Quanti?ed Experience: Finding Spaces for People 273 3 Methodology An emergent, exploratory case study approach was used for this work, said to be partic- ularly appropriate for the study of contemporary phenomena [27]. Contemporary urban environments constituted the case for this study. In Sects. 3.1–3.3 a description of the process followed for this study is provided, the sources of evidence, and the data analysis techniques used. 3.1 Process A website was used to describe the study, invite participation, and enable sign up. Demographic data were gathered during registration for the study including location, age range, and gender. People were able to self-identify in one or more categories (e.g., educator, learner, community member, city o?cial, business, etc.). Registrants were invited to complete a survey containing 20 questions as an opportunity to think about smart cities in relation to awareness, adaptability, and openness for improved livability. In-depth interviews with participants enabled discussion of urban experiences and ideas about smart cities. A pre-tested survey instrument was used for this study as well as a pre-tested interview protocol. 3.2 Sources of Evidence This study attracted international interest with participants located mostly in small to medium to large sized cities in Canada (e.g., St. John’s, Ottawa, Greater Victoria), extending also to other countries such as Israel (e.g., Tel Aviv). Survey responses provided the main source of quantitative data for this study while interview data provided qualitative evidence for this study along with data provided in response to open-ended survey questions. Three questions common to both the survey instrument and interview protocol were adapted from Anderson’s [28] body insight scale (BIS), as a mechanism for exploring the human-centered sensing of cities as a form of awareness. By contrast, other scales such as that by Teixiera et al. [29] pertain to human sensing using computing technologies for the detection of elements such as presence, count, location, track, and identity. More appropriate for this study, the BIS scale was designed for “assessing subtle human qualities” and this body insight scale [28], formerly the body intelligence scale [28], consists of three subscales—energy body awareness (E-BAS); comfort body awareness (C-BAS); and inner body awareness (I-BAS). Anderson encourages use of the scale in other domains and as such, the BIS is explored in this work in relation to people and their experience of everyday urban environments. Also of note is the impor- tance of feeling and a?ect in human computer interactions where emotion is considered to be “a critical element of design for human experience” [30], applicable here in the context of smart and responsive cities. The three questions adapted for use in this work correspond to each of the BIS sub-scales and are slightly altered in terms of wording, as follows: 274 H. P. McKenna 1. Regarding your body awareness in your city, would you agree that your body lets you know when your environment is safe (On a scale of 1 to 5 on a continuum of disagree to agree)? 2. Regarding your comfort body awareness in the world, would you agree that you feel comfortable in the world most of the time (On a scale of 1 to 5)? 3. Regarding your inner body awareness in your city, would you agree that you can feel your body tighten up when you are angry (On a scale of 1 to 5)? In parallel with this study, evidence was also gathered through individual and group discussions with people from diverse sectors across multiple cities (e.g., Toronto, Vancouver, and Greater Victoria). Perspectives across the city emerged from those in business (architectural design, ecology, energy, information technology (IT), tourism), government (city councilors, policy makers, IT sta?), educators (secondary and post-secondary, researchers, IT sta?), students (post-secondary – engineering/design/ computing/education/media), and community members (IT professionals, urban engagement leaders, urban designers, and policy in?uencers). 3.3 Data Analysis Qualitative data were analyzed using the content analysis technique involving inductive analysis to identify emerging terms from the data collected while deductive analysis enabled the identi?cation of terms emerging from the review of the research literature. Data were then analyzed for patterns and emergent insights. Descriptive statistics were used in the analysis of quantitative data. Qualitative evidence gathered from discussions in parallel with this study supported further analysis, comparison, and triangulation of data, contributing further insight and rigor. Overall, data were analyzed for an n = 61 spanning the age ranges of people in their 20s to their 70s, consisting of 39% females and 61% males. 4 Findings The ?ndings of this paper are presented in terms of the main construct of awareness with attention given to the sub-constructs of adaptability and openness in terms of the prop- osition explored in this work, in response to the research question. 4.1 Awareness Regarding technology awareness, City IT sta? described the IoT as “more about the instrumentation of things, with everything connected and communicated”. A community member in St. John’s observed that “we’re not smart about how we use the technology”. A student noted the pervasive sharing of “very traditional things” and events in daily lives where people are “all videoing them, sharing them constantly in social media,” described as “a seamless behavior” contributing to a “seamless interrelationship” of the “local and global” generating “concurrent awareness.” Exploring the Quanti?ed Experience: Finding Spaces for People 275 Based on questions adapted for the city in this study from the body insight scale (BIS), an emerging example of a people-aware quanti?ed experience is presented in Table 1. During the 2015 to 2016 phase of this study an abbreviated version of Ander- son’s 5-point scale was used to assess urban awareness in relation to the energy body and feeling safe; the comfort body; and the inner body and feelings of tightness when angry. Responses from individuals show feelings of safety at the upper end of the scale with 67% at position 4 and 33% at position 5. Feelings of comfort in the world tend toward the high end of the scale with 67% at position 5 and 33% at the neutral position of 3. Feelings of tightness related to anger are spread equally at 33% across the neutral position of 3 and the upper end of the scale at positions 4 and 5. Table 1. Awareness in the city – body insight scale (2015/2016) Awareness 1 2 3 4 5 Energy body: feeling safe 67% 33% Comfort body: in the world 33% 67% Inner body: tightens when angry 33% 33% 33% In discussions with respondents about the BIS questions, it was suggested that the term “world” contributed to confusion when assessing levels of comfort in a particular city. Based on this use experience, it was suggested that the phrase “the world” be replaced with “your city.” The 5-point scale was also found to be too restrictive and it was suggested that the scale be extended from 5 to 7 points. Table 2. Awareness in the city – body insight scale (2016/2018) Awareness 1 2 3 4 5 6 7 Energy body: feeling safe 33% 67% Comfort body 67% 33% Inner body: tightens when angry 33% 33% 33% Guided by feedback from respondents in 2015 to 2016, wording and scale adapta- tions were pre-tested and approved for use in this study from 2016 going forward. This enriched and emerging example of a people-aware quanti?ed experience is presented in Table 2. Survey responses from individuals show that feelings of safety continue to emerge at the upper end of the scale in position 7 (67%), with people indicating that their body lets them know when their environment is safe. However, 33% responded at the much lower end of the scale at position 2. During interviews it was possible to discuss the scale rating choices to learn more about underlying factors. Open-ended survey responses also provided additional insight. For example, in the case of those residing outside the city or urban area, the response rate drops sharply toward the lower end of the scale (33%) for feelings of safety during experiences of visiting the city. Regarding comfort levels in the city, responses varied from the high end at one extreme at position 7 (33%) to an increased concentration appearing at the much lower position of 3 on the scale (67%). Where urban comfort levels tended toward the higher end of the scale in cities in 2015 to 2016, comfort levels shifted noticeably in 2016 to 2018 in cities toward 276 H. P. McKenna the lower end of the scale. In part, comfort was in?uenced by urban design elements, such as the placement of benches. Feelings of tension in the city, such as anger, appearing in Table 1 (33% at the 3, 4, and 5 positions) seem to remain relatively consistent with those emerging in Table 2, tending toward the mid to higher positions of the scale with 33% at the 4, 5, and 6 positions. During interviews it was reported that feelings of tense- ness and anger depended upon the city where, in a smaller scale city, the inability to ?nd a parking spot may contribute to anger, while in a much larger urban center such as London, being tense “would be normal” pointing to “a di?erence in how you carry yourself” depending on the city. 4.2 Adaptability Mechanisms and approaches to accommodate new forms of adaptability in urban inter- actions emerged in a variety of ways. For example, an educator in Vancouver described the importance of people coming together in the city where “the meeting becomes the technology that changes everything.” A building designer noted that, “people want to be able to interact and really be in an overall environment” calling for changes in urban design. A community organizer in Victoria observed how City Council members “go where the citizens are” when there is “an opportunity for public engagement.” In the case of wanting “to reengage with our bylaws about growing food on city land,” Council members and/or city sta? will attend “city events” rather than “just posting something on their website” as “a really e?ective way to engage the community.” From a creativity perspective, a community leader articulated the need to ?gure out how to “move away from sector driven strategies to ones that” feature “clusters” so as to “bring industries and sectors together rather than that sort of silo” approach. Cross-sector initiatives were identi?ed related to “connected cities,” while recognizing the potential for, and impor- tance of, funding for smart cities. 4.3 Openness City IT sta? commented that “fundamentally there is a desire to be very, very open with the available data” as public data. It was noted that “the other element we’re trying to share is even just the processes of City Hall” using the example of permit applications. A locally developed mobile app was described by an educator in terms of the capability of being “able to open this kind of feedback” potential to anyone in the city as a way “to transform contributions both in terms of unique ideas and patterns into the design of some urban space or buildings” as in “smart infrastructure.” A building designer described the focus on creating a “whole urban space” enabling a coming together of people so as “to make it feel like its not this closed in community.” The designer suggested the potential for “having buildings or alleyways” serve as “more than just that intended use” so as to become multi-use and multi-purpose spaces. A community leader suggested that, “one of the challenges that the building community faces in doing these things is ?nancial.” Reference was made to the importance of planning for “an open innovation event” designed to be “more engaging” inviting proposals to “pilot ideas” to address urban challenges going forward. Regarding social media and openness, a student Exploring the Quanti?ed Experience: Finding Spaces for People 277 questioned the veracity of information provided to platforms, pointing to the frequent contributing of “made up” details in an e?ort to maintain some degree of privacy. Explored quantitatively, as illustrated in Table 3, when asked to assess the extent to which openness is associated with smart cities on a 7 point scale (1 – Not at all, 2 – Not sure, 3 – Maybe, 4 – Neutral, 5 – Sort of, 6 – Sure, 7 – Absolutely) the majority of responses emerge toward the upper end of the scale with 33% at positions 6 and 7 along with a 33% response at the neutral position of 4. Table 3. Openness and smart cities – assessments Smart cities 1 2 3 4 5 6 7 Openness 33% 33% 33% Exploring quantitatively the potentials for attuning, sharing, and trust, people were asked to assess these elements in relation to city-focused social media and other aware technologies on a scale of 1 to 7 (not at all to absolutely). As illustrated in Table 4, assessments of attuning to urban spaces tended toward the upper end of the scale with 33% at the 6 position and 67% at position 7. Again, sharing is strong with 67% at the upper end of the scale in position 7 and 33% in position 6. Trust emerges toward the upper end of the scale with 67% of responses at the 5 position and 33% at 7. Table 4. Attuning, sharing, and trust – assessments Smart cities 1 2 3 4 5 6 7 Attuning 33% 67% Sharing 33% 67% Trust 67% 33% A summary of ?ndings is presented in Table 5 in terms of the three constructs of awareness, adaptability, and openness in relation to the technologies of the Internet of Things (IoT), the Internet of People (IoP), and the Internet of Experiences (IoE). IoT technologies emerge in relation to awareness as instrumented, as meeting spaces for adaptability, and as mobile apps for openness. IoP technologies highlight awareness in relation to seamless behaviour, as clusters for adaptability, and as piloting ideas across diverse sectors for openness. IoE technologies contribute to multi-dimensional aware- ness, connected cities for adaptability, and to calls for attention to the veracity of data in social media and other online platforms in relation to openness and associated concerns with privacy in urban spaces. Table 5. Summary of ?ndings Tech Awareness Adaptability Openness IoT Instrumented Meeting spaces Mobile app IoP Seamless behavior Clusters Piloting ideas IoE Multi-dimensional Connected cities Veracity/privacy 278 H. P. McKenna 5 Discussion Awareness-based ?ndings suggest an instrumented, technology perspective from infor- mation technology professionals balanced by community member voices highlighting the importance of being “smart about how we use the technology.” The seamless inter- mingling of the IoT-IoP-IoE emerges in the observations of a student articulating the “concurrent awareness” of the local and the global. The nature of pervasive sharing described in the ?ndings, enriches the quantitative details provided in Table 4 for attuning and sharing. Trust level assessments in Table 4, while relatively strong, suggest an underlying tentativeness with 67% at position 5 and 33% at the upper end of the scale at 7, when compared with responses for attuning and sharing. The multi-dimensionality of the urban experience is highlighted through early-stage use of the body insight scale (BIS) to explore feelings of safety, comfort, and tension levels more directly with people. Early indications of factors in?uencing responses to use of the BIS pertain to city size, urban design elements, familiarity with the city, and other emerging and evolving aspects of cities and city regions that may include density (e.g., increasing urbanization over time) and geographic location. Adaptability-related ?ndings emphasize the importance of ?guring out e?ective ways to bring people together – meetings, clusters, technologies – in support of more community focused approaches to engagement and governance for connected cities. Openness-related ?ndings pertained to the use of an urban app for more inclusive use as smart infrastructure; the piloting of ideas in developing designs for greater connection in multi-use urban spaces; and the veracity of social media and other platform data in the face of underlying privacy concerns, shedding light on Table 3 and quantitative assessments of openness, with implications for quanti?ed experiences. 6 Future Directions Findings from this work highlight the need for qualitative data to augment, complement, and enhance the quantitative data being generated and gathered in urban spaces. Issues related to the veracity of large amounts of data providing the basis for algorithmic activ- ities gives rise to concerns identi?ed here with “made up” details and the resulting e?ect on algorithmic accuracy. As such, this work points to new pathways for the involvement of people more meaningfully and directly in the creation of spaces, both in theory and practice, for interaction in algorithmic realms. Such spaces will contribute to the shaping of debates, algorithmic designs, and new possibilities and potentials for more creative outcomes in the innovating of quanti?ed experiences as more people-aware. 7 Challenges, Limitations, and Mitigations Limitations of this work related to small sample size are mitigated by in-depth and rich detail from a wide range of individuals across small to medium to large urban centers. Challenges related to geographic location are mitigated by the potential to extend this work to other cities, including megacities and regions exceeding 10 million people. The challenge of studying emergent, dynamic, and evolving understandings of smart cities Exploring the Quanti?ed Experience: Finding Spaces for People 279 through awareness, adaptability and openness is mitigated by opportunities to explore the making of openings and spaces for innovative opportunities going forward for quanti?ed experiences. While only a limited number of possible body insight scale (BIS) questions were adapted for exploration in this work, opportunities exist for further vali- dation of these questions for use in urban environments going forward and for the inclu- sion of additional questions. 8 Conclusion This paper provides an exploration of the evolving area of aware people and aware technologies in relation to quanti?ed experiences in smart cities. Key contributions of this work include: (a) the use of awareness, adaptability, and openness in relation to the Internet of Things (IoT), the Internet of People (IoP), and the Internet of Experiences (IoE) as aspects of smart cities, in exploring the potential for innovating quanti?ed experiences; (b) formulation of a conceptual framework for people-aware quanti?ed experiences; (c) early-stage exploration of adaptations to the body insight scale (BIS) for use in the study of quanti?ed experiences in contemporary urban environments; and (d) further development of the smart cities research and practice literature in relation to innovations for quanti?ed experiences. A major take away from this work is the critical importance of aware people in combination with aware technologies in fostering new potentials for the making of innovative spaces to accommodate people more meaning- fully and directly in the algorithmic realm in smart cities. This work will be of interest to technology developers, researchers, research think tanks, urban practitioners, community members, and anyone concerned with more creative and innovative quan- ti?ed experience initiatives for future tech, smarter cities, and more responsive cities. References 1. Habitat, U.N.: Urbanization and Development: Emerging Futures—World Cities Report 2016. UN Habitat, Nairobi (2016) 2. Konomi, S., Roussos, G.: Enriching Urban Spaces with Ambient Computing, the Internet of Things, and Smart City Design. IGI Global, Hershey (2017) 3. Goldsmith, S., Crawford, S.: The Responsive City: Engaging Communities Through Data- Smart Governance. Jossey-Bass, San Francisco (2014) 4. Hotho, A., Stumme, G., Theunis, J.: Introduction: new ICT-mediated sensing opportunities. In: Loreto, V., Haklay, M., Hotho, A., Servedio, V.D.P., Stumme, G., Theunis, J., Tria, F. (eds.) Participatory Sensing, Opinions and Collective Awareness, pp. 3–8. Springer, Cham (2017) 5. Friberg, C.: Performing everyday practices: atmosphere and aesthetic education. Ambiances Int. J. Sens. Environ. Archit. Space Var. 464, 1–12 (2014) 6. Lévy, J. (ed.): The City: Critical Essays in Human Geography. Contemporary Foundations of Space and Place Series. Routledge, London (2016) 7. Townsend, A.M.: Smart Cities: Big Data, Civic Hackers and the Quest for a New Utopia. WW Norton, New York (2013) 280 H. P. McKenna 8. Kyriazopoulou, C.: Architectures and requirements for the development of smart cities: a literature study. In: Elfhert, M., et al. (eds.) Smartgreens 2015 and Vehits 2015, CCIS 579, pp. 75–103. Springer, Cham (2015) 9. Gi?nger, R., Fertner, C., Kramar, H., Kalasek, R., Pichler-Milanovic, N., Meijers, E.: Smart Cities: Ranking of European Medium-Sized Cities. University of Technology, Vienna (2007) 10. Gil-Garcia, J.R., Puron-Cid, G., Zhang, J.: Conceptualizing smartness in government: an integrative and multi-dimensional view. Gov. Inf. Q. 33(3), 524–534 (2016) 11. Khatoun, R., Zeadally, S.: Smart cities: concepts, architectures, research opportunities. Commun. ACM 59(8), 46–57 (2016) 12. Herzberg, C.: Smart Cities, Digital Nations: How Digital Urban Infrastructure can Deliver a Better Life in Tomorrow’s Crowded World. Roundtree Press, Petaluma (2017) 13. Vilarinho, T., Farshchian, B.A., Floch, J., Mathisen, B.M.: A communication framework for the Internet of People and Things based on the concept of activity feeds in social computing. In: Proceedings of the 9th International Conference on Intelligent Environments, pp. 1–8 (2013) 14. Li, M.: Editorial: Internet of People. Concurr. Comput. Pract. Exp. 29, 1–3 (2017) 15. Miranda, J., Mäkitalo, N., Garcia-Alonso, J., Berrocal, J., Mikkonen, T., Canal, C., Murillo, J.M.: From the Internet of Things to the Internet of People. IEEE Internet Comput. 19(2), 40– 47 (2015) 16. Conti, M., Passarella, A., Das, S.K.: The Internet of People (IoP): a new wave in pervasive mobile computing. Pervasive Mob. Comput. 41, 1–27 (2017) 17. McKenna, H.P.: Edges, surfaces, and spaces of action in 21st century urban environments— connectivities and awareness in the city. In: Kreps, D., Fletcher, G., Gri?ths, M. (eds.) Technology and Intimacy: Choice or Coercion, Advances in Information and Communication Technology, vol. 474, pp. 328–343. Springer, Cham (2016) 18. Wellsandt, S., Wuest, T., Durugb, C., Thoben, K.D.: The Internet of Experiences—towards an experience-centred innovation approach. In: Emmanouilidis, C., Taisch, M., Kiritsis, D. (eds.) Advances in Production Management Systems, Competitive Manufacturing for Innovative Products and Services, APMS 2012. IFIP Advances in Information and Communication Technology, vol. 397, pp. 669–676. Springer, Berlin (2013) 19. Casini, M.: Green technology for smart cities. In: IOP Conference Series: Earth and Environmental Science, vol. 83, p. 012014, 2nd International Conference on Green Energy Technology, pp. 1–10 (2017) 20. Falcon, R., Hamamoto, B.: Bodies of Data: Who are We Through the Eyes of Algorithms. Future Now. Institute For The Future (IFTF), Palo Alto (2017) 21. Amabile, T.M.: Componential theory of creativity. In: Kessler, E.H. (ed.) Encyclopedia of Management Theory. Sage, Los Angeles (2013) 22. Dourish, P.: Algorithms and their others: algorithmic culture in context. In: Big Data and Society, pp. 1–11 (2016) 23. McKenna, H.P., Arnone, M.P., Kaarst-Brown, M.L., McKnight, L.W., Chauncey, S.A.: Application of the consensual assessment technique in 21st century technology-pervasive learning environments. In: Proceedings of the 6th International Conference of Education, Research and Innovation (iCERi2013), pp. 6410–6419 (2013) 24. McKenna, H.P., Chauncey, S.A.: Exploring a creativity assessment technique for use in 21st century learning, library, and instructional collaborations. In: Proceedings of the 8th International Conference of Education, Research and Innovation (iCERi), pp. 5371–5380 (2015) 25. Baumer, E.P.S.: Toward Human-Centered Algorithm Design. In: Big Data & Society, pp. 1– 12 (2017) Exploring the Quanti?ed Experience: Finding Spaces for People 281 26. McKenna, H.P.: Creativity and ambient urbanizing at the intersection of the Internet of Things and People in smart cities. In: Universal Access in Human–Computer Interaction, Virtual, Augmented, and Intelligent Environments. Lecture Notes in Computer Science, vol. 10908. Springer, Cham (2018) 27. Yin, R.K.: Case Study Research and Applications: Design and Methods. Sage, Los Angeles (2018) 28. Anderson, R.: Body Intelligence Scale: de?ning and measuring the intelligence of the body. Hum. Psychol. 34(4), 357–367 (2006) 29. Teixiera, T., Dublon, G., Savvides, A.: A survey of human-sensing: methods for detecting presence, count, location, track, and identify. ENALAB Technical Report 09-2010, vol. 1, no. 1 (2010) 30. Hanington, B.: Design and emotional experience: introduction. In: Jeon, M. (ed.) Emotions and A?ect in Human Factors and Human–Computer Interaction, pp. 165–183. Elsevier, London (2017) 282 H. P. McKenna Prediction of Tra?c-Violation Using Data Mining Techniques Md Amiruzzaman(B) Kent State University, Kent, OH 44242, USA mamiruzz@kent.edu Abstract. This paper presents the prediction of tra?c-violations using data mining techniques, more speci?cally, when most likely a tra?c-violation may happen. Also, the contributing factors that may cause more damages (e.g., personal injury, property damage, etc.) are discussed in this paper. The national database for tra?c-violation was considered for the mining and analyzed results indicated that a few speci?c times are probable for tra?c-violations. Moreover, most accidents happened on speci?c days and times. The ?ndings of this work could help prevent some tra?c-violations or reduce the chance of occurrence. These results can be used to increase cautions and tra?c-safety tips. Keywords: Tra?c· Prediction · Crime · Violations · Data mining 1 Introduction According to [1], approximate population of US is 326,200,000, and there are 196,000,000 licensed drivers [2]. However, based on the data presented in [2], every day in average of 112,000 tickets are issued for di?erent types of tra?c-violations (mainly speeding). Altogether, approximately 41,000,000 tickets are issued every year (see Table 1). The statistics provides an overview of the tra?c-violations in the US, and there are number of reasons that causes tra?c-violations. As the number of vehicles are increasing every day, so does the chance of tra?c-violations [3,4]. Often, tra?c-violations lead to road accidents and injuries (Chen et al. 2004; Nath 2006). Chen et al. in [3] classi?ed di?erent types of crime at di?erent law-enforcement level. Such as, sex crime in law-enforcement level two, and theft (e.g., robbery, burglary, larceny, etc.) in law-enforcement level three. In their classi?cation, tra?c-violation is one of the common local crimes [3]. In general, bad weather, unskilled drivers, drunk drivers, and drivers who pay less attention while driving may cause tra?c-violations, as well as road accidents. However, there may be some other contributing reasons that may lead to tra?c-violations and road accidents. For example, speeding, reckless driving, driving under in?u-ence of drugs or alcohol, hit-and-run, road rage, etc. The research [3] mainly focused on crimes and who is committing them, rather than tra?c-violations. .s c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 283–297, 2019. https://doi.org/10.1007/978-3-030-02686-8_23 284 Md. Amiruzzaman Table 1. Tra?c-violation statistics Driving citation statistics Average number of people per day that receive a speeding ticket 112,000 Total annual number of people who receive speeding tickets 41,000,000 Total percentage of drivers that will get a speeding ticket this year 20.6 % Solomon et al. (2006) analyzed tra?c-violation data to develop tra?c safety program [4]. Their research focused on identifying places where tra?c-violations occurred and how to better monitor those places. Solomon et al. (2006) proposed to use more camera/surveillance to monitor those identi?ed high tra?c-violation places and use those surveillance footages to identify responsible parties [4]. This research [4] helped to improve tra?c-safety programs. In a separate study, Saran and Sreelekha (2015) found correlations between drunk driver, careless driving, over the speed limit and road accidents [5]. How-ever, these ?ndings are not something new to the law-enforcement agencies and research communities. Moreover, [5] mainly focused on statistical analysis (i.e., correlation analysis) and surveillance. In their paper, Saran and Sreelekha [5] used Arti?cial Neural Network (ANN) for vehicle detection. They also focused on Intelligent Transport System (ITS), which incorporate latest computer tech-nologies and computer vision [5]. Saran and Sreelekha (2015) indicated that ANN is superior in classi?cying moving vehicles than Support Vector Machine (SVN) and k-nearest neighbor (k-nn) algorithms. Note that, SVN and k-nn are two most popular algorithms that are widely used in data mining. Gupta, Mohammad, Syed and Halgamuge (2016) found a correlation between crime rates and accidents from Denver city of Colorado state [6]. Note that tra?c-violations may lead to violent crimes as well. For example, drunk driver may cause some property damage or injury to others. From their mining research, Gupta et al. (2016) were able to predict that in the months of January and February, most crimes are likely to occur. These ?ndings were helpful to the law-enforcement agencies (Gupta et al. 2016). The major drawback of [6] research is that authors only focused on one speci?c city of a state. Analyzing national database is necessary to understand how tra?c-violations occurring in the US. Nath (2006) indicated that most criminals along with other crimes, com-mitted tra?c-violation crimes as well [7]. One of the interesting ?ndings from Nath (2008) was to claim that 10% criminals commits 50% of the crimes. Chen et al. (2004) mentioned that a tra?c-violation is a primary concern for city, county, and state level law-enforcement agencies. In [7], authors mainly focused on where and how many Closed-Circuit Television (CCTV) would be helpful to ?nd responsible parties. The purpose of this study is to predict tra?c-violations based on previous incidents. The national database for tra?c violations is to be examined to deter-mine any factors that contributed to previous tra?c-violations and developed the prediction. Also, what time and days are most violations occur will be deter-mined using the mining as well. Prediction of Tra?c-Violation Using Data Mining Techniques 285 The rest of this paper is organized as follows: Sect. 2 describes existing litera-tures. Section 3 describes the method used in this study and Sect. 4 summarizes the experimental results. Section 5 presents discussion about the experimental results and Sect. 6 concludes the paper with implications and future works. 2 Literature Review Chen et al. (2004) studied di?erent types of crime, such as tra?c-violations, sex crime, theft, fraud, arson, gang/drug o?enses, violent crime, and cybercrime [3]. Also, they classi?ed these crime types to di?erent law-enforcement levels (e.g., level one, level two, etc.). Chen et al. (2004) identi?ed tra?c-violations as level one crime and one of the common local crimes [3]. They mentioned that speeding, reckless driving, causing property damage or personal injury in a collision, driving under in?uence of drugs or alcohol, hit-and-run, and road rage are common reasons for tra?c-violations [3]. According to Chen et al. (2004), tra?c-violations mostly considered as less harmful crime, however, sometimes this type of crime could cause severe bodily injury or property damage [3]. Even though, Chen et al. [3] discussed about tra?c-violation and other crimes, but their work actually did not focus on tra?c-violation analysis. Rather, their work focused on other types of crime analysis and prediction of those crimes to help law-enforcement agencies. Solomon, Nguyen, Liebowitz and Agresti (2006) demonstrated how to use data mining (DM) and evaluate cameras that monitor red-light-signals in traf-?c intersections [4]. Based on their ?ndings they proposed some techniques to improve tra?c safety programs. In their work, they used di?erent modeling techniques, such as decision trees, neural networks, market-basket analysis, and k-means. Solomon et al. (2006) focused on identifying places where red-light-signal violations occurred and how to better monitor those places. The red-light violation is known as red light running (RLR), and according to the Federal Highway Administration (FHWA), approximately 1,000 Americans were killed and 176,000 were injured in 2003 because of RLR. To describe the severity of RLR and its damage on the economy, Solomon et al. (2006) in [4] wrote, “The California Highway Patrol estimates that each RLR fatality costs the United States $2,600,000 and other RLR crashes cost between $2,000 and $183,000, depending on severity (California State Auditor, 2002)” (p. 621). As for the recommendation, they proposed to use more cam-era/ surveillance to monitor those identi?ed high tra?c-violation places and use those surveillance footages to identify responsible parties. As for their data, they used tra?c-violation data from Washington, DC area; the data was collected between the year 2000 and 2003 (Solomon et al. 2006). In terms of ?ndings, their [4] work helped law-enforcement agencies to ?nd responsible parties using the red light camera (RLC). However, placing RLCs in a right place is not an easy task. Data mining technique can be helpful to determine the high accident zone and place RLCs in appropriate locations. In a separate study [5], Saran and Sreelekha (2015) found correlations between drunk driver, careless driving, over the speed limit and road accidents. 286 Md. Amiruzzaman However, these ?ndings are not something new to the law-enforcement agencies and to the research communities [5]. Their work [5] was more of a classi?ca-tion than data mining. They used videos obtained from closed circuit television (CCTV) cameras placed in roadsides or driveways are used for the surveillance. They used arti?cial neural networks (ANN) to detect di?erent types of vehi-cles [5]. While detecting di?erent types of vehicles are important and interesting work, however, the need for tra?c-violation data mining remain the unsolved. In their work [5], Saran and Sreelekha (2015) mainly focused on road safety and surveillance system. Gupta, Mohammad, Syed and Halgamuge (2016) found a correlation between crime rates and accidents from Denver city of Colorado state. Note that tra?c-violations may lead to violent crimes as well [6]. For example, drunk driver may cause some property damage or injury to others. To describe the phe-nomenon, they said in [6] “The major cause of road accidents is drink driving, over speed[ing], carelessness, and the violation of tra?c rules” (p. 374). From their mining research, Gupta et al. (2016) were able to predict that in the months of January and February, most crimes are likely to occur. These ?ndings were helpful to the law-enforcement agencies (Gupta et al. 2016). They used data from the National Incident-Based Reporting System (NIBRS), The dataset con-tained 15 attributes and 372,392 instances [6]. While, Gupta et al. (2016) in [6] presented interesting ?ndings based on their data mining research, however, their work is mainly focused on a speci?c city of a speci?c state. It is important that a research study focus on the entire US and try to generalize the ?ndings mentioned in [6]. Nath (2006) in [7] indicated that most criminals along with other crimes, committed tra?c-violation crimes as well. One of the interesting ?ndings from Nath (2008) was to claim that 10% criminals commits 50% of the crimes. Chen et al. (2004) mentioned that a tra?c-violation is a primary concern for city, county, and state level law-enforcement agencies. They also added that tra?c-violations and other criminal activities may be related, and information obtained from tra?c-violations can be further used to ?nd criminals. They focused on getting contact information from the Department of Motor Vehicles (DMV). This paper, will provide an overview of tra?c-violation data mining as well as some interesting ?ndings that can be helpful to maintain cautions and prevent unwanted tra?c-violations. The proposed data mining predicts where and what time of the day the incidents (tra?c-violations) will occur based on National database. Also, what combinations of factors contribute to tra?c-violations. 3 Method Several data mining algorithms were used to analyze the data. For example, Na¨ive Bayes, J48 decision tree, Decision Table, and Support Vector Machine. Also, a few statistical analysis, such as, linear regression analysis, correlation analysis, and reliability analysis were considered to analyze the ?nal data. Mul-tiple tools were used to process and analyze the data. For example, SPSS Prediction of Tra?c-Violation Using Data Mining Techniques 287 (i.e., Statistics is a software package developed by IBM company) tests helped to determine which attributes should be considered for data mining. Also, WEKA1 (i.e., Waikato Environment for Knowledge Analysis) tool was used to perform data mining algorithms [8] on the research dataset. 3.1 Data The data was downloaded from the national database for public data2 . The original database consists of 36 attributes. However, there were lots of attributes that did not show any variations. For example, the accident attribute only had “No” as a value. Attributes like that does not contribute to data analysis, so, those attributes were deleted before the ?nal analysis. The database consisted over one million records. Of course, some of the rows had some missing values or wrong values (e.g., human errors). Missing values and wrong values seemed to be due to user errors. The database included demographic information, such as, gender of vehicle drivers, and place of incidents, driver state, driver city, etc. 3.2 Preprocessing The initial task for the preprocessing was to identify which attribute to keep and which attributes to discard. Of course, the database included overwhelming amount of data. However, for the data mining, only the most important and relevant attributes were considered for ?nal analysis. The preprocessing process included deleting missing data, deleting irrelevant attributes, modifying records to meaningful format, etc. – SPSS tests helped to determine which attribute could to be deleted or not included for data mining as well as ?nal analysis (see Table 2). – Missing and repeating attributes were discarded as well. Also, wrong entries were discarded from ?nal selection of data analysis. – The dataset was divided into training set and testing set. The training set consisted 67% of the data, whereas testing test consisted of 33% of the total number of records. Holdout method was used to determine the training set and testing set. Initial Processing. After the determining the training set and testing set, and deciding to keep some candidate attribute. Again, SPSS tests were executed to determine which attribute should be deleted to further increase the accuracy of the result. Mainly the test helped to determine which item should be deleted is “items-deleted” to increase the reliability value. For example, SPSS tests indi-cated time of the incident should be deleted to increase the reliability of the results. 1 https://www.cs.waikato.ac.nz/~ml/weka/downloading.html. 2 https://catalog.data.gov/dataset. 288 Md. Amiruzzaman Table 2. Inter-Item correlation matrix Personal injury Property damage Alcohol Contributed to accident Personal injury 1.000 -o0.016 0.013 0.346 Property damage -o0.016 1.000 0.019 0.368 Alcohol 0.013 0.019 1.000 0.014 Contributed to accident 0.346 0.368 0.014 1.000 Initial Results. Initial processing suggested that most tra?c-violations hap-pened in Maryland (DC), more speci?cally in Washington, DC area. Also, after modifying the date of incident to weekdays (e.g., Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, and Saturday), it was noticed that most tra?c-violations happened on Tuesday and Wednesday (see Fig. 1.). This is maybe because people are more anxious on mid-week (i.e., we call it mid-week e?ect). Fig. 1. Number of incidents in days. (x-axis is days–Sunday (starting from left), and end with Saturday (on the right); y-axis is the number of incidents). 4 Results 4.1 SPSS Correlation analysis helped to determine that property damage and alcohol were correlated (17%). Similarly, contributed to accident and property damage Prediction of Tra?c-Violation Using Data Mining Techniques 289 were correlated (34%); contributed to accident and personal injury were corre-lated (37%). The correlation values were calculated using the following equation (see (1)): rxy = .xi=0 n (xi -i x¯)(yi -i y¯) .¯ .¯i=0 n (xi -i x¯)2 .¯i=0 n (yi -i y¯)2 (1) where, rxy is the correlation value between variables, x and y, ., is the symbol for “sum up”, xi is the individual value of variable x, x¯ is the mean of variable x. Similarly, yi is individual value of variable y, y¯ is the mean of variable y. In this analysis linear regression was used to verify some of the prediction made by the WEKA software. The regression equation can be expressed as (see (2)) yi = a + bxi + c (2) where, Y is the dependent variable that the equation tries to predict, X is the independent variable that is being used to predict Y , xi ?i X, and i = 1, 2, 3, ..., n, yi ?i Y , and i = 1, 2, 3, ..., n, a is the Y -intercept of the line, b is the slope, and c is a value called the regression residual, which can be calculated by |yˆi -ˆ yi|, where yˆi is the expected value of y. The values of a and b are selected so that the square of the regression residuals is minimized. More detail about regression equation and example of regression can be found online3 . The results obtained from linear regression analysis is presented in Table 3. Table 3. Linear regression analysis Model R R2 Adjusted R2 Std. error of the estimate 1 0.404 0.163 0.163 0.125 Reliability values were calculated using equation below (see (3)) a) = N × c¯ v¯ + (N -¯ 1) × c¯ (3) 3 http://www.stat.yale.edu/Courses/1997-98/101/linreg.htm. 290 Md. Amiruzzaman where, N is the number of items c¯ is average iter-item covariance and v¯ is average variance. The reliability of four attributes (i.e., personal injury, property damage, alco-hol, and contributed to the accident) was 0.435 (see Table 4.) Table 4. Reliability statistics Cronbach’s a. Cronbach’s a. based on standarized items N of items 0.435 0.362 4 4.2 Na¨.ave Bayes The Na¨ive Bayes classi?er is one of the most popular classi?ers in data mining. To describe the strength of Na¨ive Bayes [9] wrote “The na¨ive Bayes classi?er computes the likelihood that a program is malicious given the features that are contained in the program. This method used both strings and bytesequence data to compute a probability of a binary’s maliciousness given its features” (p. 6). Results obtained from Na¨ive Bayes is presented in Table 5. Table 5. Comparisons of di?erent methods Method name Correctly classi?ed (%) Incorrectly classi?ed (%) Kappa statistics Root Mean Square Error (RMSE) Precision Recall J48 decision tree 97.67 2.32 0.24 0.14 0.98 0.99 Na¨ive Bayes 97.60 2.39 0.06 0.13 0.97 0.99 Support Vector Machine (SVM) 97.61 2.38 0.00 0.15 0.97 1.00 Decision table 97.64 2.35 0.24 0.13 0.98 0.99 Following the mathematical de?nition will help to explain how the Na¨ive Bayes classi?er works. Let, the dataset be d, and set of classes C = c1,c2, ..., cn, and predicted class c ?n C. The Na¨ive Bayes classi?cation can be expressed as (see (4)), P (c|d) = P (d|c)P (c) P (d) (4) Prediction of Tra?c-Violation Using Data Mining Techniques 291 Over 500,000 instances were analyzed using Na¨ive Bayes (Weka could not return any results over 0.5 million records). 67% of them as training set and 33% of them as testing set. The confusion matrix helped to compute the accuracy of classifying algo-rithms. Therefore, the accuracy of a classifying algorithm can be de?ned as (see (5)), Accuracy = (TP + TN) (TP + FP + TN + FN) (5) here, TP = True Positive, TN = True Negative, FP = False Positive, and FN = False Negative. With 97.6% accuracy Na¨ive Bayes algorithm was able to classify tra?c violations-personal injury, property damage, and the presence of alcohol. The confusion matrix of Na¨ive Bayes has shown that only 297 records were classi?ed as “True Negative” (see Table 6) Table 6. Confusion matrix (Na¨ive Bayes) Predicted class No Yes Actual class No True positive = 327107 False negative = 331 Yes False positive = 7715 True negative = 297 In the database di?erent types of vehicle was reported. For example, motor-cycle, automobile, station wagon, limousine, etc. Na¨ive Bayes algorithm was able to classify tra?c-violations based on vehicle type with accuracy of 87.444%. Also, Na¨ive Bayes algorithm reported that automobile had the highest incident records. 4.3 J48 The J48 decision tree algorithm was used to visualize and determine how predic-tion was made. In fact, J48 algorithm uses a mathematical model to determine information gain can help to determine which variable ?ts better in terms of target variable prediction. There are other data mining research, such as [10] used J48 decision tree to predict their outcome variables as well. Following the mathematical de?nition will help to explain how SVN classi?er works. Let, the dataset be d, The dependent variable is Y (i.e., the target variable that the algorithm is trying to classify). The dataset d is consists of vector x, which is composed of the features, x1,x2,x3,... etc. that are used to make the classi?cation or the decision tree. Then, the decision tree algorithm can be expressed as (see (6)) (x, Y ) = (x1,x2,x3,...,xk,Y ) (6) where, k is number of features in vector x. 292 Md. Amiruzzaman Around 5:00 pm, the tra?c-violation happened did not involve alcohol, which make sense as most people leave their work at that time. However, perhaps the rush to go home may cause those tra?c-violations at that time. On the other hand, most tra?c-violations between 12:00 am and 1:00 am involved alcohol, which indicates that those occurred by drunk drivers. Perhaps, law-enforcement agencies should look into those incidents and maintain more cautions. The J48 algorithm classi?ed with the accuracy of 97.6% correct classi?cation. The con-fusion matrix of J48 has shown that only 1290 records were classi?ed as “True Negative” (see Table 7) Table 7. Confusion matrix (J48) Predicted class No Yes Actual class No True positive = 326350 False negative = 1088 Yes False positive = 6722 True negative = 1290 In addition, the J48 algorithm was able to classify tra?c-violations based on vehicle type with accuracy of 87.433%. Also, J48 algorithm reported that automobile had the highest incident records. 4.4 Support Vector Machine (SVM) Support vector machine (SVM) is one of the powerful data classi?cation tools. The SVM was invented at ATT Bell Laboratories by Cortes and Vapnik in 1997 [11]. To describe the strength of SVM classi?cation algorithm Kim, Pang, Je, Kim, Bang and Yang (2003) in [11] wrote, “The SVM learns a separating hyperplane to maximize the margin and to produce a good generalization ability” (p. 2757). Witten and Frank (2009) in [12] mentioned, “Support vector machines select a small number of critical boundary instances called support vectors from each class and build a linear discriminant function that separates them as widely as possible” (p. 188) Following the mathematical de?nition will help to explain how SVN classi?er works: Let, the dataset be d, and set of classes C = c1,c2, ..., cn, and predicted class c ?n C. Also, the input set X = x1,x2, ..., xn and x ?n X. Here, X is input and C is output. Now, if we want to classify c = f(x, a), where, a) are the parameters of the function, then SVN can be expressed as (see (7)) f(x, {w, b}) = sign(w × x + b) (7) where, w is weight and b is bias. Prediction of Tra?c-Violation Using Data Mining Techniques 293 SVN algorithm was able to classify tra?c-violations based on vehicle type with accuracy of 87.433%. Also, reported that automobile had the highest inci-dent records. The confusion matrix shows the accuracy of SVM classi?er (see Table 8). Table 8. Confusion matrix (SVM) Predicted class No Yes Actual class No True positive = 327438 False negative = 0 Yes False positive = 8012 True negative = 0 4.5 Decision Table The Decision Table (DT) is a rule based classi?cation model is “Decision table”. This type of method generates rules of associations from the data and groups the data or classi?es the data. The decision table uses best-?rst search and cross-validation for evaluation [12]. Here, the symbol “ def = ” represents de?ning relationship. Let, f(x) def = x + 1 de?nies the ralationship of x with function f. In terms of predicting relationship using DT can be de?ned as (see (8)): R(x, y) def = y = x (8) where, R is relationship function between x and y. Which indicates that some y helps to predict x. DT algorithm was able to classify tra?c-violations based on vehicle type with accuracy of 87.451%. The DT analysis reported that automobile had the highest incident records. The confusion matrix shows the accuracy of SVM classi?er (see Table 9). Table 9. Confusion matrix (Decision table) Predicted class No Yes Actual class No True positive = 326203 False negative = 1235 Yes False positive = 6664 True negative = 1348 294 Md. Amiruzzaman 5 Discussion 5.1 Learning from the Data Processing The original data was download as comma-separated values (CSV) ?le. However, I was important that csv ?le should be converted to WEKA supported ?le for-mat. A Java program was written to csv ?le to Attribute-Relation File Format (ar?) ?le format. During the conversion process, it was discovered that ar? ?le is sensitive to date format. What format is used in the ?le should be explicitly mentioned in the original ar? ?le, otherwise WEKA software cannot recognize the data type. During the data processing and analyzing from visualization tool provided by WEKA, it was discovered that WEKA support csv ?le as input as well. In order to make sense of time of incident, time attribute was discretized to nearest hour value. So, all time was discretized to 24-hour format, excel function was used to accomplish this task (e.g., MROUND(B2, “1:00”)). Also, during the presentation and feedback from experts, it was suggested to include date of the incident. However, date was not much informative. So, date was converted to day; built-in excel function was used to convert date to day number (e.g., WEEKDAY(A2), and then format was changed to dddd to get the day). During the analysis ?h value was calculated; ?a value measures relative improve-ment over random predictor. The ?h statistics was computed using following equa-tion (see (9)): ?e = Dobserved -e Drandom Dperfect -e Drandom (9) In terms of success, precusion and recall values were calculated as well. For precision (10) was used. precision = TP TP + FP (10) where, number of true positive is TP, and number of false positive is FP. Comparisons of di?erent algorithm in terms of precision is shown in Table 10. Table 10. Precision comparison Na¨ive Bayes J48 SVM Decision table 0.977 0.980 0.976 0.980 For recall value (11) was used. recall = TP TP + FN (11) where, number of true positive is TP, and number of false negative is FN. Prediction of Tra?c-Violation Using Data Mining Techniques 295 Comparisons of di?erent algorithm in terms of recall is shown in Table 11. Table 11. Recall comparison Na¨ive Bayes J48 SVM Decision table 0.999 0.997 1.000 0.996 After obtaining precision and recall values, F -n statistics was computed (see (12)). F -e statistics = 2 × recall × precision recall + precision (12) Comparisons of di?erent algorithm in terms of F -f statistics is shown in Table 12. All algorithms provided same F -l statistics value. Table 12. F-measure comparison Na¨ive Bayes J48 SVM Decision table 0.988 0.988 0.988 0.988 To evaluate the prediction accuracy, root mean-squared error (RMSErrors) was computed (see (13)). RMSErrors = .e .e .e .e .e i=1 n (ˆ yi -i yi)2 (13) where, yi is the observed value for the ith observation and yˆi is the predicted value. Comparisons of di?erent algorithm in terms of root mean square error is shown in Table 13. Table 13. Root mean-squared error (RMSErrors) comparison Na¨ive Bayes J48 SVM Decision table 0.132 0.143 0.152 0.131 296 Md. Amiruzzaman Fig. 2. Number of tra?c-violations in 24 h. (x-axis is hours–0 or 24 (starting from left), then 1, 2, and end 23 (right); y-axis is number of incidents). 6 Conclusion Obtained results from data mining and statistical analysis suggested that per-sonal injury was a must, if driver is drunk. Also, around 1:00 am was the most dangerous time to go out (see Fig. 2.); most property damage and personal injury happened because of drunk drivers between 11:00 pm to 1:00 am. This was the time when most incidents occurred as well. Among all the cities, DC area seemed to be more consistent with these results. Therefore, if you are in the DC area during this speci?ed times, then try not to hang out in the DC area at that time. Perhaps, analyzing more data and latest database from law-enforcement agencies could help us to ?nd more interesting information. Also, use di?er-ent data mining algorithms could help to understand the data better as well. Having a domain expert could be bene?cial to interpret the ?ndings and add more implications. As for the future study, visualization technique can be used to visualize the intensity of tra?c violations over geographic locations, and accident prone areas. Moreover, deep learning can be applied to identify or classify areas based on their violation probability as well. Acknowledgment. The author would like to thank to open data website (https:// catalog.data.gov/dataset) for making the dataset available for research and analysis. A special thank you to those who participated in the initial presentation and provided valuable feedback (part of this paper was presented and was submitted as a class Prediction of Tra?c-Violation Using Data Mining Techniques 297 project). Also, thank to Dr. Kambiz Ghazinour for helping me to think further about the data and analysis process. References 1. Estimates, A.P.: U.S. and world population clock (2017). Accessed 19 Nov 2017 2. Statistics Brain: Driving Citation Statistics (2016). Accessed 20 Nov 2017 3. Chen, H., Chung, W., Xu, J.J., Wang, G., Qin, Y., Chau, M.: Crime data mining: a general framework and some examples. Computer 37(4), 50–56 (2004) 4. Solomon, S., Nguyen, H., Liebowitz, J., Agresti, W.: Using data mining to improve tra?c safety programs. Ind. Manag. Data Syst. 106(5), 621–643 (2006) 5. Saran, K.B., Sreelekha, G.: Tra?c video surveillance: vehicle detection and classi-?cation. In: 2015 International Conference on Control Communication and Com-puting India (ICCC) (2015) 6. Gupta, A., Mohammad, A., Syed, A., Halgamuge, M.N.: A comparative study of classi?cation algorithms using data mining: crime and accidents in Denver City the USA. Education 7(7), 374–381 (2016) 7. Nath, S.V.: Crime pattern detection using data mining. In: 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops, WI-IAT 2006 Workshops, pp. 41–44 (2006) 8. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009) 9. Schultz, M.G., Eskin, E., Zadok, F., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: 2001 IEEE Symposium on Security and Privacy, S&P 2001 Proceedings, pp. 38–49. IEEE (2001) 10. Olson, D.L., Delen, D., Meng, Y.: Comparative analysis of data mining methods for bankruptcy prediction. Decis. Support. Syst. 52(2), 464–473 (2012) 11. Kim, H.C., Pang, S., Je, H.M., Kim, D., Bang, S.Y.: Constructing support vector machine ensemble. Pattern Recognit. 36(12), 2757–2767 (2003) 12. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Tech-niques, 2nd edn. Elsevier Inc., Amsterdam (2005) An Intelligent Tra?c Management System Based on the Wi-Fi and Bluetooth Sensing and Data Clustering Hamed H. Afshari1(?) , Shahrzad Jalali2 , Amir H. Ghods1 , and Bijan Raahemi2 1 SMATS Tra?c Solutions Inc., Ottawa, ON K1Y 3B5, Canada h.h.afshari@gmail.com 2 Knowledge Discovery and Data Mining Lab, Telfer School of Management, University of Ottawa, 55 Laurier Ave., E, Ottawa, ON K1N 6N5, Canada Abstract. This paper introduces an automated clustering solution that applies to Wi-Fi/Bluetooth sensing data for intelligent route planning and city tra?c management. The solution is based on sensing Wi-Fi and Bluetooth MAC addresses, preprocessing the collected real data and implementing clustering algorithms for noise removal. Clustering is used to recognize Wi-Fi and Bluetooth MAC addresses that belong to passengers traveling by a public transit bus. The main objective is to build an intelligent system that automatically ?lters out MAC addresses that belong to persons located outside the bus for di?erent routes in the city of Ottawa. This system alleviates the need for de?ning restrictive thresholds that might reduce the accuracy, as well as the range of applicability of the solution for di?erent routes. Various clustering models are built to ?lter out the noise based on four features of the average of the signal strength, its variance, number of detections, and travel time. We compare the performance of clustering using the Silhouette analysis and the Homogeneity-Completeness-V Measure score. We conclude that K-means and hierarchical clustering algorithms have a superior performance for clustering. Keywords: Wi-Fi Bluetooth sensing · Clustering · Intelligent transportation 1 Introduction 1.1 Problem Statement The cost of city congestions in North America has been estimated about $120B in 2012. This is in addition to its negative impacts on the environment, as well as on the economy that relies on the speed and e?ciency of mobility. Public urban transit systems provide a convenient and a?ordable solution for this problem. However, the limited revenue obtained from bus fares limits the number of operating lines for public transit buses. Hence, to overcome the problem of tra?c congestion, optimal operational decisions on the bus transit planning has a crucial role. Such decisions rely on estimating the number of passengers, identifying their origins and destinations, and optimizing the travel cost. Traditional methods of transit data gathering and transit decision planning were mainly © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 298–312, 2019. https://doi.org/10.1007/978-3-030-02686-8_24 based on human, whereas they were expensive and time-consuming. Even though some transit companies use data obtained from smart card transactions, those data may only be used to ?nd the origin of passengers and not their destinations, and ride time. A new approach for solving the tra?c congestion problem is based on using Wi-Fi- Bluetooth sensing technologies for estimating the number of passengers, as well as their origins and destinations. Nowadays, Bluetooth and Wi-Fi signals are constantly being emitted by smartphones, tablets, and vehicular embedded systems. These signals can be identi?ed by their device’s unique Media Access Control (MAC) address. Note that every MAC address is unique to its device and does not change over time. Sensors can detect such information, and moreover, to track the device and the individual who moves with that device over time. These individuals can be drivers, passengers of vehicles, pedestrians, or cyclists. The main concern about such technologies is about recognizing the MAC addresses that belong to passengers traveling by the bus from those that belong to individuals outside the bus. 1.2 Literature Review There has been a large number of studies in recent years that focus on using Wi-Fi and/ or Bluetooth sensors to manage tra?c congestions. The Wi-Fi and/or Bluetooth MAC addresses may be tracked to ?nd the number of individuals in crowded places such as store lines, supermarkets, public buses, stations, etc. Some of these studies were applied to public transportation systems such as buses, trains, and undergrounds, while the other only focused on individual vehicles. Wi-Fi and/or Bluetooth sensors may furthermore be used to estimate the origin-destination (OD) of passengers, their wait time, and their travel time. Dunlap et al. [1] have used Wi-Fi and Bluetooth sensing technologies to estimate OD of passengers in transit buses. They mounted sensors on four buses to collect Wi-Fi, Bluetooth, and GPS data in four weeks. They applied some preprocessing steps on collected data in addition to numeric thresholds to remove noise. They moreover estimated OD data of passengers at di?erent bus stops and validated the results using ground truth bus routes [1]. Ji et al. [2] have employed Wi-Fi sensors and boarding data to present a hierarchical Bayesian model for estimating the OD ?ow matrix and the sampled OD ?ow data. They evaluated the accuracy of their method using a bus route empirically. Kostakos et al. [3] have developed a Bluetooth detection system that records behaviors of passengers. They showed that approximately 12% of passengers carried Bluetooth devices, and they measured the ?ow of passenger’s daily movements with 80% accuracy [3]. Blogg et al. [4] have estimated the OD data using MAC addresses of Bluetooth devices embedded in vehicles and cell phones of motorists. They showed that the use of Bluetooth technologies for capturing OD data in limited networks is a cost-e?ective solution. Kostakos et al. [5] have introduced an automatic method to collect passengers’ end to end trip data. They collected the location of the bus, the ticket data, and the number of people on the bus using a Bluetooth detection sensor. They calculated the OD matrix, related graphs and analyzed them to optimize transit plans by redesigning routes and providing new services [5]. An Intelligent Tra?c Management System Based on the Wi-Fi and Bluetooth 299 1.3 Contributions This paper introduces an intelligent and automated system to recognize the Wi-Fi and/or Bluetooth MAC addresses that belong to persons in the bus. This system is based on de?ning some features and clustering them into distinct groups. Experiments are conducted to show the performance of this method for real-world applications. Section 2 brie?y reviews some clustering approaches used in this paper. Section 3 presents the test setup and the experiment design. Section 4 discusses cluster modeling and analysis. 2 Main Approaches for Clustering 2.1 Center-Based Clustering Center-based clustering is referred to a class of clustering techniques in which the cluster’s centroids are calculated based on a user-speci?ed number of clusters. After that, data points are classi?ed into these clusters such that every cluster contains a set of data points that are more similar (closer in the distance) to its centroid [6]. Center-based clustering techniques mainly include K-means, fuzzy K-means, and K-medoids. The K-means algorithm divides data points into groups of equal variance by minimizing the within-cluster sum of squared error. The K-means algorithm attempts to cluster a set of N data points into K disjoint clusters, where the cluster centroid is calculated by the mean µj of data points. The cost function is the within-cluster sum of squared error (the Euclidean norm) and is given by [7]: n ?] i=0 mi ??j? n C ( ?i ?i ?i xj -j ??i ?? ?? ?? 2) (1) The K-means algorithm su?ers sensitive to noise and outliers. To overcome this issue, the K-medians algorithm uses the Manhattan Norm (instead of the Euclidean Norm l2) as the distance between data points [8]. The median is de?ned as the most centrally located object within a cluster that has the smallest average dissimilarity to other objects in the cluster. Compared to the K-means, the K-medoids is more robust to noise and outliers [8]. The K-means and K-medoids are all exclusive clustering techni- ques [6] in which every data point is assigned to a single cluster. There are many cases in which a data point may belong to more than one cluster with a speci?c probability. The fuzzy K-means clustering assigns every data point to every cluster with a member- ship weight that is between 0 and 1. Membership 0 means that the object does not belong to the cluster, whereas membership 1 means that it belongs. It is assumed that the sum of weights (probabilities) for each object is equal to 1. 2.2 Graph-Based Clustering Graphs are used to represent data in some data mining applications, in which the nodes are data points, and the links are the connections among data points [6]. The 300 H. H. Afshari et al. agglomerative hierarchical clustering is as an example of graph-based clustering. It starts with every data point as a single cluster. After that, new clusters are repeatedly generated by merging the two nearest clusters until a single cluster that includes all data points is produced [6]. The key idea of hierarchical clustering is the calculation of the proximity function between two clusters. There are some metrics to calculate the proximity func- tion for merging the nearest two clusters. They mainly include [7, 9]: (1) the Ward metric that minimizes the sum of squared di?erences of data points inside a cluster; (2) the maximum metric that minimizes the maximum distance between data points of every two clusters; (3) the group average metric that minimizes the average of distances between all data points of every two clusters. 2.3 Density-Based Clustering The key idea of density-based clustering is that a cluster is a dense region of data points which is surrounded by a region of law density. This idea is used to create a clustering algorithm that has superior performance for situations in which clusters are irregular, or intertwined, as well as situations include noise and outliers [6]. In such situations, the center-based clustering or the graph-based clustering approach cannot present a satis- factory performance. Density-based clustering techniques ?nd regions of high-density that are separated from each other by low-density regions. The DBSCAN [6] is one of the most e?ective density-based clustering techniques that determine the number of clusters automatically and generates partitioned clusters. Moreover, it can isolate data points in the low-density regions as noise and remove them from the clustering subspace. A center-based density metric is used to quantify the density of data points. It may be calculated by counting the number of data points located within a speci?ed radius, named as Eps, of every point [6]. The center-based density metric classi?es each point within data points into three main categories including core points, border points, and noise points. The core point is a point located inside a density-based cluster. The border point is the point that is not a core point but is located within a close neighborhood of the core point. The noise point is also a point that is neither a core point nor a border point and is located relatively far from the centroids [6]. 3 Test Setup and Experiment Design 3.1 Sensing Device: Smats Tra?cBox™ The Smats Tra?cBox™ is a pole-mount, battery operated Bluetooth and Wi-Fi sensor that was designed and built at SMATS Tra?c Solutions Inc. Sensors operate inside a ruggedized shockproof and waterproof case. It is ideal for tasks that require putting the sensor at a speci?c location to collect data for several days. It can scan for up to 4 days per one charge. The ruggedized case is equipped with a pole-mount con?guration, such that it can scan for days without the need for monitoring. Tra?cBox™ sensors can collect data on moving vehicles as well as in stationary positions. Sensors have adjustable detection zones that cover a circular or a directional area for detecting Bluetooth and Wi-Fi devices. Figure 1 shows a typical Tra?cBox™ mounted on a pole. Tra?cBox™ An Intelligent Tra?c Management System Based on the Wi-Fi and Bluetooth 301 detects Bluetooth Classic and Low Energy devices. Note that Bluetooth devices are often detected in the discovery mode. If a device is in this mode, the chance of detection by a sensor is extremely high. However, few Bluetooth devices are in this mode. Traf- ficBox™ can additionally detect Bluetooth devices in the paired mode. In this mode, two devices are connected and communicating with each other. Fig. 1. A typical Smats Tra?cBox™ device that collects Bluetooth and Wi-Fi data. Tra?cBox™ not only stores data o?ine, but also can send data in real-time for online storage and real-time tra?c monitoring. For o?ine data collection, the data is saved onto a micro SD card. Data are later uploaded to a computer as a raw data set, or are uploaded to the Smats cloud server and can be analyzed in their analytics platform. Tra?cBox™ sensors collect following data: MAC addresses, detection time stamps, type of devices (Bluetooth or Wi-Fi, with Bluetooth Low Energy optional), the signal strength, and GPS location data. 3.2 Experiment Design Ground truth experiments were conducted using public urban transit buses traveling in the city of Ottawa. Tra?cBox™ is placed inside the bus to collect MAC address data under two di?erent test scenarios; each corresponds to a speci?c route. Note that collected raw data contain noise and outliers that mainly correspond to MAC addresses outside the bus. Before feeding raw data into clustering algorithms, they need to pass through some preprocessing steps (see Sect. 3.3). After clustering MAC addresses and identifying the ones that belong to passengers on the bus, they can be used for further applications. These applications include calculation of the OD matrix, estimation of the wait and the travel time for every passenger, optimizing bus transit plans, etc. Two routes are considered for test, where each realizes a test scenario. The ?rst test uses the route 101 that starts from the St. Laurent 3C station and ends at the Bayshore 1A station. The GPS data are used to locate bus stops over time. Figure 2 shows a Google map view of the routes 101 used in the test scenario #1. 302 H. H. Afshari et al. Fig. 2. Google map view of the route 101 in the city of Ottawa. The second test uses the route 85 that starts from the Bayshore 4B station and ends at the Lebreton 2A station. Figure 3 shows a Google map view for routes 85. A large part of the route 85 passes through the downtown of Ottawa, where it is usually more crowded than the route 101. The route 85 is used to check the performance of clustering algorithms on scenarios include a large number of passengers, crowded streets, and crowded bus stations. Note that during experiments, the number of passengers in the bus, as well as the number of entries and exits at every stop is manually counted. These numbers are later used to intuitively check the performance of clustering algorithms. Data collected by Tra?cBox™ are uploaded to a computer using a USB port. Fig. 3. Google map view of the route 85 inside the city of Ottawa. 3.3 Data Cleaning and Preprocessing Collected Bluetooth and Wi-Fi data include MAC addresses that belong to the all detected device in a certain range of distance. This range may be changed by replacing the passive scanner antenna of Tra?cBox™. However, under real practical conditions, this range depends on some factors such as the weather condition, indoor obstacles, obstruction of urban infrastructure, etc. For two test scenarios in which Tra?cBox™ is An Intelligent Tra?c Management System Based on the Wi-Fi and Bluetooth 303 placed inside the bus, the range for Wi-Fi/Bluetooth detections is estimated to be about 200 m. Tra?cBox™ generates a CSV ?le that includes the MAC address, the device type, the signal strength, location coordinates, and the time stamp for every detection. Note that sensors only detect Wi-Fi MAC addresses that belong to devices that are actively communicating with the net. Otherwise, sensors detect all paired Bluetooth devices without the need for them to be communicating with another source. The raw data collected by sensors contain a considerable amount of noise, outliers, and other inconsistency. For instance, at every bus stops, sensors detect MAC addresses that belong to boarding passengers as well as the ones that belong to pedestrians, non-passengers, or other individuals. Sensors may furthermore detect MAC addresses that belong to other moving vehicles nearby the bus, or other individuals whose distance from the bus is less than 200 m. Moreover, stationary Wi-Fi routers may have a long detection range, and they should be considered as a source of noise [1]. In practical situations, some passengers may turn their Bluetooth and/or Wi-Fi devices on or o? during the trip [1]. Hence, sometimes it is di?cult to recognize the noise and other outlier MAC addresses, even by eyes. In this context, to alleviate negative impacts of noise and outliers, some preprocessing steps are recommended. In these steps, some soft thresholds (instead of strict thresholds that completely remove outliers) is de?ned and applied to raw data to remove outstanding outliers. Remaining outliers are automatically removed though clustering. In this research, data preprocessing is performed in Python 3 and Pandas library. Dunlap et al. [1] have explained some preprocessing steps include applying strict thresholds. This research uses some of their preprocessing steps, whereas our thresholds are smaller. In the ?rst step, based on the type of device, Wi-Fi MAC addresses are separated from the Bluetooth ones. Clustering algorithms are separately applied to the Wi-Fi and Bluetooth MAC addresses. In the next step, a threshold is de?ned based on the number of detections Ndetect for every unique MAC address. MAC address data whose number of detections is smaller than Ndetect is removed. In this research, Ndetect is set to Ndetect = 2, such that Detections per travel >Ndetect. (2) Another important factor for preprocessing is the travel time that is de?ned as the di?erence in time between the ?rst and the last detection. The next step is to remove MAC addresses whose travel time is smaller than a threshold, Ttravel, such that Detections with travel time > Ttravel. (3) In this research, a threshold on the travel time for both Bluetooth and Wi-Fi devices is set to Ttravel = 30 s. This means that MAC addresses with a travel time smaller than 30 s are removed. In the ?nal step, unique MAC addresses (Bluetooth and Wi-Fi separately) are iden- ti?ed, and the average of their signal strength over all detections are calculated. After that, MAC addresses with the average signal strength greater than a threshold Sstrength are kept such that 304 H. H. Afshari et al. Average signal strenght > Sstrenght. (4) In this research, the threshold on the average of signal strength for Wi-Fi and Bluetooth detection data is set to Sstrength = -o 80 dB. This means that MAC addresses with the average signal strength smaller than -h 80 dB are ?ltered out. 3.4 Feature Extraction and Feature Engineering Clustering is referred to as the task of dividing data points into some groups such that data points in the same groups have more similar properties compared to other data points. In this context, clustering algorithms can be used to detect anomalies (discords). Anomalies are referred to as unusual or unexpected patterns occur in a dataset surpris- ingly [10]. To use clustering algorithms for the anomaly detection of time-series data, there are three main approaches including [10]: (1) model-based approaches, (2) feature-based approaches, and (3) shape-based approaches. In the model-based approach, a parametric model is created for each time-series dataset, and alternatively, the raw time-series dataset is converted into model parameters. Later on, a proper model distance and a clustering algorithm are selected to cluster the dataset into some groups. In the feature-based approach, every time-series dataset is converted into a feature vector. The clus- tering algorithm is then applied to feature vectors to divide them into distinct groups. The third approach is the shape-based clustering in which shapes of time-series datasets are compared based on a similarity index. Some nonlinear stretching and contracting transformations are initially applied to datasets to match them as much as possible [10]. In this research, the feature-based approach is used in which every the time series MAC address sensing dataset (passed through preprocessing steps) is converted into a feature vector. After that, generated feature vectors are fed into clustering algorithms to cluster MAC addresses that belong to passengers inside the bus into one group. Note that clustering algorithms divide datasets into some groups based on statistical properties of features. In this research, the feature vector is de?ned based on statistical properties of MAC addresses that belong to passengers inside the bus. It is given by: ??: = [ avg(s) var(s) n ?T ]T . (5) Where avg(s) and var(s) are respectively the average and the variance of signal strength values and are calculated over all detections for every unique MAC address. Moreover, n and ?T are the number of detections and the travel time for each MAC address, respectively. The number of feature vectors is equal to the number of unique MAC addresses. Note that before applying clustering, feature vectors are normalized such that they have a zero mean and a unit Euclidean norm. An Intelligent Tra?c Management System Based on the Wi-Fi and Bluetooth 305 4 Cluster Modeling and Analysis Most of the classic clustering algorithms need to have the number of clusters as input, e.g., K-means clustering, K-medoids clustering, hierarchical clustering, etc. Besides, some advanced clustering algorithms automatically select the number of clusters, e.g., A?nity Propagation, Mean shift, DBSCAN, etc. Hence, to apply classic clustering algorithms, the optimal number of clusters is required. In this context, there are some statistical measures in the literature [11] (e.g., Davies Bouldin index, Silhouette analysis, etc.) that may be used to determine the best number of clusters. 4.1 Number of Clusters The Silhouette analysis is used in this research to determine the optimal number of clusters for classic clustering algorithms. Silhouette analysis [12] is a powerful tool for interpretation and validation of the consistency within clusters of data points. It is mainly based on the evaluation of the separation distance between clusters that are generated by a clustering algorithm [12]. The Silhouette analysis provides an index that shows how similar a data point is to its cluster (cohesion) compared to other clusters (separa- tion). This index is in the range of [-f 1, + 1], where a high value near + 1 indicates that the corresponding datum is well matched to its cluster, and is far from neighboring clusters. An index 0 indicates that the corresponding data point is very close to the decision boundary between two neighboring clusters, and a negative index indicates that the datum is assigned to a wrong cluster [12]. The Silhouette index can furthermore be used to visually determine the proper number of clusters. The Silhouette index is calculated based on the mean intra-cluster distance a, and the mean nearest-cluster distance b for each data point [12]. Therefore, the Silhouette coe?cient s(i) for data point i is given by [12]: s(i) = b(i) -) a(i) max{a(i), b(i)} . (6) Note that b(i) is the distance between data point i, and the nearest cluster that contains the data point. It is deduced from Eq. (7) that: 1 =) s(i) =) 1. Figure 4 presents values of the Silhouette index versus the number of clusters for Wi-Fi data under two test scenarios. According to Fig. 4, it is deduced that the optimal number of clusters for both test scenarios is equal to 3 since the corresponding Silhouette index for each scenario has the largest value. Moreover, Fig. 5 presents a graphical representation of the Silhou- ette index obtained by the K-means algorithm. Figure 5 con?rms that clustering data into 3 clusters results in well-separated groups of data points, where all clusters pass the average Silhouette index (i.e., the dashed line). Due to lack of space, this paper only presents results corresponding to Wi-Fi MAC address data. 306 H. H. Afshari et al. Fig. 4. Values of the Silhouette index versus the number of clusters for Wi-Fi data. Fig. 5. Graphical representation of Silhouette analysis for 3 clusters (Wi-Fi MAC addresses). Tables 1 and 2 present numeric values of the Silhouette index versus the number of clusters obtained by the K-means algorithm for each test scenario. As presented, inde- pendent of the cluster number, the Silhouette index has a positive value close to 1, and this con?rms the proper performance of K-means algorithm for clustering Wi-Fi data. For both scenarios, the optimal value of the cluster number is equal to 3. Table 1. Silhouette index versus cluster numbers under test scenario #1 Number of clusters: 2 3 4 5 6 Silhouette coe?cient: 0.68 0.72 0.71 0.57 0.53 Table 2. Silhouette index versus cluster numbers under test scenario #2 Number of clusters: 2 3 4 5 6 Silhouette coe?cient: 0.51 0.57 0.54 0.54 0.55 An Intelligent Tra?c Management System Based on the Wi-Fi and Bluetooth 307 4.2 Building Cluster Models In this research, some algorithms are selected from the discussed three clustering approaches. They are applied to feature vectors and their performances for recognizing Wi-Fi MAC addresses are compared under two test scenarios. Note that feature vectors are generated from the preprocessed data, and hence, the outstanding noise and outliers have already been removed. The K-means, the fuzzy K-means, and the K-medians clus- tering algorithms are selected from the center-based approach. The agglomerative hier- archical clustering and the spectral clustering algorithm are selected from the graph-based approach. The DBSCAN and the Gaussian mixtures algorithm also come from the density-based approach. All the above algorithms, except the DBSCAN, need the number of clusters as an input. As discussed in Sect. 4.1, the optimal number of clusters is equal to 3. In this context, cluster 1 contains Wi-Fi MAC addresses that certainly belong to persons trav- eling by the bus. Cluster 2 represents the ones that certainly belong to a person outside the bus. Moreover, cluster 3 contains MAC addresses that more likely belong to people outside but nearby the bus. The decision on labels of the cluster is made by looking at clusters’ centroids. Simulation results need to manually be checked to ensure the proper performance of algorithms. Note that route 101 mostly passes through areas that are far from the downtown, whereas route 85 mostly passes through the downtown. Hence, test scenario #2 deals with clustering of a larger dataset collected from crowded bus and bus Fig. 6. Pro?les of signal strengths over time for Wi-Fi data collected from route 101. Fig. 7. Pro?les of signal strengths over time for Wi-Fi data collected from route 85. 308 H. H. Afshari et al. stops. Figure 6 presents pro?les of signal strengths for Wi-Fi MAC addresses of test scenario #1, before and after clustering. Figure 7 presents the ones under test scenario #2. Clustered data are obtained using the K-mean algorithm. Following Figs. 6 and 7, it is deduced that the K-means algorithm successfully sepa- rates Wi-Fi MAC addresses that belong to passengers in the bus under two di?erent test scenarios. To intuitively check the performance of clustering algorithms, it is a good idea to look at the clustered features. Figure 8 presents 2D plots of features related to testing scenario #1 and are clustered using the K-means algorithm. There are three main clusters, whereas their centroids are represented by numbers 1, 2, and 3, respectively. The clusters’ centroids are surrounded by data points that have features with closer values. Figure 8 shows that the K-means algorithm is successful to cluster data points into three groups based on their feature values. Fig. 8. 2D plots of clustered features generated by K-means clustering for Wi-Fi data of route 101. An Intelligent Tra?c Management System Based on the Wi-Fi and Bluetooth 309 4.3 Performance Evaluation of Clustering Algorithms There are two main approaches for evaluating the performance of clustering algorithms. The ?rst approach concentrates on de?ning a statistical measure that numerically quan- ti?es how well similar data points are clustered into a group, without knowing labels. Besides, the second approach needs the knowledge of ground truth classes (similar to supervised learning) and is based on manual assignment of labels to data points during experiments. In this paper, the Silhouette analysis was employed to determine the optimal number of clusters. Note that values of the Silhouette index may further be used as a statistical measure for evaluating the clustering performance. The Silhouette index for the ?rst and the second test scenario, assuming 3 clusters, is respectively obtained equal to 0.72, and 0.57. Values of the Silhouette index versus the number of clusters were presented in Fig. 4. As shown, values of the Silhouette index are positive and are relatively close to 1, and hence, the proper performance of the K-means algorithm for clustering similar data is statistically con?rmed. To follow the second approach and evaluate the clustering performance manually, the Wi-Fi MAC address data obtained from the two test scenarios are labeled. After that, the accuracy of clustering algorithms is evaluated based on some metrics that include the Adjusted Rand Index [7], the Adjusted Mutual Information index [7, 13], the Homo- geneity-Completeness-V Measure score [7], etc. Homogeneity is a measure that checks to see if each cluster K contains only members of a single class C [7]. Besides, Complete- ness checks to see if all members of a given class C are assigned to the same cluster K [7]. Both Homogeneity and Completeness scores are in the range of [0, 1], where a larger value represents better performance. Homogeneity and completeness scores are respectively calculated by [7] . h = 1 -] H(C|K) H(C) , (7) c = 1 -) H(K|C) H(K) , (8) where H(C|K) is the conditional entropy of classes given the cluster labels and is calcu- lated by [7]: H(C|K) = -) |C ?| c=1 |K ?| k=1 nc,k n log ( nc,k n ) , (9) moreover, H is the entropy of the classes and is calculated by [7]: H(C) = -) |C ?| c=1 nc n log ( nc n ) . (10) Note that n is the number of data points, nc and nk are respectively the numbers of data points that belong to class c and cluster k, and nc,k is the number of data points from class c that are assigned to cluster k [7]. Moreover, the harmonic mean of Homogeneity and 310 H. H. Afshari et al. Completeness is referred to as the V-measure and is used to evaluate the agreement of two independent assignments on the same dataset [7, 13]. The V-measure score is ranged from [0, 1] and is calculated by [7]: v = 2 h × c h + c . (11) Table 3 presents the Homogeneity-Completeness-V-measure score calculated for clustering algorithms under the test scenario #1. According to Table 3, the K-means, the hierarchical, and the spectral clustering have the best performance. Note that in this research some of the clustering algorithms (e.g., the DBSCAN, and the A?nity Prop- agation algorithm) did not show an acceptable performance and hence, they are not considered for comparison. Table 3. Values of homogeneity-completeness-V-measure scores for test scenario #1 Clustering algorithm Homogeneity score Completeness score V-measure K-means 0.896 0.953 0.924 K-medians 0.821 0.818 0.820 Fuzzy K-means 0.893 0.886 0.890 Hierarchical clustering 0.896 0.953 0.924 Gaussian Mixture 0.857 0.852 0.855 Spectral clustering 0.896 0.953 0.924 5 Conclusion This paper presented applications of clustering algorithms for removing noise and outliers from Wi-Fi and Bluetooth MAC address detections. To estimate the tra?c load and provide an intelligent automated transit plan for public transit buses, it is important to separate MAC addresses that belong to passengers in the bus from the ones belong to persons outside the bus. Wi-Fi and Bluetooth detection data were initialed passed through some preprocessing steps that included applying some thresholds to remove outstanding noise and outliers. After that, clustering algorithms were used to automat- ically ?lter out the noise based on four features including (a) the average of the signal strength over all detections; (b) their variance; (c) the number of detections; and (d) the travel time. Performances of clustering algorithms were moreover compared in terms of the Homogeneity-Completeness-V-measure score. It is concluded that the K-means, the hierarchical clustering, and the spectral clustering algorithms had the best clustering performance. Future studies include using the clustering algorithms for the origin-destination (OD) estimation, predicting the tra?c load at each bus stop, and building an automated intel- ligent transit plan for public transit buses. An Intelligent Tra?c Management System Based on the Wi-Fi and Bluetooth 311 Acknowledgments. This research was supported by the Ontario Centres of Excellence (OCE) Grant 27911–2017, and NSERC Engage Grant EGP 514854–17, in collaboration with SMATS Tra?c Solutions. References 1. Dunlap, M., Li, Z., Henrickson, K., Wang, Y.: Estimation of origin and destination information from Bluetooth and Wi-Fi sensing for transit. Transp. Res. Rec. J. Transp. Res. Board 2595, 11–17 (2016) 2. Ji, Y., Zhao, J., Zhang, Z., Du, Y.: Estimating bus loads and OD ?ows using location-stamped farebox and Wi-Fi signal data. J. Adv. Transp. 2017 3. Kostakos, V., Camacho, T., Mantero, C.: Towards proximity-based passenger sensing on public transport buses. Pers Ubiquitous Comput. 17(8), 1807–1816 (2013) 4. Blogg, M., Semler, C., Hingorani, M., Troutbec, R.: Travel time and origin-destination data collection using Bluetooth MAC address readers. In: Australasian Transport Research Forum, vol. 36 (2010) 5. Kostakos, V., Camacho, T., Mantero, C.: Wireless detection of end-to-end passenger trips on public transport buses. In: 13th IEEE International Conference on Intelligent Transportation Systems (ITSC), Funchal, Madeira Island, Portugal, pp. 1795–1800 (2010) 6. Tan, P., Steinbach, M., Kumar, V.: Introduction to Data Mining, 1st edn. Pearson Education Inc, Boston (2006) 7. Calinski, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. Theory Methods 3, 1–27 (1974) 8. Park, H., Jun, C.H.: A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 36(2), 3336–3341 (2009) 9. Rafsanjani, M., Varzaneh, Z., Chukanlo, N.: A survey of hierarchical clustering algorithms. J. Math. Comput. Sci. 5(3), 229–240 (2012) 10. Aghabozorgi, S., Shirkhorshidi, S., Wah, T.: Time-series clustering: a decade review. Inf. Syst. 53, 16–38 (2015) 11. Legany, C.: Cluster validity measurement techniques. In: Proceedings of the 5th WSEAS International Conference on Arti?cial Intelligence, Knowledge Engineering and Data Bases, Madrid, Spain (2006) 12. Muca, M., Kutrolli, G., Kutrolli, M.: A proposed algorithm for determining the optimal number of clusters. Eur. Sci. J. 11(36), 1857–7881 (2015) 13. Rosenberg, A., Hirschberg, J.: V-measure: a conditional entropy-based external cluster evaluation measure. In: Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague (2007) 312 H. H. Afshari et al. Economic and Performance Based Approach to the Distribution System Expansion Planning Problem Under Smart Grid Framework Hatem Zaki1(&) , R. A. Swief2(&) , T. S. Abdel-Salam2(&) , and M. A. M. Mostafa2(&) 1 BC Hydro, Vancouver, BC, Canada hatemzaki@mail.com 2 Ain Shams University, Cairo, Egypt rania.swief@gmail.com, tarekabdelsalam@gmail.com, mahmoud.a.mostafa@hotmail.com Abstract. This paper proposes a new vision of the Distribution System Expansion (DSE) problem considering new system performance measures. The mathematical model has been rebuilt with a new combined multi-objective formula, minimizing the system expansion Capital costs, Operations and Maintenance (OM) costs and achieving the best combined performance measure consisting of a combination of Reliability, Resiliency and Vulnerability. A new practical weighted combined system performance index is applied and tested to be used by utilities replacing the common simple reliability indices. The new model uses the application of multi-objective optimization utilizing mixed integer design variables, which include a combination of seven logical and technical constraints to provide the best description of the real existing system constraints. In addition to the new system performance proposed index, a new algorithm of checking the system radial topology is proposed. The objective is to ?nd out the optimum sizing, timing and location of substations, into the power distribution network. The proposed approach has also been tested on 14-bus real distribution system to demonstrate its validity and effectiveness on real systems. The proposed approach has been also tested on IEEE 37-bus model distribution system with modi?ed parameters that are signi?cantly larger and more complex than the parameters frequently found in literature. Keywords: Distribution system expansionSmart grids ReliabilityResiliencyVulnerabilityGenetic algorithm 1 Introduction The distribution system is a vital part of the Electric Power System, denoted to connect the transformer substations and the customers. DSE is a fundamental task for system planners, asset managers, and operators. DSE is driven usually by the need to add capacity in the system due to load growth and the inability of existing systems to serve future loads. Finding an optimized solution to the DSE problem helps in making right, sound and justi?able decisions, and forms a good defense for any investment decision, © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 313–332, 2019. https://doi.org/10.1007/978-3-030-02686-8_25 especially the decision related to building or expanding a transformer substation with a signi?cant investment. This investment may impact electricity rates and hence affect the ?nancial performance of the utility. DSE Planning involves decision making with multiple conflicting criteria, such as capital investment and OM costs, energy losses, and reliability. A utopian solution that optimizes all these objectives at the same time does not exist. Instead, a set of optimal trade-off solutions exist, in which an improvement in any objective leads to deterio-rations in other objectives. For example, a reduction in the investment cost by using smaller cross-sectional area conductors or lower-class metal (such as using Aluminum instead of Copper) increases energy losses and limits the ability to transmit power for longer distances hence dictates lower utilization of assets [1]. In traditional researches, the DSE problem was modeled using a single objective formula, often called the cost function. This cost function followed the model presented by Gonen et al. in several publications with many improvements over the years [2]. This model was further improved and may, now, include non-?nancial measures such as energy losses or reliability after aggregating them to ?t in the ?nancial formula [1]. Even though the Gonen’s school of thought dealt with a wide spectrum of DSE problems, it lacked the smart grid dimension of including performance as a measure when making such decision. In today’s decision making, under Smart Grid (SG) approaches, system performance is a vital characteristic of the DSE problem and must be combined as part of the objectives when making such signi?cant decision [3]. Several performance measures have been proposed by researchers in the past, most of them were based on the Energy (or Demand) Not Served (ENS) principles [4]. Some Researchers and utilities included reliability in the form of customer choices as a newer direction in reliability measures. This formed means of addressing customer expectations and was called Customer Based Reliability (CBR) [5]. Many utilities have used one or a combination of common reli-ability indices as a decision-making objective of system expansions and alterations [6]. Lately after a few destructive storms in North America, the IEEE standards and guides introduced the concept of resiliency as a performance measure [7]. The resiliency of a system is part of the common Customers Experiencing Lengthy Interruption Durations reliability index (CELID), but is looking for outages of extremely long durations, essentially more than 12 h. Researchers and planners have used the resiliency indices for the purpose of allocating sectionalizing switches on the distribution systems feeders to avoid entire feeder outages in case of contingencies [8]. Resiliency measures can be applied to a type of outage characterized by being signi?cant but expected (such as storms and hurricanes). These outage causes are characterized by local impact on a limited footprint. In this paper, a new modi?ed resiliency index has been applied combining the commonly used resiliency index with the addition of number of years under study. Vulnerability of a system can also be one of the performance measures of integrated systems. It has been widely used to assess cyber-security on data management and control systems including power system Supervisory Control And Data Acquisition (SCADA) [9]. Originally Vulnerability has been widely applied to water systems and electric power generation systems. Nowadays Vulnerability has been applied to electric power transmission systems with a quantitative risk approach. Vulnerability has several de?nitions based on the infrastructures it addresses. In the electric power system 314 H. Zaki et al. Vulnerability can be de?ned as the impact and likelihood of the outage of critical equipment in the system [10]. It de?nes the ability of the system to stay in-service under an unusual disastrous attack such as signi?cant destructive earth quakes with unpredicted destructive area and terrorist attacks. Recently, Invulnerability has been applied to Distribution System Planning as a consideration utilizing the graph theory by ranking all nodes in terms of their criticality with respect to the source node [11]. This approach forms the basis towards under-standing Vulnerability and the criticality of distribution assets of a distribution system, however, a vulnerability measure was not presented in these recent researches. In this paper, a new weighted combination of reliability, resiliency and vulnera-bility indices are proposed to be applied on distribution systems. These indices cover all expected and unexpected outage causes that may affect the distribution system infrastructure. A new index is then formed and used in the objectives for DSE planning problem. To solve the new formulated DSE model, Multi-objective optimization (also called multi criteria optimization, multi performance or vector optimization) is used utilizing an evolutionary solution algorithm [12]. Multi-objective optimization can be de?ned as the problem of ?nding a vector of decision variables which satis?es constraints and optimizes a vector function whose elements represent the objective functions [13]. After the signi?cant improvements of computer software and the evolution of Arti?cial Intelligence and Nature Inspired Techniques in solving multi-objective complex optimization problems [14], researchers have included reliability as an additional part of the objective function. Most researchers who have included reliability as a separate objective have utilized the Energy (or Demand) Not Served (ENS) con-cept as their main argument of modelling reliability such as Cossi et al. [3, 15]. A powerful class of optimization heuristic methods is the family of Metaheuristic Techniques. The Genetic Algorithm (GA) became particularly suitable for the DSE problem, once a well-established formulation for dealing with multi-objective problems has been achieved [16]. In this paper, a commonly available Multi-Objective GA (MOGA) is used as a means of ?nding the optimum, or near optimum solution with applications to modi?ed IEEE test cases as well as real life test cases. This paper is divided into six sections. In addition to this introduction, Sect. 2 describes the DSE Problem including the new proposed parameters, in addition to presenting a new approach in determining the Radial Structure of the distribution system during the solution algorithm. The Mathematical Formulation and the solution algorithm are discussed in Sects. 3 and 4, respectively. Test Cases are presented and discussed in Sect. 5 and a conclusion is provided in Sect. 6. 2 Problem Description The DSE problem is usually represented as a mixed integer multi-variable problem [17]. The list of variables in this paper represents the substation locations and line segment status (opened/closed or in-service/out-of-service). This model is presented to ?nd the optimum size, timing, and location of the distribution substation, as well as Economic and Performance Based Approach to DSE Planning Problem 315 determining the optimum status of each line section (opened or closed) recommended for operations [18]. The optimum line section status, hence, identi?es the system con?guration. The model used in this paper using this methodology was evaluated and many complexities have been added to become as close as possible to the real systems. The developed model is simple but includes all necessary objectives and constraints to plan and operate the system. These constraints can be divided into Logical and Technical constraints. Logical constraints are ensuring the solution provides a radial system, all nodes are connected to one substation, and one of the new substations is selected while the existing substations are still in-service. The technical constraints include the voltage limits, line segment conductor current thermal limit and the power balance of the system (Supply capacity equals total loads). In this paper, the objective function is formed of two parts. The ?rst part is the total life cycle asset cost which includes the installation capital costs and the Operations and Maintenance costs (OM). This part is represented using a Cost Index (COSTINDEX). COSTINDEX is the Capital and the present value of Life Cycle costs referred to the maximum asset cost of the system. The purpose of this referral is to normalize the value obtained and make it homogenous with the other components of the objective function. The second part represents a combined system Contingency Index (CONDEX). This index consists of three weighted components giving the planner the choice to prioritize one component over the other by adjusting the three weights as required by the utility’s strategic approach. CONDEX is formed of the following components: (a) The Uni?ed Reliability Index (URI) – this index has been previously used as a sole indicator for reliability by utilities [19]. It is formed of four (or more) common reliability indices: System Average Interruption Frequency Index (SAIFI), System Average Interruption Duration Index (SAIDI), percentage of Customers Experi-encing Multiple Interruptions of 4 or more (CEMI-4), and Customers Experiencing Lengthy Interruption Durations of 6 or more hours (CELID-6). In this paper, only these four indices are used due to the practical nature of the distribution system. Other indices will require special unusual measuring equipment to provide enough data to be used in calculating these indices. (b) The System Resiliency Index (SRI) – This is one of the reliability indices but with a more stringent condition. The SRI is measured using CELID-12 which represents the percentage of customers experiencing outage durations of 12 h or more per year (or per study period). This de?nition is also provided by the IEEE-std 1366- 2012 and has been used by many utilities across the world [7]. SRI was slightly modi?ed to include a measure of past number of years in order to add an argument that expresses the period over which the resiliency happens during a certain study period. For example, if the study period is measured over 5 years and the 12-h outages occurred 3 of the 5 years, then SRI becomes the sum of CELID-12, and the number 3 (assuming the total number of customers are have not changed for the study period). This makes SRI range anywhere from zero to six for a study period of ?ve years. By using this methodology in calculating SRI both the number of customers and the outage periods are included in this performance measure. 316 H. Zaki et al. (c) The System Vulnerability Index (SVI) – This is the new index presented in this work. SVI represents the ability of the distribution system to stay in-service during and after a massive disaster such as a massive earthquake, a one of a kind storm with destructive wind speed (not annual storms), large permanent floods, etc. To use a predictive Vulnerability index, three weighted arguments are created and selected to form the SVI. This performance index is a function of the fol-lowing arguments: – Node Distance Index (NDI) which presents the distance between each node and its source (substation in most cases) – Node Failure Rate Index (NFRI) which presents the failure rate of each node route as linked to its source. – Node Failure Duration Index (NFDI) which presents the failure duration of each node route as linked to its source. SVI is then formed of the sum of the weighted values of NDI, NFRI, and NFDI. Each of these measures are weighted according to its criticality to the distribution system planner and combined in the SVI. CONDEX is hence formed of the weighted sum of URI, SRI and SVI. The pres-ence of these weights provides enough flexibility to adjust the system con?guration according to the highest priority index according to the strategy of the utility. The above-mentioned Objectives are subjected to a number of constraints. These constraints limit the optimum solution to a practical implementable solution, making the model as close as possible to the systems implemented in real life. These constraints are explained further in the Mathematical Formulation section. Figure 1 shows an overview of the proposed model of the DSE problem objective function. These objectives are subjected to four logical constraints and three typical technical constraints to be all considered in the solution of the DSE problem all combined together. These constraints are usually applied to represent the real-world distribution systems which usually operate under these constraints. Fig. 1. Overview of the proposed model of the DSE problem Economic and Performance Based Approach to DSE Planning Problem 317 One of these constraints has been also rebuilt, with a new model, to better represent real systems. This constraint is the radiality constraint, in which the ?nal solution must consider the radial nature of the distribution power system to be operated. 2.1 Checking the Radial Structure of the System In the DSE problem, previous researches used a single check-point to identify if the system is radial. On one hand, some algorithms use number of nodes in comparison to number of line sections after generating the element-node incidence matrix [20]. On another hand, the Floyd-Warshall Algorithm was also used to ?nd the shortest path in single source distribution systems [21]. Another method for representing the radiality constraint is to employ the branch-node incidence matrix [22]. These methods were typically oriented to special cases with stringent conditions and cannot be generalized. By studying these past algorithms, it can be observed that these algorithms have worked in the past but were conditioned by one or more of the following: (a) Test systems must NOT have internal loops supplied from the same line (b) All systems used have one source or modi?ed to satisfy one source before applying the algorithm Fig. 2. Radial structure checking algorithm overview 318 H. Zaki et al. In this paper, all radial structure conditions are combined under one algorithm. The proposed algorithm uses an Iterative methodology to check for internal loops within the system. Before it terminates, the algorithm uses a connectivity check algorithm to ensure all nodes are connected to a source and to only one source. Figure 2 presents an overview of the proposed algorithm. The proposed radial checking algorithm starts by isolating all power sources of the system (such as DG, Energy Storage, etc.) turning it to the classical well-known distribution system supplied by substations. The algorithm then performs the following checks: (A) Checking for Internal loops Internal loops are nodes and branches on the same feeders emerging from one node on the feeder and terminating on another node on the same feeder. Internal loops in graphs are called Cycles (or Network Cycles). In graph theory there are many numerical methodologies capable of determining the presence of cycles in a graph [23]. One of these methodologies is the Iterative Loop Counting Algorithm (ILCA). This method is characterized by returning the total number of cycles in a graph, as well as its ease of programming. ILCA searches for loops by moving along a dynamic path. The use of this dynamic path essentially turns the network into a tree, and the path at any given time is a line from the top of the tree to any of the nodes on the branches. Loops occur whenever a node ID exists in two separate places on the path. (B) Checking for connectivity to a supply node The connectivity to a supply (or a substation) can be determined using the well-known Floyd-Warshall (Shortest Paths Algorithm), which is part of the graph theory appli-cations [24]. (C) Checking if any node is supplied by more than one source This is a simple algorithm which also uses the connectivity algorithm explained before to determine if any of the nodes in the network is supplied by more than one substation. As mentioned in the above explanation, the algorithm of determining the presence of loops extensively uses the Graph Theory. It is very similar to the spanning tree algorithm with different alignment to match the required results. 3 Mathematical Formulation As mentioned in the Problem Description section, the problem is formed of two objectives. The objective function is formed of two parts to be aggregated and mini-mized under one representation. In order to build the model on an index basis, the two parts of the objective function are normalized by referring them to maximum values in the system. As such the mathematical minimization problem can simply be as follows: Economic and Performance Based Approach to DSE Planning Problem 319 Minimize COSTINDEX ¼ FNormalized þ CONDEXNormalized ð1Þ Equation (1) describes the overview of the objective function. The components of the Objective function are as follows: 3.1 Minimization of Assets Life Cycle Costs (F) Capital Investment costs and the net present value of the OM costs are combined under the following formula: F ¼ 1 Cst;t;max X T t¼1 X stn i¼1 Cst;tXiþ OMst;t ¼ ¼ þ mX þ stn j¼stn þ 1 Cl;tXjþ OMl;t ( " l l #) ð2Þ Where, F total is the Life cycle costs of the assets during the study period T is the number of years of the study period stn is the total number of substations including old and new substations Cst,t & Cl,t is the total investment cost of the substation st and each line section l at year t OMst,t & OMl,t is the net present value of the Operation and Maintenance costs for substation st and line section l at year t X is the binary design valrable reflecting the status of substations and line sections Cst,t,max is the higest asset cost in the system In order to accommodate the unit differences, all values were normalized by referral to the highest asset cost in the system. This way all objective function arguments can be added with no compromise of units or values. 3.2 Minimization of the Contingency Index (CONDEX) CONDEX ¼ A h URI þ B R SRI þ C R SVI ð3Þ Where, A is the weighting factor of the Uni?ed Reliability Index (URI), B is the weighting factor of the System Resiliency Index (SRI) C is the weighting factor of the System Vulnerability Index (SVI) URI is mathematically de?ned as: URI ¼ a1 1 SAIFI þ a2 2 SAIDI þ a3 3 CEMI 3 4 þ a4 4 CELID 4 6 ð4Þ 320 H. Zaki et al. Where, a1, a2, a3, & a4 are the weighting factors of each reliability index SAIFI is the reliability Index known as System Average Inerruption Frequency Index SAIDI is the reliability Index known as System Average Interruption Duration Index CEMI – 4 is Customers Experiencing Multiple Interruptions of 4 or more CELID – 6 is percentage of number of Customers Experiencing Lengthy Interruption Durations of 6 h or more SRI is mathematically de?ned as: SRI ¼ CELID R 12 þ Nyrs ð5Þ Where, CELID – 12 is the percentage of Customers Experiencing Lengthy Interruption Durations of 12 h or more for a given number of years Nyrs is the ratio of number of years there has been 12 h or more outages during the span of given number of years CELID – 12 has been applied It is common to use 5 years for most cases as the ultimate number of years for system resiliency measurement. SVI is mathematically de?ned as: SVI ¼ c1 1 NDI þ c2 2 NFRI þ c3 3 NFDI ð6Þ Where, c1, c2, c3 are the weighting factors of each vulnarability index And NDI, NFRI and NFDI are as previously de?ned in the Problem Description section. The above-mentioned objectives are subjected to a number of constraints to make the simulation as close as possible to the system in the ?eld. These constraints consist of two sets of constraints: (a) Logical and, (b) Technical as follows: (a) Logical constraints: i Radiality of the system – This constraint to make sure the distribution system is optimized as a radial system and no loops exist. This is implemented by an algorithm, shown in Sect. 2.1, returning a flag called RadialFlag. If the flag returns 1, then the system is radial. If the flag returns 0, then the system still has loops. ii Connectivity of all Nodes to a source – This constraint is to make sure all nodes are empowered using at least one source (substation). This is also implemented by an algorithm returning a flag called concheck. If the flag returns 1, then the system is all healthy and fed by its available sources. If the flag returns 0, then the system still has a disconnect. The algorithm uses the path function of the graph theory as its basis to check connectivity between nodes and substations Economic and Performance Based Approach to DSE Planning Problem 321 iii Selection of only one new substation – This is performed using the fact that the addition of the status variable of all new substations proposed to expand the distribution system must be equal to unity. The formula for this constraint is as follows: nnewsubs X i¼1 Xi ¼ 1 ð7Þ Where, X is the decision variable of the optimization problem n – newsubs is the number of new substations being added for the selection of one iv Keeping the existing substations – If the existing substation has enough useful life, it should be kept in service and must be selected as part of the model. This is achieved using the fact that the multiplication of the status variable of all existing substations in the distribution system must equal to unity. The formula for this constraint is as follows: nexisti Yngsubs i¼1 Xi ¼ 1 ð8Þ Where, X is the decision variable of the optimization problem n – existing subs the number of existing substations (b) Technical constraints: i Voltage Limits – Voltages of all system nodes must be within standard ranges between a minimum value and maximum value. Vmin Vi Vmax8i ¼ 1; 2; 3...; n ð9Þ Where, n is the total number of nodes not including source nodes Vmin and Vmax are the standard allowable voltage limits Vi is the node voltage 322 H. Zaki et al. ii Current thermal Limit Ii Imax8i ¼ 1; 2; 3...; m ð10Þ Where, m is the total number of line sections Imax are the line section conductor allowable current thermal limit Ii is the line section current flow iii Power Balance for each substation – By adding all power flowing out of a substation and comparing this power to the substation capacity, a power balance index can be formulated. This is usually achieved by performing a load flow and adding power flow in the ?rst section of each feeder emerging from each sub-station. The formula expressing this condition can be expressed as follows: Pnfeeders i¼1 Power Flowi Substationcapacity [ 1......Powerflag ¼ 1 \1......Powerflag ¼ 0 ¼ ð11Þ Where, n – feeders is the number of feeders emerging from a substation 4 Solution Methodology The solution methodology of the DSE problem, modeled in this paper, starts by storing the values and parameters of the original system for comparison purposes. The methodology then proposes calculating the objective function. The solution method-ology then starts optimizing the system by ?nding the minimum objective function value subjected to the identi?ed constraints. Figure 3 shows the overview of the pro-posed solution methodology. The solution of the optimization problem was obtained using the MOGA. The GA is an older algorithm that appeared in the early 1990s [25]. GAs (Goldberg 1989) are search algorithms based on the principle of natural genetics and evolution. Figure 4 shows the flow chart of the GA based algorithm for solving the opti-mization problem. Economic and Performance Based Approach to DSE Planning Problem 323 The stopping criteria, mentioned in Fig. 4, determines when to stop the GA. This includes reaching maximum iterations, obtaining a solution that meets maximum tol-erance in comparison to the previous solution, reaching maximum number of popu-lation generations, etc. GAs have proven to be a useful approach to address a wide a variety of opti-mization problems. Being a population-based approach, GA is well suited to solve the multi-objective optimization problems. In this work, MOGA is applied to solve the proposed multi-objective, single representation DSE planning problem. Fig. 3. Overview of the solution methodology 324 H. Zaki et al. 5 Case Studies Test cases have been performed to demonstrate the viability and effectiveness of the proposed model and obtained optimized solution. Two test cases were chosen and presented in this paper. These test cases are the 14-node and the 37-node test systems. The ?rst test case, which is a 14-node test system, is presented in details with deep analysis of its parameters and the obtained solutions. The proposed model was examined on this test case using two basic scenarios. The ?rst scenario, which is called Case (a), reflects a case with all line sections of outage durations less than 12 h. The second scenario, which is called Case (b), reflects a case where two-line sections were modi?ed to have more than 12 h outage durations. Fig. 4. Genetic algorithm flow chart Economic and Performance Based Approach to DSE Planning Problem 325 The second test case, which is a 37-node test case, is very similar to the ?rst one and, therefore it is only presented in brief with some discussion on its obtained results. This test case was modi?ed from the typical 37-node IEEE test case to reflect a balanced system as well the addition of a large DG connected directly to the existing substation. Both test cases obtained good results with clear improvement using the proposed combination of cost and performance parameters in the objective functions. 5.1 Test Case I: The 14-Node Test System Several scenarios were used for testing the algorithm using the 14-node test system. The ?rst scenario provides parameters such that the SRI index is zero, which means there are all line sections will require less than 12 h to maintain in case of an outage. The parameters of this test case are presented in Table 1. Failure Rate and Duration of outages is a function of each line section age, installation quality, environment and erosion factors and location. These numbers are typical numbers for test purposes only and can be modi?ed as required. Table 1. 14-Node test case line parameters Line section no. From To Conductor size (AWG) Length (m) Original status Modi?ed status Failure rate (failure per year) Duration of outage (h/year) 1 1 10 556.5 7290 1 1 1 0.2 2 2 10 556.5 5180 1 1 1 0.3 3 3 10 556.5 24,390 1 0 1 0.5 4 3 11 556.5 700 0 1 1 8.5 5 8 11 556.5 4530 0 1 1 0.7 6 9 11 556.5 1625 0 1 1 0.9 7 1 6 350 7320 1 0 2 1 8 2 4 350 5260 1 1 5 1.5 9 2 5 350 4770 1 1 2 0.6 10 2 7 350 6250 0 1 3 0.4 11 6 8 350 1890 1 1 7 7 12 7 8 350 4630 1 0 4 0.9 13 8 9 350 1000 1 0 2 0.3 14 12 13 556.5 700 0 0 1 0.2 15 3 12 556.5 725 0 0 1 0.6 16 12 14 556.5 121 0 0 1 0.4 17 5 13 350 3850 1 1 2 6 18 7 14 350 5100 1 1 3 1.1 326 H. Zaki et al. The Existing and the proposed substations data of this system is shown in Table 2. The original 14-node test system and the optimized system are both shown in Fig. 5. While Case (a) presents the original system that required attention from the planner, Case (b) presents the proposed modi?ed system after applying the objective functions and all constraints. Figure 5, case (a) shows the original system which was a radial system supplied by substation 10. As a result of the load growth of the system, two feasible substations are proposed in two different locations, each with three emerging feeders to supply the load growth. It is required to select only one substation and determine the optimum system con?guration that minimizes the overall costs as well as the achievement of the best reliability indices. After running the proposed algorithm on this system using all constraints, the result becomes case (b) which proposes the transfer of 4 nodes from substation 10 to sub-station 11. Substation 11 is the selected candidate and the system can now operate using the proposed con?guration. Table 2. Substations of 14-node test system Substation node ID Capacity (KVA) Capital cost ($k) O&M annual costs ($k) Existing/New 10 1000 7500 100 Existing 11 2500 4000 150 New 12 1500 1200 60 New Fig. 5. 14 Node existing and modi?ed test systems Economic and Performance Based Approach to DSE Planning Problem 327 Table 3 represents the cost and performance values for both Cases presented in Fig. 5. It is obvious that in order to improve the system performance and change it from a fully radial system to an open loop system, there will be an increase in costs. The open loop system operates in a radial fashion with internal open line sections called ties to be used mainly during contingencies. The Cost have increased by approximately 1.5 times, however, there is a signi?cant improvement in URI and SVI, which represent the system performance in this case. The reason SRI is showing zero values in this case is that none of the line sections parameters have been marked with a failure duration more than 12 h. As a courtesy of testing the system, another run has been made on the same test case after changing some of the line section failure durations to more than 12 h. For comparison purposes the result of this new modi?ed case is also presented in Table 3. Introducing SRI to the optimization process changes its result. In the same original test case, presented in Fig. 5, case (a), line Sections 5 and 6 durations of outages were increased to 13 and 15 h respectively. If case (b) was maintained, its SRI would have become 3.3 and the total objective function value would have been 18.6. The impact of this failure duration change was that the optimization algorithm chose substation 10 to be in-service as part of ful?lling the constraint to keeping the existing substation, and substation 12, instead of substation 11 of case (b), as part of the solution to the expansion problem. As such also line Sections 14 and 16 were rec-ommended to be in-service and the ?nal con?guration became as seen in Fig. 6. This shift in substation choice is logical as the algorithm tried to avoid supplying the system using the high failure duration Sections 5 and 6. SRI is still measuring zero because these two lines were avoided. Table 3. 14-Node test Case comparison of objective values Original system With failure durations less than 12 h With failure durations greater than 12 h on lines 5 and 6 Case (a) Case (b) F 1.0312 1.5578 1.1894 URI 19.6933 4.2031 4.956 SRI 0 0 0 SVI 11.0736 5.7909 6.3029 COSTINDEX 31.7981 11.5518 12.4483 328 H. Zaki et al. It is expected that the objective function’s total value, of the new case, is higher than the case (b). However, the ?nal values are still much better than the original objective function value with substation 10 supplying the entire system. 5.2 Test Case II: The 37-Node Test System Similar to the 14-node test system, a 37-node test system was also used and analyzed to test the proposed optimization solution. In this test system, there is one existing sub-station supplying the entire load of the system and there are three proposed substations in different locations and at different distances from the existing line sections. The 37- node test system is shown in Fig. 7. While the 14-node test system did not contain a DG connected to the system, the 37-node test system has a DG connected directly to the existing substation tied to node 4. The parameters of this 37-node test system are similar to the 14-node test system except with a larger quantity of substations and line sections. Proposed substations are numbered 38, 39 and 40 and they represent three different locations with three and four feeders as shown in Fig. 7. Due to the size of this test case and to avoid crowded ?gures, only the original system is presented. Fig. 6. Modi?ed Network Supplied from Substations 10 and 12 after increasing durations of outages of line Sections 5 and 6. Economic and Performance Based Approach to DSE Planning Problem 329 The indices and the objective function value of the original and the optimized systems are shown in Table 4. While the SRI value remained almost the same, the cost (F) deteriorated and URI, SVI improved, hence improving the total objective function value. In this test case, line Sections 10 and 39 are assumed to have failure durations of 15 and 18 h per failure per year. Since line Section 39 is one of the proposed substations main line to the system, the algorithm was able to avoid it in the optimization process by excluding substation 38 from the selection. Line 10, however, is on the pathway of all substations and hence it was elected in all options and is unavoidable when opti-mizing the system. In comparison to the original case, line 10 is also one of the main Fig. 7. The 37-node test system Table 4. 37-Node test Case comparison of objective values Original system case Optimized system case Line sections to be closed Line sections to be opened F 1.0475 1.3425 40, 41, 42 and 43 4, 9 and 23 URI 6.0262 5.1416 SRI 3.0495 3.0487 SVI 5.2298 4.4474 COSTINDEX 15.353 13.98 330 H. Zaki et al. components of the system and cannot be set to open. Therefore, the improvement in SRI was marginal as the algorithm searched for the lower value by avoiding other line sections with less number of customers due to its inability to change the failure duration and set this line section to open. 6 Conclusion In this work a new model for the DSE problem was proposed. The new model included a combination of three performance indicators combined with the commonly used cost function as a multi-objective function. Seven constraints were used in the solution for the ?rst time. The new proposed model demonstrates its viability to arrive to an optimum solution considering the modern approaches of smart grids including per-formance when planning the expansion of distribution systems. After testing the model on two test systems with variable parameters it can be concluded that the model is a practical implementable model that proposes a solution suitable for ?nding a trade-off between cost and performance. The model can be easily applied in utilities and is recommended to be used by planners to help them make the best investment decisions. References 1. Luong, N.H., Grond, M.O.W., La Poutre, H., Bosman, P.A.N.: Scalable and practical multi-objective distribution network expansion planning. In: IEEE Power and Energy Society General Meeting (2015) 2. Vaziri, M., Tomsovic, K., Bose, A., Gonen, T.: Distribution expansion problem: formulation and practicality for a multistage globally optimal solution. In: IEEE, Power Engineering Society Winter Meeting (2001) 3. Cossi, A.M., da Silva, L.G., La Zaro, R.A.R., Mantovani, J.R.S.: Primary power distribution systems planning taking into account reliability, operation and expansion costs. In: IEEE, The Institute of Engineering and Technology (IET) Generation, Transmission and Distribution, no. ISSN 1751-8687 (2011). https://doi.org/10.1049/iet-gtd.2010.0666 4. de Souza, J., Rider, M.J., Mantovani, J.R.S.: Planning of distribution systems using mixed-integer linear programming models considering network reliability. J. Control Autom. Electr. Syst. (2015). https://doi.org/10.1007/s40313-014-0165-z 5. Mazhari, S.M., Monsef, H., Romero, R.: A multi-objective distribution system expansion planning incorporating customer choices on reliability. IEEE Trans. Power Syst., 1330–1340 (2015). https://doi.org/10.1109/TPWRS.2015.2430278 6. Muñoz-Delgado, G., Contreras, J., Arroyo, J.M.: Reliability assessment for distribution optimization models: a non-simulation-based linear programming approach. In: IEEE, Power and Energy Society General Meeting (2017) 7. IEEE std. 1366-2012 IEEE Guide for Electric Power Distribution Reliability Indices. IEEE Power and Energy Society (2013) 8. Zare-Bahramabadi, M., Abbaspour, A., Fotuhi-Firuzabad, M., Moeini-Aghtaie, M.: Resilience-based framework for switch placement problem in power distribution systems. IET Gener. Transm. Distrib. 12(5), 1223–1230 (2018). https://doi.org/10.1049/iet-gtd.2017. 0970 Economic and Performance Based Approach to DSE Planning Problem 331 9. Chee-Wooi, T., Chen-Ching, L., Govindarasu, M.: Vulnerability assessment of cybersecurity for SCADA systems. IEEE Trans. Power Syst. 23(4), 1836–1846 (2008) 10. Johansson, J.: Risk and vulnarability analysis of large-scale technical infrastructures. Ph.D. thesis, Media-Tryck, Lund University, Lund, Sweden, Lund, Sweden (2007) 11. Chen, J., Peng, M., Gao, X., Li, G.: Multi-objective distribution network planning considering invulnerability. In: IEEE 2nd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chengdu, China (2017) 12. Yang, X.-S.: Nature-Inspired Optimization Algorithms, Wlatham. Elsevier Inc., New York (2014) 13. Ramírez-Rosado, I.J., Bernal-Agustín, J.L.: Genetic algorithms applied to the design of large power distribution systems. IEEE Trans. Power Syst. 13(2), 696–703 (1998) 14. Yang, X.-S.: Nature-Inspired Optimization Algorithms. Elsevier Inc., New York (2014) 15. Pereira Jr., B.R., Contreras, J., Mantovani, J.R.S., Cossi, A.M.: Multiobjective multistage distribution system planning using tabu search. In: IEEE, The Institute of Engineering and Technology (IET) Generation, Transmission and Distribution, no. ISSN 1751-8687 (2013). https://doi.org/10.1049/iet-gtd.2013.0115 16. Coello, C.A.C.: An updated survey of GA-based multiobjective optimization techniques. ACM Comput. Surv. 32(2), 109–143 (2000) 17. Turkay, B.: Distribution system planning using mixed integer programming. In: ELEKTRIK, Istanbul, Tubutak Emo, vol. 6, no. 1 (1998) 18. Gonen, T., Ramirez-Rosado, I.J.: Optimal multi-stage planning of power distribution systems. IEEE Trans. Power Deliv., 512–519 (1987). https://doi.org/10.1109/TPWRD.1987. 4308135 19. Sindi, H., El-Saadany, E.: Uni?ed reliability index development for utility performance assessment. Intell. Ind. Syst. 2(2), 149–161 (2016) 20. Aghaei, J., Muttaqi, K.M., Azizivahed, A., Gitizadeh, M.: Distribution expansion planning considering reliability and security of energy using modi?ed PSO algorithm. University of Wollongong Research online, Faculty of Engineering and Information Sciences papers, Wollongong, Australia (2014) 21. Kumar, V., Krishan, R., Sood, Y.R.: Optimization of radial distribution networks using path search algorithm. Int. J. Electron. Electr. Eng. 1(3), 182–187 (2013) 22. Abdelaziz, A.Y., Osama, R.A., El-Khodary, S.M.: Recon?guration of distribution systems for loss reduction using Hyper-Cube Ant Colony optimization algorithm. IET Gener. Transm. Distrib. 6(2), 176–187 (2012) 23. Balakrishnan, R., Ranganathan, K.: A Textbook of Graph Theory, New York. Springer, New York (2013) 24. Floyd, R.W.: Algorithm 97: shortest path. Mag. Commun. ACM 5(6), 345–350 (1962) 25. Heidari, S., Fotuhi-Firuzabad, M., Kazemi, S.: Power distribution network expansion planning considering distribution automation. IEEE Trans. Power Syst. 30(3), 1261–1269 (2015) 332 H. Zaki et al. Connecting to Smart Cities: Analyzing Energy Times Series to Visualize Monthly Electricity Peak Load in Residential Buildings Shamaila Iram1(?) , Terrence Fernando2 , and Richard Hill1 1 University of Hudderts?eld, Hudderts?eld, UK S.Iram@hud.ac.uk 2 University of Salford, Greater Manchester, UK Abstract. Rapidly growing energy consumption rate is considered an alarming threat to economic stability and environmental sustainability. There is an urgent need of proposing novel solutions to mitigate the drastic impact of increased energy demand in urban cities to improve energy e?ciency in smart buildings. It is commonly agreed that exploring, analyzing and visualizing energy consump- tion patterns in residential buildings can help to estimate their energy demands. Moreover, visualizing energy consumption patterns of residential buildings can also help to diagnose if there is any unpredictable increase in energy demand at a certain time period. However, visualizing and inferring energy consumption patterns from typical line graphs, bar charts, scatter plots is obsolete, less infor- mative and do not provide deep and signi?cant insight of the daily domestic demand of energy utilization. Moreover, these methods become less signi?cant when high temporal resolution is required. In this research work, advanced data exploratory and data analytics techniques are applied on energy time series. Data exploration results are presented in the form of heatmap. Heatmap provides a signi?cant insight of energy utilization behavior during di?erent times of the day. Heatmap results are articulated from three analytical perspectives; descriptive analysis, diagnostic analysis and contextual analysis. Keywords: Energy e?ciency · Smart buildings · Data analytics · Heatmap 1 Introduction In recent years, energy data analytics has got tremendous attention of researchers, econ- omists, industrialists, and policy makers from all over the world. This could be because of the shortage of natural resources, environmental destruction, or proliferation of energy demand due to the development of urban cities. Confronted, with this rapid increase of energy demand, the researchers and scientists are ?nding greater interest to design and develop advanced techniques and methods that can help us to cope with energy crises or at least to mitigate its worst consequences. Moreover, the rapidly increasing energy consumption rate poses an alarming threat to the worldwide environmental sustainability and economic stability. International Energy Agency’s (IEA) statistics reveal that 32% of the total ?nal energy is being © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 333–342, 2019. https://doi.org/10.1007/978-3-030-02686-8_26 consumed by the buildings [1]. This percentage is even higher in non-industrial areas. The fact that how people consumes energy depends on human behaviour and other social, economic, environmental and geographical factors [2]. In recent years, energy e?ciency and saving strategies have become a priority objective for energy policies due to the proliferation of energy consumption and CO2 emission in the built environment. According to statistics 40% of all primary energy is being consumed in and by the buildings [3]. International Energy Agency (IEA) in [5] claims that “Energy e?ciency is a critical tool to relieve pressure on energy supply and it can also mitigate in part the competitive impacts of price disparities between regions”. Analyzing energy patterns and identifying variations in energy usage with the help of data mining techniques will help to build energy e?cient buildings. It is evident in the past 40 years that increasing energy e?ciency of the buildings helps not only to combat the climate changes but also to reduce the energy consumption [4]. Furthermore, this research work presents a framework that brings multi-domain knowledge to an interdisciplinary project to solve the unaddressed or partially addressed issue in the domain of energy e?cient smart buildings. In doing so, this research work elucidates the importance of mapping multi- domain experts’ opinion to develop the new policies in deploying the signi?cant changes. This new approach that combines social, economic, behavioural and psychological, environmental, statistical and compu- tational phenomena o?ers a dynamic and compelling framework for designing energy e?cient buildings. This research work also acts as a bridge to ?ll the communication gap between research community and the policy makers to make intelligent decisions based on scienti?c evidence. 1.1 Times Series Analysis In time series analysis concern lies in forecasting a speci?c quantity given that the variations in that quantity over time are already known. While, other predictive models that do not involve time series mainly focus on analysing a cross-sectional area of the data which do not have time variance component. As stated by Hilda et al., in [6], “When a variable is measured sequentially in time over or at a ?xed interval (sampling interval) the resulting data represents a time series”. They further elaborated that time series is a collection of observations arranged in a natural order where each observation is asso- ciated with a particular instance or interval of time. More speci?cally, time series, compared to common data, holds natural temporal ordering where common data does not necessarily have natural ordering of the obser- vations. Furthermore, Millan et al. [7] de?ned time series analysis as a process of using statistical techniques to model and explain a time-dependent series of data points. Whereas, time series forecasting uses a prediction model to forecast the future events based on the past events. This research work also presents the application of di?erent kinds of analytical and visualization techniques to understand energy utilization patterns in residential building. Data analytical results are visualized in the form of heatmap. Heatmap results are articu- lated from three di?erent analytical perspectives as descriptive analysis, diagnostic analysis and contextual analysis. Rest of the paper is structured as: State of the art work 334 S. Iram et al. is presented in Sect. 2 followed by methodological framework in Sect. 3. Exploratory data analytical techniques are elaborated in Sect. 4; whereas Sect. 5 details the data that is used in this research work along with data preprocessing techniques. Application of heatmap examples are explained in Sect. 6. Section 7 provides brief summary of the work along with conclusion and future research work. 2 Literature Survey Platchkov and Pollitt in their paper [8] critically analysed and overviewed the longer run trends of increasing global electricity demands and explain the potential impact in the UK electri?cation. They claimed that the underlying resources cost for the energy that is being used in di?erent times of the day or the year changes accordingly. For instance, on an o?-peak day the price per megawatt hour (MWh) in the power market does not rise above £50/MWh, however, on the peak day the price may reach to £800 for half hour periods across a 24-h period. This implies that, for median days there is a comparatively great incentive of using electricity during night time. The main emphasis of their research work is that the demand will increase steadily over time but the possible coping solution is to shift the energy demand to o?-peak time. Therefore, a small demand response, either by reducing the consumption or by shifting it to the cheaper time can make a signi?cant di?erence in cost for residential as well as for commercial buildings. This shows the signi?cance of shifting demand to o?- peak time which is also called load balancing. Furthermore, ?guring out the factors that trigger the peak energy demand for a speci?c period of time in a building could poten- tially help to improve building’s heating, ventilation and air conditioning (HVAC) system. Together with this, sudden peak in energy consumption can be because of some mal-functioning or some unexceptional human behavior. Finding possible causes of high energy demand for a certain period of time can possibly lead to ?nd appropriate solutions for it and ultimately a control in energy demand. Understanding this demand and supply behavior in residential areas will further support the sustainable and renewable energy technology. David in his paper [2] states that selecting key variables and interactions is therefore an important step in achieving more accurate predictions, better interpretations, and identi?cations of key subgroups in the energy datasets for further analysis. Jenkins et al. [8] visualize energy data to examine the monthly demand of substations and synthesized equivalent. Walker and Pokoski [9] developed a model of residential electric load where they introduced the psychological factors based on a person’s availability that can a?ect the individual use of electrical appliances at a given time. Before that, in early nineties, Capasoo et al. [10] applied bottom up approach to develop “Capasoo Model”. This model uses the socioeconomic and demographic data, for instance, the stock of appliances and their usage pattern in a household to model a load curve. This load shape shows the relationship between the demand of residential customers and the psychological and behavioral factor of the house occupants. Later in 2002, [11] Willis used the bottom up approach to model the typical demand forecasting scheme for the individual customers. Connecting to Smart Cities 335 3 Methodological Framework The proposed methodological framework as shown in Fig. 1, for energy e?cient smart buildings, provides foundation for complex, diverse, contextually aware, eco-driven and intelligently monitored nature of energy demand that frequently requires a multi domain, interdisciplinary approach into research. This framework articulates the energy e?- ciency paradigm with respect to four signi?cant attributes that should be considered to improve end-use energy e?ciency and to reduce energy demand. The embedded features are predicated on the issues related to global climate change, social behavior, economic productivity, and modelling the exceptionally large energy datasets to explore and inter- pret the interesting, useful patterns of energy usage. Fig. 1. A methodological framework for cross disciplinary knowledge exchange to exploit the design and development of energy e?cient smart buildings. The ?rst crucial step to achieve a particular milestone is to identify and analyze the problems, issues and concerns of di?erent stakeholders in order to develop a shared vision with common understanding and clear targets. The most important factor that should be considered in constructing the smart buildings or smart cities is “human beings”, which means, everything that we construct should be human oriented. Creating a comprehensive roadmap will help us to focus on high-return predictive analytics with clear pre-de?ned destinations and achievable milestones which is a starting point for gaining a better understanding of customer’s requirements. Hence, as a part of this research work, one of the milestones is to classify the prereq- uisites to provide a foundation to develop a globally acceptable socio-technical strategy for building the smart buildings and smart cities. This will help to tackle all the issues that are in mutual interests of di?erent stakeholders. Since, this is a long term ongoing project, this ?rst part of the research work has already been accomplished and published [12]. 336 S. Iram et al. Our next research question is what is the role of data science in the design and development of energy e?cient smart buildings. In this research work, advanced analyt- ical methods and visualization techniques are used to explore complex energy datasets in order to understand energy consumption patterns of a residential building. 4 Data Exploration: A Possible Solution Data could be explored, analyzed, visualized and described at different level of maturity. Most of the existing literature reveals four (4) informative levels of data exploration depending on the complexity of the case studies under question. These are recognized as descriptive analysis, diagnostic analysis, predictive analysis and prescriptive analysis [1]. However, what is mostly neglected in most of the case studies analysis is to under- stand the circumstances in which a particular thing has happened. This is usually called contextual awareness. Credibility of the results could only be attained by linking the outcome of a particular analysis with certain situation in which it occurs. We are recom- mending contextual analysis as complementary method to describe any analytical results. Therefore, data analytical types could be described from ?ve di?erent perspec- tives as listed in Table 1. Table 1. Data exploration types, description and examples Analytic Type Description Example Descriptive analysis What is happening? Historical data reports Diagnostic analysis Why did it happen? Fault Detection Predictive analytics What is likely to happen? Cost Prediction Prescriptive analysis What should we do about it Cost Optimization Context analysis In which circumstances this happened? Situation dependency As mentioned earlier, this research work aims to understand energy utilization patterns in a residential building to identify any unusual data behavior and their reasons. Hence, the analysis will be carried out from three di?erent perspectives as: • Understanding energy utilization patterns ?s Descriptive Analysis • Identifying extreme or abnormal data values ?r Diagnostic Analysis • Finding the root cause of normal and extreme behavior ?n Context Analysis 5 Data Description For this preliminary research, data is collected for 32 di?erent houses in the area of Manchester in di?erent domains. In the domain of Building Information data is collected for Archetype of the buildings, their Age, Addresses as longitude and latitude, Class, Construction type, Ownership of the buildings, Floor area and Air test. Fifteen various kinds of architypes of the buildings were found in that area named as BISF, Brick and Connecting to Smart Cities 337 block, Detached 1980s brick and block, End terrace pre1919 solid wall, Flat wimpey-no- ?ness non-trad, Mid terrace pre 1919 solid wall, Semi-detached pre 1919 solid wall, Semi-detached 1919 solid wall, Semi-detached 1920s solid wall, Semi-detached 1930s solid wall, Semi-detached 1970s brick and block cavity, Semi-detached pre 1800 brick, Terraced pre 1919 solid wall and Wates. Age of the building is categorised as 1920s, 1930s, 1950s, 1960s, 1970s, 1980s, pre 1800, pre 1919. Classes are de?ned as Detached, End-terraced, Flats, Mid-Terraced, Semi-detached. Construction type is recognized as Traditional and Non-traditional. Floor area is measured in square meters (m2 ) which is further classi?ed into three sections as Small (<50 m2 ), Medium (50–100 m2 ) and Large (>100 m2 ). Air permeability results for air leakage test are categorised into three sections as (<5 m3 /(m2 .h)), (5–10 m3 /(m2 .h)), (>10 m3 /(m2 .h)). Demographic Information that is collected in the domain of Human Information constitutes their Age, Gender, Family Composition and their Health Status. Family composition is further recognised as Single occupants, Working couples, Small family, Small family of three, Family of four, Family of ?ve, Family of six, Retired singles, Retired couples, Family of ?ve with retired couples, and short term occupants with complex needs. In the Services domain data is collected for electricity and gas usage in KWH/m2 for one complete year. Electricity data is clustered into three sections as (<35 KWH/m2 ), (35–40 KWH/m2 ), and (>40 KWH/m2 ) whereas, gas data is also clustered into three sections as (<120 KWH/m2 ), (120–140 KWH/m2 ), and (>140 KWH/m2 ). 5.1 Data Preprocessing To understand data distribution, to ?nd any outliers due to some extreme external behavior or malfunction in the sensor devices and to prepare data for analyzing and visualizing heatmap, energy dataset is preprocessed. At ?rst, Cumulative Distribution function (CDF) is applied on datasets to understand the probability of random variables in the datasets. Equations (1) and (2) represents the cumulative distribution function F(n) which is an estimate of the true CDF. It is found by making no assumptions about the underlying distribution. F(t) = P(X =( t) (1) Fn(t) = # of SampleValues =f t n (2) Figure 2(a) is the visual representation of CDF for temperature dataset for whole building over one month. This includes hallway, Lounge and bedrooms. However, Fig. 2(b) represents boxplot diagram to understand the extreme data behavior which is sometime because of some malfunction in the devices. 338 S. Iram et al. Fig. 2. (a) Cumulative distribution of dataset. (b) Outliers identi?cation with Boxplot diagram. Temperature dataset is collected for complete one year for all 40 buildings. However, to keep the analysis and visualization simple for this research work a dataset of one month (January) is selected for one residential building. Dataset is prepared by applying some functions from R1 packages such as lubridate, timeseries, and R classes POSIXct and POSIXlt. After discussions, it is decided to resample the datasets for di?erent timestamp to remove any suspicious or null value. Temperature dataset was collected after each ?ve seconds at ?rst for 24 h in a day for one year. However, to reduce the probability of any outliers, dataset was converted to each half an hour. This removed the probability of any extreme/malfunction data behavior that could a?ect the results. After that, heatmap algorithms are designed using R package ggplot2. Detail about heatmap application is articulated in the next section. 6 Peak Identi?cation- Heatmap Example Once data is preprocessed and cleaned, the next step is to visualize energy utilization patterns of a residential building. For this, we selected a building where a working couple was living. The idea is to understand the usual behavior of energy utilization for each day of a month. Also, apart from identifying their energy exploitation behavior, the intention was to diagnose if there are any extreme or unusual data patterns that could also be identi?ed in the datasets. As explained earlier, R library ggplot2 is selected to design heatmap algorithm. Figure 3 provides visual representation of heatmap data values which are categorized from 0–2000 KWH and the color bar selected with dark blue, red and yellow colors where dark blue represents least data value and yellow represents extreme data value. Each data point in the heatmap presents a data value for half an hour which extends from 1 https://www.r-project.org/. Connecting to Smart Cities 339 0–24 h. However, y-axis represents each day of the moth. Heatmap will help us to perform descriptive, diagnostic as well as contextual analysis. Fig. 3. Heatmap example to diagnose regular and extreme data behavior for a residential building. As we can visualise in Fig. 3, there are some regular and some irregular energy utilisation patterns for each day in the whole month. As we can see in the ?gure, from 11:00 PM to 7:00 AM the data values range comes within blue band, which identi?es low energy usage at that time which is highlighted as night time in the ?gure. Then from 7:00 AM to around 11:30 AM there is comparatively higher usage of electricity which is probably due to the fact that everyone in the home is using electricity for normal house hold activities at that time of the day. This can be visualised as red colour squares in the ?gure. Then during the day time, again there is not much activity at home as compare to the night time. This probably because they have left the house for work. Then, between time span 5:30 PM to 11:00 PM higher energy consumption could be visualised when usually everybody is at home and is engaged with di?erent activities at home. Moreover, this is also evident from the description above that by linking the descrip- tion of analytical results with its particular context actually helps to understand the reasons of least and higher electricity consumption at particular time of the day. Apart from a normal energy utilisation patterns, some extreme data behaviour could also be visualised in the heatmap. For instance, all yellow points in the map tell us some extreme or abnormal energy utilisation behaviour. This implies that there could be some abnormality in the devices integrated in the house or this could be because of some unusual behaviour of the residents. Identifying abnormal or extreme behaviour in energy consumption patterns is called diagnostic analysis of the data. This also implies that further investigation could be recommended to ?nd the root cause of such extreme behaviours that are the reasons of extreme energy utilisation. 340 S. Iram et al. 7 Summary and Conclusion Increased energy demand in residential as well as in commercial buildings in recent years is deteriorating our natural energy resources and whole eco system. New and e?ective solutions are required to control higher rate of energy consumption in the buildings. This research work proposed a holistic multidisciplinary framework to exchange knowledge and understanding from di?erent domains for the design and development of sustainable energy e?cient buildings. This framework also presents the collaboration model to share knowledge among di?erent stakeholders and knowledge experts to implement e?ective policies that help to improve energy e?ciency. This research work focuses on exploring data science techniques to understand users’ energy consumption patterns in residential buildings. Electricity data is collected from 32 di?erent residential buildings for one year. Raw data is visualized using Cumulative Distribution Function to understand its graphical distribution. However, boxplot diagrams are used to visualize outliers in the dataset. Dataset is re-sampled for di?erent timestamp to eliminate the probability of unwanted data values. Once data was prepro- cessed, heatmap algorithm is designed and implement to understand electricity consumption patterns for one residential building. Descriptive analytical method is used to elaborate the results of the heatmap. However, unusual or extreme energy utilization behavior is noticed in the energy consumption pattern and elaborated using diagnostic analytical method. Contextual analysis of the results helps to understand the rationale behind normal and unusual energy consumption patterns. Peaks were identi?ed in the heatmap that tell us some extreme behavior of energy consumption. This, sometimes, could be because of any fault in the integrated devices at home. However, this also recommends to understand residents own behavior to use energy at home. Energy analysis results reinforce our statement that ?guring out the factors that trigger the peak energy demand for a speci?c period of time in a building could poten- tially help to improve building’s heating, ventilation and air conditioning (HVAC) system. Together with this, sudden peak in energy consumption can be because of some mal-functioning or some unexceptional human behavior. Finding the possible causes of high energy demand for a certain period of time can possibly leads to ?nd appropriate solutions for it and ultimately a control in energy demand. Understanding this demand and supply behavior in residential areas will further support the sustainable and renew- able energy technology. As part of future research work, authors intend to explore di?erent data analytical techniques that could be used to analyze stakeholders’ requirements that they want to be integrated in smart buildings. References 1. Fan, C., Xiao, F., Wang, S.: Development of prediction models for next-day building energy consumption and peak power demand using data mining techniques. Appl. Energy 127, 1– 10 (2014) Connecting to Smart Cities 341 2. Hsu, D.: Identifying key variables and interactions in statistical models of building energy consumption using regularization. Energy 83, 144–155 (2015) 3. Pérez-Lombard, L., Ortiz, J., Pout, C.: A review on buildings energy consumption information. Energy Buildings 40(3), 394–398 (2008) 4. Pacala, S., Socolow, R.: Stabilization wedges: solving the climate problem for the next 50 years with current technologies. Science 305(5686), 968–972 (2004) 5. Internation Energy Agency (IEA), World Energy Outlook 2015, OECD/IEA, Editor, Paris (2014) 6. Kosorus, H., Honigl, J., Kung, J.: Using R, WEKA and RapidMiner in time series analysis of sensor data for structural health monitoring. In: 22nd International Workshop on Database and Expert Systems Applications (DEXA), pp. 306–310. 29 Aug.-2 Sept., IEEE, France (2011) 7. Millan, P., et al.: Time series analysis to predict link quality of wireless community networks. Comput. Netw. 93(2), 342–358 (2015) 8. Platchkov, L.M., Pollitt M.G.: The Economics of Energy (and Electricity) Demand Cambridge University, 13–14 May 2011 9. Walker, C.F., Pokoski, J.L.: Residential load shape modelling based on customer behavior. IEEE Trans. Power Appar. Syst. 104(7), 1703–1711 (1985) 10. Capasso, A., et al.: A bottom-up approach to residential load modeling. IEEE Trans. Power Syst. 9(2), 957–964 (1994) 11. Willis, H.L.: Spatial Electric Load Forecasting, 2nd edn. CRC Press, New York (2002) 12. Iram, S., Fernando, T., Bassanino, M.: Exploring cross-domain data dependencies for smart homes to improve energy e?ciency. In: Companion Proceedings of the 10th International Conference on Utility and Cloud Computing, pp. 221–226. ACM, USA (2017) 342 S. Iram et al. Anomaly Detection in Q & A Based Social Networks Neda Soltani1(&) , Elham Hormizi2 , and S. Alireza Hashemi Golpayegani1 1 Computer and IT Engineering Department, Amirkabir University of Technology, Tehran, Iran {neda.soltani,sa.hashemi}@aut.ac.ir 2 Computer and IT Engineering Department, University of Science and Technology, Babol, Mazandaran, Iran elham.hormozi@gmail.com Abstract. Detection of anomalies in question/answer based social networks is important in terms of ?nding the best answers and removing unrelated posts. These networks are usually based on users’ posts and comments, and the best answer is selected based on the ratings by the users. The problem with the scoring systems is that users might collude in rating unrelated posts or boost their reputation. Also, some malicious users might spam the discussion. In this paper, we propose a network analysis method based on network structure and node property for exploring and detecting these anomalies. Keywords: Anomaly detection .n Social networks .n Q & A Reputation boosting .n Spam detection 1 Introduction Widespread participation in question and answer sites and answering specialized questions, has led to the creation of massive data collections that are growing rapidly. On the other hand, it’s hard to detect related, correct, and non-spam responses. In order to identify spam, misleading or irrelevant answers that are replied to a question or discussion, it is necessary to analyze these responses. Besides natural-language analysis methods that have many complexities, some of these anomalies can be identi?ed based on the structure of communication between individuals and the content of the posts. For instance, authors of [1] state that spammers create star-like sub-networks. Anomaly means deviation from expected behavior. This means there exists patterns in observed data that do not match the de?nition of normal behavior. In social net-works, anomalies mean interactive patterns that have signi?cant differences from the whole network. In fact, the de?nition of anomaly depends on the nature of the problem. Various types of anomalies could be de?ned in social network environments, depending on the network of question. For example, spam emails are known as anomaly. In a network-based trust system, collusion is identi?ed as another type of anomaly. These are just examples of anomaly types in network structures. Considering the total amount of resources, time, and cost spent on these anomalies, it is necessary to © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 343–358, 2019. https://doi.org/10.1007/978-3-030-02686-8_27 develop solutions to this issue. According to statistics, 67% of email traf?c within the period of January to June 2014 was spam. Also, in 82% of cases, social networks were used for online abuse. These examples indicate the importance of the issue. These anomalies appear as abrupt changes in interactions or interaction, which are completely different from the usual form in a particular network. For instance, subnets that are created for collusion have certain forms of interaction. Another symptom of anomalies is highly interconnected subnets or star-like structures. Solutions that have been pro-posed to detect anomalies in social networks are in two categories: • Checking and comparing the network model with a normal interaction model. • Checking network attributes. Therefore, detection of anomalies in social networks involves the selection and calculation of network characteristics, and classi?cation and observation in the char-acteristics space. The ?rst challenge is the de?nition of normal behavior. Social net-works do not have a ?xed and balanced structure in all components due to the diversity of individuals and available nodes; and the de?nition of a normal structure in such networks is not possible. Another issue is that distributes of node degrees and network structure of communities changes over time. The scenarios presented for a normal structure are not necessarily real-time and it’s possible for a network to change before structure is extracted. Anomaly detection includes the following steps [1]: (1) Determining the smallest affected unit by behavior. (2) Identifying characteristics that are different from normal states. (3) Determining the context. (4) Calculation of characteristics and extracting a characteristic space. (5) Calculation of the distance between observations. The difference between anomaly detection in social networks and other areas is that in social networks we have individuals –containing characteristics—and the relation-ships between them—, which are relevant to their characteristics. Networks may be static or dynamic, labeled or not, and local or global; all of which affect the de?nitions in the network, and also a de?nition of anomalies. Therefore, the method used for anomaly detection in a friendship social network does not necessarily have optimal result in an authors’ network. In this paper, we will use social network analysis methods to detect anomalies in content sent by users in a question and answer based social network. To achieve this goal we have to ?rst de?ne the anomaly type; and second, present the detection method based on the network and anomaly properties. Then, we will use network analysis methods to use the presented method on the selected network. The main contribution of this paper is using node properties along with graph structure for detecting anomalies. The remainder of the paper is organized as follows: In the next section literature review is throughout the recent works in this area. In Sect. 3, the problem statement is presented in details. Then, our proposed solution methodology is explained. Section 4 covers the experiments and results of our tests and ?nally, in Sect. 5 we conclude our work and discuss future works. 344 N. Soltani et al. 2 Related Work The types of anomalies in terms of the anomaly detection are in the following cate-gories [1]: Static unlabeled anomalies, Static labeled anomalies, Dynamic unlabeled anomaly, and Dynamic labeled anomaly. Detection of anomalies is critical in preventing malicious activities such as bully-ing, designing terrorist attacks and disseminating counterfeit information. The authors of [2] examined the work that has been done to detect anomalies in social networks and focus on the effects of new anomalies in social media and most new techniques to identify speci?c types of anomalies. There are also a variety of studies on the detection of anomalies, data types and data attributes in the social network, anomalies are detected in network data [3–5, 8], which focus on graph data, including data weights to detect anomalies. An “ego-nets” is provided that includes sub-graphs of favorite nodes and neighboring nodes, and an “oddball” sphere regards around each node at the substrate of the adjacent nodes that exists to each node. Then, a small list of numerical features is designed for it. Detection of anomalies in temporary data has been done by [7, 9, 10]. The key idea is to create a Granger graphical model on a reference data set, and using a series of restrictions on the existing model, assuming that there is time dependence as reference data, they test the determined dataset and also speed up detection of anomalies by several random and parallel optimization algorithms. The proposed methods in the referred papers cause the effectiveness of accuracy and stability. In [11], the author discusses about advances in detecting fraud and malformation for social network data, including point anomaly detection. In that, a taxi driving fraud detection system was used. To implement the system, there are a large number of GPS trackers for 500 taxi drivers and systematically, they have investigated counterfeit activities of taxi drivers. The author in [12] uses an algorithm called WSAR3E.0 that can detect anomalies in simulated data with the earliest possible detection time and a low false positive number. It is also discussed in some articles about the detection of group malformations in social networks, applications, and systems. In [13], in order to identify the social implicit relations and close entities in the dataset, a framework has been used to solve similar unusual users in the real-world datasets. This approach requires a model for coping of communications, a model for independent users, and a method for distinguishing between them. In [14], a graphical model called GLAD, which has the ability to discover the group structure of social networks and detect group anomalies and also, required tests are performed on real and unrealistic datasets by anomaly injections. This automatically checks the nodes of a multi-layer network based on the degree of similarity of the nodes to the stars in different layers and by parallelizing the extracted features and anomalous detection operations in different layers of the multi-layer network, signi?- cantly, the calculations have been increased by the distribution of inputs to different machines cores. In [16], the author analyzes the distribution of input times and the volume of events such as comments and displays of online surveys for ranking and detecting suspicious users, such as spammers, bots and Internet fraudsters are being Anomaly Detection in Q & A Based Social Networks 345 used. In this paper, a relative model called VOLTIME is presented that measures the distribution of input times from real users. In another research-based on the idea that most user behavior is divergent from what can be considered as ‘normal behavior’, there is a risk assessment that results in more risks [17]. Because similar users follow a series of similar rules on social net-works, this assessment is organized in two phases: Similar users are ?rst grouped together, then, for each identi?ed group, one or more models are constructed for their normal behavior [18]. Using the recorded sessions to solve the problem of whether each session is abnormal determines the degree of anomalies in each session. Imple-menting robust statistical analyzes on such data is very challenging as the number of observed sessions is much smaller than the number of network users. The new method being forwarded in this paper for detecting anomalies in a very large dimension based on hyper-graphs, an important extension of graphs in which simultaneously the edges connect to more than two vertices. Table 1 shows a comparison between abovemen-tioned researches. 3 Problem Statement and Solution Methodology As mentioned in introduction part, we are looking for anomalies in this dataset. We limit anomaly types to spam and reputation sub-networks. Therefore, following questions are to be answered in the database: 1. Which users submit answers irrelevant to the question, spam, or aim at misleads the discussion? 2. Which users boost reputation on a mendacious basis? We have ignored comments for some reasons; ?rst, we want to keep track of the discussion, which is mainly included in the posts not comments. Second, it would be a time-consuming task to merge the comments to the posts, as the dataset is provided separately for comments. Furthermore, comments are written in response to a single post and mostly contain details about that post, not the whole question. Finally, rating and badges are based on posts, not comments. So, the speci?c types of anomaly we are looking for would be found in posts. 3.1 Methodology In this section, we present our analysis made on the proposed network. The analyses aim at detecting spammer accounts, and as a result, the spam answers. Based on [4, 6], spammers create a star-like network. So, we ?rst detect star-like sub-networks. To do so we have to create ego-net for each individual node and then study the neighbor nodes. A star-like sub-network is detected if there are few neighbors who connect directly to one another. The node in the center of a star-like sub-network is a spammer by a high possibility. The other question mentioned in the previous section is about detecting the nodes, which try to falsely boost their reputation. This is done by detecting communities whose intercommunications are too much tight [19]. 346 N. Soltani et al. Finding Star-Like Structure. In order to detect star-like structures, we have to detect cliques of size 3, i.e. triads in ego network of each node. In order to study ego networks, we choose the nodes with the highest betweenness; as these nodes connect components of the network to each other, are likely to create star-like structure. Fig-ure 1 shows a pseudo code of the algorithm proposed for detecting star-like ego-networks in this paper. Detecting Highly Interconnected Communities. Another type of anomaly con-sidered in this paper is collusion in order to boost reputation. Based on [1, 19], this type of anomaly is detected by detecting highly interconnected communities. Communities having this property are almost isolated from the whole network and have a large Table 1. Comparison between recent researches on Social Networks Anomaly Detection. Reference Anomaly type Target network Method Node/Edge Property Included [3–5, 8] Anomalies Nodes Weight Graph OddBall, ego-net Patterns, Hybrid Method for Outlier Node Detection Density, Weights, Ranks and Eigenvalues, use Node and Edge [7, 9, 10] Time-Series Anomaly Detection Weight Graph Granger graphical model Edge, Weight [11] Point Anomaly Detection Weight Graph Taxi Driving Fraud Detection System Edge, Weight [12] Bayesian Network Anomaly Detection Bayesian Network WSAR3E.0 Algorithm, Simulation Edge, Time [13] Intrusion Detection Graph Network Tribes algorithm Node [14] Group anomaly Detection Graph Network Group Latent Anomaly Detection (GLAD) model, d-GLAD Node, weight [15] Multilayer Networks Unsupervised, Parameter- Free, and Network ADOMS (Anomaly Detection On Multilayer Social networks Node, Edge, Weight [16] Suspicious Users Unsupervised Anomaly Detection VOLTIME Model Time [17] User Anomalous Behaviors Online Social Networks Two-Phase Risk Assessment Approach Time, Node [18] Anomaly Detection Weighted graphs OddBall Algorithm Node, Density, Weights, Ranks Anomaly Detection in Q & A Based Social Networks 347 number of edges inside. While ?nding this type of community, edge weights get important. In the ?rst scenario we used to create the network, we did not consider the edge weights. In order to add weight to edges, in such a way that it shows the level of two nodes’ connectivity, we add the number of times one node answers another node’s question as the edge weight between those nodes. Considering the nature of the anomaly we want to detect, we can omit edge directions; as we are looking for high interconnectedness. We assume that these sub-networks contain malicious users who try to boost their own reputation by asking or answering another’s questions. Communities are detected by identifying isolated components of the network (Fig. 2). Fig. 1. Pseudo code for the proposed algorithm Fig. 2. Pseudo code for the algorithm we presented for detecting anomalous communities. 348 N. Soltani et al. 4 Experiments and Results 4.1 Dataset Speci?cations The dataset has been downloaded from the Stack Exchange site and includes questions about the “Android” category on this site. This dataset contains user information, badges, comments posted below posts, questions and answers, history of post changes, posts links, and registered votes for each post. Each of this information is in a separate XML ?le [18]. On the Stack Exchange site, they do the control mechanism for posting and controlling the users. Each post gets a negative or positive rating from users. According to posts, people give each other a badge. Also, people’s reputation is based on their posts, the number of correct answers set by the rest of the users, and so on. To work with this dataset, we ?rst enter the information in the Excel environment and save the sections in the CSV ?le format. In the following, in order for the data to be able to enter the Pajek software, using a Java program, read the ?les and save the nodes and edges in separate ?les. Network Creation Scenarios One method to detect spam is detecting spammer accounts. Therefore, if we create a network of users and analyze it in order to ?nd the spammer accounts, we could simply flag posts by those accounts as spam. Obviously, we won’t be able to detect spam sent by normal users. In the aforementioned network nodes are users. Each edge resembles a reply by a user to another user’s post. Therefore, an edge connecting user u1 to user u2 shows that user u1 has answered user u2’s one question. Edges are directed (from u1 towards u2). Therefore, a user having high in-degree in one who has answered questions by many users, and a user having a high out-degree is one who has answered questions of many users. The latter users are more important to us now, as we consider spam answers. Nodes have properties including id, reputation, account creation date, name, age, positive votes count, negative votes count, and badges. We would use these properties to detect spammer users. A large number of users are solitary; i.e. there are a large number of users who have not asked questions or answered any other questions. We remove solitary nodes, which results in the network illustrated in Fig. 3. The network created from users based on answers of each user to the other user’s question. The network has several separate components. In a plenty of cases the user has asked only one question, answered by only one user, none of whom interact with the rest of the users. In the following section, we will explain implementation of our proposed solution. There are plenty of visualizations of resulting network, which represents nodes as small circles (each of which is representative of a user either answering a question or asking one). A connection between two nodes shows an answer from one user to the other’s question. Anomaly Detection in Q & A Based Social Networks 349 4.2 Implementation Detecting Star-Like Ego-Net. In order to ?nd the possible spammer accounts, we choose the nodes based on betweenness and examine those nodes ?rst. The ?rst experiment is done on user 137 who has the most betweenness. Figure 4 shows neighbor network, Fig. 5 shows the ego-net of node 137 and Fig. 6 shows the triads of the network in Fig. 4. 50 nodes of total 105 nodes create a neighbor network with 137. Therefore, the ego network of 137 is not a star like structure as more than 70% of its neighbors are connected to each other. Table 2 shows the properties of node 137 which is used to decide if anything abnormal exists about this node. The next node in the highest betweenness order is 16575. Figures 7 and 8 show the ego-net and neighbor network of this node respectively. There are 502 nodes in 16575 neighborhoods, but only 135 of them are connected to each other. In order to analyze this node further, we check its properties as follows (Table 3). Considering upvote Fig. 3. Network created based on scenario. Fig. 4. Neighbor network of user 137. 350 N. Soltani et al. count of this node compared to its downvote, high reputation, and 79 badges, it is unlikely for this node to be a spammer. Although, the ego network of this user is quite close to star structure. The third experiment is done on user 1465. 110 nodes out of 272 nodes in 1465’s neighborhood are connected to each other (45%). Considering this node’s properties, we can see it has a high reputation, but the downvotes outnumber the upvotes. It is possible that 1465 is a spammer user (Figs. 9 and 10). Considering other properties of this node, we can see this user has had 1012 posts with an average rating of 3.32, average view of 20500, the average answer to questions of 1.33, and average comments Fig. 5. Ego network of node 137. Fig. 6. Triads of neighbor network of node 137. Table 2. Node 137 properties. ID Reputation CreationDate DisplayName UpVotes DownVotes Age Cb 137 14905 2010-09- 14T02:48:38.087 Matt 1236 18 0.0040 Anomaly Detection in Q & A Based Social Networks 351 on posts of 1.42. We compare these numbers to the overall average values (Table 4). It is seen that average values for user 1465 is above, or almost equal to overall values; based on which we conclude user 1465 is not a spammer, despite the prior guess. Other nodes having a high betweenness are studied the same way. Detecting Communities. Communities are detected by identifying isolated compo-nents of the network. We omit components having less than 4 nodes. The result is shown in Fig. 11. We consider the biggest component by detecting communities in it and removing the edges, which connect communities to each other (Fig. 12). In order to detect highly interconnected communities, each community is studied solo. For each community, we study the degree distribution, the most central node, and the reputation average of the community. As seen in Fig. 13, the sub-network has star- Fig. 7. Ego network of 16575. Fig. 8. Neighbor network of 16575. Table 3. Properties of node 16575 ID Reputation CreationDate DisplayName UpVotes DownVotes Cb 16575 45479 2012-07- 02T20:06:13.047 Izzy 1452 213 0.0034 352 N. Soltani et al. like structure and is not highly interconnected. The most central node has the following properties (Table 5). This user’s reputation is higher than the total average reputation. Nothing is anomalous about this node so we move on to the next community. One of the communities does not have star-like structure (which makes it possible to be interconnected – Fig. 14). The biggest clique in it is as represented in Fig. 15. All the nodes in Table 6 were created within two weeks. Most of them have a high reputation, and their up-votes are much bigger than their downvotes. The clique created Fig. 9. Ego network of 1465. Fig. 10. Neighbor network of 1465. Table 4. Properties of node 1465 compared to the overall average Average Score ViewCount AnswerCount CommentCount FavoriteCount All data 1.75 2937.04 1.175 1.226 1.655 1465 3.32 20500.61 1.33 1.42 5.762 Anomaly Detection in Q & A Based Social Networks 353 Fig. 11. Communities in the network. Fig. 12. Communities inside the biggest component of the network after removing components having less than 4 nodes and edges between components Fig. 13. Community with the highest number of nodes. 354 N. Soltani et al. in the aforementioned community is possible an anomaly because of it resembles a highly interconnected subnetwork. Given that other communities have a similar structure, this structure is abnormal. The reason behind the fact that most communities have star-like structure is that experts in each ?eld answer questions of their own expertise and rarely answer the question in all ?elds. Therefore, most users have asked few questions and these questions have been answered by few numbers of experts in that speci?c ?eld, who are at the center of the stars. For this community having a different structure, there could be two hypotheses: (1) there exist a number of experts that communicate to one another and rarely answer questions of other users, or (2) there are users in it who have joined the network in order to get badges and boost reputation. Considering the creation time of the users in this clique, the second hypothesis is further reinforced. Table 5. Properties of node 40036 ID Reputation CreationDate DisplayName UpVotes DownVotes Age 40036 3705 2013-08-25T09:42:20.677 RossC 913 885 Fig. 14. A community, which is not star like. Fig. 15. Biggest clique. Anomaly Detection in Q & A Based Social Networks 355 Other communities exist that have structures different from star-like sub-network. Figure 16 shows them: Table 6. 10 highest degree centrality nodes ID Reputation CreationDate DisplayName UpVotes DownVotes Age Degree 137 14905 2010-09-14 Matt 1236 18 70 10 18945 2010-09-13 Bryan Denny 1481 30 29 65 482 15609 2010-09-27 Lie Ryan 3591 141 56 15 4856 2010-09-13 gary 1498 44 31 594 3820 2010-10-02 Edelcom 376 2 54 23 366 915 2010-09-22 Casebash 154 1 28 18 86 2168 2010-09-13 FoleyIsGood 165 1 33 17 382 1804 2010-09-22 BrianCooksey 119 0 49 16 7 1687 2010-09-13 Jonas 78 17 16 280 520 2010-09-21 Radek 159 0 15 Fig. 16. Other communities with star structure. 356 N. Soltani et al. 5 Conclusion and Discussion In this paper, we presented a solution to detect anomalies in social networks. We focused on a famous QA network; therefore, the anomalies were de?ned as inappro-priate answers (e.g. spam) and false reputation boosting. In order to detect these two types of anomalies, we suggested and applied two different approaches. For detecting spammers, we used a methodology to detect star like ego networks, and for detecting false reputation boosting, we detected highly interconnected networks. As another contribution of this paper, we considered network structure and node properties at the same time, which helps to get results that are more accurate. Detecting anomalies in social networks highly depend on the type, structure, and the content of the network. Based on the type of anomaly to be detected, different network scenarios exist. Also based on the network creation scenario, the solution will be different. All of which makes it impossible to present a general-purpose anomaly detection method. The limitations of the research include the challenges of combining network analysis results with mining results on node properties. As seen in this paper, we analyzed nodes after ?nding the most probable abnormal node using network solutions. Yet, there is not a unique systematic solution to this. As the future path for this research, one can consider the following: • Analysis and detection of other possible types of anomalies in a typical Q & A social network, such as spurious expertise, irrelevant answers, offensive comments, etc. • Extension of research to user feedback-based areas like product overview, discus-sion forums, and social groups; each of which is potentially ?t for spam and reputation boosting. • Implementing different network generation scenarios; e.g. a weighted graph of users based on the number of interactions between two users, a second layer network generated based on keywords of users and questions. These scenarios might help better in detecting abnormal behavior within the current context. References 1. Savage, D., Zhang, X., Yu, X., Chou, P., Wang, Q.: Anomaly detection in online social networks. Soc. Netw. 39, 62–70 (2014) 2. Liu, Y., Chawla, S.: Social media anomaly detection: challenges and solutions. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 817–818. ACM, Cambridge (2017) 3. Akoglu, L., McGlohon, M.: Anomaly detection in large graphs. CMU-CS-09-173 Technical Report (2009) 4. Akoglu, L., McGlohon, M., Faloutsos, C.: Oddball: spotting anomalies in weighted graphs. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) Advances in Knowledge Discovery and Data Mining. PAKDD 2010. LNCS, vol. 6119. Springer, Berlin (2010) Anomaly Detection in Q & A Based Social Networks 357 5. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection for discrete sequences: a survey. IEEE Trans. Knowl. Data Eng. 24(5), 823–839 (2012) 6. Sun, J., Qu, H., Chakrabarti, D., Faloutsos, C.: Neighborhood formation and anomaly detection in bipartite graphs. In: Fifth IEEE International Conference on Data Mining, pp. 418–425. IEEE Computer Society, Washington, DC (2005) 7. Cheng, H., Tan, P.N., Potter, C., Klooster, S.: Detection and characterization of anomalies in multivariate time series. In: Proceedings 8. Tong, H., Lin, C.-Y.: Non-negative residual matrix factorization with application to graph anomaly detection. In: Proceedings of the 2011 SIAM International Conference on Data Mining, pp. 143–153. Society for Industrial and Applied Mathematics (2011) 9. Qiu, H., Liu, Y., Subrahmanya, N.A., Li, W.: Granger causality for time-series anomaly detection. In: IEEE 12th International Conference on Data Mining (ICDM), pp. 1074–1079. IEEE (2012) 10. Sun, P., Chawla, S., Arunasalam, B.: Mining for outliers in sequential databases. In: Proceedings of the 2006 SIAM International Conference on Data Mining, pp. 94–105. Society for Industrial and Applied Mathematics (2006) 11. Ge, Y., Xiong, H., Liu, C., Zhou, Z.H.: A taxi driving fraud detection system. In: 2011 IEEE 11th International Conference on Data Mining (ICDM), pp. 181–190. IEEE (2011) 12. Wong, W.K., Moore, A.W., Cooper, G.F., Wagner, M.M.: Bayesian network anomaly pattern detection for disease outbreaks. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 808–815. IEEE (2003) 13. Friedland, L., Jensen, D.: Finding tribes: identifying close-knit individuals from employment patterns. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 290–299. ACM, Vancouver, August 2007 14. Yu, R., He, X., Liu, Y.: Glad: group anomaly detection in social media analysis. ACM Trans. Knowl. Discov. Data (TKDD) 10(2), 18 (2015) 15. Bindu, P.V., Thilagam, P.S., Ahuja, D.: Discovering suspicious behavior in multilayer social networks. Comput. Hum. Behav. 73, 568–582 (2017) 16. Chino, D.Y., Costa, A.F., Traina, A.J., Faloutsos, C.: VolTime: unsupervised anomaly detection on users’ online activity volume. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 108–116. Society for Industrial and Applied Mathematics (2017) 17. Laleh, N., Carminati, B., Ferrari, E.: Risk assessment in social networks based on user anomalous behaviour. IEEE Trans. Dependable Secure Comput. (2016) 18. Stack Exchange Data Dump. https://archive.org/details/stackexchange. Accessed 9 Nov 2017 19. Pandit, S., Chau, D.H., Wang, S., Faloutsos, C.: Netprobe: a fast and scalable system for fraud detection in online auction networks. In: Proceedings of the 16th International Conference on World Wide Web, pp. 201–210. ACM (2007) 358 N. Soltani et al. A Study of Measurement of Audience in Social Networks Mohammed Al-Maitah(?) Computer Science Department, Community College, King Saud University, Riyadh, Saudi Arabia malmaitah@ksu.edu.sa Abstract. This article is dedicated to surveying and analyzing Facebook account performance and developing a set of indicators, which can describe audience of Facebook user. The raw experimental data was gathered and analyzed using stat- istical methods, developed initially for Twitter. Based on them audience was classi?ed into categories then main attributes of updates was carefully studied to develop derived indicators which can show not only audience quality, but also information coverage and partly in?uence (e.g. growth of authority and so on) and demonstrated using graphical charts. Indicators were generalized into formulae—so was built a base to further studies on Facebook account activity. Directions of future work are also listed in conclusion. Keywords: Social network · Performance · Facebook · In?uence Account survey 1 Introduction Facebook engine provides a very small number of attributes to analyze. The most posts are attributed by quantity of “likes” (e.g. number of people, who marked certain post) and “shares” (e.g. number of people, who also placed certain post on his page). These two attributes are not interdependent. This means that user can mark but not share, likewise he can share but not mark. But even in such simple estimation system there’s a set of di?culties. Firstly, there’s no way to determine, whether “like” demonstrates exactly “like”. There are a number of events, which are marked, but not really liked by the users. For example, it can be message about someone’s death or other sad news [1]. The o?cial position of Facebook clears up that actions on this social network are focused on positive interactions, whilst negative must be expressed by comments of other user. Moreover, if post contains a link to other resource, accompanied by a short comment, there’s no way to determine – weather was liked the link itself, or user’s comment to it. So empirical studies show, that ‘like’ demonstrates just interest, acknowledgment or just support, and that resource worth enough to attract attention of other people, but do not enough to preserve it on personal timeline. Hence, “share” can be described as so important event or text for user, so he decided to preserve it. But there are also such issues, as in case of “like”. We cannot determine, what exactly is important, the shared resource itself, or comment to it. We even can’t determine the exact number of shares, as the post can be directly copied into a user’s © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 359–368, 2019. https://doi.org/10.1007/978-3-030-02686-8_28 page with or without reference to its author. And there are a certain social network aggregators – special sites, which gather news from SN’s and reprint them. This means that we have very little raw data to estimate e?ciency of Facebook account. It’s obvious, that average quantity of likes and shares can show level of e?- ciency (and most network services act just that way), but to make a proper estimation, we have to know more. For example, Klout service tries to measure in?uence, which has certain user of social network. It gains access to user’s account and tracks its activity in terms of impact of every activity, then summarizes them and makes ?nal estimate in Klout points from 0 to 100. The algorithms of Klout are closed and heavily secured by patents, but the main parameters of its estimation are simple [2, 3]. • Quantity of followers (or subscribers – Klout uses the same approach for all main world-wide social networks, including Facebook and Twitter); • Quantity of likes and shares; • Quantity of users, engaged in conversation (e.g. number of users, who leave at least one comment); • Interacting with another user with more high score. According to Klout, the highest score in 2014 have Barack Obama, Beyonce, Britney Spears, later in 2018 Barack Obama, Justin Bieber, zooey deschanel, and…, surprisingly, The Beatles – e.g. accounts, which very often publish newsworthy information. This can be compared with Nielsen rating for TV shows – the more people watch it, the higher is the estimate [4]. There are even studies that show dependence of Klout score from logarithmic function of the quantity of subscribers and enlisted in conversations [3]. So, Klout estimate can show popularity, but it doesn’t show e?ciency. Moreover, it doesn’t work on people, who create original yet highly specialized content and have their devoted audience [3]. Let’s pick up as example Drew Karpyshyn, one of bestselling writers for Star Wars (he is also a script writer for award-winning game Mass E?ect). Will be there a surprise, that his Klout rating is only 53? Dan Abnett has such as low rating of 54 points and even Umberto Eco has 55 points. These people are not unpopular, just the opposite, but their popularity do not rely on frequent activity and being in-touch with main global events or memes. So, the precious question is not in estimate itself. We need a method, which can measure relative popularity and e?ciency – e.g. not in global context of social network, but in context of their potential and devoted audience. This method will show more real popularity then estimates, based on frequent updates. To reach this goal we need a proper measurement of that audience, what is the main subject of this article. 2 Related Works In recent years a number of studies in social networks were performed. The most of them are on Twitter platform. One of the very complex works was conducted by Kwak, Lee, Park and Moon [5]. They surveyed more than 4000s of trending topics and about 360 M. Al-Maitah 106 millions of tweets. Complexity of this analysis is possible due to small size of “tweet” – short message or even just a hashtag (short slogan used for trending topics in Twitter). Questions of influence in social network was covered by the theoretical works by [6–9], who suggested that social network can be described as graph relation- ships; hence the influence can be modeled as threshold and cascade approximation. Kempe also proposed a set of mathematical approaches for maximizing influence within social networks using marketing strategies. Newman, Watts et al. [10] suggest that analysis also can be conducted using random graphs with certain degree of distributions. Such model allows describing not only social network as a whole, but also a subnet works, like groups, communities and others. On the contrary [11] consider social network as a net of directed links, which can be marked, propagated and mentioned. The di?erence in?uence in the terms of marking (likes, etc.) and in?uence in terms of propagation; perhaps, they were the ?rst, who pointed out, that high level of in degree not necessarily means real in?uence to other users. But all these surveys were conducted on Twitter, due, as was mentioned before, its short-messaging nature. Facebook is still much less attractive platform for conducting statistical and estimation studies and relevant studies on its content are rare to found. So we use as a base mainly Twitter-based works. 3 Data Extraction and Analysis Measuring of audience can be performed only on very speci?c group (or segment) within social network. We developed a set of requirements for such group, which include: (a) group must be large enough; (b) there must be at least three opinion leaders within it; (c) group must have high update rate; (d) updates must contain original content or orig- inal comments to ensure, that audience have minimized in?uence from outside. Hence, as experimental space we had selected Ukrainian segment of Facebook. Here’s a checklist according our criteria: Large enough network segment. According to [12] has 2 million 143 thousands 140 users. It’s about 4.72% of total country population (SocialBakers Facebook Statistics Ukraine, 2017). Opinion leaders: In Ukraine, at least 20 in?uential opinion leaders exists, who reside primarily in Facebook (e.g. their original content appears there earlier then in national media) [13]. Moreover, Proceedings of the ECSM-2014 outline, that in Ukraine about 40% (to be precise – from 49% to 38% – depending on internal situation) of population describes Facebook as the primary source of important events [14]. High update rate: Our observations show, that in Ukrainian political and social life emerges at least three main events (on hybrid war, on political process, on everyday life) and about ten events of a smaller value. So daily update rate of an average Facebook account with certain number of readers is about 3 or 5 updates in a day – which diverse from large posts to one-liners. A Study of Measurement of Audience in Social Networks 361 Original content: ECSM-2014 also show that content in Ukrainian segment of Facebook more often contains original information and opinions then traditional media [13, 14]. So we picked up eight in?uential accounts, which already have their devoted audi- ence, have certain number of readers (more than 10000), and certain position in Ukrai- nian society, and observed them through one month, October of 2017. This period also was a last month of electoral rally, so active Facebook audience was maximized and measurement quite accurate. To preserve privacy, we identify observed accounts only by initials and concentrated their characteristics in Table 1. Table 1. Base characteristics of observed counts User Updates per month Subscribers O. T. 11 34761 A. Y. 46 290344 A. A. 28 244785 A. G. 94 75070 H. H. 134 18268 Y. S. 24 43037 P. P. 94 264982 Y. T. 50 76600 Fig. 1. Account update performance (likes). This performance can be measured using certain indicators. Pay attention to Fig. 1 which shows detailed performance of one account (namely, H. H.) because it has very large number of updates. Selected accounts performed throughout October, as showed in Table 2. 362 M. Al-Maitah Table 2. Account raw performance. User Average likes Average shares Updates with higher like rate Min likes Max likes O. T. 132 21 3 15 302 A. Y. 4143 350 21 26 9972 A. A. 4288 500 9 482 21989 A. G. 1222 157 40 97 4341 H. H. 255 24 40 12 2306 Y. S. 1129 150 7 60 5964 P. P. 2663 207 37 750 8436 Y. T. 406 32 18 96 943 We can see in this ?gure that performance of di?erent posts diverse from very low to very high. Such wide diversity allows us to split general audience into three main categories: • Supporters (or devoted audience): Their number is described by minimal like rate. This is also lowest level of interested audience of certain account. Such people tend to like every post of befriended or tracked account just to support it, even sad or bad news, which cannot be positively marked. • Regular audience: Their number is described by average like rate. This is number of guaranteed readers, on which Facebook user can count when posting new update. • Potential audience: Their number is described by maximal number of likes. It is the current potential which account can handle if proper information policy is conducted. Similarly, we can build a chart for shares, which is displayed in Fig. 2. This indicator demonstrates rather not the audience, but the sensitivity level of account owner (e.g. how his updates correspond with feelings and views of his subscribers). Hence, we have the following categories of topics, depending of their share rate. Fig. 2. Accounts update performance (shares). Notes of zero importance: Such updates have zero shares. Mostly, its everyday notes, which contains information, useful only for account owner; Notes for limited audience. This type of topics have mostly “friends-only” visibility type and intended for sharing A Study of Measurement of Audience in Social Networks 363 only among close friends, partners, those, who has similar interests. They include questions, requests and so on. Share rate for such topics is below average. Main topics: This is updates with average (with a certain spread of values) share rate and contains the main topics, which attract people to this account. Typically it’s an opinion on speci?c interest – e.g. economics, politics, games, music, etc. – what can be described as serious hobby or professional activity of account owner; socially, important topics (or Hit Topics). This category contains hits of shares. The higher is the rate – the more important topic, which update is dedicated to. Hits are very rare (see the chart) and often have very high share rate comparing to most updates on other topics, but it is not necessary. It is possible to empirically point out, that Klout rating highly depends of hits. If account has small number of hits, it will have low Klout rate, as well as other statistically based popularity estimates. Using this raw data, we can build at least two indicators, which can be used to measure audience of certain account. Active and passive audience: Active audience A1 is calculated as ratio between average number of likes and total number of readers. Passive audience A' 1 , respectively, is the supplementary value, which can be obtained just as di?erence between 100 percent and value of A1 (see Fig. 3). A1 = Navg.like Nreaders ·1 100% (1) A' 1 = 100 -0 A1 (2) Fig. 3. Active audience percentage (blue bars) and social importance (red bars). Social importance of account is the ratio between average number of shares and total number of readers, just similar to the previous indicator. A2 = Navg.shares Nreaders ·2 100% (3) Social importance cannot be high for personal accounts – otherwise it’s not a personal account, but a global or local media, which is primary source for very large 364 M. Al-Maitah numbers of other accounts. This indicator can be indeed used for determining whether account belongs to a real person, or a media frontend. If importance is more than 0.5%, it’s very good for a person, and importance higher than 10% is a mark of media. Have such two base indicators, we can proceed to derived indicators. For example, sensitivity level of account can be measured as a ratio between average like rate and number of updates per month. E1 = Navg.like Nupd/month ·1 100% (4) This indicator can be used to determine, how account owner main topics are valued by his/her audience. Moreover, monthly change of sensitivity level can be used to eval- uate growth or degrade of account authority within its regular audience. This indicator does not depend on hit topics; hence it will be more precise, then other statistical ratings. The next indicator is calculated as ratio between minimal and maximal rates of like. This indicator shows audience coverage. E2 = Nmin.likes Nmax.likes ·2 100% (5) Using this indicator and its monthly change, it is also possible to measure growth of popularity of certain account. Likewise, have careful study of hit topics along with monthly change of audience coverage, we can evaluate, how account owner views correspond with views and interests of his/her subscribers. And ?nally, using ratio between average number of shares and average like rate we can determine relevance. It’s obvious that importance for account’s audience updates will be not only “liked” but also “shared”, so the more percentage of such updates, the more will be the value of this indicator. E3 = Navg.shares Navg.likes ·3 100% (6) Similar to social importance, this indicator can also be used to determine whether account is a media. For personal account it shows grade of opinion leadership. The persons with most value of sensitivity are opinion leaders for this group. Just in case let’s calculate these indicators for our test subjects (see Fig. 4). Of course, this study is just an approach to estimation method, but even in such short form it can be used for analysis in social networks. It can even solve the problem of “invisible audience”. In social networks, audience largely remains invisible to users and can be estimated only indirectly – via feedback. But the latter is unstable and varies day to day, because users can simply log out, haven’t seen the precious post and so on. For big media products audience can be estimated via surveys and web analytics, but for the individuals such things are unreachable, so they not see their audience. But that “invisible audience” is critical for them and our method can quantify it and help to improve their media activity. A Study of Measurement of Audience in Social Networks 365 Fig. 4. Sensitivity (blue bars), audience coverage (red bars) and relevance (green bars). 4 Conclusion and Future Works This article covers experiment conducted only for one month, from raw data to certain degree of generalization, recapping as a set of indicators and formulas. Given the high rate of events in selected social network segment, this survey is just an outline, merely an approach to a more complicated and more general method of estimation. For example, we do not include to our survey number of comments due to two main reasons: (a) we simply do not have a method to determine, whether comment is auto- mated or belong to real person and represent real opinion; (b) we do not have appropriate method for estimation of comment value (Facebook allows only to like comment). The question of fake accounts and automated comments is open and highly disput- able. Facebook itself estimates, that on its platform exists from 5.5% to 11.2% fake [15]. There’s also certain web services to estimate quantity of fakes among friends of given account, based on certain criteria [16, 17]. Such tools provide SocialBakers [12], there are also methods to distinguish them from real pro?les [18] and others. But they allow estimating only general quantity of fakes, but not the nature of certain comment and its author. So, there’s a need for a detailed study on comments, which is one of our main goals in future work. The next our goal is to create integral rating estimate for account, which can provide alternate to Klout and other frequency-dependent statistical tools. We intend to make close survey of selected accounts for more long periods and determine not only base indicators but also dynamics of their change. And the third direction of our future work is surveying trending topics in Facebook, its origins, ?ow and process of propagation, along with analysis of interest spaces, related to them. Such complex studies will be useful not only for exploration of information ?ow in social network, but also will help people to improve their popularity and promote their original content without necessity of frequent updates and dependencies of global news tra?c. 366 M. Al-Maitah References 1. Benevenuto, F., Rodrigues, T., Cha, M., Almeida, V.: Characterizing user behavior in online social networks. In: Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement Conference, New York, USA, pp. 49–62 (2009) 2. Golliher, S.: How I reverse engineered klout score. Online journal by Sean Golliher. http:// www.seangolliher.com/2011/uncategorized/how-i-reversed-engineered-klout-score-to-an-r2- 094/ 3. Stevenson, S.: What your klout score really means wired. http://www.wired.com/2012/04/ ?_klout/all/. Accessed Apr 2012 4. Drula, G.: Social and online media research—data, metrics and methods. Rev. Appl. Socio Econ. Res. 3, 77–86 (2012) 5. Haewoon, K., Changhyun, L., Hosung, P., Sue, M.: What is Twitter, a social network or a news media. In: Proceedings of the 19th International Conference on World Wide Web, New York, USA, pp. 591–600 (2010) 6. Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of in?uence through a social network. Theory Comput. Open Access J. 11, 105–147 (2015) 7. Ruixu, G.: Research on information spreading model of social network. In: Second International Conference on Instrumentation and Measurement, Computer, Communication and Control, Beijing, China, pp. 918–920 (2012) 8. Tang, J.: Computational models for social network analysis. A brief survey. In: Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, pp. 921–925 (2017) 9. Jingbo, M., Lourdes, M., Amanda, H., Minwoong, C., Je?, C.: Research on social networking sites and social support from 2004 to 2015: a narrative review and directions for future research. Cyberpsychol. Behav. Soc. Netw. 20(1), 44–51 (2017) 10. Newman, M.E.J., Watts, D.J., Strogatz, S.H.: Random graph models of social networks. Proc. Nat. Acad. Sci. U.S.A. 99 (2002) 11. Cha, M., Haddadi, H., Benevenuto, F., Gummadi, K.P.: Measuring user in?uence in Twitter: the million follower fallacy. In: Proceedings of the 4th International AAAI Conference on Weblogs and Social Media (ICWSM), ICWSM 2010 on Weblogs and Social, Washington DC, USA, pp. 10–17 (2010) 12. SocialBakers Facebook Statistics (Ukraine). http://www.socialbakers.com/statistics/ facebook/pages/total/ukraine/ 13. Jaitne, M., Kantola, H.: Countering threats: a comprehensive model for utilization of social media for security and law enforcement authorities. In: Proceedings of the 13th European Conference on Cyberwarfare and Security, Greece, pp. 102–109 (2014) 14. Ronzhyn, A.: The use of Facebook and Twitter during the 2013–2014 protests in Ukraine. In: Proceedings of the European Conference on Social Media, University of Brighton, UK, pp. 442–448 (2014) 15. Facebook Estimates from 5.5 to 11.2 accounts are fake. The Next Web. http:// thenextweb.com/facebook/2014/02/03/facebookestimates-5-5-11-2-accounts-fake/ 16. Veerasamy, N., Labuschagne, W.: Determining trust factors of social networking sites. In: Proceedings of 12th European Conference on Information Warfare and Security, Finland, pp. 288–297 (2013) A Study of Measurement of Audience in Social Networks 367 17. Sirivianos, M, Cao, Q., Yang, X., Pregueiro, T.: Aiding the detection of fake accounts in large scale social online services. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, USENIX Association Berkeley, CA, USA, pp. 15–15 (2012) 18. Cook, D.: Identity multipliers and the mistaken Twittering of birds of feather. In: Proceedings of the 13th European Conference on Cyberwarfare and Security, Greece, pp. 42–48 (2014) 368 M. Al-Maitah Predicting Disease Outbreaks Using Social Media: Finding Trustworthy Users Razieh Nokhbeh Zaeem(B) , David Liau, and K. Suzanne Barber Center for Identity, The University of Texas at Austin, Austin, USA {razieh,sbarber}@identity.utexas.edu, davidliau@utexas.edu Abstract. The use of Internet data sources, in particular social media, for biosurveillance has gained attention and credibility in recent years. Finding related and reliable posts on social media is key to performing successful biosurveillance utilizing social media data. While researchers have implemented various approaches to ?lter and rank social media posts, the fact that these posts are inherently related by the credibility of the poster (i.e., social media user) remains overlooked. We propose six trust ?lters to ?lter and rank trustworthy social media users, as opposed to concentrating on isolated posts. We present a novel biosurveillance application that gathers social media data related to a bio-event, pro-cesses the data to ?nd the most trustworthy users and hence their trust-worthy posts, and feeds these posts to other biosurveillance applications, including our own. We further present preliminary experiments to eval-uate the e?ectiveness of the proposed ?lters and discuss future improve-ments. Our work paves the way for collecting more reliable social media data to improve biosurveillance applications. Keywords: Biosurveillance · Social media · Twitter · Trust 1 Introduction Thanks to the ever-growing use of social media, the Internet is now a rich source of opinions, narratives, and information, expressed by millions of users in the form of unstructured text. These users report, among many other things, their encounters with diseases and epidemics. Internet biosurveillance utilizes the data sources found on the Internet (such as news and social media) to improve detec-tion, situational awareness, and forecasting of epidemiological events. In fact, since mid 1990’s, researches have used Internet biosurveillance techniques to predict a wide range of events, from in?uenza [5] to earthquakes [9]. Internet biosurveillance takes advantage of what is called hivemind on social media—the collective intelligence of the Internet users. The sources of Internet biosurveillance (e.g., social media) are, generally, timely, comprehensive, and available [10]. These sources, however, are enormous and noisy. An important pre-processing step to draw meaningful results from these sources is to ?lter and rank the most related parts of the data sources. .h c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 369–384, 2019. https://doi.org/10.1007/978-3-030-02686-8_29 370 R. N. Zaeem et al. Such ?ltering and ranking is widely recognized in the literature. For instance, in their overview of Internet biosurveillance [10], Hartley et al. break the process of Internet biosurveillance into four steps: (1) the collection of data from the Internet, (2) the processing of the data into information; (3) the assembling of that information into analysis; and (4) the propagation of the analysis to biosurveillance experts. They identify relevancy ranking as one of the important sub-steps of processing data into information in step two, before the actual analysis begins in step three. In order to ?lter and rank the posts (i.e., Twitter posts or news articles), researchers have implemented various approaches, like Machine Learning (e.g. Naive Bayes and Support Vector Machines [6,19]), and Natural Language Pro-cessing (e.g., Keyword and Semantic-based Filtering [8] and Latent Dirichlet Allocation [7]). All the previous e?orts, however, have focused on ranking the posts independently [12], ignoring the fact that these posts (Twitter posts or news articles) are inherently related by the virtue of the credibility of the poster (the Twitter user or news agency). Furthermore, users of social media can post about anything they wish to talk about. Some users talk about their illnesses online and these are the users we wish to monitor as they give us a sampling of the union’s infectious disease state. However, users can talk about being ill to illicit sympathy from other users, or they can just be faking it. It is important to evaluate the trustworthiness of users before extracting data for analysis. Unlike previous work, we observe the fact that the credibility of the users with respect to a given epidemiological event should be taken into account when ?ltering and ranking related posts. We propose six trust ?lters that ?lter and rank social media users who post about epidemiological events: Expertise, Expe-rience, Authority, Reputation, Identity and Proximity. These trust ?lters obtain the credibility or trustworthiness of a user by considering the structure of the social network (e.g., the number of Twitter followers), the user’s history of posts, the user’s geo-location, and his/her most recent post. While we focus on the relevancy ranking sub-step by measuring the user trustworthiness, we introduce a comprehensive framework that performs the entire cycle of Internet biosurveillance as explained by the four steps mentioned by Hartley et al. [10]. We leave technical details of some of the steps out of this paper, and discuss them separately elsewhere. Finally, in a preliminary set of experiments, we collect the posts and geo-locations of 2,000 real Twitter users. We investigate the e?ectiveness of our pro-posed trust ?lters. We observe the statistics of the ?lter scores and correlations between the ?lters and suggest future improvements. 2 Overview: Surety Bio-Event App The Surety Bio-Event App is our Internet biosurveillance application developed at the University of Texas at Austin for the DTRA Biosurveillance Ecosystem (BSVE) [18] framework. The BSVE provides capabilities allowing for disease Predicting Disease Outbreaks Using Social Media 371 Fig. 1. Overview of the Surety Bio-Event App. prediction and forecasting, similar to the functionality of weather forecasting. The BSVE is a virtual platform with a set of integrated tools and data ana-lytics which support real-time biosurveillance for early warning and course of action analysis. The BSVE provides a platform to access a large variety of social media data feeds, a software development kit to create applications (apps), var-ious tools, and the cloud service to host a web-based user interface. Developers develop BSVE apps and deploy them to the BSVE to be ultimately used by biosurveillance experts and analysts. Our Surety Bio-Event app covers the entire cycle of Internet biosurveil-lance according to previous work [10]. Figure 1 shows a high level picture of the Surety Bio-Event App. The four steps are: (1) Multi-Source Real-Time Data which collects data (Sect. 5), (2) Trust Filter which processes data into infor-mation (Sect. 3), (3) Surveillance Optimization (including early detection, situ-ational awareness and prediction) which assembles the information into analysis (Sect. 6), and (4) Forecasts and Predictions which propagates the analysis to experts through a Graphical User Interface (Sect. 4). Furthermore, the Surety app is user customizable and receives Goals and Situational Awareness as well as Historical Data, Detections, and Predictions from biosurveillance experts. Figure 2 shows a more detailed view of the App. In this paper, we concen-trate on the second step, the trust ?lter, while we broadly review the other steps too. With data collected from social media, the trust ?lter component of the App evaluates the data sources to ?nd the most trustworthy social media users with respect to a given surveillance goal. The trust ?lter component optimizes range, availability and quality of data using the combination of algorithms mea-suring six dimensions of trust: Expertise, Experience, Authority, Reputation, Identity and Proximity. The primary functions of the trust ?lter component are: (1) improving the quality of data employed by BSVE applications and analysts 372 R. N. Zaeem et al. Fig. 2. Diagram of data collection and analysis with the Surety Bio-Event App (SBEA). to make biosurveillance decisions, (2) tracking and quantifying trustworthiness of known, preferred users to guard against data bias and quality drift for BSVE applications and analysts, and (3) expanding the landscape of possible trusted social media users by o?ering trusted but previously unexplored users via rec-ommendation noti?cations to BSVE applications and analysts. 3 Trust Filters In order to determine user trustworthiness, we introduce the concept of a trust ?lter—a score between 0 and 1 assigned to a user (e.g., a Twitter user) which rates his/her trustworthiness with respect to a given criteria. We propose six trust ?lters: Expertise. Expertise measures a user’s involvement in the subject of inter-est [3]. We de?ne Expertise as the probability that a user will generate content on the topic in question (e.g., an In?uenza outbreak). Using the user’s history of posts, Expertise can be calculated as how often a speci?c user has written about the subject of interest in the past. Expertise(ui,t) = p(t|ui) = #Posts(ui,t)/#Posts(ui), where ui is a user in the social media network, t is a topic, and p(t|ui) is the probability that a user has generated content on that topic. We calculate this probability by counting the number of that user’s posts on the topic and dividing by his/her total number of posts. For all the ?lters, we use a keyword based classi?er to distinguish the posts concerting the topic of interest and the users posting about that topic. Experience. Experience is the degree to which a user’s posts are corroborated by other users. Informally, Experience seeks to measure how a user’s posts about a subject are corroborated by the ground truth. Assuming that the average Predicting Disease Outbreaks Using Social Media 373 involvement of all users in the subject of interest reveals the truth about the outside world (e.g., everybody posts about ?u when a ?u outbreak actually happens), we can use this average to calculate Experience. In order to do so, we measure the di?erence between a user’s involvement in the subject using Expertise and the average Expertise. To get a score that is between 0 and 1, and using the fact that Expertise is already between 0 and 1, we calculate Experience as Experience(ui,t) = 1 -) |Expertise(t) -) Expertise(ui,t)|. The closer one’s Expertise to the average Expertise, the higher his/her Expe-rience score. Authority. Authority is the number and quality of social media links a user receives from Hubs as an Authority [3]. A link is the relationship between users, e.g., likes and comments on Facebook, and following on Twitter. We utilize the Hyperlink-Induced Topic Search (HITS) [11] algorithm, a link analysis algorithm widely used to rank Web pages and other entities that are connected by links, to get a score between 0 and 1. In this algorithm, certain users, known as Hubs, serve as trustworthy pointers to many other users, known as Authorities. Therefore, Authorities are the users that have been recognized within the social media community. Reputation. Reputation is the number and quality of social media links to a user. We utilize the PageRank algorithm [2], another widely used ranking algorithm, to get a score between 0 and 1. Identity. Identity is the degree of familial or social closeness between a user and the person a?icted with the disease. The Identity ?lter is de?ned as the rela-tionship between the posting user that talks about the disease and the subject of the post that has somehow encountered the disease. If the user is reporting the disease about himself/herself, the Identity score assigned would be the maximum value, which is 1. If the user reports about a closer family member, the score would be higher compared to when the user reports about an acquaintance of his/hers. We utilize Natural Language Processing and Greedy algorithms to cal-culate this score. This trust ?lter ?rst ?nds all possible grammatical subjects of a sentence (e.g., a Twitter post), then using the words in the family tree, it ?nds the closest family relationship to those subjects and reports that family relation-ship (e.g., self, mother, co-worker, son) for Identity. A score is assigned to this relationship ranging from 1 (i.e., reporting disease about self) to 0 (i.e., talking about total strangers). In order to get the Identity score of a user, the Identity of all of his posts about the subject of interest are calculated and averaged. More details on this ?lter can be found in our previous work [13]. Proximity. Proximity estimates the distance of a user from the event (e.g., disease outbreak location). Using relationship distance (i.e., Identity score) and geographical distance (through geo-tagged posts and the geo-location of the 374 R. N. Zaeem et al. user), Proximity utilizes a greedy algorithm to perform graph traversal over the social media network and then combines the Identity value with the distance value to calculate the Proximity as shown in Algorithm 1. Algorithm 1. Proximity Algorithm Input : Directed user graph G Output: Proximity scores user.proximity 1 Initialize Identity threshold: T ; 2 for user in users do 3 if user.identity > T then 4 user.separation = 1/user.identity; 5 else 6 user.separation = 8; 7 end 8 end 9 for user u in G do 10 for user v in G -n {u} do 11 distance = v ?1 u; 12 u.separation = min(u.separation, v.separation × distance); 13 end 14 end 15 for user in users do 16 user.proximity = 1 -6 user.separation; 17 end Note that, the network graph that the trust ?lters use is pruned so that it contains only those users that have posted (at least once) about the subject of interest. As a result, trust ?lter scores are calculated focusing on the community that discusses a particular subject on social media. 4 Trust Filter GUI Figure 3 displays the Graphical User Interface (GUI) of the trust ?lter tab of the Surety app. The GUI is composed of four smaller windows. On the top left, the social media users are listed, and for each, the value of each of the six trust ?lters is shown. Next to the gear icons, the names of the six trust ?lters appear: Identity, Reputation, Experience, Expertise, Authority, and Proximity. The last column is the Combined trust score, currently the average of the six ?lters. On the GUI, the analyst or BSVE app developer selects a trust ?lter. He/she can then sort the users with respect to that score (descending or ascending). The higher the score, the more trustworthy the user with respect to that trust ?lter. In Fig. 3, the users are sorted based on Proximity in descending order. The analyst or BSVE app developer can also select favorite users that overtime he/she has found trustworthy and mark them with a star. The GUI suggests Predicting Disease Outbreaks Using Social Media 375 Fig. 3. Trust ?lter GUI of the Surety Bio-Event App. social media users that have a higher combined score compared to the favorite users with a blue glow under the user name (trusted but previously unexplored users) as shown in the ?gure. The analyst can review the favorite users (bring all the favorites to the top) too. On the GUI, the Network Graph is the top right window, which displays the users on social media as nodes and their links (e.g., following on Twitter) and sizes. The analyst can select a trust ?lter to size the nodes in the Network Graph. In this ?gure, the node sizes are based on Identity. On the bottom left of the GUI, under Node Histogram, the GUI charts the trust ?lter scores of users with the top ?ve users for the selected ?lter. On the bottom right, under Trust Score Distribution, the GUI displays the range of user trustworthiness, based on each ?lter and the combined score. The distribution of user trust scores with tunable granularity (set to 0.1 in this ?gure) shows the number of social media users that have a given trust score. 376 R. N. Zaeem et al. 5 Data Collection In this section and the next, we brie?y overview the ?rst and third steps of the biosurveillance process, namely data collection and optimization, for the sake of completeness. The Surety app (1) uses data already available on the BSVE and (2) collects data and uploads to the BSVE. The data sources monitored within the BSVE include well established and trusted data providers such as the Centers for Dis-ease Control (CDC) and the World Health Organization (WHO). Data from these sources show the analyst working with the BSVE the best possible mea-sure of the state of disease within the country. In addition, the BSVE collects data from news sources and Twitter. From what the BSVE already provides, Twitter does contain a treasure trove of information. However, other sources such as blogs, Instagram, and Reddit have been under used. The Surety app aims to ?x these gaps in data collection. The trust ?lter part of the Surety App seeks to collect data from other sources not currently supported by the BSVE that contain connectivity network information, and are typically focused on individuals as opposed to news feeds. Figure 4 demonstrates some of the data sources for the Surety app. Note that not all the data sources are candidates to be used with trust ?lters. Some of these data sources provide only time series data which is used by the optimization part. The data sources that are appropriate for trust ?lters are as follows. For these sources, we have implemented methods within our API to collect historical user data as well as connections to streaming APIs: Twitter, WordPress, Instagram, Tumbler, Reddit, and Wikipedia. 6 Optimization The third step of the biosurveillance process analyzes large collections of trusted data sources to assemble systems that e?ciently achieve user speci?ed surveil-lance goals, such as early outbreak detection. This analysis is accomplished through optimization algorithms that evaluate data collections through com-parison to historical and simulated bio-events. The Surety app yields trusted data sources, along with statistical models and performance metrics to support future surveillance activities. The trust ?lter part of the Surety App is capable of collecting a wide-range of data then formatting that data into the required time series data source for the optimization part. Our optimization algorithms, discussed elsewhere, include early detection, situational awareness and predic-tion [14]. 7 Implementation Our app is implemented with a Python Flask back-end and JavaScript front-end. The back-end was developed to allow for user interactivity to the front end. It serves JSON data generated from the algorithms to the user interface. The application is integrated into the BSVE. Predicting Disease Outbreaks Using Social Media 377 Fig. 4. Data collection sources of the Surety App. 8 Experiments We have designed a preliminary set of experiments to answer the following research question: How well do the proposed ?lters perform? In order to answer this question, we plan to use seed data (e.g., a synthetic network of users, posts, and disease outbreaks) as well as actual data (e.g., actual network of Twitter users and their posts). 1. We observe the value of the trust ?lters and their trends. 2. We compare ?lter scores against hospital data to judge the ability of the trust ?lters to detect disease outbreaks. In this paper, we observe the trend of the proposed trust ?lters for a real network of 2,000 Twitter users with their posts. The use of seed data as well as the comparison with hospital data is work under progress. For this set of experiments, we downloaded the posts and geo-location of 2,000 Twitter users. In order to do so, we performed a keyword search of the word ‘?u’ on Twitter API and then downloaded the user pro?le information (including geo-location coordinates), the user’s friends’ time-lines, lists of friends and followers, and past 30 days of tweets. We started the download on July 22, 2016 and, because of Twitter’s bandwidth limitations, it took us a week to download 2,000 378 R. N. Zaeem et al. users that have posted at least once with the word ‘?u’, totaling 33 GB. Note that not all the posts of these users over past 30 days are necessarily about ?u. We use a keyword based classi?er to distinguish ?u-related posts. Figure 5 shows the ?lters’ maximum, minimum and average values. The Iden-tity trust ?lter has an average (as well as peak) value at about 0.48, which means that, when people do post about ?u, they tend to post about ?u encounters of their nuclear family members, as 0.5 is assigned to nuclear family members for the Identity score. Reputation and Authority scores are unanimously close to 0, implying that the network we downloaded had very little connectivity. The low degree of connectivity is expected since people who post about ?u do not neces-sarily tend to follow others who post about ?u. The average value of Expertise was close to 0 too, meaning that even among those who have posted about ?u at least once, the number of ?u related posts over a 30 day period was rela-tively very low. The average value of 0.95 for Experience shows that most users’ Expertise score was close to the average Expertise, i.e. close to 0. Investigating the out-liners should point to users that were unusually concerned about ?u. Finally, we found that Proximity should be re-de?ned to make it independent of Identity, to show concrete distance from outbreak locations. Fig. 5. Statistics of trust ?lters. Figures 6, 7, 8, and 9 display the most interesting correlations we found between the ?lter values. Figure 6 shows that the combined score is most heavily in?uenced by Identity; these two ?lters are related with R2 equals to 0.49. There-fore, we might need to normalize and weigh the ?lters to get a new less-biased de?nition of the Combined score. Figure 7 charts the correlation between Reputation and Authority ?lters (R2 = 0.15). These two ?lters are not closely related. Therefore, while both measure the connectivity of the network, they consider di?erent aspects of connectivity. Predicting Disease Outbreaks Using Social Media 379 Fig. 6. Correlation between Combined Filter and Identity. Fig. 7. Correlation between Reputation and Authority. Figure 8 con?rms that Experience and Expertise are inversely correlated. We might need to update the de?nition of Experience to measure the corroboration by others di?erently. 380 R. N. Zaeem et al. Fig. 8. Correlation between Expertise and Experience. Finally, while Proximity is initialized with Identity, as Fig. 9 shows, it is rather independent of Identity. While the Proximity of users to a potential outbreak location can be compared to one another, the absolute value of Proximity still does not show the concrete physical distance between the user and a ?u outbreak location. 8.1 Feature Importance We compare our trust ?lters with other simple features which are widely studied in processing Twitter data [16]. Figure 10 and Table 1 show the feature impor-tance score from the Scikit-Learn kit [17]. We use the Extremely Randomized Tree Classi?er as our method to evaluate the importance of each feature. We utilize a library [1,15] in which the Gini coe?cient is used as a measure to the importance of each feature. In short, the total importance scores sum up to one and the larger the score is, the more important in decision that feature is. As Table 1 shows, the best feature from Extremely Random Tree Classi?er is the number of posts by a speci?c user within the given period of time. Consequently, the ?lters that are based on the number of related posts, such as Experience and Expertise, work well. However, the number of posts can be easily forged with posting robots or Spam posts. Two other features that are known to perform Predicting Disease Outbreaks Using Social Media 381 Fig. 9. Correlation between Identity and Proximity. Fig. 10. Feature importance. 382 R. N. Zaeem et al. well in similar types of problems are the average post length and the number of tagged Twitter IDs which start with the symbol “@” [4]. Therefore, poten-tial ?lters to consider can be based on these features. Identity, Reputation, and Proximity all perform better than the other features studied in previous work, including retweet, and whether or not the posts contain ‘?’ and ‘!’. Finally, Authority performs poorly and can be considered irrelevant. Table 1. Features and corresponding importance scores. Feature Importance score Number of posts 0.205 Experience 0.143 Expertise 0.132 Avg. post length 0.129 Number of @ tags 0.111 Identity 0.100 Reputation 0.099 Proximity 0.033 Retweet 0.029 Contains ‘?’ 0.010 Contains ‘!’ 0.009 Authority 0.002 9 Conclusion Filtering and ranking social media posts is essential to biosurveillance applica-tions that monitor them to detect and forecast disease outbreaks. We introduced a novel way to ?lter and rank social media posts by concentrating on the trust-worthiness of social media users with respect to a given subject. We proposed six trust ?lters and used them in the context of a complete biosurveillance applica-tion. We further evaluated these trust ?lters by observing how they perform on a real set of Twitter posts downloaded from 2,000 users for over 30 days. Improv-ing the ?lter de?nitions and judging the e?ectiveness of the ?lters in ?nding actual disease outbreaks are two major future work directions. Acknowledgment. Surety Bio-Event App is a long term project of the Center for Identity. The authors thank Guangyu Lin, Roger A. Maloney, Ethan Baer, Nolan Corcoran, Benjamin L. Cook, Neal Ormsbee, Haowei Sun, Zeynep Ertem, Kai Liu, and Lauren A. Meyers for their contribution to this project. This work has been funded by Defense Threat Reduction Agency (DTRA) under contract HDTRA1-14-C-0114 CB10002. Predicting Disease Outbreaks Using Social Media 383 References 1. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classi?cation and Regres-sion Trees. Statistics/Probability Series. Wadsworth Publishing Company, Belmont (1984) 2. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30(1), 107–117 (1998) 3. Budalakoti, S., Barber, K.S.: Authority vs a?nity: modeling user intent in expert ?nding. In: 2010 IEEE Second International Conference on Social Computing (SocialCom), pp. 371–378. IEEE (2010) 4. Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web, WWW 2011, pp. 675–684. ACM, New York (2011) 5. Collier, N., Son, N.T., Nguyen, N.M.: OMG U got ?u? Analysis of shared health messages for bio-surveillance. J. Biomed. Semant. 2(5), S9 (2011) 6. Denecke, K., Krieck, M., Otrusina, L., Smrz, P., Dolog, P., Nejdl, W., Velasco, E.: How to exploit Twitter for public health monitoring. Methods Inf. Med. 52(4), 326–39 (2013) 7. Diaz-Aviles, E., Stewart, A., Velasco, E., Denecke, K., Nejdl, W.: Epidemic intelli-gence for the crowd, by the crowd. Int. AAAI Conf. Web Soc. Media 12, 439–442 (2012) 8. Doan, S., Ohno-Machado, L., Collier, N.: Enhancing Twitter data analysis with simple semantic ?ltering: example in tracking in?uenza-like illnesses. In: IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology (HISB), pp. 62–71 (2012) 9. Doan, S., Vo, B.-K.H., Collier, N.: An analysis of Twitter messages in the 2011 Tohoku earthquake. In: International Conference on Electronic Healthcare, pp. 58–66. Springer (2011) 10. Hartley, D.M., Nelson, N.P., Arthur, R., Barboza, P., Collier, N., Lightfoot, N., Linge, J., Goot, E., Mawudeku, A., Mado?, L.: An overview of internet biosurveil-lance. Clin. Microbiol. Infect. 19(11), 1006–1013 (2013) 11. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM (JACM) 46(5), 604–632 (1999) 12. Lamb, A., Paul, M.J., Dredze, M.: Separating fact from fear: tracking ?u infections on Twitter. In: HLT-NAACL, pp. 789–795 (2013) 13. Lin, G., Nokhbeh Zaeem, R., Sun, H., Barber, K.S.: Trust ?lter for disease surveil-lance: Identity. In: IEEE Intelligent Systems Conference, pp. 1059–1066, September 2017 14. Liu, K., Srinivasan, R., Ertem, Z., Meyers, L.: Optimizing early detection of emerg-ing outbreaks. Poster presented at: Epidemics 6, Sitges, Spain, November 2017 15. Louppe, G., Wehenkel, L., Sutera, A., Geurts, P.: Understanding variable impor-tances in forests of randomized trees. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, NIPS 2013, USA, vol. 1, pp. 431–439. Curran Associates Inc. (2013) 16. ODonovan, J., Kang, B., Meyer, G., H¨ollerer, T., Adalii, S.: Credibility in context: an analysis of feature distributions in Twitter. In: 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing, pp. 293–301, September 2012 384 R. N. Zaeem et al. 17. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011) 18. Digital Infuzion: DTRA Biosurveillance Ecosystem (BSVE) (2017) 19. Torii, M., Yin, L., Nguyen, T., Mazumdar, C.T., Liu, H., Hartley, D.M., Nelson, N.P.: An exploratory study of a text classi?cation framework for internet-based surveillance of emerging epidemics. Int. J. Med. Inform. 80(1), 56–66 (2011) Detecting Comments Showing Risk for Suicide in YouTube Jiahui Gao1 , Qijin Cheng2(&) , and Philip L. H. Yu1 1 Department of Statistics and Actuarial Science, The University of Hong Kong, Pok Fu Lam, Hong Kong 2 Department of Social Work, The Chinese University of Hong Kong, Shatin, Hong Kong qcheng@cuhk.edu.hk Abstract. Natural language processing (NLP) with Cantonese, a mixture of Traditional Chinese, borrowed characters to represent spoken terms, and Eng-lish, is largely under developed. To apply NLP to detect social media posts showing suicide risk, which is a rare event in regular population, is even more challenging. This paper tried different text mining methods to classify comments in Cantonese on YouTube whether they indicate suicidal risk. Based on word vector feature, classi?cation algorithms such as SVM, AdaBoost, Random Forest, and LSTM are employed to detect the comments’ risk level. To address the imbalance issue of the data, both re-sampling and focal loss methods are used. Based on improvement on both data and algorithm level, the LSTM algorithm can achieve more satis?ed testing classi?cation results (84.3% and 84.5% g-mean, respectively). The study demonstrates the potential of auto-matically detected suicide risk in Cantonese social media posts. Keywords: SuicideText miningSocial mediaCantonese Sentiment analysis 1 Introduction Suicide is a serious public health concern globally and Hong Kong is no exception. The latest suicide rate in Hong Kong is about 11.7 per 100,000 [1], which is about the medium level in the global context [2]. In addition, suicide is the leading cause of death among young people in Hong Kong [3]. Due to the popularity of social networking sites in recent years, many young people were found to disclose their emotional distress and even suicidal thoughts through social media [4]. Suicide prevention professionals are, therefore, highly concerned with those online contents and hope to detect online posts showing risk for suicide as early as possible so that interventions can be delivered and lives can be saved. Q. Cheng—Equal ?rst author. © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 385–400, 2019. https://doi.org/10.1007/978-3-030-02686-8_30 1.1 Related Work Some pioneering efforts have been conducted to detect textual content showing suicide risk. Some basic machine learning methods were used to classify suicide notes, achieving 71% accuracy [6]. However, the accumulation of suicide notes is restricted by very limited data sources and can be time consuming. Thanks to the instantaneity of social media content, detection of suicide ideation in social network can strengthen suicide prevention to a large extent. However, few work of suicide text detection in social media has been conducted. In 2007, blogs were ?rst used to detect users at risk. Yen-Pei Huang [7] applied simple counting methods based on suicide-related key-words to detect bloggers with suicide tendency [8], which only achieved 35% success rate with low accuracy. Based on simple token unigram bag-of-word features, machine learning algorithms were also used to predict the suicide tendency on Twitter [9]. Concerning users’ behavior feature in social network, M. Johnson Vioulès [10] applied a martingale framework for suicide warning signs detection. However, Vioulès’ study was run on only two Twitter users’ data. In Mainland China, researchers have tried different statistical and machine learning methods to detect Weibo (a Chinese social media site) posts showing emotional distress and suicide risk [11]. Although achieving promising results, they also noted a few challenges. First, dataset for detecting suicide risk is often highly imbalanced, given that suicidal behavior is a rare event. A number of solutions to the class-imbalance problem are proposed both at the data and algorithm levels [12]. At data level, researchers often had to conduct re-sampling to adjust the imbalance, such as random over-sampling of minority class with replacement, random under-sampling of majority class, direct over-sampling, direct under-sampling, and so on [13]. At algorithm level, adjustment of cost function in algorithm is suggested. In addition, those studies often retrospectively collected data from social media and used the historical data for training and testing. However, such solutions will make it questionable to directly apply the results in real life, where suicide is indeed a rare event and social media contents are constantly updating and evolving. Although both Mainland China and Hong Kong consist mainly of Chinese ethnics, Hong Kong people speak Cantonese dialect and often write in a mixture of Cantonese and English due to its history of being a British colony. Due to the absence of Can-tonese natural language processing tool, text feature extraction in Cantonese is often based on simple n-gram features rather than word features [14]. A study found that Cantonese pre-treated by a Mandarin word segmentation tool consistently outperforms the character n-gram split [15]. In order to classify the at-risk online text better, we need to do Cantonese word segmentation using a satisfactory method. The main contribution of this paper is fourfold. First, this might be the ?rst time that Cantonese social media texts’ word vector features are used for detecting suicide risk. We conducted Cantonese word segmentation based on a relatively complete Cantonese dictionary by combining dictionaries on the internet. Second, unlike pre-vious suicide detection that relied on retrospective accumulation, we investigated an algorithm to detect suicide risk based on comments’ text features immediately. Third, deep learning method was used to train the word vector model and achieved a better result than custom machine learning model. Lastly, we introduced the focal loss, in 386 J. Gao et al. addition to the re-sampling method, to tackle the imbalance issue in text ?eld and achieved a satisfactory result. Focal loss, a new loss function, is found to be an effective alternative for dealing with class imbalance [16]. 1.2 Paper Outline In the next section, the construction of Cantonese resource base will be briefly intro-duced. Section 3 presents the methods we used to preprocess the suicide-related comments. Section 4 introduces the feature extraction and classi?cation methods. Evaluation metrics will also be introduced in this section. Section 5 analyzes the experiment result. In the last section, this paper is concluded and future works are discussed. 2 Construction of Cantonese Resource Base Social media posts are openly available at large. However, to label which posts show risk for suicide requires annotations by suicide prevention professionals. Besides, even though the simple Chinese and English text mining is relatively mature, little work was done in the Cantonese text mining. The absence of popular Cantonese dictionary is also an obstacle in the ?eld. 2.1 Data Collection and Annotation There has been a surge of student suicides in Hong Kong in recent years, which was prominently reported by local press and generated wide discussion among the public. One of the authors, QC, has been monitoring how people responded to this issue in social media. She identi?ed 162 YouTube videos relating to this issue published during the 2015/16 school year, to which there were 5051 comments posted in the public domain. The comments were downloaded by calling YouTube API and annotated by QC and a trained research assistant (RA). Those comments indicating that the com-menter was having or had serious suicidal thoughts, including having attempted sui-cide, were labelled as at-risk. Both QC and the research assistant have ?rst coded a random sample of 100 comments separately. The inter-rater reliability was examined by Cohen’s Kappa coef?cient as 0.91, which indicated high agreement. Then the RA completed the annotation of the rest of comments. 2.2 Construction of Cantonese Corpus In fact, Cantonese is primarily a spoken language. The most important mechanism by which Cantonese is represented in written form is phonetic borrowing. Sometimes, when confronting the ‘sound but no character’ problem, Cantonese speakers resorted to the strategy of creating a new character to represent a Cantonese word [17]. Similar to comments in YouTube, local online forums also contain a large amount of short Cantonese texts mixed with extra characters. In order to acquire more written Detecting Comments Showing Risk for Suicide in YouTube 387 Cantonese corpus, 4,310,566 written Cantonese posts were crawled from a popular local online forum [18]. 2.3 Construction of Cantonese Dictionary Word segmentation is a very important part before text classi?cation. A good Can-tonese dictionary is important in doing word segmentation. Through combining 26 Cantonese lexicons in Sogou [19], a popular text input software in China, we con-structed a Cantonese dictionary containing 597,731 Cantonese words. 3 Text Preprocessing YouTube comments are mainly written in Cantonese. However, English is also a popular and of?cial language in Hong Kong, 9% of the total comments that we col-lected from YouTube are in English. To complete a full analysis, those English words were ?rst translated into Cantonese. 3.1 Translation Because of lacking direct translation tool from English to Hong Kong Cantonese, two steps were made to translate English comments to Cantonese. First, English words were translated into simpli?ed Chinese using the Google Translate API [20] for Python. Second, Open Chinese Converter Project (OpenCC) [21] was used to convert simpli-?ed Chinese to Hong Kong Cantonese. OpenCC is an open source project for con-version between Traditional Chinese and Simpli?ed Chinese, supporting regional idioms in Mainland China and Hong Kong [22]. 3.2 Filtering Stop words, by de?nition, are those words that appear in the texts frequently but do not carry signi?cant information [23]. Effective text mining can be achieved by removal of stop words. Cantonese and Mandarin Chinese are within the same language family, so their written forms share a number of words in common [15]. Due to the absence of Cantonese stop word dictionary, we used the Mandarin stop words dictionary to ?lter comments. Similar to English stop words, Chinese stop words are usually those words with part of speeches like adjectives, adverbs, prepositions, interjections, and auxil-iaries. Adverb “ ” (of), preposition “ ” (in), conjunction “ ” (because of) and “ ” (so) are some examples [23]. According to the guidelines for manual annotation, a comment would be labelled as non-risk if it only contains stop words, punctuations or emoji, because these simple terms cannot provide suf?cient information for the readers to assess suicide risk. Following this guideline, if a comment only contains these terms, it will be detected and classi?ed as non-risk comment at ?rst. For other comments, these terms will be removed ?rst and the remaining text will be classi?ed using the classi?cation models. 388 J. Gao et al. 4 Text Classi?cation for Suicidality Detection 4.1 Feature Representation It is a common way to represent a document using a vector. In this paper, we utilized the Jieba [24] segmentation tool and word2vec [25] model to acquire the sentence vector. Unlike English, Chinese sentences do not contain spaces. Therefore, words in a sentence cannot be detected by computer automatically in Chinese. Based on the Cantonese dictionary constructed in the last section, we conducted text segmentation using Jieba [24], a Chinese text segmentation tool, to split the sentence into words. The distributed representative of word in a vector space can group similar words better and help algorithms to achieve a better result. This paper used the word2vec model developed by Mikolov [25] for learning vector representations of words. We set the dimensionality of vectors as 100 and learned the word vectors from the huge dataset (4,310,566 Cantonese posts) collected from the local forum. Then, we averaged the word vectors in a comment document to acquire its document vector. Figure 1 shows the word ‘ (suicide in Traditional Chinese)’ and its 100 neighbouring words according to the cosine similarity between word vectors. The 100- dimension word vector data were projected into 3 dimensions using the Principal Component Analysis (PCA). Fig. 1. Word vector visualization. Detecting Comments Showing Risk for Suicide in YouTube 389 4.2 Classi?er After ?ltering those comments only containing stop words, punctuations or emoji as non-risk data, the remaining comments need to be classi?ed. Both machine learning and deep learning methods are popular in text classi?cation. The paper used algorithms in both ?elds to detect whether a comment shows risk for suicide. Support Vector Machine (SVM). Support Vector Machine (SVM) has been shown to be highly effective at traditional text categorization [26]. This method searches for a hyperplane represented by a vector that can separate document vectors of two classes with maximum margin. AdaBoost. Adaptive Boosting (AdaBoost) aims at constructing a “strong” classi?er by combining a number of “weak” classi?ers [14]. The weights are proposed in AdaBoost to increase the importance of misclassi?ed data and decrease the importance of cor-rectly classi?ed data. Through combining these weak classi?ers based on their relative performance, AdaBoost can achieve an improved accuracy. Random Forest (RF). Random forest is a variant of bagging methods proposed by Breiman [27]. Similar to bagging, random forest constructs a decision tree for each of the bootstrap samples drawn from the data. But unlike bagging, random forest ran-domly selects a subset of predictors to determine the optimal splitting rule in each node of the trees in order to avoid over?tting [28]. Long short-term memory network (LSTM). Long Short-Term Memory network (LSTM) [29] is a special kind of recurrent neural network, capable of learning long-term dependencies. We trained the LSTM model based on words, using the pre-trained word2vec embedding layer with 100 dimensions. As shown in Fig. 2, The model is formed by taking mean of the outputs of all LSTM cells to form a feature vector, and then using multinomial logistic regression on this feature vector [30]. Topic seed words classi?cation model. The suicide-related comment data studied here are extremely imbalanced with a lot of non-risk comments. The paper designed a topic seed word classi?cation model to ?lter the non-risk comments at ?rst and then use the relatively balanced data to train the classi?er. First of all, the seed words [31] relating to suicide topic were summarized under the guidance of the suicide research experts. The seed words list is shown in Table 1. Based on the similarity of documents, if a document vector is far away from the seed list, it can be predicted as non-risk. We describe the similarity by the cosine similarity between a document and the seed list. Fig. 2. Long short-term memory. 390 J. Gao et al. In Fig. 3, the x-axis shows the cutoff value for cosine similarity below which a comment is predicted to be non-risk and the y-axis shows the misclassi?cation rate. We ?nd that from 0.6 to 0.65, the misclassi?cation rate did not increase much until it has a sudden increase at cutoff = 0.7. As we use seed words here to ?lter out the non-risk comments, we decided to choose 0.65 as the cutoff value for the cosine similarity. If a comment’s cosine similarity from the seed list is smaller than 0.65, it will be classi?ed as non-risk and will be removed in the ?rst stage. The remaining comments will then be studied in the second stage for identi?cation of at-risk comments. Using 0.65 as cutoff to ?lter out the non-risk comments by the seed words, only 0.22% comments in the training data were misclassi?ed. Table 2 shows the top 10 non-risk comments with the highest cosine similarity ?ltered by seed words. 4.3 Loss Function There are two kinds of loss function used in this article. Cross entropy loss will be used when the model is trained by balanced dataset. Focal loss [16] will be used when the model is trained by imbalance dataset. Cross Entropy Loss. Cross entropy (CE) loss for binary classi?cation: CEðp; yÞ ¼ Þ logðpÞ if y ¼ 1 f logð1 f pÞ otherwise Þ Where, y 2 f1g speci?ed the class and p 2 ½0; 1 is the model’s estimated probability for the prediction class. Table 1. Seed words Seed Words Translation in English Suicide (Simplified Chinese) Suicide (Traditional Chinese) will go die (Both Traditional and Simplified Chinese) Go die (Both Traditional and Simplified Chinese) Why I am a human being (Cantonese) Press (Traditional Chinese) Pressure (Simplified Chinese) Suffering (Traditional Chinese) End one’s life (Traditional Chinese) (continued) Detecting Comments Showing Risk for Suicide in YouTube 391 Table 1. (continued) Seed Words Translation in English End one’s life (Simplified Chinese) Jump off (Cantonese) Die (Both Traditional and Simplified Chinese) End (Both Traditional and Simplified Chinese) Vile (Traditional Chinese) Disgust (Traditional Chinese) Going to die (Both Traditional and Simplified Chinese) Want to die (Both Traditional and Simplified Chinese) Negative energy (Traditional Chinese) Cry (Cantonese) Very hard (Both Traditional and Simplified Chinese) Very tired (Cantonese) Cutting wrist (Cantonese) Jump off a building (Traditional Chinese) Jump off a building (Simplified Chinese) Cutting wrist (Mandarin) Cutting hand (Mandarin) Leave this world (Cantonese) Very stressful (Traditional Chinese) Super stressful (Traditional Chinese) Give up (Traditional Chinese) Heartbroken (Traditional Chinese) Jump off (Mandarin) Unhappy (Cantonese) Helpless (Traditional Chinese) Garbage (Both Traditional and Simplified Chinese) No hope (Cantonse) Pain (Both Traditional and Simplified Chinese) Collapse (Traditional Chinese) Don’t want to live (Cantonese) End one’s own life (Traditional Chinese) Want suicide (Traditional Chinese) End life (Traditional Chinese) (continued) 392 J. Gao et al. De?ne pt: pt ¼ p if y ¼ 1 1 f p otherwise f Then, cross entropy can be rewritten: CEðp; yÞ ¼ CEðptÞ ¼ logðptÞ Table 1. (continued) Seed Words Translation in English Kill oneself (Traditional Chinese) Hopeless (Traditional Chinese) What is the point to live on (Traditional Chinese) Die (Both Traditional and Simplified Chinese) Better to die (Cantonese) Jumped (Cantonese) What is the meaning of life (Traditional Chinese) Kill (Traditional Chinese) Fig. 3. Misclassi?cation rate for various cosine similarity cutoff values. Detecting Comments Showing Risk for Suicide in YouTube 393 Focal Loss. To address this class imbalance problem, focal loss [16] was designed by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classi?ed examples. The focal loss was de?ned as [16]: FLðptÞ ¼ atð1 s ptÞc logðptÞ Where, at 2 ½0; 1 for positive class 1 and ð1 1 aÞ for negative class. The tunable focusing parameter c h 0. at is introduced to balance the importance of positive/negative examples. The modulating factor ð1 1 ptÞc is added to balance the easy/hard examples (an example with large loss is de?ned as the hard example). 4.4 Evaluation The aim of this paper is to predict whether a piece of YouTube comment is showing suicide risk. The confusion matrix, as shown in Table 3, is commonly used in clas-si?cation evaluation. Here, we take at-risk class as positive class. Our purpose is to ?nd the at-risk users and save as many lives as possible. So the costs of false positive and false negative predictions are not the same. A false positive prediction should be a serious matter as we might miss the chance to save a life. Besides, non-risk class is dominating in the data. Given such extremely imbalanced data, the error rate is no longer an appropriate performance measure [32]. Table 2. Selected comments ?ltered by seed words Cosine similarity 1:at-risk 0:non-risk Comment English Translation of the Comment 0.6499 0 Actually you have a good point 0.6498 0 Don’t feel sad 0.6497 0 Come on try your best 0.6495 0 We English teachers do a lot of homework 0.6495 0 Why so many things are arranged in the same week 0.6494 0 Thought there was something wrong 0.6493 0 But believe we are the best 0.6493 0 Have you thought how many scores you can get 0.6490 0 Come on I believe you can do it 0.6490 0 My mom forced me to take Belilios (Note: a school in Hong Kong) 394 J. Gao et al. In this paper, we use geometric mean of the accuracies (G-mean) [33] as perfor-mance measure: True Positive RateðAccþÞ ¼ TP TP þ FN True Negative RateðAccÞ ¼ TN TN þ FP G - mean ¼ ????????Þn Acc ??????????????????. þ c Acc ??c p G-mean is a popularly used performance evaluation measure in an imbalanced training data. The idea is to maximize the accuracy on each of the two classes while keeping these accuracies balanced [32]. For example, a high accuracy of negative examples with a low accuracy of positive examples will result in a poor g-mean value. 5 Experiment and Results This paper performed suicide-related comment classi?cation based on both data and algorithm levels. 5.1 Experimental Setting Experimental Data. The data crawled from YouTube consist of 5051 comments (251 at-risk comments, 4800 non-risk comments), which were split into two datasets with 80% of them for training and 20% for testing purpose. To tackle the imbalanced problem, we designed our model in two ways. One possibility is to apply under-sampling to randomly select a balanced training dataset so that it consists of 201 risk comments and 201 non-risk comments. Then the balanced dataset was used to train classi?ers using the cross-entropy loss as the loss function. Alternatively, we can use the raw imbalanced training dataset (3840 risk comments and 201 non-risk comments) to train classi?ers using the focal loss. Parameter Setting. This paper used the scikit-learn [34] library in Python to train SVM, AdaBoost, and Random Forest models; used the genism [35] tool in Python to Table 3. Confusion matrix Predicted positive Predicted negative Positive class True positive (TP) False negative (FN) Negative class False positive (FP) True negative (TN) Detecting Comments Showing Risk for Suicide in YouTube 395 train word2vec model; used the keras [36] framework in Python to train the LSTM model. Model parameters are shown in the following Table 4: 5.2 Experimental Results Recall that once comments contain stop words, punctuations or emoji, they are all non-risk. Such comments in the training data will then be classi?ed as non-risk. Topic seed words classi?cation can also be used in advance to ?rst classify the non-risk comments to balance the dataset. Various classi?ers mentioned in Sect. 4 were trained for the remaining data. Finally, these methods were applied to the testing data and the testing results are shown in Table 5. Notice that using the under-sampling method, Set A consisting of 402 balanced comments was generated and used to train classi?cation models. It can be seen from Table 5 that the deep learning algorithm LSTM performed better than the traditional machine learning algorithms (SVM, AdaBoost and RF). The LSTM classi?er without ?ltering by the seed words performed the best, with 84.3% g-mean. The ?lter of seed words did not have signi?cant impact on classi?cation Table 4. Model parameters Model Parameters SVM (RBF) kernel = ‘rbf’, C = 1.5, gamma = 0.05 Adaboost Base_estimator = decision tree, n_estmator = 50, learning_rate = 1, algorithm = ‘SAMME.R’ Random forest max_depth = 5, n_estimators = 10, max_features = 1 Word2vec size = 100, min_count = 5, sg = 1 LSTM vocab_dim = 100 # output dimension in embedding layer batch_size = 32 # number of samples per gradient update n_epoch = 4 #number of epochs to train the model Table 5. Testing results of classi?cation based on improvement on data level (Testing data: 960 at-risk comments and 50 non-risk comments) Feature extraction Classi?er G-mean (%) Set A CE loss SVM - no seed ?lter 78.3 SVM - seed ?lter 78.4 AdaBoost - no seed ?lter 79.2 AdaBoost - seed ?lter 78.6 RF - no seed ?lter 74.3 RF - seed ?lter 69.7 LSTM-no seed ?lter 84.3 LSTM-seed ?lter 82.3 396 J. Gao et al. even though it performs well to balance training and testing comments. This is because using the under-sampling method, a balanced dataset was used to train the mode, when the seed word ?lter is not necessary. Given that the LSTM model performs well, this paper decides to solve the imbalanced problem in the algorithm level based on LSTM model. The raw imbalanced training dataset (Set B) without under-sampling was used to train the model. Here the focal loss was introduced in LSTM model (setting a ¼ 0:75; c ¼ 1). Due to the use of an imbalanced dataset to train the model, we cannot just use the 0.5 cutoff to predict the comment’s risk level. Based on the training dataset, we choose the threshold which can achieve the highest g-mean as the model’s prediction cutoff. As shown in Table 6, with the topic seed word ?lter, the LSTM model with focal loss achieved 84.5% g-mean, which is slightly higher than the g-mean achieved by the LSTM with cross-entropy loss based on the balanced dataset (84.3% g-mean). Using the LSTM model and focal loss, the top 5 comments with highest predicted probability of risk was shown in Table 7. Table 6. Testing results of classi?cation based on improvement on algorithm level (Testing data: 960 at-risk comments and 50 non-risk comments) Feature extraction Classi?er G-mean (%) Cutoff Set B FC loss LSTM-no seed ?lter 81.8 0.20 LSTM-seed ?lter 84.5 0.25 Table 7. Comments with highest predicted probability (continued) Detecting Comments Showing Risk for Suicide in YouTube 397 6 Conclusion This paper compared the performance of different classi?cation algorithms based on the word vector features. Because the YouTube comments are actually in a sequential list, the LSTM which can learn the sequential information performs better than other machine learning algorithms. Based on the topic seed word classi?cation model and the improvement on loss function, it can achieve the best testing performance (84.5% g-mean). The focal loss was also effective in ?guring imbalanced text classi?cation problem. In addition, in terms of combination with under-sampling methods to classify comments, LSTM also performed better than other machine learning algorithms, reaching 84.3% g-mean. The study has pushed forward natural language processing with Cantonese, which is a complicated dialect mixed Traditional Chinese, borrowed characters to represent spoken terms, and English. It also demonstrates the potential of using machine learning Table 7. (continued) 398 J. Gao et al. methods to detect suicide risk in real social media settings. As suicide prevention is a battle against the clock, every minute saved in detecting suicide risk and alerting intervention can be crucial. However, it is challenging to employ staff to monitor and review online content 24/7. Based on the computerized algorithm, suicide professionals can scale up the real-time monitoring of online content to detect potentially at-risk posts, based on which more timely interventions can be implemented. Acknowledgements. The study was supported by Hong Kong General Research Fund (Ref No.: 17628916). References 1. Centre for Suicide Research and Prevention, The University of Hong Kong. https://csrp.hku. hk/statistics/. Accessed 30 Mar 2018 2. World Health Organization Webpage. http://www.who.int/mental_health/suicide-prevention/ world_report_2014/en/. Accessed 30 Mar 2018 3. Cheng, Q., Chen, F., Lee, E.S.T., Yip, P.S.F.: The role of media in preventing student suicides: a Hong Kong experience. J. Affect. Disord. 227, 643–648 (2018) 4. Cheng, Q., Kwok, C.L., Zhu, T., Guan, L., Yip, P.S.F.: Suicide communication on social media and its psychological mechanisms: an examination of Chinese microblog users. Int. J. Environ. Res. Public Health 12(9), 11506–11527 (2015) 5. Chan, M., et al.: Engagement of vulnerable youths using internet platforms. PLoS ONE 12 (12), e0189023 (2017) 6. Pestian, J.P., Matykiewicz, P., Grupp-Phelan, J.: Using natural language processing to classify suicide notes. In: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing. Association for Computational Linguistics (2008) 7. Huang, Y.-P., Goh, T., Liew, C.L.: Hunting suicide notes in web 2.0-preliminary ?ndings. In: Ninth IEEE International Symposium on Multimedia Workshops, ISMW 2007. IEEE (2007) 8. Moreno, M.A., et al.: Feeling bad on Facebook: depression disclosures by college students on a social networking site. Depress. Anxiety 28(6), 447–455 (2011) 9. O’Dea, B., Wan, S., Batterham, P.J., Calear, A.L., Paris, C., Christensen, H.: Detecting suicidality on Twitter. Internet Interv. 2(2), 183–188 (2015) 10. Vioulès, M.J., Moulahi, B., Azé, J., Bringay, S.: Detection of suicide-related posts in Twitter data streams. IBM J. Res. Dev. 62(1), 7:1–7:12 (2018) 11. Cheng, Q., Li, T.M.H., Kwok, C.L., Zhu, T., Yip, P.S.F.: Assessing suicide risk and emotional distress in Chinese social media: a text mining and machine learning study. J. Med. Internet Res. 19(7), e243 (2017) 12. Kotsiantis, S.B.: Supervised machine learning: a review of classi?cation techniques. Emerg. Artif. Intell. Appl. Comput. Eng. 160, 3–24 (2007) 13. Estabrooks, A., Jo, T., Japkowicz, N.: A multiple resampling method for learning from imbalanced data sets. Comput. Intell. 20(1), 18–36 (2004) 14. Zhang, Z., Ye, Q., Li, Y.: Sentiment classi?cation of Internet restaurant reviews written in Cantonese. Expert Syst. Appl. 38(6), 7674–7682 (2011) 15. Zhang, Z., Ye, Q., Li, Y., Law, R.: Sentiment classi?cation of online Cantonese reviews by supervised machine learning approaches. Int. J. Web Eng. Technol. 5(4), 382–397 (2009) 16. Lin, T.-Y., et al.: Focal loss for dense object detection. arXiv preprint arXiv:1708.02002 (2017) Detecting Comments Showing Risk for Suicide in YouTube 399 17. Cheung, K.-H., Bauer, R.S.: The representation of Cantonese with Chinese characters. University of California, Project on Linguistic Analysis (2002) 18. LIHKG Webpage. https://lihkg.com/category/30. Accessed 30 Mar 2018 19. Sogou Webpage. https://pinyin.sogou.com/dict/search/search_list/%D4%C1%D3%EF/ normal. Accessed 30 Mar 2018 20. Python Webpage. https://pypi.python.org/pypi/googletrans. Accessed 30 Mar 2018 21. Python Webpage. https://pypi.python.org/pypi/OpenCC. Accessed 30 Mar 2018 22. GitHub Webpage. https://github.com/BYVoid/OpenCC. Accessed 30 Mar 2018 23. Zou, F., Wang, F.L., Deng, X., Han, S., Wang, L.S.: Automatic construction of Chinese stop word list. In: Proceedings of the 5th WSEAS International Conference on Applied Computer Science (2006) 24. GitHub Webpage. https://github.com/fxsjy/jieba. Accessed 30 Mar 2018 25. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. (2013) 26. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: European Conference on Machine Learning (1998) 27. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996) 28. Liaw, A., Wiener, M.: Classi?cation and regression by randomForest. R. News 2(3), 18–22 (2002) 29. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computat. 9(8), 1735– 1780 (1997) 30. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classi?cation. Adv. Neural Inf. Process. Syst. (2015) 31. Kim, S.-M., Hovy, E.: Determining the sentiment of opinions. In: Proceedings of the 20th International Conference on Computational Linguistics. Association for Computational Linguistics (2004) 32. Liu, X.-Y., Wu, J., Zhou, Z.-H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 39(2), 539–550 (2009) 33. Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one-sided selection. ICML, Vol. 97 (1997) 34. Scikit-learn Webpage. http://scikit-learn.org/stable/. Accessed 30 Mar 2018 35. Gensim Webpage. https://radimrehurek.com/gensim/models/word2vec.html. Accessed 30 Mar 2018 36. Keras Webpage. https://keras.io/models/sequential/. Accessed 30 Mar 2018 400 J. Gao et al. Twitter Analytics for Disaster Relevance and Disaster Phase Discovery Abeer Abdel Khaleq(&) and Ilkyeun Ra University of Colorado, Denver, CO 80204, USA {abeer.abdelkhaleq,ilkyeun.ra}@ucdenver.edu Abstract. Natural disasters happen at any time and at any place. Social media can provide an important mean for both people affected and emergency per-sonnel in sharing and receiving relevant information as the disaster unfolds across the different phases of the disaster. Focusing on the phases of pre-paredness, response and recovery, certain information needs to be retrieved due to the critical mission of emergency personnel. Such information can be directed depending on the disaster phase towards warning citizens, saving lives, or reducing the disaster impact. In this paper, we present an analytical study on Twitter data for three recent major hurricane disasters covering the three main disaster phases of preparedness, response and recovery. Our goal is to identify relevant tweets that will carry important information for disaster phase discov-ery. To achieve our goal, we propose a cloud-based system framework focused on three main components of disaster relevance classi?cation, disaster phase classi?cation and knowledge extraction. The framework is general enough for the three main disaster phases and speci?c to a hurricane disaster. Our results show that relevant tweets from different disaster data sets spanning different disaster phases can be classi?ed for relevancy with an accuracy around 0.86, and for disaster phase with an accuracy of 0.85, where key information for disaster management personnel can be extracted. Keywords: Twitter analytics .e Twitter data mining Social media classi?cation .e Disaster relevance classi?cation Disaster phase classi?cation .e Cloud-based analytics .e Disaster management 1 Introduction Natural disasters are large scale in impact and many of them span multiple disaster phases. Some disasters need more focus on preparedness, some on response and some on recovery. It is necessary to direct each agency to its mission during a disaster based on the disaster phase. For example, warning systems and evacuation plans need to be in place during preparedness, medical personnel need to act during response, and relief agencies will provide shelters during recovery. Twitter provides a rich platform for key information during a disaster. Analyzing and extracting informational tweets from Twitter during disasters is one of the text mining researches in recent years [1]. However, Twitter data is highly unstructured and has a lot of noise and irrelevant © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 401–417, 2019. https://doi.org/10.1007/978-3-030-02686-8_31 messages where identifying relevant tweets is a challenge [2]. There is a need to ?lter out those relevant tweets during the disaster phases and uncover insightful information. During a disaster, we may have massive number of disaster related tweets coming from many different sources carrying important disaster information. Our idea is to build a general system framework that can process the large number of the disaster related tweets and ?lter out the relevant ones that may carry important information and can be used for managing the disaster. From the collected disaster relevant Twitter data, the disaster phase, the disaster location and other key information will be extracted. Our system will be hosted in the cloud for storage and analytics processing capabilities and protecting from the potential loss of resources during a disaster. To accomplish our goal, we conducted an analytical study on Twitter data from three recent hurricanes disasters including hurricane Matthew from 2016, Harvey and Irma from 2017 across the three disaster phases of preparedness, response and recovery to have a well diverse and general data set. We chose hurricanes as a disaster type for our analytical study because they can be predicted, have a sustainable impact for response and recovery. Hurricanes are natural disasters that affect the US and other countries every year. They result in a great loss of civilians and cause a lot of damage and devastation that goes far beyond expectations. Many lives can be saved, and many resources can be sustained with minimal damage if the proper information can be delivered to the right personnel at the right time during the right phase of the disaster. This makes them applicable to our study where the three disaster phases of pre-paredness, response and recovery can be further identi?ed to provide the needed resources. Since each disaster phase has its own requirements and valuable information, it is important to distinguish between these phases and extract the right information for each phase. The contributions of our paper are as follows: (1) Provide a general cloud-based framework for Twitter data analytics in hurricane disaster management. (2) Identify relevant tweets during a disaster from different hurricane disaster data sets. (3) Classify the disaster phase of preparedness, response and recovery from relevant tweets. (4) Extract key knowledge from relevant tweets text such as location, key phrases and key terms that can be used by disaster emergency personnel. Our study is not geared toward creating new classi?cation algorithms. Rather it is limited to the use of existing classi?cation algorithms and methodologies to uncover the disaster relevance, the disaster phase and disaster key knowledge from the massive Twitter data that comes during a disaster. In this study we present our work on static hurricane Twitter data to build the classi?cation models, in future work we will implement the system on streaming real-time disaster data. The paper is organized as follows. Section 2 describes related work in Twitter disaster relevance, Sect. 3 describes the proposed Twitter analytics system framework along with the hurricane data sets used for the experiments, Sect. 4 presents the disaster relevance classi?cation experiment on the tweets, Sect. 5 presents the disaster phase discovery experiment on both labeled and unlabeled tweets, Sect. 6 describes the 402 A. A. Khaleq and I. Ra knowledge extraction experiment for the disaster location and other key information from relevant tweets, and ?nally Sect. 7 presents a conclusion and future work directions. 2 Related Work It has been widely acknowledged that Humanitarian Aid and Disaster Relief (HADR) responders can gain valuable insights and situational awareness by monitoring social media-based feeds, from which tactical, actionable data can be extracted from the text [3]. Ashktorab et al. [4], for example, introduced Tweedr, a Twitter-mining tool that extracts actionable information for disaster relief workers during natural disasters. The Tweedr pipeline consists of three main parts: classi?cation, clustering, and extraction. Imran et al. [5] developed an arti?cial intelligence system for disaster response that classi?es real-time Twitter data into relevant disaster categories based on keywords hashtags. Imran et al. [6] performed disaster-relevant information extraction on Twitter data for both hurricane Sandy in 2012 and Joplin tornado in 2011. In their work they proposed a two-step method for disaster-related information extraction which are classi?cation of relevance and information extraction from tweets using off-the-shelf free software. In the same context, Stowe et al. [2] performed Twitter data classi?cation for relevance before, during and after the hurricane Sandy 2012 disaster. Their method was based on binary classi?cation for both relevance and ?ne-grained categories such as action, preparation, movement, etc. They concluded that tweets can be classi?ed accurately combining a variety of linguistic and contextual features which can sub-stantially improve classi?er performance. Those research areas address tweets classi?cation and ?ne-grain category classi?- cation during a disaster without identifying the disaster phase. Wang et al. [7] pointed out that most studies with exceptions of Haworth et al. [8] and Yan et al. [9] have focused on disaster response instead of other phases because of lack of data through those phases. This data sparsity problem in phases like, mitigation, preparedness and recovery may cause unreliable analytical results. They emphasized that future work is needed to overcome this limitation and effort needs to be directed toward gaining more useful information for all phases of disaster management through mining social media data. To the best of our knowledge, there is no work on establishing a general classi?- cation framework of Twitter data to classify the three main disaster phases of pre-paredness, response and recovery. Most of the research work is more focused on response and on the subcategories of ?ne-grained classi?cation. There is also a lack for a general hurricane disaster classi?cation framework, thus our work will focus on the characteristics of a disaster from the three shared phases of preparedness, response and recovery speci?c to a hurricane natural disaster. Our work is different on the following aspects: 1. We propose a general hurricane disaster classi?cation framework based on three natural hurricane disaster datasets with accuracy as a measurement for classi?cation. Twitter Analytics for Disaster Relevance and Disaster Phase 403 2. We will identify relevant tweets based on textual context by manually examining and labeling the tweets and not using hashtags and keywords for a more general and accurate classi?cation. 3. We will uncover the disaster phase of preparedness, response and recovery through classi?cation of relevant tweets with accuracy as a measurement for classi?- cation. We believe these three disaster phases can be founded easily in tweets related to natural disasters like hurricanes. 3 System Framework and Data Set Our proposed system framework will have a Twitter analytics component for disaster relevancy and phase discovery specially tuned for hurricanes as part of a complete cloud-based platform for disaster management and response. This can serve as a foundation for a micro-service architecture where new components can be added, or existing ones can be updated for a new disaster phase or new requirements. As the focus of our study is on the Twitter analytics component, we plan on pursuing implementing the cloud-based framework in our future work. Fig. 1. System framework for Twitter analytics. 404 A. A. Khaleq and I. Ra Figure 1 provides the general system framework along with the Twitter analytics system workflow. Our focus in this study is on tweets texts for location and key knowledge extraction. The date and time of a disaster can be extracted from the created_at1 ?eld of the tweets and will be part of the complete framework imple-mentation of consecutive studies. Our work is focused on static Twitter data that was collected from recent hurricane disasters including hurricane Matthew, Harvey and Irma. All three disasters had sig-ni?cant impact on US and other areas with casualties and damage. As we are aiming on having a general classi?cation framework for a hurricane disaster, we sampled the data from three hurricanes to have a more general data set. We also made sure to diversify the data by covering the disaster phases of preparedness, response and recovery from each hurricane disaster. We identi?ed the disaster phase based on the disaster evolving date and time and the available hurricane information. We applied variable number of queries with geo-tagged and non-geotagged queries as our focus is on identifying relevance over a general data set using the different sets of disasters and different queries without adding any bias to certain tweets on the classi?er. We used Gnip2 for the historic Matthew data set and Twitter API streaming for the disasters of Harvey and Irma as they were unfolding. Table 1 provides a more detailed look at the data sets collected from the three hurricanes, listing the query used and the corresponding dis-aster phase. Table 1. Collected data sets for the three hurricanes Hurricane Date Query Disaster phase Number of tweets collected Matthew 10/7/2016 Track = (“Hurricane Matthew”) (flood OR wind OR storm OR heavy OR rain), no retweets, lang=‘en’ Preparedness 27,000 over the 3 days Matthew 10/8/2016 Track =((“Hurricane Matthew”) (flood OR wind OR storm OR heavy OR rain), no retweets, lang=‘en’ Response Matthew 10/9/2016 Track = (“Hurricane Matthew”) (flood OR wind OR storm OR heavy OR rain), no retweets, lang=‘en’ Recovery Harvey 8/25/2017 Bounding box including corpus Christi, san Antonio, west of Houston, Lang=‘en’, track=‘Hurricane Harvey’, no retweets Preparedness 7,728 Harvey 8/28/2017 Bounding box around Houston area, Lang=‘en’, track=‘Hurricane, Harvey, flood, help, rescue, rain’, no retweets Response 121,658 Harvey 8/30/2017 Lang=‘en’, track=Houston, no location, no retweets Recovery 61,940 Irma 9/5/2017 Track=‘Hurricane Irma’, lang=‘en’, no retweets Preparedness 34,445 Irma 9/10/2017 Bounding box around Florida, Track=‘irma’, lang=‘en’, no retweets Response 1,128 Irma 9/11/2017 Track = ‘Hurricane Irma’,lang = eng, no retweets Recovery 9,099 1 Tweet object https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object. 2 Gnip http://support.gnip.com/. Twitter Analytics for Disaster Relevance and Disaster Phase 405 4 Disaster Relevance Classi?cation 4.1 Disaster Relevance Annotation Our goal is to classify a general tweet during a hurricane disaster for relevance. We manually examined the data for quality of tweets texts. Manually labeled a sample of each disaster set over every phase for relevance. We examined the relevance of each Twitter text to the disaster phase. If the Twitter text contains any crucial information related to disaster phases such as “need”, water”, “evacuate”, “rescue”, we label it as relevant. If the tweet text does not carry any crucial information, we label it as non-relevant. For example, a relevant message is “Storm getting stronger: 2 million urged to leave” where it has information about evacuation that is important for preparedness. However, a message like, “We pray for those in the path of Hurricane Matthew. If you are in an area that may be affected by the disaster phases and…” will be labeled as non-relevant. It is important to point out that during this initial step we are classifying for relevance only and not for the disaster phase. As we cannot manually label the huge number of tweets across the three disaster sets, we randomly sampled a smaller data set of each. Table 2 shows the sampled data sets across the three disaster phases of the three hurricanes. Our initial plan was to sample the same number of tweets from each data set over each phase, but some data sets have a lot of noise and repetitions for some tweets which explains the lower number of tweets for some sets. However, we feel that we have captured the three disaster stages over a hurricane disaster with this sample data set as this is our focus. 4.2 Relevance Classi?cation Model We have utilized Microsoft Azure machine learning studio3 to conduct our experiment as we plan on having a cloud-based framework in addition to the fact that Azure learning studio has a vast number of classi?cation models and text analytics models that can be easily tuned for performance. We combined the three data sets into one. Table 2. Sampled data set for disaster relevance classi?cation Disaster phase Hurricane Matthew Hurricane Harvey Hurricane Irma Total Preparation 200 Relevant 200 Non-relevant 200 Relevant 157 Non-relevant 200 Relevant 106 Non-relevant 600 Relevant 463 Non-relevant Response 188 Relevant 109 Non-relevant 130 Relevant 50 Non-relevant 191 Relevant 105 Non-relevant 509 Relevant 264 Non-relevant Recovery 171 Relevant 74 Non-relevant 31 Relevant 16 Non-relevant 126 Relevant 110 Non-relevant 328 Relevant 264 Non-relevant Total 559 Relevant 383 Non-relevant 361 Relevant 178 Non-relevant 517 Relevant 321 Non-relevant 1437 Relevant 927 Non-relevant 3 Microsoft Azure Machine Learning Studio https://azure.microsoft.com/en-us/services/machine-learning- studio/. 406 A. A. Khaleq and I. Ra We cleaned and removed missing data based on text or other important ?elds which resulted in 2311 tweets, 1434 relevant and 877 non-relevant. We preprocessed the data by removing special characters, URLs and user mentions for privacy. We kept numbers as they are important for hurricane category, number of casualties, address, etc. We tokenized, stemmed and removed stop words. 4.3 Binary Classi?cation Algorithm Stowe et al. [2] work showed that logistic regression with uni-gram features and cross validation achieved best accuracy on binary classi?cation for tweets relevance. Habdank et al. [10], pointed out that uni-gram achieves better accuracy than bi-grams in tweet text classi?cation for relevance as proved in other researchers experiments. We have also experimented with binary classi?cation algorithms in previous work on Twitter data including logistic regression, support vector machine, Naïve Bayes and Stanford classi?er. We found that logistic regression with uni-gram features gave us the best accuracy. We applied TF-IDF (Term Frequency Inverse Document Frequency) weighing function to uni-grams counts which adds weights of words that appear fre-quently in a single record but are rare across the entire dataset. We used ?lter-based feature selection to reduce the dimensionality and chose 1000 features with Chi-squared as a score function to calculate the correlation between the label column value and the text vector. We split the data 70% training and 30% testing. For parameter tuning, we split the testing data 50% for parameter tuning and 50% for scoring. We also used 10-fold cross validation to alternate between training and testing data and to assess both the variability of the dataset and the reliability of the training model. 4.4 Evaluation Measurement Having a classi?er model that can accurately classify relevant tweets during an emergency is an important part of measuring the classi?er performance. Some tweets can be a matter of saving or losing a life if it has not been classi?ed correctly to be relevant. Habdank et al. [10] explained how accuracy and recall are very important evaluation measures. The higher the recall value the less relevant tweets have been falsely marked negative. Precision and F1 score are also other signi?cant measures. Precision measures false positives and F1 score is the weighted mean of both precision and recall. We focused in our experiment on accuracy as a main evaluation metric in addition to recall, precision and F1 score. Table 3 shows the logistic regression results across different feature hashing techniques. The best accuracy we got was around 0.86 using 10-folds cross validation and uni-gram with TF-IDF feature selection, which is slightly better than 0.856 achieved by Stowe et al. [2]. This shows that tweets from multiple data sets over different disaster phases for a certain disaster type can be classi?ed for relevance with an accuracy similar and slightly better than one single data set which helps in building a classi?er that can be more general for a certain disaster type such as hurricanes. Twitter Analytics for Disaster Relevance and Disaster Phase 407 5 Disaster Phase Discovery Once the tweets are classi?ed for relevance, we need to identify the disaster phase from the relevant tweets. We focus our work on the three main disaster phases of pre-paredness, response and recovery as these are the main three phases where most of natural disasters will go through especially hurricanes. We have experimented with LDA (Latent Dirichlet Allocation) for topic discovery on unlabeled data and multi-class binary classi?cation on labeled data. The following sections describe our ?ndings. 5.1 LDA for Disaster Phase Discovery on Unlabeled Data LDA uses a generative approach on unlabeled data. The algorithm generates a prob-abilistic model that is used to identify groups of topics which then can be used to classify either existing training cases or new cases. It uses the distribution of words to mathematically model topics [11]. The topic model gives us two major pieces of information for any collection of documents: (1) a number of topics which are con-tained within a corpus and; (2) for each document contained within the corpus, what proportion of each of the topics is contained within each document [12]. It is important to note that during a disaster usually tweets will be coming from one phase at a given time with some overlap. Based on this we are not using LDA to uncover the disaster phase as the disaster unfolds in real-time, we are rather identifying the disaster phase from static data to help in discovering disaster phase. Based on similar terms among the disaster phases across the three different disaster sets we can potentially label the data. In LDA, every topic is a collection of words. Each topic contains all the words in the corpus with a probability of the word belonging to that topic. LDA ?nds the most probable words for a topic, associating each topic with a theme is left to the user. The LDA approach requires careful validation of the topical clusters. Table 3. Results of binary classi?cation for disaster relevance Binary classi?cation Average accuracy Precision Recall F1 score Two-class logistic regression uni-gram with TF-IDF cross validation 0.858 0.868 0.90 0.886 Two-class logistic regression unigram feature selection parameter tuning cross validation 0..852 0.857 0.91 0.884 Two-class logistic regression uni-gram with TF-IDF 0.841 0.852 0.90 0.876 Two-class logistic regression uni-gram with feature selection parameter tuning 0.835 0.85 0.893 0.871 408 A. A. Khaleq and I. Ra We applied LDA in Azure machine learning studio on the relevant tweets. In LDA, an important parameter need to be identi?ed which is the number of topics. We experimented with few topics and different data sets to ?nd the best topic discovery for the disaster phases. When we applied LDA on one data set such as hurricane Irma that has the three disaster phases with topic = 3 and uni-gram we got good separation based on the disaster phase. Table 4 shows a sample of the results where we can identify topic 1 for assessment and recovery, topic 2 for response, and topic 3 for preparedness and update. However, when we applied LDA on the general data set for the three hurricanes we got mixed results and as we increase the number of topics we can see the sub categories of the disaster emerge better such as warning, update, and death. We are convinced that LDA can be a good choice for identifying the disaster phase on one data set but does not perform well on a more diverse data set. 5.2 Multi-class Classi?cation for Disaster Phase Discovery on Labeled Data As LDA did not perform well to accurately identify the three disaster phases, we applied multi-class classi?cation on the relevant tweets to classify the relevant tweets for a disaster phase. We combined data sets from the three different disasters covering the three disaster phases of preparedness, response and recovery to have a well-balanced data set. Only relevant tweets were taken, with a phase label 1 for pre-paredness, 2 for response, and 3 for recovery. The data was labeled manually based on the disaster phase. We acquired a balanced data set with a total of 981 relevant tweets consisting of 327 tweets from each disaster phase across the three different disasters. The data was preprocessed in the same way we did our binary classi?cation, split into 70% training and 30% testing. We performed the experiment in Azure Machine Learning studio. We identi?ed several multi-class classi?cation algorithms to evaluate for accuracy based on recommendation from the work of Huang, et al. [13] and Azure machine learning [14]. The classi?ers were chosen based on their known high accuracy for multi-class text classi?cation. Table 5 provides the results of the multi-classi?ers on the data set. Table 4. Sample topics identi?ed from LDA on hurricane Irma data set Tweet text Topic1 Topic2 Topic3 drone footage naples florida shows complete devastation hurricane irma 0.997509 0.001245 0.001245 hurricane irma 10 dead cuba record flooding hits northern florida latest news 0.000831 0.998337 0.000831 nc dps state ready hurricane irmas effects reach north carolina 0.000997 0.000997 0.998006 Twitter Analytics for Disaster Relevance and Disaster Phase 409 We can see that both neural networks with uni-gram feature hashing and parameter tuning along with two-class logistic regression with one-vs-all multi-classi?er gave an average accuracy of 85% and an average recall of 78%. Comparing our results to previous work on multi-class text classi?cation, Stowe et al. [2] performed binary classi?cation on the ?ne-grain subcategory of the disaster tweets and their best feature precision was around 0.71 and recall around 0.80. Huang et al. [13] applied logistic regression binary classi?cation on the ?ne-grained sub categories of the disaster and got an overall precision of 0.647 and recall of 0.711. Our results show that we can achieve an average accuracy of 0.85 on a more general disaster phase discovery rather than ?ne-grained sub categories. This shows that rel-evant tweets can be classi?ed for a disaster phase discovery with good accuracy. 6 Knowledge Extraction 6.1 Location After tweets are classi?ed for relevance and disaster phase, useful information need to be extracted. One main information is the location of the disaster. Tweets can be geo-tagged by the user to indicate where the tweet is coming from. This information is represented in the coordinates ?eld of the tweet which is in the form of a geoJSON (longitude ?rst, then latitude). For example: “coordinates”: {“coordinates”: [-“ 75.14310264, 40.05701649], “type”: “Point”}. The problem is not all tweets are geo-tagged, in our data set, for example, for both hurricane Matthew and Harvey in a data set of 1973 tweets, only 1% tweets are geo-tagged. Another ?eld that a user can share a location is the place ?eld which when present, indicates that the tweet is associated, but not necessarily originating from, a Place. In the same data set only 5% of the tweets are associated with a place. Extracting location from text will aid in identifying the main areas affected by the disaster [15]. In the following sections we present extracting tweets location from text, coordinates and place ?elds of a tweet object. Table 5. Results of multi-class text classi?cation for disaster phase identi?cation Multi-classi?er algorithm Average accuracy Overall accuracy Micro-average precision Macro-average precision Micro-average recall Macro-average recall Neural networks uni-gram feature hashing parameter sweeping 0.85 0.775 0.775 0.777 0.775 0.775 Two-class logistic regression, with one vs. all multi-classi?er uni-gram feature hashing parameter sweeping 0.85 0.775 0.775 0.775 0.775 0.775 Multi-class decision forrest with feature hashing parameter sweeping 0.845 0.768 0.768 0.77 0.768 0.768 410 A. A. Khaleq and I. Ra 6.1.1 Text Based Location Extraction Our data set consists of 981 relevant tweets with 327 tweets from each disaster phase across the three different Hurricane disasters of Matthew, Harvey and Irma. Based on the lack of the tweet originating place in the coordinates ?eld, we examined the tweet text to extract the location. We applied the Named Entity Recognition module in Azure learning studio [16] which identi?es the names of things from text such as people, companies, locations, etc. Figure 2 presents the extracted location from the tweets text. We can see the extracted location names are associated with the hurricanes actual locations. For example, hurricane Matthew was targeting Florida, Haiti, North Carolina and South Carolina. Hurricane Harvey was targeting Houston, Texas and Hurricane Irma was targeting South Carolina, North Carolina and Florida. We can also identify the name of the hurricane such as Harvey, Matthew or Irma. This is a holistic approach where the disaster is happening, but for precise location, the geo-tagged coordinates ?eld will give the exact address. 6.1.2 Coordinates and Place Fields Location Extraction In this section we present a holistic approach to uncover a disaster location from the three tweet ?elds of text, coordinates and place and compare the results for consistency. Our data set consists of 121,658 tweets of Hurricane Harvey during the response phase. We extracted the latitude and longitude from the coordinates ?eld of the geo-tagged Fig. 2. Extracted locations from relevant tweets text for the three disasters Matthew, Harvey and Irma sampled data set. Twitter Analytics for Disaster Relevance and Disaster Phase 411 tweets and uploaded them on a Google maps for visual representation using Google Table Fusion. Figure 3 shows the geo-tagged tweets on Google map from the hurricane. Harvey dataset during the response phase and how they are mainly originating from Houston, TX the main affected area during the hurricane. The place ?eld in a tweet object consists of sub?elds such as country, country_- code, name, place_type all within a bounding box coordinates. We extracted those sub?elds on the tweets that the user decides to share a place with. Again, the place is not necessarily where the tweet is originating from. Figure 4 shows the cities names based on the place name ?eld in the same data set. Around 4000 tweets associate Houston with the tweet place. In addition, about 1500 tweets associate Texas with the tweet place. This indicates that the disaster place is associated with Houston, Texas. To compare our results with the text extracted location on the same data set, we applied the named entity recognition module in Azure on the tweets text which resulted in around 6000 mentions of Houston and about 4200 mentions of Texas as shown in Fig. 5. Comparing the results of coordinates, place and text ?elds extraction shows that they are consistent which con?rms that the disaster location is mainly affecting Houston, TX. Fig. 3. Coordinates of geo-tagged tweets of hurricane Harvey, response phase data set. 412 A. A. Khaleq and I. Ra Fig. 4. Extracted location from tweets place ?eld for hurricane Harvey, response phase data set. Fig. 5. Extracted location from tweets text for hurricane Harvey, response phase data set. Twitter Analytics for Disaster Relevance and Disaster Phase 413 We can also see from the results that many of the disaster locations names are coming from tweet text vs. place and coordinates con?rming that tweets text carry key information during a disaster phase. There can also be other ?elds to extract location such as the user pro?le which is not necessarily where the tweet is originating from, but the correlation can be further studied in future work. 6.2 Key Knowledge Extraction For key knowledge extraction, we experimented with both term frequency and Key Phrase Extraction module in Azure [17]. Our data set consists of 981 tweets from the three-different disaster sets with a balanced distribution among the disaster phases totaling 33% tweets for each disaster phase of preparedness, response and recovery. For term frequency we created the matrix of terms using R in Azure for each disaster phase. The preparedness phase resulted in 693 key terms. Table 6 shows the top key terms for each disaster phase. For key phrase extraction, the module is a wrapper for natural language processing API for key-phrase extraction. The phrases are analyzed as potentially meaningful in the context of the sentence for various reasons such as if the phrase captures the topic of the sentence and if the phrase contains a combination of modi?er and noun that indicates sentiment. The output is a dataset containing comma separated key phrases in the text. Figure 6 gives the output of applying the module on the preprocessed data set for each of the disaster phases. Comparing the two outputs we can see the similarity among the key terms for each disaster phase. Utilizing the key phrases module gives us more meaningful phrases in high frequency for the disaster phase which will be very helpful for disaster personnel. The term frequency can be used as a complimentary module for veri?cation and for building a key term dictionary for the disaster phase. The hurricanes names and locations can be stripped off to generalize the dictionary terms for any disaster. Table 6. Key terms for each disaster phase based on tweet text term frequency Disaster phase Top key words in order Preparedness Hurricane, storm, Matthew, Harvey, Irma, Florida, category, Haiti, coast, Texas, disaster, death, wind, strengthen, toll, mph, dead, brace, deadly, surge, barrel, hit, near, Caribbean, news, atlantics, evacuation, head, immense, intensify, prepare, suffer, update, order, safe, threaten, approach, flee, flood, declare, expect Response Hurricane, Matthew, storm, Florida, Irma, flood, help, key, coast, surge, landfall, wind, Harvey, Houston, batter, category, power, Jacksonville, rain, hit, people, Carolina, foot, downgrade, feel, victim, weaken, need, help, rescue, kill, shelter, emergency, fear, relief, deadly, death, threaten, damage Recovery Hurricane, Matthew, storm, Irma, Carolina, north, state, flood, Florida, death, destruction, major, face, governor, rain, fatality, Houston, toll, damage, power, leave, surge, hit, devastation, cholera, expect, river, destructive, head, outbreak, cause, effect, collapse 414 A. A. Khaleq and I. Ra 7 Conclusion and Future Work In this paper we proposed a general framework for a cloud-based Twitter analytics platform for disaster relevance identi?cation and disaster phase discovery. We exam-ined three major hurricanes and specially focused on studying three main disaster phases: disaster preparedness, disaster response, and disaster recovery. Our proposed system consists of three main components of Twitter analytics: relevance classi?cation, disaster phase classi?cation and knowledge extraction. Our experiment demonstrates that we can build a general classi?er with good accuracy around 86% to classify relevant tweets from a hurricane disaster. Disaster phase discovery using multi-class text classi?cation turns out to be a better choice for uncovering the three main disaster phases compared to LDA. LDA gives mixed results depending on the data set size and diversity. We were able to classify the disaster phase of preparedness, response and recovery using a multi-classi?er with an accuracy around 85%. Relevant tweets for a certain disaster phase carry important information for emergency management per-sonnel. We extracted the disaster location name from the tweet text and from the geo-tagged coordinates and place ?elds. As the number of geo-tagged tweets is usually very Fig. 6. Top key phrases for preparedness, response and recovery disaster phases in order from left to right. Twitter Analytics for Disaster Relevance and Disaster Phase 415 limited, the extracted text-based location becomes helpful in identifying the general location of a disaster. We have also extracted the key phrases and key terms for each disaster phase which can be used to uncover more ?ne-grained categories and poten-tially build a disaster phase key term dictionary. Our study is limited in scope to the use of existing classi?cation algorithms for Twitter text classi?cation of relevance and disaster phase discovery on hurricane static disaster data. We focused in our study on extracting meaningful disaster knowledge from tweets text. However, there is more disaster information that needs to be extracted including the disaster time, and the disaster scale for assessment and recovery. Novel approaches will be needed to uncover those areas from other tweets ?elds in addition to the text ?eld. As we continue working on this framework, we plan to have a general Twitter platform that can be utilized in a cloud-based disaster management application as a service. The platform needs to be general enough to allow for dynamic requirements update through micro-service architecture. Identifying relevant tweets in real-time is another goal as we plan on implementing the system for real time streamed data. We would like to test our work on various disasters from different domains which will help in discovering similarity among the different disasters and the disaster phases via key words or other similarity measures. Through our work, we also see a need for novel labeling mechanisms for Twitter data based on text context. Presenting the extracted information about the disaster in a user-friendly or standard format is another area to work on. Acknowledgements. Special thanks to Dr. Farnoush Banaei-Kashani, University of Colorado Denver. This work is supported by the Department of Education GAANN Program, Fellowship # P200A150283, focused on Big Data Science and Engineering. References 1. Win, S.S.M., Aung, T.N.: Target oriented tweets monitoring system during natural disasters. In: 16th IEEE/ACIS International Conference on Computer and Information Science (ICIS), pp. 143–148. IEEE, Wuhan (2017) 2. Stowe, K., Paul, M.J., Palmer, M., Palen L., Anderson, K.: Identifying and categorizing disaster-related tweets. In: The Fourth International Workshop on Natural Language Processing for Social Media, pp. 1–6. Association for Computational Linguistics, Austin (2016) 3. Vieweg, S.E.: Situational awareness in mass emergency: a behavioral and linguistic analysis of microblogged communications. Doctoral dissertation, University of Colorado at Boulder, Boulder, CO (2012) 4. Ashktorab, Z., Brown, C., Nandi, M., Culotta, A.: Tweedr: mining twitter to inform disaster response. In: Hiltz, S.R., Pfaff, M.S., Plotnick, L., Shih, P.C. (eds.) 11th Interna-tional ISCRAM Conference, pp. 354–358. The Pennsylvania State University, Pennsylvania (2014) 5. Imran, M., Castillo C., Lucas J., Meier P., Vieweg, S.: AIDR: arti?cial intelligence for disaster response. In: 23rd International Conference on World Wide Web, pp. 159–162. ACM, Seoul (2014) 416 A. A. Khaleq and I. Ra 6. Imran, M., Elbassuoni S., Castillo, C., Diaz, F., Meier, P.: Practical extraction of disaster-relevant information from social media. In: 22nd International Conference on World Wide Web, pp. 1021–1024. ACM, Rio de Janeiro (2013) 7. Wang, Z., Ye, X.: Social media analytics for natural disaster management. Int. J. Geogr. Inf. Sci. 32(1), 49–72 (2018) 8. Haworth, B., Bruce, E., Middleton, P.: Emerging technologies for risk reduction: assessing the potential use of social media and VGI for increasing community engagement. Aust. J. Emerg. Manag 30(3), 36 (2015) 9. Yan, Y., Eckle, M., Kuo, C.L., Herfort, B., Fan, H., Zipf, A.: Monitoring and assessing post-disaster tourism recovery using geotagged social media data. ISPRS Int. J. Geo-Inf. 6(5), 144 (2017) 10. Habdank, M., Rodehutskors, N., Koch, R.: Relevancy assessment of tweets using supervised learning techniques: mining emergency related tweets for automated relevancy classi?cation. In: 4th International Conference on Information and Communication Technologies for Disaster Management (ICT-DM), pp. 1–8. IEEE, Münster (2017) 11. Latent Dirichlet Allocation. https://docs.microsoft.com/en-us/azure/machine-learning/studio-module- reference/latent-dirichlet-allocation. Accessed 02 Feb 2018 12. Anastasopoulos, L.J., Moldogaziev, T.T., Scott, T.A.: Computational Text Analysis for Public Management Research: An Annotated Application to County Budgets (2017) 13. Huang, Q., Xiao, Y.: Geographic situational awareness: mining tweets for disaster preparedness, emergency response, impact, and recovery. ISPRS Int. J. Geo-Inf. 4(3), 1549–1568 (2015) 14. Machine learning algorithm cheat sheet for Microsoft Azure machine learning studio. https:// docs.microsoft.com/en-us/azure/machine-learning/studio/algorithm-cheat-sheet. Accessed 02 Feb 2018 15. Spielhofer, T., Greenlaw R., Markham, D., Hahne, A.: Data mining Twitter during the UK floods: investigating the potential use of social media in emergency management. In: 3rd International Conference on Information and Communication Technologies for Disaster Management (ICT-DM), pp. 1–6. IEEE, Vienna (2016) 16. Named Entity Recognition. https://docs.microsoft.com/en-us/azure/machine-learning/studio-module- reference/named-entity-recognition. Accessed 02 Feb 2018 17. Extract key phrases from text. https://docs.microsoft.com/en-us/azure/machine-learning/ studio-module-reference/extract-key-phrases-from-text. Accessed 02 Feb 2018 Twitter Analytics for Disaster Relevance and Disaster Phase 417 Incorporating Code-Switching and Borrowing in Dutch-English Automatic Language Detection on Twitter Samantha Kent(&) and Daniel Claeser Fraunhofer Institut FKIE, Fraunhoferstrasse 20, 53343 Wachtberg, Germany {samantha.kent,daniel.claeser}@fkie.fraunhofer.de Abstract. This paper presents a classi?cation system to automatically identify the language of individual tokens in Dutch-English bilingual Tweets. A dic-tionary- based approach is used as the basis of the system, and additional features are introduced to address the challenges associated with identifying closely related languages. Crucially, a separate system aimed speci?cally at differenti-ating between code-switching and borrowing is designed and then implemented as a classi?cation step within the language identi?cation (LID) system. The separate classi?cation step is based on a linguistic framework for distinguishing between borrowing and CS. To test the effectiveness of the rules in the LID system, they are used to create feature vectors for training and testing machine learning systems. The discussion centres are based on a Decision Tree Classi?er (DTC) and Support Vector Machines (SVM). The results show that there is only a small difference between the rule-based LID system (micro F1 = .95) and the DTC (micro F1 = .96). Keywords: Code-switching .1 Borrowing .1 Dutch .1 English .1 Twitter Machine learning .1 Decision trees .1 SVM 1 Introduction In the European Union, it is estimated that just over half of all European citizens are able to speak at least one other language in addition to their mother tongue [1]. Online micro-blogging platforms such as Twitter provide the perfect setting for multilingual communication, and Tweets containing Dutch and English, as in (1) below, are not uncommon. (1) oke give me some reasons waarom jij denkt dat het real is ok give me some reasons why you think it’s real Currently, multilingual communication poses a challenge for Natural Language Processing (NLP) tasks such as Part-of-Speech tagging, machine translation, and Named Entity Recognition. Improving the ability to process multilingual communi-cation is vital, as it will contribute to further solving these tasks. Automatic language identi?cation (LID) is the task of determining the language of a document, sentence or word. Language identi?cation at Tweet level reaches accuracy © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 418–434, 2019. https://doi.org/10.1007/978-3-030-02686-8_32 levels of over 95% for many languages. Nevertheless, one of the reasons the language of a Tweet is incorrectly identi?ed, aside from the marked Twitter language, is because they can contain code-switching. Code-switching (CS) is de?ned as “the alternation of two languages within a single discourse, sentence or constituent” [2]. CS can consist of multi-word utterances or single-word insertions. To determine whether or not a Tweet contains multiple languages, an analysis at token-level needs to be conducted. While there are many different LID methods, arguably, one of the simplest approaches is based on the use of a lexical lookup system. In this method, dictionaries, which are lists containing lexical items extracted from a particular language, are used to verify that a word is part of the lexicon of that language. This method was used as a starting point to identify the language of tokens in Spanish-English, German-Turkish, and Dutch-English Tweets [3]. The results suggested that the performance of a dictionary-based LID system is much better for language pairs that are not as closely related as Dutch-English. In the case of Dutch and English, many Dutch words are borrowed from English and have been integrated into the Dutch lexicon. The challenge, therefore, lies in determining whether the English words are in fact borrowed and part of a monolingual Tweet, or if they are English words (CS) that are included in a multilingual Tweet. Without distinguishing between these two types of words, it is very dif?cult to accurately identify the language of tokens in sentences that contain both English and Dutch. Thus, in order to address this issue, this study seeks to present a method for distinguishing between borrowed and code-switched English words in order to improve the overall language classi?cation of tokens in Dutch-English Tweets. To do so, the method in this paper combines a LID system based on a dictionary lookup with a synonym detection method that identi?es whether the token in question is code-switched or borrowed. Even though “words are seldom exactly synonymous” [4], comparing the use of a token and its possible synonyms provides an indication as to how a token is integrated into a language. 2 Code-Switching and Borrowing To fully understand CS, a distinction between CS and lexical borrowing needs to be made. Lexical borrowing is de?ned as “the incorporation of lexical items from one language in the lexicon of another language” and is, together with CS, one of the more prominent language contact phenomena [5, p. 189]. CS and borrowing are closely related in the sense that lexical items that were once classi?ed as foreign word CS may be absorbed into the lexicon of a host language over time [6]. Example (2) below illustrates that it is not always so easy to determine whether a word should be identi?ed as a foreign word or not. (2) ik heb een video klaarliggen… een social test met mn docent, wanneer moet die online? I have a video ready to go… a social test with my teacher, when should it go online? Incorporating Code-Switching and Borrowing in Dutch-English ALD on Twitter 419 At ?rst glance, it would seem as though ‘social’, ‘test’ and ‘online’ are all English words in this sentence. In fact, according to the Woordenlijst Nederlandse Taal, 1 the only word that is actually an English word is ‘social’, as the Dutch equivalent of this word is ‘sociaal’. The other two words are identical to English, but are also a part of the Dutch lexicon. They should, therefore, not be identi?ed as code-switching but instead as borrowing. Numerous attempts have been made to distinguish between borrowing and code-switching. They range between establishing a set of speci?c criteria with which to identify borrowing and CS, to the assertion that there is no clear-cut distinction between the two. In the ?rst view, one of the main distinguishing features between the two is the number of words. Lexical borrowings consist of only one word, whereas CS can consist of multiple words [7]. Having said this, the dif?culty in distinguishing between the two does not lie in the difference between single word lexical borrowings and multi-word alternations, but rather between lexical borrowing and single word CS inclusions. Table 1 provides a set of criteria to establish whether foreign inclusions can be classi?ed as borrowing or CS [7]. These criteria are used as guidelines to differ-entiate between the two phenomena. By delineating these criteria, the impression is given that there are only two pos-sibilities to classify a single word inclusion: CS or borrowing. However, it is argued that this strict separation of the two phenomena is not always possible and there are many exceptions that do not fall into either category. Instead of strictly differentiating between the two, CS and borrowing could be viewed as a continuum where the canonical forms of CS and borrowing are placed at either end of the spectrum [8]. This continuum makes it possible to account for tokens that may not be precisely in either stage, but are instead transitioning into becoming fully-fledged loanwords. The de?nition of borrowing that will be adopted in this paper is that borrowed words are words that stem from a foreign language and have been integrated into the lexicon of a native language. In contrast, words that are classi?ed as code-switching are not integrated. Rather than having to de?ne a frequency at which a token is either automatically classi?ed as CS or borrowing, the approach taken here relies on the difference between the frequency for the token and any possible alternatives in the native language. This ensures that instead of having to assign an arbitrary value, the unique difference between the tokens determines whether a word is CS or borrowing. 3 Related Work Code-switching in Tweets was the topic of the shared task for the workshops on computational approaches to code-switching during the conference on Empirical Methods in Natural Language Processing (EMNLP) in 2014 and 2016. CS detection methods ranged from deep learning algorithms to traditional machine learning approaches and various dictionary-based approaches [9, 10]. The best result was 1 Woordenlijst Nederlandse Taal is a word list that contains the correct spelling of current Dutch words. It is maintained by de Taalunie http://woordenlijst.org/. 420 S. Kent and D. Claeser obtained by [11] for Spanish-English with an F1 score of 91.3%. The performance of the submissions for the Arabic language pair ranges from an F1 of 66% to an F1 of 83% for the system with the best performance [12]. The results suggest that the more similar a language pair is, the more dif?cult it is to accurately detect CS. To the best of our knowledge, there are currently only two studies that present a method of automatically identifying CS and borrowing on social media. Neither study incorporated their results into a LID method. The ?rst one focused on English-Hindi CS and on developing a method that automatically detects whether a foreign language inclusion is CS or borrowing [13]. The method used is similar to the one in this paper, as the starting point is also the assumption that it is possible to distinguish between CS and borrowing by looking at the distribution of use of a foreign word in a native language. They achieve this by looking at the frequency of use of a token in a monolingual Hindi newspaper. Alternatively, [14] propose three different metrics to measure word usage: The Unique User Ratio (UUR), The Unique Tweet Ratio (UTR) and Unique Phrase Ratio (UPR). The results are that the overall micro precision/recall is 0.33 for the UUR metric, compared to a baseline of 0.19 established in [13]. It is clear from previous studies that multilingual text within one Tweet still pro-vides a challenge for automatic language detection. The systems described above cite similar reasons for the misclassi?cation of certain tokens. Firstly, the highly informal nature makes it dif?cult to capture the language of all tokens in a Tweet. A second reason misclassifying occurs is because the presence of named entities complicates the LID task [15]. Thirdly, words that share the same spelling in both languages are dif?cult to detect [15, 16]. This particular challenge seems to increase the more similar the languages in the language pair are. It seems as though it is more dif?cult to detect the language of tokens if there is a high level of lexical overlap. 4 Resources A Dutch-English code-switching corpus was created for the purpose of training and testing the classi?er and was compiled with the aim of collecting as many Dutch Tweets containing English CS as possible. The corpus was compiled using the search function in the Twitter streaming API and both a speci?c language setting, Dutch, and Table 1. Characteristics of borrowing and CS [5, 7]. Criteria Borrowing Code-switching No more than one word + -n Phonological adaptation + ± Morphological adaptation + -n Syntactic adaptation + -n Frequent use + -s Replaces own word + -w Recognized as own word + -w Incorporating Code-Switching and Borrowing in Dutch-English ALD on Twitter 421 speci?c search words were used to ?nd Tweets containing Dutch English code-switching. The top 25 most frequently used Dutch words on Dutch Wikipedia, con-sisting solely of grammatical function words, were used as search terms. The language identi?cation method presented in [3] was used to make a pre-selection of Dutch Tweets that are likely to contain English tokens. Based on these language tags, all Tweets with only Dutch or English tokens were separated from the Tweets that contain both Dutch and English tokens. It was necessary to select not only Tweets that were correctly identi?ed by the LID system as CS, but to also include Tweets that were incorrectly identi?ed so as not to introduce a bias. Therefore, some Tweets in the corpus contain only Dutch words that were mistakenly identi?ed as English and are used to test the classi?ers ability to recognize code-switched and borrowed tokens. The authors manually selected 1250 Tweets for annotation. The following four categories were used in the manual annotation of the Tweets: • Dutch (NL) – This category consists of Dutch words. It also includes all Dutch words that are borrowed from English. Particular attention is paid to the annotation of borrowed words, and because they are often overlooked and easily incorrectly annotated as English, these words were double checked in the Dutch word list. • English (EN) – All English words are labelled as English. If there is doubt about whether a word is English or Dutch, the same criteria as described in the Dutch category are applied. • Social Media Token (SMT) – It proved useful to create a separate category for all social media related tokens [16]. It includes all tokens that are speci?cally related to Twitter, such as at-mentions containing people’s usernames, hashtags and URLs, but it also includes tokens such as ‘hahahah’, ‘lol’ or ‘aww’. • Ambiguous (AM) – This category includes tokens that cannot be categorized as belonging to a particular language. Similarly to the SMT category, the tokens are used by both languages and are thus considered to be language independent. For example, company names such as Twitter or Google as well as the names of places and people, are categorized as ambiguous. The annotation was conducted by a native speaker of both Dutch and English, and a second native speaker annotated 100 randomly selected Tweets to check the accuracy of the annotation. A comparison of the Tweets annotated by both annotators shows a high inter-rater agreement (Cohen’s Kappa = 0.949). 1000 Tweets were used as training material and 250 Tweets were used to test the classi?er. An overview of the distribution of Tweets in the training and testing sets is given in Table 2 below. Note that while the category ambiguous (AM) has been included for the purpose of com-pleteness, it is not taken into account in any further classi?cation or analysis. The synonym dictionaries used in the LID system stem from three different sources. The ?rst dictionary was obtained from Open Dutch WordNet [17]. Open Dutch WordNet is a lexical semantic database containing 117914 synonym sets, of which 51588 sets contain at least one Dutch synonym. The second dictionary is from a Dutch language foundation called Open Taal. 2 They provide language resources for the 2 http://data.opentaal.org/opentaalbank/woordrelaties/. 422 S. Kent and D. Claeser creation of Dutch language software. The ?nal dictionary was created using Dutch Wiktionary. 3 The synonyms for each of the Dutch entries in the dictionaries were extracted and used to compile a speci?c synonym dictionary. The addition of multiple synonym dictionaries not only increases the number of synonym sets but also means that entries can be cross-checked. The word frequency dictionaries were created using the Wikipedia dumps for Dutch and English (version: “all pages with edit history” on 01/03/2017). This par-ticular version contains the pages itself and a user discussion section where Wikipedia users may comment on the page content. This means that the dictionary contains both formal and informal language, as well as a wide range of vocabulary from different topics. The word list was created by stripping the raw input of all special characters, tokenizing the sentences, and sorting the tokens according to their rank. The rank lists were cut at ?ve million types because any words that are lower down on it consist of single words with a frequency of one. The Social Media Token (SMT) word list consists of a combination of different elements. The SMT list provided in [16] forms the basis of the list used here, which is supplemented by two additional resources. Firstly, the addition of an emoticon list from Wikipedia allows tokens such as “xD” to be captured. Secondly, a list of onomatopoeic words, such as ‘haha’ ‘pff’, retrieved from the training corpus was also added. To ensure that as many of these tokens as possible are identi?ed as SMT, the list is extended to include various different forms of the same token. This means that alongside ‘haha’ and ‘pff’, ‘hahahah’ and ‘pffff’ were also added. 5 Classi?cation In this section, the classi?cation process is described. Section 5.1 contains an overview of the rule-based system, whereas Sect. 5.2 describes how the features derived from the classi?cation rules are extracted for use in various machine learning classi?ers. Table 2. Number of tokens in each of the four categories in the annotated Tweet training and testing sets Category No. of tokens in training set No. of tokens in testing set Dutch (NL) 73% (n = 10637) 73% (n = 2680) English (EN) 15% (n = 2220) 17% (n = 612) Social Media Token (SMT) 9% (n = 1281) 9% (n = 341) Ambiguous (AM) 3% (n = 438) 1% (n = 41) Total 14576 3674 3 https://nl.wiktionary.org/wiki/Hoofdpagina. Incorporating Code-Switching and Borrowing in Dutch-English ALD on Twitter 423 5.1 Rule-Based LID System The notion of word frequency plays a central role in the design of the system. It is assumed that the Dutch and English word frequency dictionaries are large enough for all tokens to be present in both dictionaries. Crucially, the rank of the token will be different as it will be more frequent in the language of origin compared to the other dictionary. Thus, in the ?rst step of the LID system, a token is assigned a language tag based on whether the rank of the token is higher in the Dutch or English dictionary. In the rare instance that a token is not present in either of the dictionaries, it is assigned the tag ‘none’. As a ?nal step, all none tags are tagged as the majority language (NL) of the Tweet. Aside from the binary classi?cation of either Dutch or English, tokens that are speci?c to social media also need to be taken into account. Tweets contain many additional tokens, such as @-mentions, hashtags, and abbreviations, which do not strictly belong to either of the two languages. To account for these tokens, an additional rule containing Social Media Tokens (SMT) is introduced. Once the initial classi?- cation based on the rank information is made, an additional lookup is performed in an SMT word list. Without this list, almost all of the SMT tokens would be tagged as English, simply because they are more frequent in the English rank dictionary com-pared to the Dutch one. All tokens present in this SMT list are tagged as such and are excluded from any further steps or rules in the LID system. The lexical overlap between Dutch and English means that it is challenging to capture the language of tokens that are orthographically identical in both languages. For example, the word “school” is used in both Dutch and English and should therefore also be classi?ed as such. However, if the word “school” has a rank of 615 in Dutch dictionary and a rank of 325 in the English dictionary, the classi?er will tag the word as English. If the LID system were to just consist of a basic dictionary lookup without any additional rules, all Dutch occurrences of the word would be misclassi?ed. In order to account for these tokens, two additional rules have been incorporated into the classi?er. The ?rst additional rule is the inclusion of a synonym detection method to determine whether a token is code-switched or borrowed. To start, the token that is being classi?ed is matched to an equivalent synonym in the Dutch synonym dictionary. If there is no match for the token, and therefore no synonym, the token is classi?ed as English. If there is a match, the token is classi?ed as Dutch in the fol-lowing two conditions: • If the rank of the original English token is higher than that of the selected synonym in the Dutch word frequency dictionary, the token is tagged as Dutch and therefore is borrowed. For example: ‘soul’ (rank = 6914) vs. ‘ziel’ (rank = 7291). • If the difference in ranks between the original English synonym and the selected synonym is less than 30.000, the token is tagged as Dutch and therefore is also borrowed. For example: ‘power’ (rank = 4092) vs. ‘macht’ (rank = 1316). The maximum rank distance is iteratively determined to be 30.000 using a list of English words that could potentially be borrowing or CS from the training data. To select the corresponding synonym the original English token is compared to each of the synonym sets, and if the token is present in a set, its synonyms are added to a match 424 S. Kent and D. Claeser list. Once the match lists have been created, the correct synonym is selected using a process of elimination. In the ?rst step, the synonym that occurs most frequently as a synonym match is selected. Secondly, if there is a tie, the synonym with the highest rank in the Dutch language dictionary is selected. The information obtained from the synonym dictionaries only outweigh the frequency information gained in step one if there is an actual synonym match. Otherwise, the classi?er assigns the original tag. The second additional rule considers the context of a token. It applies to tokens where the token is in one language and the preceding and the following token are in another language. In these cases, the token is assigned a language tag that matches the language of the surrounding tokens. For example, if token ‘n’ is Dutch and tokens ‘n -n 1’ and n + 1’ are English, it is possible that the middle token ‘n’ is, in fact, English and should be reassigned as such. An essential addition to this rule is that it only comes into effect when the ranks of the token are suf?ciently similar in the Dutch and English frequency dictionaries (Fig. 1). If a maximum rank distance is not set, all tokens will be reassigned to match their context and all one-word code-switches could be incorrectly classi?ed and lost. After a distance of 1000 ranks, English recall starts to decrease considerably. Therefore, in order to optimize the identi?cation of the English tokens, the rank distance has been set to the maximum of 1000. To summarize, the steps in the LID system are as follows: • Base rule: dictionary lookup using the rank information in the Dutch and English Wikipedia dictionaries. • Base rule: SMT lookup. • Additional rule 1: Synonym dictionary lookup. • Additional rule 2: The context rule. 5.2 Machine Learning The four steps in the LID system have been converted into numeric vectors to use as an input for the classi?ers in scikit-learn 0.18. This allows the system to be tested in a formal classi?cation framework and be exported for further use. The resulting vector has four different features: rank EN, rank NL, SMT, synonym rank, each corresponding to the information derived from the rule-based LID system described in Sect. 5.1. Rank EN, rank NL and synonym rank are all integers containing the absolute ranks retrieved from the language dictionaries. For the SMT token, we converted the Boolean ‘present/absent’ in a social media token list to either returning an integer of 0 or 1. A second variation of the vector was also tested. The absolute synonym rank infor-mation was replaced with the difference in ranks between the token in question and its corresponding synonym. All other vector dimensions remained the same. The differ-ence between the ?rst and second version of the vector is that in the ?rst the difference in ranks between the token and the synonym are returned implicitly. The information is inherent in the synonym rank and the rank of the Dutch token and is thus already in the vector. In the second version, the difference in ranks is explicitly added as a feature. This distinction was made to allow the classi?ers to be trained on different information and to see if they would learn the rank difference without being explicitly given the information. We trained and tested eight different classi?ers using 10-fold cross Incorporating Code-Switching and Borrowing in Dutch-English ALD on Twitter 425 validation, the results of which can be found in Table 3 below. The two best classi?ers will be discussed in more detail in the following section. 6 Evaluation In this section, the results for LID system and the best performing classi?er, the Decision Tree classi?er, are presented. Additionally, in Sect. 6.2, the code-switching and borrowing detection rule is evaluated separately. 6.1 General Evaluation The LID system and the Decision Tree Classi?er were evaluated on a held-out set of 250 Tweets. The results are very similar. The precision, recall and F1 for the individual categories, NL, EN, and SMT, in the Decision Tree classi?er, are shown in Table 4 below. The best result is NL, with an F1 of 97.19%, followed closely by SMT and EN that have an F1 of 96.47% and 88.73%. Compared to the LID system, both precision and recall for the NL and EN improved. The overall F1 scores for the LID system and the DTC are 94.66% and 95.69% respectively, which is a signi?cant improvement compared to the baseline (F1 = 85.29%) for Dutch-English CS detection in Claeser et al. [3]. Both systems illustrate that it is easier to identify Dutch, the main language of the Tweets, although there is an improvement in the classi?cation of the EN tokens in the DTC. All ?gures for the DTC do not include any post-processing, since the effect of the context rule on the output of the classi?er was below the variance of the results of different test splits within cross-validation. The confusion matrix in Table 5 provides the misclassi?ed tokens for the DTC. Most errors stem from tokens that should have been classi?ed as either NL or EN. Fig. 1. Dutch and English precision and recall with differing maximum rank distance. 426 S. Kent and D. Claeser The SMT tokens are rarely misclassi?ed, and if they are it is because a token is a more unusual version of an SMT token already present in the SMT list. One of the largest sources of errors consists of Dutch tokens that should have been classi?ed as English. This includes tokens such as ‘god’, ‘pianist’, ‘pressure’, and ‘dreaming’. There are two main types of errors. Firstly, single word inclusions were misclassi?ed due to the context in which they appeared. For example, ‘god’ and ‘pianist’ are part of the Dutch and English lexicon, and were misclassi?ed in these cases because they were used in an English context but classed as borrowed (NL) by the inclusion of a synonym rank. Secondly, tokens have been misclassi?ed because the matched synonym is incorrect. A manual inspection of the tokens and their selected synonyms shows, for example, that the synonym that was selected for ‘love’ is ‘rose’. While these tokens are related in some way, they cannot be considered to be synonyms of one another. However, because ‘love’ is more frequent than ‘rose’, it is automati-cally classi?ed as being a borrowed (NL) word because the English token is more frequent than its supposed Dutch synonym. Another source of errors is English tokens that should have been classi?ed as Dutch. In most cases, they were not detected as borrowed words by the classi?er. One of the main reasons is that for these tokens the synonyms were not included in any of the three external synonym dictionaries. For example, ‘respect’, ‘defect’, ‘story’, ‘highlight’ and ‘trends’ are all part of the Dutch lexicon, but have been classi?ed as English. The second reason for misclassi?cations is the inclusion of multi-word code-switched segments. Table 3. Classi?er Performance Micro F1 Classi?er Micro F1 Decision tree classi?er 0.9537 Support vector machine 0.924 Ada boost classi?er 0.9096 Linear discriminant analysis 0.8187 Quadratic discriminant Analysis 0.8186 Logistic regression 0.7729 Neural network 0.7503 Table 4. P, R, F1 for the individual categories in the DTC and LID system Language Precision (%) Recall (%) F1 (%) Decision tree classi?er Dutch (NL) 97.16 97.23 97.19 English (EN) 88.58 88.87 88.73 Social Media Token (SMT) 97.10 95.86 96.47 Rule-based LID system Dutch (NL) 95.85 97.22 96.53 English (EN) 86.50 80.21 83.23 Social Media Token (SMT) 97.56 98.00 97.77 Incorporating Code-Switching and Borrowing in Dutch-English ALD on Twitter 427 For example ‘minute’ is misclassi?ed as English. However, if it is used as part of the phrase ‘last minute’, it should be considered Dutch. Only the phrase as a whole is considered to be Dutch, the individual tokens within the phrase are not. In order to capture these speci?c instances, multi-word token sequences would need to be included in the dictionaries, and currently, the classi?er operates on single tokens. 6.2 Evaluation of the Synonym Selection Rule To evaluate the effect of the synonym detection step on the overall classi?cation process, a list of 400 words that are tagged as English in the base step of the LID system were extracted for further analysis. Each token was tagged as either borrowed or code-switching based on the information from the synonym dictionaries, and the original language (EN) was appended to Dutch whenever the system indicated that the word may be borrowed. This output was then compared to the gold standard, which was based on the presence or absence of a word in the “Woordenlijst Nederlandse Taal”. The analysis is based solely on the 400 individual tokens, without taking their context in the Tweet into account. In total, 82% of the tokens were correctly identi?ed as being either borrowed or code-switched. 260 tokens were correctly identi?ed as code-switching, compared to a total of 289 tokens that should have been classi?ed as code-switching and 71 out of 97 tokens have been correctly identi?ed as borrowing. Without this additional step, based on the initial rank information, all of these tokens would have been classi?ed as code-switched (EN), even though many of these are indeed part of the Dutch lexicon and should, therefore, be tagged accordingly. This demonstrates the importance of distinguishing between borrowing and CS in a lan-guage identi?cation system that classi?es closely related languages. As well as analyzing the impact of the synonym dictionary rule as a whole, the two different conditions in which a token is tagged as borrowing have also been examined (see Sect. 5.1 for a description of the conditions). Each condition considers the rank information of the token and the synonym that has been selected as an equivalent match. The ?rst enables the detection of borrowed tokens that have a higher rank than its equivalent Dutch synonym. A total of 53 borrowed words were correctly identi?ed using this method. Among the correctly identi?ed tokens are ‘we’, ‘must’, ‘budget’, ‘crash’, ‘super’, ‘sale’ and ‘media’, ‘perfect’, ‘modern’, and ‘ranking’. The information that was used to classify the tokens is provided in Fig. 2 below. For each of the tokens, the English version was used more frequently than its Dutch equivalent. In some cases, the distance between the ranks of the two synonyms is much larger than others. Table 5. Confusion matrix of the decision tree classi?er NL EN SMT Total NL 2674 66 10 2750 EN 67 611 3 681 SMT 7 2 314 323 Total 2748 679 327 13889 428 S. Kent and D. Claeser The larger the rank distance between the two synonyms, the larger the difference in frequency of use of the borrowed word compared to the Dutch equivalent synonym. In the second rule, a token is classi?ed as borrowed if the distance between the rank of a token and its selected synonym is less than 30,000. The CSB system correctly identi?ed 30 tokens using this rule. Among them is the selection of tokens provided in Fig. 2. In these instances, the frequency of use is higher in the Dutch synonym equivalent than in the token. For example, ‘ticket’ is used relatively frequently in Dutch, although the Dutch version ‘kaartje’ is still used more frequently. In other words, the original Dutch token is used more frequently than the borrowed equivalent of the word. Interestingly, this rule enables the identi?cation of borrowed nouns as well as highly frequent grammatical tokens. The synonym pair ‘me’ and ‘mij’ demonstrated the CSB system’s ability to recognize that ‘me’ is both a Dutch and English pronoun (Fig. 3). Whilst the ?rst borrowing rule may have identi?ed more borrowed tokens overall, a direct comparison of the number of correct tokens identi?ed by both of the rules shows that they are both equally capable at identifying borrowed tokens. 89.9% of the tokens classi?ed by the ?rst borrowing rule were correct and 90.9% of tokens classi?ed by the second rule were correct. The synonym selection process was crucial to successfully differentiating between borrowed and code-switched tokens. In order to judge whether the synonyms are a correct match or not, two Dutch native speakers separately annotated the synonym match lists. The judgment was based solely on whether the two tokens could be synonyms, without taking any context into account. These ?gures do not take into account whether or not the token was classi?ed correctly; it focuses solely on whether the synonym match is correct. Generally, there was agreement between the annotators, and the ?nal judgments for each annotator were merged to create an overall judgment list. In total, out of the 97 borrowed tokens identi?ed by the system, 79.4% of all synonyms have a correct match. Table 6 below shows that of the 77 correct synonym matches, only 5% of tokens were incorrectly classi?ed as borrowing. Contrastingly, 40% of tokens with an incorrect synonym were incorrectly classi?ed as borrowing. Therefore, there seems to be a correlation between whether the synonym that is identi?ed by the system is correct or not and the corresponding classi?cation of bor-rowing or code-switching. If the synonym match is correct, the more likely the system will correctly identify whether a token is borrowed or code-switched. Overall, the system is relatively accurate at identifying whether an English token is in fact just English, or whether it also belongs to the Dutch lexicon. These tokens have been de?ned as borrowed tokens in the context of this study, even though strictly speaking not all tokens are actually borrowed from English and some may share another etymology. Nevertheless, the system is able to identify if a token should also be classi?ed as Dutch; so from the perspective of a method able to differentiate between these two languages, the classi?er will be a valuable tool in this process. Incorporating Code-Switching and Borrowing in Dutch-English ALD on Twitter 429 7 General Discussion Our initial assumption was that the Decision Tree classi?er would be the most suitable classi?er for features extracted from the rule-based LID system. However, even though it was the best performing classi?er, the rule related to the rank distance between a token and its corresponding synonym did not transfer. This is true for both versions of the vector. The rule was not learned whether or not the rank distance was explicitly provided. In each case, a different tree is generated, but both are equivocally complex: the classi?er bypasses the synonym rank rule and the model is based on grouping tokens with similar ranks to create paths. We suspect the reason that the classi?er did not learn the rule is that the algorithm that builds the decision tree has the objective to Fig. 2. A selection of correctly identi?ed borrowed (NL) tokens. The token is marked in bold and supplemented by its rank in the Dutch Wikipedia dictionary as well as the synonym selected by the classi?er and its matching rank. Fig. 3. A selection of correctly identi?ed borrowed (NL) tokens using the maximum rank distance rule. Table 6. Correlation between synonym matches and the number of correctly classi?ed borrowed (NL) tokens No. incorrectly classi?ed tokens No. correctly classi?ed tokens Correct matches 5% (n = 4) 95% (n = 73) Incorrect matches 40% (n = 8) 60% (n = 12) 430 S. Kent and D. Claeser ?nd the most ef?cient local split. It aims to create the purest subset with maximum information gain, and consequently fails to detect the global optimum. Instead, the classi?er generated hundreds of speci?c paths to classify small groups of tokens. The second best performing classi?er, aside from the rule-based LID system, is the Support Vector Machine. In contrast to the Decision Tree, the SVM does, in fact, learn the synonym rank rule. We believe that this is because the RBF kernel enables the classi?er to generalize and learn the concept of a rank threshold for the synonyms. It does so by transforming the non-linear data from the dictionary rank lists to a hyperspace that allows for the separation of the otherwise intertwined examples of borrowing and CS in the rank lists. This assumption is supported by the observation that giving either the ranking distance as explicit information or just the synonym rank has no visible influence on either runtime or performance of the resulting SVM. Neither does changing the default value in the vector from 0 to -o10 million, a value larger than the size of the dictionary, for non-existing synonyms. Interestingly, the rule-based LID system performed very similarly to the machine learning classi?ers. [16] also reported a similar ?nding, in that the results for the rule-based system were actually slightly better than for the machine learning systems, suggesting that if the rules are designed carefully, language detection for this particular language pair can be just as accurate in rule-based systems as in machine learning systems. The performance of systems depends greatly on the quality of the external mate-rials. While designing the systems, we noticed both advantages and disadvantages for the different types of external resources. Firstly, the synonym dictionaries proved to be quite dif?cult to obtain. The decision was made to combine multiple synonym dic-tionaries in order to compensate for incomplete dictionaries. The main reason for doing so is the ability to cross-reference entries for the lemmas. This allows for a veri?cation of whether the entry is actually correct. For example, for some dictionary entries, the English translation of a word is listed as a synonym even though it is not of?cially a part of the Dutch lexicon. These tokens caused issues, as they were not included as an English token in the annotated gold standard, and were consequently incorrectly classi?ed. The most frequently occurring example is ‘why’, which is listed as a syn-onym for ‘waarom’ in the Open Taal synonym dictionary. This mistaken entry would be easy to rectify if all synonyms not present in at least one other dictionary are disregarded as synonym matches. However, this would not be possible with the current synonym dictionaries as many of the matches only occurred in one dictionary. Too many entries would be lost and the performance of the identi?cation of the borrowed or Dutch tokens would decrease. If such a frequently occurring word is listed as a syn-onym even though it is not, it is likely that this is also true for other entries, which may cause issues in the classi?cation of other tokens in the future. Secondly, the Wikipedia rank list turned out to be a highly suitable external resource. A comparison of the studies describing just a basic dictionary lookup approach to the results obtained in this system illustrates that the quality of the Wikipedia dictionaries enhanced the performance the ?rst step in the LID system. The system in [16] obtained an F1 score of 38% for identifying English tokens, [18] obtained similar Figs. (38% and 35%) for the English-Hindi and English-Bengali language pairs, and [19] obtained the highest F1 scores in comparison with 71% and Incorporating Code-Switching and Borrowing in Dutch-English ALD on Twitter 431 73% for Spanish-English and Nepali-English respectively. In the LID system we present, the most basic version without any additional rules achieved a micro F1 of 72%. This suggests that the quality of the dictionaries is good, because based on just the lookup alone, the results are better than initially anticipated based on previous research. Having said this, a few issues still remain. Firstly, it must also be considered that while Wikipedia contains a large variety of topics and registers, there may be some topics that are overrepresented on Wikipedia and tokens related to that topic are consequently also more frequent in the dictionaries than they would be in other cir-cumstances. Secondly, the use of quotations or names in the articles may also mis-represent the actual frequency of certain tokens. Names of books or ?lms are not translated into Dutch and they are often used in the original language. Consequently, the article ‘the’ is extremely frequent in the Dutch Wikipedia pages even though it is not a Dutch token. In the English rank dictionary, ‘the’ is the most frequently used token and is ranked at one. In the Dutch dictionary, it is ranked at 63. Even if it is highly ranked, the assumption that words are more frequent in their language of origin still holds. Nevertheless, according to the Dutch Wikipedia rank dictionary, the word ‘the’ is more frequent than most Dutch lexical items and it does not match the fre-quency information that one would expect of words that are not a part of the Dutch lexicon. 8 Conclusion The question posed in this paper was whether or not a dictionary-based LID system is suitable for token-level language detection in a closely related language pair. Previous research [3] indicated that lexical items present in both languages, in this case, Dutch- English, caused misclassi?cations in a dictionary-based lookup system. It was dif?cult to identify whether or not a token was code-switched because many English tokens were classi?ed as Dutch. The solution presented in this paper was to combine a system designed speci?cally to differentiate between borrowing and code-switching. The results show that by incorporating this method into token level language classi?cation yields a micro F1 of 94.66% and 95.69% for the rule-based LID system and the DTC respectively. This is a great improvement compared to the baseline (F1 = 85.29%) for Dutch-English CS detection in [3]. Even if the overall result is highly competitive to other similar systems, future research could bene?t from adding a number of improvements. Firstly, named entities were excluded from classi?cation altogether, because as far as we are aware, there are no suitable external named entity recognition systems for code-switched Dutch-English tweets. The systems could bene?t from the addition of named entity recognition, but more importantly, it should be included for the purpose of completing the classi?cation of a Tweet as a whole. Secondly, the synonym selection method could be improved, if context were to be taken into account. Currently, the context information is only used in the ?nal step of the LID system to correct any misclassi?cations by the frequency dictionary lookup and synonym dictionary lookup. It would be interesting to see 432 S. Kent and D. Claeser whether performance improves if this step is implemented within the synonym selection process, rather than as a ?nal step. One of the challenges for the design of the system was acquiring good external resources. The dictionaries based on Dutch and English Wikipedia are a highly suitable source for the creation of the language-speci?c word frequency lists. The inclusion of formal and informal language and a wide range of topics ensure many of the tokens are in fact present in the dictionaries. However, there seems to be a lack of freely available material for Dutch natural language processing. The synonym dictionaries, in partic-ular, are not ideal, as three separate dictionaries are necessary to achieve the results in this paper. The performance of the systems would improve with a better quality syn-onym dictionary. It is possible to improve the current dictionaries and tailor them speci?cally to the task at hand by verifying the synonym sets and adding other forms of the tokens already present. This would not only increase the likelihood of a synonym being present in the dictionary, but also the likelihood that the synonym is a correct match. Finally, both systems were developed using the language pair Dutch-English, and because the design of the classi?ers is quite simplistic and not necessarily tied based on a particular language, it would be interesting to see how they would perform on a different closely related language pair. References 1. European Commission: Europeans and their languages. Special Eurobarometer 386 (2012) 2. Poplack, S.: Sometimes I’ll start a sentence in Spanish Y TERMINO EN ESPANOL: toward a typology of code-switching. Linguistics 18, 581–618 (1980) 3. Claeser, D., Felske, D., Kent, S.: Token-level code-switching detection using Wikipedia as a lexical resource. In: Rehm, G., Declerck, T. (eds.) GSCL 2017. Language Technologies for the Challenges of the Digital Age. Lecture Notes in Arti?cial Intelligence, Lecture Notes in Computer Science, vol. 10713, pp. 192–198. Springer, Heidelberg (2018) 4. Johnson, S.: A dictionary of the english language: a digital edition of the 1755 classic. In: Besalke, B. (ed.) The History of the English Language. https://johnsonsdictionaryonline. com/the-history-of-the-english-language/. Accessed 15 April 2014 5. Muysken, P.: Code-switching and grammatical theory. In: Milroy, L., Muysken, P. (eds.) One Speaker, Two Languages: Cross-Disciplinary Perspectives on Code-Switching, pp. 177–198. Cambridge University Press, Cambridge (1995) 6. Auer, P.: Bilingual Conversation. Amsterdam/Philadelphia, Benjamins (1984) 7. Poplack, S., Sankoff, D.: Borrowing: the synchrony of integration. Linguistics 22, 99–135 (1984) 8. Clyne, M.: Dynamics of Language Contact. Cambridge University Press, Cambridge (2003) 9. Solorio, T., Blair, E., Maharjan, S., Bethard, S., Diab, M., Gohneim, M., Hawwari, A., Al- Ghamdi, F., Hirschberg, J., Chang, A., Fung, P.: Overview for the ?rst shared task on language identi?cation in code-switched data. In: Proceedings of the First Workshop on Computational Approaches to Code Switching, pp. 62–72. Doha, Qatar (2014) 10. Molina, G., AlGhamdi, F., Ghoneim, M., Hawwari, A., Rey-Villamizar, N., Diab, M., Solorio, T.: Overview for the second shared task on language identi?cation in code-switched data. In: Proceedings of the Second Workshop on Computational Approaches to Code Switching, pp. 40–49. Austin, Texas (2016) Incorporating Code-Switching and Borrowing in Dutch-English ALD on Twitter 433 11. Shirvani, R., Piergallini, M., Gautam, G.S., Chouikha, M.: The Howard University system submission for the shared task in language identi?cation in Spanish-English Codeswitching. In: Proceedings of the Second Workshop on Computational Approaches to Code Switching, pp. 116–120. Austin, Texas (2016) 12. Samih, Y., Maharjan, S., Attia, M., Solorio. T.: Multilingual code-switching identi?cation via LSTM recurrent neural networks. In: Proceedings of the Second Workshop on Computational Approaches to Code Switching, pp. 50–59. Austin, Texas (2016) 13. Bali, K., Sharma, J., Choudhury, M., Vyas, Y.: I am borrowing ya mixing?: An analysis of English-Hindi code mixing in Facebook. In: Proceedings of the First Workshop on Computational Approaches to Code Switching, Doha, Qatar, pp. 116–126 (2014) 14. Patro, J., Samanta, B., Singh, S., Basu, A., Mukherjee, P., Choudhury, M., Mukherjee, A.: All that is English may be Hindi: enhancing language identi?cation through automatic ranking of the likeliness of word borrowing in social media. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp. 2264–2274, 7–11 September 2017 15. Nguyen, D., Dogruöz A.: Word level language identi?cation in online multilingual communication. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, pp. 857–862 (2013) 16. Dongen, N.: Analysis and prediction of Dutch-English code-switching in social media messages. Unpublished master’s thesis. University of Amsterdam (2017) 17. Postma, M., van Miltenburg, E., Segers, R., Schoen, A., Vossen, P.: Open Dutch WordNet. In: Proceedings of the Eight Global Wordnet Conference, Bucharest, Romania (2016) 18. Das, A., Gambäck, B.: Code-mixing in social media text: the last language identi?cation frontier? Trait. Autom. Lang. 54(3), 41–64 (2013) 19. Maharjan, S., Blair, E., Bethard, S., Solorio, T.: Developing language-tagged corpora for code-switching tweets. In: Proceedings of LAW IX - The 9th Linguistic Annotation Workshop, Denver, Colorado, pp. 72–84 (2015) 434 S. Kent and D. Claeser A Systematic Review of Time Series Based Spam Identi?cation Techniques Iqra Muhammad(?) , Usman Qamar, and Rabia Noureen National University of Sciences and Technology, H-12, Islamabad, Pakistan iqra1804@gmail.com, usmanq@ceme.nust.edu.pk, rabia.noureen15@ce.ceme.edu.pk Abstract. Reviews are an essential resource for marketing the company’s prod- ucts on e-commerce websites. Professional spammers are hired by companies to demote competitive products and increase their own product ratings. Researchers are now adopting unique methodologies to detect spam on e-commerce websites. Time-series based spam detection has gained popularity in the recent years. We need techniques that can help us catch spammers in real time, using fewer resources. Hence, an analysis involving the use of time series is of utmost impor- tance for real-time spam detection. We focus on systematically analyzing and grouping spam detection techniques that either involve the use of temporal features, or have used time series. This study will proceed with analyzing the techniques in terms of accuracy and results. In this research paper, a survey of di?erent time series based spam detection techniques has been presented and limitations of the techniques have been discussed. Keywords: Review spam · Time series · Techniques 1 Introduction In the past decade, the increasing use of e-commerce websites for online shopping has also encouraged users to write reviews on products. This evolution of writing reviews on merchant websites has also led to spammers posting spam reviews. Companies sell products on e-commerce websites, hire spammers to post spam reviews for demotion of competitor’s products. Spam has lessened the credibility of online reviews and people become reluctant to buy a product, unsure whether the online reviews about a product are spam or not spam. Online spam reviews a?ect both buyers and sellers. Researchers have adopted a number of approaches for detection review spam. The conventional approaches for detecting review spam involve focusing on one reviewer or a single online review [1]. The authors in previous approaches [1], have detected duplicated reviews in a dataset as spam In addition to this, some previous methods of spam detection have focused on using n-gram features for spam identi?cation [2]. Our study will focus on providing a critical analysis, of the spam detection techniques that make use of time series to identify spam. Some spam detection techniques involve the use of psychological and behavioral features and identifying fake reviews [3, 4]. In addition to this, some state of the art © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 435–443, 2019. https://doi.org/10.1007/978-3-030-02686-8_33 focuses on identifying temporal patterns for detection of spam [5]. Temporal patterns involve exploration of temporal burstiness patterns for detection of opinion spam [5]. Author in [6] introduces a robust spam identi?cation approach, in which content-based factors, rating deviation and activeness of reviewers are employed along the use of time series to identify spam in online reviews. The authors in [6] have listed the disadvantages and advantages of the proposed technique in terms of increasing time e?ciency and reducing high computation requirements. The authors in [7] have linked burstiness with reviewers. Bursts of reviews are de?ned as the abnormal peaks in a time series of reviews. Bursts can occur in a time series due to several reasons. The ?rst reason can be due to the sudden rise of a product’s sale on the merchant website. The second reason for the occurrence of a burst in a time series can be due to spam attacks. Many current state of the art techniques have captured these bursts in time series for identi?cation of spam attacks. A spam review and a spammer can be related in a burst. Spammers like to work in groups while posting reviews hence spam reviews are related in a burst. Non-fake reviews are also related to other non-fake reviews in a burst of time series of reviews. The authors in [8] have used a time series based fake review detection approach in which, they have combined content and usage information. This study [8] has covered product reviews and the behavioral qualities of reviewers. Lastly, the authors in [9] have highlighted the technique of using correlated temporal features for identifying spam attacks. Their methodology [9] of spam attacks revolves around the creation of a multidimensional time series derived from aggregation of statistics. The time series [9] has been constructed to show the e?ectiveness of using correlations. In the current study, a comparative analysis of existing time series based spam detection techniques has been performed. The focal point of our review paper is that after going through mentioned techniques, experts can devise an e?cient time series based spam detection approach that uses novel temporal features. The researchers can bene?t from the review of time series based spam detection techniques and identify the limitations of the existing techniques to propose new methods of temporal-based spam detection methods. The paper is ordered as follows: Sect. 2 describes the terms of spam detection and time series, Sect. 3 describes the critical analysis of some of the time series based spam detection techniques. The next Sect. 4 includes discussion on the techniques and Sect. 5 consists of the conclusion and future work. 2 De?nitions 2.1 Time Series Time series is defined as a series of data points arranged in a timely order. Time series is widely used in the banking sector to identify fraud in credit cards. It is also used as an application in anomaly detection [9]. The authors in [9] use multivariate time series as a tool for anomaly detection. Time series has also been recently used in the literature, for the detection of opinion spam [8]. Time series can be defined mathematically using the simple regression model: 436 I. Muhammad et al. y(t) = x(t)?? + ??(t), (1) where y(t) = {yt; t = 0, 1, 2,…} is a sequence, numbered by the time subscript t. t includes an observable signal sequence x(t) = {xt} and an unobservable white-noise sequence e(t) = {et} [16]. 2.2 Review Spam Detection Techniques Review Spam is de?ned as the set of fake reviews posted on e-commerce websites. Opinion spam detection techniques [2, 3] have been widely used by researchers to detect fake spam. Such techniques assist e-commerce websites in automation of spam detec- tion. 3 Systematic Review This section will give an overview of some papers found in the literature have used time series or temporal features for the detection of opinion spam. 3.1 On the Temporal Dynamics of Opinion Spamming In [5], hybrid technique has been used to identify spamming on time series of Yelp reviews. The authors in [5] discovered temporal patterns in time series and their rela- tionship with the posting rates of spammers. They used auto vector regression methods to predict the fraud rate during multiple spamming policies. The authors in [5] also discovered the e?ects of ?ltered reviews on the rating on future rating of reviews. Author in [5] has covered three types of spamming policies. Due to the presence of three types of spamming policies, restaurants in yelp were grouped according to the policies. They calculated set of 10 modalities of normalized time series. For each behavioral modality, they had to use time series clustering in a certain policy. The authors in [5] also char- acterized the reasons of spamming by making a comparison of the time series of decep- tive rating with the truthful ratings. They had to use number of weeks as time interval for the time intervals in the time series. They also found out the major reasons of the deceptive ratings using correlation techniques. The authors also carried out 5-fold cross validation with classi?cation on time series features, behavioral features and n-gram features. This technique lacked the use of ten-fold cross validation when applying clas- si?cation on the review features. The authors could have also used additional set of textual features from the review text to improve the accuracy of the model. The compar- ison of di?erent spam detection techniques has been shown in Table 1. A Systematic Review of Time Series 437 Table 1. Shows precision, recall, f-score and accuracy for all techniques. Approaches Dataset Precision Recall F-score Accuracy On the temporal dynamics of opinion spamming [5] (late spamming) Yelp hotels and Restaurant Review dataset [14] 86.3 95.3 90.6 90.1 Exploiting Burstiness in Reviews for Review Spammer Detection [7] (Burst review with LBP and local observation) Amazon Review Dataset [13] 83.7% 68.6% 75.4% 77.6% Fake Review Detection via Exploitation of Spam Indicators and Reviewer Behavior Characteristics [8] Amazon Review Dataset [13] 75.2 75 7 74.9 x Detection of Fake Opinions using time series [6] Amazon Review Dataset [13] 82 88 86 x Biomodal Distribution and Co-bursting in Review Spam Detection [10] Dianping’s real-life ?ltered (fake or spam) reviews [15] x x x x Modelling Review Spam Using Temporal Patterns and Co-Bursting Behaviors [12] Dianping’s real-life ?ltered (fake or spam) reviews [15] x x x x Review Spam Detection via Temporal Pattern Discovery [11] Review website (www.resellerratings.com) [11] x x x x 3.2 Exploring Burstiness in Reviews for Review Spammer Detection A sudden rise in the popularity of products or the presence of spam attacks can produce bursts in time series. The authors in [7] have captured these bursts in time series of reviews. Spam reviews are related to other spam reviews in a burst. The reason is that, the spammers work in groups and post spam reviews collectively. Real reviews are related to other real reviews in time series. Author in [7] has proposed a robust spam detection framework that uses a network of reviewers appearing in the peaks of time series. They have also modeled reviewers and their co-occurrence in the peaks as Markov Random Field. In addition to this, they have used Loopy Belief Propagation technique to decide whether a reviewer can be marked as a spammer or not. They also used feature-engineering techniques, in the Loopy Belief Network for network inference. Lastly, they used an evaluation technique of using supervised classi?cation on their reviews. The limitations of this technique [7] include testing the proposed method on other review datasets to increase the validity of their technique. 438 I. Muhammad et al. 3.3 Fake Review Detection via Exploitation of Spam Indicators and Reviewer Behavior Characteristics In [8] the authors have proposed a novel spam detection framework for the identi?cation of spam reviews. This technique combines content and usage information for the iden- ti?cation of spam product reviews. The model also includes reviewer’s behavioral char- acteristics and product reviews. The authors have derived a relationship between both reviews and spammers. Their proposed model [8], identi?ed bursts to examine suspi- cious time intervals of product reviews. The technique has also employed each review- er’s past record of reviewing to derive the authorship attribute. This authorship attribute of a reviewer is a strong indicator of spam in product reviews. The technique [8] has not only considered reviews in burst intervals but also considered reviews outside the burst intervals. The authors employed [8], basic spam indicators like the rating deviation, number of reviews and content similarity. The reviews captured from burst time intervals included spam indicators like content similarity and burst activity. The techniques last step involves a linear weighted scoring function, which integrates the individual scores and calculates a mean output for overall spam score. Lastly, the technique [8] has been validated on a real word review dataset. The limi- tations of this technique may include lack of e?ective features. The feature set used for identi?cation of spam reviews can be improved by using additional reviewer based features like reviewers location and taking into account reviewer’s writing style. They can also use a di?erent weighting scoring function for assigning scores, which might improve the accuracy of the model. 3.4 Detection of Fake Opinions Using Time Series Author in [6] focuses on the implementation of a unique time series based spam detection algorithm. The algorithm involves factors like rating deviation, activeness of reviewer and other content based factors or detection of spam reviews. There are certain ?aws associated with conventional spam detection techniques. The proposed technique [6] has tried to overcome ?aws of high time consumption and high computations time. The technique is based on the assumption that the spammers work in groups and spam reviews frequency raises during certain time intervals. Author in [6] has tried to over- come the drawbacks of high time consumption and high computation required for searching for spam in large review datasets. The authors [6] have proposed that the system can be used as a real-time spam ?ltering system. We can easily clean large review datasets from spam reviews. Their proposed model achieved an F-score of 0.86. The limitation of this study is that they have not taken into account, the spam reviews that might exist outside the time series bursts. Secondly, the authors could have increased the accuracy of the model by employing features focused on the characteristics of a spammer like spammer’s IP address etc. Lastly, the validity of their proposed technique [6] can be increased by applying it onto multiple datasets. The technique is domain dependent because it has been created for application on review datasets. A Systematic Review of Time Series 439 3.5 Biomodal Distribution and Co-bursting in Review Spam Detection The author in [10] highlights the issue of spam detection and proposes a hybrid approach of using biomodal distribution and co-bursting factors. According to the authors, online reviews are critical for the comparison of di?erent products on merchant websites [10]. As explained earlier in the article spammers and fraudsters take advantage of online reviews and post fake opinions to attract customers on certain products. The previous approaches have made us of review contents, reviewer’s behavioral traits and rating patterns. This research [10] has focused on exploiting reviewer’s posting rates. The authors [10] discovered that the reviewers posting rates have a biomodal relationship with each other. According to [10], spammers post reviews in a collective manner within short intervals of time. This phenomenon of posting reviews collectively is called co-bursting. The authors in [10] have discovered patterns in a reviewer’s temporal dynamics. Authors in [10] include a labeled hidden Markov model with two modes. This model has been used to detect spamming using a single reviewer’s posting times. The method is then extended to couple hidden Markov model for identifying posting behavior and signals with co-bursting. They have also proposed a co-bursting network based model, which aids in detection of spammers. The proposed approach [10] lacks evaluation of the model through the use of supervised machine learning techniques. 3.6 Review Spam Detection via Temporal Pattern Discovery This proposed approach [11] provides evidence of spam attacks being bursty. The bursts in a time series can be either positive or negative. The authors propose [11] a correlated temporal approach to detect spam. This approach uses singleton reviews spam identi?- cation. In addition to this, it maps SR spam identi?cation to correlated pattern detection. The proposed approach [11] is based on multidimensional time series anomaly detection algorithm. The algorithm involves making a multi-scale time series and use statistics with joint anomalies as an indicator of spam. The detected statistics involve factors like average rating, ratio of singleton reviews and lastly the average rating of reviews. The time-series, is then developed and an SR spam detection model is based on this time series. The algorithm also uses integration of longest common subsequence and curve ?tting. Both of these factors are used to ?nd abnormal sections in each dimension of time series. The authors [11] have introduced a ranking technique to sum up all anomalies in various dimensions for detection of abnormal sections in time series. Fluctuations are common in time series. This algorithm has used a time window size of more than two months, so that noises in the time series can be smoothed. In a certain scenario, if a singleton review spam attack occurs in time series, the time window size is decreased so that any further abnormal patterns can become more obvious. The construction of time series is done, and this time series is multi-dimensional. Multi-dimensional time series is then used to identify abnormally correlated pattern detection problem. The results of this methodology show that it is quite e?ective in identi?cation of singleton review spam attacks. The limitation of this approach can be that this technique is not 440 I. Muhammad et al. applicable on other types of spams like sms and email spam. The model has been tested on a single dataset. 3.7 Modelling Review Spam Using Temporal Patterns and Co-bursting Behaviors This technique [12] is based on a real life dataset from a review hosting site called dianping. The authors [12] discovered that reviewers posting rates were biomodal. In addition to this scenario, the transitions between di?erent states could be used to detect spammers from real reviewers. The technique proposed, involves a two model labeled hidden Markov model for identi?cation of spammers in review websites. The ?ndings of the model prove that the existing approach can outperform, supervised machine learning algorithms. Spammers are keener on writing reviews in a group and hence bursts in time series of reviews are created. The authors in [12] propose a co-bursting based approach for identifying spammers. This framework can enable more precise detection of spammers and outperforms the current state of the art mentioned in [12]. The authors have also mentioned that biomodal distributions are disparate and these distributions were identi?ed in both form as review spammers and non-spammers. The limitation of this approach is that it requires time stamps of reviews in a dataset. Without the presence of time stamps, the approach is not applicable in real life datasets. The advantage of the algorithm is that it can be applied to commercials review spam ?lters. 4 Discussion We have compared all the approaches using the metrics of precision, F-score, recall and accuracy. The amount of precision, recall, f-score and accuracy for each technique has been taken from the articles mentioned in Table 1. A comparison has been made among the techniques keeping in view the fact that most of these approaches have been applied to the similar datasets. After the comparison, it can be seen that only some of the algo- rithms mentioned in Table 1, have used precision, recall, f-score and accuracy for comparison. The ?rst article referred in Table 1, uses the dataset of Yelp [14]. Yelp [14] is a website that provides reviews on hotels and restaurants. Spammers work in groups to post fake reviews about certain hotels. Spammers target hotels and Restaurants and fake reviews cause their ratings to decrease. This approach [5] has achieved an accuracy of 90.1 with late spamming. Late spamming achieved the best set of precision and accu- racy among all three types of spamming. The second approach [7] mentioned in Table 1, is based on exploration of burstiness in reviews for spammer identi?cation. This approach [7] produced these set of results in the table with the use of LBP and local observation techniques. This algorithm used Amazon review dataset [13]. Amazon review dataset [13] provides a large-scale dataset on various set of products. Products rating, reviews and other attributes have been included in Amazon review dataset. This approach achieves an accuracy of 70.1 with the LBP and local observations. The third approach [8], included in the table is based on spam detection by using reviewer char- acteristics and various spam indicators. This approach [8] also used Amazon review A Systematic Review of Time Series 441 dataset [13]. The approach [8] didn’t use the metric of accuracy for evaluating its model. It achieved an F-score of 74.9%. The fourth algorithm [6], mentioned in the table makes use of time series and other reviewer traits to detect spam in reviews. It has also Amazon review dataset [13]. The model achieved an F-score of 86%. This model [6] didn’t used any supervised machine learning technique to classify the suspicious set of reviews as spam or non-spam. The ?fth article [10] included in Table 1, is based on a biomodal distribution model used to detect review spam. This model used dianping’s [15] real life dataset. Dianping [15] is a Chinese website that includes reviews about consumer products and retail services. Dianping dataset is the single largest dataset to have spam and non-spam classes. Each review is for a single individual. There have been references in the liter- ature of yelp datasets [14], with class labels but these datasets are much small in size when compared to dianping dataset [15]. The authors in [10] have reasonably argued their choice of dataset because of its large size and presence of labels. The models proposed by this article [10], outperform existing models on this huge dataset [15]. This paper [10] didn’t use any metrics like accuracy, precision, recall and f-score for its spam detection model evaluation. The next technique [13] included in Table 1 has used temporal patterns and co-bursting factors to identify spam in review dataset. This article [12] has also used dianping’s real life dataset [15]. Temporal features were extracted from the dataset time stamps [12]. The authors in this article [12] haven’t used metrics like precision, recall, f-measure and accuracy for evaluation of proposed model. The last technique [11] mentioned in Table 1 has highlighted the importance of temporal features in reviews, for spam detection. Temporal patterns have been discovered in the reviews of a reseller website [11]. The dataset [11] contained around 408,469 reviews. Each review in the dataset [11] can be identi?ed by a unique id. The authors in [11] used the dataset for suspicious store detection via identi?cation of singleton spam attacks. Human evaluators in [11] were used to perform validation of the results by reading reviews from all 53 stores and singling out stores that were suspicious. This technique did not employ metrics like precision, recall, f-score and accuracy for evaluation of its model. In conclusion, all approaches mentioned in Table 1, used time series based on the assumption that spammers work in groups when posting spam reviews. Their collec- tive manner of working produces bursts in times series of reviews and we can easily capture these bursts for spam detection. 5 Conclusion and Future Work This research paper highlighted state of the art methods that involved the use of time series for spam detection in online reviews. It made a critical comparative analysis of the techniques present in the literature. It also showed the details of the techniques of each related article in the literature related to time series based spam detection. Secondly, we also provided a summarized overview of all techniques, their used datasets and made a comparison of the metrics used for the evaluation of the proposed models. Our review paper can be used by experts as an asset while searching for state of the art relevant to time series based spam detection. Future work of this study includes proposing a hybrid 442 I. Muhammad et al. approach to time series based spam detection. The model can include more diverse feature engineering techniques and the use of supervised machine learning techniques for suspicious reviews ?ltered by time series. References 1. Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the International Conference on Web Search and Web Data Mining - WSDM 2008 (2008) 2. Li, J., Ott, M., Cardie, C., Hovy, E.: Towards a general rule for identifying deceptive opinion spam. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2014) 3. Dewang, R.K., Singh, P., Singh, A.K.: Finding of review spam through “Corleone, review genre, writing style and review text detail features”. In: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies - ICTCS 2016 (2016) 4. Mukherjee, A., Kumar, A., Lin, B., Wang, J., Hsu, M., Castellanos, M.: Spotting opinion spammers using behavioral footprints. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 632–640 (2013) 5. Kc, S., Mukherjee, A.: On the temporal dynamics of opinion spamming. In: Proceedings of the 25th International Conference on World Wide Web - WWW 2016 (2016) 6. Heydari, A., Tavakoli, M., Salim, N.: Detection of fake opinions using time series. Expert Syst. Appl. 58, 83–92 (2016) 7. Fei, G., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R.: Exploiting burstiness in reviews for review spammer detection. In: Kiciman, E., et al. (eds.) ICWSM. The AAAI Press (2013) 8. Dematis, I., Karapistoli, E., Vakali, A.: Fake review detection via exploitation of spam indicators and reviewer behavior characteristics. In: SOFSEM 2018: Theory and Practice of Computer Science Lecture Notes in Computer Science, pp. 581–595 (2017) 9. Li, J., Pedrycz, W., Jamal, I.: Multivariate time series anomaly detection: a framework of hidden Markov models. Appl. Soft Comput. 60, 229–240 (2017) 10. Li, H., Fei, G., Wang, S., Liu, B., Shao, W., Mukherjee, A., Shao, J.: Bimodal distribution and co-bursting in review spam detection. In: Proceedings of the 26th International Conference on World Wide Web - WWW 2017 (2017) 11. Xie, S., Wang, G., Lin, S., Yu, P.S.: Review spam detection via temporal pattern discovery. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2012 (2012) 12. Li, H., Fei, G., Wang, S., Liu, B., Shao, W., Mukherjee, A.: Modeling review spam using temporal patterns and co-bursting behaviors. arXiv preprint arXiv:1611.06625 (2016) 13. Amazon: Amazon (2018). http://snap.stanford.edu/data/amazon/productGraph/. Accessed 4 Feb 2018 14. Yelp: Yelp (2017). http://www.yelp.com. Accessed 6 Dec 2017 15. Dianping Chinese Review dataset. http://liu.cs.uic.edu/download/dianping/. Accessed 6 Apr 2018 16. Hamilton, J.D.: Time Series Analysis, vol. 2. Princeton University Press, Princeton (1994) A Systematic Review of Time Series 443 CNN with Limit Order Book Data for Stock Price Prediction Jaime Nino ˜ 1(B) , German Hernandez1 , Andr´es Ar´evalo1 , Diego Leon2 , and Javier Sandoval2 1 Universidad Nacional de Colombia, Bogot´a, Colombia {jhninop,gjhernandezp,ararevalom}@unal.edu.co 2 Universidad Externado de Colombia, Bogot´a, Colombia {diego.leon,javier.sandoval}@uexternado.edu.co Abstract. This work presents a remarkable and innovative short-term forecasting method for Financial Time Series (FTS). Most of the approaches for FTS modeling work directly with prices, given the fact that transaction data is more reachable and more widely available. For this particular work, we will be using the Limit Order Book (LOB) data, which registers all trade intentions from market participants. As a result, there is more enriched data to make better predictions. We will be using Deep Convolutional Neural Networks (CNN), which are good at pat-tern recognition on images. In order to accomplish the proposed task we will make an image-like representation of LOB and transaction data, which will feed up into the CNN, therefore it can recognize hidden pat-terns to classify FTS in short-term periods. We will present step by step methodology to encode ?nancial time series into an image-like represen-tation. Results present an impressive performance, ranging between 63% and 66% in Directional Accuracy (DA), having advantages in reducing model parameters as well as to make inputs time invariant. Keywords: Short-term forecasting · Deep Learning Convolutional Neural Networks · Limit Order Book Pattern recognition 1 Introduction Finance has become a highly sophisticated scienti?c discipline that depends on innovations from computer science to analyze huge ?ows of data in real time. Finance o?ers nonlinear relationships and large data sets on which Machine Learning (ML) ?ourishes, but they also impose tremendous challenges when applying these computational techniques, due to data noisiness, non linearities among other characteristics of ?nancial systems. Literature is vast when report-ing applications using machine learning methods for FTS modeling [4,6,9,11,15]. Works include Arti?cial Neural Networks, Support Vector Machines, among others. Lately, Deep Learning has emerged as a superior ML technique for a .o c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 444–457, 2019. https://doi.org/10.1007/978-3-030-02686-8_34 CNN with Limit Order Book Data for Stock Price Prediction 445 wide variety of ?elds, including Image Recognition, Audio Classi?cation, Natu-ral Language Processing, as well as FTS Forecasting and Algorithmic Trading among others. In this work, we use a Convolutional Neural Network to predict movements of FTS. We will be working with both LOB and transaction (tick) data. LOB data contains all traders intentions to negotiate an asset at a par-ticular price and quantity at certain time t. LOB information is richer than transaction data, which only records prices and quantities exchanged at certain time t. In order to use CNN, we represent both LOB and tick data as images. Results are very competitive when compare to other DL approaches reported in [1,3,7,16,20], with the advantage of using the same trained model for di?erent assets. This paper continues as follows: Sect. 2 explains how LOB and tick data is transformed into images, Sect. 3 gives a brief summary of CNN, Sect. 4 explains the methodology to process and classify image data, Sect. 5 shows results and Sect. 6 gives ?nal remarks, conclusions, and further work opportunities. 2 Limit Order Book and Tick Data Transformation 2.1 De?nitions Limit Order Book. Order Book Data records market agents buy/sell inten-tions. It includes a time-stamp, quantity and price to buy/sell. This data is known as Limit Order Book (LOB). Formally, an order x = (p, q, t, s) sent at time tx with price px, quantity qx (number of shares) and side sx (buy / sell), is a commitment to buy/sell up to qx units of an asset at price px. Orders are sorted by arrival time t and quoted price p. Sell orders have larger prices than buy orders. [5,8,18] Some other useful concepts include [8,18]: – Spread size is the di?erence between the best sell and buy price. – Bid price is the highest price among all active buy orders at time t. Conversely, Ask price is the lowest price among all active sell orders at time t. Both are called best quotes. – An LOB L(t) is the set of all active orders at time t. Dynamics of LOB are complex [5,8], since it re?ects interactions among mar-ket agents with a di?erent point of views and di?erent trading strategies. For a particular time t, LOB concepts are illustrated in Fig. 1. When all recorded intentions are joined, they can be seen as an Image Fig. 2. On this image representation, y-axis represent prices, the x-axis is time and each point is a quantity willing to be traded. The darker the color the most quantity q at certain price p. In [19], authors used this graphic representation to cluster LOB-Patterns in order to build a classi?er. Based on this work, LOB data can be seen as a list of tuples (prices-quantities) where agents expect to negotiate. Numerically, this representation can be seen as a multivariate FTS1 . 1 Some considerations should be done, particularly related to the dimensionality of the FTS. 446 J. Nino ˜ et al. Fig. 1. LOB snapshot, taken from [8]. LOB Representation. For a set of successive timestamps, LOB data can be represented as a matrix-like object, where column labels are timestamps, row labels are prices and the content of each cell is the number of shares to bid/ask. Each cell contains a quantity q, with subindex side s, time t and price line p. Order side could be either ask a or bid b. Because there are order imbalances, price lines subindex are k for the ask side and j for the bid side (Table 1). Table 1. LOB matrix representation t0 t1 ... tn AskP ricek qa0k ... ... qank AskP ricek-.1 qa0k-.1 ... ... qank-.1 ... ... ... ... ... AskP rice0 qa00 ... ... qan0 BidP rice0 qb00 ... ... qbn0 ... ... ... ... ... BidP ricej-.1 qb0j-.1 ... ... qbnj-.1 BidP ricej qb0j ... ... qbnj Normalizing each qsti between 0–255, will produce a LOB gray scale image. However, there is a lot more information in LOB data. Because each order is recorded individually and sorted by arrival time, it is possible to aggregate vol-umes at the same price. By doing so you can get how many di?erent orders (quotes) are placed at the same price. Formally, for each unique price p adds all quantities qk, where q = [q1,q2, ...qm], being m the last entered order at price p. This information is very important because is di?erent to have many distinct agents interested at one particular price that just a few ones. However this fact, under real market conditions, goes hand in hand with how much volume (quantity) of the asset is available at that particular price p. In other words, it is important to have some sense of the distribution. It is di?erent to have a CNN with Limit Order Book Data for Stock Price Prediction 447 lot volume concentrated in just one participant that distributed across many. To introduce this information in our representation, we used maxpk(q), for each unique price p at line k, signaling a sense of the volume distribution. As a result, we will represent LOB data in a 4-channel representation, which can be seen as a RGBA image (Fig. 2), where: – R channel is only used for ask volumes qa, 0 otherwise. – G channel is only used for bid volumes qb, 0 otherwise. – B channel is only used to represent total number of placed orders at a unique price p. – A channel is only used to represent volume distribution for a unique price p, taking maxpk(q). Fig. 2. LOB as image, taken from [19]. Tick Data. Tick data records transactions, that is prices and quantities exchanged for a particular asset. Formally, a transaction occurs when at time t the bid price equals the ask price. At this point, a transaction T = (p, q, t) occurs, where pT is the price, qT is the shares quantity exchanged and tT is the transaction time-stamp [18]. Tick data is a univariate time series2 . Tick Data Graphical Representation. As mentioned before, tick data is the most widely used data when modeling FTS. This is because is easier to obtain. LOB data is more di?cult to get and usually cost a lot, not just in money terms but also in storage terms. Transactions are heavily in?uenced by the intentions 2 Bivariate if volumes are included. 448 J. Nino ˜ et al. recorded in the LOB, but they do not have the richness of LOB. Nevertheless, we expect, that in conjunction with the LOB, to yield better results. In other to homogenize inputs, it is necessary to transform tick data into a matrix-like representation. In [22], authors show a step by step methodology that transforms univariate time series into an image representation. This transformation is called Grammian Angular Field (GAF), which consists of the following steps3 : – Time series normalization between [-1, 1] – Time series is converted from Cartesian to Polar coordinates fo = arccos(xt); r = ti N ,ti ?i N (1) – GAF matrix deduction, de?ned as: ?s ?s ?s ?s ?s < x1, x1 > ... < x1, xn > < x2, x1 > ... < x2, xn > . . . . . . . . . < xn, x1 > ... < xn, xn > ?n ?n ?n ?n ?n , where < x, y > = x · y -, v, I -, x2 · 2 I -2 y2 Authors in [22] used for non-Financial Time Series. In this paper, we apply the same general steps in order to obtain a graphical version of the tick data, as illustrated in Table 2. One advantage of this transformation is that marks peaks of the input signal, based on intensity levels Table 2. This is useful for pattern recognition because it helps to di?erentiate price variances within the original signal. On the other hand, the transformed input can be rolled back to the original signal [22]. We expect that on this new space, patterns could be easier to identify since CNN’s learning capabilities have been proven good in frequency spaces. In fact, in a previous work we show how a wavelet transformation improve results over a pure time-space approach [1] 4 . 3 Deep Learning - Convolutional Neural Networks The concept of Deep Learning (DL) was adopted from Neuroscience [13], where the seminal authors [17] proposed a novel way of how our visual cortex processed data coming in through our visual system using a layered representation, starting in the retina all the way up to the visual cortex. Their proposal consisted of making sparse representations of input data, in order to get its appropriated representation. In other words, any instance of data can be reconstructed as a di?erent linear combination of the same components from sparse representations from the original data or to make more complex representations of the data at each layer by combining the representation of the previous layer [13]. 3 For full details please refer to [22]. 4 We used other DL topologies. CNN with Limit Order Book Data for Stock Price Prediction 449 Table 2. Original tick data vs Image representation of tick data Tick-data line chart Image representation This development was computational feasible only until 2006, when semi-nal authors [10], proposed a novel Unsupervised Learning algorithm to train deep architectures consisted of Restricted Boltzmann Machines (RBM). This model was capable of building complex representations of data at deeper layers by capturing sparse representations from the previous ones. At that time, this algorithm won an Image Classi?cation contest and it was established as the DL introduction [13]. Since its emergence, DL has facilitated the application and use of di?erent neural network topologies more successfully in di?erent ?elds, due to the fact that DL tackles the issue of gradient vanishing while training multilayer networks. As a result, di?erent network topologies are being used with DL, including tradi-tional Multilayer Perceptron (MLP), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) Networks, Deep Belief Networks (DBN) and Con-volutional Neural Networks (CNN). Each topology has its own particularities. In the case of CNN, they have been used for Image Processing and Classi?cation task. A CNN is a variation of a Multilayer Perceptron, which means that it is a feed-forward network, however, it requires less processing when compared to a MLP, due to the mechanism used to process input data. Moreover, CNN’s main characteristic is to be space invariant, that is due to the convolution operator that transform data inputs. CNN are biological inspired, trying to emulate what happens in mammal’s Visual Cortex, where neural cells are specialized to distinguish particular fea- 450 J. Nino ˜ et al. tures. Building blocks of a CNN architecture are in charge of doing this feature detection by activating or de-activating a set of neurons. Since market agents decisions are mostly made from visual analysis of price changes and events in the LOB, we expect that an algorithm can learn patterns in order to help trig-ger trading decisions. In fact, [18,19] shown that a visual dictionary could be constructed from LOB data and that dictionary had predicting capabilities. The two main build blocks of a CNN are the convolution layer and the pooling layer, which in conjunction with a dense layer, complete a CNN. Convolution Layer. It is in charge of applying convolution operator to the input matrix, in other words it applies a kernel to ?lter data input. Depending on the parameters used, it can reduce or maintain input’s dimensionality. The reason to convolve is to identify edges. That means to identify or separate features that later on can be used to construct more complex representations in deeper layers. Pooling Layer. It is a local operator, that takes convolution output and maps subregions into a single number. The pooling operator can extract the max value of the mapped subregion (Max pooling) or the average value of the mapped subregion (Average Pooling). In other words, it gets subsamples out of the Con-volution Layer. Usually both layers make are treated as one layer in the CNN topology, however, it is not necessary to have one convolution and one pooling layer. Additionally, CNN topologies usually include various layers of convolution plus pooling, therefore networks extract simpler features at the ?rst layer, and by combining those, can learn more complex features in deeper layers. Dense Layer. Finally, the deeper convolutional layer is connected to a dense layer (fully connected), from which network obtains its outputs. As mentioned before, the CNN topology may have one or more dense layers. AlexNet and LeNet: Well-Know CNN Architectures. LeNet-5 is a CNN created by [14] and it was aimed to make hand-written number recognition. It consists of 7 layers (Input, Conv + Pool, Conv + Pool, Dense + Output). At that time, computing resources were scarce, creating a constraint for this technique. However, as computer resources got better in performance and cost, training this particular architecture is easy and it has become a baseline in image recognition contests. AlexNet was created in 2012 and it became famous due to the fact that reduces the classi?cation error in an Image Recognition Contest to 15.3% by that time. Nowadays, classi?cation error is much lower. Since AlexNet was the pioneer, it has become baseline architecture as LeNet. AlexNet took advantages of computer developments, particularly parallel processing through Graphics Processing Units (GPUs). It was created by [12]. It has more ?lters than LeNet as well as stocked convolution layers, as a result, it is deeper with more parameters. CNN with Limit Order Book Data for Stock Price Prediction 451 We decide to compare di?erent CNN topologies, in order to compare DA among them in order to analyze advantages and disadvantages of each one. We will make the comparison with another self-created CNN topology. Next section will give step by step explanation for our experiment. 4 Classifying Financial Time Series with CNN 4.1 Why a CNN for FTS Classi?cation – Firstly, DL models have demonstrated a greater e?ectiveness in both clas-si?cation and prediction tasks, in di?erent domains such as video analysis, audio recognition, text analysis and image processing. Its superiority is due to the fact that they are able to learn useful representations from raw data, avoiding the local minimum issue of ANNs, by learning in a layered way using a combination of supervised and unsupervised learning to adjust weights W . – Secondly, DL applications in computational ?nance are limited [2,3,7,21,23] and as long as it goes to our knowledge, there is no publication applying CNN to FTS, particularly using LOB data for short-term periods forecasting. – Thirdly, CNN are good for pattern recognition, real traders have told us that they try to identify patterns by following buy/sell intentions in a numeric form. In a previous work, [18] identi?ed volume barriers patterns to translate them into trading decisions and [19] identi?ed visual patterns and cluster them into a bag of words model to predict market movements. As a result of these works, we decided to extend them and use a more suitable technique for pattern recognition such as CNN on image-like representation of market data. – Finally, by applying input space change (from time to a frequency), we expect that CNN will recognize patterns more e?ective, indeed authors in [1] improved their results by using wavelets to represent high frequency data of several ?nancial assets. Even tough our images are not natural ones, we expect that CNN’s layers are capable to distinguish simple frequency changes (edges) at lower layers in order to identify more complex patterns at deeper ones. 4.2 Experimental Setup – Data acquisition: Original dataset is compose of LOB and transaction data for 12 stocks listed on the Colombian Stock Market, from Feb 16, 2016 to Dec 28, 2017. Dataset includes 184,450 LOB ?les and 612,559 ticks (transactions), totaling 590MB in disk. 5 – Data preparation: For each stock, data normalization was conducted, taking into account some considerations which include handling of no orders at some price levels in LOB data, some liquidity constrains and event of LOB. Details are given in the next subsection. 5 Data provided by DataDrivenMarket Corporation. 452 J. Nino ˜ et al. – Data transformation: For each stock both LOB data and tick data are trans-formed to an image-like representation, following the methodology previously explained. – CNN modeling: We chose a base CNN architecture. We trained and test it with transformed data. – Model Comparison across di?erent CNN architectures: We use another two CNN, which mimic Le-Net and Alex-Net standard architectures, in order to compare the proposed model. – CNN comparison again other DL topologies: We compare results achieved results obtained in this work against others, which have been used for simi-lar problems (Short-term forecasting) but di?erent Deep Learning topologies (RNN, LSTM, Multilayer Perceptron, DBN) Following paragraphs will provide further details of our experimental setup. 4.3 Data Preparation Data Normalization. For each stock, prices, volumes (quantities) and a num-ber of orders at the same price were normalized between (0–1]. Given the fact that LOB data may have price levels with no demand/o?er, minimum values were reduced by a small factor so that minimum values had a small value above zero. That is because empty cells in the LOB had a 0 value, therefore we can di?erentiate a no entry in LOB with an entry with a very low volume or just one order at certain price p. Data normalization by stock facilitates magnitude equilibrium across all stock data, regardless their nominal prices or volumes. In other words, we homogenize the image representation in di?erent dimensions: price, quantities, and a number of orders. Handling of Liquidity Constraints. Given the fact that Colombian market is not highly liquid, we only took, for each stock, LOB data that had enough entries in a single trading day. That is, we took trading days which had more than 100 ?les on LOB data per stock, which is equivalent to have at least one LOB event for any given stock every three and half minutes on average. For classi?cation purposes, it would not mind having low liquid days mixed with high liquid days. However, for practical purposes, liquidity constraints are very important in ?nancial markets, because spreads may vary widely as liquidity is lower. That is the reason we choose samples corresponding to highly liquid days. Handling of LOB Events. We took an event-based approach, that is to analyze a ?xed number of LOB events (10 in this case). This means that the LOB matrix explained in Sect. 2 Table 1, was partitioned into ?xed segments of 10. And we took all of the ticks that happened between this 10 LOB records, to create the corresponding image for tick data. Figure 3 illustrates procedure described above. CNN with Limit Order Book Data for Stock Price Prediction 453 Fig. 3. LOB events. Handling of LOB Deepness. LOB data may have many di?erent lines or prices in both side (bid/ask). Depending of market conditions depth wide varies, that is not all time you will have a symmetric number of lines for each book side. We have decide to work with LOB data of 10 lines depth, that is the ?rst 10 di?erent prices for each side. Prices start from the best quotes (down/up) side depending (bid/ask). Additional Considerations. It is important to note the following: – Price dynamics make that a price matrix with more than 20 rows (prices) in Fig. 1. In other words, we will have unequal height for each LOB 10-event image. Table 3 shows results graphically. – Prices with no volume will have a 0 value. This value will be always di?erent for the lowest volume after normalization, as mentioned before. – To make LOB image’s size homogeneous for modeling purposes, we resize each image to be 10 width and 40 height. Individual price matrices have di?erent height. This happens because of price dynamics, in order words there are di?erent set of prices for each time t, depending of traders intentions. 4.4 CNN Modeling Data Input. Four channel images are used, one for LOB data another for tick data. A ?ve dimensional tensor is used for data input, with size [n, 2, 10, 40, 4]. The ?rst dimension is the number of samples, second one the number of images categories (LOB/tick) an the other three, image dimensions (Width, Height, Channels). 454 J. Nino ˜ et al. Table 3. LOB data images Data Labeling. Data will be classi?ed in three di?erent classes: – Class 0: Upwards movement – Class 1: Downwards movement – Class 2: No trending movement Class speci?cation was based in how a following set of ticks behave after a 10-set LOB events. Ticks analysis was done to get the thresholds. Table 4 illus-trates thresholds. Table 4. Three class Rules Price direction Rule Class Upward movement Last tick price above 0.03% vs last tick of the previous window 0 Downward movement Last tick price below –0.03% vs last tick of the previous window 1 Flat movements Otherwise 2 CNN Architecture. We use a standard CNN architecture, which consists of (Input + Conv + Pool + Conv + Pool + Dense + Dropout), input images’ size is 10 × 40. We compare it to AlexNet and LeNet. We have to make some modi?- cations to input images’ size (20 × 40) as well as ?lters sizes in some convolution layers, particularly for AlexNet type con?guration. We set up two di?erent experiments, one using LOB data only, another using both LOB and tick data, meaning that the second one had more input informa-tion. We used TensorFlow. Special considerations for training size included dropout at 40% and bath size at 100. The dataset was split into 90% for training, 10% for testing. The number of samples was 67,348 images. Moreover, a similar setup was built taking into account only LOB data. This means a whole set of experiments working with 2D Convolutions. CNN with Limit Order Book Data for Stock Price Prediction 455 5 Results 5.1 Model Comparison Across Di?erent CNN Architectures The CNNs were used to classify the three target classes (Up, Down, Flat). Table 5 shows the performance of the three di?erent architectures over the testing sam-ples. De?nitely, the combination of LOB and Tick data as model’s features signif-icantly increased the model accuracy model; it achieved accuracies greater than 65%. LeNet* and AlexNet* had a better performance than the proposed topol-ogy, but they require too much computational power for training purposes, then it could become a serious problem in a real high-frequency trading strategy. On the other hand, the proposed CNN Topology sacri?ces some performance (less than 1%), but it is simpler and easier to train. This property is useful in a real environment, given that it allows to retrain the model and deploy it. Table 5. Result summary for di?erent architectures Experiment Topology Data input Perfomance 2D-LeNet LeNet* LOB 59.56% 2D-AlexNet AlexNet* LOB 63.15% 2D-Own Other CNN Topology LOB 58.23% 3D-LeNet LeNet* LOB+Tick 66.09% 3D-AlexNet AlexNet* LOB+Tick 66.83% 3D-Own Other CNN Topology LOB+Tick 65.31% 5.2 Model Comparison Against Other DL Topologies As observed in Table 6, proposed model is very competitive with the advantage that one model runs for several assets (Table 6). Table 6. Comparison against other DL topologies DL Topology Classes Data used Directional accuracy Multilayer Perceptron [1] 2 1-Stock, tick data 66% Deep Belief Network [16] 2 1-Stock, LOB + Tick data 57% Proposed Model (CNN) 3 12-stocks, LOB + Tick data 65.31% 6 Conclusion and Future Research CNN for FTS prediction purpose worked well. DA shows that results are very competitive, in fact, better than other approaches tested before [1,16,19]. As expected, performance improves when both LOB and tick data is used in 456 J. Nino ˜ et al. conjunction, and the main reason is simple: there is more market information. Image-like representation is useful and even could be extended, that is it is possible to have more channels in the original input image (matrix). Perceived advantages – One network for multiple assets. It is not usually the case, given the fact that each asset has it owns dynamics. Image-like representation homogenize inputs, resulting in an image representing market information, ?nding patterns across all image set, regardless the asset. – Lifetime of trained model. In ?nancial applications frequent retraining is the norm. This approach extends the lifetime of the trained model due to the time invariance fact associated with images. Perceived disadvantages – It is a data intensive technique. As there are more images for training, results will improve. – Training times are large, particular for complex architectures such as AlexNet, which uses several channels and several layers. – Preprocessing could be tricky. There are a lot of details to take into account when transforming raw data. In our experience, we suggest a trade-o? analysis between training times and lifetime of the trained model. For real implementations with an expected lifetime ranging from 5 min to a couple of hours, we think is hugely advantageous. This model should be tested with data from more liquid markets, to check preprocess-ing times as well as performance. We think that there are a lot of possibilities for improvement, including the use of combined approaches (LSTM and CNN), and to code more information in more channels, for example, technical information. References 1. Ar´evalo, A., Nino, J., Hern´andez, G., Sandoval, J.: High-Frequency Trading Strat-egy Based on Deep Neural Networks, pp. 424–436 (2016). https://doi.org/10.1007/ 978-3-319-42297-8 40 2. Arnold, L., Rebecchi, S., Chevallier, S., Paugam-Moisy, H.: An introduction to deep learning. In: ESANN (2011). https://www.elen.ucl.ac.be/Proceedings/esann/ esannpdf/es2011-4.pdf 3. Chao, J., Shen, F., Zhao, J.: Forecasting exchange rate with deep belief networks. In: The 2011 International Joint Conference on Neural Networks, pp. 1259–1266. IEEE (2011). http://ieeexplore.ieee.org/articleDetails.jsp?arnumber=6033368, http://ieeexplore.ieee.org/xpls/abs all.jsp?arnumber=6033368 4. Chen, M., Ebert, D., Hagen, H., Laramee, R.S., van Liere, R., Ma, K.L., Ribarsky, W., Scheuermann, G., Silver, D.: Data, information, and knowledge in visualiza-tion. IEEE Comput. Graph. Appl. 29(1), 12–19 (2009) 5. Cont, R., Stoikov, S., Talreja, R.: A stochastic model for order book dynamics. Oper. Res. 58, 549–563 (2010) CNN with Limit Order Book Data for Stock Price Prediction 457 6. De Goijer, J., Hyndman, R.: 25 years of time series forecasting. J. Forecast. 22, 443–473 (2006) 7. Ding, X., Zhang, Y., Liu, T., Duan, J.: Deep learning for event-driven stock pre-diction. In: Proceedings of the Twenty-Fourth International Joint Conference on Arti?cial Intelligence (ICJAI) (2015). http://ijcai.org/papers15/Papers/IJCAI15- 329.pdf 8. Gould, M.E.A.: Limit order books. Quant. Financ. 13, 42 (2010) 9. Hamid, S., Habib, A.: Financial forecasting with neura networks. Acad. Acc. Financ. Stud. J. 18, 37–56 (2014) 10. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006). https://doi.org/10.1162/neco. 2006.18.7.1527, pMID: 16764513 11. Huang, G.E.A.: Trends in extreme learning machines: a review. Neural Netw. 61, 32–48 (2015) 12. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classi?cation with deep con-volutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS 2012, vol. 1, pp. 1097–1105. Curran Associates Inc., USA (2012). http://dl.acm.org/citation.cfm?id=2999134.2999257 13. Laserson, J.: From neural networks to deep learning: zeroing in on the human brain. XRDS 18(1), 29–34 (2011). https://doi.org/10.1145/2000775.2000787 14. Lecun, Y., Bottou, L., Bengio, Y., Ha?ner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998) 15. L¨angkvist, M., Karlsson, L., Lout?, A.: A review of unsupervised feature learn-ing and deep learning for time-series modeling. Pattern Recognit. Lett. 42, 11–24 (2014). http://www.sciencedirect.com/science/article/pii/S0167865514000221 16. Nino, J., Hernandez, G.: Price direction prediction on high frequency data using deep belief networks. In: Applied Computer Sciences in Engineering, pp. 74–83. Springer (2016) 17. Olshausen, B.A., Field, D.J.: Natural image statistics and e?cient coding. Net-work Comput. Neural Syst. 7(2), 333–339 (1996). https://doi.org/10.1088/0954- 898X 7 2 014, pMID: 16754394 18. Sandoval, J.: Empirical shape function of the limit-order books of the USD/COP spot market. In: ODEON, p. 7 (2013). https://ssrn.com/abstract=2408087 19. Sandoval, J., Nino, J., Hernandez, G., Cruz, A.: Detecting informative pat-terns in ?nancial market trends based on visual analysis. Procedia Com-put. Sci. 80, 752–761 (2016). http://www.sciencedirect.com/science/article/pii/ S1877050916308407. International Conference on Computational Science 2016, ICCS 2016, 6-8 June 2016, San Diego, California, USA 20. Shen, F., Chao, J., Zhao, J.: Forecasting exchange rate using deep belief networks and conjugate gradient method. Neurocomput. 167, 243–253 (2015). https://doi. org/10.1016/j.neucom.2015.04.071 21. Takeuchi, L., Lee, Y.: Applying Deep Learning to Enhance Momentum Trading Strategies in Stocks (2013) 22. Wang, Z., Oates, T.: Encoding Time Series as Images for Visual Inspection and Classi?cation Using Tiled Convolutional Neural Networks (2015). https://pdfs. semanticscholar.org/32e7/b2ddc781b571fa023c205753a803565543e7.pdf 23. Yeh, S., Wang, C., Tsai, M.: Corporate Default Prediction via Deep Learning (2014). http://teacher.utaipei.edu.tw/cjwang/slides/ISF2014.pdf Implementing Clustering and Classi?cation Approaches for Big Data with MATLAB Katrin Pitz(&) and Reiner Anderl Technische Universität Darmstadt, 64283 Darmstadt, Germany pitz@dik.tu-darmstadt.de Abstract. Data sets grow rapidly, driven by increasing storage capacities as well as by the wish to equip more and more devices with sensors and con-nectivity. In mechanical engineering Big Data offers new possibilities to gain knowledge from existing data for product design, manufacturing, maintenance and failure prevention. Typical interests when analyzing Big Data are the identi?cation of clusters, the assignment to classes or the development of regression models for prediction. This paper assesses various Big Data approaches and chooses adequate clustering and classi?cation solutions for a data set of simulated jet engine signals and life spans. These solutions include k-means clustering, linear discriminant analysis and neural networks. MATLAB is chosen as the programming environment for implementation because of its dissemination in engineering disciplines. The suitability of MATLAB as a tool for Big Data analysis is to be evaluated. The results of all applied clustering and classi?cation approaches are discussed and prospects for further adaption and transferability to other scenarios are pointed out. Keywords: Big DataClusteringClassi?cationK-means Discriminant analysisNeural networksMATLAB 1 Introduction When it comes to Big Data, there is no solitary, generally agreed-on de?nition, neither in academia nor in industry [1]. However, most experts agree on Big Data exceeding common storing capacities and computing methods [2]. It has also become popular to outline Big Data via the 3 Vs introduced by [3]: volume, velocity, and variety. Volume means that an increasing amount of data is to be handled, even though the speci?c numbers for when to start labeling data as Big Data vary. Velocity stresses the fact that data is generated, processed or modi?ed at high speeds, in some applications close to real time. Variety describes the state the data is in. This can range from structured data to semi-structured or unstructured data. Text written or spoken by humans is often referred to as unstructured data. Though, [2] emphasizes that many sources of Big Data are not as unstructured as they may seem at ?rst glance, but that it rather takes some extra time and effort to ?nd the logical flow they do possess. In addition to the three Vs wider de?nitions have been proposed over the years leading to ?ve or even more Vs depending on the source consulted. For example, [4] presents value and veracity as additional Vs with value considering the potential to contribute to entrepreneurial or © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 458–480, 2019. https://doi.org/10.1007/978-3-030-02686-8_35 scienti?c progress and veracity assessing the consistency and trustworthiness of the data. Some other characteristics of Big Data are its exhaustiveness (capturing entire populations or systems), flexibility (offering the possibility to add new aspects or expand in size) and relational character (allow for linking to other data bases) [1]. The sources and drivers of Big Data are numerous. Web data is referred to as the original Big Data [2] and often involves interests such as understanding customer behavior. It may include social media data, interaction data or voluntarily submitted data. Authors in [5] names mobile sensors, video surveillance, smart grids, geophysical exploration and medical experimentation as further drivers of the data deluge. In the ?eld of mechanical engineering the focus lies on data generated by machinery. A growing number of sensors and actuators are embedded into technical systems so that some even reach the state of operating completely autonomously. Furthermore, the interest in monitoring devices and equipment while it is in use is increasing rapidly. Cameras, GPS units and radio frequency identi?cation (RFID) tags are only some examples of how this development currently manifests itself [1]. Big Data is closely linked to the ?elds of business intelligence (BI) and data mining. It can be considered an extension of BI solutions as they are primarily built to analyze structured data whereas Big Data approaches aim to handle all kinds of data [6]. Still, BI solutions should not be discarded too quickly for the sake of Big Data strategies. It seems more promising to integrate and conjoin Big Data into the data a business already has and the methods that proved successful throughout its history [2, 7]. Data mining, on the other hand, denotes a set of methods to make use of data by discovering similarities, patterns, trends, outliers or clusters [8]. Established data mining techniques are focused on analyzing traditional, structured data [6]. Big Data now aims at larger amounts of data which are more complex in their structure. This does not necessarily mean that existing methods need to be overthrown and replaced, but it at least poses questions of scalability and adaption [2]. Moreover, it is to be discussed whether the tried and trusted data base language SQL (structured query language) will still serve the purposes. NoSQL, columnar databases, massively parallel processing (MPP) databases, cloud computing and frameworks like Hadoop are some of the new technologies on the rise [2, 6]. This paper addresses various Big Data approaches, highlights their advantages as well as their shortcomings and describes how they can be implemented with the help of MATLAB 2017a, an established software tool for engineering applications [9]. The data on which the implementation and validation is based stems from the National Aeronautics and Space Administration (NASA) – Prognostics Center of Excellence (PCoE). This institution collects and provides data sets from science and engineering that are free of cost and allow researchers and practitioners to explore and enhance data mining and machine learning algorithms [10]. The focus of this paper lies on clustering and classi?cation. In addition to implementation matters, general conclusions on MATLAB’s suitability for Big Data purposes are drawn and the scalability of existing MATLAB code is discussed. The paper divides into seven sections. The introduction given in this section is followed by a description of the data base in Sect. 2. Section 3 explains the criteria based on which the approaches for clustering and classi?cation are chosen and outlines Implementing Clustering and Classi?cation Approaches 459 their theoretical foundations. The implementation of these approaches in MATLAB is part of Sect. 4. Section 5 presents and discusses the results of both clustering and classi?cation. The paper concludes with an outlook on future work in Sect. 6 and a summary in Sect. 7. 2 Database The data set chosen for this paper is part of the NASA PCoE data repository. This repository currently comprises 16 data sets ranging from biology to electrical or mechanical engineering topics. What they all have in common is a time dependency and an information on failure, i.e. they represent time series from a speci?c starting condition until failure [10]. As this work is located in the ?eld of mechanical engi-neering a data set with an according background is chosen: “6 Turbofan Engine Degradation Simulation Data Set”. This set, introduced by [11], deals with a classical jet engine with the following main components: low pressure compressor (LPC), the high pressure compressor (HPC), the outer shaft (N1), the core shaft (N2), the high pressure turbine (HPT) and the low pressure turbine (LPT). The data are the results of simulations using an engine model. It is not a record of signals transmitted by engines physically existing and operated by airlines. Variations in the production quality of the original engines and degradation effects are included in the simulation. Each time series in the data set starts at an arbitrary point in the engine’s life where it is not as good as new anymore but has not failed yet. The data set separates into training data and test data. The training data serve to train a model whereas the test data are used to validate the accuracy of the created model. The time series from the training data provide the time of failure. They contain all data points from starting condition to failure. The test data time series, on the contrary, cut off at a point prior to engine failure. The created model can then be used to estimate the remaining useful life (RUL) of the engine. Time series enclose 21 different signals an engine would provide, e.g. temperatures, pressures, shaft speeds and amounts of fuel and coolant. Three more signals that are useful to determine an engine’s operation condition are available in each time series: flight altitude, Mach number, and throttle angle. However, these signals shall not be discussed in more detail as one of the paradigm shifts in applying Big Data approaches is to focus more on what the data itself reveal on a statistical level and less on building physical models that are comprehensible in all its interrelationships [12]. The entire data set is divided into ?ve different subsets varying in complexity. Some subsets show 6 different operating conditions, some only show 1 operating condition. Analogously, some subsets exhibit 2 different failure mechanisms while others only have 1 failure mechanism. This information on subsets, operating condi-tions and failure mechanisms is available with the data set itself. Table 1 gives an overview of how the data set divides into subsets. 460 K. Pitz and R. Anderl The size of the chosen data set is 12 Mb. This is a relatively small size, considering that some authors claim the lower boundary of Big Data to be several terabyte or petabyte [4]. However, a clear de?nition of how big Big Data has to be does not exist [5]. Even though the data set may not have the highest volume, the remaining V criteria should not be dismissed. For example, it exhibits high variety and value characteristics. Furthermore, it is feasible to test Big Data approaches with this data set while simultaneously allowing for upscaling to larger amounts of data in the implementation. 3 Chosen Approaches There are different motivations for building models based on the jet engine data described above. Typical engineering questions, that would be of interest for an engine operator as well, are: • Are operating conditions and failure mechanisms identi?able based on the signals solely? • How should an alarm system for imminent engine failures be designed? • How can the remaining useful life of an engine be estimated? In terms of data analysis, the ?rst question relates to clustering, the second to classi?cation and the third to regression or, more generally, prognostics. This paper focuses on the former two as they lay a base for further prognostic tools. Moreover, assessing clustering and classi?cation techniques allows to compare supervised versus unsupervised learning [13]. 3.1 Clustering Clustering aims at identifying different groups of related data within a larger data set. The grouping is carried out based on the mere data. No additional information stating which point or series belongs to which group is available. A veri?cation whether or not the data have been clustered correctly is not possible. Clustering is therefore considered a method of unsupervised learning [13]. For the chosen data set it is known that 6 different operating conditions and two different failure mechanisms exist. However, it cannot be retrieved which time series is from which group. It can be considered a classical clustering scenario, extended by the fact that the number of clusters is explicitly given. Table 1. Subsets of the engine data set Subset Number of operating conditions Number of failure mechanisms 1 1 1 2 6 1 3 1 2 4 6 2 5 6 1 Implementing Clustering and Classi?cation Approaches 461 Data within one cluster shall be as homogeneous as possible whereas the clusters themselves shall be as distant from one another as possible. Different distance measures are a main distinguishing feature between different clustering methods [14]. Established methods include hierarchical clustering, k-means clustering and Gaussian mixture models. Hierarchical clustering methods do not need a priori information on how many clusters are expected, but reveal an initially unknown cluster structure within the data set. The major drawback is that hierarchical methods are accompanied by high com-putational costs [15]. k-means clustering and Gaussian mixture models both belong to the ?eld of partitioning clustering. They both need the information on the number of clusters to be found. k-means clustering strictly assigns data points to clusters whereas Gaussian mixture models calculate belonging probabilities. For this work, k-means clustering is chosen as it is computationally ef?cient [15] and well compatible with MATLAB and other Big Data technologies such as Hadoop and MapReduce. The basic idea of deploying k-means clustering is to divide all n elements into k disjoint clusters so that the Euclidean distance between elements and cluster centers is minimized. The clusters’ centers are denoted in the matrix M = [m1, …, mk]. Each vector mj contains the center of the j-th cluster Cj which is calculated as follows: mj ¼ 1 nj X xi2Cj xi; ð1Þ with njbeing the number of elements belonging to the j-th cluster and xithe values of it i-th observation in this cluster. The algorithm for performing k-means clustering can then be described by the following four steps [14]: • Initialize clusters by specifying cluster centers, either randomly or deliberately. Calculate the preliminary matrix M based on the speci?ed cluster centers. • Assign each element in the data set to its nearest cluster Cl, i.e. xi 2 Cl if jjximljj\jjximjjj for i ¼ 1; ...; n; j6¼ l; i ¼ 1; ...; k: ð2Þ • Update matrix M based on the current assignment of elements to clusters using (1). • Repeat the second and third step until no further changes occur in the cluster allocation. k-means clustering is dependent on the initial choice of cluster centers. The algo-rithm converges to a local minimum of distances between elements and centers. Depending on the initial centers the ?nal clusters may vary. Choosing them therefore becomes an essential part of performing k-means clustering. However, choosing them by hand is laborious and opposing to the idea of evaluating Big Data as automatically as possible. The purely random selection of initial cluster centers, on the other hand, may lead to long run times of the algorithm and clusters that are not close to the optimal solution [16]. An algorithm that overcomes both shortcomings by choosing starting centers based on weighed probabilities that account for the structure in the data is called k-means++ and was ?rst proposed in [17]. 462 K. Pitz and R. Anderl k-means++ chooses the ?rst center c1 randomly from all elements available in the data set. It then calculates the distances D(xi) of all elements to the ?rst center. The following center c2 is chosen based on a weighed probability, ensuring that elements are more likely to be chosen the higher their D2 value, i.e. their distance from the ?rst center, is. After that D(xi) is calculated again for each element, now denoting the smallest distance between xi and any center chosen so far. The next center is chosen based on the updated D2 probabilities. These last two steps are repeated until all k starting centers have been set. Modi?cations of the k-means clustering are k-medians clustering and k-medoids clustering. The use of medians makes the method more robust in terms of outliers. k-medoids clustering extends the original method by requiring that each cluster center needs to coincide with an element of the data set. This makes the method applicable for categorical data as well. However, both extensions are not necessary for the data considered in this paper so that k-means clustering is chosen for implementation. Prior to running the classical k-means clustering the above mentioned k-means++ is applied to determine the cluster centers to start with. 3.2 Classi?cation Classi?cation follows a similar aim as clustering but is part of supervised learning [13]. It also intends to sort data into groups, in this case called classes, which are as homogeneous as possible. What sets classi?cation apart from clustering is that in classi?cation procedures information on the actual class af?liation is available. The model is trained with a set of training data for which the true class of each element is known. The trained model can then be used to assign new data for which the class af?liations are unknown to the appropriate classes. The main interest in the jet engine scenario lies on the remaining useful life of the individual engines. An operator of engines might wish to know which engines are close to failure so that failure may be avoided by means of shop visits and maintenance. Proximity to failure is indicated by low RUL values, given in flight cycles, e.g. RU = 5 means that the engine will only be able to perform ?ve more flights before it fails. Creating a warning system based on RUL values and their criticality is a legitimate, self-evident use case for classi?cation. Three classes are de?ned in Table 2. Classi?cation methods include decision trees, k-nearest neighbors, support vector machines, naive Bayes, and discriminant analysis. An extensive introduction can be found in [18]. All methods have advantages as well as shortcomings so that a general statement on which method is superior to another without considering the speci?c use Table 2. Classes for engine failure warning system Class no. Range of values Signi?cance System action 1 0 f RUL U 25 Engine very close to failure Alarm 2 25 < RUL U 125 Engine heading toward failure Warning 3 RUL > 125 Normal operation None Implementing Clustering and Classi?cation Approaches 463 case is hardly possible. A problem of classi?cation that might arise regardless of the chosen method is the phenomenon of over?tting. Over?tting denotes the effect that a classi?cation algorithm adapts overly well to the training data, i.e. scores a high accuracy within this subset of data, but has a high error rate when classifying test data [8]. One way to reduce over?tting is the use of cross validation. The data set is then divided into k subsets. The algorithm is trained with k -s 1 of these sets leaving the k-th one for validation. This procedure is repeated until each subset has once been the validation set. It obviously increases the computational cost compared to the more basic holdout validation which only once divides the data set into training and vali-dation data. It can be considered a trade-off between over?tting reduction and com-putational ef?ciency. In this work, the decision is taken in favor of holdout validation. Linear Discriminant Analysis. For this implementation a linear discriminant analysis is chosen based on the facts that the linear case is ef?cient to calculate, allows a quick classi?cation and is supported by MATLAB’s capabilities. The main reasons to dismiss the other classi?cation possibilities are that naive Bayes is a rather simple method that has its strength in serving as a benchmark for other methods. Support vector machines allow quick classi?cation and are highly generalizable but go along with high com-putational effort, the need for transformations in speci?c cases [19] and an incom-patibility with MATLAB’s Big Data functions. k-nearest neighbors disqualify, because it is a method prone to outliers [15] and adverse in terms of memory space as the whole data set has to be kept available as long as the algorithm is carried out. Decision trees give the opportunity to understand the classi?cation but need downstream pruning steps [18] or parallelization in form of random forests [20] to handle over?tting. Discriminant analysis is a method from the ?eld of multivariate statistics. At ?rst, a distribution function is calculated for each class. Commonly, a multivariate normal distribution is chosen whose density function is [21]. fXðxÞ ¼ 1 ??? ð2 ??????????????????g pÞp ??? detðRÞ p ? exp x 1 2 ðx x lÞT R1 ðx x lÞ Þ Þ : ð3Þ X is the p-dimensional random variable that, in the engine data example, is composed of the different signals each engine provides as mentioned in Sect. 2. µ, the vector of means, and R, the covariance matrix, are to be determined individually for each class. The borders between two classes are de?ned as where their density functions have the same value. The functions describing those borders are called discriminant functions. If the assumption of identical covariance matrices among all classes is fair, the method simpli?es to linear discriminant analysis. The discriminant functions are then hyper planes or, regarding a two-dimensional case, linear functions as shown in Fig. 1. Neural Network. Linear Discriminant Analysis: As an alternative to linear dis-criminant analysis, classi?cation is carried out with the help of a neural network. The reasoning behind that is to create the option of comparison and to give a prospect for future work that might expand into the ?eld of regression for which neural networks are also suitable [18]. Neural networks have become popular, sometimes being advertised as a magical solution to all computational problems [18]. They are in fact a very 464 K. Pitz and R. Anderl powerful and general method that can in theory approximate any complex interrelations [8]. A neural network is a nonlinear statistical model whose number of layers and whose activation functions influence this complexity the model is able to represent [18]. It is best applied in settings where prediction is more important than interpretation of results [18]. Neural networks can be considered a simulation of the human brain and its learning process. They involve neurons, weighed connections, and external stimuli. In the living organism learning signi?es the strengthening of synaptic connections between neurons in response to an external stimulation that has been received. In the neural network this can be modeled via weights and activation functions [8]. Figure 2 shows the general structure of a neural network with its input neurons, output neurons, and two exemplary hidden layers. Hidden layers do their name justice as they are not directly observed but only used internally in the calculation process. Fig. 1. Example of a linear discriminant analysis for two dimensions [22]. Fig. 2. General structure of a neural network [23]. Implementing Clustering and Classi?cation Approaches 465 In a classi?cation scenario with k classes the number of neurons in the output layer is k as well so that each neuron represents one class. The input neurons stand for the signals the model is fed with. The hidden layers in between represent the model to be trained in order to assign a data element with certain input signal characteristics to its appropriate class. This means that in the jet engine use case 21 signals can be drawn upon for input neurons, and the 3 classes de?ned in Table 2 serve as output neurons. Each connection is allocated to a weight wij. The ?rst index i denotes the prede-cessor this connections comes from, the second index j stand for the layer of the network that is currently at focus. The variables ai state whether or not a connection is activated. The sum of all incoming ai, weighed with the associated wij, is calculated by zj ¼ X n i¼0 wijai; ð4Þ with n being the number of preceding neurons. The value of z is then fed into the so called activation function g(zj). Typically, the sigmoid function gsigmoidðzjÞ ¼ 1 1 þ ezj ð5Þ is chosen for this purpose. An alternative worth considering, especially in regard to performance in MATLAB [24], is the hyperbolic tangent function gtanhðzjÞ ¼ 2 1 þ ezj h 1: ð6Þ The result of function g(zj) gives the activation aj the neuron propagates into the next layer of the network. Figure 3 illustrated the activation process. The neuron shown in this ?gure exhibits a bias fed into it, represented by a0 which is constantly 1 and weight w0j, which is a standard modeling technique [18, 25]. Fig. 3. Model of one neuron [25]. 466 K. Pitz and R. Anderl If a neural network only sends signals to its subsequent layers, as discussed so far, it is called a feedforward network. This is the approach widely used [18]. Networks which send signals back to their preceding layers exist as well and are sometimes referred to as networks possessing a memory. The more common name is recurrent neural network [26]. One dif?culty in using a neural network for classi?cation is to determine an ade-quate size. There are no established rules on how many layers and neurons to use. It rather is an iterative process of experimentation, facilitated by expertise and experience, to ?nd the right size for the speci?c scenario [18]. If too many neurons are chosen, over?tting occurs. If there are too few neurons, the network might not be able to suf?ciently model complex interrelations in the data. The size of the neural network can either be determined in a destructive approach or in a constructive one [27]. Destructive in this case means that the starting point is a big network from which neurons are then gradually being removed until the performance of the network starts to decrease. Opting for the constructive approach is to start with a small network and add neurons until the performance is not enhanced any further. Once the structure of the neural network is set, it has to be trained. The training data subset is used for this step. The generic approach to minimize errors is to use a gradient descent method, also called backpropagation. Detailed equations can be found in [18]. The fastest algorithm MATLAB offers for training neural networks with up to several hundreds of neurons is the Levenberg-Marquart backpropagation algorithm [28]. It was ?rst proposed in [29] and applied to neural networks in [30]. The main underlying idea is to avoid calculating the computationally intensive Hessian matrix and choosing an approximation instead. In this work, a standard feedforward neural network with bias and hyperbolic tangent activation functions is chosen. The size of the network is determined via the destructive approach described above. Backpropagation is carried our via the Leven-berg– Marquardt algorithm. Both, the results of the linear discriminant analysis and of the neuronal network, are discussed and compared in Sect. 5. 4 Implementation with MATLAB 2017A The implementation of the chosen approaches to process the engine data set and solve the problems of clustering and classi?cation is carried out using the programming environment MATLAB, release 2017a. Even though, MATLAB may not be the most popular programming language when judged in an overall comparison, it is still listed around rank 20 in current rankings [31, 32]. It has its strengths in matrix-based numerical calculations and is widely used in science and engineering. Since release 2016b MATLAB offers new functionalities for handling Big Data, e.g. tall arrays, a new data type that allows users to carry out calculations with data that would actually be too big to ?t into the working memory by breaking it down into heaps and eval-uating equations repeatedly. This process can also be parallelized. Through free educational licenses for teaching staff and affordable student licenses, MATLAB has gained some popularity in academia. It may be explained thereby that Implementing Clustering and Classi?cation Approaches 467 graduates are acquainted with it and established it in industry as well. Assuming that MATLAB is an available tool to practitioners in the ?eld of mechanical engineering, this paper aims at exploring how and to which extent it can be used to dive into Big Data analysis. In order to fully reproduce the results discussed in this paper the following MATLAB components are required: • MATLAB R2016b or newer, • MATLAB Parallel Computing Toolbox, • MATLAB Statistics and Machine Learning Toolbox, • MATLAB Neural Network Toolbox. 4.1 Workflow Regardless of the Big Data approach at focus a general workflow for implementation can be deployed. The workflow this paper follows, for clustering as well as for clas-si?cation, is shown in Fig. 4. It follows the recommendation given in [33] and is in line with MATLAB’s guidelines [34]. The problem de?nition has already been laid out in Sect. 3. The ?rst three preparation steps are described in the following paragraph, with a focus on dimensional reduction. The processing step will be dealt with under the headline of parallel and distributed computing. Model design, validation and upscaling are dis-cussed in Sect. 5. Fig. 4. Workflow for the implementation of Big Data approaches. 468 K. Pitz and R. Anderl 4.2 Data Preparation First of all, a transformation is performed on the input signals in order to create z-scores. In statistics, z-scores are random variables with mean 0 and standard deviation 1. MATLAB offers the function zscore to obtain z-scores. Using this standardized form of variables helps comparing them and making them processible by statistical methods. It can be regarded a step of preprocessing as depicted in Fig. 4. Subsequently, 5000 data points are randomly sampled from the training data. Data point does not mean that a certain engine is chosen, but that one value from one signal of an engine at an arbitrary time is picked. Operating conditions or failure mechanisms are not yet considered. datasample is the MATLAB function used for this sampling step. The reasoning behind the sampling step is that the randomly chosen points are representative for the entire set and that it is more ef?cient in terms of computational cost to explore the sample rather than the entire set. Scatter plots are chosen as an easy and intuitively accessible means of data exploration. MATLAB’s function to create these is named scatter. Figure 5 exemplarily shows the scatter plots for signals 10 to 21. All x-axes show the negative RUL value. All y-axes are without unit because of standardization. It is evident that some signals show a clear trend over time while others remain unaffected by time or just react with increased noise as time advances. Signal 11 for example has a positive trend, i.e. when an engine is close to failure signal 11 tends to have high values. Signal 21 gives an example of a negative trend, i.e. its values decrease the closer an engine gets to failure. Signals like those two examples should be Fig. 5. Signals of the engine data set, subset 1, training data, sample of 5000 points, z-scored, plotted over negative RUL values. Implementing Clustering and Classi?cation Approaches 469 included into models because their tendencies can help to categorize new data. Signal 10 exhibits no trend over time but stays constant. Therefore, it cannot contribute information to a model that is built on time-dependencies. Signal 17 shows a weak positive effect but not as distinct as others do. It could be argued whether or not to include it. To opt for the safe side, it is dismissed in this work. Signal 14 is exemplary for a signal that has a varying amount of noise. One might tend to interpret the points close to RUL = 0 as an upward trend, but indeed they are just scattered further around a signal value of 0. As the time span just before failure is of special interest for a warning system, a signal with high noise in this area is of little help and should also be excluded from the model. Applying this reasoning to all signals available, the ?rst half not shown in Fig. 5 and the second half documented in Fig. 5, the list of relevant signals to train time-dependent models with results in: 2; 3; 4; 7; 8; 11; 12; 13; 15; 20; 21: This can be considered a dimensional reduction. The original 21 signals were reduced to 11 relevant ones. Reducing dimensions is a standard step in preparing data for statistical learning algorithms. The less information is dragged along unnecessarily the more ef?cient the algorithms work. Choosing the relevant inputs manually works ?ne for a reasonable number of input variables. If the number increases, the process can easily be automated, e.g. with the help of correlation coef?cients. corrcoef is the corresponding MATLAB function. Note that Fig. 5 only shows a subset of the engine data set. The reduced list of signals is to be seen as a ?rst attempt at the least complex case of 1 operating condition and 1 failure mechanism which is represented by subset 1. Processing other subsets may require further selection of signals. 4.3 Parallel and Distributed Computing When processing large amounts of data, as is typical for Big Data applications, there are two steps to be considered in order to optimize computing times: parallelizing and distributing the computation. Parallel computing refers to the internal processes in one device, e.g. a laptop, workstation computer or computing server. Computations are divided among multiple processor cores of this device. Distributed computing enhances this concept by involving more than one device. Computing clusters are one way to realize this. Making data ?t for parallel and distributed computing usually requires some steps in front. Working with MATLAB and the described engine data set, those are the following: First of all, CSV ?les are created. Each CSV ?le contains the data of one engine. All ?les are then pooled together with the help of a datastore object. A datastore object in MATLAB does not create one large variable or container with all the separate data in it but solely captures the storing path of the ?les. When data are needed for calculation they are transformed from the datastore object into a tall array. tall arrays do not load all the data into the working memory at once but process data in heaps. When tall arrays appear in a MATLAB script the respective 470 K. Pitz and R. Anderl equations are not evaluated immediately. An explicit gather command is needed to execute calculations. The general aim when writing MATLAB code for Big Data is to reduce gather commands to a minimum, because they are what drives computational cost. It should also be checked whether all functions used are compatible with tall arrays. Some examples used in this work that support the use of tall arrays are: zscore, kmeans, discretize, and double. Self-written functions can handle tall arrays as well. In this paper, tall arrays are evaluated locally, using all processor cores available. This form of parallelization is why the MATLAB Parallel Computing Toolbox is necessary for executing the code. The size of the data set does not make the use of distributed computing necessary. However, if bigger data sets were processed, the same MATLAB code would still be applicable with only slight adjustments via the mapreduce function. This would allow for the use of computer clusters or cloud computing solutions such as Hadoop and Spark. The neural network used for comparative purposes in the classi?cation scenario functions without tall arrays but has its performance optimized by the MATLAB Neural Network Toolbox as well as by parallelization. The third toolbox in use, MATLAB Statistics and Machine Learning Toolbox, does not provide for parallel or distributed computing but for the statistical methods themselves. It offers pre-de?ned functions for support vector machines, decision tress, k-nearest neighbors, k-means, k-medoids, hierarchical clustering and many more, some of which are directly applied to obtain the results discussed in the next section and some of which were adduced as comparisons beforehand in order to ?nd the right approaches for the engine data scenario. 5 Results and Discussion This section presents and discusses the results of both the clustering and the classi?- cation problem. 5.1 Clustering Clustering is carried out in order to determine groups of engines with similar operating conditions and failure mechanisms. Clustering for Operating Conditions. Different operating conditions are only prevalent in subsets 2, 4, and 5. Therefore, only those subsets are subject to this kind of clustering. Input variables are the three condition signals flight altitude, Mach number and throttle angle as mentioned in Sect. 2. Having the information that these three allow to deduce how the engine is operated while all other signals are just simulated sensor signals recording internal processes in the engine, makes them an easy and obvious choice. Clustering has been performed on the training data only. Two itera-tions of k-means clustering were needed to identify all six clusters shown in Fig. 6. Implementing Clustering and Classi?cation Approaches 471 All clusters turn out very concentrated, making them appear like six single points even though a total of 5000 points is plotted. The cluster centers are given in Table 3. The results for subsets 4 and 5 are similar, showing highly concentrated centers as well. Clusters those are as clearly distinguishable as these could have been identi?ed man-ually just as well. Nevertheless, automated clustering embodies much less effort and is more generalizable as it can also be used for complex, spread-out clusters. The results obtained from the training data can be transferred to the test data. No modi?cations need to be made. It could be considered to use the cluster centers identi?ed from the training data as starting points for a clustering algorithm applied on the test data. Still, the k-means++ algorithm which does not need manual input for starting centers proved to be very effective as well in this scenario, given the fact that only two iterations were necessary. Clustering for Failure Mechanisms. Subsets 3 and 4 exhibit different failure mechanisms and have therefore been considered in this part of clustering. It is assumed that failure is a time-dependent phenomenon for the engine scenario. The closer an Fig. 6. Identi?ed clusters for operating conditions, subset 2, training data, sample of 5000 points, using negative RUL values. Table 3. Cluster centers for operating conditions in subset 2 Cluster Cond. 1 (flight altitude) Cond. 2 (Mach number) Cond. 3 (throttle angle) 1 (yellow) 2 0.00 100 2 (green) 25003 0.62 60 3 (red) 45003 0.84 100 4 (purple) 20003 0.70 100 5 (blue) 10003 0.25 100 6 (orange) 35003 0.84 100 472 K. Pitz and R. Anderl engine is to failure the higher or lower certain signals will be, indicating malfunctions in parts of the engine. 21 signals are available in total. Section 4.2 gives a list reduced to 11 signals that show a clear tendency over time. For this clustering it has been compared whether using all 11 signals or further reducing the number of input variables is more ef?cient. The decision is taken in favour of reduction. Essential signals could be reduced to: 7; 12; 15; 20; 21: Figure 7 shows why they are the most useful signals for identifying clusters of failure mechanisms. All signals chosen as inputs have a clear diverging trend towards RUL = 0. A comparison with Fig. 5, in which a subset with only 1 failure mode is plotted and no such diverging point clouds can be spotted, suggests that it is a valid indicator for the failure modes in this case. The two different failure modes identi?ed via k-means clustering are already highlighted in Fig. 7. For signal 15 for example, it can be concluded that high values towards the end of the engine’s life indicate the ?rst failure mode (red) while low values indicate the second one (blue). Plotting the clusters like in Fig. 6 is no longer feasible as more than three dimensions are used for failure mechanism clustering. Cluster centers are summarized in Table 4. It is striking that the cluster centers are very close to each other with respect to all ?ve signals. Table 3 showed greater distances, at least for input Cond. 1. Still, the k-means algorithm could identify failure mechanism clusters as ef?ciently as before. Again, results are obtained after two iterations. Fig. 7. Signals of the engine data set, subset 3, training data, sample of 5000 points, z-scored, plotted over negative RUL values, different failure modes color coded in blue and red. Implementing Clustering and Classi?cation Approaches 473 Only training data are used for clustering. The time-dependency makes data points close to RUL = 0 more valuable than those with high RUL values. Hence, only the last ten points of each time series are considered. In some cases those ten data points from the same engine are not all assigned to the same cluster. However, as an engine is assumed to only fail from one failure mechanism, a clear assignment to one or the other cluster has to be made. Whenever this case occurs, the cluster the engine is assigned to most often out of the ten times is chosen. Time-dependency is what makes it dif?cult to transfer the failure mode clustering from the training data to the test data. In the training data set all time series are available until the event of failure whereas in the test data set time series are cut off at a random RUL value, potentially a high one. For test data with a low RUL value it might be possible to apply the clusters identi?ed from the training data as diverging trends in the relevant signals already show their effects. For new data with high RUL values this will, if at all, be accompanied by great uncertainty. Furthermore, it should be stated that subset 4 requires nested clustering as multiple operating conditions and multiple failure mechanisms are present at the same time. This is why the result for subset 4 consists of six pairs of clusters. The clustering for failure mechanisms is carried out after the clustering for operating conditions but otherwise does not differ from the procedure described before. 5.2 Classi?cation Classi?cation has the aim of assigning elements of the engine data set to the right class of criticality regarding the RUL value. Three classes have been de?ned in Table 2. The quality of classi?cation can be evaluated as the actual class af?liations are available. Some erroneous classi?cations may be rated more undesirable than others. Considering a warning system for engine failure, it is worse to receive a normal operation prompt when actually a warning should be given than to receive an erroneous warning when the engine is still in normal condition. The clustering results are made further use of as additional inputs for classi?cation. For example, engines that were identi?ed as belonging to the same failure mechanism may be more likely to fall into the same class of criticality as well. Classi?cation via Linear Discriminant Analysis. The ?rst method applied for clas-si?cation is linear discriminant analysis. It needs a training time of 1.3 s on a con-temporary, customary laptop (Lenovo Thinkpad E550, Intel Core i5-5200 processor). Training and processing of the entire data set takes approximately 10 s. The results are summarized in the form of a confusion matrix in Fig. 8. The diagonal of the confusion matrix documents correct classi?cation, e.g. the upper left corner of the matrix states that 8.6% of all data elements (5249 in absolute Table 4. Cluster centers for failure mechanisms in subset 3 Cluster Sign. 7 Sign. 12 Sign. 15 Sign. 20 Sign. 21 1 551.62 519.92 8.52 38.47 23.09 2 567.57 534.94 8.24 39.57 23.75 474 K. Pitz and R. Anderl numbers) have been classi?ed for alarm and were real alarm cases. The lower right corner sums up all diagonal entries, showing that in total 74% of all elements have been classi?ed correctly whereas 26% have suffered misclassi?cation. Three groups of misclassi?cations should be looked at more closely: Cases in which the target class was alarm but the model only chose warning or normal and cases in which the engine was operating normally but the model gave an alarm. The ?rst two mislead the operator to overestimate the engine’s performance and not consider checkup or maintenance work. The latter may lead to premature shop visits and cause unnecessary costs. For safety reasons the ?rst two are to be considered even more critical than the latter one which has economic consequences only. The fact that all three of these misclassi?cations occur at a very low rate, 2.0, 0.0 and 0.1% respec-tively, indicate good quality of the trained model. Another aspect to be considered when developing an engine failure warning system is that there should be at least one alarm before engine failure. Engines failing without prior notice are highly undesirable in the intended system. Figure 9 shows that no such case occurred for the linear discriminant analysis model. Vertical lines in Fig. 9 represent individual engines. The y-axis shows its simulated life in flight cycles. It can be concluded from the plot that most engines start in normal condition, actually operating normally and correctly classi?ed so. As simulations start at an arbitrary point of time in an engine’s life some already show warning condition at the beginning of time counting. The red tips of all lines demonstrate that each engine has given multiple alarms before failure. Engine 118 for example has the highest line in the plot and passes through all three phases, starting in normal condition, transitioning into warning and ?nally reaching alarm state, initially giving some premature alarms but then correctly classi?ed as RUL U 25. The fact that all engines give alarms, in case of doubt rather too early than not giving it at all, emphasizes the well-functioning of the warning system based on linear discriminant analysis. Fig. 8. Confusion matrix for classi?cation with linear discriminant analysis, subset 4. Implementing Clustering and Classi?cation Approaches 475 Classi?cation via Neural Network. The second method applied for classi?cation is a neural network. Following the destructive approach leads to a number of 20 neurons in 1 hidden layer. 17 inputs, consisting of a reduced number of signals according to Sect. 4.2 and clustering results, are used. Figure 10 shows the neural network as modeled in MATLAB. Training this network to the point that a valid model is found takes 170 iterations on average. Using the same laptop as before this is equivalent to approximately 12 s. The classi?cation results obtained via the described neural network are summarized in the confusion matrix in Fig. 11. The sum of all diagonal elements is 74.1%, almost the same as with linear dis-criminant classi?cation. 25.9% of all data elements are still misclassi?ed. However, the three most severe misclassi?cations have values of 2.4, 0.0, and 0.1%, again almost identical to the results obtained via linear discriminant analysis, which are acceptably low. The neural network scores a slightly worse rate for classifying alarm conditions as such but is slightly better at correctly classifying warnings. Fig. 9. Displayed alarms, warnings and normal conditions when system is trained via linear discriminant analysis, subset 4. Fig. 10. Neural network modelled in MATLAB. 476 K. Pitz and R. Anderl Figure 12, when compared to Fig. 9, also highlights the fact that the warning system trained via neural network behaves almost identical to the one based on linear discriminant analysis. All engines display alarms before failure which is the preferred characteristic for the warning system discussed in this work. Fig. 11. Confusion matrix for classi?cation with neural network, subset 4. Fig. 12. Displayed alarms, warnings and normal conditions when system is trained via neural network, subset 4. Implementing Clustering and Classi?cation Approaches 477 6 Outlook The results presented in this paper offer various connecting points for further research. One promising next step may be to broaden the focus from clustering and classi?cation to also include regression. In the considered use case regression models could be used to estimate the remaining useful life of the engines. It should be examined to which extent regression models can pro?t from clustering and classi?cation results already obtained for the data set. Further enhancements could include image or video data to prove that the methods are also applicable for high variety data. In general, bigger data sets should be con-sidered for further validation. Integration of cloud solutions or distributed server structures should be tested. Applying the approaches to data sets from other technical systems could further prove their generalizability. 7 Summary In this paper, a data set for applying Big Data approaches in a mechanical engineering scenario has been chosen. Various Big Data approaches have been assessed and compared. A problem de?nition of clustering and classi?cation has been formulated. For these two problems k-means clustering, linear discriminant analysis and neural networks have been identi?ed as adequate methods. All three methods have been implemented using the programming environment MATLAB 2017a. Above all, datastore objects, tall arrays and gather com-mands are crucial for enabling MATLAB scripts for Big Data. The code produced constitutes a basis for further extension. Bigger data sets could be processed spreading the computation among a greater number of cores with the help of MATLAB’s Parallel Computing Toolbox or involving computing clusters or cloud solutions via mapre-duce settings. Moreover, existing MATLAB scripts for any purposes can be adapted for Big Data use based on the insights gained by these examples. All that has to be considered is whether all functions that are used support tall arrays and whether the program sequence should be adjusted to minimize the number of gather commands. MATLAB proved to be an adequate tool for analyzing large amounts of stored data stemming from engine simulations. If it is still powerful enough when additional challenges like near real-time data or highly unstructured social media data arise remains to be proven. The results of the methods themselves show that k-means clustering with k-means+ + initialization is very fast and effective in identifying operating condition and failure mechanism clusters in the engine data, reaching plausible results within two iterations. Comparing linear discriminant analysis and a feedforward neural network with one hidden layer shows a very similar performance for both when three de?ned classes for RUL values are the underlying scenario. Both reach approximately 74% of correct classi?cations and 2% or less for misclassi?cations considered especially severe. The neural network is easier to implement in MATLAB, more generalizable but less suitable whenever interpretation of results is a focus as well. The linear discriminant analysis proved to be slightly faster than the neural network. 478 K. Pitz and R. Anderl References 1. Kitchin, R.: The Data Revolution. SAGE, Los Angeles (2014) 2. Franks, B.: Taming the Big Data Tidal Wave. Wiley, Hoboken (2012) 3. Laney, D.: 3D Data Management: Controlling Data Volume, Velocity, and Variety, https:// blogs.gartner.com/doug-laney/?les/2012/01/ad949-3D-Data-Management-Controlling-Data- Volume-Velocity-and-Variety.pdf. Accessed 01 June 2018 4. Demchenko, Y., Grosso, P., Laat, C., de Membrey, P.: Addressing Big Data issues in scienti?c data infrastructure. In: IEEE (ed.) 2013 International Conference on Collaboration Technologies and Systems (CTS) (2013) 5. Long, C., Talbot, K., Gill, K. (eds.): Data Science & Big Data Analytics. Wiley, Indianapolis (2015) 6. Simon, P.: Too Big to Ignore. Wiley, Hoboken (2013) 7. Iafrate, F.: From Big Data to Smart Data. Wiley, Hoboken (2015) 8. Aggarwal, C.C.: Data Mining. Springer, Cham (2015) 9. Discroll, T.A.: Learning MATLAB. Society for Industrial and Applied Mathematics, Philadelphia (2009) 10. NASA Prognostics Center of Excellence: PCoE Datasets. https://ti.arc.nasa.gov/tech/dash/ pcoe/prognostic-data-repository/. Accessed 06 Sept 2017 11. Saxena, A., Goebel, K.: Turbofan Engine Degradation Simulation Data Set. https://ti.arc. nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository. Accessed 14 June 2018 12. Kitchin, R.: Big Data, new epistemologies and paradigm shifts. SAGE J. Big Data Soc. (2014) 13. Louridas, P., Ebert, C.: Machine learning. IEEE Softw. 33(5), 110–115 (2016) 14. Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16(3), 645– 678 (2005) 15. Ester, M., Sander, J.: Knowledge Discovery in Databases. Springer, Berlin (2000) 16. Shindler, M.: Approximation Algorithms for the Metric k-Median Problem. UCLA, Los Angeles (2008) 17. Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: SIAM (ed.) SODA 2007: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035, Philadelphia (2007) 18. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer, New York (2017) 19. Suthaharan, S.: Machine Learning Models and Algorithms for Big Data Classi?cation. Springer, New York (2016) 20. Genuer, R., Poggi, J.-M., Tuleau-Malot, C., Villa-Vialaneix, N.: Random forests for Big Data. In: Big Data Research, pp. 22–46 (2017) 21. Schlittgen, R.: Multivariate Statistik. Oldenbourg, München (2009) 22. The MathWorks Inc.: Create and Visualize Discriminant Analysis Classi?er. https://de. mathworks.com/help/stats/create-and-visualize-discriminant-analysis-classi?er.html. Acces-sed 2018 Sep 2017 23. Nielsen, M.: Using Neural Nets to Recognize Handwritten Digits. http://neuralnetwork sanddeeplearning.com/chap1.html. Accessed 27 Mar 2018 24. The MathWorks Inc.: Tansig: Hyperbolic Tangent Sigmoid Transfer Function. https://de. mathworks.com/help/nnet/ref/tansig.html. Accessed 28 Mar 2018 25. Russell, S., Norvig, P.: Künstliche Intelligenz, 3., aktualisierte. Pearson, München (2012) 26. Kolen, J.F., Kremer, S.C. (eds.): A Field Guide to Dynamical Recurrent Networks. IEEE, New York (2001) Implementing Clustering and Classi?cation Approaches 479 27. Alpaydin, E.: Introduction to Maschine Learning. MIT Press, Cambridge (2004) 28. The MathWorks Inc.: Tainml: Levenberg–Marquardt Backpropagation. https://de. mathworks.com/help/nnet/ref/trainlm.html. Accessed 27 Mar 2018 29. Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11(2), 431–441 (1963) 30. Hagan, M.T., Menhaj, M.: Training feed-forward networks with the Marquardt algorithm. IEEE Trans. Neural Netw. 5(6), 989–993 (1994) 31. TIOBE: TIOBE Index for March 2018. https://www.tiobe.com/tiobe-index/. Accessed 21 Mar 2018 32. GitHut: Top Active Languages. http://githut.info/. Accessed 21 Mar 2018 33. Ramasso, E., Saxena, A.: Performance benchmarking and analysis of prognostic methods for CMAPSS datasets. Int. J. Prognstics Health Manag. 5(2), 1–5 (2014) 34. The MathWorks Inc.: Big Data Workflow Using Tall Arrays and Datastores. https://de. mathworks.com/help/distcomp/big-data-work?ow-using-tall-arrays-and-datastores.html. Accessed 27 Mar 2018 480 K. Pitz and R. Anderl Visualization Tool for JADE Platform (JEX) Halim Djerroud(B) and Arab Ali Cherif Universit´e Paris8, Laboratoire d’Informatique Avanc´ee de Saint-Denis (LIASD), 2 Rue de la libert´e, 93526 Saint-Denis, France {hdd,aa}@ai.univ-paris8.fr Abstract. This article presents JEX, a useful visualization extension to the JADE platform. JEX provides the possibility for MAS (Multi-agent systems) community using JADE to visualize and interpret their simula-tions developed under it. Why this contribution? Agent-based modeling is widely used to study complex systems. Therefore, several platforms have been developed to answer this need. However, in many platforms, the graphical representation of the environment and agents are not fully implemented. In the case of JADE, it’s completely inexistent. Implement-ing such a graphical representation within JADE is of interest because it’s a powerful multi-agent platform and FIPA compliant. Adding an extra feature like JEX will greatly help the scienti?c community and the industry to represent and interpret their MAS models. Keywords: Spatial simulation · JADE · Multi-agent systems 1 Introduction Multi-agent systems (MAS) has become an active area of research. According to Weiss [1], a multi-agent systems (MAS) is de?ned as a system involving two or more agents to cooperate with each other while achieving local goals. Multi-agent systems are acknowledged as a suitable paradigm for modeling complex systems. They are applied in various domains such as collaborative decision support sys-tems and robotics. The software development process of MAS requires robust platforms to address the complexity of these tasks by o?ering MAS key features such as agent development, monitoring and analysis. The development e?ciency can be signi?cantly enhanced using a platform able to do speci?c representation. Agent-based models [2] is the discipline aimed at understanding interaction of agents in their environment. The multi-agent system are used in two cases: (a) Simulation of complex phenomena [3] witch implies the simulation of interactions between agents. This simulation is meant to de?ne the system’s evolution in order to predict its future organization, such as the food chain study. (b) Distributed problems solving [1] such as the study of virus propagation in computer networks. The study of complex phenomena often involves entities that evolve in space and time. Implementation of these systems in an MAS requires the representation .h c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 481–489, 2019. https://doi.org/10.1007/978-3-030-02686-8_36 482 H. Djerroud and A. A. Cherif of the environment in which they evolve and integrate their positions within it? We refer this kind of simulation as “Agent-based spatial simulation”. Agent-based spatial simulation is a key tool and system complex study [4]. It has grown considerably among the scienti?c community, and within many social science disciplines such as psychology in case of simulating human behavior during emergency evacuations [5]. Since the modeling method used to represent systems varies according to its characteristics, it is essential to represent the environment as well as the agents developed in it. For this purpose, several multi-agent platforms have chosen to integrate a graphical interface that makes possible direct visualization of the agents’ interaction and the development of the environment. JADE1 is one of the most popular multi-user platforms [6]. This platform is widely used in research works because it implements the FIPA standard [7]. Thus, it becomes consequently easily interoperable with other platforms that implement the same standard like ZEUS [17], FIPA-OS [18], LEAP [19] and JACK [20]. Furthermore, we can state that JADE is particularly well docu-mented [8] and has a proven an impressive track record of the system that has been develops with it. Using JADE, we faced one main issue as JADE does not implement one key feature which is a spatial representation module. Thus, dedicated and rigid plateforms like GAMA [9] and Netlogo [12] seems to be more appealing as they o?ers natively this functionality. JEX (JADE Environment Extension) has been developed in order to address the lack of this key module in JADE. JEX is a spacial representation module and technically a Java library that integrates easily with JADE. This article is structured as follows: First it presents a state of the art of sev-eral Multi Agent System architectures that illustrate the interest of our contri-bution. Second describes the JEX extension as well as all provided possibilities. Finally, it compares our contribution with the tools presented in the Related Work section. We conclude the article with a discussion about the perspectives of our contribution. 2 Related Work More and more applications are developed using MAS, but there are few multi-agent oriented implementation tools and powerful agent programming languages. MAS Design relies on existing languages and programming techniques and it’s often hard to develop MAS (implementation, distribution, communications). The trend in this context takes on Multi-Agent Oriented Programming and meaning programming MAS with MAS tools. Many standards have been developed in this regard such as FIPA2 , MASIF3 and DARPA4 . In this section, we introduce 1 JADE: An open source tool available in: http://jade.tilab.com/. 2 IEEE FIPA:Foundation for Intelligent Physical Agent. 3 MASIF-OMG (Object Management Group) : OMG e?ort to standardize mobile agents - middleware services and internal middleware interfaces. 4 Knowledge Sharing E?ort The DARPA Knowledge Sharing E?ort. Visualization Tool for JADE Platform (JEX) 483 and compare some agent platform such as JADE, NetLogo, GAMA and Mason [10,11]. JADE [6] (Java Agent Development Framework) is one of the most pop-ular agent technology platform. JADE has become a major open source soft-ware project with a worldwide scope. It is an agent-oriented middleware that facilitates the development of multi-agent systems. It’s FIPA compliant, FIPA being an IEEE Foundation for Intelligent Physical Agents. JADE is developed in JAVA. It includes a runtime environment with JADE agents, on which one or more agents can be run from the host; a class library that programmers must/can use to develop their agents; a suite of graphical tools that allow the administration and monitoring of the activity of agents during implementation. However, JADE has no tools to visualize agents and the environment. NetLogo [12] is a multi-agent environment focused on [13,14], modeling tools. It integrates its own programming language that can be described as a high-level language. The environment is discrete and it is represented in 2D or 3D form depending on the version used. Netlogo represents the agents that are obligatory in the environment and can not communicate with the environment alone. Under Netlogo, it’s possible to depict a third type of facility referred to as links. It connects up two agents and symbolizes the relationship between agents. Gama [9] The GAMA platform (Gis & Agent-based Modelling Architec-ture) is like Netlogo, it o?ers a complete modelling language - GAMA (Gama Modelling Language) - allowing modellers to build models quickly and easily. However, unlike Netlogo which is limited to the construction of simple models, GAMA allows the construction of very complex models, as rich as those built by a computer scientist from tools such as Repast Simphony. In particular, GAMA o?ers very advanced tools for space management. Mason [15] MASON is a fast, discrete, Java-based, multi-agent simulation library designed to serve as a foundation for large customized Java simulations, and to provide su?cient utility for several soft simulation needs. MASON con-tains both a model library and an optional suite of 2D and 3D visualization tools. 3 JEX Architecture JEX is an extension visualization tool for JADE Framework, this section present JEX general architecture. The main goal is to provide JADE with an easy and e?ective viewer module like the NetLogo interface, therefore, JEX is inspired by NetLogo visual architecture and functionalities. To provide a visual representation of MAS, we need to represent agents, patches and links. Agents can act on the environment, to simplify the complex implementation of the environment, they are decomposed into small parts called Patches. The Links are relations between agents. For JEX we propose the following architecture: We consider the tree types of entities mentioned above: Agents, Links and Environment. The tree entities has been implemented as classes named JADE Agents and are named JexAgent, 484 H. Djerroud and A. A. Cherif Fig. 1. JEX architecture, UML class diagram. JexLink and JexEnvironnement, as illustrated in Fig. 1. These three classes are derived from JexGenircAgent which is simply a jade Agent superclass. We have chosen this implementation in order to take full advantage o?ered by the JADEagent Superclass functionalities and be fully compatible with the framework. – JexEnvironnement Class consist of a set of patches. The user can choose the environment dimension and global characteristics (patches size, word-warps5 , colors etc). The dimensions of the patches can also be chosen. Each element (patches) can be manipulated independently. From a technical point of view JexEnvironnement is a static class, with static members. To avoid multi-instances of environments, and ease agent access. Other global char-acteristics have been added, such as: the posting6 ) delay, the origin position (position (0,0)) of the environment and other parameters that fully listed in JEX documentation7 . – JexAgent and Jexlinks are used by JexObserver, that can be consider as an agent used as a registration point the agents willing to subscribe to graphics’s representation module. JexObserver provides other services such as creating links (Links) and initializing the environment and proposes Jex- AgentObserver Interface for the Agents wishing to use the graphics’ rep-resentation functionalities. We insist that these various actions are completely transparent to the user, and they are performed automatically. In the next section we describe how to integrate JEX into a JADE project. 5 Connect the edges of the environment. 6 Step by step execution, or time unit of execution. 7 http://djerroud.halim.info/index.php/jex. Visualization Tool for JADE Platform (JEX) 485 4 Integration to JADE JEX (JADE Environment Extension) comes in the form of a jex.jar java library. This library makes it possible to provide JADE with a graphical environment which makes it possible to visualize the agents and the environment. The integration of JEX into a JADE project, does not require any mod-i?cations of the JADE project. It needs only creating a JexObserver type agent. This agent enables it to con?gure the environment, e.g., the length and the width of the environment, the refreshment time and soon. If none of these parameters are speci?ed, the values will be set by default. In the next section of the code, we will present how the JexObserver agent is created. We observe in the selection of the code that follows, that the creation of the JexObserver agent is done in the same way as the creation of a JADE agent. This is possible because JEX agents, as indicated in the previous section, are JADE agents, more precisely they are derived from the JADE Agent class. import jade . c or e . Agent ; import j e x . JexEnvironnement2D ; import j e x . JexObserver ; p u b l i c c l a s s JexTesterAgent extends Agent{ protected void setup ( ) { JexEnvironnement2D . i n i t 2 D ( ) ; Object a r g s [ ] = new Object [ 1 ] ; args [ 0 ] = ” ” ; ContainerController cc = getContainerController ( ) ; try { AgentController ac = cc . createNewAgent (” JexObs ” , ” j e x . JexObserver ” , args ) ; ac . s t a r t ( ) ; . . . . } catch ( Exception e ) { ... } } } In order to maintain the ?exibility of JADE, the JEX library does not monitor all the agents systematically. It’s up to the user to choose the agents to observe. In order to monitor an agent, the agent needs only to register within the JexObserver agent as shown in the following code: import jade . c or e . Agent ; import j e x . JexAgent ; . . . p u b l i c c l a s s AgentToObserve extends Agent{ . . . protected void setup ( ) { jexObserver . s u b s c r i b e ( 486 H. Djerroud and A. A. Cherif t h i s . getLocalName ( )); . . . addBehaviour ( . . . ) { . . . } ); }} Once the observer agent JexObserver is created, and the agents wishing to bene?t from JEX have registered with the observer, it remains to animate these agents only. For that, JEX o?ers a set of functions that allow the manipulation of the various agents in the environment. Among the functions that JEX o?er, we ?nd the initialization functions, that give the initial position of the agent in the environment. This position can be de?ned by the user or let JEX propose a random position. Another set of functions give a shape to the agents. This is de?ned by a basic geometrical shape, e.g., square, rectangle or circle, etc., according the speci?c form de?ned by the user via an image ?le. Finally, there is the set of functions that allow the movement itself. These functions can directly indicate a position to converge to, or give orientation and movement. Other functions specify the color of the agents, the text on display etc. All of these functions are described in the JEX documentation. The selection of the code bellow gives an example of an implementation of an agent that performs initialization and basic movements. . . . JexAgent jexAgent= jexObserver . getJexAgent ( t h i s . getLocalName ()); . . . jexAgent . setRadius ( 1 0 ) ; jexAgent . setShape ( jexAgent .CERCLE) ; jexAgent . s e t C o l o r ( new JexColor ( 2 0 0 , 0 , 0 ) ) ; jexAgent . s e t I n i t P o s ( 5 0 , 5 0 ) ; . . . addBehaviour ( new . . . Behaviour ( . . . ) { protected void onTick ( ) { jexAgent . setHeading (270); jexAgent . forward ( 5 ) ; } } ) ; . . . As indicated in the previous section, JEX allows the addition of links between agents, these links (Links) are represented in the graphical environment by the lines that connect the agents to each other. These links are particularly useful when representing graphs. The following code gives an account of how to add these links (Links) in JEX. Visualization Tool for JADE Platform (JEX) 487 . . . jexObserver . addLink ( jexAgent . getJexAgentLocalName ( ) , ” agent attached ” , f a l s e ); . . . We end this section with a graphic illustration (Fig. 2). We have chosen an example to illustrate the possibilities of JADE associated with JEX, namely, an implementation of a simulation for the propagation of viruses in a computer network. The model, displayed in Fig. 2, shows the spread of a virus through a network. Although the model is somewhat abstract, the interpretation is the following: each node represents a computer, and the modeling represents the progression of a computer virus through this network. Each node has two states: infected or not. In academia, such a model is sometimes called the SIR model. The Blue nodes represent the uninfected machines. The links that exist between these machines are ?gured as lines connecting to the nodes. The red nodes represent the infected machines. Fig. 2. Computer network, spread of viruses. 488 H. Djerroud and A. A. Cherif 5 Discussion The existing multi-agent platforms are more or less specialized, we use again the example of NetLogo that makes possible to accomplish feats in terms of visual rendering and spatial representation of the agents. However this tool is very little used in the scienti?c world, because of its lack of robustness and speci?city of language, that reduce the working possibilities. JADE, is written in Java and is easy to use. It implements the FIPA protocol which makes it one of the best multi-agent platforms. However, it does not o?er a graphical environment for the spatial representation of agents. Attempts to combine the two platforms have already been tested [16]. The communication between the two systems is possible via the exchange of XML ?les. Spatial representation is essential for the study of complex phenomena as we have shown in Sect. 1. The utility of integrating a spatial representation tool for the powerful JADE tool is an important contribution. We described in this article how to provide JADE with the same graphic means such as NetLogo, which inspired us in our work. For the future of JEX, we have developed tools for 2D representation, and we plan to add a 3D representation of the environment as well as to improve the API that we presented. We share this work using a free license; the whole source code as well as the jar ?le and the documentation can be downloaded from the link8 . 6 Conclusion In this paper, we have proposed JEX a spatial representation of MAS agents as an extension of JADE Framework. We discussed its algorithms and more impor-tantly its e?ectiveness and complementary contribution to JADE. We suppose that this easily integrated enhancement will be very bene?cial to JADE’s devel-oper community. References 1. Weiss, G.: Multiagent Systems: A Modern Approach to Distributed Arti?cial Intel-ligence. MIT Press, Cambridge (1999) 2. Vidal, J.M., Buhler, P., Goradia, H.: The past and future of multiagent systems. In: AAMAS Workshop on Teaching Multi-agent Systems (2004) 3. Amigoni, F., Schia?onati, V.: A multiagent approach to modelling complex phe-nomena. Found. Sci. 13(2), 113–125 (2008) 4. Macal, C.M., North, M.J.: Agent-based modeling and simulation: ABMS examples. In: Simulation Conference, Winter WSC 2008, p. 2008. IEEE (2008) 5. Pan, X., et al.: A multi-agent based framework for the simulation of human and social behaviors during emergency evacuations. Ai Society 22(2), 113–132 (2007) 8 http://djerroud.halim.info/index.php/jex. Visualization Tool for JADE Platform (JEX) 489 6. Bellifemine, F., Agostino, P., Giovanni, R.: JADE-A FIPA-compliant agent frame-work. In: Proceedings of PAAM, vol. 99, pp. 97–108 (1999) 7. O’Brien, P.D., Nicol, R.C.: FIPA-towards a standard for software agents. BT Tech-nol. J. 16(3), 51–59 (1998) 8. Bellifemine, F.L., Giovanni, C., Dominic, G.: Developing Multi-agent Systems with JADE, vol. 7. Wiley (2007) 9. Taillandier, P., et al.: GAMA: a simulation platform that integrates geographical information data, agent-based modeling and multi-scale control. In: International Conference on Principles and Practice of Multi-Agent Systems. Springer, Heidel-berg (2010) 10. Nguyen, G., et al.: Agent platform evaluation and comparison. Rapport technique, Institute of Informatics, Bratislava, Slovakia (2002) 11. Trillo, R., Sergio, I., Eduardo, M.: Comparison and performance evaluation of mobile agent platforms. In: Third International Conference on Autonomic and Autonomous Systems ICAS 2007. IEEE (2007) 12. Tisue, S., Uri, W.: Netlogo: a simple environment for modeling complexity. In: International Conference on Complex systems, vol. 21 (2004) 13. Tisue, S., Uri, W.: NetLogo: design and implementation of a multi-agent modeling environment. Proc. Agent (2004) 14. Kornhauser, D., Rand, W., Wilensky, U.: Visualization tools for agent-based mod-eling in NetLogo. Proc. Agent, 15–17 (2007) 15. Luke, S., et al.: Mason: a multiagent simulation environment. Simulation 81(7), 517–527 (2005) 16. Reis, J.C., Rosaldo, J.F.R., Gil, G.: Towards NetLogo and JADE Integration: an industrial agent-in-the-loop approach 17. Nwana, H.S., Ndumu, D.T., Lee, L.C.: ZEUS: an advanced tool-kit for engineering distributed multi-agent systems. In: Proceedings of PAAM, vol. 98 (1998) 18. Poslad, S., Phil, B., Rob, H.: The FIPA-OS agent platform: open source for open standards. In: Proceedings of the 5th International Conference and Exhibition on the Practical Application of Intelligent Agents and Multi-Agent, vol. 355 (2000) 19. Bergenti, F., Poggi, A.: Leap: a FIPA platform for handheld and mobile devices. In: International Workshop on Agent Theories. Architectures and Languages. Springer, Heidelberg (2001) 20. Winiko?, M.: JACK™ intelligent agents: an industrial strength platform. In: Multi- Agent Programming, pp. 175–193. Springer, Boston (2005) Decision Tree-Based Approach for Defect Detection and Classi?cation in Oil and Gas Pipelines Abduljalil Mohamed1(&) , Mohamed Salah Hamdi1 , and So?ene Tahar2 1 Information Systems Department, Ahmed Bin Mohamed Military College, Doha, Qatar {ajmaoham,mshamdi}@abmmc.edu.qa 2 Electrical and Computer Engineering Department, Concordia University, Montreal, Canada tahar@ece.concordia.ca Abstract. Metallic pipelines are used to transfer crude oil and natural gas. These pipelines extend for hundreds of kilometers, and as such, they are very vulnerable to physical defects such as dents, cracks, corrosion, etc. These defects may lead to catastrophic consequences if not managed properly. Thus, monitoring these pipelines is an important step in the maintenance process to keep them up and running. During the monitoring stage, two critical tasks are carried out: defect detection and defect classi?cation. The ?rst task concerns with the determination of the occurrence of a defect in the monitored pipeline. The second task concerns with classifying the detected defect as a serious or tolerable defect. In order to accomplish these tasks, maintenance engineers utilize Magnetic Flux Leakage (MFL) data obtained from a large number of magnetic sensors. However, the complexity and amount of MFL data make the detection and classi?cation of pipelines defects a dif?cult task. In this study, we propose a decision tree–based approach as a viable monitoring tool for the oil and gas pipelines. Keywords: Defect detection and classi?cation .n Decision tree Data mining .n Pipeline monitoring and maintenance 1 Introduction Oil and gas pipeline defect monitoring is an essential component of the pipeline maintenance process. In order to maintain the pipeline in a properly working order, different inspection tools such as magnetic flux leakage (MFL), ultrasonic waves, and closed circuit television (CCTV) are used to detect and classify pipeline defects [1–3]. The complexity and amount of data obtained by such diverse tools require the use of sophisticated defect detection and classi?cation techniques. Most of the approaches reported in the literature [4] have been proposed for the purpose of either prediction of defect dimensions, detection of defects, or classi?cation of defect types. To achieve © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 490–504, 2019. https://doi.org/10.1007/978-3-030-02686-8_37 these objectives, techniques such as machine learning [5–7], wavelets [8–13], and signal processing [14–16] are widely used. The focus of this paper, however, is on developing a pipeline monitoring tool that incorporates the two tasks namely: defect detection and defect classi?cation. The main inference engine for both tasks is a decision tree that takes as an input the crucial MFL depth and length parameters. 2 Pipeline Monitoring In this paper, we propose a new monitoring approach for oil and gas pipelines. The general structure of the proposed approach is shown in Fig. 1. MFL Signals. MFL data are collected from autonomous devices known as intelligent pigs. An increase in flux leakage may indicate metal loss, which in turn, means the possibility of defect occurrence. Thus, at the location of the potential defect, the depth and length of the flux leakage are measured or estimated by using arti?cial neural networks. Defect Detection. These two crucial MFL parameters are ?rst entered into the defect detection unit. A decision tree is realized in this unit as defect detection technique. If no defect is detected, the monitoring process terminates. On the other hand, if a pipeline defect is detected, the two parameters will be passed on to the classi?cation unit. Defect Classi?cation. In this unit, based on their severity level, the defect is classi?ed into one of two categorizes: Type I or Type II. In this work, Type I is considered a very serious pipeline defect which requires an immediate action and reparation. Type II is considered less serious and can wait and be scheduled for defect maintenance. Fig. 1. The proposed monitoring approach for the oil and gas pipelines. Decision Tree-Based Approach for Defect Detection 491 3 Decision Tree-Based Approach for Defect Detection and Classi?cation The decision tree utilized in this work is derived from the simple divide-and-conquer algorithm. The decision tree is expressed recursively as described in the following sections. MFL Signal Depth and Length Attributes. In order to detect/classify pipeline defects, the obtained MFL signals are ?rst normalized and mapped into depth and length ranges. According to the industry standard [17], the depth range for the MFL signals is normalized between 0 and 1; and the length range for the MFL signals is normalized between 0 and 6. These two ranges constitute the MFL attributes, and are divided into different values as described below. The MFL depth attribute values are: Very high = [0.80 1.00], High = [0.60 0.79], Medium = [0.40 0.59], Low = [0.20 0.39], Very low = [0.00 0.19], The MFL length attribute values are: Large = [3.81 6.00], Medium = [1.81 3.80], Small = [0.61 1.80], Very small = [0.00 0.60], Defect Detection. Based on the information given in [17], the MFL attributes can now be used to identify the status of the MFL signals as shown in Table 1. The MFL signal can either be identi?ed as abnormal (defect) or normal. Constructing Decision Tree. To construct the decision tree for the defect detection, an attribute is ?rst selected and placed at the root node, and make branch for each possible value. This splits up the MFL signals into subsets, one for every value of the attribute. The process is repeated recursively for each branch, using only those instances that actually reach the branch. If all instances at a particular node are all either abnormal or normal, then we stop developing that part of the tree. There are two possibilities for each split; and they produce two trees as shown in Figs. 2 and 3 for the depth and length attributes, respectively. The number of 2 (abnormal) and 1 (normal) classes is shown at the leaves. Any leaf with only one class (i.e., 2 or 1) reaches the ?nal split; and thus the recursive process terminates. In order to reduce the size of the trees, the information gain for each node is measured. Now the information for the two attributes is calculated and split is made on the one that gains the most information. Tree Structure. The informational value of creating a branch on the MFL-depth attribute and the MFL-length attribute are then calculated. The number of normal and abnormal at the leaf nodes in Fig. 2 are [0 4], [1 3], [2 2], [2 2], and [4 0], respectively. 492 A. Mohamed et al. The number of normal and abnormal at the leaf nodes in Fig. 3 are [4 1], [3 2], [1 4], and [1 4], respectively. Calculating the information gain for each attribute yields the tree structure shown in Fig. 4. As described in Fig. 5, the decision tree basically uses three values of the MFL-depth attribute and four values of the MFL-length attribute. The values are Low, Medium, and High for the MFL-depth attribute, and Very Small, Small, Medium, and Large for the MFL-length attribute. Table 1. MFL signal abnormal and normal status based on its depth and length range MFL-depth MFL-length Status Very high High Medium Low Very low Very small Small Medium Large Normal (1) Abnormal (2) YES NO NO NO NO YES NO NO NO NO YES YES NO NO NO NO NO YES NO NO NO YES YES NO NO NO NO NO NO YES NO NO YES YES NO NO NO NO NO NO NO YES NO YES NO YES NO NO NO YES NO NO NO YES NO NO YES NO NO NO NO YES NO NO NO YES NO YES NO NO NO NO NO YES NO NO YES NO YES NO NO NO NO NO NO YES NO YES NO NO YES NO NO YES NO NO NO YES NO NO NO YES NO NO NO YES NO NO YES NO NO NO YES NO NO NO NO YES NO NO YES NO NO YES NO NO NO NO NO YES NO YES NO NO NO YES NO YES NO NO NO YES NO NO NO NO YES NO NO YES NO NO YES NO NO NO NO YES NO NO NO YES NO NO YES NO NO NO YES NO NO NO NO YES NO YES NO NO NO NO YES YES NO NO NO YES NO NO NO NO NO YES NO YES NO NO YES NO NO NO NO NO YES NO NO YES NO YES NO NO NO NO NO YES NO NO NO YES YES NO Fig. 2. The decision tree for the MFL depth attribute. The abnormal status is referred to by 2; while the normal status is referred to by 1. Decision Tree-Based Approach for Defect Detection 493 Defect Classi?cation. The MFL data used for classifying the defect severity level is shown in Table 2. The table shows that the two attribute values can indicate either the defect level is of Type I, or the defect level is of Type II. Fig. 3. The decision tree for the MFL length attribute. The abnormal status is referred to by 2; while the normal status is referred to by 1. Fig. 4. The decision tree structure for the defect detection. 494 A. Mohamed et al. Constructing Decision Tree. The two trees produced by the two attributes are shown in Figs. 6 and 7. As was the case for the defect decision tree, the information gain for each node is measured, and split is made on the one that gains the most information. Tree Structure. The informational value of creating a branch on the MFL-depth attribute and the MFL-length attribute are then calculated. The number of defect Type I and Type II at the leaf nodes in Fig. 6 are [2 1], [1 2], and [0 3], respectively. The number of defect type I and type II at the leaf nodes in Fig. 7 are [0 3], [1 2], and [2 1], respectively. Calculating the information gain for each attribute yields the tree structure shown in Fig. 8. As described in Fig. 9, the decision tree basically uses three values of the MFL-depth attribute and three values of the MFL-length attribute. The values are Low, Medium, and High for the MFL-depth attribute, and Small, Medium, and Large for the MFL-length attribute. Fig. 5. The defect detection based on the two MFL attributes. Table 2. MFL signal defect (i.e., Type I, Type II) status based on its depth and length range MFL-depth MFL-length Defect High Medium Low Small Medium Large Type I (1) Type II (2) YES NO NO YES NO NO NO YES YES NO NO NO YES NO YES NO YES NO NO NO NO YES YES NO NO YES NO YES NO NO NO YES NO YES NO NO YES NO NO YES NO YES NO NO NO YES YES NO NO NO YES YES NO NO NO YES NO NO YES NO YES NO NO YES NO NO YES NO NO YES NO YES Decision Tree-Based Approach for Defect Detection 495 Fig. 6. The decision tree for the MFL-depth attribute. The defect status of type I is referred to by 1; while type II is referred to by 2. Fig. 7. The decision tree for the MFL-length attribute. The defect status of type I is referred to by 1; while type II is referred to by 2. Fig. 8. The decision tree structure for the defect classi?cation. 496 A. Mohamed et al. 4 Performance Evaluation The performance of the proposed approach is measured by two important criteria: the receiver operating characteristics (ROC) curves and the confusion matrices. In ROC, the true positive rates (sensitivity) are plotted against the false positive rates (1- speci?city) for different cut-off points. For a speci?c severity class, the closer its ROC curve is to the left upper corner of the graph, the higher its classi?cation accuracy is. In the confusion matrix plot, the rows correspond to the predicted class (output class), and the columns show the true class (target class). In the defect detection and classi?cation, the proposed approach is compared with the four well-known classi?ers, namely the Naive Bayesian (NB) classi?er, k-nearest neighbor (KNN) classi?er, Arti?cial Neural Network (ANN) classi?er, and the Support Vector Machine (SVM) classi?er. Data. The available MFL dataset used in the experimental work is categorized as follows. For the defect detection, there are 907 samples of normal status, and 2721 samples of the abnormal status. For the defect classi?cation, there are 907 samples for each type of defects. The data samples have been further divided as follows: 70% for training, 15% for validation, and 15% for testing. Defect Detection. The confusion matrix and the ROC curves for each detector model are shown in Figs. 10, 11, 12, 13 and 14 for the models NB, KNN, ANN, SVM, and the proposed decision tree (DT). In these ?gures, the normal status of the MFL signal is referred to by Class 1, and abnormal status is referred to by Class 2. Fig. 9. The defect classi?cation based on the two MFL attributes. Decision Tree-Based Approach for Defect Detection 497 Defect Classi?cation. The confusion matrix and the ROC curves for each classi?er model are shown in Figs. 15, 16, 17, 18 and 19 for the models NB, KNN, ANN, SVM, and the proposed decision tree (DT). In these ?gures, the defect type is referred to by Class 1, and defect Type II is referred to by Class 2. Fig. 10. The defect detection confusion matrix (a) and ROC curves (b) for the NB model. Fig. 11. The defect detection confusion matrix (a) and ROC curves (b) for the KNN model. 498 A. Mohamed et al. It should be noted from these ?gures that the proposed DT model outperforms all other models. It yields 99.2% accuracy for the detection and classi?cation. Moreover, the arti?cial neural network model yields the worst performance at 70.2% detection accuracy and 71.4% classi?cation accuracy. The defect detection and classi?cation performance of all models are summarized in Table 3. Fig. 12. The defect detection confusion matrix (a) and ROC curves (b) for the ANN model. Fig. 13. The defect detection confusion matrix (a) and ROC curves (b) for the SVM model. Decision Tree-Based Approach for Defect Detection 499 Fig. 14. The defect detection confusion matrix (a) and ROC curves (b) for the DT model. Fig. 15. The defect classi?cation confusion matrix (a) and ROC curves (b) for the NB model. 500 A. Mohamed et al. Fig. 16. The confusion matrix (a) and ROC curves (b) for the KNN model. Fig. 17. The defect classi?cation confusion matrix (a) and ROC curves (b) for the ANN model. Decision Tree-Based Approach for Defect Detection 501 Fig. 18. The defect classi?cation confusion matrix (a) and ROC curves (b) for the SVM model. Fig. 19. The defect classi?cation confusion matrix (a) and ROC curves (b) for the DT model. Table 3. Detection and classi?cation accuracy for the NB, KNN, ANN, SVM, and DT models. Classi?er model Defect Detection Classi?cation NB 87% 83.8% KNN 98.8% 96.8% ANN 70.2 71.4% SVM 89.5% 90% DT 99.2% 99.2% 502 A. Mohamed et al. 5 Conclusion The monitoring process for the oil and gas pipelines consists of two main tasks: defect detection and defect classi?cation. The complexity and amount of the MFL monitoring data make both tasks very dif?cult. In this work, we proposed a decision tree-based approach as a viable monitoring tool. The new approach is evaluated using two important criteria: the receiver operating characteristics (ROC) curves and the confu-sion matrices. The performance of the new approach is compared with other well-known monitoring tools. Extensive experimental work has been carried out and the performance of the proposed approach along with four other well-known techniques are reported. The new approach outperforms all of them with accuracy at 99.2% for the detection and classi?cation tasks. Acknowledgment. This work was made possible by NPRP Grant # [5-813-1-134] from Qatar Research Fund (a member of Qatar Foundation). The ?ndings achieved herein are solely the responsibility of the authors. References 1. Park, G.S., Park, E.S.: Improvement of the sensor system in magnetic flux leakage-type nod-destructive testing. IEEE Trans. Magn. 38(2), 1277–1280 (2002) 2. Jiao, J., et al.: Application of ultrasonic guided waves in pipe’s NDT. J. Exp. Mech. 1, 000 (2002) 3. Jiao, J., et al.: Application of ultrasonic guided waves in pipe’s NDT. J. Exp. Mech. 17(1), 1–9 (2002) 4. Layouni, M, Tahar, S., Hamdi, M.S.: A survey on the application of neural networks in the safety assessment oil and gas pipelines. In: 2014 IEEE Symposium on Computational Intelligence for Engineering Solutions. IEEE (2014) 5. Khodayari-Rostamabad, A., et al.: Machine learning techniques for the analysis of magnetic flux leakage images in pipeline inspection. IEEE Trans. Magn. 45(8), 3073–3084 (2009) 6. Lijian, Y., et al.: Oil-gas pipeline magnetic flux leakage testing defect reconstruction based on support vector machine. In: Second International Conference on Intelligent Computation Technology and Automation, ICICTA 2009, vol. 2. IEEE (2009) 7. Vidal-Calleja, T., et al.: Automatic detection and veri?cation of pipeline construction features with multi-modal data. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014). IEEE (2014) 8. Song, S., Que, P.: Wavelet based noise suppression technique and its application to ultrasonic flaw detection. Ultrasonics 44(2), 188–193 (2006) 9. Hwang, K., et al.: Characterization of gas pipeline inspection signals using wavelet basis function neural networks. NDT E Int. 33(8), 531–545 (2000) 10. Mukhopadhyay, S., Srivastava, G.P.: Characterisation of metal loss defects from magnetic flux leakage signals with discrete wavelet transform. NDT E Int. 33(1), 57–65 (2000) 11. Han, W., Que, P.: A modi?ed wavelet transform domain adaptive FIR ?ltering algorithm for removing the SPN in the MFL data. Measurement 39(7), 621–627 (2006) 12. Joshi, A., et al.: Adaptive wavelets for characterizing magnetic flux leakage signals from pipeline inspection. IEEE Trans. Magn. 42(10), 3168–3170 (2006) Decision Tree-Based Approach for Defect Detection 503 13. Qi, S., Liu, J., Jia, G.: Study of submarine pipeline corrosion based on ultrasonic detection and wavelet analysis. In: 2010 International Conference on Computer Application and System Modeling (ICCASM), vol. 12. IEEE (2010) 14. Afzal, M., Udpa, S.: Advanced signal processing of magnetic flux leakage data obtained from seamless gas pipeline. NDT E Int. 35(7), 449–457 (2002) 15. Guoguang, Z., Penghui, L.: Signal processing technology of circumferential magnetic flux leakage inspection in pipeline. In: 2011 Third International Conference on Measuring Technology and Mechatronics Automation (ICMTMA), vol. 3. IEEE (2011) 16. Kandroodi, M.R., et al.: Defect detection and width estimation in natural gas pipelines using MFL signals. In: 2013 9th Asian Control Conference (ASCC). IEEE (2013) 17. Cosham, A., Hopkins, P., Macdonald, K.A.: Best practice for the assessment of defects in pipelines—corrosion. Eng. Fail. Anal. 14(7), 1245–1265 (2007) 504 A. Mohamed et al. Impact of Context on Keyword Identi?cation and Use in Biomedical Literature Mining Venu G. Dasigi1(?) , Orlando Karam2 , and Sailaja Pydimarri3 1 Bowling Green State University, Bowling Green, OH, USA vdasigi@bgsu.edu 2 Kennesaw State University, Marietta, GA, USA orlando.karam@gmail.com 3 Life University, Marietta, GA, USA sailaja.pydimarri@life.edu Abstract. The use of two statistical metrics in automatically identifying important keywords associated with a concept such as a gene by mining scien- tific literature is reviewed. Starting with a subset of MEDLINE® abstracts that contain the name or synonyms of a gene in their titles, the aforementioned metrics contrast the prevalence of specific words in these documents against a broader “background set” of abstracts. If a word occurs substantially more often in the document subset associated with a gene than in the background set that acts as a reference, then the word is viewed as capturing some specific attribute of the gene. The keywords thus automatically identi?ed may be used as gene features in clustering algorithms. Since the background set is the reference against which keyword prevalence is contrasted, the authors hypothesize that di?erent back- ground document sets can lead to somewhat di?erent sets of keywords to be identi?ed as speci?c to a gene. Two di?erent background sets are discussed that are useful for two somewhat di?erent purposes, namely, characterizing the func- tion of a gene, and clustering a set of genes based on their shared functional similarities. Experimental results that reveal the signi?cance of the choice of background set are presented. Keywords: Literature mining · Automatic keyword identi?cation · TF-IDF Z-score · Background set · Features · Clustering 1 Objectives and Goals The usefulness of certain text mining approaches for automatic identification of keywords associated with documents and using those keywords for additional anal- ysis, such as classification and clustering of documents, have been studied previ- ously [1, 4, 7]. Keywords are identified by the strength of their association with documents or document classes, such as tweets [4] or research abstracts associated with specific genes [1, 7]. Keywords thus identified are used as features for addi- tional purposes, such as classification of tweets based on sentiment [4] or organ- izing genes into groups or clusters based on functional similarity [1, 7]. © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 505–516, 2019. https://doi.org/10.1007/978-3-030-02686-8_38 The strength of association a keyword has to a document or a collection is generally not determined in isolation or absolute terms, but within the context of its contrast to its strength in a reference or “background” set of documents. In this work, the focus is on the signi?cance of the context provided by the background set. The objective in this work is to understand the impact of context, provided by such a background collection of documents, in text mining to describe the function of a set of genes, and in explicating possible similarities in their function by grouping them into clusters. The task of clustering genes is carried out as a two-step process: First, keywords speci?c to each gene of interest are algorithmically extracted from a subset of MEDLINE® documents, based on two metrics: Z-score [1], a well-known statistical concept, and TF-IDF [8], a classic term weight metric from information retrieval. The formulation of these metrics helps identify how important and distinguishing a keyword is for a particular gene. In the second step of clustering, the classic K-means algorithm [5] is used to group related genes based on the keyword features into clusters. Each of these clusters is interpreted as comprised of functionally related genes, as indicated by the keywords the genes in each share among themselves. To achieve these stated goals, the extracted keywords should represent two aspects of the genes: they should be su?ciently speci?c as to characterize the gene and at the same time, some of them should be shared among multiple genes so the genes may be organized into functionally related clusters. To capture these two aspects for keywords, two di?erent background sets of documents are needed to provide a reference context. Others have evaluated strengths of di?erent features relative to the same background set, as Ikeda and Suzuki did in identifying peculiar composite strings as in DNA sequences [6]. However, few others have attempted to understand the impact of di?erent background sets in identifying keywords that are used for di?erent purposes. As pointed out above, keywords for a concept, such as a gene in this work, are identi?ed based on the strength of their association with the concept. Two alternative metrics, namely, Z-scores and a less explored variant of TF-IDF (de?ned below), are considered in this work to capture the strength associated with keywords. The quality of keywords extracted for some genes from each metric is evaluated by an expert. The quality of clusters resulting from K-means is evaluated by calculating the purity of clus- ters, which measures the overall similarity of the computed clusters of genes against expert-de?ned clusters [2]. 2 Methods Keywords capture and represent the content of documents, such as biomedical abstracts. Keywords that appear more often in a document are considered more likely to be repre- sentative of the content of the document. This ability of a keyword to represent the content of a document is called the representation aspect. Useful keywords also need to be able to distinguish between documents. A word that occurs in most documents obvi- ously cannot distinguish among those documents. This ability of a keyword to discrim- inate between documents will be called the discrimination aspect. Thus, a word that occurs in not many documents, i.e., one with a low document frequency, can set the 506 V. G. Dasigi et al. small number of documents that it does occur in apart from the many that it doesn’t occur in. A word that rates well in both the representation aspect and the discrimination aspect would thus be a good keyword. When a concept, such as a gene, is captured by a set of documents, it is useful to extend these notions from a single document to a group of documents [3]. Thus, a keyword may be thought of as characterizing a group of documents (related to a speci?c concept, such as a gene) and as distinguishing the group from other groups (also related to other concepts, such as genes). In this extended view, the keyword may also be viewed as characterizing a concept, such as a gene (which underlies the group of documents), itself, and as distinguishing it from other concepts or genes (which underlie the other groups of documents). In order to capture the repre- sentation and discrimination aspects of keywords relative to various concepts, the distri- bution of the keyword in various (possibly overlapping) groups of documents, which correspond to the concepts in question, would be of interest. TF-IDF has traditionally focused on the representation and discrimination aspects of keywords relative to individual documents in information retrieval [8]. Andrade and Valencia, and others following them, have used Z-score more naturally to capture the distribution of a keyword within groups of documents [1]. The Z-score is thus directly suitable for capturing the representation and discrimination aspects of keywords relative to groups of documents, and the concepts underlying them, as Andrade and Valencia did with protein families. In order to take advantage of the powerful notion of TF-IDF, while adopting it to the context of concepts represented by groups of documents, the original de?nition is improvised here. A brief de?nition of Z-score is presented ?rst, followed by a discussion of an improvised variant of TF-IDF that extends to groups of documents. 2.1 The Z-Score Well-known in statistics, the Z-score of a word a relative to a gene (or other concept) g is de?ned as follows, where F stands for a frequency that simply counts the number of documents containing a word. Za g = Fa g -a F ¯a ??a , where, Fa g , F ¯a , and ??a all relate to the word a, and are respectively the frequency (number of documents that contain the word a, as mentioned above) in the group corresponding to the gene g, the average frequency across groups corresponding to all genes of interest, and the standard deviation of the frequency across the groups of documents corre- sponding to all genes of interest. While the standard deviation plays a useful role in de?ning Z-score, it is not a focus of this paper. Thus, the Z-score is a measure of how many standard deviations away the frequency of the word in a group of documents corresponding to a given gene is from the average frequency of the word across the set of various groups of documents corresponding to all the genes that are of interest; for instance, a Z-score of 3 means that the frequency in question is 3 standard deviations above average. The set of the various groups of Impact of Context on Keyword Identi?cation and Use 507 documents corresponding to all the di?erent genes that are of interest mentioned above, used as a reference against each individual group of documents that corresponds to a speci?c gene, is referred to as a background set. The need for selecting appropriate background sets for di?erent purposes is discussed in Sect. 2.3. The Z-score is a measure of how far (in terms of the standard deviation) the frequency of the word in the group of documents corresponding to a given gene (or a concept) is from the average frequency of the word across the various groups of documents corre- sponding to all the genes (or concepts) in consideration; for instance, a Z-score of 3 means that the frequency in question is 3 standard deviations above average. This refer- ence set of the various groups of documents corresponding to all the di?erent genes (or concepts) that are of interest is the “background set”. The need for selecting appropriate background sets is discussed in Sect. 2.3. 2.2 TF-IDF and Its Variant TF-IGF The TF-IDF score is classic and well-known in the information retrieval literature, and has been used to capture the strength of individual words to characterize documents and distinguish them from other documents [8]. In contrast, in the present work, keywords are of interest that distinguishes a gene from other genes (or a concept from other similar concepts). The authors have previously extended the notion of TF-IDF to this new context [3]. Since the extension is not as well-known as the Z-score, it is brie?y explained here. The entire document collection may be thought of as comprising (not necessarily exhaustively) a number of possibly overlapping groups of documents corresponding to di?erent genes that are of interest. There may also be other documents that are unrelated to any of these genes; thus the overlapping groups of documents do not necessarily exhaust the entire document collection. Here the focus is on characterizing the repre- sentational and discriminating aspect of words relative to each gene (which corresponds to a group of documents), and not relative to each document (as was the focus of TF- IDF). The extension involves de?ning the term frequency TFa g of a term a relative to a gene (represented by a group of documents) g, the group frequency of a term a (similar to the document frequency of a term), denoted GFa , the inverse group frequency for a term a, denoted IGFa , and ?nally the combined notion TF-IGFg a , the group variant of TF-IDF, that brings all the pieces together. TFa g is de?ned as the sum of the number of times the word a appears in the documents corresponding to the gene1 g, that is, TFa g = ?F d?g tfa d , 1 This is also sometimes called the collection frequency of the term in the set of documents, and counts the total number of occurrences of the term in all the documents of the collection. It di?ers from the document frequency of a term in a collection of documents in that the document frequency just counts how many documents contain the term (with no distinction on the number of occurrences). 508 V. G. Dasigi et al. where g is used to refer to a gene, as well as to the group of documents associated with it. The summation is over all documents d associated with the gene, or group of docu- ments, g, and tfa d is the frequency of the term a in d. GFa is de?ned simply as the number of genes or groups of documents that include (at least a document that contains) the word a. Here G denotes the entire set of genes or groups of documents. GFa = ?F g ?F G { 1 if ?d ?d g|a is in d 0 otherwise IGFa is de?ned much as the classic IDF. IGFa = log |G| GFa , where |G| is the cardinality of the set of gene groups (44 in the present work with yeast genes). Finally, TFa g and IGFa are multiplied to form TF-IGFa g = TFa g ·F IGFa Above, G has denoted the entire set of genes or groups of documents used in computing the inverse group frequency for IGFa of a word a. This component is intended to capture the aspect of keywords that can distinguish a gene associated with a particular group of documents from all genes and the document groups associated with them. As in the case of the Z-score, this entire set of groups G is used as the reference against which individual groups are contrasted is called the “background set” here. The signif- icance of the background set is discussed in the next subsection below in more detail. 2.3 The Background Set In the de?nitions of both Z-score and TF-IGF, the reference set of the various groups of documents corresponding to all the di?erent genes that are of interest has been called the “background set”. The background set is roughly the universe of interest. The focus is on how a word can distinguish a “foreground” set of documents, which corresponds to a speci?c gene, from a background set of documents, which corresponds to all the genes, and possibly all other concepts at large. Since each gene corresponds to a group of documents, the term “gene” and the phrase “group of documents” are sometimes used interchangeably, if it suits the context (often with the symbol g left ambiguous between a gene and a group of documents). In the case of the Z-score, it tries to capture how the frequency of a word (in a speci?c group of documents corresponding to a “foreground” gene) deviates (in terms of standard deviations) from the average frequency of the word (in the groups of documents corresponding to the background set of genes). Thus, the average frequency F ¯a and the standard deviation ??a are both computed from the background set. Impact of Context on Keyword Identi?cation and Use 509 If a word is contained in only one group of documents corresponding to a speci?c gene, then the average frequency of the word in the background set would be very small, so the word would have a high value of Z-score for that gene, and potentially negative Z-scores for all other genes. This in turn captures the notion that the word is very signi?cant for that particular gene. Thus it helps us in distinguishing the gene from others, and possibly capturing part of its functional description. TF-IGF attempts to capture how high the frequency of a word (in a speci?c group of documents) corresponding to a “foreground” gene is, while the word occurs relatively infrequently in (the groups of documents corresponding to) the background set of genes. For any given word a, the ?rst aspect is captured by a high TFa g for a speci?c group of documents corresponding to a gene g, and the second aspect is captured by a high IGFa in the background set. As with Z-score, if a word a has a high TF-IGFg a for a gene g, the word helps distinguish the gene from others, possibly capturing part of its functional description. Keywords identi?ed for a gene using the Z-score or TF-IGF could conceivably serve at least two distinct purposes. They could be used to characterize or describe the function of the gene as uniquely or distinctly as possible. Here, the focus would be on distin- guishing each gene from the others. Alternatively, the keywords might be used to iden- tify possible functional similarities and overlaps between the di?erent genes (indicated by possibly shared functional keywords). In this case, it would be desirable to see the keywords capture as much of the functionality of each gene as possible, rather than emphasize their distinction from other genes. It appears that the speci?c choice of background set can impact the appropriateness of the keywords selected for the gene for the two distinct purposes discussed above. In order to obtain keywords that uniquely characterize a gene, the keywords should be associated with the gene in question, but not with any or most of the other genes. A natural background set for this purpose would be one that includes groups of documents that correspond to each of the genes that are of interest, and no others. Every document in the background set would be associated with one or more genes being studied that we seek to distinguish from one another. There would be no documents in the back- ground set that are unrelated to one gene or the other from the set of genes being studied. In the rest of the paper, this background set of documents is referred to as the restricted background set. On the other hand, suppose the focus were instead on grouping the various genes from the set being studied into clusters based on similarities of function, indicated by any keywords associated with each gene that are shared with at least another gene. In this scenario, what would be very useful is to allow keywords identi?ed for di?erent genes to overlap somewhat, indicating potential similarities in function between pairs of genes, based on any keywords the pair shares. For this purpose, a background set such as the one described in the preceding paragraph would be inappropriate, because it tends to focus on distinguishing the various genes, rather than on whether they could be similar. A di?erent background set that includes many general documents (including other biomedical documents, possibly not necessarily about any of the genes being studied) might provide a broader and a more neutral reference. For instance, the entire MEDLINE® document collection, which includes many documents that are not 510 V. G. Dasigi et al. necessarily about any of the genes in question, could be such a background set. This kind of background set is naturally called unrestricted. In this work, a restricted background set and an unrestricted background set are created for use in identifying slightly di?erent keyword sets for each gene. The hypoth- esis, to be veri?ed, is that the former background set is more suitable for selecting keywords that are better for characterizing gene function uniquely, while the latter is more appropriate for selecting keywords used as gene features for functional clustering of genes. The restricted background set is simply formed from the 44 groups of docu- ments that correspond to the 44 yeast genes. It is simply the union of all these documents, 2,233 in total. The unrestricted background set is the entire collection of 6,791, 729 MEDLINE® abstracts (which is a superset of the restricted set, since they were all downloaded at the same time). This entire set is divided randomly into 44 groups, so as to keep the methodology consistent and comparable. 3 Results and Analysis As indicated before, a set of 44 genes that are involved in the cell cycle of budding yeast have been chosen for this study, since others have studied them, as well. For example, Cherepinsky et al. includes a study speci?cally for gene clustering, where they also include an expert-de?ned clustering based on functions and transcriptional activators [2]. In this work, that same expert-de?ned clustering (not shown here) is used as the basis for comparison of the quality of clustering. Using both TF-IGF and Z-score with context provided by the restricted and unre- stricted background sets, the N top-ranking keywords were generated by varying N from 10 to 100 for each gene. Thus, four combinations of experiments in all were performed for generating keywords and for clustering genes. The top 30 keywords generated by both TF-IGF and Z-scores for three di?erent genes were evaluated by an expert. Using the top N keywords as features, the K-means algorithm was used to compute gene clus- ters [5]. The ?ow of data for computations of Z-score, TF-IGF, and K-means is illus- trated in Fig. 1. For the 44 yeast genes in consideration, the purity of the computed clustering was evaluated relative to the expert-generated clustering from in Cherepinsky et al. [2], as mentioned previously at the beginning of this section. A clustering is a set of sets of genes. Each inner set of genes is sometimes called a cluster; thus a clustering is a set of clusters. Purity is calculated by ?rst computing the best degree of match against any inner set of genes in the expert clustering for each inner set of genes in the computed clustering, and then averaging this measure over all inner sets in the clustering. For clustering purposes, once all the keywords for each of the 44 genes were iden- ti?ed, any keywords that are unique to each gene, that is, those not shared by at least two genes, were eliminated. Note that these eliminated words are very important for an entirely di?erent purpose, namely, to describe the potentially unique functional aspects of the respective genes, although not that useful for clustering by K-means. A little fewer than half of the total keywords were unique and eliminated for clustering purposes, leaving a little over half of the total keywords shared by at least two genes. Impact of Context on Keyword Identi?cation and Use 511 3.1 Impact of Di?erent Background Sets – Keyword Quality An expert was asked to evaluate the top 30 ranking keywords for three genes namely, ace2, cdc21, and mnn1, from all four combinations of experiments. Not surprisingly, the name of the gene itself is ranked at the top in most cases. According to the expert, keywords obtained using TF-IGF were better than those based on Z-scores. Contrary to initial expectation, in the ?rst cut, the quality of the keywords did not appear to depend signi?cantly on the background set, although there were di?erences. However, an inter- esting observation was made for ace2, which is the name of both a yeast gene and also a human gene. When Z-scores were computed by using the restricted background set, more keywords related to the cell cycle function of the human gene (renal activity) were selected than with the unrestricted background set. This surprising result has an inter- esting explanation: the restricted background set results in keywords that are less likely to be shared between the di?erent genes, and keywords related to human functions of ace2 are less likely to be shared by other yeast genes. This expectation was at the heart of the rationale behind the original hypothesis that keywords selected with the smaller, restricted background set are better for de?ning the functions of the genes with a partic- ular focus on their distinctness relative to the all other genes represented in the back- ground set, while those selected with the larger, unrestricted background set are better for clustering! Space considerations prohibit listing the keywords identi?ed for the genes under the four combinations. Fig. 1. The entire MEDLINE corpus constitutes the unrestricted background set; the restricted background set is the subset of documents that corresponds to the 44 yeast genes. Four sets of keywords are computed, based on Z-scores using the restricted background set (K-Z-R) and the unrestricted one (K-Z-U), as well as based on TF-IGF using the restricted background set (K-T- R) and the unrestricted one (K-T-U). Eventually four clusterings (C-Z-R, C-Z-U, C-T-R, and C- T-U) are computed by K-means using these respective sets of keywords. 512 V. G. Dasigi et al. 3.2 Impact of Di?erent Background Sets – Functional Clustering of Genes After identifying sets of keywords associated with the genes of interest, the keywords that are shared by more than one gene were used as features that form the basis of K-means clustering. An initial set of clustering experiments was conducted separately with the values of TF-IGF and Z-score that are associated with each keyword as feature weights for the clustering algorithm. Those initial experiments produced clustering results that were not particularly meaningful or interesting, prompting the authors to switch to using simple binary weights for the features (keywords) instead. The binary weights were defined as 1 if the word appears in at least one document associated with the gene set, and 0 otherwise. Simplistic as these weights are, intuitively binary weights on keywords in the context of the clustering algorithm capture the notion of shared keywords. Experiments were repeated based on 10, 20, 30, 50, 70, and 100 top-ranking keywords for each gene from each of the lists generated by TF-IGF and Z-scores. Tables 1 and 2 show the purity results for the clusters computed by the K-means algorithm based on keywords generated using TF-IGF and Z-scores, respectively, both within the context of the restricted back- ground set. In the bottom row, the tables also show the total number of distinct keywords used by the clustering algorithm across all 44 genes. Table 1. Clustering results for 9 clusters using binary keyword weights, based on 1000 runs, with keywords based on TF-IGF computed in the context of the restricted background set. Top 10 Top 20 Top 30 Top 50 Top 70 Top 100 Micro purity 0.636 0.659 0.682 0.562 0.500 0.546 Macro purity 0.707 0.723 0.742 0.643 0.559 0.567 Keywords 315 600 830 1383 1833 2530 Table 2. Clustering results for 9 clusters using binary keyword weights, based on 1000 runs, with keywords based on Z-scores computed in the context of the restricted background set. Top 10 Top 20 Top 30 Top 50 Top 70 Top 100 Micro purity 0.409 0.477 0.477 0.432 0.432 0.409 Macro purity 0.455 0.523 0.511 0.496 0.489 0.443 Keywords 475 1010 1524 2280 2888 3623 Purity values are averaged across all clusters in two possible ways. As mentioned before, each clustering (which is a set of sets of genes) may be viewed as a set of clusters, where each cluster is a set of genes. Macro-averaging involves simply averaging purities of individual clusters across all clusters of a clustering. Since the purity of each cluster is a ratio, the alternative technique of micro-averaging (which is not really a kind of averaging in the mathematical sense) involves taking the ratio of the sum of numerators and the sum of denominators, without reducing any of the individual ratios. The micro purity and macro purity rows in the tables refer to the micro-averaged purity and macro-averaged purity across the clusters of the computed clustering. Impact of Context on Keyword Identi?cation and Use 513 From Tables 1 and 2, it is interesting to notice that, when the restricted back- ground set is used in computing the metrics, purity of the clusters based on keywords identified using TF-IGF is substantially better than that relating to Z-scores. The results with TF-IGF are better in terms of both higher purity and fewer keywords than with Z-scores! Fewer features allow for faster clustering. The experiments were continued with the unrestricted background set to compute the TF-IGF and Z-scores, and select keywords based on those computations. The unre- stricted background set has a much larger number of documents. The documents were divided randomly into 44 groups for calculating the IGF and Z-scores. The 10, 20, 30, 50, 70, and 100 top-ranking keywords were once again obtained for each gene from the lists generated based on the TF-IGF and Z-score metrics. Only the keywords shared by at least two genes were considered, and the binary feature weight was used once again. The K-means algorithm was repeated 1000 times again with 9 clusters to get the optimal solution many times. Tables 3 and 4 show the purity results of clusters computed by the K-means algorithm based on keywords generated using TF-IGF and Z-scores, respec- tively, this time both within the context of the unrestricted background set. As in Tables 1 and 2, in the bottom row, Tables 3 and 4 also show the total number of distinct keywords used by the clustering algorithm across all 44 genes. Table 3. Clustering results for 9 clusters using binary keyword weight, based on 1000 runs, with keywords based on TF-IGF computed in the context of the unrestricted background set. Top 10 Top 20 Top 30 Top 50 Top 70 Top 100 Micro purity 0.682 0.614 0.682 0.659 0.636 0.636 Macro purity 0.749 0.674 0.640 0.696 0.716 0.674 Keywords 247 417 590 885 1168 1563 Table 4. Clustering results for 9 clusters using binary keyword weights, based on 1000 runs, with keywords based on Z-scores computed in the context of the unrestricted background set. Top 10 Top 20 Top 30 Top 50 Top 70 Top 100 Micro purity 0.682 0.659 0.614 0.636 0.636 0.568 Macro purity 0.708 0.699 0.708 0.728 0.630 0.663 Keywords 309 547 747 1139 1526 2067 Tables 3 and 4 indicate that in the context of an unrestricted background set, the purity of clustering with either metric is perhaps comparable to that with the other. The total number of keywords extracted with TF-IGF is always fewer than that with Z-scores, though, indicating faster clustering with TF-IGF. The purities of the clustering based on keywords with the N top-ranking Z-scores computed relative to both the restricted and unrestricted background sets are compared next, from Tables 2 and 4. It can be readily seen that the purity is much better across the board for the clusterings computed in the context of the unrestricted background set than for the clusterings computed in the context of the restricted background set. The positive impact of the unrestricted background set is also evident from a comparison of the 514 V. G. Dasigi et al. numbers of keywords used in computing the clusterings for each threshold of top-ranking keywords. Fewer keywords in the context of the unrestricted background set obviously means more keywords are shared between di?erent genes. These results substantiate the original hypothesis for Z-score that an unrestricted background set allows for identi?cation of more shared keywords for genes, and consequently, better clustering by gene function. Now, consider the purity of the clusterings generated by keywords with the top N TF-IGF scores, computed in the context of the restricted and unrestricted back- ground sets, respectively, from Tables 1 and 3. These results with the TF-IGF metric are less categorical based on the cluster purity than with Z-scores, although the purity results with the unrestricted background set are somewhat better or the same in most (actually 75%) of the cases presented, and in all the cases when 50 or more top-ranked keywords are considered. Another interesting point to note is that, as with Z-score, there is always a smaller number of distinct words in the Top N ranking words when the unrestricted background set is used, indicating that more keywords are shared within the context of an unrestricted background set. The chances of more keywords being shared is higher when more keywords are consid- ered, in general, which is borne out by the previous observation that purity is clearly improved with the unrestricted background set when 50 or more keywords are considered. These results once again lend substantial support to the original hypoth- esis that, even for TF-IGF, use of a broader or unrestricted background set is better for functional clustering of genes than a narrower or more restricted one. 4 Summary and Conclusion In this paper, two metrics have been reviewed for identifying keywords that have a strong association with a particular concept of interest, such as a gene, based on the prevalence of the keyword in documents that are about the concept, contrasted to the keyword’s distri- bution in a general “background” set of documents. The two metrics used in working with a set of 44 yeast genes are the standard statistical metric of Z-score and an extension of the classic TF-IDF weight metric from information retrieval, which has been named TF-IGF. The initial hypothesis is that different choices of background sets of documents lead to keywords with somewhat different properties suitable for different purposes. In relation to the ability of keywords to uniquely characterize the genes, especially as distinguished from other genes, TF-IGF seemed to yield somewhat better keywords, as judged by an expert. Some weak evidence was also found for the hypothesis that a restricted background set might be more suitable for identifying keywords that are likely to uniquely characterize the genes in the context of Z-score. As for clustering of genes, TF-IGF produced keywords that led to clustering with better purity than the Z-score, with either background set. The results were also achieved with fewer keywords with TF-IGF than with the Z-score, which is an additional bonus that leads to faster clustering. In addition, strong evidence was found for the hypothesis that an unre- stricted background set is more suitable with either Z-score or TF-IGF for identifying keywords that could be potentially shared between different genes and thus more suitable Impact of Context on Keyword Identi?cation and Use 515 for use in the K-means clustering algorithm. The evidence was supportive of the original hypothesis in two aspects: a higher averaged cluster purity was obtained with fewer keywords with the unrestricted background set, irrespective of whether Z-score or TF-IDF was used to identify the keywords. A final observation about our hypothesis about the impact of the choice of back- ground set on keyword quality for characterizing each gene and on clustering of genes needs to be made in relation to the Z-score metric. For characterizing each gene distinctly, it is important to identify as many unique keywords for each gene (preferably not shared with many other genes) need to be identified. For clustering of genes based on shared function, it is important to allow for more keywords with strong association to the genes to be shared between multiple genes. The results presented in Sects. 3.1 and 3.2 support this hypothesis much more decisively for the Z-score metric than for the TF-IGF. Indeed, this is not so surprising because the hypothesis has a more intuitive basis in the definition of the Z-score metric! Acknowledgments. The authors acknowledge that the MEDLINE® data used in this research are covered by a license agreement supported by the U.S. National Library of Medicine. Thanks are also due to Professor Rajnish Singh (Kennesaw State University) for her assistance in relation to evaluating the keywords for the various genes, and for her help in other ways related to this work. References 1. Andrade, M., Valencia, A.: Automatic extraction of keywords from scienti?c text: application to the knowledge domain of protein families. Bioinformatics 14(7), 600–607 (1998). https:// doi.org/10.1093/bioinformatics/14.7.600 2. Cherepinsky, V., Feng, J., Rejali, M., Mishra, B.: Shrinkage based similarity metric for cluster analysis of microarray data. Proc. Natl. Acad. Sci. USA 100(17), 418–427 (2003). https:// doi.org/10.1073/pnas.1633770100 3. Dasigi, V., Karam, O., Pydimarri, S.: An evaluation of keyword selection on gene clustering in biomedical literature mining. In: Proceedings of Fifth IASTED International Conference on Computational Intelligence, pp. 119–124 (2010). URL: http://www.actapress.com/ Abstract.aspx?paperId=43008 4. Hamdan, H., Bellot, P., Béchet, F.: The impact of Z-score on Twitter sentiment analysis. In: Proceedings of 8th International Workshop on Semantic Evaluation, pp. 596–600 (2014). https://doi.org/10.3115/v1/s14-2113 5. Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a K-means clustering algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979). https://doi.org/10.2307/2346830 6. Ikeda, D., Suzuki, E.: Mining peculiar compositions of frequent substrings from sparse text data using background texts. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases, Springer Lecture Notes in Arti?cial Intelligence, vol. 5781, pp. 596–611 (2009). https://doi.org/10.1007/978-3-642-04180-8_56 7. Liu, Y., Navathe, S., Pivoshenko, A., Dasigi, V., Dingledine, R., Ciliax, B.: Text analysis of MEDLINE for discovering functional relationships among genes: evaluation of keyword extraction weighting schemes. Int. J. Data Min. Bioinform. 1(1), 88–110 (2006). https:// doi.org/10.1504/ijdmb.2006.009923 8. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24, 513–523 (1988). https://doi.org/10.1016/0306-4573(88)90021-0 516 V. G. Dasigi et al. A Cloud-Based Decision Support System Framework for Hydropower Biological Evaluation Hongfei Hou1,2(?) , Zhiqun Daniel Deng1,3 , Jayson J. Martinez1 , Tao Fu1 , Jun Lu1 , Li Tan2 , John Miller2 , and David Bakken4 1 Paci?c Northwest National Laboratory, Energy and Environment Directorate, Richland, WA 99352, USA hongfei.hou@wsu.edu 2 School of Engineering and Applied Sciences, Washington State University Tri-Cities, 2710 Crimson Way, Richland, WA 99354, USA 3 Department of Mechanical Engineering, Virginia Tech, Blacksburg, VA, USA 4 School of Electrical Engineering and Computer Science, Washington State University, 355 NE Spokane St., Pullman, WA 99163, USA Abstract. Hydropower is one of the most important energy sources: it accounts for more than 80% of the world’s renewable electricity and 16% of the world’s electricity. Signi?cantly more hydropower capacity is planned to be developed. However, hydro-structures, including hydroelectric dams, may have adverse biological e?ects on ?sh, especially on migratory species. For instance, ?sh can be injured or even killed when they pass through turbines. This is why biological evaluations on hydro-structures are needed to estimate ?sh injury and mortality rates. The Hydropower Biological Evaluation Toolset (HBET) is an integrated suite of science-based desktop tools designed to evaluate whether the hydraulic conditions of hydropower structures are ?sh friendly by analyzing collected data and providing estimated injury and mortality rates. The Sensor Fish, a small autonomous sensor package, is used by HBET to record data describing the conditions that live ?sh passing through a hydropower structure will experience. In this paper, we present a plan to incorporate cloud computing into HBET, and migrate into a cloud-based decision support system framework for hydropower biological evaluation. These enhancements will make the evaluation system more scalable and ?exible; however, they will also introduce a signi?cant challenge: how to maintain security while retaining scalability and ?exibility. We discuss the technical methodologies and algorithms for the proposed framework, and analyze the relevant security issues and associated security countermeasures. Keywords: Decision support system · Hydropower · Dam · Fish injury Fish-friendly turbine 1 Introduction A decision support system (DSS) is a type of interactive knowledge-based software that uses prede?ned models to process data inputted from various data sources to help busi- nesses and organizations in decision-making activities [1]. A DSS is composed of three © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 517–529, 2019. https://doi.org/10.1007/978-3-030-02686-8_39 fundamental components [2]: a data management component imports/stores data and provides data access to other components; a decision-making component, containing prede?ned decision-making models, compiles useful information from the data provided by the data management component to make decisions; and a presentation component enables users to interact with the systems (Fig. 1). DSS will be incorporated into the Hydropower Biological Evaluation Toolset (HBET), an integrated suite of science-based desktop tools designed to evaluate the degree to which the hydraulic conditions of hydropower structures (e.g., turbine, spillway, overshot weir, undershot weir, and pumped storage) a?ect entrained ?sh by analyzing the collected data and providing estimated injury and mortality rates based o? experimentally derived, species-speci?c dose-response relationships [3]. Fig. 1. Architecture of DSS [4]. Hydropower is one of the most important energy sources, accounting for more than 80% of the world’s renewable electricity and about 16% of the entire world electricity supply [5]. Signi?cantly more hydropower capacity is planned to meet demand [5]. However, hydro-structures, including hydroelectric dams and hydraulic turbines, may have adverse biological e?ects on ?sh, especially on migratory species. For example, ?sh can be injured or even killed when they pass through turbines [6–10]. This is why 518 H. Hou et al. biological evaluations of hydro-structures are needed to estimate ?sh injury and mortality rates. HBET uses the Sensor Fish (SF), a small autonomous sensor package instrument [11], to collect data describing the conditions that would be experienced by live ?sh passing through a hydro-structure. SF and HBET can support evaluations of turbines and sites including physical components (barriers, trash racks, spillways, etc.) ?sh interact with during downstream passage to identify the most ?sh-friendly alternatives. Currently HBET is platform dependent, and is available only to users who have access to computers where HBET has been installed. To increase its availability and usage, we will incorporate cloud computing into HBET, and migrate it into a cloud-based DSS framework for hydropower biological evaluation. This will make HBET available to users no matter where they are as long as they have an internet connection. Users would always use the latest version without installation and upgrading. This will also make HBET more scalable and ?exible in incorporating new dose-relationship, new ?sh species, and study types. However, at the same time, it introduces a signi?cant concern: how to maintain system security without adversely impacting the scalability and ?exi- bility so no proprietary information is compromised. In this paper, we discuss the tech- nical methodologies and algorithms for the proposed framework, and analyze the rele- vant security issues and associated security countermeasures. 2 Overview of the Framework The framework contains three major components, which reside in the cloud (Fig. 2). The ?rst component is data acquisition and integration (DAI), which contains modules that receive data. There are two types of data sources for hydropower evaluation: the internal database and external SF ?les. The second component is decision-making (DM), which contains modules to estimate injury and mortality rates for di?erent ?sh species. For example, one module could estimate the barotrauma mortal injury rate, and another, the major injury rate due to shear. Currently HBET can assess strike, shear and baro- trauma stressors. Fish species it supports include Chinook salmon, Australian bass, Gudgeon, Murray cod, and Silver perch. Any incorporation of new stressors or new ?sh species can be added through this component. The third component is data validation and self-monitoring (DVSM), which contains modules which validate input data for DAI modules and monitor outputs and behaviors of DM modules. For each module in the ?rst two components, there will be a corresponding data validation and self-moni- toring module. Each component can adopt as many modules as needed, and these modules can be used for di?erent purposes (i.e., di?erent ?sh species, di?erent study type, and so on). For example, the DM component can contain DM modules for Chinook salmon, and for Australian bass. Similar to a typical DSS, the proposed framework also includes a knowledge base which includes information such as rules, logic, and corre- sponding conditions. For example, A Cloud-Based Decision Support System Framework 519 Fig. 2. Architecture of the proposed framework. In the proposed framework, we will implement four countermeasures to address cloud-related security concerns: 1. Introduce DVSM modules into the framework. Data validation is used to ?lter out invalid input data. Self-monitoring will use data mining to predict each DM module’s output, and compare the predicted result with the actual output from the DM module to determine if the output is expected. Self-monitoring will also monitor modules’ behavior such as resource usage and execution time. (i.e., the proposed framework would monitor its own behavior during runtime). 2. Use data encryption so that data cannot be interpreted even if it is exposed to unau- thorized users. 3. Use a login token and temporary password so that the commitment of any cloud interfaces and API requests needs a valid login. 4. Create a module set for each study type so that module set’s failure in one study type does not a?ect other study types. 520 H. Hou et al. 3 System Security and Countermeasures In order to analyze security levels of the proposed framework, we ?rst need to identify its vulnerabilities. There are common vulnerabilities that exist in all types of DSSs, such as security issues in account authentication and lack of security education [12]. In this research, we focus on the vulnerabilities that exist only in cloud-based DSSs, but not in desktop ones: 1. Insecure cloud interfaces and APIs. Cloud-based systems provide cloud interfaces and APIs [13] through which to communicate with other systems and/or devices, and thus their security will depend on the security of the cloud interfaces and APIs. These issues include insecure cloud interfaces, immature cloud APIs, insu?cient inputted data validation, and insu?cient self-monitoring [14]. 2. Resource overbooking. Resources can be overused if the modules in the cloud-based DSS are modeled inaccurately [15]. This can also happen if attackers intentionally design a module to allocate or occupy resources without limits. If resource over- booking occurs, services of a cloud-based DSS will become unavailable (i.e., the DSS will be inaccessible). Typical methods employed by attackers to overbook resources include unlimited memory allocation, unlimited occupation of storage, and unlimited occupation of bandwidth. 3. Data exposure. Input data should only be accessible and exposed to the desired DAI modules, and output from DM modules should only be exposed to the desired devices or DVSM modules. However, since the data or training set data are saved in the cloud-based database, they can be co-located with the data owned by competitors or intruders because of weak separation [16]. 4. Vulnerabilities in virtual machines and hypervisors. Cloud-based systems will run in virtual machines or hypervisors. Compromises occurring in virtual machines and hypervisors may introduce data leakage [17], and resource overbooking. Threats in cloud computing will also exist in cloud-based DSS. There are 12 top security threats faced by cloud-based service, called the “Treacherous 12” [18] as shown in the ?rst 12 rows of Table 1. “Data breach” is always the major concern for all systems, including both desktop-based systems and cloud-based ones. “Data encryption” can be applied to saved data so that data cannot be interpreted even if it is breached. “Insu?cient identity, credential, and access management,” “system vulnerabilities”, “account or service hijacking”, “malicious insiders”, “advanced persistent threats”, “data loss”, and “insu?cient due diligence” can bring security risks to input data, stored data, and systems themselves. In this research, we aim to maintain ?exibility and scalability but retain security when migrating a desktop DSS into a cloud-based one. Thus, we will only focus on the threats that are unique to cloud-based systems and some threats that are major concerns: “data breaches”, “insecure cloud interfaces and cloud APIs”, “abuse and nefarious use of cloud services”, “denial of services”, and “shared-technology vulnerabilities”. The counter- measures we will implement in the proposed framework are to address the focused threats. A Cloud-Based Decision Support System Framework 521 Table 1. Threats in cloud-based DSS Threats Explanations Data breaches Data will be breached if it is accessed by unauthorized services or function calls, or when authorized services or function calls use the data in an improper way. Data breaches are not unique to cloud-based DSS, but it is the top concern for cloud-based DSS users [15, 19] Insu?cient identity, credential, and access management If identity, credential, and access management is not su?cient, sensitive data can be exposed to unauthorized entities, and data and applications can be manipulated unexpectedly [19] Insecure cloud interfaces and cloud APIs Cloud interfaces and cloud APIs are the fundamental parts of cloud-based DSSs. They are the bridges between system components and databases. If the cloud interfaces and APIs are not secure, attackers can use them to access data and perform commitment as often as they wish System vulnerabilities System vulnerabilities include bugs or issues in operating systems or software. Exploiting system vulnerabilities is a common way for attackers to commit their actions Account or service hijacking After hijacking an account or service, attackers can bypass the authentication process and then pretend to be legitimate users, operators, or software developers, in order to achieve their goals [19] Malicious insiders Malicious insiders can cause much more damage than other threats. For example, a system administrator can access any data and any application, and thus can in?ict any kind of damage Advanced persistent threats Advanced persistent threats (APTs) are cyberattacks used to gain control over the systems to steal data Data loss Input data or training set data stored can be deleted or erased once attackers take control of a system. This can also occur because of human error, but this is not the focus of this research Insu?cient due diligence Without due diligence, wrong technologies or wrong system con?gurations can be applied. This will introduce a potentially large risk Abuse and nefarious use of cloud services If cloud services are not secured, they can be abused to achieve certain speci?c goals; for example, email spam Denial of services If a resource is overused, the system may have no resources left to process any incoming legitimate requests Shared-technology vulnerabilities Sharing technology make cloud service more scalable. However, it brings vulnerabilities at the same time Insecure virtual machines and hypervisors If the virtual machines or hypervisors of the cloud-based system are not secure, the cloud-based system will be at risk 522 H. Hou et al. The First Counter Measure is to Introduce DVSM Modules Into the Framework. Each time a new DM module is added to the system, the system will use the inputted domain (e.g., Turbine) and subdomain information (e.g., Francis) to ?nd a matching DVSM module, and then create a new instance of the module found for the newly added module. If there is no existing module, the system will display user interfaces to request information to generate the corresponding DVSM module. There are three steps to collect information. The ?rst step is to collect information about data validation, such as data type or data sequence format. The second step is to collect information about self-monitoring, such as conditions and corresponding actions when conditions are met. For example, if execution time exceeds 30 s, change the module’s status to “Suspicious”. The third step is to input a training data set which will be used for the self-monitoring part of the newly generated module. The training data set will be saved into the database for further reference. The newly generated DVSM module and training data set will be reviewed and veri?ed before being put into use. In the data validation part of each DVSM module, we use a structured validation consisting of several operations for all newly acquired data [20]. The ?rst operation is to check whether the input datasets are in correct format. For example, the dataset collected from Sensor Fish should be “ ”. The second operation is to check the data type for each ?eld. For example, the data type of “pressure” should be “?oat”. The third operation is data range check to make sure the acquired data are within reasonable limits. For example, the pressure should be greater than 0 psi. The last operation, to check data frequency, is to make sure that data are collected at the expected intervals. In the self-monitoring part of each DVSM module, we will implement the k-nearest neighbors algorithm (KNN), a non-parametric algorithm with lazy learning [21] for data mining on modules’ outputs and behaviors. We chose KNN for the following reasons: KNN is e?cient because the lazy learning algorithm can use the training data set without any generalization; KNN has been used widely and can be applied for data with arbitrary distribution because it is not parametric; KNN is ranked among the top 10 data mining algorithms [22]. In this research, we use SF as the data source. For each study, we deploy SF at the desired study site to get a su?cient sample size for statistical analysis and required precision. Each time a SF is released, the corresponding hydro-structure’s environmental characteristics are recorded. After all SFs are released and recovered, the data ?les are downloaded from the SF. DAI modules will upload these downloaded SF ?les into the system, and then pass the interpreted data into the hydropower evaluation DM modules. DVSM modules will use the attributes shown in Table 2 to monitor the outputs from the DM modules. Table 3 is part of the training data set, which contains combinations of attributes’ value and expected outputs. Multiple classes describe the modality injury rates associ- ated with the corresponding stressors: BMIR refers to barotrauma mortal injury rate, and SMIR refers to shear major injury rate. A Cloud-Based Decision Support System Framework 523 Table 2. Attribute list to monitor HBET decision-making module Name Description DN Domain name, such as Hydropower Biological Evaluation (represented as an integer; e.g., “1”). Read from the con?guration ?les STN Study-type name, such as Turbine (represented as an integer; e.g., “1”). Read from the con?guration ?les SSTN Sub-Study type name, such as Francis (represented as an integer; e.g., “0”). Read from the con?guration ?les FS Fish species studied, such as Chinook salmon (represented as an integer; e.g., “11”) AFD Actual total ?ow discharge of the study site, in thousands of cubic feet per second TFD Target total ?ow discharge of the study site, in thousands of cubic feet per second APG Actual power generation of the study site, in megawatts TPG Target power generation of the study site, in megawatts BP Barometric pressure measured when the SF is released in pounds per square inch ERD Estimated release depth when SF is released, in feet BA Blade angle of the turbine, in percentage WGO Wicket gate open percentage TE Tailwater elevation of the study site, in feet FB Forebay elevation of the study site, in feet HHE Hydraulic head elevation of the study site, in feet Table 3. Training data to monitor HBET decision-making module DN 1 1 1 1 1 STN 1 1 1 1 1 SSTN 0 0 0 0 0 FS 11 11 11 11 11 AFD 50.087894 50.017833 50.143431 50.094383 50.043149 TFD 80 80 80 80 80 APG 92.438843 91.543232 93.431293 91.738209 91.637234 TPG 150 150 150 150 150 BP 14.721021 14.697332 14.719908 14.716734 14.700632 ERD 127.989454 124.548293 126.431829 125.438219 126.008943 BA 0.15 0.15 0.15 0.15 0.15 WGO 0.57 0.57 0.57 0.57 0.57 TE 17.895445 17.047384 18.089433 17.894343 18.047854 FB 120.483943 119.483943 120.894320 119.823343 120.439083 HHE 102.588498 102.436559 102.804887 101.929000 102.391229 BMIR 0.045684 0.053612 0.047534 0.048893 0.051234 SMIR 0.021367 0.031267 0.029123 0.035623 0.013434 524 H. Hou et al. After processing each SF data ?le, DVSM modules will retrieve the corresponding information for each attribute shown in the Table 1 to generate a vector, which is then used to calculate the Euclidean distance (ED, the square root of the sum of the square of di?erences between the corresponding values of two vectors; Eq. 1) [23] against each row of the training set. ED(x, y) = v) ?n i=1 |xi -i yi|2 (1) After calculating the EDs for all rows in the training set, the system will add the results as a new column into the training set and sorts it by the ED in ascending order. The predicted class is the majority of the classes in the top K rows, which is used to compare the result made by the corresponding DM module for the given inputted data set. Comparison results are accumulated to calculate the error rate. In this research, we would choose 11 as the value of K for the accuracy based on the chart (Fig. 3). Fig. 3. Relationship between value of K and accuracy of KNN. The Second Countermeasure is to Use Data Encryption. When registering to use the cloud-based framework, an organization will be provided a public/private key set. Input data will be interpreted by the DAI module, and then encrypted using the provided public key and saved into the database. The training data set will also be encrypted and saved into the database. For the DM module, any time it retrieves data from the database, the private encryption key will be used to decode the data for further processing. The public key is shared with the public, and the private key ?le is distributed by the organ- ization only to authorized users and saved into an encrypted USB drive. When using the cloud-based DSS, the USB driver holding the private key ?le should be connected. Without the private key, data cannot be interpreted. A Cloud-Based Decision Support System Framework 525 The Third Counter Measure is to Use a Login Token and Temporary Password. A login token is generated when a user logs into the systems, and it expires any time the user logs out or the session times out. When the server side processes the login request, it will ?rst validate it by checking the username and password. If the validation passes, it then sends a temporary password to the user’s email or cell phone, based on the user’s selection. The user must input the correct temporary password for the system to success- fully log the user in and generate the login token. The token generated is used for each call of the cloud interfaces and application programming interfaces (APIs) as one of the properties in the parameter JavaScript Object Notation (JSON) object. When the server side of the cloud-based DSS receives the service request with the passed-in JSON object, it ?rst retrieves and validates the login token. If the login token is valid, the system processes the request and moves forward. Otherwise, the request is discarded. The Fourth Countermeasure is to Create a Module Set for Each Study Type. For example, for a Turbine study, there will be a module set including a DAI module and a corresponding DVSM module, a DM module and a corresponding DVSM module. Thus, failure on any module in a Turbine study’s module set will be isolated from other study types. Besides applying the above mentioned countermeasures, we will also act quickly on suggestions by system providers, such as upgrading or installing patches. Table 4 shows the speci?c countermeasures proposed for the threats which may be encountered. Table 4. Threats and countermeasures Threats Security countermeasures Data breaches Data encryption, login token and temporary password, self-monitoring Insu?cient identity, credential, and access management Data encryption, login token and temporary password, self-monitoring Account or service hijacking Login token and temporary password, self-monitoring Advanced persistent threats Login token and temporary password, self-monitoring Insecure cloud interfaces and cloud APIs Login token and temporary password, self-monitoring Abuse and nefarious use of cloud services Data validation and scanning, self-monitoring Denial of services Data validation and scanning, self-monitoring Shared-technology vulnerabilities Independent component set, data encryption, login token and temporary password, data validation and scanning, self-monitoring For threats not listed in Table 4, we will take actions suggested by system providers. By applying the latest upgrades and installing the latest patches, we can prevent the security risks due to “system vulnerabilities”. By improving employee screening and hiring practices, we can reduce the issues that can be caused by “malicious insiders”. Providing su?cient security education will signi?cantly improve the security level 526 H. Hou et al. against risks caused by “insu?cient due diligence” [24]. “Data loss” is caused mainly by human error. By educating employees and incorporating login tokens and temporary passwords, organizations can signi?cantly reduce data loss. By con?guring the virtual machine as suggested by the vendors, applying all security patches, installing all security upgrades, and pursuing regular monitoring, risks introduced by “virtual machine vulner- abilities” will be controlled. DVSM is the fundamental component in the proposed framework. Compared to these existing methods, this component will not only monitor the input data and output from DM modules’, but also monitor the modules’ behaviors. Under these conditions, the proposed framework can maintain security when migrating from a desktop DSS. 4 Conclusion To increase the availability and usage of HBET, we will incorporate cloud computing and migrate it into a cloud-based DSS framework. This will make the systems more scalable and ?exible. To maintain security while retaining scalability and ?exibility, we will implement several security countermeasures in the proposed framework. By applying data encryption, login tokens and temporary passwords, and data validation and self-monitoring, the proposed DSS framework can address threats including data breaches; insu?cient identity, credential, and access management; account or service hijacking; advanced persistent threats; insecure cloud interfaces and cloud APIs; abuse and nefarious use of cloud services; denial of services; and shared-technology vulner- abilities. For threats not mentioned above, we will take actions suggested by system providers. By applying the latest upgrades and installing the latest patches, we can prevent security risks due to system vulnerabilities. By improving employee screening and hiring practices, we can reduce issues caused by malicious insiders. Providing su?cient security education will signi?cantly reduce risks caused by insu?cient due diligence. Data loss is caused mainly by human error. By educating the employees and incorporating login tokens and temporary passwords, organizations can signi?cantly reduce data loss. By con?guring the virtual machine as suggested by the vendors, applying all security patches, installing all security upgrades, and pursuing regular monitoring, risks introduced by virtual machine vulnerabilities will be controlled. We conclude that the proposed framework can maintain security when migrating from a desktop DSS. For future work, we will use this paper as the basis to implement the proposed cloud-based DSS framework and deploy it into the cloud. Acknowledgments. The work described in this article was funded by the U.S. Department of Energy Water Power Technologies O?ce. A Cloud-Based Decision Support System Framework 527 References 1. Power, D.J.: Decision Support Systems: Concepts and Resources for Managers. Greenwood Publishing Group, Santa Barbara (2002) 2. Sage, A.P.: Decision Support Systems Engineering, 1st edn. Wiley, Hoboken (1991). ISBN-10: 047153000X, ISBN-13: 978-0471530008 3. Hou, H., Deng, Z.D., Martinez, J., Fu, T., Duncan, J.P., Johnson, G.E., Lu, J., Skalski, J.R., Townsend, R.L., Tan, L.: A hydropower biological evaluation toolset (HBET) for characterizing hydraulic conditions and impacts of hydro-structures on ?sh. Energies 11(4), 990 (2018) 4. Turban, E., Aronson, J.E.: Decision Support Systems and Intelligent Systems, 6th edn. Prentice Hall, Upper Saddle River (2001). ISBN:0130894656, 9780130894656 5. REN21: Renewables 2016 Global Status Report (Paris: REN21 Secretariat) (2016). ISBN: 978-3-9818107-0-7 6. Brown, R.S., Colotelo, A.H., P?ugrath, B.D., Boys, C.A., Baumgartner, L.J., Deng, Z.D., Silva, L.G.: Understanding barotrauma in ?sh passing hydro structures: a global strategy for sustainable development of water resources. Fisheries 39(3), 108–122 (2014) 7. Cada, G.F.: The development of advanced hydroelectric turbines to improve ?sh passage survival. Fisheries 26(9), 14–23 (2001) 8. Cushman, R.M.: Review of ecological e?ects of rapidly varying ?ows downstream from hydroelectric facilities. N. Am. J. Fish. Manag. 5(3A), 330–339 (1985) 9. Pracheil, B.M., DeRolph, C.R., Schramm, M.P., Bevelhimer, M.S.: A ?sh-eye view of riverine hydropower systems: the current understanding of the biological response to turbine passage. Rev. Fish Biol. Fish. 26(2), 153–167 (2016) 10. Trumbo, B.A., Ahmann, M.L., Renholds, J.F., Brown, R.S., Colotelo, A.H., Deng, Z.D.: Improving hydroturbine pressures to enhance salmon passage survival and recovery. Rev. Fish Biol. Fish. 24(3), 955–965 (2014) 11. Deng, Z.D., Lu, J., Myjak, M.J., Martinez, J.J., Tian, C., Morris, S.J., Carlson, T.J., Zhou, D., Hou, H.: Design and implementation of a new autonomous sensor ?sh to support advanced hydropower development. Rev. Sci. Instrum. 85(11), 115001 (2014) 12. Hashizume, K., Rosado, D.G., Fernández-Medina, E., Fernandez, E.B.: An analysis of security issues for cloud computing. J. Internet Serv. Appl. 4, 5 (2013) 13. Dawoud, W., Takouna, I., Meinel, C.: Infrastructure as a service security: challenges and solutions. In: The 7th International Conference on Informatics and Systems (INFOS), pp. 1– 8. IEEE Computer Society (2010) 14. Carlin, S., Curran, K.: Cloud computing security. Int. J. Ambient Comput. Intell. 3(1), 14– 19 (2011) 15. Catteddu, D.: Cloud computing: bene?ts, risks and recommendations for information security. In: Serrão, C., Aguilera Díaz, V., Cerullo, F. (eds.) Web Application Security. Communications in Computer and Information Science, vol. 72. Springer, Berlin (2010) 16. Viega, J.: Cloud computing and the common man. Computer 42(8), 106–108 (2009) 17. Rittinghouse, J.W., Ransome, J.F.: Cloud Computing: Implementation, Management, and Security. CRC Press, Boca Raton (2009). ISBN 9781439806807 18. Violino, B.: The Dirty Dozen: 12 Top Cloud Security Threats for 2018. CSO Online (2018) 19. Cloud Security Alliance: Top Threats to Cloud Computing. V1.0 (2010) 20. Zio, M.D., Fursova, N., Gelsema, T., Gießing, S., Guarnera, U., Petrauskien, J., Kalben, L.Q., Scanu, M., Bosch, K.O., Loo, M., Walsdorfer, K.: Methodology for Data Validation 1.0. ESSnet ValiDat Foundation (2016) 528 H. Hou et al. 21. Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992) 22. Zhang, Z.: Introduction to machine learning: k-nearest neighbors. Ann. Transl. Med. 4(11), 218 (2016) 23. Shirkhorshidi, A.S., Aghabozorgi, S., Wah, T.Y.: A comparison study on similarity and dissimilarity measures in clustering continuous data. PLoS ONE 10(12), e0144059 (2015) 24. Popovic, K., Hocenski, Z.: Cloud computing security issues and challenges. In: Proceedings of the 33rd International Convention MIPRO, pp. 344–349. IEEE Computer Society, Washington DC (2010) A Cloud-Based Decision Support System Framework 529 An Attempt to Forecast All Different Rainfall Series by Dynamic Programming Approach Swe Swe Aung1,3(&) , Shin Ohsawa2 , Itaru Nagayama3 , and Shiro Tamaki3 1 Department of Software, University of Computer Studies, Taunggyi, Myanmar 2 Weathernews Inc., Okinawa, Japan 3 Department of Information Engineering, University of the Ryukyus, Okinawa, Japan {sweswe,nagayama,shiro}@ie.u-ryukyu.ac.jp Abstract. Unexpected heavy rainfall has been seriously occurred in most parts of the world, especially during monsoon season. As a serious consequence of heavy rainfall, the people in those areas battered by heavy rainfall faced many hardship lives. Without exception, prevention is the best way of minimizing these negative effects. In spite of all, we developed a rainfall series prediction system for different series patterns by applying the dynamic programming approach aiming to acquire the rainfall level of the whole rainfall cycle. The simple idea behind the proposed dynamic programming approach is to ?nd the similarity of two rainfall sequences upon the maximum match of the rainfall level of those sequences. Based on 2011 and 2013 real data sets collected from WITH radar, which is installed on the rooftop of Information Engineering, University of the Ryukyus, the comparison between the conventional approach (Polynomial Regression) and the proposed approach is investigated. These correlation experiments con?rm that the dynamic programming approach is more ef?cient for predicting different rainfall series. Keywords: Dynamic programmingRainfall seriesPolynomial regression WITH radar 1 Introduction Rainfall forecasting in meticulous practice plays an important role in predicting the severe natural disasters with a view to prevent the potential threats and damages. As reported by online news, heavy rainfall lashed Sierra Leone in Africa on August 14, 2017, and left the region with landslides and mudslides due to heavy flooding. On June 13, 2017, torrential rainfall hit Bangladesh and triggered deadly mudslides in that region. The same deadly damages caused by guerrilla rainfall occurred in Sri Lanka during the ?nal week of May 2017. On July 5, 2017, many people went missing in the massive landslides and floods from heavy rainfall that battered Fukuoka, Japan. On July 21, 2017, the heaviest rainfall hit lower Myanmar, and many people were temporarily displaced due to landslides and floods. Figure 1 shows the flood in the city of Nago, Okinawa, Japan, caused by heavy rain on July 9, 2014. © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 530–547, 2019. https://doi.org/10.1007/978-3-030-02686-8_40 The damage caused by guerrilla rainfall points out the importance of localized rainfall prediction with accurate estimations to prevent the after-effects. A quick change in rainfall is one of the most dif?cult factors in making a decision about a long-term prediction. The states between developing and decaying cumulonimbus clouds can alter very rapidly. Fortunately, the small-dish WITH aviation radar includes functions for observing and capturing rapidly developing cumulonimbus clouds in high resolution to deal with those dif?culties. For this purpose, a prediction model is designed by using the concept of dynamic programming algorithm. Dynamic programming approach is a powerful tool for solving the problem of investigating the similarity between two pairs of rainfall series. The similarity between two rainfall sequences is de?ned according to the maximum match number of rainfall levels. Direct comparison of two rainfall sequences is not completely an appropriate matching to compute the similarity and generate the rainfall level relationships between those two rainfalls. Thus, dynamic programming came to our attention as an approach that is a good ?t for predicting different rainfall series pattern. Another study for predicting the whole rainfall series is one of conventional curve ?tting approaches (polynomial regression). The concept of polynomial regression is to generate a prediction model of independent variable x and dependent variable y cor-responding to the nonlinear relationship between them. In this paper, the two approaches primarily aim to investigate the most similar rainfall series for newcomers are presented. The rest of this paper is organized as follows. Section 2 describes related works. Section 3 discusses WITH radar and how to generate rainfall level. Section 4 describes the phenomenon of localized rainfall. Section 5 details the construction of rainfall level data model and Sect. 6 details with dynamic programming model for rainfall series prediction. Section 7 is about polynomial regression and Sect. 8 is analytical result and discussion. Section 9 is the conclusion. Fig. 1. Okinawa in Japan floods caused by heavy rain on July 9, 2014 [1]. An Attempt to Forecast All Different Rainfall Series 531 2 Related Works The level or amount of rainfall prediction system for short-term period has being implemented by many researchers in many parts of the countries by applying various prediction methodologies to different kinds of rainfall data resources. In this case, many powerful machine learning approaches have come to the attention of researchers for the short-term rainfall prediction systems. In [2], the authors proposed a system for prediction of rainfall using radar reflec-tivity data by applying ?ve machine learning approaches (neural network, random forest, classi?cation and regression tree, support vector machine, and k-nearest neighbor) in a watershed basin at Oxford. The purpose of the paper is to select one algorithm, which could predict the rainfall with the highest precision accuracy. As reported by the experimental results, arti?cial neural network MLP NN is the best performance in comparison to other algorithms. In [3], the authors designed a system for short-term rain forecasting system in the northeastern part of Thailand by applying machine learning techniques (decision tree (DT), arti?cial neural network (ANN) and support vector machine (SVM)). According to the comparative results, arti?cial neural network and support vector machine are more suitable for the prediction of short-term rainfall amount than decision tree. Aung et al. [4] proposed a short-term prediction of localized rainfall from radar images by applying dual-kNN approach aiming to forecast one-minute, three-minute, and ?ve-minute forecasts. They utilized dual-kNN approach in order to upgrade the ordinary classi?cation routines of classical k-nearest neighbors (k-NN) and to improve the prediction accuracy. They experimentally con?rmed with test cases and simulations that the performance of dual-kNN is more effective than classical k-NN. Inafuku et al. [5] designed a short-term prediction for guerrilla rainstorm by using state-transition method. For the short-term prediction, they introduced the rapid state-transition ones based on short-period sampling data to overcome the weakness of the classical state-transition method. Besides, they introduced the estimation method of the coordinates of center of gravity movement of rain cloud to get more precision forecast. In [6], the authors proposed approach for searching for similarities in the amino acid sequence of two proteins to determine whether signi?cant homology exist between the proteins by applying dynamic programming matching approach. The systems described above only emphasized on how to do the prediction of short-term rainfall prediction using various powerful machine learning approaches. In other words, it means that the system makes a forecast emphasizing on only one part of the rainfall series. Thus, this paper intends to predict the rainfall level of the whole rainfall series or rainfall circle by applying dynamic programming approach. Dynamic programming approach is a powerful approach for solving the problem of sequence decisions [7]. The underling idea is to ?nd the similarity of two sequence problems by applying alignment method. 532 S. S. Aung et al. 3 WITH Radar and Rainfall Level The small-dish aircraft radar dubbed WITH radar, which is owned by Weathernews Inc., is Doppler radar for observing and capturing cumulonimbus clouds that can cause torrential rainstorms. It has the following features [8]. • The diameter of the radar is about 1000 mm. • It can capture the development processes of cumulonimbus clouds that cause guerrilla rainstorms. • It can observe altitudes of 2 km and below. • Observations use the Doppler method. • The frequency is 9340 MHz (X-band). • Electric power is 30 W. • Sampling time is six seconds. • The observable range is a 50 km radius. • Spatial resolution is a 150 m mesh. Figure 2 combines three pictures, where the leftmost is the WITH radar installed on the rooftop of the Information Engineering building. The middle picture is a cross-section scan from the WITH radar that shows a cumulonimbus cloud forming near Okinawa Island. The rightmost photo is the color scheme for rainfall levels 0 to 14. The quantity of rainfall is de?ned by the equation 2.67 h Rain Level, corresponding to a quantity of precipitation from 00 mm/h to 40 mm/h. In Fig. 2, the middle image represents a sample image from observation of a rain cloud constructed by cross-section scan. In this image, the weather radar locates the area where the suspected rain cloud produces a heavy rainstorm. The intensity of rainfall levels is represented by 15 different colors (black, off-white, sky blue, light blue, blue, dark blue, dark green, green, light green, light yellow, yellow, yellow– orange, light pink, pink, and red) as shown in the rightmost section of Fig. 2. Beyond that, the rainfall level in digital format is from 0 to 14, where 0 is clear (i.e. not raining). Light rain is from rainfall level 1 to 5, moderate rainfall is from level 6 to 11, and heavy rain is from level 12 to 14. Fig. 2. Left to right: WITH radar; an observed localized cumulonimbus cloud near Okinawa; and the colors denoting the various rain levels. An Attempt to Forecast All Different Rainfall Series 533 Table 1 illustrates the intensity of each rainfall level in digital format. In this case, the intensity of each rainfall level is computed by applying the following (1): Intensity of Rainfall ¼ ððRainfall mm=hrÞ= Level NumberÞ m Rainfall level ð1Þ 4 Phenomenon of Localized Rainfall Usually, the development and decay of cumulonimbus clouds lasts from 30 min to 1 h. The phenomenon can occur over small islands, such as Okinawa. Figure 3 demon-strates 12 rainfall series. Extensively, Y axis represents the rainfall level in inches, and X axis denotes minutes. Figures 3 and 4 illustrate the phenomenon of torrential rainfall based on 2011 and 2013 weather data. In Fig. 3, it can be clearly seen that the red dots represent growth and blue dots represent decay. It is obvious that higher rainfall levels cover a larger rainfall area. The two go hand-in-hand. Figure 4 illustrates a series of torrential rainfall levels based on a time increment that lasted around 30 min. In this ?gure, the X axis is time, and the Y axis is rainfall level. Figure 5 demonstrates the development and decay conditions in torrential rainfall. It usually starts at a small size and becomes bigger. Finally, it slowly starts to decay. Actually, a rainfall cycle that usually lasts 30 min includes around 300 images, because the radar takes one picture every six seconds. From one rainfall cycle, we only used Table 1. Intensity of rainfall levels Rainfall level Intensity of rainfall level (2.66 * rainfall level) 0 0 1 2.66 2 5.32 3 7.98 4 10.64 5 13.3 6 15.96 7 18.62 8 21.28 9 23.94 10 26.6 11 29.26 12 31.92 13 34.58 14 37.24 534 S. S. Aung et al. Fig. 3. Twelve Rainfall Series based on 2011 and 2013 Rainfall Data. Fig. 4. Rainfall levels for torrential rain lasting about 30 min. Fig. 5. Radar images of rainfall lasting about 30 min. An Attempt to Forecast All Different Rainfall Series 535 some of the more important images to illustrate the characteristics of the rainfall cycle in Fig. 5. 5 Rainfall Level Data Model Construction Before going into the detailed discussion of dual-kNN, we want to discuss how to create a rainfall level data model for rainfall prediction. Figure 6 illustrates the rainfall level of each pixel, P (ri) (x, y), extracted from radar images where {P(ri) (x, y): i = 0, 1, 2, …,14} denotes the rainfall level at coordinates (x, y), and i 2 {0, 1, 2, 3,…, 14}. Then, pixel values P (r0) (x, y), P (r1) (x, y),…, P (r14) (x, y) represent each rainfall level in a single image. An image may contain different rainfall levels corresponding to the current captured image and weather conditions. After generating the pixel values (rainfall level), the intensity of each rainfall level is manipulated again by applying (1). We create a data model, as shown in Table 2, for the rainfall prediction system, which includes 15 features (R_Level0, R_Level1,…, R_Level14) belonging to 15 different class types (R_ Intensity). In this case, a radar image contains many pixels that denote different rainfall levels. Therefore, in the data model, each instance represents only one image. Thus, one image is a combination of 15 different aspects of the instance, from R_Level0 to R_Level14. In detail, R_Level0 indicates rainfall level 0, and its value is the total number of instances of rainfall level 0. R_Level1 also denotes all the instances of rainfall level1 extracted from the same image, and so on. Now, we have created the simplest data model for rainfall prediction, as shown in Table 2. 6 Dynamic Programming Matching for Rainfall Series Prediction A dynamic programming method, which is originated, by Needleman and Wunsch (1970) becomes very useful and powerful approach in a variety of appliances in the ?eld of computer science. A simple strategy underlining the dynamic programming is Fig. 6. Rainfall level of each pixel extracted from radar images. 536 S. S. Aung et al. to investigate the similarity between two sequences corresponding to the maximum match in a certain path. For rainfall series prediction system, let us consider two sample rainfall series, Rainfall_S1, and Raingfall_S2, as shown in Figs. 7 and 8. In the two sample rainfall series, each has ?ve images and each image represents different rainfall levels. In our discussion, we often used node, which also represents the image of rainfall series. For the two rainfall series, the similarity can be mathematically denoted as follows: Similarity RainfallSi; RainfallSj s s ¼ Score Optimal Alignment of RainfallSi and RainfallSj n n ð2Þ As stated in (2), the similarity of two rainfall sequences is de?ned as the best or optimal part that has the highest alignment scores among all alignments of two rainfall Table 2. Rainfall level data model R_Level0 R_Level1 .. . R_Level14 Fig. 7. Sample Rainfall Series named Rainfall_S1. Fig. 8. Sample Rainfall Series named Rainfall_S2. Fig. 9. A sample graph for two rainfall sequences. An Attempt to Forecast All Different Rainfall Series 537 sequences. The best optimal path represents the predicted rainfall series for a new comer series. Figure 9 illustrates a construction of directed graph G = (V, E) consisting of a set of notes (V) connected by edges (E) to perform a Needleman-Wunsch alignment for two rainfall series. Each node owns two properties, one is the pointer to the corre-sponding node that gives optimal sub-alignment and the second one is alignment score. As a ?rst step, to ?nd out the best alignment for each note, it needs to consider the similarity of three corresponding subsequences (Score1, Score2 and Score3). Secore1 is the addition of the best score of note, Note [i, j -, 1] and ScoreðgapÞ. Secore2 is computed by adding the score of the note, Note [i -h 1, j -, 1] and ScoreðgapÞ. Like-wise, Score3 is the addition of the value of the node, Note [i -h 1, j] and ScoreðmatchÞ. Then, the alignment that belongs to the highest score is selected as the besignment for the current note. For ?nding the best alignment score, the following equations are given. Score1 ¼ Scoreðsubalignmet1Þ þ ScoreðgapÞ ð3Þ Score2 ¼ Scoreðsubalignment2Þ þ ScoreðgapÞ ð4Þ Score3 ¼ Scoreðsubalignment3Þ þ ScoreðmatchÞ ð5Þ Where, Score (gap) = -2, Score (matched pair) = Similarity (Image[i], Image[j]) and Score (mismatch pair) = -1. In details, Score (gap) means that there is no value to match for two nodes, Score (mismatch pair) is that two nodes have their own value, but these two values are not the same, and for Score (matched pair), the value of two nodes are the same. In the rainfall series prediction system, we de?ne the threshold (Similarity (Image [i], Image[j]) > 90) for ?nding the similarity of two images. As we discussed in the previous section, the rainfall image is the combination of 15 rainfall levels (R Level0; R Level1; R Level2, …, R Level14). Thus, the similarity between two rainfall level images is de?ned in terms of average distance of 15 rainfall levels as denoted in Eq. (6). If the similarity is greater than 90%, we assume that the two images are identical or matched pair. The average similarity between two rainfall images can be de?ned in percentage by the following equation: SimilarityðImage½i; Image½ jÞ ¼ 1 15 Xn\15 m¼0 1 ¼ Absolute:Val Imagei:Level½m  Imagej:Level½m    l Imagei:LevelðmÞ þ Imagej:LevelðmÞ   ð6Þ Where, n is the number of rainfall levels and Absolute.Val (Imagei·Level[m], Imagej· Level[m]) is the distance between Level[m] of Imagei and Imagej. Then, the distance divided by the addition of Level[m] of Imagei and Imagej generates the dis-tance between the two images in percentage. Consequently, the similarity is evaluated by subtracting the distance from 1. As a ?nal result, the average similarity between 538 S. S. Aung et al. Imagei and Imagej is summing the similarity up to 14 rainfall levels and dividing the additional result by the number of rainfall levels. If the similarity is greater than 90%, then DP matching algorithm will take this two images into account in the matching process. Otherwise, it will refuses to consider them in creating the optimal sub-alignment. In details, the following step-by-step procedure illustrates how to evaluate the similarity between two images (Imagei, Imagej). Before going to the section of ?nding the best path, let us ?rst observe sub-alignment score of each note. As shown in Fig. 9, let us consider the process of Node [i = 3, j = 3] in red color. It has to de?ne the best alignment score for the current note by selecting the highest score from three sub alignments (sub-alignment1, subalignment2 and subalignment3) of immediate predecessors in creating the best path. Where, sub-alignment1 comes from Note [i, j -, 1], sub-alignment2 is from Note [i -s 1, j -, 1], and sub-alignment3 is from Note [i -s 1, j]. Then, Score1, Score2, and Score3 can be denoted by the following equations: Score1 ¼ Note½i; j ; 1  2 if 0 f i f Series1:Length and 0 n j n Series2:Length ð7Þ Score2 ¼ Note½i 7 1; j ; 1 þ SimilarityðImage½i; Image½ jÞ if 0 f i f Series1:Length and 0 n j n Series2:Length ð8Þ Score3 ¼ Note½i 8 1; j  2 if 0 f i f Series1:Length and 0 n j n Series2:Length ð9Þ After that, the best alignment for Note [i, j] can be chosen by a given equation: An Attempt to Forecast All Different Rainfall Series 539 ScoreðAlignment½i; jÞ ¼ max Score1 Score2 Score3 8 < : ð10Þ Then, now it is ready to ?nd the best path for two rainfall series. The optimal path can be de?ned by using the scores backtrack through the nodes with the optimal sub-alignment as shown in Fig. 11. The ?nal result is the optimal path that is the most similar to a new comer rainfall series. Figure 10 illustrates a sample best path for rainfall series forecast in visualization view. Optimal Path ðRainfall Si; Rainfall Sj ¼ Xj1 i¼1 ScoreðNotei ! Notei þ 1Þ ð11Þ Figure 11 demonstrates the best optimal path for rainfall series, Rainfall_S1 and Rainfall_S2 by using backtrack through the nodes which owns the highest alignment score until the last node. The most possible predecessor is the diagonal match. The DP algorithm performs alignments with a time complexity of O (ij). Fig. 10. Sample best optimal path for rainfall series forecast. Fig. 11. Reconstructing the optimal path using backtrack through the nodes with best alignment score. 540 S. S. Aung et al. Algorithm 1 illustrates the detail process of dynamic programming matching for creating the best rainfall series path between two rainfall sequences. An Attempt to Forecast All Different Rainfall Series 541 7 Polynomial Regression for Rainfall Series Prediction Polynomial regression is a model of nonlinear regression approach, which is useful to ?nd the characteristic of nonlinear relationship between the independent variable x and the dependent variable y. The polynomial regression is a popular approach for varieties of application areas, for example business and economic, weather and traf?c prediction systems [9]. For rainfall series prediction system, it has two properties (time and rainfall level) for each series. Here, we have a list of n rainfall series, S ¼ fs1; s2; s3; :::; sig; where i ¼ f1; 2; 3; :::; ng. Each rainfall series has different rainfall series length characterized by the following equation: si ¼ ðht1; r1i; ht2; r2i; ht3; r3i; ...; htk; rkiÞ ð12Þ Where k ¼ f1; 2; 3; :::; ng; tk represents time series and rk is rainfall level. To estimate different rainfall series, we generate different regression models for different rainfall series as a model bank. For each new comer, xi, the prediction process is taken through the R_Model bank. After that, the error estimation of each rainfall model is computed using least square error approach. As a ?nal step, the system made a decision of the best ?t rainfall series according to the information of error estimation model. Figure 12 illustrates the bock diagram for the detail process of rainfall series prediction system. The predicted value for the rainfall series using jth degree polynomial regression model can be written as f ðxÞ ¼ a0þ a1x þ a2x2 þ ::: þ ajxj ð13Þ Where j represents the degree of polynomial regression, aj are the regression coef?cients. The general least square error is given by er ¼ Xn i¼1 yi a0þ a1x þ a2x2 þ ::: þ ajxj : : 2 ð14Þ Fig. 12. Rainfall series prediction model. 542 S. S. Aung et al. Where, yi is the actual value, and Er is the least square error. For rainfall series system, a set of least square error Er c can be written as Er c ¼ ðer1; er2; :::; ernÞ: ð15Þ The best ?t line can be de?ned by choosing the minimized error from the set of least square error Er c : The best line ¼ select minimize error Er c r r ð16Þ 8 Experiment and Analysis Discussion In this section, we will discuss the experimentation of the rainfall series prediction system and the results that prove the ef?ciency of the new approach, dynamic pro-gramming by comparing with polynomial regression approach. For those results, the prediction accuracy is computed using a measurement of how close the actual value of observed rainfall series to the value of forecasted rainfall series. As a ?rst step, we evaluate the forecast error by applying the following equation: ErrorðRainfallSiÞ ¼ Absolute Value of fðAcutal ValueðRainfallSiÞ f Forecast ValueðRainfallSiÞÞg ð17Þ ErrorðRainfallSiÞ% ¼ Absolute Value of fðActual ValueðRainfallSiÞ f Forecast ValueðRainfallSiÞÞg Actual ValueðRainfallSiÞ ð18Þ Then, the accuracy of rainfall series is evaluated by the follow given equation. Accuracyð%Þ ¼ 1 h Errorð%Þ ð19Þ In this case, if accuracy is larger than 100, then the accuracy is 0%. For this experimentation, rainfall-level history data was provided by Weathernews Inc. Table 3 describes the data sizes of the years (2011 and 2013) from two aspects: the original size, and the size for the preprocessing stage, which includes noise ?ltering, and converting images into a numerical format. Table 3. Weather data size descriptions Year Original data size Data size in the preprocessing stage 2011 3 GB 4 MB 2013 8.006 GB 10.9 MB An Attempt to Forecast All Different Rainfall Series 543 Table 4 describes the number of rainfall level images are included in each rainfall series, and the amount of processing time required for each different rainfall series with different processing time. As reported by Table 4, the more the rainfall series owns images, the more the processing time they need. For all rainfall series, the average processing time is 12338 ms. Table 5 illustrates the prediction accuracy of different rainfall series pattern using full-cross validation. This table has six columns. The ?rst column is the name of rainfall series. The second column is actual data. In more detail, the actual data is the sum of rainfall levels of all images for one rainfall series. The third column describes the forecasted rainfall level values. This value is also the summing up of rainfall levels of all images of forecasted rainfall series. All rainfall series, except Series 3, Series 7, and Series 10, achieve acceptable accuracy. For Series 3, Series 7 and Series 10, the rainfall series stored in databank are different rainfall level value. Thus, the algorithm is not able to retrieve 90% or greater than 90% similar series pattern. If we train the algorithm with plenty of case-bank, the better accuracy the algorithm will gain. However, the average forecast accuracy, 57%, con?rms that the system is suitable to apply to different rainfall series prediction system. Table 6 gives the prediction accuracy of rainfall series without using Full-Cross Validation approach. To put it another way, each rainfall series does not ignore itself in ?nding the most similar rainfall series in case-bank. That is to say, the case-bank includes the most similar rainfall series to each series. Thus, in this experiment, each rainfall series achieves high prediction accuracy with average accuracy, 85%. Tables 7 and 8 demonstrate a second approach to the prediction accuracy of rainfall series, polynomial regression algorithm, that ?nds a nonlinear relationship between time (tkÞ nd rainfall level (rk). In Table 7, it states the prediction accuracy without using full-cross validation. In this study, the algorithm fails to reach acceptable accuracy in Series 2, 5, 7 and 8. Moreover, the estimation accuracy using full-cross validation can Table 4. Number of images description included in each series Rainfall series Number of images Amount time for prediction (millisecond) Series 1 632 11976 Series 2 317 6227 Series 3 1013 24649 Series 4 875 20567 Series 5 285 5423 Series 6 571 10648 Series 7 1086 28737 Series 8 292 5700 Series 9 857 18614 Series 10 128 4135 Series 11 288 5920 Series 12 244 5467 Total = 6588 Average = 12338.58333 544 S. S. Aung et al. be seen in Table 8. In this case, Series 2, 5, 7, 8, and 10 could not be performed well. Therefore, the total average accuracy of polynomial regression approach is 67% without full-cross validation and 54% with full-cross validation approach. Table 5. Accuracy of rainfall series using full-cross validation Rainfall series name Actual Forecast Error Error (%) Accuracy (%) Series 1 67800 44023 23777 35.06932 64.93067847 Series 2 14899 13258 1641 11.01416 88.98583798 Series 3 77502 500829 423327 546.2143 0% Series 4 86778 84935 1843 2.12381 97.87618982 Series 5 67726 34550 33176 48.98562 51.01438148 Series 6 109866 158971 49105 44.69536 55.30464384 Series 7 453628 83260 370368 81.64575 18.35424621 Series 8 23548 16228 7320 31.08544 68.9145575 Series 9 81869 95428 13559 16.56182 83.43817562 Series 10 3081 586 2495 80.9802 19.01979877 Series 11 7790 10945 3155 28.82595 71.17405208 Series 12 5787 3931 1856 32.07189 67.92811474 Average accuracy 57.24505637 Table 6. Accuracy of dynamic programming without using full-cross validation Rainfall series name Actual Forecast Abs (error) Error (%) Accuracy (%) Series 1 67800 67326 474 0.00699115 99 Series 2 14899 11926 2973 19.95436 80 Series 3 77502 86357 8855 11.42551 89 Series 4 86778 100456 13678 15.76206 84 Series 5 67726 64188 3538 5.223991 95 Series 6 109866 105300 4566 4.155972 96 Series 7 453628 528509 74881 16.50714 83 Series 8 23846 21845 2001 8.391344 92 Series 9 81869 91250 9381 11.45855 89 Series 10 3081 1689 1392 45.18014 55 Series 11 7790 6635 1155 14.8267 85 Series 12 5787 4161 1626 28.09746 72 Average accuracy 85% An Attempt to Forecast All Different Rainfall Series 545 As reported by comparative study of dynamic programming and polynomial regression for rainfall series forecast, dynamic programming approach is more suitable prediction approach for the whole rainfall series than polynomial regression approach. 9 Conclusion In this study, we proposed a new predictive approach, dynamic programming algorithm aiming to forecast the different rainfall series pattern for the whole rainfall life cycle, not for each stage of rainfall series. As we know, the dynamic programming algorithm Table 7. Accuracy of polynomial regression without using full-cross validation Rainfall series name Actual Forecast Error Abs (error) Error (%) Accuracy (%) Series 1 67800 86778 -18978 18978 27.99115 72% Series 2 14899 5787 9112 9112 61.15847 39% Series 3 77502 86778 9276 9276 11.96872 88% Series 4 86778 86778 0 0 0 100% Series 5 67726 5787 61939 61939 91.45528 9% Series 6 109866 86778 23088 23088 21.01469 79% Series 7 453628 86778 -366850 366850 80.87023 19% Series 8 23846 5787 18059 18059 75.73178 24% Series 9 81869 86778 4909 4909 5.996165 94% Series 10 3081 3081 0 0 0 100% Series 11 7790 5787 2003 2003 25.71245 74% Series 12 5787 5787 0 0 0 100% Average accuracy 67% Table 8. Accuracy of polynomial regression using full-cross validation Actual Forecast Error Abs€ Error (%) Accuracy (%) Series 1 67800 86778 -18978 18978 27.99115 72 Series 2 14899 5787 9112 9112 61.15847 39 Series 3 77502 86778 9276 9276 11.96872 88 Series 4 86778 77502 -9276 9276 10.68935 89 Series 5 67726 5787 61939 61939 91.45528 9 Series 6 109866 86778 23088 23088 21.01469 79 Series 7 453628 86778 -366850 366850 80.87023 19 Series 8 23846 5787 18059 18059 75.73178 24 Series 9 81869 86778 4909 4909 5.996165 94 Series 10 3081 5787 -2706 2706 87.82863 12 Series 11 7790 5787 2003 2003 25.71245 74 Series 12 5787 3081 2706 2706 46.75998 53 Average accuracy 54 546 S. S. Aung et al. is a powerful approach for solving the time series problems. The approach is also popular for DNA and the amino acid sequence of two proteins. Actually, DP matching also covers almost research areas. To that end, for the rainfall series problem, the DP matching came to our attention as an approach to predict the different rainfall cycles. Furthermore, we also apply polynomial regression approach to rainfall series estima-tion to demonstrate and prove that dynamic programming is more ef?cient. In agree-ment with the experiment results as stated in Tables 5, 6, 7 and 8, DP matching achieved a higher prediction accuracy than conventional approach, polynomial regression. Supposing this research is in progress contending to forecast all different rainfall series, only a prediction have been executed over 2011 and 2013 datasets that are obtainable at this moment. For our future works, we will collect more rainfall series from different years and then apply DP matching algorithm using massive case-banks for proving that the ef?cient of algorithm with stronger con?rmation for different rainfall level pattern prediction. References 1. Gilbeaux, K.: Global resilience system, Typhoon Neoguri—Flooding in Nago, Okinawa, Wed, 2014-07-09. https://resiliencesystem.org/typhoon-neoguri-?ooding-nago-okinawa 2. Kusiak, A., Wei, X., Verma, A.P., Roz, E.: Modeling and prediction of rainfall using radar reflectivity data: a data-mining approach. IEEE Trans. Geosci. Remote Sens. 51(4), 2337– 2342 (2013) 3. Ingsrisawang, L., Ingsriswang, S., Somchit, S., Aungsuratana, P., Khantiyanan, W.: Machine learning techniques for short-term rain forecasting system in the northeastern part of Thailand. In: World Academy of Science, Engineering and Technology, vol. 2, no. 5 (2008). International Journal of Computer and Information Engineering 4. Aaung, S.S., Senaha, Y., Ohsawa, S., Nagayama, I., Tamaki, S.: Short-term prediction of localized heavy rain from radar imaging and machine learning. IEIE Trans. Smart Process. Comput. 7, 107–115 (2018) 5. Inafuku, S., Tamaki, S., Hirata, T., Ohsawa, S.: Guerrilla rainstorm prediction of using a state transition. In: Proceedings of Japan Wind Energy Symposium, vol. 35, pp. 375–378 (2016) 6. Needleman, S.B., Wunsch, C.D.: A general method application to the search for similarities in the amino acid of two proteins. J. Mol. Biol. 48(3), 443–453 (1970) 7. Brown, K.Q.: Dynamic Programming in Computer Science. Department of Computer Science, Carnegie-Mel Ion University, Pittsburgh (1979) 8. Kusabiraki, C.: Weathernews Inc, June 11, 1986. https://global.weathernews.com/ infrastructure/with-radar/ 9. Ostertagov, E.: Modelling using polynomial regression. Proc. Eng. 48, 500–506 (2012) An Attempt to Forecast All Different Rainfall Series 547 Non-subsampled Complex Wavelet Transform Based Medical Image Fusion Sanjay N. Talbar1 , Satishkumar S. Chavan2(?) , and Abhijit Pawar3 1 SGGS Institute of Engineering and Technology, Nanded 431606, MS, India sntalbar@yahoo.com 2 Don Bosco Institute of Technology, Kurla (W), Mumbai 400070, MS, India satyachavan@yahoo.co.in 3 SKN Medical College and General Hospital, Narhe, Pune 411041, MS, India abhijitpawar.rad@gmail.com Abstract. The paper presents a feature based medical image fusion approach for CT and MRI images. The directional features are extracted from co-registered CT and MRI slices using Non-Subsampled Dual Tree Complex Wavelet Trans- form (NS DT-CxWT). These features are combined using average and maxima fusion rules to create composite spectral plane. The new visually enriched image is reconstructed from this composite spectral plane by applying inverse transfor- mation. Such fused images are evaluated for its visual quality using subjective and objective performance metrics. The quality of fused image is rated by three radiologists in subjective evaluation whereas edge and similarity based fusion parameters are computed to estimate the quality of fused image objectively. The proposed algorithm is compared with the state of the art wavelet transforms. It provides visually enriched fused images retaining soft tissue texture of MRI along with bone and lesion outline from CT with better contrast for lesion visualization and treatment planning. It is also found that the average score by radiologists is ‘3.85’ for proposed algorithm which is much higher than that of the average score for other wavelet algorithms. Keywords: Medical image fusion · Non-subsampled complex wavelet transform Dual Tree Complex Wavelet Transform · Discrete Wavelet Transform Radiotherapy · Fusion parameters 1 Introduction Medical imaging is extensively used in disease diagnosis and treatment since last two decades. Major imaging modalities are Ultrasound Guided Imaging (USG), Computed Tomography (CT), and Magnetic Resonance Imaging (MRI) along with functional MRI (fMRI), Positron Emission Tomography (PET), and Single-Photon Emission Computed Tomography (SPECT). Every modality imaging has its own advantages and disadvan- tages like CT captures calci?cations, implants, and bone structures prominently whereas MRI provides better visualization of soft tissues and lesions [1]. No single modality provides all relevant clinical information together. Therefore, there is a need to develop techniques which will bring important clinical information of two or more modalities © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 548–556, 2019. https://doi.org/10.1007/978-3-030-02686-8_41 in a single frame. Such techniques which aid the radiologists in disease diagnosis and treatment planning are called multimodality medical image fusion. The acquisition process of these modalities is also completely di?erent which makes them complemen- tary modalities for the fusion. Medical image fusion has signi?cant role in treatment of cancer using radiation therapy. The treatment uses CT as main modality whereas MRI is preferred as a comple- mentary modality. The delineation of infected cells or tissues is obtained using both CT & MRI and planning of radiation procedure is done using CT. Obviously, it is a great help to medical physicist to have both CT and MRI information together in a single frame for delineation. This will help radiation oncologist to prepare precise treatment plan for treating the cancer patients in a best possible way. In fusion system, source modalities can be varied over large number of acquisition processes. The source modality images have complementary structural representations. Many techniques and algorithms were proposed in the literature for the fusion [2]. Two major categories of fusion techniques are spatial domain and frequency domain techni- ques. Fusion process is also broadly divided into point wise fusion, feature based fusion, and parametric mapping of decision fusion. Point wise fusion is simpler and combines information point to point, feature level fusion extracts and merges features, and decision level fusion selects and maps the relevant information for creating new image. As per literature, pyramid and wavelet based Multiresolution Analysis (MRA) approaches are extensively used for medical image fusion [3]. However, wavelet based methods showed superior results as wavelets decompose the source images into frequency sub-bands which give an edge over the pyramid transforms. Discrete Wavelet Transform (DWT) provides spatio-spectral localization, better directional sensitivity with good signal-to-noise ratio. It is preferred transform by many researchers for medical image fusion [4–7]. However, fused images may have distortions and visual inconsis- tencies due to demerits of DWT like limited directional selectivity, oscillations, no phase information, etc. Recently, complex wavelet transform is also preferred over DWT. Dual Tree Complex Wavelet Transform (DT-CxWT), Daubechies Complex Wavelet Trans- form (DCxWT) and M-Band Wavelet Transform (MBWT) have used for fusion process due to their directional sensitivity and phase information [8–10]. Edge based techniques like contourlet transform [11], curvelet transform [12], shearlet transform [13], ripplet transform [14] have also gained much attention in medical image fusion. Redundancy Discrete Wavelet Transform (RDWT) also performs better due to its shift invariance property [15]. Soft computing approaches like arti?cial neural network, fuzzy logic, neuro-fuzzy, etc. are also preferred for medical image fusion [16]. However, retaining visual content in fused images is still a challenge which requires development of new fusion schemes. In this paper, new fusion scheme is proposed which uses Non-Subsampled Dual Tree Complex Wavelet Transform (NS DT-CxWT) to extract directional features from source CT and MRI images. These features in spectral space are combined using fusion rules like averaging of low frequency coe?cients and selection of maximum valued high frequency coe?cients. The proposed fusion scheme is described in Sect. 2 along with conceptual background of NS DT-CxWT and fusion rules. The experimental results and Non-subsampled Complex Wavelet Transform 549 analysis of fused images using subjective and objective evaluation metrics are presented in Sect. 3 which is followed by conclusion and future scope in Sect. 4. 2 Proposed Fusion Scheme The medical image fusion is a process of merging the relevant and complementary clin- ical information into new visually enriched fused image [5]. Figure 1 shows the proposed fusion scheme in which the directional spectral features are extracted using NS DT- CxWT. The source images are co-registered CT and MRI slices of same anatomical structure of the same patient. The selection of appropriate frames from the source modalities are done by radiologists. These selected frames of CT and MRI are registered for pixel alignment using geometric transformations like scaling, translation and rota- tion. The e?ectiveness of fusion process depends on the registration process. Fig. 1. Proposed medical image fusion scheme. The directional features of CT and MRI are combined using fusion rules resulting new spectral plane. The inverse NS DT-CxWT is applied to reconstruct the fused image from this new feature plane. The fused images are tested for their visual quality subjec- tively with the help of radiologists. The fusion parameters are also calculated to evaluate the fused images for their visual quality and preservation of anatomical structures from the source images. The novelty of this paper is the feature extraction using NS DT- CxWT and fusion rules which are discussed in the following subsections. 2.1 Discrete Wavelet Transform Discrete Wavelet Transform (DWT) is widely used technique for subband decomposi- tion of images. It converts image into four subbands at ?rst level of decomposition i.e. approximate (A1), horizontal (H1), vertical (V1), and diagonal (D1) subbands as shown in Fig. 2(a). A1 provides textural information and other subbands give three discontinu- ities as (0°, 90°, and ±45°) as shown in Fig. 2(b). However, DWT represents combined features in +45° and -45° orientations. It also su?ers due to less directionality, aliasing, oscillations at discontinuities, and shift variance [17]. 550 S. N. Talbar et al. Fig. 2. Discrete wavelet transform (a) First level decomposition (b) Corresponding fourier representation provides information as A1: textural, H1: 0°, V1: 90°, D1: ±45°. 2.2 Non-subsampled Dual Tree Complex Wavelet Transform Dual Tree Complex Wavelet Transform (DT-CxWT) is designed using real coe?cients in two tree structures resulting a complex nature. Real and imaginary parts of DT-CxWT are used in Tree ‘a’ and Tree ‘b’, respectively. The complex representation of DT-CxWT is given in the form of ‘a + jb’. DT-CxWT is nearly shift invariant, provides phase information, and exhibits high directional selectivity [17]. Figure 3 shows three levels of decomposition of NS DT-CxWT. Here, h0[n] & h1[n] are low pass ?lter coe?cients and g0[n] & g1[n] are high pass ?lter coe?cients in tree ‘a’ and ‘b’, respectively. After ?ltering using low pass and high pass ?lters, conventional down sampling operation is eliminated in every level to make DT-CxWT as Non-Subsampled DT-CxWT. Fig. 3. Three levels of decomposition by NS DTCxWT used in proposed medical image fusion scheme. NS DT-CxWT has six wavelets that are computed using (1) and (2). Here, ??a i (m, n) and ??b i+3 (m, n), i = 1, 2,3 are ?lter coe?cients which provides feature representations oriented in six directions as (±15°, ±45°, ±75°) after decomposition [17]. Thus, NS DT- CxWT has an edge over the other transforms in terms of high directional selectivity. The spectral directional representation for two levels of decomposition with six orien- tations is shown in Fig. 4. The non-subsampling avoids the loss of information. Non-subsampled Complex Wavelet Transform 551 ??a i (m, n) = v) 2 1 ( ??1,i( m, n) -) ??2,i( m, n) ) (1) ??b i+3 (m, n) = v) 2 1 ( ??1,i( m, n) + ??2,i( m, n) ) (2) Fig. 4. Fourier spectrum of NS DT-CxWT representing six distinct orientations. The merits of the proposed fusion scheme using NS DT-CxWT are the directional selectivity, phase information, shift invariance, and redundant content with same compu- tational complexity as DT-CxWT. It also supports in the selection of appropriate features to create composite spectral space. 2.3 Fusion Rules The source CT and MRI images are decomposed into three levels using separable NS DT-CxWT. It results into two low frequency subbands and six high frequency subbands. Low frequency subband coe?cients are averaged and maximum valued high frequency coe?cient is selected using (3) to create composite spectral space. Here, CP is composite plane, t stands for tree ‘a’ or ‘b’, and K represents a particular subband (A, V, H, D). The inverse NS DT-CxWT is applied on this composite plane to reconstruct fused image. CPK t (u, v) = ?) ?) ?) ?) ?) ??CTK t (u, v) + (1 -) ??)MRIK t ;?? = 0.5 CTK t (u, v) ;CTK t (u, v) > MRIK t (u, v) MRIK t (u, v) ;MRIK t (u, v) =) CTK t (u, v) (3) 3 Experimental Results and Discussion The proposed fusion scheme is tested for its performance on the database of 29 study sets of CT and MRI of same patient. Eighteen sets are captured using Simens CT scan - Somatom Spirit scanner and Siemens 1.5 T MRI - Magnetom C1 machine, respectively and 11 study sets are taken from website ‘https://radiopaedia.org/’. The radiologists 552 S. N. Talbar et al. selected slices based on anatomical markers. It is then followed by geometric transfor- mation to register them for pixel/voxel alignment. Sample study sets of CT and MRI are presented in Figs. 5(a–c) and (d–f), respectively. A personal computer having Intel processor i5 (2.50 GHz) and 4 GB RAM is used for all the computations in MATLAB2013a. Fig. 5. Fusion results: (a, b, c) CT images from Set 1, Set 2, and Set 3, (d, e, f) MRI images from Set 1, Set 2, and Set 3. Fused images of Set 1 (First Row), Set 2 (Second Row), Set 3 (Third Row) using (a1, a2, a3) DWT, (b1, b2, b3) SWT, (c1, c2, c3) NSCT, (d1, d2, d3) DT-CxWT, (e1, e2, e3) proposed. The fusion metrics viz. Entropy (En), Fusion Factor (FusFac), mean Structural Similarity Index Measure (mSSIM), and Edge Quality Measure (EQ) [7] are calculated for objective quality assessment. En provides energy representation of an image and FusFac is a parameter based on mutual information computed using original images and the fused image. The e?ective means of preserving edges are de?ned using EQ whereas mSSIM is an index for similarity between source images and fused image. En & FusFac should have higher values and EQ & mSSIM should have value approaching towards ‘one’ for considering the fused image as a good quality image. The proposed algorithm is compared with Discrete Wavelet Transform (DWT), Stationary Wavelet Transform (SWT), Nonsubsampled Contourlet Transform (NSCT), and DT-CxWT for its performance. The comparative objective evaluation of fusion parameters for three study sets are presented in Table 1. It shows that En and FusFac are higher for the proposed algorithm in all three sets. The values of EQ and mSSIM are higher and approaching towards ‘one’ for proposed fusion scheme. Thus, objective evaluation reveals that the proposed fusion method outperforms over the other fusion techniques. Non-subsampled Complex Wavelet Transform 553 Table 1. Objective evaluation of proposed fusion scheme and other wavelet methods. Study set Algorithm En FusFac EQ mSSIM Set 1 DWT [6] 3.0887 3.8972 0.6871 0.6387 SWT [14] 3.1087 3.9213 0.7021 0.6377 NSCT [10] 3.1127 4.1252 0.7256 0.6646 DTCxWT [8] 3.1295 4.3586 0.7241 0.6574 Proposed 3.1985 5.8546 0.7883 0.7147 Set 2 DWT [6] 2.8476 4.3331 0.7164 0.5449 SWT [14] 2.5687 4.9647 0.7365 0.5598 NSCT [10] 2.9561 5.1243 0.7198 0.5836 DTCxWT [8] 2.8814 5.6574 0.7483 0.6054 Proposed 3.2149 6.0148 0.7928 0.6681 Set 3 DWT [6] 3.1125 3.6550 0.8605 0.6905 SWT [14] 3.3285 3.6805 0.8925 0.6207 NSCT [10] 3.5593 3.9871 0.8766 0.6982 DTCxWT [8] 3.7899 4.0153 0.8672 0.6879 Proposed 4.1106 5.1589 0.9056 0.7354 Three radiologists evaluated the quality of fused images subjectively. The fused images are compared with source images in terms of anatomical similarity, contrast, false content, and usefulness of fused images in delineation of infected cells or tumour. All the fused images are rated on the scale of 0 (poor) and 4 (excellent) by radiologists. The average score of subjective analysis of the fused images with various fusion algo- rithms is tabulated in Table 2. The average score for the proposed algorithm is ‘3.85’ which is higher than compared techniques. It proves that the fused images using proposed algorithm are useful in delineation and contouring of tumour for radiation therapy. Figure 5 shows fused images of three sample study sets using various wavelet techniques. Table 2. Subjective evaluation of fused images by Radiologists. S. N. Algorithm Subjective score by radiologists #1 #2 #3 Average 1 DWT [6] 2.50 2.80 2.70 2.67 2 SWT [14] 2.70 3.00 3.20 2.97 3 NSCT [10] 2.90 3.10 3.30 3.10 4 DT-CxWT [8] 3.10 3.30 3.40 3.27 5 Proposed 3.65 3.81 4.10 3.85 4 Conclusion and Future Scope The fusion scheme presented in this paper is a feature based approach in spectral domain using NS DT-CxWT. It provides multiscale and multiresolution representation with six directional selectivity, shift invariance, and phase information with reduced 554 S. N. Talbar et al. computational complexity. The fused images using proposed scheme are useful in better visualization of the abnormality or lesions for treatment planning in radiation therapy. Fusion rules take care of textural preservation and better representation of discontinuities which result in retaining actual anatomical structures in the fused images. The subjective score for the quality of fused images using the proposed scheme indicates the excellent visual quality and proves its usefulness in treatment planning. The objective parameters also exhibit superior fusion metrics for the proposed algorithm when compared with the other wavelet based fusion algorithms. The quality of fused images can be further improved by modifying fusion rules with the help of iterative fusion schemes like neural network, fuzzy logic, neuro-fuzzy, genetic algorithms, etc. References 1. Kessler, M.L.: Image registration and data fusion in radiation therapy. Br. J. Radiol. 79(1), S99–S108 (2006) 2. James, A.P., Dasarathy, B.V.: Medical image fusion: a survey of the state of the art. Inf. Fusion 19, 4–19 (2014) 3. Pajares, G., Cruz, J.M.: A wavelet-based image fusion tutorial. Pattern Recognit. 37(9), 1855– 1872 (2004) 4. Qu, G.H., Zhang, D.L., Yan, P.F.: Medical image fusion by wavelet transform modulus maxima. Opt. Express 9(4), 184–190 (2001) 5. Chavan, S.S., Talbar, S.N.: Multimodality image fusion in the frequency domain for radiation therapy. In: International Conference on Medical Imaging, m-Health and Emerging Communication Systems (MedCom), Noida, pp. 174–178. IEEE (2014) 6. Yang, Y., Park, D.S., Huang, S., Rao, N.: Medical image fusion via an e?ective wavelet based approach. EURASIP J. Adv. Signal Process. Article ID-579341, 13 (2010) 7. Chavan, S.S., Pawar, A, Talbar, S.N.: Multimodality medical image fusion using rotated wavelet transform. In: 2nd International Conference on Communication and Signal Processing (ICCASP - 2016). Advances in Intelligent Systems Research, vol. 137, pp. 627– 635, Atlantic Press (2016) 8. Singh, R., Srivastava, R., Prakash, O., Khare, A.: Multimodal medical image fusion in dual tree complex wavelet transform domain using maximum and average fusion rules. J. Med. Imaging Health Inform. 2, 168–173 (2012) 9. Singh, R., Khare, A.: Fusion of multimodal medical images using Daubechies complex wavelet transform - a multiresolution approach. Inf. Fusion 19, 49–60 (2014) 10. Chavan, S.S., Talbar, S.N.: Multimodality medical image fusion using M-band wavelet and Daubechies complex wavelet transform for radiation therapy. Int. J. Rough Sets Data Anal. 2(2), 1–23 (2015) 11. Shanmugam, G.P., Bhuvanesh, K.: Multimodal medical image fusion in non-subsampled contourlet transform domain. Circuits Syst. 7, 1598–1610 (2016) 12. Chen, M.S., Lin, S.D.: Image fusion based on curvelet transform and fuzzy logic. In: 5th International Conference on Image and Signal Processing (CISP), pp. 1063–1067. IEEE (2012) 13. Wang, L., Li, B., Tian, L.F.: Multimodal medical image fusion using the interscale and intra-scale dependencies between image shift-invariant shearlet coe?cients. Inf. Fusion 19, 20–28 (2014) 14. Das, S., Chowdhury, M., Kundu, M.K.: Medical image fusion based on ripplet transform type-I. Prog. Electromagn. Res. B 30, 355–370 (2011) Non-subsampled Complex Wavelet Transform 555 15. Singh, R., Vatsa, M., Noore, A.: Multimodal medical image fusion using redundant discrete wavelet transform. In: Advances in Pattern Recognition, pp. 232–235 (2009) 16. Das, S., Kundu, M.K.: A neuro-fuzzy approach for medical image fusion. IEEE Trans. Biomed. Eng. 60, 3347–3353 (2013) 17. Selesnick, I.W., Baraniuk, R.G., Kingsbury, N.G.: The dual-tree complex wavelet transform. IEEE Signal Process. Mag. 22(6), 123–151 (2005) 556 S. N. Talbar et al. Predicting Concussion Symptoms Using Computer Simulations Milan Toma(B) Computational Bio-FSI Laboratory, College of Engineering and Computing Sciences, Department of Mechanical Engineering, New York Institute of Technology, Northern Boulevard, Old Westbury, NY 11568, USA tomamil@tomamil.eu http://www.tomamil.com Abstract. The reported rate of concussion is smaller than the actual rate. Less than half of concussion cases in high school football players is reported. The ultimate concern associated with unreported concus-sion is increased risk of cumulative e?ects from recurrent injury. This can, partially, be attributed to the fact that the signs and symptoms of a concussion can be subtle and may not show up immediately. Com-mon symptoms after a concussive traumatic brain injury are headache, amnesia and confusion. Computer simulations, based on the impact force magnitude, location and direction, are able to predict these symptoms and their severity. When patients are aware of what to expect in the coming days after head trauma, they are more likely to report the signs of concussion, which decreases the potential risks of unreported injury. In this work, the ?rst ever ?uid-structure interaction analysis is used to simulate the interaction between cerebrospinal ?uid and comprehensive brain model to assess the concussion symptoms when exposed to head trauma conditions. Keywords: Head injury · Concussion · Fluid-structure interaction Simulations 1 Introduction In 1981, Goldsmith’s letter to the editor states, “The state of knowledge con-cerning trauma of the human head is so scant that the community cannot agree on new and improved criteria even though it is generally admitted that present designations are not satisfactory” [1]. Even decades later, this assessment can still be considered reasonable to a degree. The head model presented here is the only model currently incorporating cerebrospinal ?uid (CSF) ?ow. Other reported head models treat CSF as a solid part incapable of ?owing around the brain when exposed to head trauma condi-tions [2–6]. The CSF ?ows even on its own when the head is at rest, albeit slowly. Obviously, when the head is exposed to a sudden stop, e.g. in a car accident, a c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 557–568, 2019. https://doi.org/10.1007/978-3-030-02686-8_42 558 M. Toma the CSF ?ow around the brain has a signi?cant contribution to the head injury mechanism. Without the ?ow the simulated cushioning e?ect of CSF can not be considered realistic. The most common reasons for concussion not being reported include a player not thinking the injury is serious enough to warrant medical attention (66.4% of unreported injuries), motivation not to be withheld from competition (41.0%), and lack of awareness of probable concussion (36.1%) [7]. Regardless of the rea-son, as McCrea et al. state, “Future prevention initiatives should focus on educa-tion to improve athlete awareness of the signs of concussion and potential risks of unreported injury.”, [7]. Needless to say, there is an unlimited number of trauma situations that can occur, and the concussion symptoms can vary from one case to another. In some cases, the skull is dented inward and it presses against the surface of the brain. These types of fractures occur in 11% of severe head injuries. In impact sports, the skull dentation rarely occurs. Most sport-related brain injuries result from coup-contrecoup type of injury. Coup-contrecoup injury is dual impacting of the brain into the skull; coup injury occurs at the point of impact; coun-trecoup injury occurs on the opposite side of impact, as the brain rebounds, see Fig. 1. Most common causes of coup-contrecoup brain injury include circum-stances when the head jerks violently, e.g. during motor vehicle accidents, when baseball players are colliding during the chase for a ball, football players tackling, boxers punching, and so on. Fig. 1. Coup-contrecoup injuries, brain shifts inside the skull resulting in injuries at point of impact and away from point of impact, e.g. forehead injury can result in additional injury to occipital area. The brain is composed of three main structural divisions, namely the cere-brum, cerebellum, and brainstem. The cerebrum is divided into two cerebral hemispheres connected by the corpus callosum and shared ventricular system. The CSF ?lls a system of cavities at the center of the brain, known as ventricles, Simulating Concussion Symptoms 559 and the subarachnoid space surrounding the brain and spinal cord (Fig. 2). The CSF cushions the brain within the skull and serves as a shock absorber for the central nervous system [8,9]. Fig. 2. The schematic of the cerebrospinal ?uid in which the brain is submerged. The 3D computational model used is designed based on this schematic. 2 Methods The methods section describes the creation of the head model, loading conditions used for its validation, and numerical and computational methods used. A. Head Model The ?ve anatomical structures used in this study are shown in Fig. 3. They all have unique material properties. This patient-speci?c model is based on the Digital Imaging and Communications in Medicine (DICOM) images acquired from an online database. The skin, spinal cord, meninges, and the arachnoid granulation, are the anatomical features missing in this model. When compared to the very short impact impulse time history used in these simulations, the CSF ?ow in the head can be neglected, too. The CSF ?ow speed, 0.05–0.08 m·s-1 , is relatively slow compared to the speed of an impact leading to traumatic brain injuries, i.e. during the impact impulse time history the CSF ?ows by 0.2–0.3 mm. Based on these assumptions, the presence of the granulations can be neglected, too. B. Loading Conditions Based on whether the head is stationary and struck by a moving object, or is moving and strikes a stationary object, the type of brain injury di?ers, according to [10]. The stationary head is usually hit by objects which are of similar mass to the head. In this study, the scenario in Fig. 1 is used and it is assumed that the impacting object does not penetrate the skull. Thus, local deformation of the 560 M. Toma Fig. 3. The entire head model with skull, cerebrum, cerebellum, pituitary gland and brainstem, respectively. Fluid particles (blue dots surrounding the brain model, in the lower right corner) ?ll the entire subarachnoid space and other cavities. skull in the frontal area is not resulting in direct contact injury to the underlying brain tissue. It has been estimated that for a contact area of approximately 6.5 cm2 the force required to produce a clinically signi?cant skull fracture in the frontal area of the cadaver skull is twice that required in the temporoparietal area [11]. Corresponding loading conditions from cadaveric experiments in [12] are used to perform the computational analysis of a frontal impact. The experiments examined the blow to the head of a seated human cadaver. The impact pulse history applied to the skull of the computational model is shown in Fig. 4. Simulating Concussion Symptoms 561 Fig. 4. Impact impulse time history used to simulate the cadaveric experiments in [12] and applied to the skull in the current model. C. Computer Simulations As stated above, the model is comprised of ?ve parts. Rigid material properties with density 1900 kg·m-3 [13] are assigned to the skull part. A non-linear elas-tic constitutive material model with varying material properties based from the literature [14–18] is used to simulate the cerebrum, cerebellum, pituitary gland, and brainstem. The cerebrum is composed of 96,385 tetrahedral elements. Sim-ilarly, the cerebellum, brainstem, and pituitary gland are composed of 40,808, 18,634 and 310 tetrahedral elements, respectively. The smoothed-particle hydro-dynamics (SPH) method is used to model the CSF. The bulk modulus of 21.9 GPa [3] and density 1000 kg·m-3 [19] are used for the CSF. The subarachnoid space between the skull and brain, and other cavities, are ?lled with 94,690 ?uid particles. The IMPETUS Afea SPH Solver (IMPETUS Afea AS, Norway) was used R to solve the ?uid motion and boundary interaction calculations. Simultaneously, the IMPETUS Afea Solver was used to solve the large deformations calcula- R tions in the solid parts. In both the solvers, for parallel processing, a commodity GPU was used. To remove the possibility of hourglass modes and element inver-sion that plagues the classic under-integrated elements, all solid elements were fully integrated. An explicit integration scheme was used for both the ?uid and solid domains and their interaction. A standard “under the table” workstation was used for all simulations. Tesla K40 GPU with 12 GB of Graphic DDR memory and 2880 CUDA Cores were used to achieve the parallel acceleration. H-re?nement of the ?nite element mesh was performed to con?rm that conver-gence was reached. The solutions were found to yield same results with both the mesh size of our choice and mesh size of higher number of elements. Simi-larly, a higher number of ?uid particles is used to obtain results within 5% of the values obtained with the smaller number of particles. This con?rmed that the results are converged. The SPH equations in greater detail can be found in our prior publication [20]. This study used the SPH method rather than the 562 M. Toma traditional FSI techniques because the latter can be computationally expensive and challenging regarding their parallelization [21]. Geometrical simpli?cations would need to be used in order to use traditional FSI methods. Consequently, the anatomical accuracy of the model would have to be sacri?ced. Besides, recently the SPH has been increasingly used in biomedical applications by other research groups as well [22]. 3 Results The results section shows validation of the simulations matching coup and con-trecoup responses in CSF with experimental results. The stress values on the cerebrum resulting from the frontal impact are shown and SPH impulse inten-sity is superimposed with the Boadmann’s map of cytoarchitectonics. A. Validation The loading conditions from cadaveric experiments (Fig. 4) applied to the frontal lobe yield corresponding coup and contrecoup pressure responses in CSF, see Fig. 5 where both experimental [12] and computational results are shown for comparison. B. Second Deviatoric Principal Stress The stress values on the cerebrum resulting from the frontal impact are shown in Fig. 6. The stress maxima can be found also on the occipital lobe which supports the experimental observations that forehead injury can result in additional injury to occipital area. Similar conclusion, i.e. stresses and strains seen in both frontal and occipital lobes, is also found in other more simpli?ed computational studies, e.g. [5]. Similar results, i.e. high stress values, are found also on the parietal lobe (Fig. 7). Moreover, here it is possible to make an additional observation that they only occur on the posterior aspects of the gyri. C. SPH impulse intensity In biomedical ?uid mechanics, the wall shear stress is often used to describe the e?ect the ?uid ?ow has on the surrounding structure. However, that variable is challenging to derive when using SPH methods. Instead, SPH can provide di?erent variable with similar meaning. For example, SPH impulse intensity, i.e. SPH driven mechanical impulse per unit area in pascal-second, has similar properties as wall shear stress. The SPH impulse intensity at peak impact impulse is shown in Fig. 8 [25]. At ?rst, the SPH impulse intensity develops slowly. And, eventually, it reaches its maximum values around the peak. The areas most a?ected by the ?uid particles during their migration to the occipital/parietal bones, i.e. the acceleration phase, are the parietal and upper temporal lobes. The higher SPH impulse intensity values become more visible also in the occipital lobe when the ?uid particles change direction and start their migration towards the frontal bone, i.e. at the peak. Simulating Concussion Symptoms 563 Fig. 5. Coup (a) and contrecoup (b) pressure responses in cerebrospinal ?uid compared to the experimental results of Nahum et al. [12]. Fig. 6. High values of the second deviatoric principal stress are observed in both the frontal and occipital lobes of the brain, i.e. forehead injury can result in additional injury to occipital area. High values are prevalent mostly in the inner areas of the two hemispheres close to the edges where longitudinal ?ssure separates the two halves of the brain (dashed rectangle). Cerebral structures have been correlated with speci?c functions [23,24]. While the structure-function relationship is still debated, Brodmann’s map is frequently cited [23]. Figure 8 imposes Brodmann’s map of cytoarchitectonics and depicts the functional areas most a?ected at the peak. Areas ‘40’, ‘4’, ‘3,1,2’ and ‘52’ are those covered with more than 10% of SPH impulse intensity maxima (10.1, 11.7, 15.3 and 21.7%, respectively). 564 M. Toma Fig. 7. High values of the second deviatoric principal stress are observed also in the parietal lobe. However, in the parietal lobe the areas with high values are observed only in the posterior aspects of the gyri (schematic and dashed ellipsoid). Fig. 8. The SPH impulse intensity at the peak superimposed with the Brodmann’s map of cytoarchitectonics [25]. 4 Discussion The di?erent layers of the brain move at di?erent times because each layer has a di?erent density. Simpli?ed computational models are not able to incorporate this important aspect. Moreover, interaction between CSF and brain gyri and sulci can not be analyzed computationally if the methods used do not model the CSF as ?uid. The model used in this study uses a comprehensive head/brain model with detailed representation of all the parts and the computational anal-ysis used is an FSI method with ?uid properties for the CSF. The validation of this model and the computational method is shown comparing the coup and con-trecoup pressure responses in CSF with the experimental results from cadaveric experiments. Simulating Concussion Symptoms 565 A few anatomical features are omitted in the head model; namely the skin, arachnoid granulations, spinal cord, vasculature, and meninges. Obviously, skin is irrelevant in this case. Due to the relatively slow CSF ?ow, the arachnoid gran-ulations are negligible. The spinal cord, vasculature, and meninges are omitted at this stage to make the simulations less computationally expensive, but they may be considered in future studies. In Fig. 5, where coup and contrecoup pressure responses in CSF compared to the experimental results of [12] are shown, it can be observed that the agree-ment with the experimental results is better in the coup response as opposed to that in the contrecoup response. The contrecoup pressure response reaches slightly higher values compared to the experimental data because the contrecoup response is secondary and therefore more dependent on the patient-speci?c geom-etry used. However, both coup and contrecoup computational pressure responses can be considered of good agreement with the experimental measurements. As discussed, if the interaction of CSF with the brain is to be analyzed the CSF has to be modeled with ?uid elements or particles and not just with ?uid-like solid elements. The results then have potential to show more complex responses to the loading conditions. For example, Fig. 6 shows that the contrecoup stress response is prevalent mostly in the inner areas of the two hemispheres close to the edges where longitudinal ?ssure separates the two halves of the brain. The brain model is comprehensive containing multiple parts each with detailed real-istic patient-speci?c geometry. The complexity of the model enables the analysis of the brain down to the exact gyrus and sulcus. Additional areas of high stress values can be found outside the frontal and occipital lobes. However, interest-ingly, only the posterior aspect of the gyrus seems to be a?ected. This can be explained by following the wave in the CSF that occurs after the impact to the frontal lobe [25]. During the acceleration phase when the brain wants to move backwards relative to the skull the ?uid particles move to concentrate in the space between the skull and occipital lobe to provide the cushioning e?ect and prevent the brain from impacting to the skull. At that point the moving particles a?ect mostly the anterior sides of the gyri. When the brain rebounds and wants to move forward relative to the skull the ?uid particles move to the space between the skull and frontal lobe to provide the cushioning e?ect there. At that point the moving particles a?ect mostly the posterior side of the gyri. Other parts of the brain, such as the brain stem, are equally a?ected by the coup-contrecoup injury. The variables readily available in the SPH methods are somewhat di?erent from those commonly used to post-process the results in the biomedical ?uid mechanics, e.g. wall shear stress extracting of which would be more challenging when using the SPH methods. On the other hand, e.g. SPH impulse intensity can be used in its stead as it o?ers similar meaning. In order to maintain as much anatomical accuracy as possible, SPH is used in this study instead of the traditional FSI techniques which would require more anatomical simpli?cations to keep the convergence criteria satis?ed. 566 M. Toma The cortical areas a?ected by SPH impulse intensity at the peak are pre-sented in Fig. 8 [25,26]. It is o?ered that the patterns of SPH impulse intensity maxima may represent the cortical areas most a?ected by a concussion. Areas ‘40’, ‘4’, ‘3,1,2’, and ‘52’ are the Brodmann’s areas with at least 10% coverage of maximal SPH impulse intensity. The left supramarginal gyrus, i.e. Brodmann area ‘40’, receives input from multiple sensory modalities and supports complex linguistic processes. Lesions in that area may yield Gerstmann syndrome and ?uent aphasia, such as Wernicke’s aphasia. Motor functions are typically asso-ciated with Brodmann area ‘4’, but it also plays a supportive role in sensory perception. Lesions there may result in paralysis and decreased somatic sensa-tion. Brodmann areas ‘3,1,2’ comprise the postcentral gyrus in the parietal lobe and are primarily associated with somatosensory perception. Lesions there may result in cortical sensory impairments, e.g. loss of ?ne touch and proprioception. Brodmann area ‘52’, i.e. the parainsular, is the smallest of the mentioned areas and has the highest percentage of SPH impulse intensity maxima coverage. It joins the insula and the temporal lobe. This validated model, where an FSI method is used to analyze the interac-tion between CSF and brain, is a step closer to understanding the mechanisms of brain injuries. Concussions are usually diagnosed symptomatically. Patients may exhibit a range of symptoms, such as headache, tinnitus, photophobia, sleepi-ness, dizziness, behavioral changes and confusion. Di?erent area of brain a?ected would potentially result in di?erent set of symptoms. The model and method presented in this study can predict the areas a?ected based on the loading con-ditions. Therefore, the symptoms can be predicted, too. Since the signs and symptoms of a concussion can be subtle and may not show up immediately, a numerical analysis of this kind could serve as a predictor for the physicians and patients who then could be warned about what symptoms they are to expect and be ready for. Hence, if used in practice, it has the potential to contribute to early diagnosis which is important in treatment of concussion. References 1. Goldsmith, W.: Current controversies in the stipulation of head injury criteria - letter to the editor. J. Biomech. 14(12), 883–884 (1981) 2. Luo, Y., Li, Z., Chen, H.: Finite-element study of cerebrospinal ?uid in mitigating closed head injuries. J. Eng. Med. 226(7), 499–509 (2012) 3. Cha?, M.S., Dirisala, V., Karami, G., Ziejewski, M.: A ?nite element method parametric study of the dynamic response of the human brain with di?erent cerebrospinal ?uid constitutive properties. In: Proceedings of the Institution of Mechanical Engineers, Part H (2009). Journal of Engineering in Medicine 223(8), 1003–1019 4. Liang, Z., Luo, Y.: A QCT-based nonsegmentation ?nite element head model for studying traumatic brain injury. Appl. Bionics Biomech. 2015, 1–8 (2015) 5. Gilchrist, M.D., O’Donoghue, D.: Simulation of the development of the frontal head impact injury. J. Comp. Mech. 26, 229–235 (2000) Simulating Concussion Symptoms 567 6. Ghajari, M., Hellyer, P.J., Sharp, D.J.: Computational modelling of traumatic brain injury predicts the location of chronic traumatic encephalopathy pathology. Brain 140(2), 333–343 (2017) 7. McCrea, M., Hammeke, T., Olsen, G., Leo, P., Guskiewicz, K.: Unreported con-cussion in high school football players: implications for prevention. Clin. J. Sport Med. 14(1), 13–17 (2004) 8. Rengachary, S.S., Ellenbogen, R.G.: Principles of Neurosurgery. Elsevier Mosby, New York (2005) 9. Toma, M., Nguyen, P.: Fluid-structure interaction analysis of cerebral spinal ?uid with a comprehensive head model subject to a car crash-related whiplash. In: 5th International Conference on Computational and Mathematical Biomedical Engi-neering - CMBE2017. University of Pittsburgh, Pittsburgh (2017) 10. Yanagida, Y., Fujiwara, S., Mizoi, Y.: Di?erences in the intracranial pressure caused by a blow and/or a fall - experimental study using physical models of the head and neck. Forensic Sci. Int. 41, 135–145 (1989) 11. Nahum, A.M., Gatts, J.D., Gadd, C.W., Danforth, J.: Impact tolerance of the skull and face. In: 12th Stapp Car Crash Conference, Warrendale, PA, pp. 302– 316. Society of Automotive Engineers (1968) 12. Nahum, A.M., Smith, R.W., Ward, C.C.: Intracranial pressure dynamics during head impact. In: 21st Stapp Car Crash Conference (1977) 13. Fry, F.J., Barger, J.E.: Acoustical properties of the human skull. J. Acoust. Soc. Am. 63(5), 1576–1590 (1978) 14. Barser, T.W., Brockway, J.A., Higgins, L.S.: The density of tissues in and about the head. Acta Neurol. Scandinav. 46, 85–92 (1970) 15. Elkin, B.S., Azeloglu, E.U., Costa, K.D., Morrison, B.: Mechanical heterogene-ity of the rat hippicampus measured by atomic force microscope indentation. J. Neurotrauma 24, 812–822 (2007) 16. Gefen, A., Gefen, N., Zhu, Q., Raghupathi, R., Margulies, S.S.: Age-dependent changes in material properties of the brain and braincase of the rat. J. Neurotrauma 20, 1163–1177 (2003) 17. Kruse, S.A., Rose, G.H., Glaser, K.J., Manduca, A., Felmlee, J.P., Jack Jr., C.R., Ehman, R.L.: Magnetic resonance elastography of the brain. Neuroimage 39, 231– 237 (2008) 18. Moore, S.W., Sheetz, M.P.: Biophysics of substrate interaction: in?uence on neutral motility, di?erentiation, and repair. Dev. Neurobiol. 71, 1090–1101 (2011) 19. Lui, A.C., Polis, T.Z., Cicutti, N.J.: Densities of cerebrospinal ?uid and spinal anaesthetic solutions in surgical patients at body temperature. Can. J. Anaesth. 45(4), 297–303 (1998) 20. Toma, M., Einstein, D.R., Bloodworth, C.H., Cochran, R.P., Yoganathan, A.P., Kunzelman, K.S.: Fluid-structure interaction and structural analyses using a com-prehensive mitral valve model with 3D chordal structure. Int. J. Numer. Meth. Biomed. Engng. 33(4), e2815 (2017). https://doi.org/10.1002/cnm.2815 21. Toma, M., Oshima, M., Takagi, S.: Decomposition and parallelization of strongly coupled ?uid-structure interaction linear subsystems based on the Q1/P0 discretization. Comput. Struct. 173, 84–94 (2016). https://doi.org/10.1016/j. compstruc.2016.06.001 22. Toma, M.: The emerging use of SPH in biomedical applications. Signi?cances Bio-eng. Biosci. 1(1), 1–4 (2017). SBB.000502 23. Brodmann, K.: Vergleichende Lokalisationslehre der Grosshirnrinde (in German). Johann Ambrosius Barth, Leipzig (1909) 568 M. Toma 24. Limited TCT Research (ed.) Cortical Functions. Trans Cranial Technologies ltd. (2012) 25. Toma, M., Nguyen, P.: Fluid-structure interaction analysis of cerebrospinal ?uid with a comprehensive head model subject to a rapid acceleration and deceleration. Brain Inj. 1–9 (2018). https://doi.org/10.1080/02699052.2018.1502470 26. Varlotta, C., Toma, M., Neidecker, J.: Ringside physicians’ medical manual for boxing and mixed martial arts: technology & impact sensor testing. Association of Ringside Physicians, Chapter D10 (2018) Integrating Markov Model, Bivariate Gaussian Distribution and GPU Based Parallelization for Accurate Real-Time Diagnosis of Arrhythmia Subclasses Purva R. Gawde1(&) , Arvind K. Bansal1 , and Jeffery A. Nielson2 1 Department of Computer Science, Kent State University, Kent, OH 44240, USA pgawde@kent.edu, arvind@cs.kent.edu 2 Department of Emergency, Northeast Ohio Medical University, Rootstown, OH, USA jeffnielson@gmail.com Abstract. In this paper, we present the integration of SIMT (Single Instruction Multiple Threads), Markov model and bivariate Gaussian distribution as a general-purpose technique for real-time accurate diagnosis of subclasses of arrhythmia. The model improves the accuracy by integrating both morpholog-ical and temporal features of ECG. GPU based implementation exploits con-current execution of multiple threads at the heart-beat level to improve the execution ef?ciency. The approach builds a bivariate Gaussian Markov model (BGMM) for each subclass of arrhythmia where each state includes bivariate distribution of temporal and morphological features of each waveform and ISO-lines using ECG records for each subclass from standard databases, and the edge-weights represent the transition probabilities between states. Limited 30- second subsequences of a patient’s beats are used to develop bivariate Gaussian transition graphs (BGTG). BGTGs are matched with each of the BGMMs to derive the exact classi?cation of BGTGs. Our approach exploits data-parallelism at the beat level for ECG preprocessing, building BGTGs and matching multiple BGTG-BGMM pairs. SIMT (Single Instruction Multiple Thread) available on CUDA resources in GPU has been utilized to exploit data-parallelism. Algo-rithms have been presented. The system has been implemented on a machine with NVIDIA CUDA based GPU. Test results on standard MIT- BIH database show that GPU based SIMT improves execution time further by 78% with an overall speedup of 4.5 while retaining the accuracy achieved by the sequential execution of the approach around 98%. Keywords: ArrhythmiaAI techniquesECG analysisGaussian GPUMarkov modelMedical diagnosisMachine learning ParallelismWearable devices © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 569–588, 2019. https://doi.org/10.1007/978-3-030-02686-8_43 1 Introduction An aging population is challenging the current healthcare system by increasing costs, creating a lack of healthcare personnel, and contributing to more complex combinations of chronic diseases [1]. Cardiovascular diseases like arrhythmia, ischemia, myocardial infarction and cardiomyopathy (including hypertrophy) are some of the most common problems in elderly leading to sudden cardiac death (SCD) [1, 2] and congestive heart failure. Often, these symptoms go undetected due to the transient nature of symptoms and the mobile life-style of the modern society. Transitory nature of arrhythmia requires monitoring of ECG to diagnose and reduce the risk of SCD [1] including life-threatening ventricular ?brillation [3]. The demand for an improved healthcare system requires development of infor-mation technology, and one area of opportunity is wearable smart monitoring devices [4, 5]. Advances in microelectronics have provided smaller, faster and more affordable embedded platforms for personal monitoring systems such as the NVIDIA Jetson GPU [4, 5]. Most of these wearable biomedical systems can detect a variety of abnormalities such as stress, oxygen level saturation, ischemia and arrhythmias, but with limited accuracy. ECG signal analysis for real-time detection of abnormalities involves computation-ally expensive modules like signal denoising, morphological and temporal feature extractions; complex functional transforms [6], computational intelligence techniques for classi?cation and machine learning. The AI techniques include the use of Bayesian network [7], neural networks [8] and Markov models [9, 10]. The computational overhead of exploiting these techniques is signi?cant and violates the basic requirement of resource-limited smart wearable devices diagnosing abnormality accurately in real-time. In recent years, several researchers have exploited GPU based SIMT (Single Instruction Multiple Threads) parallelism to improve the computational ef?ciency for automated ECG analysis [11], de-noising [12], and classi?cation of premature beats using neural networks [8, 13]. Different techniques for parallelization include time-domain analysis [7] and probabilistic neural networks [13]. For arrhythmic beat clas-si?cation, Fan, Xiaomao, et al. [14] have proposed GPU based detection of seven types of beats using thresholds and rule-based system. These studies indicate that GPU based parallelization signi?cantly improves the computational ef?ciency of ECG analysis. However, these studies separate only premature ventricular complex beats from normal beats [13], and do not address the diagnosis of the subclassi?cation of ventricular and supraventricular arrhythmia in real time. The ?ner classi?cation of arrhythmias is important because different subclasses require different treatment [2]. For instance, ventricular tachycardia is generally treated with antiarrhythmic drugs [2]; while ventricular ?brillation needs immediate treatment by a de?brillator. Subclasses of supraventricular arrhythmia like the atrial flutter can result into blood clots leading to cerebrovascular events [3] if not treated. Finer subclassi?cation requires an integrated model that can capture both mor-phological and temporal characteristics of ECG and consider transition probabilities within waveforms to account for waveform variations. Arrhythmic ECG also presents a 570 P. R. Gawde et al. challenge when some waveform features are embedded in another waveform [3], which can lead to misclassi?cation [1, 3]. Our earlier work focused on detecting ?ner subclasses of supraventricular and ventricular arrhythmia in real time using the integration of Markov models and the identi?cation of embedded P-waves [9, 10]. The run-time detection of the disease subclass requires: (1) statistical derivation of a theoretical Markov model graph for each subclass; (2) dynamically building a real-time graph using a limited number of beats at the run-time; and (3) matching the real-time graph from an individual patient to the derived graphs to best classify the patient condition. Our previous work needs to be further improved for time-ef?ciency because resource-limited wearable devices need to analyze the ECG for other heart abnor-malities such as ischemia (lack of oxygen), electrolyte imbalance such as hyperkalemia (excessive potassium), myocardial infarction (heart failure due to prolonged ischemia) to name a few. Additional improvement in execution-time is required to facilitate real-time detection of other heart abnormalities concurrently and in real-time in resource-limited miniaturized wearable devices [4, 5]. In this research, we propose an integrated general-purpose BGMM (Bivariate Gaussian Markov Model) model that further improves the accuracy by associating bivariate Gaussian distribution of amplitude and duration of the waveforms and ISO-lines with each state of the Markov model. We improve execution ef?ciency by exploiting SIMT parallelism available on GPU as shown in Fig. 1. The major contributions in this paper are: 1. The development of a general-purpose model that integrates bivariate Gaussian distribution of amplitude and duration of waveforms for a state with Markov model to integrate morphological and temporal features. 2. The exploitation of SIMT based concurrency on GPUs that signi?cantly improves the execution ef?ciency of the ?ner subclassi?cation of arrhythmia. 3. The development of multiple algorithms for beat-level exploitation of SIMT for dynamic graph building and graph matching exploiting expectation maximization for the arrhythmia subclassi?cation. The remainder of the paper is organized as follows: Sect. 2 describes the back-ground concepts of Markov model and bivariate Gaussian distribution. Section 3 describes our BGMM based approach for arrhythmia subclassi?cation. Section 4 dis-cusses SIMT parallelization of the approach. Section 5 discusses algorithms for Abnormality detection 1 Multiprocessor embedded GPU CPU Sensor Data Preprocessing SIMT concurrency Abnormality detection n Executable function 1 Executable function n Fig. 1. Personal monitoring system for multiple abnormalities detection. Integrating Markov Model, Bivariate Gaussian Distribution 571 execution of kernel functions; Sect. 6 discusses implementation and performance results. Section 7 compares our approach and performance with other related works. Section 8 concludes the paper and discusses future directions. 2 Background 2.1 Arrhythmia Subclassi?cation Arrhythmia is de?ned as irregular heartbeats caused by the presence of irregular and refractory pulse patterns due to the presence of ectopic nodes arising outside the sinus node. Arrhythmia is broadly classi?ed into either supraventricular arrhythmias arising above lower chambers of heart, or ventricular arrhythmia arising in the lower chambers of heart. Supraventricular arrhythmias are further subclassi?ed as: (1) Atrial ?brillation (AFib); (2) Atrial flutter (AF); (3) Atrial-ventricular nodal reentry tachycardia (AVNRT); and (4) Ectopic atrial tachycardia (EAT). Ventricular arrhythmia is clas-si?ed into three major subclasses: (1) Ventricular Tachycardia (VTach), (2) Ventricular Flutter (VFlu) and (3) Ventricular Fibrillation (VFib). Different subclasses have dif-ferent levels of threat to health, and are treated differently [3]. Atrial ?brillation (AFib) is characterized by the absence of P-waves and a QRS complex duration of less than 120 ms with an atrial rate of 400–600 beats per minute (bpm). Atrial Flutter (AF) is characterized by the presence of P-waves with shorter duration, elevated PQ baseline and 250–350 atrial bpm. Atrial-ventricular nodal reentry tachycardia (AVNRT) is characterized by retrograde P-waves after or embedded inside QRS-complex, with an atrial rate of 250–300 bpm. Ectopic atrial tachycardia (EAT) is characterized by the negative P-waves, T-wave elevation and heart rate of around 150 bpm. Ventricular Tachycardia (VTach) is typically characterized by wide S-wave (>100 ms), elevated R-wave, wide T-wave and heart rate greater than 100 bpm. Ventricular Flutter (VFlu) is characterized by the absence of P-waves, T-waves, S-waves, baselines, wide R-waves, elevated amplitude of R-waves and increased QT duration with heart rate 180–250 bpm. Ventricular Fibrillation (VFib) is characterized by no identi?able P-wave, T-wave or ISO lines, elevated ST baselines and heart rate of 150–500 bpm. Arrhythmia Supraventricular AFIB AF AVNRT EAT VTach VFlu VFib Ventricular Fig. 2. A subclassi?cation of Arrhythmia. 572 P. R. Gawde et al. 2.2 Markov Model A Markov model [15] is a probabilistic ?nite-state nondeterministic automaton mod-eled by a 5-tuple of the form (set of all states, set of initial states, set of ?nal states, transition matrix, and initial-state-probability-vector). Weighted edges are the transition probabilities between two adjacent states. Statistical analysis based upon transition frequency is used to build Markov models. 2.3 Bivariate Gaussian Distribution The joint distributions of two variables, denoted as A and B, having normal Gaussian distribution [16, 17] is calculated using conditional variance. Conditional variance is used based on correlation between variables [17]. Assuming lA and rA represent mean and variance of the variable A, and lB and rB represent the mean and variance of B. Conditional mean for the variable B is calculated by (1). EðBjAÞ ¼ lB þ q rB rA ðA A lAÞ ð1Þ Where, q represents the covariance between the variables A and B. Conditional variance of B is calculated by (2). r2 BjA ¼ r2 B 1 2 q 2 2 2 ð2Þ Conditional distribution of the variable B given A = a is calculated by (3). hðbjaÞ ¼ 1 rBjA ??????n 2p p exp x B x lBjA x 2 2r2 BjA 2 6 4 3 7 5 ð3Þ Using conditional distribution of B, the joint probability distribution is calculated by (4). f ða; bÞ ¼ fAðaÞ:hðbjaÞ ð4Þ 2.4 Statistical Modeling of ECG for Subclassi?cation Bivariate Gaussian Markov Model (BGMM). A bivariate Gaussian Markov model (BGMM) is a special class of Markov models that integrates joint Gaussian distribution [17] of feature vectors for the states and probabilistic transition between the states. It is modeled as a weighted directed graph where transition probabilities between two adjacent states represent the weight of the edges and state value of a graph represent the joint Gaussian distribution of two variables: amplitude and duration. Integrating Markov Model, Bivariate Gaussian Distribution 573 BGMM has eight states and transitions. The eight states are: (1) P-wave features; (2) Q-wave features; (3) R-wave features; (4) S-wave features; (5) T-wave feature; (6) PQ iso-segment; (7) ST iso-segment; and (8) TP iso-segment. Bivariate Gaussian Transition Graph (BGTG). A bivariate Gaussian transition graph (BGTG) is a weighted directed graph that shows the probability of transition between the adjacent states of a ?nite state automaton like BGMM. However, BGTG is made from a small sample of data-elements from the same patient’s heart-beats in comparison to the BGMM that carries large sample-size of multiple patients having a common physician annotated abnormality. The matching of BGTG with BGMM graph provides subclassi?cation of a patient’s ECG. 2.5 ECG Signal Preprocessing Denoising. Raw ECG signals from the MIT-BIH database [18] contain at least three types of noise: electromyography noise from muscles’ movement, radio frequencies and power line noise [6]. Discrete Wavelet Transforms (DWT), a multi-resolution decomposition scheme is used to eliminate these noises [6]. The source signal is decomposed into low and high frequency sub-bands. Low-pass and high-pass ?lters are used to remove low-frequency and high-frequency sub-bands, respectively. Feature Extraction. Amplitude and duration for waveforms (P, Q, R, S and T) and for baselines (TP or ISO1, PQ or ISO2 and ST or ISO3): Daubechies 6 (D6) wavelet transform is used to detect amplitude and duration of waveforms in each beat [6]. Wavelet transforms are scaled up to eight levels to obtain corresponding approximation coef?cients. Four separate algorithms [6] are used to detect R-wave, Q and S-wave, PQ and ST segments, and P-waves. Based on zero crossings of waveforms, durations of the waveforms and baselines are derived. SIMT and Parallel Computations. SIMT (Single Instruction Multiple Threads) paradigm is based upon executing the same sequence of instructions concurrently spawning multiple light-weight threads. 2.6 GPU and CUDA Architecture A CUDA based GPU has multiprocessor cores, and acts as a coprocessor to the main CPU. CUDA (Compute Uni?ed Device Architecture) supports data-parallelism using SIMT paradigm by spawning a high number of concurrent threads on different sets of data-elements in compute-intensive applications [19]. Streaming multiprocessors (SM) are assigned to multiple groups of threads called blocks using a grid architecture [19] as shown in Fig. 3. Each SM has multiple CUDA cores that are comprised of ALUs, FPUs (Floating Processing Unit), load/store units and registers. These cores are assigned automatically to balance the load by the SM scheduler. The GPU supports high latency global memory to share information between CPU and GPU, short latency constant memory that cannot be altered during a thread’s execution, limited on-chip shared memory and local memory. Global memory is also used to share information across SMs. Constant 574 P. R. Gawde et al. memory is a cache memory written into before spawning the corresponding thread. It does not allow rewriting during the thread execution. A block is a group of threads that can be executed concurrently. These threads communicate to each-other using low latency shared memory. The threads are auto-matically allocated CUDA cores to exploit concurrency and balance the load. NVIDIA GPU Based Architecture. NVIDIA GPU exploits data parallelism by concurrent spawning of multiple threads. These threads are automatically allocated CUDA cores, over which a programmer has no control. Distribution of data on SMs for exploiting concurrency is also automated, and this cannot be speci?ed by the pro-grammer, either. The spawning of multiple blocks enhances the chance of concurrent utilization of multiple SMs by mapping different blocks on different SMs. 3 BGMM Based Classi?cation of Arrhythmia Each state of the BGMM is associated with joint distribution of two variables: amplitude and duration. Transitions between the states represent transition probabilities between the states. Values of zero vary in meaning for amplitude and duration: The duration of zero for any of the baseline segments: ISO1 (TP-segment), ISO2 (PQ-segment) and ISO3 (ST segment) imply that the corresponding state in the BGMM is bypassed (i.e. the event never occurred). Conversely, an amplitude-value of zero is anticipated, and does not imply the absence of transitions between the ISO-states and the corresponding P-Q-R- S-T states because ISO-states have no peak (i.e. zero amplitude) in regular heart-beats. P-waves embedded in the QRS-complex are considered missing. The overall approach for real time irregular beat subclassi?cation is divided into two phases (as shown in Fig. 4): (1) a training phase that uses the standard MIT-BIH database [18], and (2) a dynamic diagnosis phase based upon real-time collection and analysis of a sequence of multiple beats-windows. Training Phase: A BGMM is constructed for each subclass using the annotated MIT- BIH database [18]. The training phase has four stages: (1) denoising the beats; (2) feature extraction (amplitude and duration of each waveform in a beat); (3) area subtraction to identify embedded waveforms; and (4) construction of Markov model. Dynamic Detection Phase: This phase has six stages: (1) de-noising of acquired beats (2) heartbeat collection for 30 s window; (3) morphological and temporal feature’s … … Grid 1 Block (1,1) Kernel CPU GPU Block(0,0) Block (1,0) Block(0,1) Block (0,0) Thread 1 Thread n Fig. 3. A CUDA architecture. Integrating Markov Model, Bivariate Gaussian Distribution 575 extraction; (4) embedded P-wave and R-wave detection, (5) BGTG construction and 6) BGTG classi?cation. The second stage is executed once for ?rst window of signal; subsequent windows do not require this stage because they incrementally build the statistical information by adding next beat information and removing the least recent beat information. A win-dow of 30 s is chosen for beat analysis to balance the quick response time needed in emergency conditions and to maintain accuracy. Each GPU analyzes around 20 beats based on optimal error analysis [17] using a con?dence interval of 95%. Statistical analysis showed that error increases by 2% for 10 beats, and decreases only by 0.2% for 40 beats. However, performance degrades for 40 beats window. 3.1 Embedded Waveforms Detection Embedded waveform analysis is required to derive one waveform embedded in another. This can occur in the same beat or a preceding beat. An embedded waveform can often be mistakenly considered missing [3] leading to misclassi?cation of sub-classes [9, 10]. In our previous work [9, 10], we identi?ed P-waves embedded in QRS-complex for the accurate diagnosis of EAT, and R-wave embedded in T-wave of the previous beat in VTach. The embedded waveforms are detected by area-subtraction technique [10, 20]. Area subtraction is based upon ?nding the mean area of each type of waveforms and sub-tracting the observed waveform area in the current beat from the corresponding mean. The calculation uses a threshold for identifying embedded waveforms [3, 10] with a con?dence interval [17] of 95%. After area subtraction of the initial waveform, the embedded P-wave or R-wave is allocated the mean amplitude and duration. 3.2 Bivariate Gaussian Transition Graph (BGTG) Construction A BGTG is constructed by extracting the amplitude and duration of each of the eight states and transitions between them. Zero durations in waveforms or ISO-states reflect missing corresponding states. Embedded wave analysis is utilized to identify the absent edges in the Markov model. Frequency analysis is used to derive transition probability. Denoising Feature extraction Embedded waveform BGMM construction Denoising Embedded waveform detection First window analysis BGTG Training Phase Dynamic Phase Graph Matching Feature extraction Fig. 4. Bivariate Gaussian Markov model approach. 576 P. R. Gawde et al. Figure 5 shows an example of a BGTG constructed for annotated beats of the EAT arrhythmia in MIT-BIH [18] dataset. Table 1 shows average amplitude and durations obtained for the same window. Transition from ISO3 ! T is only 0.02 meaning T-waves are absent during EAT because the next depolarization (i.e. P-wave) begins before the repolarization [3]. In addition, ectopic foci lead to negative amplitude of P-wave. 3.3 Graph Matching After constructing the BGTG, the diagnosis reduces to matching the BGTG with the BBGMMs for appropriate classi?cations [9, 10]. The algorithm has three steps: Step 1: For the constructed BGTG, most probable path (MPP) is identi?ed. An MPP is the path from ISO1 to ISO1 with the highest transition probability. For the BGTG given in Fig. 2, MPP is given by: ISO1!P!ISO2!Q!R!S!ISO3!ISO1. Step2: Transition probabilities below 0.05 are removed from BGTG to eliminate noise. The derivation of the threshold is based upon statistical analysis [17] of noise present in dataset [18]. A subset of the BGMMs is selected that includes all the transitions present in the BGTG. This step gives the list of prospective matching of BGMMs. Step 3: For all the BGMMs obtained from the Step 2, graph matching is performed by multiplying two values: (1) probability that observed bivariate distribution of state in BGTG is produced by state in BGMM using maximum likelihood estimation (MLE) [16] and (2) probability that the state in the observed beat is generated by a given BGMM based on transition probabilities using a standard forward-backward algorithm [15]. BGTG is classi?ed based upon the BGMM with the maximum likelihood. 0.9 8 0.02 ISO2 ISO3 S Q R T ISO1 1 0.04 0.96 1 1 1 1 P 0.98 0.02 Fig. 5. A sample BGTG for 20-beat window. Integrating Markov Model, Bivariate Gaussian Distribution 577 4 Concurrent Model 4.1 Dependency Analysis Figure 6 shows various modules and their execution time. Table 2 shows average processing time required for the four major modules. ECG preprocessing module has two submodules: denoising module and feature extraction module. The high-level modules cannot be executed concurrently due to the inherent dependency between the modules: preprocessing ! embedded wave detection ! BGTG construction ! graph matching. However, denoising, feature extraction, embedded wave analysis and BGTG graph construction modules require the beat-level analysis and shared memory to merge the data from individual beat analysis. Graph matching matches one BGTG with multiple BGMMs. While ?rst three modules can exploit data-parallelism at the beat level within the same SM (streaming multiprocessor), Table 1. Average amplitude and duration Amplitude Duration P-wave --0.20 mv 0.08 s Q-wave --0.14 mv 0.2 s R-wave 1.8 mv 0.6 s S-wave --0.2 mv 0.1 s T-wave 0.17 mv 0.10 s ISO1 0 0.11 s ISO2 0 0.09 s ISO3 0 0.07 s Patient Specific Analysis [PERCENTAGE] [CATEGORY NAME] [PERCENTAGE] [CATEGORY NAME] [PERCENTAGE] [CATEGORY NAME] [PERCENTAGE] [CATEGORY NAME] [PERCENTAGE] Fig. 6. Timing analysis of bivariate Markov model approach. 578 P. R. Gawde et al. graph matching requires data-parallelism for concurrently matching multiple BGTG- BGMM pairs. Two major issues in exploiting GPU based parallelism are: (1) mismatch of the latency time of different memories; and (2) mismatch between the data transfer rate between CPU and GPU and data transfer rate between SMs within GPUs. Thus, we have to optimize task distribution so that faster memory accesses in GPUs are exploited without excessive data transfer between slower global memories. In addition, we have to maintain the accuracy of the diagnosis while distributing the beats across SMs in GPU based on statistical analysis. In our case, the CPU performs real-time ECG collection and spawning of the data analysis. However, data parallel work is done in GPU. Feature extraction has two functionalities: (1) identi?cation of the waveforms; and (2) extraction of amplitude and duration of each waveform and ISO lines. The ?rst task begins without prior knowledge about the waveforms. It has eight subtasks: (1) R-wave extraction; (2) Q-waves extraction; (3) S-wave extraction; (4) zero crossing detection to get ISO2 baseline; (5) zero crossing detection to get the ISO3 base line; (6) P-wave extraction; (7) T-wave extraction; 8) ISO1 extraction using knowledge of P and T waves. There is a task dependency in identifying the beats. R-wave is identi?ed ?rst followed by two tasks: (Q-wave detection ! zero crossing to get ISO2 ! P-wave detection) and (R-wave detection ! zero crossing to get ISO3 ! T-wave detection). After the detection of P-wave and T-wave, ISO1 is identi?ed. 4.2 Exploiting Concurrency on GPU The overall approach to exploit concurrency consists of three steps: (1) block level parallelism for noise detection and waveform extraction by dividing the data into equal time-slots; (2) exploiting data parallelism at the beat level for the amplitude and duration analysis, embedded wave detection and BGTG construction; and (3) concur-rent matching of BGTG-BGMM pairs by spawning multiple threads within a block, one for each BGTG-BGMM pair. Before starting concurrent processing of time-windows, the initial window for the ?rst 30-s period is analyzed sequentially in the CPU to estimate the statistical infor-mation regarding the waveform features. The analyzed features are: (1) number of beats and individual waveforms in 30 s window; (2) mean, median, minimum and maximum of the amplitude and duration of each type of waveform and ISO-lines. This infor-mation is needed to spawn and terminate multiple threads during concurrent analysis of Table 2. Average processing time for modules Module Processing time (ms) Preprocessing 950 Embedded wave 200 Transition graph 2800 Graph matching 3200 Integrating Markov Model, Bivariate Gaussian Distribution 579 future windows. This information is stored in the global memory and the constant memory for use by SMs for subsequent concurrency exploiting modules. Concurrent Denoising and Feature Extraction. The noise removal submodule processes 30-s window (around 120 beats) of raw ECG signal, and has no knowledge of the waveforms. It performs convolution, low pass and high-pass ?ltering. Hence 30- s windows are divided equally in multiple blocks (>6 per window in our case). After the noise-removal, the signal is input to the feature-extraction module. Since the data is already present in GPU, there is no data transfer overhead. Beat detection and feature vector analysis are performed in one block to exploit shared memory (low latency). Based upon the estimate of the R-waveform counts derived from the initial window analysis, the same number of threads are spawned to concurrently detect individual R-waveforms using barrier-based synchronization (synchronization in Nvidia GPU terminology). After detecting R-waveforms, two sets of concurrent threads are spawned to detect other waveforms and features (Q-wave, ISO2, P-wave) and (S-wave, ISO3, T-wave) respectively. Again, the number of threads spawned in each set is equal to number of detected R-waves. After detecting the waveforms, one thread is spawned to sequentially identify all ISO1 lines in the sample. After feature extraction, feature data is transferred to global memory for BGTG con-struction. Since the data-size is quite small after feature extraction, the overhead of data transfer is also quite small. For each window, there are multiple BGTGs (around six for a 30-s window). Multiple blocks are spawned, one for each BGTG construction, to exploit a maximum number of available SMs in the GPU. For every BGTG, there are three tasks for every state: (1) computing averages of the durations and amplitudes for each type of waveform; (2) computing the joint probability of amplitude and duration; and (3) computing the transition probability. Eight concurrent threads are spawned: one for each state. This exploits data-parallelism. This gives six BGTGs for a 30-s window. Graph matching phase exploits data-parallelism by spawning multiple blocks, one for each BGTG. Each block spawns multiple threads, one for each BGTG-BGMM pair. Each thread utilizes the CUDA cores by automatic allocation at the OS level. 5 Algorithms This section describes algorithms for the major concurrent tasks: (1) concurrent denoising and feature-extraction; (2) concurrent embedded-wave detection; (3) con-current BGTG construction; (4) concurrent MPP (most probable path) detection; and (5) concurrent Matching. For describing the concurrent thread spawning, we use the constructs cobegin-coend for modeling concurrent thread-groups that terminate together, barrier to model waiting for a group of threads to terminate together, and forall to spawn multiple threads concurrently. A block of activity in a single thread is enclosed within curly brackets {…}. Blocks are used for processing multiple concurrent activities such as thread-groups working on a ?nite number of beats to exploit maximum utilization of automated thread-groups to SMs mapping in the GPU. 580 P. R. Gawde et al. 5.1 Concurrent Preprocessing and Embedded Wave Detection Algorithm for Concurrent Denoising and Feature Extraction. To execute this kernel function, 30 s of data is divided into number of blocks corresponding to a set of beats based on the average beat area calculated in the initial window analysis. On each block, data is divided between multiple threads. Noise removal and the R-wave detection with amplitude and duration is performed by the threads concurrently. A barrier is used to ?nish the execution of all R-wave detection threads. Next, data is divided into two chunks: left of R-wave (R-wave – D) and right of R-wave (R-wave + D), which are spawned on multiple threads concurrently. Each of the left-side threads detect and extract features of one corresponding Q-wave, ISO2 and P-wave. Similarly, each of the right-side threads detect and extract features of one corre-sponding S-wave, ISO3 and T-wave. Threads are terminated after they cross their respective boundaries. After the termination, their output is used to detect ISO1 and store its duration in the global memory. Algorithm for Concurrent Embedded Wave Detection. A kernel function with a grid of six blocks is launched, where one block is executed on one SM. To execute it, each block gets information for average area calculation from the initial window analysis and features calculated for each beat in the previous module. Each thread in a block works on one beat. For the missing P-waves, the corresponding threshold area is checked to assign average features for the missing waveform. Otherwise, unchanged features are passed back to global memory. Detailed algorithms are given in Fig. 7. 5.2 Concurrent BGTG Construction To exploit a maximum number of available resources and SMs, 120 beats were divided into 20 beat blocks. A BGTG is constructed by each block by using feature-values of 20 beats and estimated values derived by initial window analysis. For each state of BGTG, two calculations are performed by each thread: (1) bivariate probability; and (2) transition probability to other states. Thread calculations are synchronized using barrier to ensure fully constructed BGTG before transferring data to global memory. Detailed algorithm is given in Fig. 8. 5.3 Concurrent Graph Matching The concurrent graph matching algorithm has three kernel functions: (1) Computing the most probable path in each BGTG; (2) pruning BGMMs that do not have an edge present in the BGTG; (3) classifying BGTG using MLE and the forward-backward algorithm. Concurrent Most Probable Path. On the GPU, one grid with six blocks is deployed. In each block, one state of BGTG is analyzed by each thread to calculate highest probability for that state. Information of highest probability is stored in form of pair (statei, statej) representing maximum probability from stateito statej. Final thread waits for barrier and creates MPP by joining all state-pairs for one BGTG. Integrating Markov Model, Bivariate Gaussian Distribution 581 Algorithm Concurrent denoising and feature extraction Input: ECG signal, D6 wavelet Output: denoised beats with features extracted { //Execute grid of blocks on GPU for window of 30 sec. forall block1: blockn //dispatch 5 second window to block { forall threads T1:Tm { //denoising and R-wave detection spawn Ti for denoising and R-wave detection; store derived information in memory, and wait; end barrier;} count number of R-waves from memory. Let it be k;} Co-begin forall threads T1 : TK{ spawn Ti to detect and store Q-wave ISO2 P-wave store derived information in memory; terminate if distance > R-wave-location end barrier; } forall threads TK+1 : T2*K{ spawn Ti to detect and store S-wave ISO3 T-wave store derived information in memory; terminate if distance > R-wave-location end barrier; } Co-end calculate ISO1 based on P-wave and Q-wave; store ISO1 information}} Algorithm Concurrent embedded waveform detection Input: Beats-and-features Output: updated-beats-and-features {//Execute grid of blocks on GPU forall block1 : blockm //execute m concurrent blocks with 20 beats/block for multiple SMs forall T1 : Tm // each thread works on one beat if (missing (P-wave)) { compute QRS area if (QRS area > threshold) { mark P-wave present with average amplitude, duration update beats-and-features; }} } Fig. 7. Algorithm for concurrent waveform detection Fig. 8. Algorithm for concurrent BGTG. 582 P. R. Gawde et al. Concurrent BGMM Pruning. To ?nd the subset of potential BGMMs for each BGTG, one grid of six blocks are launched and each block is executed on different SM. Each block takes input of one BGTGs and all BGMMs. Comparison of one BGTG with one BGMM is performed by each thread on one block. BGTGs with probabilities less than the threshold are pruned by the ?rst thread in the block. Next, concurrent threads are launched for each BGTG-BGMM pair. If states in BGTG and BGMM match, BGMM is considered as a potential match for the BGTG and is stored in the common vector SUB accessible to all the threads in the block. Concurrent Maximum Probability. To calculate the probabilities of matching each BGTG with the ?ltered BGMMs, a kernel function with a grid of six blocks is laun-ched. One BGTG is matched with the subset of ?ltered BGMMs in one block. The probability of matching one BGTG-BGMM pair is calculated by multiplying two values: (1) probability of state-value (bivariate Gaussian distribution) of BGTG pro-duced by BGMM using MLE [16], and (2) probability of transition probabilities in BGTG produced by BGMM using a forward-backward algorithm [15]. This probability is stored in a vector accessible to all the threads in the block. The outputs for each block are transferred back to the global memory. Detailed algorithm is given in Fig. 9. 6 Implementation The software was executed on a Dell machine having Intel(R) Xeon(R) dual core CPU E5-2680 @2.70 GHz 64-bit system with 128 GB RAM and CUDA enabled GeForce GTX 1050 ti GPU card. In GTX 1050 ti, there are six SMs. Each SM has four blocks with 32 cores per block, and 48 KB shared memory. There are 24 blocks, each having 1024 threads. There are total of 768 cores in the GPU for SIMT processing. We analyzed the MIT-BIH arrhythmia dataset [18] and the Creighton University Ventricular Database available at PhysioNet [21]. The dataset was divided 60% for training and 40% for testing. Threshold used for area subtraction in algorithm for detection of the embedded waveforms was chosen experimentally after analyzing 3093 beats in MIT-BIH [18]. To derive the execution ef?ciency, we compared the CPU only implementation and CPU + GPU combination with full CUDA resources. For the acquisition of real-time ECG data, signal ?ltering and processing, feature extraction and analysis, we use MATLAB software along with WFDB software package [21] provided by PhysioNet written in C++ [21]. We also used MATLAB for statistical analysis. GPU algorithms were executed in C with CUDA framework. 6.1 Performance Analysis and Discussion We tested overall execution ef?ciency and improvement using single core CPU and 768 CUDA cores at the module level as summarized in Table 3. We also tested the effect of memory utilization of different types of memory on overall improvement as shown in Fig. 10. Based on limitations and advantages of each memory type, we analyzed two approaches to exploit data parallelism: (1) Combination of constant Integrating Markov Model, Bivariate Gaussian Distribution 583 Fig. 9. Algorithm for concurrent graph matching Table 3. Concurrent execution speedup of modules Module Single CPU Concurrency using GPU Speedup Preprocessing 950 ms 503 ms 1.8 Embedded wave 200 ms 102 ms 1.9 Transition graph 2800 ms 489 ms 5.7 Graph matching 3200 ms 492 ms 6.5 Total time 7150 ms 1586 4.5 584 P. R. Gawde et al. memory and global memory; (2) Combination of shared memory and constant memory. The execution times of different modules are based on the analysis of 120 beats per execution for 500 iterations. Average time taken to execute the sequential BGMM approach on CPU is around 7 s. Average time taken to execute the modules concur-rently using the GPU is around 1.6 s. The overall improvement is 4.5x (77.8%) for arrhythmia subclassi?cation. After the GPU implementation ?nishes in 1.6 s, it remains idle for the next 28.4 s while CPU collects next 30-s window in real-time. This idle time can be utilized to analyze other abnormalities working on same ECG data [19]. In the ?rst approach of memory utilization, constant memory was used for multiple access of read-only data. Due to the read-only nature of constant memory, during concurrent preprocessing of modules, global memory was used for information exchange and storing dynamic data during concurrent preprocessing of modules. In the second approach, faster shared memory within a single block was used as a read/write memory during dynamic execution. However, due to its limited size, read-only mul-tiple access data was stored in constant memory [19]. The experiment was run only for 20 beats/BGTG. The sequential execution increased linearly as the number of BGTGs increased. The time saved with the com-bination of shared memory and constant memory was more than the time saved using a combination of constant memory and global memory. This difference is expected because the shared memory is a cache memory with a low latency period. One more interesting result was observed. The concurrent approach increased linearly up to six BGTGs. After six BGTGs, the execution time became constant possibly due to addi-tional automated allocation of CUDA resources or SMs. Thus, additional overloading of the GPU is automatically compensated by additional allocation of CUDA resources. This might prove a useful feature for exploring an analysis of other aspects of ECG abnormalities without increasing execution time. 6.2 Classi?cation Accuracy We calculated false positives, false negatives, true positives and true negatives to compute sensitivity as (TP/(TP + FN) * 100) and speci?city as (TN/(TN + FP) * 100). Fig. 10. Effect of memory utilization in speed up. Integrating Markov Model, Bivariate Gaussian Distribution 585 True positives (TP) are the number of positive detections that correspond to the anno-tations of a specialist. False positives (FP) are the number of detections that do not correspond with the annotations of a specialist. True negatives (TN) are the beats that were not annotated as a ventricular arrhythmia beat by a physician, and were not identi?ed by the algorithm. False negatives (FN) are the heartbeats that were annotated as arrhythmia by a specialist, but were not detected by the algorithm. Table 4 shows the accuracy of our technique when using a combination of shared memory and constant memory. Both sensitivity and speci?city are high for all the subclasses. 7 Related Works Several researchers have exploited parallel techniques for various subtasks such as processing of the signal using ?lters [12], wavelet transform [13], and classi?cation of beat to supraventricular and ventricular arrhythmia [14]. Lopes et al. [22] have proposed ventricular arrhythmia diagnosis using parallel implementation of neural networks. Their approach focuses on parallel implementation of back propagation. Their technique is limited to PVC beat detection, and does not address real-time classi?cation. Their approach does not detect embedded P-waves reducing the accuracy. The sensitivity obtained using their approach is 94.5% [17] compared to 98.8% obtained using our approach. Another neural network-based classi?cation approach has been proposed by Li [11]. It is limited to separating supraventricular or ventricular beat using GPUs. Phaudphut and Phusomsai obtained sensitivity of 88.0% [13] in detection of PVC beat compared to 99.3% by our approach. In addition, we diagnose all seven major sub-classes in real-time. Some researchers have utilized the GPU for denoising and feature extraction [8, 12]. Domazet et al. [12] has proposed an optimization with shared and constant memory for DSP ?lter for ECG denoising. Although our goal is much broader, we tested our approach with two memory optimization techniques. Combination of shared memory and constant memory showed improvements due to low latency compared with a combination of global and constant memory as expected. Table 4. Accuracy of arrhythmia subclassi?cation Class Subclass Sensitivity Speci?city Sensitivity Speci?city Sequential approach GPU based concurrent approach Ventricular AFib 97.3 93.6 97.2 96.4 Ventricular AFlu 92.3 94.3 92.1 94.3 Ventricular AVNRT 95.2 96.9 95.6 97.0 Ventricular EAT 98.6 94.0 98.6 94.0 Supraventricular VTach 94.0 96.3 94.0 96.4 Supraventricular VFlu 91.3 98.2 91.6 98.3 Supraventricular VFib 98.6 99.6 98.5 99.1 586 P. R. Gawde et al. 8 Limitations and Future Directions The current system uses only lead II for arrhythmia analysis. The model could be extended further to analyze three leads signals on embedded GPU such as NVIDIA Jetson [5] based wearable devices to handle ischemia, heart abnormalities due to electrolyte imbalances and myocardial infarction in real-time. We are currently extending our GPU based BGMM to a GPU based multivariate model to diagnose ischemia, hyperkalemia and myocardial infarction using three leads in real-time. References 1. Lerma, C., Glass, L.: Predicting the risk of sudden cardiac death. J. Physiol. 594(9), 2445– 2458 (2016) 2. Rautaharju, P.M., Surawicz, B., Gettes, L.S.: AHA/ACCF/HRS recommendations for the standardization and interpretation of the electrocardiogram: part IV. J. Am. Coll. Cardiol. 53 (11), 982–991 (2009) 3. Garcia, T.B., Miller, G.T.: Arrhythmia Recognition: The Art of Interpretation. Jones and Bartlett, Burlington (2004) 4. Abtahi, F., Snäll, J., Aslamy, B., Abtahi, S., Seoane, F., Lindecrantz, K.: Biosignal pi, an affordable open-source ECG and respiration measurement system. Sensors 15(1), 93–109 (2014) 5. Page, A., Attaran, N., Shea, C., Homayoun, H., Mohsenin, T.: Low-power manycore accelerator for personalized biomedical applications. In: ACM Proceedings of the 26th Edition on Great Lakes Symposium on VLSI, Boston, pp. 63–68 (2016) 6. Mahmoodabadi, S.Z., Ahmadian, A., Abolhasani, M.D.: ECG feature extraction using Daubechies wavelets. In: Proceedings of the Fifth IASTED International Conference on Visualization, Imaging and Image Processing, Benidorm, pp. 343–348(2005) 7. Sayadi, O., Mohammad, B., Shamsollahi, M.B., Clifford, G.D.: Robust detection of premature ventricular contractions using a wave-based Bayesian framework. IEEE Trans. Biomed. Eng. 57(2), 353–362 (2010) 8. Jun, T.J., Park, H.J., Yoo, H., Kim, Y.H., Kim, D.: GPU based cloud system for high-performance arrhythmia detection with parallel k-NN algorithm. In: Proceedings of the 38th Annual International Conference of the. IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, pp. 5327–5330 (2016) 9. Gawde, P.R., Bansal, A. K., Nielson, J.A.: ECG analysis for automated diagnosis of subclasses of supraventricular arrhythmia. In: Proceedings of International Conference on Health Informatics and Medical Systems, Las Vegas, pp. 10–16 (2015) 10. Gawde, P.R., Bansal, A.K., Nielson, J.A.: Integrating Markov model and morphology analysis for ?ner classi?cation of ventricular arrhythmia in real-time. In: IEEE International Conference on Biomedical & Health Informatics, Orlando, pp. 409–412 (2017) 11. Li, P., Wang, Y., He, J., Wang, L., Tian, Y., Zhou, T.: High-performance personalized heartbeat classi?cation model for long-term ECG signal. IEEE Trans. Biomed. Eng. 64(1), 78–86 (2017) 12. Domazet, E., Gusev, M., Ristov, S.: Optimizing high performance CUDA DSP ?lter for ECG signals. In: Proceedings of the 27th DAAAM International Symposium in Intelligent Manufacturing and Automation, Vienna, pp. 0623–0632 (2016) Integrating Markov Model, Bivariate Gaussian Distribution 587 13. Phaudphut, C, So-In, C., Phusomsai. W.: A parallel probabilistic neural network ECG recognition architecture over GPU platforms. In: Proceedings of the 13th International Joint Conference on. Computer Science and Software Engineering (JCSSE), Khon Kaen, pp. 1–7 (2016) 14. Fan, X., He, C., Chen, R., Li, Y.: Toward automated analysis of electrocardiogram big data by graphics processing unit for mobile health application. IEEE Access 5, 17136–17148 (2017) 15. Russell, S., Norwig, P.: Arti?cial Intelligence—A Modern Approach, 3rd edn. Prentice Hall, Upper Saddle River (2010) 16. Psutka, J.V., Psutka J.: Sample size for maximum likelihood estimates of Gaussian model. In: International Conference on Computer Analysis of Images and Patterns, pp. 462–469. Springer, Cham (2015) 17. Everitt, B., Skrondal, A.: The Cambridge Dictionary of Statistics, vol. 106. Cambridge University Press, Cambridge (2002) 18. MIT-BIH Arrhythmia dataset. https://www.physionet.org/physiobank/database/MIT-BIH/ 19. Nvidia, C.: C Programming Guide PG-02829–001_v9.1, March 2018. http://docs.nvidia. com/cuda/pdf/CUDA_C_PrograBGMMing_Guide.pdf 20. Tallarida, R.J., Murray, R.B.: Area Under a Curve: Trapezoidal and Simpson’s Rules Manual of Pharmacologic Calculations, pp. 77–81. Springer, New York (1987) 21. Creighton University Ventricular Tachyarrhythmia Database. https://physionet.org/ physiobank/database/cudb/ 22. Lopes, N., Ribeiro, B.: Fast pattern classi?cation of ventricular arrhythmias using graphics processing units. In: Iberoamerican Congress on Pattern Recognition. LNCS, vol. 5856, pp. 603–610. Springer, Heidelberg (2009) 588 P. R. Gawde et al. Identification of Glioma from MR Images Using Convolutional Neural Network Nidhi Saxena(B) , Rochan Sharma, Karishma Joshi, and Hukum Singh Rana University of Petroleum and Energy Studies, Dehradun, India nsaxena117@gmail.com Abstract. This paper presents a novel approach of classifying the type of glioma using convolutional neural network (CNN) on 2D MR images. Glioma, most common type of malignant brain tumor, and can be clas-si?ed according to the type of glial cells a?ected. The types of gliomas are, namely, actrocytoma, oligodendroglioma and glioblastoma multi-forme (GBM). Various image processing and pattern recognition tech-niques may be used for cancer identi?cation and classi?cation. Though in recent years deep learning has been proved to be e?cient in computer aided diagnosis of diseases. Convolutional Neural Networks, a type of deep neural network which is generally used for classi?cation of images, contains multiple sets of conv-pool layers for feature extraction, followed by fully-connected (FC) layers that make use of extracted features for classi?cation. Keywords: Glioma · Astrocytoma · Oligodendroglioma Glioblastoma multiforme (GBM) MRI and convolutional neural network (CNN) 1 Introduction Glioma is a major type of brain tumor that can occur in all age groups though mostly seen in adults. It originates in glial cells of brain. Glial cells are of four types namely - astrocytes, oligodendrocytes, microglia and ependymal cells. Accordingly astrocytoma, oligodendroglioma and glioblastoma multiforme are the types of glioma cancers as shown in Fig. 1. These tumors can be cured if detected at early stage but some of the fast growing gliomas can be dangerous. The most common and aggressive type of brain tumor is glioblastoma multiforme or GBM, which is a malignant grade IV glioma. In early-stage glioblastoma, as per MRI ?ndings, are ill-de?ned small lesions with little or no mass e?ect, and having no or subtle contrast enhancement. Within several months, these lesions develop typical MRI ?ndings such as a heterogeneous enhanced bulky mass with central necrosis. The average period from the initial to ?nal scan in diagnosis of glioblastoma has been 4.5 months [1]. Magnetic Resonance Imaging (MRI) is one of the commonly used modalities used for diagnosing brain tumors. As com-pared to other diagnostic methods, like computed tomography scan, ultrasound, .o c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 589–597, 2019. https://doi.org/10.1007/978-3-030-02686-8_44 590 N. Saxena et al. etc., MRIs are safe, non-invasive and re?ects the true dimensions of organ/tissue, therefore in imaging of the brain, it is widely considerable [2]. Convolutional neural networks (CNN) [3] consists of conv-pool layers followed by fully-connected (FC) layers. One conv-pool layer consists of a convolutional layer and a pooling layer. The convolutional layer is used to detect hierarchical features from images, whereas pooling layer is used to forward the detected fea-tures further in the network [4]. In the proposed model, convolution operations are performed with same padding (size of feature space remains same) and pool-ing is performed with valid (or no) padding (size of feature space is reduced). Conv-pool layers detects useful features and forward them to FCs where classi?- cation is performed. Unlike neural networks, in CNN each layer grid is connected to only a limited number of layers. In CNN, the entire network can be put into the GPU memory and the hardware cores can be used to boost network speed using deep learning tools. They have a lot of applications in medical diagnosis involving image segmentation independent of morphology. Lesions are detected and classi?ed accordingly using CNNs and type and severity of diseases can be predicted. Fig. 1. MR scans of types of gliomas: a. Astrocytoma, b. Olidendroglioma and c. Glioblastoma multiform or GBM. 2 Literature Review 2.1 Segmentation In medical domain, segmentation is the technique for detection and separation of a part from medical image (can be a lesion or an organ) that can be used for further diagnosis. Segmentation proves to be very helpful for monitoring disease progression, plan treatment strategies and prediction of treatment outcomes. It can be done in many ways like by thresholding or by developing a heuristic algorithm as shown by Rajnikanth et al. [5]. Their work focuses on developing Hamiltonian Mechanics 591 a heuristic algorithm to segment the tumor region from 2D brain MRI images. Initially, preprocessing is done which enhances the tumor region in MR scans followed by multi-level thresholding to segment the lesion. Then accuracy is calculated on di?erent slices of MR images, which is above 95% for all types of MRI slices. Deep learning can also be applied for segmentation of lesion and detection of cancer from modalities like Computed Tomography (CT) scan, ultrasound and MRI. Farnaz et al. [6], trained a deep convolutional neural network (DCNN) for segmentation of lesions in brain from MR images. The proposed model was 6 layers deep (5 convolution layers and 1 FC layer) showed that the DICE similarity coe?cient matric was 0.90 for complete, 0.85 for core and 0.84 for enhancing regions on BRATS 2016 dataset. Segmentation, sometimes may need humans to provide some high level infor-mation needed to extract the segmented region from images. This type of segmentation is called interactive segmentation [7]. Wang et al. [8] performed interactive medical image segmentation by ?ne-tuning a pre-trained CNN for segmenting multiple organs from 2D fetal MR slices (here two types of organs were annotated for training) and also on 3D segmentation of brain tumor core and whole brain tumor (here the brain tumor core was annotated in one MR sequence). The image speci?c ?ne-tuning made CNN model adaptive to a speci?c test image which can be either unsupervised or supervised. Also, a weighted loss function considering network and interaction based uncertainty for ?ne-tuning was proposed. Experiments show that image speci?c ?ne tuning improves seg-mentation performance. 2.2 Classification In medical diagnosis, aim is to identify the presence of a disease in a person on the basis of scans of a particular organ along with analyzing patient’s medical history. To detect the disease by analyzing an image, pre-processing may prove to be bene?cial. Sadegi-Naini et al. [9] proposed a method for feature extraction (a pre-processing step) and data analysis to characterize breast lesion by using texture based features in ultrasound scans. Among 78 patients, 46 and 32 patients were con?rmed with benign and malignant lesions respectively based on radiology and pathology reports. Though MR is an e?cient modality, still to apply Computer Aided Diagnosis (CAD) sometimes pre-processing methods such as feature selection, extraction or representation is required. Mingxia et al. [10] proposed an anatomical landmark based feature representation which automatically extracts features in brain MR images for the purpose of disease diagnosis. Experimental results showed that the proposed method improves the performance of disease classi?cation. An approach to ?nd the severity of tumor is to ?rst segment tumor region from the scan then classify it as malignant or benign. Deckota et al. [11] proposed a system which identi?es the cancerous nodule from the lung CT scan images using watershed segmentation for detection and support vector machine (SVM) for classi?cation of nodule as malignant or benign. The proposed model includes 592 N. Saxena et al. 6 stages: image pre-processing, segmentation of the pre-processed image, feature extraction, feature reduction using PCA, classi?cation using SVM and evaluation of the classi?cation. The model detects cancer with 92% accuracy classi?er has accuracy of 86.6%. In a classi?cation problem of medical diagnosis, accuracy is generally mea-sured in terms of speci?city and sensitivity and both are directly proportional to the accuracy of the classi?er. Blumenthal et al. [12] proposed an automatic classi?cation for tumor and nontumor cells using support vector machine (SVM) classi?er which is trained on 4 components enhancing and nonenhancing, tumor and nontumor. Classi?cation results were evaluated using 2 fold cross validation analysis of the training set and MR spectroscopy. High sensitivity and speci?city (100%) were obtained within the enhancing and nonenhancing areas. Zakarakhi et al. [13] also proposed a scheme to classify brain tumor type and grade using MR images. The proposed scheme consists of several steps includ-ing ROI de?nition, feature extraction, feature selection and classi?cation. The extracted features include tumor shape and intensity characteristic as well as rotation invariant texture features. Feature subset selection is performed using SVM with recursive feature elimination. The binary SVM classi?cation accuracy, sensitivity and speci?city were respectively 85%, 87% and 79% for discrimina-tion of metastases from gliomas and 88%, 85% and 96% for discrimination of high-grade from low-grade neoplasms. Deep learning can be used e?ciently for identi?cation of di?erent types of substances in organ scans, as shown by Fang Liu et al. [14] as they designed a deep Magnetic Resonance Attenuation Correction (MRAC) for classi?cation of air, bone and soft tissue in CT scans of various organs. Their method provided an accurate pseudo CT scan with a mean Dice coe?cient of 0.971 ± 0.005 for air, 0.936 ± 0.011 for soft tissue and 0.803 ± 0.021 for bone. Most common application of deep learning is detecting whether a person is having a particular disease (mostly cancer) or not. David et al. [15] proposed a skin cancer prediction model using ANN (arti?cial neural network) whose training sensitivity was 88.5% and speci?city was 62.2% for the prediction of non-melanoma skin cancer (NMSC). The validation set showed a sensitivity of 86.2% and speci?city of 62.7%. Vipin et al. [16] used a deep neural network architecture for detection of tumors in lung CT scans and brain MR images. They basically classi?ed the images as being tumorous or non-tumorous. The accuracy of classi?cation was more than 97% for both CT and MR images. Frameworks AlexNet and ZFNet are compared for the same purpose. 3 Method 3.1 Implementation Details Dataset used is REMBRANDT [17,18] which consists of MR scans of 130 patients su?ering from glioma tumors of di?erent types and at di?erent stages. From this dataset, a total of 38,952 images, each of 128 × 128 were used. 5- fold cross-validation is applied with batch-size of 512 images. The label 0 was Hamiltonian Mechanics 593 Fig. 2. Architecture of proposed CNN. astrocytoma, 1 was GBM and 2 was oligodendroglioma. Test split was 4096 images from a total of 38,952 images. For training, 45 epochs are used for each validation. The proposed CNN model is implemented using tensor?ow frame-work on a system with con?guration as 4 CPU, 15 GB with 2 NVIDIA K80 GPUs on ubuntu 16.05. 3.2 Convolutional Neural Network In neural networks, there was a need to provide features to the network for classi?cation. CNNs are special type of neural networks where earlier layers are used to extract features and later layers are used to perform classi?cation using the extracted features. In general, initial layers of CNN comprises of multiple conv-pool layers followed by FCs. The last layer or output layer is either a sigmoid layer (in case of binary classi?cation) or a sigmoid layer. CNNs have proved to be very e?ective for extracting features from images, and eliminates the need of providing hand-crafted features to the network. Though training of CNN are computationally expensive, but use of GPU can fasten the process. Deeper the network, greater the classi?cation power due to the additional non-linearities and better quality of local optima [19]. However convolutions with 3D kernels are computationally expensive in comparison to the 2D kernel, which hampers the addition of more layers. Thus deeper network variants that are implicitly regularized and more e?cient networks can be designed by simply replacing each layer of common architectures with more layers that use smaller kernels [20]. However deeper networks are more di?cult to train. It has been shown that the forward (neuron activations) and backward (gradients) propagated signals 594 N. Saxena et al. may explode or vanish if care is not given to retain its variance [21]. This problem of vanishing gradients is solved by using adam optimizer described below. Adam Optimizer: Adam, derived from adaptive moment estimation [22], is an optimization algorithm used to solve the problem of vanishing gradients and elps in achieving learning rate decay. It uses the ?rst moment (which involves the exponentially decaying average of the previous gradients) and second moments (which involves exponentially decaying average of previous squared gradients). Adam is generally regarded as being fairly robust to the choice of hyper parame-ters, though the learning rate sometimes needs to be changed from the suggested default. Batch Normalization: In deep CNNs, each layer gets di?erent inputs or acti-vations which may result in inputs belonging to di?erent distributions at di?erent layers. This problem of internal covariant shift [23] is solved by applying batch normalization. It means, inputs at each layer are normalized so that they are all on same scale and hence belong to same distribution. Thus, batch normalization increases the adaptiveness of later layers learning. The batch normalization is applied in all the layers in the proposed architecture. Architecture: The CNN model (as shown in Fig. 2) is developed for 128 × 128 grayscale 2D MR images, having 5 conv-pool layers and 4 fully-connected (FC) layers. All the conv-pool layers used 3 × 3 kernel size and stride as 1 and max-pool layers used 2 × 2 kernel size and stride as 2. Activation function used in all the layers is ReLu (Recti?ed Linear Unit). First layer used two dimensional 32 kernels followed by max-pool of stride and same padding. Second layer used 64 kernels, followed by 128 kernels in the third layer, 256 kernels in the forth convolutional layer and ?nally 512 kernels in last convolutional layer (as shown in Fig. 4). Batch normalization is performed in all conv-pool and FC layers. After ?ve layers, there are 8192 features which are then ?attened in FC6 and converged to 2048 features in FC7, followed by 512 in FC8, then 64 in FC9 and ?nally to 3, which is the total number of classes. In last FC layer softmax, which is a probabilistic activation, is applied as classi?cation is done for three classes, which are three di?erent types of gliomas. 3.3 Results The cost minimization in one validation set is shown in Fig. 3. The proposed model is executed with 5-fold cross-validation and the overall cost minimization is shown in Fig. 4. The reason for observed ?uctuations in Fig. 4 is the applied validation. The cost in last epoch of one validation set is much lower than the cost at ?rst epoch while training the next validation set. The model gives a training accuracy of 63.17%, validation accuracy of 56.67% and test accuracy of 65.24%. Hamiltonian Mechanics 595 Fig. 3. Cost plot. Fig. 4. Cost when validation is applied. 4 Conclusion This paper proposes a novel CNN based model for identi?cation of glioma based on their origin in brain. To the best of our knowledge, this is the ?rst time deep learning is applied for identi?cation of glioma. The most common brain tumor is gliblastoma multiforme which can be classi?ed using the proposed model. GBM is grade IV malingnant tumor. One shortcoming of the proposed model is that some astrocytomas and oligodendrocytomas are misidenti?ed as GBM. In future, with further improvement, this model may assist radiologists to predict the type of glioma a person is su?ering from and treatment can be given accordingly. 596 N. Saxena et al. 5 Future Scope Grade determines the severity of the disease. As mentioned, glioma has four grades. Grade IV is the most malignant stage and is also called gliblastoma multiforme or just high grade glioma. This paper detects the type of glioma from MR images using CNN. Further, a di?erent CNN architecture can be used for the detection of grade of glioma. References 1. Ideguchi, M., Kajiwara, K., Goto, H., Sugimoto, K., Nomura, S., Ikeda, E., Suzuki, M.: MRI ?ndings and pathological features in early-stage glioblastoma. J. Neu-roOncol. 123, 289–297 (2015) 2. El-Gamal, F., Elmogy, M., Atwan, A.: Current trends in medical image registration and fusion. Egypt. Inform. J. 17, 99–124 (2016). https://doi.org/10.1016/j.eij.2015. 09.002 3. Lecun, Y., Bottou, L., Bengio, Y., Ha?ner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998) 4. Zeiler, M., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833 (2014) 5. Rajnikanth, V., Fernandes, S., Bhushan, B., Sunder, N.: Segmentation and anal-ysis of brain tumor using tsallis entropy and regularised level set. In: 2nd Inter-national Conference on Micro-Electronics, Electromagnetics and Telecommunica-tions. Springer, Singapore (2018) 6. Hoseini, F., Shahbahrami, A., Bayat, P.: An e?cient implementation of deep con-volutional neural networks for MRI segmentation. J. Digit. Imaging 31, 738 (2018) 7. McGuinness, K., O’Connor, N.: A comparative evaluation of interactive segmen-tation algorithms. Pattern Recognit. 43, 434–444 (2010) 8. Wang, G., Li, W., Zuluaga, M., Pratt, R., Patel, P., Aertsen, M., Doel, T., David, A., Deprest, J., Ourselin, S., Vercauteren, T.: Interactive medical image segmenta-tion using deep learning with image-speci?c ?ne-tuning. IEEE Trans. Med. Imag-ing. 37, 1562 (2018) 9. Sadeghi-Naini, A., Suraweera, H., Tran, W., Hadizad, F., Bruni, G., Rastegar, R., Curpen, B., Czarnota, G.: Breast-lesion characterization using textural features of quantitative ultrasound parametric maps. Sci. Rep. 7, 13638 (2017) 10. Liu, M., Zhang, J., Nie, D., Yap, P., Shen, D.: Anatomical landmark based deep feature representation for MR images in brain disease diagnosis. IEEE J. Biomed. Health Inform. 22, 1476 (2018) 11. Devkota, B., Alsadoon, A., Prasad, P., Singh, A., Elchouemi, A.: Image segmen-tation for early stage brain tumor detection using mathematical morphological reconstruction. Procedia Comput. Sci. 125, 115–123 (2018) 12. Blumenthal, D., Artzi, M., Liberman, G., Bokstein, F., Aizenstein, O., Ben Bashat, D.: Classi?cation of high-grade glioma into tumor and nontumor components using support vector machine. Am. J. Neuroradiol. 38, 908–914 (2017) 13. Zacharaki, E., Wang, S., Chawla, S., Soo Yoo, D., Wolf, R., Melhem, E., Davatzikos, C.: Classi?cation of brain tumor type and grade using MRI texture and shape in a machine learning scheme. Magn. Reson. Med. 62, 1609–1618 (2009) Hamiltonian Mechanics 597 14. Liu, F., Jang, H., Kijowski, R., Bradshaw, T., McMillan, A.: Deep learning MR imaging-based attenuation correction for PET/MR imaging. Radiology 286, 676– 684 (2017) 15. Ro?man, D., Hart, G., Girardi, M., Ko, C., Deng, J.: Predicting non-melanoma skin cancer via a multi-parameterized arti?cial neural network. Sci. Rep. 8, 1701 (2018) 16. Makde, V., Bhavsar, J., Jain, S., Sharma, P.: Deep neural network based classi?ca-tion of tumourous and non-tumorous medical images. In: International Conference on Information and Communication Technology for Intelligent Systems, pp. 199– 206 (2017) 17. Scarpace, L., Flanders, A.E., Jain, R., Mikkelsen, T., Andrews, D.W.: Data From REMBRANDT. The Cancer Imaging Archive (2017) 18. Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Ma?tt, D., Pringle, M., Tarbox, L., Prior, F.: The cancer imaging archive (TCIA): maintaining and operating a public information repository. J. Digit. Imaging 26, 1045–1057 (2013) 19. Choromanska, A., Hena?, M., Mathieu, M., Arous, G., LeCun, Y.: The loss surfaces of multilayer networks. In: Arti?cial Intelligence and Statistic, pp. 192–204 (2015) 20. Kamnitsas, K., Ledig, C., Newcombe, V., Simpson, J., Kane, A., Menon, D., Rueck-ert, D., Glocker, B.: E?cient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Med. Image Anal. 36, 61–78 (2017). https:// doi.org/10.1016/j.media.2016.10.004 21. Glorot, X., Bengio, Y.: Understanding the di?culty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Arti?cial Intelligence and Statistics, pp. 249–256 (2010) 22. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980 (2014) 23. Io?e, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167 (2015) Array of Things for Smart Health Solutions Injury Prevention, Performance Enhancement and Rehabilitation S. M. N. Arosha Senanayake1,2(?) , Siti Asmah @ Khairiyah Binti Haji Raub2 , Abdul Ghani Naim1,2 , and David Chieng3 1 Institute of Applied Data Analytics, University of Brunei Darussalam, Gadong BE1410, Brunei arosha.senanayake@ubd.edu.bn 2 Faculty of Science, University of Brunei Darussalam, Gadong BE1410, Brunei 3 Wireless Innovation, MIMOS Berhard, Technology Park Malaysia, Kuala Lumpur, Malaysia Abstract. Data visualization on wearable devices using cloud servers can provide solutions for personalized healthcare monitoring of general public leading to smart nation. The objective of this research is to develop personalized healthcare IoT assistive devices/tools for injury prevention, performance enhancement and rehabilitation using an Intelligent User Interfacing System. It consists of Array of Things (AoT) which interconnects hybrid prototypes built using di?erent wearable measurement and instrumentations multimodel sensor system for transient and actual health status and classi?cation. Android platforms have been used to prove the success of AoT using national athletes and soldiers with whom were permitted the implementation of a knowledge base encapsulated reference/benchmarking massive retrieve, retain, reuse and revise health pattern sets accessible via case base reasoning cloud storage. Two case studies were conducted for injury prevention and rehabilitation and performance enhancement of soldiers and athletes using smart health algorithms. Validation and testing were carried out using Samsung Gear S3 smart watches in real time. Keywords: Array of Things (AoT) · Personalize healthcare Multimodel sensor system · Transient health · Smart health 1 Introduction Array of Things concept was ?rstly introduced in Smart Chicago project [1]. Their concept was the designing of range of cyber physical devices as measurement and instrumentation systems at urban scale based on the principle of array of telescopes and IoT. In [2], authors summarize Parkinson Disease (PD) patients monitoring in the home setting using wearable and ambient sensors. The technology includes a wireless unit strapped around the wrist, Band-Aid-like sensors attached to the lower limbs, a wearable camera worn as a pendant, a smart watch, and a mobile phone clipped on the belt used as gateway to relay the data to the cloud to assess speci?c functions (using its embedded sensors) as well as to communicate with the patient (using customized apps). The inte- gration of wearable technology with smart devices enables the remote monitoring of © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 598–615, 2019. https://doi.org/10.1007/978-3-030-02686-8_45 patients with PD and real-time feedback to clinicians, family/caregivers, and the patients themselves. Three Machine Learning (ML) algorithms were proposed to generate knee angle patterns in sagittal plane, which is one of the joints used during the walk. The Extreme Learning Machine algorithm outperformed against Arti?cial Neural Network and Multi-output Support Vector algorithms and can generate a speci?c reference of normal knee pattern depending on individual’s characteristics and walking speed. This speci?c refer- ence provides a personalized gait analysis [4]. Having done extensive research work on applying virtual measurement and instru- mentation for human motion analysis during past two decades [5–11], this paper intro- duces generalized frame work for data visualization on wearable devices for personal- ized healthcare using wearable sensors and its data fusion; Array of Things for smart health solutions, as illustrated in Fig. 1. Fig. 1. System overview of Array of Things for smart health solutions. Smart health solution architecture consists of wearable devices for personalized healthcare services and technologies and cloud server technologies in order to visualize smart health data fused and update, repair and remove transient health data based on actual health status using personalized wrist band data center. Thus, this paper is struc- tured from general system architecture introduced leading to speci?c application domains used in order to prove its services. Smart health solution architecture is articu- lated using Hybrid System Architecture Platform (HSAP) that is the novel platform for Array of Things (AoT) devices/tools composed of a set of cloud computing based sensor, processing, control, and data services integrating AoT and cloud computing into a single framework Thus, this article describes the HSAP system architecture in detail using its core components; smart data fusion, smart data analytics and deep learning. HSAP allows to acquire personalize health pattern set using wearable devices which requires multimodel sensory mechanisms to extract feature set, integrate feature set and transform it using Array of Things for Smart Health Solutions Injury Prevention 599 data fusion techniques such a way that knowledge base (KB) of an individual person is formed. Formed KB consists of pre-injury (healthy) pattern set, injury pattern set and post-injury pattern set which will be updated using personalized wrist band data center primarily using worn IoTs. Virtual measurements and instrumentation technologies (LabVIEW) is used as the platform to interface AoTs connected to cloud server by implementing Intelligent Graphical User Interfacing System (IGUIS) in order to acquire current (actual) health data pattern set on site, online and real time to update KB using case base reasoning such a way that cloud computing takes care of providing appropriate services; reactive care, episodic centric and clinic centric for performance enhancement, injury prevention and rehabilitation. Thus, KB interfaced with smart health algorithms processed using cloud computing facilitates the classi?cation of current health status considered as actual health status while cloud storage maintains transient health status of each individual using historic pattern set already available in cloud storage. Based on the limited storage available in worn IoTs, Samsung Gear S3 watch provides 2 GB free space, transient health status (classi?cation) is stored in a queue to continuously update the classi?cation of individual using actual health status on site, online and real time. 2 Smart Health Solution Architecture 2.1 Rationale On health or lifestyle monitoring, harvesting of motion data and context reasoning is often a complex task. IntelliHealth Solutions was introduced to assess, monitor and to provide feedback on active lifestyle focusing generalized solution for normal Brunei citizens [4]. While IntelliHealth solutions has already achieved the establishment of reference standards of Brunei Citizens based on soldiers and national athletes (healthy citizens) using intelligent knowledge base formed (resident pattern storage in a cloud server) [5], the aim of this research is to develop a transient wearable healthcare solutions for transient pattern storage in real time with shared resource allocation using cloud technology for resident pattern storage already formed using intelligent knowledge base. This will allow real time monitoring of human test subject while performing real time walking, jogging, running and cycling. So far, resident pattern storage of soldiers and athletes has been established using smart data and decision fusion consisted of smart data analytics, deep learning, case based reasoning and virtual measurement and instru- mentation technologies [6]. Thus, the achievement of the development of wearable motion interfacing and reasoning devices for general public with its own vision ‘towards active healthy lifestyle’ facilitates the monitoring of gait and rehabilitation of initially ASEAN obese community with pilot study on going in Brunei as the center, Malaysia and Vietnam under the ASEAN Institutes of Virtual Organization at National Informa- tion and Communications Technology (NICT), Tokyo, Japan with the title “IoT system for Public Health and Safety Monitoring with Ubiquitous Location Tracking”. Heavy computations required for motion data reasoning and position estimation result in high energy consumption. Together with the needs to maintain a reliable data connection anytime anywhere, a practical battery design is becoming a huge challenge for such wearable devices. Certain computations need to be o?oaded to a cloud server 600 S.M.N. Arosha Senanayake et al. without signi?cantly compromising the response time. In today’s highly digitized society, cloud technologies play a critical role in preserving health and safety of citizen especially women, children and the elderly. Over the last few years, there is a growing needs for monitoring the citizen’s lifestyle including their health status. Smart Health will have a direct impact on society leading to a smart society. The ultimate achievement of AoT for smart health solutions works as a service provider for the wellbeing of public. The AoT for quality life style have not been addressed exten- sively in recent years. Recently developed devices were not a great success due to three main critical issues not appropriately integrated into customized devices targeting a particular society needs (ASEAN countries); Intelligent User Interfaces, information fusion and real time biofeedback control. Hence, the goal of Smart Health solutions is to design, implement and build AoT devices/tools which incorporate hybrid tools; intel- ligent user interfacing systems and real time biofeedback control systems embedded with information fusion. Smart Health will have a direct impact on society leading to a smart society. The ultimate achievement of AoT for smart health solutions works as a service provider for the wellbeing of public. The AoT for quality life style have not been addressed exten- sively in recent years. Recently developed devices were not a great success due to three main critical issues not appropriately integrated into customized devices targeting a particular society needs (ASEAN countries); Intelligent User Interfaces, information fusion and real time biofeedback control. Hence, the goal of Smart Health solutions is to design, implement and build AoT devices/tools which incorporate hybrid tools; intel- ligent user interfacing systems and real time biofeedback control systems embedded with information fusion. Thus, AoT for smart health solutions embeds solutions for injury prevention, performance enhancement and rehabilitation using reactive care services, episodic response services and clinic centric services respectively. Intelligent Graphical User Interfacing System (IGUIS) was built to integrate these services and tested using soldiers and national athletes successfully as reported in [6]. IGUIS was built using virtual measurement and instrumentation tools provided by LabVIEW and using Support Vector Machines (SVM) interfaced with case base reasoning. 2.2 System Architecture As shown in Fig. 1, the overall system architecture is mainly divided into two sub-systems; Wearable Device and Server (Cloud) which are interconnected via communi- cation protocols with two critical parameters; one related to IoT(s) active from Array of Things (AoT) and the status. Initially, wearable device considered is Android based platform, but recon?guring to other wearable platforms is allowed using customizing tools integrated. Wearable device contains multimodal healthcare system on device, personalized wrist band data center and AoT platforms. AoT is designed in order to accommodate all embedded platforms arising from multimodal healthcare system from di?erent devices. It is imple- mented using real time embedded system interfaced with IGUISs. Hence, AoT uses daisy chain methods to interface with all IoT devices encapsulated under smart health Array of Things for Smart Health Solutions Injury Prevention 601 solutions. This will allow the connectivity of future IoTs to be developed with no addi- tional hardware. In order to facilitate the connectivity with Cloud servers, personalized communication protocol is built. Communication protocol is the interface to the server usually a cloud server con?g- ured to the IoT in consideration. It carries two important information from Android device currently active; IoT and Status. IoT information contains personalized health protocol headers which allows to recon?gure and to synchronize with corresponding smart health data in the cloud server. The status is the result of actual health status of actual human test subject in consideration in real time or online. Cloud server contains smart health algorithms built in on server, smart health data analytics and hybrid system platforms. Thus, cloud server is the service provider which provides data visualization using virtual technologies and services requested by the end user. Hybrid system platforms is based on hybrid system architecture platforms (HSAP) interfaced to wearable devices. As far as wearable devices connected are based on HSAP, they can transfer necessary smart health data into HSAP for processing. In this project, HSAP is restricted to wearable devices with Android platforms and its families such as Tizen OS platforms used for smart watches. HSAP is depicted in Fig. 2. Fig. 2. Hybrid System Architecture Platforms (HSAP). Main components of HSAP are smart data fusion, smart data analytics and deep learning. Smart data fusion is carried out using IoT currently active interfaced with actual health status of current human test subject under consideration in real time or/and online. Thus, this will facilitate to apply selected smart health algorithm in order to transform 602 S.M.N. Arosha Senanayake et al. active pattern set for smart data analytics. Smart data analytics is responsible to apply case based reasoning for the intelligent knowledge base already stored in cloud server such a way that transient health pattern set already available in the memory is the basis to retrieve the matching pattern set or/and revise and retain in the knowledge base. Deep learning techniques are implemented to produce the output to be either visualized as personalized health data or/and client services requested by clinicians or/and physio- therapists or/and trainers or/and subject under assessment which are primarily based on the established protocols and norms for injury prevention, performance enhancement and rehabilitation monitoring. In this research, Canadian protocols have been used to implement decision fusion algorithms to make a ?nal judgment as a wireless wearable assistive tool/device independent of location and human anthropometry. 3 Prototypes Built, Emulation and Validation The implementation of the AoT for Smart Health Solutions (SHS) is based on the criteria and norms (Canadian norms) currently practiced by the Performance Optimization Centre of Ministry of Defense and Sports Medicine and Research Centre of Brunei utilizing the standard guidelines established for injury prevention, performance enhancement and rehabilitation of soldiers and national athletes. Thus, AoT is designed by setting up di?erent functional/service units (currently in operation) as follows; reac- tive care, episodic response and clinic centric. Thus, smart health solutions at its current stage support the following functionalities across wearable devices and HSAP. • Personalized Wrist Band Data Centre for Healthy Lifestyle • Pre-clinical monitoring of movement disorders/abnormalities • Secure Personalized Performance Analysis Data Center • Personalized Recovery Progress Analysis and Classi?cation • Secure Sports/Military Personnel Performance Enhancement. A hybrid intelligent framework was developed by combining case-based reasoning (CBR) approach and adaptive intelligent mechanisms in order to build prototypes with di?erent functionalities. The framework utilizes the concept of solving new problems by using/modifying the similar previous experiences (problem-solution pairs). CBR problem-solving cycle consists of four steps [7, 12]: • Retrieve: Finding similar case(s) from the knowledge base whose problem descrip- tion best matches with the given problem. • Reuse: Reusing the solution of most similar case to solve the new problem. • Revise: Adapting/Modifying the chosen solution according to the di?erences in new problem. • Retain: Storing the new problem-solution pair as a case once it has been solved. Thus, designing intelligent hybrid knowledge based system is subject to the estab- lishment of knowledge base (KB) of smart health solutions using pattern sets currently Array of Things for Smart Health Solutions Injury Prevention 603 available and at the same time allowing the evolvement of KB with new pattern sets subject to CBR which is stored in a cloud server as depicted in Fig. 1. 3.1 Knowledge Base (KB) The structure of knowledge base for smart health solutions is depicted in Fig. 3. The knowledge base contains di?erent types of information including; raw and processed data, domain knowledge, historical data available for subjects (pre-injury, post-injury and recovery data) and session data during convalescence, case library (problem-solu- tion pair), reasoning and learning models (trained intelligent methods) and other relevant data (e.g. subjects’ pro?les, gender, activity type, etc.). Fig. 3. The structure of knowledge base for smart health solutions. In order to manage the knowledge base repository, a relational database was used to reduce the storage redundancy and provide ?exibility. The knowledge base evolves with the time-period when new problems are presented and new cases are added to the system 604 S.M.N. Arosha Senanayake et al. using CBR. This evolution process makes it more useful for domains where subject’s speci?c monitoring and prognosis mechanisms are required. In general, the information in KB can be represented as in (1): KB = [ pre_inj_Ii S , post_inj_Ij S , post_op_Ik S , T ( pre_inj_Ij S ) , T ( post_inj_Ij S ) , T ( post_op_Ik S ) , Sp, D, C, Mt ] (1) where pre_inj_Ii S : raw input data set of a group of subjects ‘S’ for di?erent activities at pre-injury (i.e. healthy) stage for i sessions (i =o 1) post_inj_Ij S : raw input data set of a group of subjects ‘S’ for di?erent activities during post injury for j sessions (j =o 1) post_op_Ik S : raw input data set of a group of subjects ‘S’ for di?erent activities during post-surgery (i.e. rehabilitation) for k sessions (k =o 1) T(pre_inj_Ii S ): processed input data set of a group of subjects ‘S’ for di?erent activ- ities at pre-injury (i.e. healthy) stage for i sessions (i =o 1) T(post_inj_Ij S ): processed input data set of a group of subjects ‘S’ for di?erent activ- ities during post-injury (i.e. before surgery) for j sessions (j =o 1) T(post_op_Ik S ): processed input data set of a group of subjects ‘S’ for di?erent activ- ities during post-surgery (i.e. rehabilitation) for k sessions (k =o 1) Sp: pro?le (e.g. gender, age, weight, height, type of injuries, activities etc.) of p subjects D: domain knowledge (e.g. type of protocols followed for subjects, local/standard norms for di?erent rehabilitation testing activities etc.) C: case library consisting of problem-solution pairs (processed input, rehabilitation procedure followed, outcomes and possible suggestions) related to individuals or di?erent group of subjects Mt: trained intelligent models for each activity t to be monitored. The designed KB is not a static collection of information, but it acts as a dynamic resource which has the capacity to learn and evolve with the passage of time when new problems are presented and new problem-solution pairs are added to the system using CBR. This evolution process makes it more useful for domains where subject’s speci?c monitoring and prognosis mechanisms are required. Thus, as an integral component of injury prevention, performance enhancement and rehabilitation, this KB has been used to optimize collection, organization and retrieval of relevant information for subjects using CBR. 3.2 Smart Health Solutions Service Provider Services de?ned by smart health solutions are tightly coupled with available AoT func- tional/service units and its functionalities across wearable devices and HSAP with the hybrid intelligent knowledge based system formed as explained in the Sect. 3.1. Hence, prototypes built, emulation and validation are carried out using reactive care, episodic Array of Things for Smart Health Solutions Injury Prevention 605 response and clinic centric under the careful supervision of specialists; clinicians, phys- iotherapists, trainers, test subjects, etc. Reactive Care. This service provides performance enhancement and injury prevention tools as proactive and preventive care services for healthy active lifestyle. If a person is concerned about daily active lifestyle, reactive care services produce required output data using daily healthcare records up to date using easy steps as follows: • Secure Personalized data center is responsible to store and to visualize all measure- ments of daily active lifestyle. • If a person is not active during working time, preventive care tool assists to ?nd and to determine causes. • Produce and generate personalized reports using data visualization tools. Episodic Response. These tools guarantee life long active daily life style by providing periodic monitoring and biofeedback control through appropriate intervention during critical stages. Episodic response tools provide services not only for today, it is about wellbeing throughout the life. Periodic monitoring of recovery stages upon the injury treatment will lead the returning to healthy active lifestyle within shortest possible time frame. These features are integrated using the following tools: • Pre-clinical monitoring of movement disorders/abnormalities. • Personalized Recovery Progress Analysis and Classi?cation by storing personalized data into a knowledge base in which pre-injury, post-injury and recovery data are stored and fused in the cloud server. • Real time biofeedback control using personalized wearable devices. Clinic Centric. Clinic centric service guides patients with rehabilitation protocols for recovery of injured joints/muscles or/and tiny muscle repair. The injury recovery is crucial to return to active daily healthy lifestyle. Progressive recovery percentage can be quanti?ed and visualized using following tools: • Secure personalized wrist band data center using wearable wireless sensor suit. • Integrated tiny muscle detector of damaged tiny muscle areas in relevant muscles up to mm2 . • Produce and generate personalized reports using virtual technologies interfaced with data visualization tools. In this research, prototypes built, emulation and validation of smart health solution services have been proven and tested using the following key and critical planned activ- ities: • Prototypes built for physical & mobility impairments, obesity, gait disorders, etc. • Incorporated intelligent user interfacing tools and real time biofeedback mechanisms in wearable devices (smart watches) and customized taking into consideration society needs. • Validate and test smart health solution service for di?erent types of human test subjects (ASEAN, Japan and USA) in di?erent clinical environment; Performance Optimization Centre and Sports Medicine and Research Center in Brunei. 606 S.M.N. Arosha Senanayake et al. 4 Case Studies Using AoT Built AoT is built using virtual measurement and instrumentation technologies (LabVIEW), Tizen OS emulator and smart watches for physical and mobility impairments, obesity and gait disorders community and for national athletes as healthy subjects in a society. In order to validate and test AoT so far built, clinical and laboratory environment were set up as illustrated in Fig. 4 at Performance Optimization Centre of Ministry of Defense, Sports Medicine and Research Centre of Ministry of Youth, Culture and Sports and Physiotherapy unit under Ministry of Health. Fig. 4. Clinical and laboratory set up for smart health solutions. 4.1 Case Study 1 – Injury Prevention and Rehabilitation A general framework of intelligent and interactive biofeedback virtual measurement and instrumentation system was built for physical and mobility impairments, obesity and gait disorders as smart health solution for soldiers and professional athletes, especially during rehabilitation monitoring. The application of machine learning techniques along with custom built wireless wearable sensor suit facilitated in building a knowledge base system for periodical rehabilitation monitoring of test subjects and providing a visual/ numeric biofeedback to the clinicians, patients and healthcare professionals. The vali- dated system is currently used as a decision supporting tool by the clinicians, physio- therapists, physiatrists and sports trainers for quantitative rehabilitation analysis of the subjects in conjunction with the existing recovery monitoring systems [5]. In order to perform real time recovery classi?cation of gait pattern for an ambulation activity, multi-class Support Vector Machine (SVM) is implemented using one – vs – all method. SVM has been extensively used as a machine learning technique for many biomedical signal classi?cation applications. The identi?cation of class/status from gait patterns of a new/actual subject can provide useful complementary information in order to make the adjustments in his/her rehabilitation process. Figure 5 illustrates LabVIEW Array of Things for Smart Health Solutions Injury Prevention 607 data ?ow diagram of SVM embedded into the Intelligent Graphical User Interfacing System (IGUIS) built [6]. Fig. 5. Data ?ow diagram of SVM for recovery classi?cation. Thus, interactive biofeedback visualization was designed to monitor rehabilitation and recovery status of subjects with physical and mobility impairments, obesity and gait disorders. There are two conditions accepted by biofeedback visualization. First condi- tion is the availability of gait pattern set of the subject in the KB (o?ine) while the second condition is the subject undergoing actual experiment to analyze current recovery status (real time). In o?ine mode, biofeedback visualization displays previously saved and visualized signals using IGUIS. The total time needed for real time system software to start until the output produced is 20 s during real time analysis, otherwise in o?ine processing, it is immediate. The visual output generated using IGUIS facilitates the adjusting of individual subject’s rehabilitation protocol using standard procedures governed. Di?erent classi?ers may assign di?erent classes to the same subject base on his/her performance during each activity or due to misclassi?cation. In addition to evaluate the output of an individual activity of a subject, an overall assessment can also be helpful to categorize the recovery stage of a subject after a certain rehabilitation period. The classi?cation results of multiple activities for each subject’s data have been combined using Choquet integral method as illustrated in (2). The Choquet integral is a non-linear functional de?ned with respect to a fuzzy measure g?, where g? is completely determined by its densities (gi - degree of importance of classi?er yi towards ?nal decision). The fusion of di?erent classi?ers is computed based on (1) and (2) [8, 13]. 608 S.M.N. Arosha Senanayake et al. ek = t ?k i=1 (hk(yi) -i hk(yi-1)).g(Si) (2) Where hk(yi): The certainty of the identi?cation of subject S to be in stage k using the classi?er yi g(Si): The degree of importance of classi?er yi of the subject S towards ?nal decision ek: The overall recovery stage of the fuzzy integration based on the highest value computed for e in the stage k of subject S. Figure 6 shows classes of recovery classi?cation of a knee injured test subject extracted from IGUIS built. Four classes (A, B, C and D) were formed using historical data collected and stored in the KB using fuzzy C-means clustering. Hence, classes A through D represent di?erent stages of health/recovery condition of subjects based on the gait patterns; Class A: represents 2–6 months of recovery, Class B: represents 7–12 months of recovery Class C: represents 13–24 months of recovery; Class D: represents healthy subject. Fig. 6. Knee recovery classi?cation of the subject classi?ed as Class A in real time. Having implemented hybrid intelligent framework together with CBR called as smart health algorithms stored in cloud server for clinic centric/episodic response care services and data visualization can be obtained using wearable IoT devices. In this study, Tizen OS visualization emulator was used as illustrated in Fig. 7 and subsequently visualized using Samsung Gear S3 smart watch as IoT device, courtesy from Samsung Asia Pte Ltd, Singapore as illustrated in Figs. 8 and 9 using JSON tools. Array of Things for Smart Health Solutions Injury Prevention 609 Fig. 7. Tizen OS emulator for real time classi?cation during rehabilitation. Fig. 8. IoT devices for real time classi?cation during knee rehabilitation. Fig. 9. Samsung Gear S3 smart watch for real time classi?cation during knee rehabilitation. In this study, Samsung Gear S3 smart watch works as an IoT for injury prevention and rehabilitation tool wirelessly connected independent of clinicians and patients (soldiers) locations. As far as IoT (Samsung Gear S3 smart watch) tool revises the pattern set using actual (current) pattern set identi?ed during the rehabilitation process, case based reasoning is used to update the intelligent KB in the cloud server. Hence, clinicians 610 S.M.N. Arosha Senanayake et al. were able provide real time biofeedback for patients so that soldiers under monitoring due to rehabilitation and injury prevention followed the protocols given clinicians in order to improve the recovery classi?cation. As per IoT so far built for the critical joint of soldiers’ knee, clinicians were able to prevent doing second Anterior Cruciate Liga- ment (ACL) surgery for women soldiers who were commonly prone not returning soldiers’ career due to no real time biofeedback monitoring done previously. Therefore, Samsung Gear S3 smart watch as the IoT for real time knee monitoring used in this study was capable to provide the current recovery classi?cation of knee injured soldier without physical presence in the clinic and at the same time based on the current classi?cation, clinicians were able to provide new protocols to improve knee rehabilitation process. Currently, this IoT is used for soldiers as soldiers are considered as reference/bench- marking population in a nation. Since this study has already proven the capability of real time biofeedback monitoring using IoT via smart health data stored and accessed via cloud server set up, current study is in focus on validating and testing normal public in physiotherapy clinic in the government hospital and Jerudong Park Medical Center under Gleneagles Hospital chain from Singapore under the close routine monitoring of clinicians in the clinic. While patients are voluntarily taking part in this pilot study, smart watches are sponsored by Samsung Asia Pte Ltd to revise the pattern set based on the pattern set collected at home environment by automatically updating smart health data in the cloud server. 4.2 Case Study 2 – Performance Enhancement A hybrid framework combining Self Organizing Maps (SOMs) and CBR for clustering, accessing, examining and recommending training procedures for performance enhance- ment of national athletes is implemented. This system is intended to assist sports profes- sionals, coaches or clinicians to maintain records of subject information, experiment information, diagnose improper movements based on KB, provide recommendation for improvements and monitor progress of performance over a period of time. The IGUIS is built to facilitate monitoring and providing instantaneous biofeedback during training sessions. The IGUIS supports a range of features necessary in real time applications, and are clustered into separate frames for simplicity and ease of use. Figure 10 illustrates IoT platforms used for real time data visualization during performance enhancement of national athletes based on the hybrid framework combining SOMs and CBR implemented as smart health algorithms in the cloud server in order to derive personalized performance enhancement of athletes using reactive care and episodic response services provided by the smart health solutions. In this study, Tizen OS emulator followed by Samsung Gear S3 smart watch was used to visualize data applying database driven neural computing interfaced with JSON tools as illustrated in Fig. 11. Array of Things for Smart Health Solutions Injury Prevention 611 Fig. 10. IoT platforms for athletes’ performance enhancement using hybrid intelligent computing. Fig. 11. Samsung Gear S3 smart watch for athletes’ performance enhancement in real time. In this study, database-driven neural computing system was used to monitor di?erent activities instructed by coaches during their training regime. Di?erent coaches use di?erent protocols and standards to classify national athletes. But, in general the expect- ation is to perform as excellent or very good for di?erent activities assigned each athlete during training regime, otherwise automatically considered as not deserve to be in the national pool of athletes. Hence, women netball players in a training regime were considered under the close monitoring of coaches and physical strength and conditioning specialist who use Canadian protocols. There are pre-de?ned activities set by the coach during training regime in order for coach to decide the positioning of players in forth- coming international games/tournament. By wearing smart watch during training regime in the indoor stadium and pre-de?ned physical exercises given by coaches and clinicians, just before the subsequent training regime, coach and clinicians have the access to pro?le pattern set of each athlete updated in the cloud server. Samsung Gear S3 smart watch considered as an IoT worn by each athlete automatically visualizes transient health status of personalized classi?cation from cloud storage prior to actual regime starts which is fundamental for healthcare professionals, in this case coaches to determine the perform- ance level of athlete to be undergone in the actual training regime onsite, online and real time. Hence, coaches and clinicians are able to make a judgment or/and re-adjust the 612 S.M.N. Arosha Senanayake et al. training regime of each athlete with updated/revised protocols for the forthcoming training regimes and actual games based on real time biofeedback monitoring. 5 Comparative Analysis with Existing Systems Array of Things (AoT) using virtual measurement and instrumentation technologies for smart health solutions addressed in the research work is novel. While there are speci?c application domains exist using augmented, virtual and mixed realties, none of the existing applications failed to introduce generalized architecture similar to Hybrid System Architecture Platform (HSAP) which allows the interfacing and mapping to speci?c domain of interest using cloud computing. Further, this article addresses the solution space using wearable technologies from the acquisition of pattern set of person- alized health pattern set via multimodal healthcare system using personalized wrist band data center while IoT themselves, in this case Samsung Gear S3 smart watch works as real time biofeedback monitoring based on transient health status (classi?cation) and current/actual health status (classi?cation or recovery status) onsite, online and real time during injury prevention, performance enhancement and rehabilitation using cloud computing. Hence, there is no concrete evidence in literature to do comparative analysis because so far solutions provided are domain centric within digital healthcare technol- ogies and services. 6 Conclusions Array of Things (AoT) for smart health solutions during injury prevention, performance enhancement and rehabilitation futuristic concept introduced in this research work were proven by interfacing virtual measurement and instrumentation (LabVIEW from NI) and IoT platforms (Samsung Gear S3 smart watch). Intelligent graphical user interfacing system was built to assist the formation of intelligent knowledge base which is an evolving smart health pattern storage using case base reasoning via retrieve, reuse, revise and retain mechanisms during real time biofeedback monitoring. At its current stage, cloud storage consists of smart health data processed according to Canadian standard protocols established by coaches, clinicians, physiotherapists and physical strength conditioning specialists at Performance Optimization Center of Ministry of Defense and at Sports Medicine and Research Center of Ministry of Youth, Culture and Sports using nation active healthy population; soldiers and professional athletes. Two case studies have been conducted during their training regimes under close monitoring of di?erent specialists. AoT for smart health solutions concept was proven using IoT platforms during real time feedback monitoring and at the same time reference and benchmarking were able to set up based on the nation active healthy population; soldiers and athletes. This will allow to establish norms for general public for their health and safety moni- toring during their real time biofeedback monitoring using these IoT platforms as assis- tive tools/devices for di?erent health classi?cation and recovery status regardless of patients location whether at home or/and at clinic under close monitoring of di?erent specialists. Array of Things for Smart Health Solutions Injury Prevention 613 Thus, services provided by AoT; reactive care, clinic centric and episodic response provide the platform to personalize IoT devices for healthcare using database driven neural computing platforms. Therefore, futuristic goal of this ongoing research will be the utilization di?erent deep learning algorithms, in particular reinforcement learning mechanisms for smart data analytics which will be geared for smart data visualization and services. Acknowledgments. This publication is part of the output of the ASEAN Institutes of Virtual Organization at National Information and Communications Technology (NICT), Tokyo, Japan; ASEAN IVO project with the title “IoT system for Public Health and Safety Monitoring with Ubiquitous Location Tracking”. This research is also partially funded by the University Research Council (URC) grant scheme of Universiti Brunei Darussalam under the grant No: UBD/PNC2/2/ RG/1(195). References 1. Michael, E.P.: Introduction to the array of things. http://niu.edu/azad/_pdf/3- Michael_May18_2016.pdf 2. Alberto, J.E., et al.: Technology in parkinson’s disease: challenges and opportunities. Mov. Disord. 31(9), 1272–1282 (2016). https://doi.org/10.1002/mds.26642. Epub 29 April 2016 3. Vieira, A., Ribeiro, B., Ferreira, J.P.: GAIT analysis: methods & data review. Cisuc tecnhical report TR-2017–004, December 2017 (unpublished) 4. Arosha Senanayake, S.M.N., et al.: IntelliHealth solutions: technology licensing. http:// intelli-health.org/ 5. Yahya, U., Arosha Senanayake, S.M.N., Naim, A.G.: Intelligent integrated wearable sensing mechanism for vertical jump height prediction in female netball players. In: Eleventh International Conference on Sensing Technology (ICST), Sydney, Australia, pp. 94–100. https://doi.org/10.1109/icsenst.2017.8304484. 978-1-5090-6526-4/17/$31.00 ©2017 Crown 6. Filzah Pg Damit, D.N., Arosha Senanayake, S.M.N., O., Malik, Jaidi Pg Tuah, P.H.N.: Instrumented measurement analysis system for soldiers’ load carriage movement using 3-D kinematics and spatio-temporal features. Measurement 95, 230–238 (2017) 7. Wulandari, P., Arosha Senanayake, S.M.N., Malik, O.A.: A real-time intelligent biofeedback gait patterns analysis system for knee injured subjects. In: Nguyen, N.T., et al. (eds.) Intelligent Information and Database Systems, Part II. Lecture Notes in Arti?cial Intelligence (LNAI), vol. 9622, pp. 703–712. Springer, Heidelberg (2016). https://doi.org/ 10.1007/978-3-662-49390-8_68 8. Arosha Senanayake, S.M.N., Malik, O.A., Iskandar, P.M., Zaheer, D.: A knowledge-based intelligent framework for anterior cruciate ligament rehabilitation monitoring. J. Appl. Soft Comput. 20, 127–141 (2014) 9. Senanayake, C., Arosha Senanayake, S.M.N.: A computational method for reliable gait event detection and abnormality detection for feedback in rehabilitation. Comput. Methods Biomech. Biomed. Eng. 14(10), 863–874 (2011) 10. Alahakone, A.U., Senanayake, A.: A real-time interactive biofeedback system for sports training and rehabilitation. Proc. IMechE J. Sports Eng. Technol. 224(Part P), 181–190 (2010) 11. Gouwanda, D., Arosha Senanayake, S.M.N.: Emerging trends of body-mounted sensors in sports and human gait analysis. In: International Federation for Medical and Biological Engineering Book series, Chap. 102. Springer, Heidelberg (2008). ISBN 978-3-540-69138-9 614 S.M.N. Arosha Senanayake et al. 12. Aamodt, A., Plaza, E.: Case-based reasoning: foundational issues, methodological variations, and system approaches. AI Commun. 7, 39–59 (1994) 13. Murofushi, T., Sugeno, M.: An interpretation of fuzzy measures and the choquet integral as an integral with respect to a fuzzy measure. Fuzzy Sets Syst. 29, 201–227 (1989) Array of Things for Smart Health Solutions Injury Prevention 615 Applying Waterjet Technology in Surgical Procedures George Abdou(&) and Nadi Atalla New Jersey Institute of Technology, Newark, USA {abdou,na76}@njit.edu Abstract. The main objective of the paper is to predict the optimal waterjet pressure required to cut, drill or debride the skin layers without causing any damages to the organs. A relationship between the waterjet pressure and skin thickness has been established. It also includes the modulus of elasticity of the skin, the diameter of nozzle ori?ce, the nozzle standoff distance and the traverse speed of the waterjet as well as the duration of applying the waterjet pressure. Thus, practical relationship between waterjet operating parameters and the physical properties of the skin has been formulated. A real Caesarean section procedure data has been applied to the formulation. Given the Ultimate Tensile Strength of the skin at the abdomen to be 20 MPa, incision parameters of 18 mm deep, 12 cm long and 0.4 mm wide, applying a traverse speed of 0.5 mm/s and stand-off distance of 5 mm, the resulted waterjet pressure is 17.89 MPa using a 0.4 mm ori?ce diameter. Keywords: WaterjetSurgerySkinIncision 1 Introduction Waterjet technology has been used in several applications such as industrial cutting, drilling and cleaning. Furthermore, waterjet technology can also be used in the medical ?eld; applications of this include dentistry, wound cleaning and other surgical opera-tions. Over the years, waterjet techniques have been developed into a revolutionary cutting tool in variety types of surgery [1]. It can be used in precision cutting of skin for any type of surgery. The tool would simply be moved in a line to apply the pressure and the cut. The main advantage of waterjet incision is its precision; it is as effective as a laser cutter. However, the waterjet incision does not cause any thermal damage to the separated tissue due to its coolant ability. Additionally, the waterjet also washes away blood which eliminates any extra tools to do this which would be required in a regular cut [2]. In vivo and in vitro experiments on patients and animals have been conducted with continuous waterjet at different low pressures. However, few studies have focused on the skin. Further analyses on the relationship among the operating parameters of waterjet, structure, and mechanical properties of the skin must be conducted. © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 616–625, 2019. https://doi.org/10.1007/978-3-030-02686-8_46 2 Literature Review The waterjet technology is currently used for cutting a wide range of materials. The main advantages of this technology include the lack of thermal effect on the material being cut. While waterjet is applied to all kinds of industries, only the medical ?eld will be highlighted. Table 1 summarizes some of the applications of waterjet cutting in the medical ?eld. The performance of waterjet machining process is dependent on the water pressure of the jet and the elastic properties of the skin. The initial impact is considered to be the highest impact; it can be achieved when the waterjet hits the tissue. After that, the water starts flowing radially and the impact of the jet decreases [4]. 2.1 Waterjet in Surgical Wound Debridement Waterjet technology can be used for surgical wound debridement and surgical inter-ventions where selective cutting is necessary. Surgical wound debridement uses devices on the market such as VersaJet and Debritom while surgical interventions use devices on the market such as Jet Cutter 4, Helix HydroJet and ErbeJet2 [4]. A study in 2006 introduced Versajet waterjet as an alternative to standard surgical excisional techniques for burn wounds. In the study, the Versajet waterjet was able to suf?ciently debride super?cial partial thickness and mid-dermal partial thickness wounds for the subsequent placement of Biobrane. Additionally, the study has demonstrated that the Versajet waterjet has the advantage in the surgical treatment of super?cial to mid-partial thickness burns in the face, hand and foot [5]. Table 1. Overview of using waterjet in medicine [3] Type of surgery Operation description Bene?ts Orthopedic Cutting endoprosthesis and bone Below the critical temperature by cutting Dental Cutting and grinding of dental materials Reduces the risk of jagged teeth and reduces the need for anesthesia General Resection of soft tissues: liver, gall bladder, brain, kidney, prostate, cleaning wounds Blood vessels and nerve ?bers remain in the de?ned pressure maintained, minimal bleeding, intact edges and precise cuts, lack of necrotic edge, reduce the duration of myocardial ischemia Plastic Cleaning skin graft, removal of tattoos, liposuction Separation of the layers of tissue, higher accuracy of results without edema and contour changes Dermatology Removing dead skin Possibility of direct dose medications in a water jet Applying Waterjet Technology in Surgical Procedures 617 Another study conducted in 2007 reviewed the versatility of the Versajet waterjet surgical tool in treating the deep and indeterminate depth face and neck burns. With ex-vivo histologic analysis of depth of debridement on human skin, the study con?rmed that predictable and controlled depth of debridement could be obtained by adjusting the apparatus settings [6]. 2.2 The Use of Waterjet Incision in Other Surgical Procedures Waterjet technology in surgical procedures was ?rst reported in 1982 for liver resec-tion. Throughout the years, waterjet machining process has become a recognized technique in different surgical areas. Clinically, waterjet technique is used for cutting softs tissues like liver tissues. Experimentally, waterjet technique is used for dissecting spleen, kidney tissue and brain tissues. While these tissues can be cut at low water pressures, waterjet techniques can also cut bone and bone cement at much higher water pressures [7]. Studies have been done using waterjet technology to drill or cut bone or bone cement. A study in 2014 has shown that such cut requires water pressure that ranges between 30 MPa to 50 MPa; which depends on the diameter of the nozzle. The study also summarized different materials that were tested in previous analyses, the required waterjet pressure to cut them as well as the nozzle diameter (Table 2). A comparison between the existing systems and the proposed algorithm is illus-trated in Table 3. The methods proposed in this study will provide more flexible and robust solutions for setting up the waterjet apparatus when used in surgical procedures. 3 Mathematical Formulation The operating parameters of the waterjet machining process are determined several independent variables. Table 4 summarizes these variables based on four system components: Process, skin, nozzle and pump characteristics [8]. Figure 1 describes how each parameter can control the incision characteristics as well as the illustration of the incision processes. Table 2. Overview of required waterjet pressures to cut bone and bone cement [7] Material tested Dnozzle (mm) Required pressure (MPa) Human calcanei 0.6 30 Human femora 0.3 40 Bone cement 40 Human femora 0.2 50 Bone cement 30 Human interface tissue 0.2 12 0.6 10 618 G. Abdou and N. Atalla Table 3. Features of previous works and proposed methods Authors Year Type of study Method used Apparatus Water purity Pressure Depth of incision Width of incision Cutting velocity Ori?ce diameter Stand-off distance Angle Feed rate/transverse speed Arif [8] 1997 Skin incision Finite element analysis Theoretical 100% Water Fixed Generated Generated N/A Fixed N/A N/A N/A Vichyavichien [9] 1999 Skin incision Finite element analysis Theoretical 100% Water Fixed Generated Generated N/A Fixed Fixed Fixed N/A Wanner et al. [10] 2002 Fat tissue incision Ex vivo Commercial 0.9% saline Fixed Generated N/A Fixed Fixed Fixed Fixed N/A Rennekampff et al. [5] 2006 Debridement of burn wounds Ex vivo Commercial Sterile saline Fixed N/A N/A Fixed Fixed N/A Fixed N/A Cubison et al. [11] 2006 Debridement of burns Ex vivo Commercial N/A Fixed N/A N/A Fixed Fixed N/A N/A N/A Tenenhaus et al. [6] 2007 Wound debridement Ex vivo Commercial N/A Fixed N/A N/A Fixed Fixed N/A N/A N/A Keiner et al. [12] 2010 Brain tissue dissection In vivo Commercial 0.9% Saline Fixed N/A N/A N/A Fixed N/A N/A N/A Kraaij et al. [7] 2015 Interface tissue incision In vitro Custom 100% Water Fixed Generated N/A Fixed Fixed Fixed Fixed Fixed Bahls et al. [4] 2017 Various tissue incision or abrasion and removal In vivo Commercial 10% Gelatin Fixed N/A N/A Fixed Fixed Fixed Fixed N/A Proposed 2018 Skin incision Mathematical/Simulation Matlab & Minitab 100% Water Generated Variable Variable Generated Generated Variable Fixed Variable Applying Waterjet Technology in Surgical Procedures 619 3.1 Surgical Incisions Main Components: Operation Characteristics The main three components for a surgical incision are: the width of incision, the length of incision and the depth of incision. Before performing the incision, the surgical team must have these three factors de?ned. The width of incision as well as the length of incision is determined based on the individual surgery and the recommended incision speci?cations. When performing a skin incision, the depth of incision is determined by the skin thickness. Epidermal thickness differs by age, sex, gender, skin type, pig-mentation, blood content, smoking habits, body site geographical location and many other variables. For these reasons, a system which can adapt to the differences must be created. Table 4. Waterjet incision parameters Process characteristics Skin characteristics Nozzle characteristics Pump characteristics Depth of cut Thickness Stand-off distance Pressure ratio Width of cut Hardness Ori?ce diameter Flow rate Traverse (feed) rate Consistency Nozzle structure Pump ef?ciency Waterjet flow rate Power Fig. 1. Waterjet parameters and its components. 620 G. Abdou and N. Atalla To develop metrics for skin thickness, high frequency Ultrasound technology is necessary. By applying the ultrasound apparatus on the area to be operated on, skin thickness can instantly be measured and fed into the system which determines the water pressure required for the skin incision. Other skin characteristics can also be deter-mined from the Ultrasound results. Such characteristics include the elastic modulus of each of the skin layers as well as their tensile strength. The total energy required for the skin incision which is converted to pressure energy is formulated as follows: PE ¼ UTS Qs ð1Þ Where UTS is the Ultimate Tensile Strength of the skin, and Qs is the flow rate at which the waterjet removes the skin which is calculated as: For skin cutting and debridement: Qs cut ¼ DsLsf ð2Þ For skin drilling: Qs drill ¼ Dswsvs ð2aÞ Dsis the depth of incision, Lsis the length of incision, f is the traverse speed (feed rate), ws is the width of cut and vs is the velocity of the waterjet stream at the skin. 3.2 Waterjet Operating Conditions: Catcher Characteristics To minimize the process noise, a catcher is necessary. The kinetic energy of the catcher is the remaining energy that is not absorbed by the skin incision process, it is for-mulated as follows: KEc ¼ 1 2 Qcv2 c qw ð3Þ Where qwis the density of water. Qcis the flow rate at which the residue water is going into the catcher; it is the sum of the flow rates of water out of the nozzle Qn and rate at which the waterjet removes the skin Qs. The velocity at which the excess water is going to the catcher (vc) is: vc ¼ ??????r 2gx ?g p ð4Þ where, g is the gravity. 3.3 Waterjet Operating Conditions: Nozzle Characteristics The kinetic energy of the waterjet stream coming out of the nozzle is the sum of the pressure energy required the skin incision and the kinetic energy of the catcher: Applying Waterjet Technology in Surgical Procedures 621 KEn ¼ PE þ KEc ð5Þ To look at the nozzle characteristic of the waterjet incision, this kinetic energy (5) will be equal to the following: KEn ¼ 1 2 Qnv2 nqwke ð6Þ Where vn is the velocity of the waterjet stream coming out of the nozzle, ke is the loss coef?cient. The waterjet nozzle converts high pressure water to a high velocity jet. The per-formance of waterjet incision is affected by several variables such as nozzle ori?ce diameter, water pressure, incision feed rate and standoff distance. In the medical ?eld, waterjet incision devices usually use low to medium pressure as well as a small design nozzle that is different from industrial waterjet. A relationship between the velocity of the waterjet stream coming out of the nozzle (vn) and the velocity of the waterjet stream at the skin (vs) can be described as follows: vn ¼ vs eax ð7Þ Where, a is the taper index and x is the standoff distance of the nozzle. Assuming a straight taper waterjet nozzle design, the flow of the water from the nozzle to the atmosphere is affected by the area and the shape of the ori?ce. Table 5 represents the different ori?ce types and the typical values of contraction (Cc) and loss (ke) coef?- cients for water ori?ces. From (1) through (7) Qn and vn are calculated: For cutting and debridement: Qn ¼ 2 PEcutþ 2gxqwDsLsf qw v2 n 2 2gx g g ð8Þ vn ¼ 2 ????????sf: PE ?????????????cs cutþ 2 ?????????????cs gxqwD ??????????????????. sLsf ??? Qnqw ? 2gx s ð9Þ Table 5. Types of ori?ces and their coef?cients values [13] Ori?ce Description Cc Ke SE Sharp-edged 0.63 0.08 RE Round-edged 1.0 0.10 TSE Tube with square-edged 1.0 0.51 TRE Short tube with rounded entrance 0.55 0.15 622 G. Abdou and N. Atalla For drilling: Qn ¼ 2PEdrillþ 2gxqwDswsvs qw v2 n 2 2gx g g ð8aÞ vn ¼ ?????????????cs 2PEdrill ?????????????cs þ 2gx ??????????????????. qwDswsvs ??????????????s Qnqw g 2gx s ð9aÞ The relationship between Qn and vn can also be represented by: Qn ¼ CcAnvn ð10Þ An is the area of the ori?ce of the nozzle which is represented by: An ¼ p d2 n 4 ð11Þ Where dn is the ori?ce diameter of the nozzle. 3.4 Waterjet Operating Conditions: Pump and Intensi?er Characteristics The relationship between the velocity of the waterjet flow coming out of the pump reservoir and the one coming out of the nozzle is calculated as follows: vr ¼ vne2bLn ð12Þ Where Ln is the length of the nozzle and b is the exponential constant which is based on an exponential taper waterjet nozzle design where: b ¼ lnðdn=doÞ Ln ð13Þ Where do is the diameter of the top of the nozzle. The pressure ratio (rp) between the water outlet pressure (Pw2) and the oil inlet pressure (Po1) and as well as the oil inlet area (Ao) and the water inlet area (Aw) is described as follows: rp ¼ Pw2 Po1 ¼ Ao Aw ð14Þ The waterjet flow rate out of the intensi?er (Qi) is equal to the waterjet flow rate coming out of the nozzle (Qn). By design, the hydraulic intensi?er increases the pressure of water. Thusly, the water pressure coming out of the intensi?er (Pw2) is determined by the Power (W), the ef?ciency of the intensi?er (?i) and the flow rate (Qi) as follows: Applying Waterjet Technology in Surgical Procedures 623 Pw2 ¼ Wgi Qi ð15Þ 4 Application Example and Results In this example of a caesarean section procedure, Pfannenstiel traverse incision is assumed. This curved incision (Length of incision Ls) is approximately 10–15 cm long and 2 cm above the pubic symphysis [9]. Using the waterjet, the skin and rectus sheath are opened traversely. The rectus muscles are not cut and the fascia is dissected along the rectus muscles. The skin thickness at the abdomen for a female is approximately 2.30 mm while the subcutaneous adipose tissue thickness at the abdomen is approxi-mately 15.7 mm [10]. The UTS of the skin at the abdomen ranges between 1 and 24 MPa [11]. The exact thickness of the skin and its characteristics would be measured using high frequency Ultrasound. The width of cut is 0.4 mm; in an traditional incision, a #10 (0.4 mm) blade is used [12, 13]. Table 6 summarizes the operation character-istics as follows: The waterjet velocity coming out of the nozzle (vn) is 151.05 m/s while the waterjet velocity that reaches the skin (vs) is 150.86 m/s. The velocity of the excess water that is going to the catcher is very minimal at 0.31 m/s. The calculated power required for the intensi?er is 423.52 W. Assuming the ef?ciency of the intensi?er (?i) is 80%, the calculated pressure that is required for the cesarean section operation is 17.89 MPa with a 0.4 mm nozzle ori?ce diameter. The results obtained from this study can be summarized as follows: 1. The mathematical formulation for different incision processes has been developed and simulated for the best results. 2. Using the cutting incision, an application example has been demonstrated. 3. The data applied has been extracted from real life application. Table 6. Caesarean section operation characteristics [14–19] Parameters Value Depth of cut (Ds) 18.00 mm Length of cut (Ls) 12.00 cm Width of cut (ws) 0.40 mm Ultimate Tensile Strength (UTS) 20.00 MPa Density of water (q) 1.00 g/cm3 Feed rate (f) 0.50 mm/s Gravity (g) 9.80 m/s2 Stand-off distance (x) 5.00 mm Taper (a) 0.25 624 G. Abdou and N. Atalla 5 Conclusion and Recommendations Given any surgical operation characteristics, this mathematical model is able to cal-culate the optimal operating conditions for surgical cutting, debridement or drilling. This will help the surgeon pick the right nozzle size as well as the right waterjet instrument parameters such as pressure, power and velocity. The next step is to use the results of the study to create a comprehensive surgical procedure simulation model such as a Caesarean section procedure or any other surgical procedure that is needed. References 1. Areeratchakul, N.: Investigation of water jet based skin surgery (2002) 2. Yildirim, G.: Using Water jet technology to perform skin surgery (2003) 3. Hreha, P., Hloch, S., Magurová, D., Valícek, J., Kozak, D., Harnicárová, M., Rakin, M.: Water jet technology used in medicine. Tech. Gaz. 17(2), 237–240 (2010) 4. Bahls, T., et al.: Extending the capability of using a waterjet in surgical interventions by the use of robotics. IEEE Trans. Biomed. Eng. 64(2), 284–294 (2017) 5. Rennekampff, H.-O., Schaller, H.-E., Wisser, D., Tenenhaus, M.: Debridement of burn wounds with a water jet surgical tool. Burns 32, 64–69 (2006) 6. Tenenhaus, M., Bhavsar, D., Rennekampff, H.-O.: Treatment of deep partial thickness and indeterminate depth facial burn wounds with water—jet debridement and a biosynthetic dressing. Inj. Int. J. Care Inj. 38, 538–544 (2007) 7. Kraaij, G., et al.: Waterjet cutting of periprosthetic interface tissue in loosened hip prostheses: an in vitro feasibility study. Med. Eng. Phys. 37(2), 245–250 (2015) 8. Arif, S.M.: Finite element analysis of skin injuries by water jet cutting. In: Mechanical and Industrial Engineering. New Jersey Institute of Technology, Newark (1997) 9. Vichyavichien, K.: Interventions of water jet technology on skin surgery (1999) 10. Wanner, M., Jacob, S., Schwarzl, F., Oberholzer, M., Pierer, G.: Optimizing the parameters for hydro-jet dissection in fatty tissue - a morphological ex vivo analysis. Eur. Surg. 34(2), 137–142 (2002) 11. Cubison, T.C.S., Pape, S.A., Jeffery, S.L.A.: Dermal preservation using the Versajet® hydrosurgery system for debridement of paediatric burns. Burns 32, 714–720 (2006) 12. Keiner, D., et al.: Water jet dissection in neurosurgery: an update after 208 procedures with special reference to surgical technique and complications. Neurosurgery 67(2), 342–354 (2010) 13. Abdou, G.: Analysis of velocity control of waterjets for waterjet machining. In: Waterjet Cutting West. Society of Manufacturing Engineers, Los Angeles (1989) 14. Raghavan, R., Arya, P., Arya, P., China, S.: Abdominal incisions and sutures in obstetrics and gynaecology. Obstet. Gynaecol. 16, 13–18 (2014) 15. Akkus, O., Oguz, A., Uzunlulu, M., Kizilgul, M.: Evaluation of skin and subcutaneous adipose tissue thickness for optimal insulin injection. Diabetes Metab. 3(8) (2012) 16. Jansen, L.H., Rottier, P.B.: Some mechanical properties of human abdominal skin measured on excised strips. Dermatology 117(2), 65–83 (1958) 17. Ritter, J.: The Modern-day C-section. Surg. Technol. 159–167 18. FST Homepage. https://www.?nescience.com/en-US/Products/Scalpels-Blades/Scalpel- Blades-Handles/Scalpel-Blades-10. Accessed 8 Apr 2018 19. WardJet Homepage. https://wardjet.com/waterjet/university/precision-quality. Accessed 31 Mar 2018 Applying Waterjet Technology in Surgical Procedures 625 Blockchain Revolution in the Healthcare Industry Sergey Avdoshin(&) and Elena Pesotskaya National Research University Higher School of Economics, 20 Myasnitskaya ulitsa, 101000 Moscow, Russian Federation {savdoshin,epesotskaya}@hse.ru Abstract. The paper analyses the possibility of using blockchain technologies in the sphere of Healthcare. Modern society requires new tools, e.g. distributed ledger and smart contracts for sharing data between patients, doctors and healthcare professionals by giving them control over the data and allowing smarter cooperation. In this situation, utilizing blockchain technology can resolve integrity, data privacy, security and fraud issues, increase patient health autonomy and provide access to better services. This paper provides a review of blockchain technology and research of possible applications in healthcare, gives an overview of positive trends and outputs. Keywords: Blockchain .n Distributed ledger .n Smart contracts Healthcare .n Patient .n Security 1 Introduction Blockchain is already disrupting many industries. Initially it was intended as a banking platform for digital currency, but now blockchain has applications that go beyond ?nancial transactions and its operations are becoming popular in many industries. The idea of blockchain is to use a decentralized system that can replace banks and other trusted third parties. Blockchain is a large structured database distributed by independent participants of the system. This database stores an ever-growing list of records in order (blocks). Each block contains a timestamp and a reference to the previous block. The block cannot be changed spontaneously - each member of the network can see that a transaction has taken place in the blockchain, and it is possible to perform a transaction only with access rights (private key). Blocks are not stored on a single server, this distributed ledger is replicated on thousands of computers worldwide, so users interacting in the blockchain do not have any intermediaries. Blockchain technology can be shared by individuals, organizations, and even devices. It saves time, increases transparency, and gives the ability to make everything a tradable asset. The World Economic Forum predicts that by 2027, it would be possible to store nearly 10% of the global gross domestic product on blockchains [1]. The potential of blockchain has already been realized by many people - authors who want to protect their research and share the knowledge at the same time, by car owners who want to share their car or use rental cars with no 3rd parties’ commission. Even for people who want to share music or even space on their hard drive, but want to © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 626–639, 2019. https://doi.org/10.1007/978-3-030-02686-8_47 feel secure and protected at the same time with no involvement of counterparties. Many industries are thinking about the great potential and possibilities of blockchain tech-nology, and the strong positive effect it can have on people’s health and the healthcare system. The cost of medicine in the world is constantly growing. According to the Global Health Care report, world health spending in the world’s major regions will increase from 2.4% to 7.5% between 2015 and 2020 and will reach $8.7 trillion by 2020 [2]. This is influenced by many factors, including the increase and aging of the population, economic growth in developing countries, and others. Let’s analyse the basic needs for healthcare services that every patient and doctor face and the associated risks: • Organizing visits to the best healthcare professionals, ?nding trusted and afford-able care providers. What we can see now is the fact that though the prices for medical services are increasing rapidly, it is still dif?cult to ?nd the appropriate specialist and treatment for a symptom or disease or it requires long waiting lists. Availability of medical services for patients, access to the best possible treatments and innovative services are very important in the healthcare industry. Patients need to be able to search for care providers in a snap - even abroad if needed - with information on where a speci?c treatment is done with great care and without delay or sometimes after-hours access to medical care. • Storage, management and control of access to patients’ data. Patients need instant data access (including CT, MRI, x-rays, echocardiograms, ultrasounds, etc.) from any place on their mobile device, iPad or PC. Such access has become possible due to the digital revolution and a development of mobile healthcare, but still there is a question how a person can be assured about personal data being secure. Also patients face potential risks of data mismanagement, access limitations to their patient records, and decentralization of all personal healthcare data. • Communication with your doctor and community on a real time basis, getting access to knowledge, trainings, healthcare plans, and advisory services. Lack of communication between experts in different ?elds and the impossibility of a quick consultation with several specialists from one area of medicine causes lower quality and negative patient experiences. Patients expect to consult with a specialist who has a long history of treating and healing patients with similar symptoms. What we see now is a lack of incentives and personalized information about preventive care: visits from one specialist to another to get a clear view on a disease, manually searching Google and hoping that eventually, someone can help. • Easy and transparent payments for the medical services. Many people will agree that it would be convenient to use a single medical insurance around the world. Today, this is hampered by dif?culties with insurer checking and slow payments through a long chain of intermediaries. Additionally, patients want to pay not for the fact of seeing a specialist, but for the result that they receive. Currently, in most cases payment takes place before admission, or money is written off regardless of the outcome. Patients often overpay for repeated tests in multiple medical institu-tions, or alternatively, undergo unnecessary examinations. Telemedicine or mobile medicine can solve some of the raised issues, which has a great potential to reduce the uncertainty of diagnoses, increase accessibility from Blockchain Revolution in the Healthcare Industry 627 remote areas, improve the quality and ef?ciency of treatment as well as the cost-effectiveness. But it still faces many challenges associated with international payments and 3rd party fees, centralization, patient security, integrity, and trust - factors related to different organizational entities. Using blockchain technology, patients and society can also eliminate the potential risks of data mismanagement, access limitations, delays in prognosis and human manipulation. The contribution of this paper is twofold. Firstly this paper explores the potential applications of blockchain in the Health Industry by examining the core requirements of the healthcare interested parties and society. Secondly, the analysis of the existing solutions and applications helps generalising the framework and approach for choosing the appropriate technology. This paper aims to provide a foundation for evaluating the effects of a blockchain technology on healthcare ecosystem. The main research question: What are the possibilities of using Blockchain in the Healthcare Industry? To approach the research question we describe applications of blockchain in the health industry, based on the customer needs (Sect. 3), followed by the research of the blockchain technology and solutions (Sect. 4). In the discussion, we present the examples of several ICO launches and healthcare blockchain startups in practice. 2 Technology Investment Trends Healthcare has the most aggressive deployment plans of any industry: 35% of respondents in that industry say their company plans to deploy blockchain into pro-duction within the next calendar year [3]. Many people will agree that it would be great to use the insurance all over the world, having instant access to best healthcare pro-fessionals. Currently there are many dif?culties connected with insurance: long pro-cedures and slow payments with participation of many involved intermediates, security and trust. The Global Health Journal [4] published a research of projects that implement a blockchain technology in healthcare. Currently there are over a thousand blockchain startups, various open source implementations. There are dozens of blockchain com-panies targeting healthcare applications. According to an IBM survey, which involved 200 healthcare executives across sixteen countries, approximately 16% admitted to taking a proactive approach in adopting a commercial blockchain solution in 2017 [5]. Blockchain startups seek investments through initial coin offering (ICO) with tokens sold to the public - the startup exchanges “utility” tokens for cash. The initiated tokens provide utility within the network, and tokens are traded on secondary exchanges. ICOs and token launches are a growing method of blockchain ?nancing and investors are proactively participating in such ICOs as there is no time to lose. Con-tracts can be signed remotely, and the pro?t from ICOs has been growing over recent years, with investors getting their money back even if the ICO does not work. Investors hope to turn a pro?t by buying early access to potentially foundational blockchain 628 S. Avdoshin and E. Pesotskaya protocols and applications, just as early investors into bitcoin and Ethereum did. For reference, a $100 investment into bitcoin on January 1, 2011 would now be worth nearly $1.5 M. Over 250 blockchain teams have completed ICOs since January 2016, with more than 55% of them raised during or after July 2017. Cumulatively (since January 2016), the number of ICOs should surpass the number of equity deals in October 2017, emphasizing the hype around the ?nancing mechanism [6]. Currently Robomed Network (https://robomed.io/) is launching an ICO in order to attract $30 mln for network deployment in Russia and all over the globe. The Robomed Network is aimed at dramatically changing the healthcare environment and ecosystem by applying a smart contract and a value-oriented approach to medical services. The Robomed Network connects healthcare service providers and patients based on a smart contract, the value criteria of which are the performance metrics of a speci?c medical service and patient satisfaction. Another international blockchain healthcare provider UBI (http://www.globalubi. com/index.aspx) can be used for applications that record data about customer health and automatically change the tariffs depending on the client’s behavior based on a smart contract and already announced an ICO date. 3 Potential Applications of Blockchain in the Health Industry 3.1 Blockchain for Electronic Medical Records In today’s digital age, technology is at the core of all business and personal aspects. The rapidly evolving Internet of Medical Things (loMT) has made it dif?cult for the existing health IT infrastructure and architecture to support it effectively. It is estimated that by 2020, the number of connected healthcare loT devices will be 20–30 billion, up from 4.5 billion in 2015 [7]. Many big companies see great potential in building the interface between healthcare and the mobile industry and creating ecosystems and using devices. There has been a notice able increase in the amount of data generated regarding the health and lifestyle of consumers due to the IoT enabling more medical device activity. Currently healthcare organizations store large amounts of sensitive patient information with no single approach to cybersecurity that raise certain concerns about interoperability, data privacy, and fraud. The EHR (Electronic Health Records) system is believed to be of great bene?t to the mobile health sector of the future. However, in practice, their implementation is com-plex and expensive, and adoption on a global scale is low. EHRs were never assumed to support multi-institutional, life time medical records, unlike PHR. The concept behind PHR (Personal Health Record) is that medical records are stored by a third party provider so that they can be accessible in whole or in part by healthcare professionals as and when needed. Mobile PHR systems represent the potential for signi?cant changes in how medical data are stored and used. PHRs also represent a change in the “ownership” of health information - from the medical institution, or health authority, to the indi-vidual, who is thereby empowered. Eventually, the argument goes, the “cure” is replaced by continuous monitoring before any cure is needed [8]. Blockchain Revolution in the Healthcare Industry 629 Certain dif?culties arise in the establishment of an up-to-date healthcare system in Russia as a number of barriers need to be broken down in order to ensure proper communication between different stakeholders – connecting providers, physicians, patients, clinics, government, etc. Patients nowadays have personal data distributed among clinics, hospitals, labs and insurance companies. This ecosystem does not work very well because there is no single list of all the places data can be found or the order in which it was entered. Many Russian doctors don’t want patients to access EHRs, being concerned by the fact that the patient can get access to his entire medical history, and can draw wrong conclusions regarding the state of their health. This means that patients take a passive role in managing and tracking their health, having a lack of control and ownership that makes them feel disappointed in their care. Those patients who don’t ?nd proper care are discontented and their faith in medical professionals disappears. This in turn deteriorates trust towards physicians, which is why less than half (*34%) of patients trust medical professionals compared to a 70%+ rate 50 years ago [9]. Concerns about the integrity and cybersecurity of patient data have always plagued the healthcare industry. In 2016 alone, around 450 data breaches were reported according to the Protenus Breach Barometer report. This impacted over 27 million patients. The breaches were mostly caused by insiders; human error or theft of data, amounted to 43% of the breaches, whereas the others were due to hacks, ransomware or malware [10]. A solution would be a record management system that can handle EHRs based on blockchain technology. It helps to guarantee data integrity and protect patient privacy by handling access rights to a particular pool of data and ensuring that personal data does not fall into the wrong hands. In blockchain personal data do not have to be placed somewhere: everything is stored on the client’s device, and only their con?rmation is stored in the blockchain system. Being decentralized, the technology of blockchain can ensure that data is stored securely in chronological order, in millions of servers and devices. This chronological chain of activity is shared—everyone participating on the network can maintain a complete activity history. Cryptography (encoding) is used to ensure that previously veri?ed data modi?cations are safe. The permissions for the data access also stored on the blockchain, and the patients’ data is only accessible by the party to whom access was granted, despite this data being hosted in a decentralized manner. Every modi?- cation of data is agreed to by the participants on a network according to the established rules and the data can be trusted without having to rely on a central authority like ?nancial organization or government. In blockchain technology patients are able to access securely and move their medical records between different healthcare organizations. Whenever required, the data from the various connected devices can be accessed instantly using the unique key assigned to the medical professionals. During the visit of a new patient the doctor can consult the system and other specialists, get all the necessary information on the state of the patient’s health, and plan appropriate treatment. Such collaboration of patient and doctor reduces the need to rely on intermediaries, the amount of time wasted while waiting, and inconsistent treatment plans from different healthcare professionals. 630 S. Avdoshin and E. Pesotskaya All this improves patients trust and satisfaction. For this reason, blockchain technology has been referred to as a “trust machine” [11]. We can see a growth of decentralized health platforms with a portable, secure, and self-sovereign personal health record (PHR) built on blockchain technology and designed to drive healthy patient behavior through the security token. Usually a plat-form provides access to patient-controlled health records, including medication, diag-nosis, care plan, complex medical imaging, patient generated behavior data, key vital signs generated outside of the clinic including weight, blood pressure, sleep, stress levels, glucose, and more. The platforms pull information from electronic health record systems, as well as from all personal sources of patient-generated data including the web, mobile applications, and connected devices. Patients grant permissions for data access via smart contracts embedded in the blockchain, and executions performed via the application. The mobile app then allows users to create an individual pro?le through which they can review their health information, connect with care providers or even chat to patients with similar conditions. Platforms are designed to be fully compatible with existing EMR systems, and work like an API. Hospitals and health care providers usually are able to use the same equipment and technology with only a minor change to their backend. Among the most popular platforms we can distinguish MintHealth [https://www.minthealth.io/], HealthHeart [https://www.healthheart.io/], Patientory [http://www.patientory.com/], MedRec by Media Lab [https://medrec. media.mit.edu/] and many others. Doctors, health systems, health coaches, case man-agers, family, and friends can gain access to the data via social modules embedded in the applications that will serve to build awareness around the healthcare chronic conditions via a patient-centered community. Of course, only patients can specify who can access their health records. The advantages of using blockchain technologies apply to many participants within ecosystem: • “Medical history right in the pocket” and direct access to healthcare for Patients. Patients get instant access to health information and the medical community to learn more about treatment and therapy, get 24.47 advisory services, trainings, education and access to care plan information. The patient community and EHR can even be referenced in an emergency or when travelling abroad when quick access to medical records is needed. Also patients will be able to search for care providers in a snap - even abroad if needed - with information on where a speci?c treatment is done with great care and without a long wait. • “A data sharing platform for providing a personalised medicine” - for healthcare professionals. Doctors, health coaches and healthcare advisors get instant access to medical history information including complete notes from other medical organi-zations. They can interact with patients more ef?ciently being able to leverage a proven clinical tool with built-in automation. - complete view of their patients’ history, including out-of-network encounters, prescription ?lls, and lifestyle infor-mation, and can eliminate the administrative burden associated with medical record transfers. Doctors can reach relevant patients, build online reputation, and get access to the latest technological possibilities. Blockchain Revolution in the Healthcare Industry 631 • “Cost-saving” for Healthcare Organizations and Insurance companies. They save costs on data gaps by using improved standards of care, involving the patient in their care plan, providing medication reminders, appointment booking and tools to track personal health that have a positive impact and improve clinical outcomes. Having a more complete picture of a patient’s health condition, insurers and healthcare organizations can create individual healthcare plans based on personal-ized information and machine intelligence, saving costs, improving outcomes and increasing productivity of medical services. For example, if the client was at the doctor’s place, the system will only have a document stating that the medical examination took place, and the diagnosis and the medical history will remain with the user. If the customer’s data were veri?ed during the conclusion of the contract, he can send the con?rmed identi?cation data to other companies for the conclusion of new contracts without the need to re-pass the veri?- cation process. In addition to that, transparency and fairness of tariffs and processing of insured events can increase the client’s motivation and interest. 3.2 Blockchain for Tracking and Tracing Medical Fraud The identi?cation of healthcare fraud is another direction of application of the block-chain technology. This affects the concern of the patients that healthcare representatives and organizations used to falsify personal healthcare records and prescriptions. Regardless of whether your employer provides you with health insurance, or if you have taken out a policy for yourself, you can be at risk of fraud. This happens when a person takes advantage of a patient by either inserting into their EHR false diagnoses of medical conditions that are untrue, or by exaggerating the conditions that they do have. The intention is to submit for payment fraudulent insurance claims. Even if a person uses free medical care (which is common in Russia) with the funding coming from the healthcare tax imposed on all registered employers (over 3% of each employee’s income), this means the waste of a healthcare budget that can be allocated for more quality services, higher medical staff compensations, more afford-able care services, etc. Blockchain takes control over the customer healthcare record, tracks all changes, and protects against mistakes and data mismatch. Currently the workload for pharmacies, insurance companies, and doctors in ver-ifying the correctness of prescriptions and reducing fraud and coincidental mistakes is very high. Insurance companies more often than other ?nancial institutions suffer from fraud. Sometimes claims are denied because of incomplete or incorrect information. Blockchain allows one to check the customer and every particular case with minimal costs. Manipulation of claim assessments causes patients to suffer huge time delays and loss of claims due to incomplete or ‘mismanaged’ records. A blockchain that connects hospitals, physicians, lab vendors and insurers could enable a seamless flow of health information for improved underwriting and validating of claims. Among the bene?ts we can state the fact that insurance companies will need to spend less time checking data, that they can trust the data presented to them, not only from the access given to them by the patient, but also from the notes provided from the medical professional. The burden of patient losses will be reduced as well as the cost of 632 S. Avdoshin and E. Pesotskaya disputes, an insurance company will have become completely transparent and would be able to suggest a more personalized care plan based on accurate medical records. EHR fraud and operational mistakes are not the only reasons for using blockchain technology. Some participants can see the bene?ts to secure drug provenance, manage inventories and provide an auditable drug trail. Drug production and distribution involves many participants - manufacturers, distributors, wholesalers and pharmacies who want to know the true source of the drug and track distribution from the factory floor to the end user. A blockchain-based solution can help build such trust in healthcare products and their supply chain. Manufacturers can record drug batches as blockchain transactions tagged with a QR code revealing batch details. Records on a blockchain cannot be modi?ed, updates to records are stored on the blockchain by writing the updated version of the full record to the blockchain with all versions of the record available. The drug batch details are immutable once con?rmed on the block-chain. A single tracking identi?er is established via a QR code across the distribution chain. All downstream participants can trust a drug batch based on the scanned QR code and use the same data to track further distribution, they can buy or sell the drug post-veri?cation using the QR code returned by the blockchain. This greatly simpli?es and streamlines the distribution management that can pre-vent the drugs from falling into the wrong hands, authenticating the drug for the end consumer which greatly reduces the counterfeiting possibility, price manipulation and delivery of expired drugs [12]. Another advantage of using blockchain in this scenario includes the safety of the patient as spurious drugs cannot enter the distribution chain. The true source of the drug can be irrefutably proved as manufactured batches are recorded on a blockchain as a single source of truth available to all participants. Each participant in a blockchain can verify the drug before it is purchased and after it is received [13]. Within a few seconds, the blockchain technology will allow patients to check the drugs for authenticity learn the manufacturer and track the history of the movement through the delivery chain. 3.3 Blockchain for Arti?cial Intelligence Arti?cial Intelligence (AI) in the health sector uses algorithms and software to simulate human abilities in the analysis of complex medical data. A huge amount of medical data pushes the development of applications with AI, although it should be noted that AI has not yet reached the full potential for the healthcare industry, as this requires a large and diverse range of data to ensure accuracy and effective results. Blockchain technology allows creating a platform where patients can discuss their medical data with an advanced arti?cial intelligence “doctor”. This functionality might help healthcare providers and medical companies to provide services, which will allow their patients to have personalized (based on health data) AI-powered conversations about their health. Also it will improve patient care and experience through an advanced natural dialogue system which will be able to generate insights from com-bined medical data [14]. With arti?cial intelligence healthcare specialists and primary care physicians are able to diagnose quickly a patient with a given system, taking into consideration what treatment has worked in the past for similar diseases (leverage all of the medical data, Blockchain Revolution in the Healthcare Industry 633 e.g. the blood tests, MRI results, X-Rays, echocardiograms, etc.) and how it has worked. This principle can be applied to diagnosing illnesses as well. Whatever it can be converted into alphanumeric data will be inputted into the AI neural network. This enables the system to be trained to assist medical professionals, helping them to diagnose conditions quickly and recommend treatment plans based on an individual’s personal medical pro?le and their symptoms. An arti?cial intelligence platform can be launched on the blockchain that is able to predict and diagnose ail-ments based on a vast database of previous diagnostic histories and the results of medical examinations. Patients will be able to approve their data to be used for this while doctors will be able to narrow down options quickly for diagnosis and treatment with the help of an intelligent platform with patient data from all around the globe as MediBond platform [15] announces their intention of doing it. The more is the par-ticipants, the greater is the value of the network. 3.4 Blockchain for Secure and Guaranteed Payments Blockchain technology helps to create an ecosystem through smart contract and digital currency, so that all participants – patients, doctors, healthcare providers, researches and medical institutions, are ?nancially motivated and secured. In this context “Smart” means “without intermediates” - e.g. banks, ?nancial organizations or insurance company or brokers. Smart contract also means “technically executed” as without execution there is no payment. They are written to execute some given conditions, to eliminate the risk of relying on someone else to follow through on their commitments. This is particularly important for value-based healthcare, in which payments are tied to outcomes. For convenience, the agreement and the patient’s signature can be digital. The patient pays for the medical services - visits, consultation, tests, etc. with Tokens (cryptocurrency). The distributed nature of blockchain technology makes possible accepting pay-ments and paying healthcare providers for their contribution globally. This mechanism avoids complicated legal and accounting procedures supported by assigned specialists and charging fees for the services. This method of payment makes it possible for any individual, no matter where they are in the world, to purchase services without the need to pay additional charges related to processing credit card transactions. The protection of patient’s rights is assured without the need of involving additional third parties, such as expensive lawyers, or entities to ensure that the correct treatment has been pre-scribed. Once the conditions of the smart contract have been met, the payment will automatically be taken from the patient’s account and be deposited into the service provider’s account. Smart contracts offer several advantages: they are a reliable and transparent payout mechanism for the customer that enables automation of claims handling and can be used to enforce contract-speci?c terms. It means that in the case of illness or an accident, a smart contract can ensure that the claim is only paid out if the patient recovers and received full treatment in the preferred hospital as prede?ned by the insurer. Although such programs could also be implemented without blockchain, but a blockchain-based smart contract platform could provide substantial network effects - an increased degree of transparency and credibility for customers due to decentralization. 634 S. Avdoshin and E. Pesotskaya Smart contracts offer a great bene?t to Insurance companies as their business depends directly on data that is available to the insurance specialist, and this data needs to be reliable and trustworthy. Insurance contracts are usually complicated and hard to understand for the majority of people, as they contain legal terminology. Smart con-tracts help to make the insurance industry more transparent and friendly to both current and potential clients. 3.5 Blockchain for Medical Research Blockchain technology enables research and discovery. With smart contracts, it becomes possible to reward healthcare content creators in proportion to how everyday visitors perceive their content (e.g. “likes” that get recorded). Moreover, rewards are an additional push for medical professionals to sign up to have a free mobile-friendly online pro?le. Healthcare companies can use the blockchain-based platform to reach potential clinical trial participants who ?t a certain medical history or care plan. The traditional amount of time and effort required to source such participants is greatly reduced, as well as the dependency on health systems to act as intermediaries. Additionally, the use of such blockchain-based systems facilitates longitudinal tracking of trial participants. This is of most importance though, this also reduces the risks and increases the ef?- ciency of these trials through means of participation that has been tailored to speci?c health or genomic pro?les. Sometimes medical researchers mine the network as the healthcare community (patients, doctors) release access to aggregate, anonymous medical data as transaction “fees” that become mining rewards. In some blockchain research platforms (e.g. MedRec) researchers can influence the metadata rewards that providers release by selectively choosing which transactions to mine and validate. Providers are then incentivized to match what researchers are willing to accept, within the boundaries of proper privacy preservation. Patients and providers can limit how much of their data is included in the available mining bounties. This approach helps engage participants in health research, facilitates collaboration, and fosters an environment of fast-paced learning, seeking better treatment options and cures for the patients, enables the creation of new communities of individuals who have a desire to connect with others that share a similar condition, learn about treatment options, share their experiences, and participate in research. For example, in the Bur-stIQ platform, individuals can browse the marketplace and make a request to participate in a research initiative or patient community. Additionally, individuals have the option to donate or sell their data to a research initiative or population data repository [16]. Among the advantages we can also mention a deep learning environment that con-tinuously expands the knowledge of an individual to improve relevance and impact. Researchers can ?nd and access the people and data they need to support their research, and collaborate with other researchers to explore new ideas. They are able to connect directly with the right participants, reducing the cost and time-scale of both academic and commercial health research. Blockchain Revolution in the Healthcare Industry 635 4 Blockchain Solutions Blockchain is a digital platform that stores and veri?es the entire history of transactions between users across the network in a tamper. Transactions between users or counter-parties are broadcast across the network and are veri?ed by cryptographic algorithms and grouped into blocks. At the moment, there are several competing protocols that exist and a handful of other proprietary middleware and application development suites for each protocol. They differ in permissions, functionality, access rights and decision making processes inside the network. The terminology around blockchain is still confusing. In different sources we can ?nd different de?nitions of blockchain, and its classi?cation. In this paper we will distinguish between public and private blockchain, as well as between permissionless blockchain and permissioned (exclusive) blockchain. Each public blockchain can be inspected by anyone, whereas private blockchains can only be inspected by computers that have been granted access rights. Some of the solutions use an approach that involves tracking data modi?cations on a private blockchain and recording hashes of these changes on a public blockchain. In this approach, the public blockchain effectively serves as a notary for data modi?cations by verifying that they occurred and at what time [17]. The majority of blockchain solutions were inspired by Bitcoin’s(https://bitcoin.org/), original protocol, created in 2011, which aimed to provide an alternative to the formal ?nancial system, and made possible a blockchain data structure, in which every modi-?cation of data on a network is recorded as part of a block of other data modi?cations that share the same timestamp. Bitcoin blockchain is a public permissionless network where participants are able to access the database, store a copy, and modify it by making available their computing power. Bitcoin, a public network offers an open, permissionless invitation for anyone to join. If the dominant requirement is a trust mechanism between strangers who know nothing about each other, then a public network may be the way to go. For digital or crypto-currencies, such as bitcoin, this as a catalyst for driving greater adoption globally, enabling more people to make purchases with these currencies [18]. The most notable non-Bitcoin public blockchain is Ethereum (https://www.ethereum.org/), which was created in 2014. Like Bitcoin, Ethereum also permissionless, runs on a public Peer-to- peer (P2P) network, utilizes a cryptocurrency “ether”, and stores information in blocks. Compared to Bitcoin, which was solely designed to store information about transactions, Ethereum is a programmable blockchain that also allows users to put self-executing computer scripts and has much broader functionality. It provides a built-in programming language and an open-ended platform that allows users to create decentralized applications of unlimited variety. While distributing computing across a P2P network necessarily results in slower and more expensive computation than nor-mal, it also creates a database that is agreed to by consensus, available to all partici-pants simultaneously, and permanent, all of which are useful when trust is a primary concern. Bitcoin and Ethereum are both public, permissionless blockchains, which anyone with the appropriate technology can access and contribute to. Companies use these open-ended platforms to build their customized solutions. For instance, HealthHeart’s 636 S. Avdoshin and E. Pesotskaya platform (https://www.healthheart.io/) uses the Ethereum functionality for assigning unique addresses to patients, medical care providers, organizations, etc. and restricts access to a patient’s addresses and link them to the full history of transactions for a given identity on the blockchain, thus creating an audit trail for all events within a medical record. It supports reviews of past transactions by consumers, providers and third party entities that have been granted access, facilitates the connection between the consumer and the care provider. Public blockchains offer maximum transparency and its main goal is to prevent the concentration of power. However, many private ?rms are uncomfortable relying on public blockchains as a platform for their business operations due to concerns about privacy, governance, and performance. For instance, within the banking industry organizations prefer to transact only with trusted peers. For this reason IBM (https://www.ibm.com) has invested signi?cant resources into helping the Linux Foundation design an open-source modular blockchain platform called Hyperledger Fabric (https://www.hyperledger.org) which provides programmers with a “blockchain builders kit”, and allows them to tailor all elements of a ledger solution, including the choice of the consensus algorithm, whether and how to use smart contracts, and the level of permissions required. It is another permissioned network which provides collectively de?ned membership and access rights within a given business network. Fabric is designed for organizations that need to meet con?- dential obligations to each other without passing everything through a central authority and ensuring con?dentiality, scalability and security. Also a number of startups, including Ripple (https://ripple.com/) and the R3 Consortium (https://www.r3.com/), a group of more than 70 of the world’s largest ?nancial institutions that focuses on developing blockchain permissioned solutions for the industry, have developed platforms that run on private or permissioned networks on which only veri?ed parties can participate [19]. Consortium blockchains are usually open to the public but not all data is available to all participants, while private blockchains provide another type of permission and access rights to users. In private networks a central authority manages the rights to access or modify the database. The system can be easily incorporated within infor-mation systems and offers the added bene?t of an encrypted audit trail. In private blockchains, the network has no need to encourage miners to use their computing power to run the validation algorithms. 5 Conclusion Blockchain technology is gradually becoming very popular. The bene?ts of blockchain are enormous, from decentralization, to security and scalability, to privacy and affordability. Both health professionals and organizations will be able to work faster and more ef?ciently, relative to how accessible, safe and trustworthy the information available is. Professionals in the industry that are provided open access to this reliable information would be able to predict future trends, keep track of pharmaceutical inventories, amongst other things. As a result, the general population would have improved health and a higher quality of life. Blockchain Revolution in the Healthcare Industry 637 Still there are a huge barriers to blockchain adoption, such as regulatory issues (45%), followed by concerns over data privacy (26%) [20]. In the case of Russia – it also does not have the required regulatory base and needs to provide targeted government-backed funding with a speci?c focus on remote medical services and their integration into existing healthcare programs. A major issue with data processing lies in the fact that patient information is stored in different places, information is being lost or concealed through the fault of the patient or the doctor, while there are no personalized analytics. The regulatory concerns are linked to a decentralized infrastructure that can’t be controlled by any person or group. Also not everyone is approaching blockchain positively - there is an opinion that blockchain technology is relatively new, and its business advantages are unproven, it requires non-trivial computing infrastructure changes, though this is not completely accurate. There are many startups that have already proved the fact that blockchain technology has a positive effect on the cost of provided services, positively influences the delivery of care and the collaboration between different interested parties. Despite this, in order to maintain regulation compliant with global health standards, it is necessary to establish a consistent approach to compliance framework and implementation through standardized pro-cesses and interoperability. Not only standards need to be in place, but there also should be a level of con?dence and motivation from people before any organization can adopt new blockchain technology. For future work, the authors intend to improve this review paper with innovative research, enrich with more quantitative data. A framework for analysis of existing ICOs and solutions supported by a case study can be initiated. This framework would help to evaluate and predict the effects of different blockchain projects in healthcare. A set of criteria should be developed; the KPI measurement metrics and a validation model should be identi?ed to choose the most trusted provider by looking at the different perspectives in the framework. References 1. Espinel, V., Brynjolfsson, E., Annunziata, M.: Global Agenda Council on the Future of Software & Society. Deep Shift: Technology Tipping Points and Societal Impact. World Economic Forum Homepage. http://www3.weforum.org/docs/WEF_GAC15_Technological_ Tipping_Points_report_2015.pdf. Accessed 20 Jan 2018 2. 2017 global health care sector outlook. Deloitte Homepage. https://www2.deloitte.com/ content/dam/Deloitte/global/Documents/Life-Sciences-Health-Care/gx-lshc-2017-health-care- outlook-infographic.pdf. Accessed 20 Jan 2018 3. Schatsky, D., Piscini, E.: Deloitte survey: blockchain reaches beyond ?nancial services with some industries moving faster. Deloitte Homepage. https://www2.deloitte.com/us/en/pages/ about-deloitte/articles/press-releases/deloitte-survey-blockchain-reaches-beyond-?nancial-services- with-some-industries-moving-faster.html. Accessed 20 Jan 2018 4. Till, B., Peters, A., Afshar, S., Meara, J.: From blockchain technology to global health equity: can cryptocurrencies ?nance universal health coverage?. BMJ Global Health Homepage. http://gh.bmj.com/content/2/4/e000570. Accessed 20 Jan 2018 638 S. Avdoshin and E. Pesotskaya 5. Hogan, S., Fraser, H., Korsten, P., Pureswaran, V., Gopinath R.: Healthcare rallies for blockchain: keeping Patients at the center. IBM Corporation Homepage. https://www-01. ibm.com/common/ssi/cgi-bin/ssialias?html?d=GBE03790USEN&. Accessed 20 Jan 2018 6. Blockchain Investment Trends in Review. CBInsights Homepage. https://www.cbinsights. com/research/report/blockchain-trends-opportunities/. Accessed 20 Jan 2018 7. Internet of Medical Things, Forecast to 2021. Reportlinker Homepage. http://www. prnewswire.com/news-releases/internet-of-medical-things-forecast-to-2021-300474906.html . Accessed 20 Jan 2018 8. Avdoshin, S., Pesotskaya, E.: Mobile healthcare: perspectives in Russia. Bus. Inform. 3(37), 7–13 (2016) 9. Embrace Disruptive Medical Technologies. The Medical Futurist Homepage. http:// medicalfuturist.com/grand-challenges/disruptive-medical-technology/. Accessed 20 Jan 2018 10. Protenus Releases 2016 Healthcare Data Breach Report. HIPAA Journal Homepage. https:// www.hipaajournal.com/protenus-releases-2016-healthcare-data-breach-report-8656. Acces-sed 20 Jan 2018 11. Katz, D.: The Trust Machine. The Economist Homepage. https://www.economist.com/news/ leaders/21677198-technology-behind-bitcoin-could-transform-how-economy-works-trust-machine. Accessed 20 Jan 2018 12. Gilbert, D.: Blockchain Technology Could Help Solve $75 billion Counterfeit Drug Problem. International Business Times Homepage. http://www.ibtimes.com/blockchain-technology- could-help-solve-75-billion-counterfeit-drug-problem-2355984. Accessed 20 Jan 2018 13. Chowdhury, C., Krishnamurthy, R., Ranganathan, V.: Blockchain: A Catalyst for the Next Wave of Progress in Life Sciences. Cognizant Homepage. https://www.cognizant.com/ whitepapers/blockchain-a-catalyst-for-the-next-wave-of-progress-in-the-life-sciences-industry- codex2749.pdf. Accessed 20 Jan 2018 14. Vitaris, B.: The Next Doctor You Consult Could Be a Robot: Healthcare Meets AI and the Blockchain. Bitcoin Magazine Homepage. https://bitcoinmagazine.com/articles/next-doctor-you- consult-could-be-robot-healthcare-meets-ai-and-blockchain/. Accessed 20 Jan 2018 15. Steffens, B., Billot, J., Marques, A., Gawas, D., Harmalkar, O.: Facilitate health care on block chain. MediBond Homepage. https://medibond.io/doc/medibond_whitepaper.pdf. Accessed 20 Jan 2018 16. Ricotta, F., Jackson, B., Tyson, H., et al.: Bringing Health to Life. BurstIq Homepage. https://www.burstiq.com/wp-content/uploads/2017/09/BurstIQ-whitepaper_07Sep2017.pdf. Accessed 20 Jan 2018 17. Pisa, M., Juden, M.: Blockchain and Economic Development: Hype vs. Reality. Center for Global Development Homepage. https://www.cgdev.org/sites/default/?les/blockchain-and-economic- development-hype-vs-reality_0.pdf. Accessed 20 Jan 2018 18. Vaidyanathan, N.: Divided we fall, distributed we stand. The Association of Chartered Certi?ed Accountants (ACCA) Homepage. http://www.accaglobal.com/lk/en/technical-activities/ technical-resources-search/2017/april/divided-we-fall-distributed-we-stand.html. Accessed 20 Jan 2018 19. Adam-Kalfon, P., El Moutaouakil, S.: Blockchain, a catalyst for new approaches in insurance. PwC Homepage. https://www.pwc.com.au/publications/pwc-blockchain.pdf. Accessed 20 Jan 2018 20. Strachan, J.: Pharma Backs Blockchain. The Medicine Maker Homepage. https:// themedicinemaker.com/issues/0717/pharma-backs-blockchain/. Accessed 20 Jan 2018 Blockchain Revolution in the Healthcare Industry 639 Effective Reversible Data Hiding in Electrocardiogram Based on Fast Discrete Cosine Transform Ching-Yu Yang1,2(&) , Lian-Ta Cheng1,2 , and Wen-Fong Wang1,3 1 Department of Computer Science and Information Engineering, National Penghu University of Science and Technology, Magong, Penghu, Taiwan chingyu@gms.npu.edu.tw 2 National Penghu University of Science and Technology, Magong, Taiwan 3 National Yunlin University of Science and Technology, Douliu, Yunlin, Taiwan Abstract. Based on the fast discrete cosine transform (FDCT), the authors present an effective reversible data hiding method for electrocardiogram (ECG) signal. First, an input ECG data is transformed into a series of non-overlapping bundles by one-dimensional (1-D) FDCT. The FDCT bundles are subsequently attributed into two disjoint subsets according to a simple classi-?cation rule. Then, two pieces of data bits in different length are separately embedded in the selected coef?cients of the classi?ed bundles via the least signi?cant bit (LSB) technique. Simulations con?rmed that the hidden message can be extracted without distortion while the original ECG signal can be fully recovered. In addition, the perceived quality of the proposed method is good while the hiding capacity is superior to existing techniques. Since computational complexity is simple, the proposed method is feasible to be applied in real-time applications, or to be installed in the health care (or wearable) devices. Keywords: Data hidingReversible ECG steganography Fast discrete cosine transform (FDCT) LSB technique 1 Introduction With the maturity of arti?cial intelligence algorithms, the popularization of the Internet of Things, and the flexible use of big data, people and organizations can easily use the diversity services such as the World Wide Web, e-mail, e-commerce, online news, and social networking from the Internet. However, if the handling of important (or con?- dential) data does not properly conduct, it is possible for crucial resources to be compromised. Namely, the content of the message could be intercepted, eavesdropped, or forged by adversaries (or hackers) during transmission. One of an economical manner to protect (or secure) the information assets is the use of data hiding techniques. In general, data hiding can be divided into two categories: steganography and digital watermarking [1, 2]. The applications of both approaches are quite difference. The main aims of the steganographic methods [3, 4] are to conceal secret bits in host media © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 640–648, 2019. https://doi.org/10.1007/978-3-030-02686-8_48 while maintaining an acceptable perceptual quality, whereas the primary goals of digital watermarking [5, 6] try to achieve robustness with a limited hiding payload. To secure patients’ diagnoses such as blood pressure, blood glucose level and body temperature, as well as name, ID number, address and patient history and other sensitive information, some researchers have developed the data hiding methods in biometrics, such as electrocardiogram (ECG) or electromyography (EMG). However, most of ECG steganography [7, 8] were incapable of restoring the original ECG signal after the extraction of hidden message. As host biometrics signals are valuable to the hospitals and individuals, it is undesirable that the host data be damaged after bit extraction. To completely recover the original hosts and successfully extract the hidden message at the receiver site, several authors have designed reversible ECG steganog-raphy to achieve the goal [9, 10]. Yang and Wang [9] presented two types of data hiding methods for ECG signals, namely, lossy and reversible ECG steganographys. To preserve the originality of host ECG data, a reversible version of data hiding for ECG signal was proposed. By employed the mean-predicted technique and coef?cient alignment, data bits were embedded in the prede?ned bundles of the host ECG. Simulations revealed that the hidden bits were extracted successfully while the original ECG signal can be restored completely. The average payload of the method was 44.07 Kb with signal to noise ratio (SNR) of value 34.78 dB. Based on the Hamming code and matrix coding techniques, Shiu et al. [10] suggested a reversible data hiding method for ECG and EMG signals. Simulations indicated that the hiding capacity of their method was larger than those of existing techniques, but the average SNR was only 17.99 dB. Since the perceived quality of the marked ECG signal was distorted severely, rendering it of no use for clinical diagnosis in medicine. In this article, we propose a simple but effective reversible ECG steganography, which is capable of providing high hiding storage with good perceptual quality. The remainder of this paper is organized as follows. Section 2 speci?es the procedure of bit-embedding/- extraction, plus overhead analysis and discussion. Section 3 presents the demonstrations of the proposed method, and Sect. 4 provides the conclusion. 2 Proposed Method First, an ECG host is transformed into a series of non-overlapping bundles via FDCT [11–13]. The FDCT bundles are subsequently attributed into two disjoint subsets according to a simple classi?cation rule. Then, two pieces of data bits in different length are separately embedded in the target coef?cients of the classi?ed bundles. The details of bit embedding/extraction of the proposed method are speci?ed in the following sections 2.1 Bit Embedding Let Aj is the jth bundle of size 1 f n derived from a host ECG, and also let Hj ¼ sji j n1 i¼0 be a series of non-overlapping jth bundle taken from 1-D FDCT coef?cients, which was obtained by performing FDCT from Aj with n = 8, as shown in Fig. 1. The FDCT bundles are represented by I ¼ fHjjj ¼ 1; 2; ...; jIjg with Hj ¼ 10 0 AjX j j ; where and X is a predetermined 8 s 8 matrix, as shown in (1). [Note that Effective Reversible Data Hiding in Electrocardiogram 641 to ensure a reversible ECG steganography can be reached, the values of sji in Hj are obtained by performing a floor function to the multiplication of 10 and AjX:] X ¼ 1 1 1 1 1 1 1 1 3 2 5 4 3 4 3 8 3 8 3 4 5 4 3 2 1 1 2 1 2 1 1 1 2 1 2 1 5 4 3 8 3 2 3 4 3 4 3 2 3 8 5 4 1 1 1 1 1 1 1 1 3 4 3 2 3 8 5 4 5 4 3 8 3 2 3 4 1 2 1 1 1 2 1 2 1 1 1 2 3 8 3 4 5 4 3 2 3 2 5 4 3 4 3 8 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 1 : ð1Þ The main procedure of bit embedding of the proposed method is speci?ed in the following algorithm. Algorithm 1. Hiding a secret message in an ECG host. Input: Host ECG data E, scrambled secret message W, and control parameters µ. Output: Marked ECG data E ~ and bitmap ?. Method: Step 0. Perform forward FDCT from E to obtain 1-D DCT bundles Step 1. Input a bundle Hj from ?j . If the end of input is encountered, then proceed to Step 5. Step 2. Compute the average of the absolute values from Hj , if T =f µ, then mark this bundle with bit “0”, otherwise, mark it with bit “1”, and save the bitmap Step 3. If then take three (and two) data bits from W each time and embed it in the coefficients{s }3 ji i=0 (and {s } 2 4 -n = n ji i ) by the LSB technique, respectively, and return to Step 1. Step 4. If then take two data bits from W each time and embed it in the coefficients {s }3 ji i=0 by the LSB technique, and return to Step 1. Step 5. Perform inverse FDCT from the marked bundles and form marked ECG data. Step 6. Stop. Fig. 1. Bundle of size 8. 642 C.-Y. Yang et al. To alleviate distortion and obtain better hiding capability during the encoding phase, two pieces of data bits in different length are separately employed at Steps 3–4. Namely, each time there are ð3 3 4Þþð2 3 3Þ ¼ 18 and 2 n 4 ¼ 8 bits embedded in the two classi?ed bundles, respectively. 2.2 Bit Extraction The decoding part of the proposed method is summarized here. Algorithm 2. Extracting hidden message from mark ECG data and restoring original ECG host. Input: Marked ECG data E ~ , the control parameters µ, and the bitmap ?. Output: A secret message W and host ECG data E. Method: Step 0. Perform forward FDCT from E ~ to obtain 1-D DCT bundles and read in the bitmap Step 1. Input a bundle H ˆ l which derived from ˆ ?. If the end of input is encountered, then proceed to Step 4. Step 2. If then extract eighteen hidden bits from the coefficients {s } 2 0 -h = n ji i and restore the host bundles, and go to Step 1. Step 3. If then extract eight hidden bits from the coefficients {s }3 ji i=0 and restore the host bundles, and go to Step 1. Step 4. Descramble and assemble all extracted bits, and perform inverse FDCT from ?n ˆ to restore the original ECG data. Notice that the marked ECG data E ~ was obtained by conducting Step 5. Stop. 2.3 Overhead Analysis and Discussion From Algorithm 1 we can see that it requires one bit to record the attribute of each FDCT bundle in the bitmap W: The auxiliary information (Oh) of the proposed method is Oh ¼ jIj: For example, if the size of an input host ECG is 30,000 and the size of a bundle is set to be 8, then overhead bits of the proposed method is Oh = 30,000/8 = 3,750. Notice as well overflow issue can be avoided during the encoding process. In general, the value of the coef?cient sjðn1Þ is often signi?cantly larger than those of remaining n -f 2 coef?cients of Hj after FDCT operation. The role of coef?cient sjðn1Þ is similar to that of the DC coef?cient in conventional DCT domain. In other words, if data bits were embedded in this coef?cient, a severe distortion would be introduced during the process of encoding. Therefore, the proposed method embeds secret bits in the remaining n -h 2 coef?cients of Hj. Effective Reversible Data Hiding in Electrocardiogram 643 3 Experimental Results The simulations of the proposed method were implemented in Matlab (R2015b) pro-gramming language under the platform of Microsoft Windows 10 laptop and an Intel Core (TM) i5-6300U 2.4 GHz with 8 GB RAM. The host ECG signals were derived from the MIT-BIH arrhythmia database [14]. Several host ECG data were utilized in our experiments. The size of each test set was 30,000. The average execution time of the proposed method was 0.125 s. The relationship between the average SNR/PRD and net payload of the proposed method with various mean value (l) was drawn in Fig. 2. It can be seen that the lower value of l, the larger the SNR value, and the lesser the hiding capacity, and vice versa. In our proposed method, to achieve a desired net payload, SNR value, and perceived quality, the value of l was set to be 9. Table 1 indicated the net payload, SNR, and PRD of the proposed method using l = 9. The average SNR/PRD of the proposed method is 40.74 dB/0.0093 with an average net payload of 45.80 Kb. In addition, the relationship between average SNR and net payload of the proposed method using ?ve different inputs with various l was depicted in Fig. 3. From the ?gure we can see that the ECG100 has the best performance among all the input data. The hiding performance of ECG102 won the second place, followed by ECG101, ECG103, and ECG104. One of the main reason for ECG104 ranked at the last place is that it contains more steep areas (or drastic variations) than smooth ones, meaning that the corresponding coef?cients in the FDCT bundles are often larger than l, and the lesser data bits can be embedded in ECG104. The SNR and PRD are de?ned as follows: SNR ¼ 10 log10 P i s2 i P i ðsi s ^ siÞ2 ð2Þ and PRD ¼ P ????????????????????????.? i ðsi s ^ siÞ2 P i s2 i v u u u u t ; ð3Þ where, si and ^ si are the data in original ECG and marked ECG divided by 10, respectively. Generally speaking, the larger value of SNR, the lesser the PRD, the better perceived quality can be obtained. Close observation of the host and the marked ECGs, namely, ECG100, ECG101, ECG111, and ECG220 (at the beginning of 5-s) were drawn in Fig. 4. The resultant SNR and net payload were also depicted in the ?gures. It is clear that the perceived quality is not bad. No apparent distortion existed in the marked ECGs. As described previously, the less proportion of the steep areas (or drastic variations), the better the hiding capability of the proposed method. From Fig. 4 we can see that ECG100 (in 644 C.-Y. Yang et al. Fig. 4a) provided the best hiding capability, whereas, ECG111 provided the least hiding storage (in Fig. 4c). Performance comparison between our method and existing techniques [9, 10] was listed in Table 2. It is obvious that the average SNR of our method is much larger than that of the Yang and Wang’s technique [9] when the average net payload is around 44 Kb. Although the hiding storage of the Shiu et al.’s approach [10] is the largest among the compared methods, their resultant SNR is not good. Due to a low SNR implies a poor perceived quality of the marked ECG signals, it is not feasible for medical staffs use it in the diagnosis of patients. Fig. 2. The relationship between the average SNR/PRD and net payload of the proposed method with various l. (a) Average SNR vs. net payload and (b) average PRD vs. net payload. Effective Reversible Data Hiding in Electrocardiogram 645 Table 1. Net payload, SNR, and PRD performance of the proposed method using l = 9 ECG data Net payload SNR PRD ecg100 54,580 41.12 0.0088 ecg101 46,220 41.31 0.0086 ecg102 50,440 41.22 0.0087 ecg103 46,560 39.77 0.0102 ecg104 44,830 39.59 0.0105 ecg111 42,220 43.15 0.0070 ecg112 45,960 42.03 0.0079 ecg113 44,240 38.98 0.0112 ecg114 47,990 42.25 0.0077 ecg115 51,130 38.63 0.0117 ecg121 51,870 42.03 0.0079 ecg220 45,170 37.60 0.0132 ecg221 42,740 41.40 0.0085 ecg222 49,380 42.15 0.0078 ecg223 46,740 40.59 0.0093 ecg230 43,370 40.08 0.0099 ecg231 43,930 40.70 0.0092 Average 46,904 40.74 0.0093 Fig. 3. The relationship between the SNR and net payload of the proposed method with various host ECG. 646 C.-Y. Yang et al. Fig. 4. Close observation of the host and the marked ECGs: (a) ECG100, (b) ECG101, (c) ECG111, and (d) ECG220. Table 2. Net payload/SNR comparison with existing techniques ECG data Net payload/SNR Yang and Wanga [9] Shiu et al. b [10] Our method 100 45,567/36.89 68,270/19.69 54,580/41.12 121 47,029/37.93 68,270/18.26 51,870/42.03 122 44,683/31.52 68,270/18.61 37,570/40.59 205 44,343/36.09 68,270/17.82 51,140/41.97 207 44,853/37.10 68,270/15.56 44,590/43.38 220 44,921/31.65 N/A 45,170/37.60 230 44,530/32.30 N/A 43,370/40.08 Average 45,132/34.78 68,270/17.99 45,497/39.83 a With reversible version using bundle size = 1. b With (1023, 1013)-Hamming code. Effective Reversible Data Hiding in Electrocardiogram 647 4 Conclusion In this study, based on a smart processing of the FDCT coef?cients, we proposed an effective reversible data hiding method for ECG signal. First, a simple classi?cation rule was performed on the host bundles. Then, two pieces of data bits in different length are separately embedded in the target coef?cients of the classi?ed bundles via the LSB technique. Simulations con?rmed that the hidden message can be extracted without distortion and the original ECG signal is completely recovered at the receiver site. In addition, the hiding capacity and SNR/PRD of the proposed method outperform those of existing techniques. Due to the processing time of encoding/decoding is short, it is suitable for our method to implement in the real-time applications, or to be performed in a (mobile) health care device for ECG signal measurements. References 1. Phadikar, A.: Data Hiding Techniques and Applications Speci?c Designs. LAP LAMBERT Academic Publishing, Saarbrucken (2012) 2. Eielinska, E., Mazurczyk, W., Szczypiorski, K.: Trends in steganography. Commun. ACM 57, 86–95 (2014) 3. Yang, C.Y., Wang, W.F.: High-capacity steganographic method for color images using progressive pixel-alignment. J. Inf. Hiding Multimed. Signal Process. 6, 815–823 (2015) 4. Li, B., Wang, M., Li, X., Tan, S., Huang, J.: A strategy of clustering modi?cation directions in spatial image steganography. IEEE Trans. Inf. Forensics Secur. 10, 1905–1917 (2015) 5. Hsiao, C.Y., Tsai, M.F., Yang, C.Y.: High-capacity robust watermarking approach for protecting ownership right. In: The 12th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, November 21–23, Kaohsiung, Taiwan (2016) 6. Liu, S., Pan, Z., Song, H.: Digital image watermarking method based on DCT and fractal encoding. IET Image Process. 11, 815–821 (2017) 7. Ibaida, A., Khalil, I.: Wavelet-based ECG steganography for protecting patient con?dential information in point-of-care systems. IEEE Trans. Biomed. Eng. 60, 3322–3330 (2013) 8. Chen, S.T., Guo, Y.J., Huang, H.N., Kung, W.M., Tseng, K.K., Tu, S.Y.: Hiding patients con?dential data in the ECG signal via a transform-domain quantization scheme. J. Med. Syst. 38 (2014). doi: 10.1007/s10916-014-0054-9 9. Yang, C.Y., Wang, W.F.: Effective electrocardiogram steganography based on coef?cient alignment. J. Med. Syst. 40 (2016). doi: 10.1007/s10916-015-0426-9 10. Shiu, H.J., Lin, B.S., Huang, C.H., Chiang, P.Y., Lei, C.L.: Preserving privacy of online digital physiological signals using blind and reversible steganography. Comput. Methods Programs Biomed. 151, 159–170 (2017) 11. Chen, W.H., Smith, C.H., Fralick, S.C.: A fast computational algorithm for the discrete cosine transform. IEEE Trans. Commun. COM-25, 1004–1009 (1977) 12. Feig, E., Winograd, S.: Fast algorithm for the discrete cosine transform. IEEE Trans. Signal Process. 40, 2174–2193 (1992) 13. Liang, J., Tran, T.D.: Fast multiplierless approximations of the DCT with the lifting scheme. IEEE Trans. Signal Process. 49, 3032–3044 (2001) 14. Moody, G.B., Mark, R.G.: The impact of the MIT-BIH arrhythmia database. IEEE Eng. Med. Biol. Mag. 20, 45–50 (2001) 648 C.-Y. Yang et al. Semantic-Based Resume Screening System Yu Hou(?) and Lixin Tao Pace University, New York City, NY 10038, USA {yh50276p,ltao}@pace.edu Abstract. At present, XML becomes one of the best choices for storing semi-structured electronic resumes. Most of the companies let the candidates ?ll out their resumes online on the company’s website and store these electronic resumes uniformly. This paper assumes that all candidates’ electronic resumes will be saved in the form of XML, and proposed a Semantic-based Resume Screening System (RSS). The RSS could improve the accuracy and e?ciency in the hiring process by using the Ontology Knowledge Base and the Pace XML Validator. Keywords: Knowledge representation · Web Ontology Language (OWL) · XML Integrated syntax and semantic validation 1 Introduction 1.1 A Subsection Sample Due to the low coverage, poor e?ciency and high cost, the traditional o?ine recruitment mode has been replaced by the internet recruitment mode since the last few decades. The top companies may receive a large number of electronic resumes daily. Therefore, it is challenging for recruiters to store and screen the resumes which are semi-structured. Nowadays, the most popular model is applicants ?ll out their resumes online on the company’s website, which facilitates the uniform store and management of electronic resumes. Since XML has appeared, it becomes the best choice for storing electronic resumes. At present, most of the companies are challenged by screening those semi-structured resumes. It is a heavy work to screen the ideal candidate accurately and e?ciently from a large number of resumes. Manual screening is not only time-consuming, but also has a strong subjectivity. It is di?cult to be guaranteed that the companies can ?nd the ideal candidates from the large-scale resume objectively and e?ciently. The traditional and the most common solution is keyword search, for example, if the HR want to search candidates who graduate from Pace University, then he or she needs to use ‘Pace University’ as the keyword to search in candidates’ resumes. However, this method cannot meet the most HRs’ requirements very well, because some companies HR usually use keyword tags for expressing their certain demands, such as ‘candidate who has work experience in the Fortune Global 500 companies’, ‘candidate who is graduated from the Lvy League’. The traditional keyword search just can screen the resumes which include the speci?c name such as ‘Google’, ‘Facebook’, ‘Pace © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 649–658, 2019. https://doi.org/10.1007/978-3-030-02686-8_49 University’ and so on, but it cannot screen resumes by using ‘Fortune Global 500 companies’. This paper proposed a Semantic-based Resume Screening System (RSS), which is introduced an Ontology Knowledge Base. The RSS could improve the prescre- ening of the hiring process by matching the Ontology Knowledge Base. Since 2013 Pace University has developed Pace XML Validator [1], this validator greatly improves the e?ciency of the XML ?le’s veri?cation with the features of reus- able, integrated syntax and semantic validation. Because of the advantages of XML in the storage and retrieval of electronic resumes, this paper assumes that the electronic resumes need to be ?ltered that is ?lled out by the candidates on the companies’ website and stored as XML documents, and using the Pace XML Validator to set the constraints to represent the screening requirements. Therefore, the validated XML documents meet the screening requirements. Conversely, an XML document that failed the validation means it does not meet the screening requirements. This paper o?ers a proposed approach to help the RSS understand the user’s real intention more accurately. The HR can achieve the ideal candidates from the optimized result. The main contribution of the RSS is to enhance the accuracy and e?ciency of the electronic resumes screening process. In the following of this paper, the related work of RSS system will be introduced in Sect. 2, and the approach on how to create a knowl- edge base will be discussed in Sect. 3. The details of the system framework and imple- mentation will be illustrated in Sect. 4. Finally, we will make a conclusion of this project. 2 Related Work 2.1 Ontology-Based Knowledge Representation Knowledge Representation is the ?eld of study concerned with using formal symbols to represent a collection of propositions believed by some putative agent [2]. In a general sense, knowledge representation is a set of conventions for describing the world and it is the symbolization, formalization, or modeling of knowledge. From the perspective of computer science, knowledge representation is a general method to study the feasibility and validity of computer to represent knowledge. It is a strategy of representing human knowledge as a data structure and a system control structure of machine processing. More speci?cally, the knowledge can be de?ned as understanding, facts, information and description for some real or imaginary entity. In other words, in the ?eld of computer science, Knowledge Representation means that let the machines can understand. At present, the research on knowledge representation and organization method is mainly composed of frame expression, generative expression, object-oriented expression and ontology-based expression. Ontology-based knowledge representation is getting more and more attention. The concept of ontology originated in the ?eld of philosophy, which is de?ned as ‘a systematic description of the objective existence in the world’ is a systematic explanation or explanation of objective existence and concerns the abstract essence of objective reality [3]. With the development of arti?cial intelligence, Knowl- edge Representation was given a new de?nition in the ?elds of AI and computer science. Ontology is an integration tool for application and domain knowledge. It is a collection of concepts in a certain domain and relations among concepts, and the relationship 650 Y. Hou and L. Tao re?ects the constraints and connections among concepts. Ontology-based knowledge representation can ensure the consistency and uniqueness of knowledge sharing in the process of sharing, and can fully express the complex semantic relations between knowledge. Therefore, ontology can solve a large number of knowledge exchange and disordered sharing situation to maximize the sharing and reuse of knowledge. The use of ontology formal knowledge representation can easily access knowledge semantic information. Speci?cally, ontologies emphasize the relationships between entities and express. It can re?ect these relationships through a variety of knowledge representation elements. These elements are also called meta toms that includes concept, attributes, relations, functions, axioms and instance. Therefore, ontology has been widely used in many ?elds. 2.2 Pace Schematron XML Validator Extensible Markup Language (XML) is a markup language that de?nes a set of rules for encoding documents in a format that is both human-readable and machine-readable. The XML can be used to mark data, de?ne data types, and it is a source language that allows users to de?ne their own markup language. The main features of the XML are: (1) Convenient extensibility. XML allows organizations or individuals to create a collection of tags that suit their own needs, and these collections of tags can quickly get used to the Internet. (2) Strong structure. The logical structure of XML document data is a tree-like hierarchy. Each element in a document can be mapped to an object, and corresponding attributes and methods are also available. Therefore, it is suitable for the use of object-oriented programming to develop applications that process these XML documents. (3) Good interaction. When users interact with applications, using XML makes it easy to locally sort, ?lter, and perform other data operations without interacting with the server which relieves the burden on the server. (4) Powerful Semantic. In XML documents, people can use certain tags to de?ne the relevant semantics for data, which not only greatly improves the readability of the document for human beings, but is also easy to be read and used by machines. Therefore, the information exchange between di?erent devices and di?erent systems can be easy. Because XML describes the meaning of data content by tagging it and separates the display format of the data, the search for XML document data can be performed simply and e?ciently. In this case, the search engine does not need to go through the entire document, but only to ?nd the contents of the speci?ed tag on it. In this way, it is no longer di?cult to browse the Internet, as each page is displayed exactly what the viewer wants. In the electronic resume, for di?erent candidates, some speci?c markers are ?xed, such as name, age, graduation school, work experience, etc., but only the content is speci?ed by these marks. Therefore, combined with the characteristics of XML, the storage of electronic resumes in the form of XML documents has become the most e?ective method. Since 2013, Pace University developed an integrated syntax/semantic validator which is a Pace XML Validator. Schematron [4] is a popular rule-based XML dialect that allows us to specify such co-constraints for a class of XML documents and then use a standard Schematron validator to validate the co-constraints without coding. Over the past decade, the standard implementation of the Schematron validator is to use a standard Semantic-Based Resume Screening System 651 XSLT stylesheet [5, 6] to transform a Schematron document into a new validator XSLT stylesheet, and then use the latter to validate the XML instance documents. However, the current industry practice of XSLT-based Schematron validation may produce invalid results and cannot be easily integrated with other system components [1]. Thus, Pace University designed and implemented a validator as a reusable software component based on DOM Level 3 XPath. It supports all key features of Schematron ISO [4] including abstract rules and abstract patterns, network integration through web services, and event-driven loose-coupling. 3 Create Knowledge Base Ontologies are usually organized in taxonomies and typically contain modeling primi- tives such as classes, relations, functions, axioms and instances [7]. Therefore, the ontology design of knowledge base is the design of concept, relationship and instance. This paper will illustrate the design of using the domain knowledge base to analyze the resume information, which could help the users to ?nd the ideal candidates more accu- rately. At present, the design of the ontology for semantic analysis of resume information is mainly composed of classes and instances. The classes in ontology have two functions: (1) Describe the meaning of class and the knowledge contained in the class; (2) De?ne subclasses and instances of the class. The di?erence between an instance and a class is that the class could be a name or some attributes that describe an instance within a collection, but the instance is a member of the collection. For example, the smartphone is a class, and the iPhone 8 is an instance of this class. By matching the domain knowl- edge base, the system can set the constraints in Pace XML Validator more accurately, so that the system can achieve the better result to the users. This paper uses Protégé as an ontology modeling tool to create a knowledge base. Protégé is a free, open source ontology editor and a knowledge management system [8]. Protégé provides a set of behavior-oriented systems based on a knowledge model structure to support the ontology construction of various expressions (such as an OWL, RDF, Dublin Core and so on). In the Protégé editor, the ontology structure is shown in the hierarchical directory structure. It is straightforward for the maintenance operations of the ontology (such as adding classes, subclasses, attributes, instances). Therefore, there is no need to concern the speci?c ontology language; it only needs to design a domain ontology model at the conceptual level. The example used in this paper is that an HR needs to ?nd the candi- dates that ‘graduated from Lvy League’, or ‘has work experience in Fortune Global 500 companies’. A knowledge base will be designed based on this assumption. First, the ‘Lvy League’ and ‘Fortune Global 500 companies’ are derived from the class Thing. The university such as ‘Havard University’, ‘Columbia University’ and so on, they belong to the class of ‘Lvy League’, and the company such as ‘IBM’, ‘Apple’, ‘Micro- soft’ and so on, they are the instances which belong to the class ‘Fortune Global 500 companies’. By establishing the knowledge base in this ?eld, the system will understand how to set the constraints in Pace XML Validator when it meets the requirements such as ‘having a work experience in Fortune Global 500 companies’. Therefore, the system’s ability will be enhanced. 652 Y. Hou and L. Tao 4 Design of Semantic-Based Resume Screening System Framework The Semantic-based Resume Screening System (RSS) is composed of four parts: No. 1: Reading the requirements from the users, and the RSS will conduct a preprocessing for later operation. No. 2: Based on the pre-processed input, the system will match the knowledge base created previously, then generate the contents from the resumes that RSS will screen later. No. 3: Based on the contents, the RSS will generate constraints automatically when the RSS invokes the Pace XML Validator. That is, the RSS will generate a Schematron ?le (.sch ?le). No. 4: The RSS will invoke the Pace XML Vali- dator to validate each resume in the resume folder, then return the veri?ed documents that the users want to achieve. Figure 1 is the design of Semantic-based Resume Screening System Framework. Fig. 1. The design of semantic-based resume screening system framework. 4.1 Preprocessing When a user enters a requirement, we need to preprocess the input ?rst so that the later operation can be more convenient for these requests. The main preprocessing is to ignore the capitalization of the letter input, as well as the space input. As we know, when users enter the requirements, the ?rst letter of a university or organization always needs capi- talization in the expression. However, the expression of some classes and instances in the knowledge base may not be stored in the form of capital letters. In order to avoid errors caused by inconsistencies during the operation process, the RSS will ignore the capitalization of the letter input in the preprocessing part. In the OWL ?le, the spaces are often saved with the ‘symbol’. Figure 2 is an example of an OWL ?le. From this example, we can see that the spaces in the class ‘fortune global 500 companies’ and the class ‘lvy league’ are represented by the ‘symbol’. Therefore, the RSS will preprocess the spaces, in order to avoid the error during matching the knowledge base. Semantic-Based Resume Screening System 653 Fig. 2. An example of OWL ?le. 4.2 Matching the Knowledge Base In this section, the RSS will use Jena to read and identify the established knowledge base, which is to use Jena to read and analyze the saved OWL ?le. Apache Jena (or Jena in short) is a free and open source Java framework for building semantic web and Linked Data applications. The framework is composed of di?erent APIs interacting together to process RDF data [9]. First, the RSS will match the preprocessed input with the OWL ?le read by Jena, thus, the RSS will understand whether the user needs the knowledge base’s assist. For example, if the user wants to ?nd out candidates who have experience working with Fortune Global 500 companies, the RSS can know that ‘Fortune Global 500 companies’ means that candidates should have work experience in companies such as IBM, Apple, Microsoft and so on because the instances ‘IBM, Apple, Microsoft’ belong to the class ‘fortune global 50 companies’ in the knowledge base. If a user’s requirement does not need the knowledge base’s assist, for example, a user wants to ?nd candidates who graduated from Pace University, the RSS may ?nd that ‘Pace University’ is not one of the classes in the OWL ?le. Then, the RSS will return the result “Pace University” directly without the knowledge base. 654 Y. Hou and L. Tao 4.3 Generate Schematron File Through the previous section, the RSS will understand the details of users’ demand to search. Next, the RSS will generate a Schematron ?le based on the keywords returned in the previous session automatically. The Schematron ?le is to set the XML ?le’s constraints. The Pace XML Validator will use the Schematron ?le to verify whether the XML ?le can meet the constraints. For example, a candidate named Mike, his resume is saved in XML format. Figure 3 is Mike’s resume saved in an XML ?le. If an HR wants to ?nd candidates who graduated from Pace University, then the keyword gener- ated in Sect. 2 is ‘Pace University’. And the RSS will generate the corresponding Sche- matron ?le based on the keyword to set constraints on the XML ?le. Figure 4 is a Sche- matron ?le generated from the keyword ‘Pace University’. In this ?le, we restrict the XML ?le as follows: Search the content ‘Pace University’ under the ‘education’ element, if the ‘education’ element has the content – ‘Pace University’, it means the veri?cation is passed, otherwise, it will fail. When an HR wants to ?nd candidates who have working experience in the Fortune Global 500 companies, the RSS will understand that ‘work in Fortune Global 500 companies’ means ‘working in the companies such as IBM, Apple, Microsoft and so on’ via matching the knowledge base. Thus, the keywords are the companies’ name such as ‘IBM, Apple, Microsoft and so on’. Then, the RSS will generate the corresponding Schematron ?le based on these keywords. Figure 5 is a Schematron ?le with the constraints of ‘work in Fortune Global 500 companies’. In this ?le, we restrict the XML ?le as follow: Search the content which includes any of the companies’ name which is one of the Fortune Global 500 companies under the ‘work’ element, if the ‘work’ element has any of these names, it means the veri?cation passed, otherwise, it will fail. Fig. 3. Mike’s resume. Semantic-Based Resume Screening System 655 Fig. 4. The Schematron ?le with the keyword ‘Pace University’. Fig. 5. The Schematron ?le of ‘work in Fortune Global 500 companies’. 4.4 Invoke Pace XML Validator In this paper, we assume that all candidates’ electronic resumes are saved as XML ?les in a speci?c folder. In this step, the RSS will invoke the Pace XML Validator and follow the Schematron ?le which is generated in the previous step to verify the XML ?les individually. Once the validation has completed, the RSS will return the veri?ed XML ?le. Figure 6 shows the three resumes saved in one folder; if a user wants to ?nd a candidate who graduated from Pace University. After the screening, the RSS will return the veri?ed XML ?les in ‘Alice.xml’ and ‘Tom.xml’. Figure 7 is the results of the screening. If a user wants to ?nd a candidate who has work experience in Fortune Global 500 companies. After the screening, the RSS will return the veri?ed XML ?les in ‘Mike.xml’ and ‘Tom.xml’. Figure 8 is the results of the screening. Now, the user can ?nd their ideal candidates for the screening. 656 Y. Hou and L. Tao Fig. 6. The example of resumes. Fig. 7. The results of the RSS screen ‘Pace University’. Fig. 8. The results of the RSS screen ‘Fortune Global 500 Companies’. 5 Conclusion In this paper, we showed that the search based on keywords cannot satisfy the current screening of electronic resumes. This paper proposed a Semantic-based Resume Screening System (RSS). This system can greatly enhance the understanding ability of the screening requirement based on the knowledge base. And this paper also improves the e?ciency of XML document validation through the application of Pace XML Vali- dator. The approach proposed in this paper can improve the e?ciency and accuracy of screening resumes. Therefore, by using this approach the e?ciency of a company’s hiring process will be highly promoted. In the future, our work will introduce the knowl- edge graph to improve the capability of the knowledge representation. Because the ontology primarily only supports the subclass Of (is-a or inheritance) relation. Various other relations, such as part-of are essential for representing information in various ?elds including all engineering disciplines [10]. Semantic-Based Resume Screening System 657 References 1. Tao, L., Golikov, S.: Integrated syntax and semantic validation for services computing. In: 2013 IEEE 10th International Conference on Services Computing (2013) 2. Brachman, R.J., Levesque, H.J.: Knowledge Representation and Reasoning. Morgan Kaufmann, San Francisco (2004) 3. Wu, J.: The construction of ontology-based domain knowledge base. Sci. Technol. Innov. Herald 30, 250–252 (2010) 4. I. Standard: Information technology - Document Schema De?nition Language (DSDL) - Part 3: Rule-based validation – Schematron, March 2013. http://standards.iso.org/ittf/ PubliclyAvailableStandards 5. Dodds, L.: Schematron; validating XML using XSLT, March 2013. http://www.ldodds.com/ papers/schematron_xsltuk.html 6. Jelli?e, R.: Schematron Implementations, March 2013. http://www.schematron.com/ links.htm 7. Gruber, T.R.: A translation approach to portable ontology speci?cations. Knowl. Acquis. 5, 199–220 (1993) 8. Musen, M.A.: The Protégé Project: a look back and a look forward. AI Matters 1(4), 4–12 (2015) 9. Jena, A.: Getting started with Apache Jena. https://jena.apache.org/getting_started/ index.html 10. Patel, K., Dube, I., Tao, L., Jiang, N.: Extending OWL to support custom relations. In: 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing, New York, USA, November 2015 658 Y. Hou and L. Tao The Next Generation of Arti?cial Intelligence: Synthesizable AI Supratik Mukhopadhyay1(?) , S. S. Iyengar2 , Asad M. Madni3 , and Robert Di Biano4 1 Division of Computer Science and Engineering, Louisiana State University, Baton Rouge, LA 70803, USA supratik@csc.lsu.edu 2 School of Computing and Information Sciences, Florida International University, Miami, FL 33199, USA iyengar@cs.fiu.edu 3 Department of Electrical and Computer Engineering, University of California, Los Angeles, CA 90095, USA ammadni@ee.ucla.edu 4 Department of Computer Science, Louisiana State University, Baton Rouge, LA 70803, USA Abstract. While AI is expanding to many systems and services from search engines to online retail, a revolution is needed, to produce rapid, reliable “AI everywhere” applications by “continuous, cross-domain learning”. We introduce Synthesizable Arti?cial Intelligence, and discuss its uniqueness by its ?ve advanced “abilities”; (1) continuous learning after training by “connecting the dots”; (2) measuring quality of success; (3) correcting concept drift; (4) “self-correcting” for new paradigms; and (5) retroactively applying new learning for development of “long-term self-learning”. SAI can retroactively apply new concepts to old examples, “self-learning” in a new way by considering recent experiences similar to the human experience. We demonstrate its current and future applications in transferring seamlessly from one domain to another, and show its use in commercial applications, including engine sound analysis, providing real-time indications of potential engine failure. Keywords: Arti?cial intelligence · Synthesizable Arti?cial Intelligence IBM Watson · Natural language processing · Self-learning 1 Introduction IBM’s Watson Analytics is no longer just a Jeopardy playing genius. Watson has embarked on a journey of knowing, going far beyond its initial capacity for Jeopardy question answering. Watson Analytics has made great strides employing the use of the The authors acknowledge the sponsorship of NASA Ames Research Center, US Department of Agriculture, National Science Foundation, US Department of Defense and the US Army Research Lab in their research. © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 659–677, 2019. https://doi.org/10.1007/978-3-030-02686-8_50 Natural Language Processing User Interface (NLP-UI) as a novel approach to analysis of business problems allowing even unseasoned businessmen an opportunity to analyze industry and personal datasets. The diversity of challenges in AI and their speci?c embedded complexities should not obscure the fact that the heart of the subject belongs to real-time reasoning. For the last decade, researchers in Arti?cial Intelligence (AI) have made exponential progress in applications across broad industry areas. Autonomous vehicles from Google and others have registered countless miles on American roads. AI systems are interpreting radiology images and diagnosing diseases with the same skill level as experienced radi- ologists and doctors. AI is in?uencing every aspect of human life from hearing aids to stock trades. So, is AI ready for primetime, or are we already there? We think the state-of- the-art in AI today is at the same stage that software engineering was in the early 1960s. During that time, software could only handle small problems in diverse domains (e.g., numerical analysis, personnel management, etc.): there was no way in which complex software systems involving millions or billions of lines of code could be created to tackle real world problems. In the same way, today’s AI systems are limited to solving smaller (but harder) problems like “image recognition”, and “automatic question answering”. Scaling such systems to address large complex tasks such as automated drug design, air tra?c control, or running an entire enterprise remains a challenge. Soft- ware engineers invented abstractions embodied in object-oriented techniques and prin- ciples of software reuse to revolutionize productivity; today large software systems are no longer developed from scratch: they are built by reusing existing code through subclassing and overriding methods. A variety of software abstractions are available today to enable code reuse, from design patterns to frameworks. Thanks to this meth- odology, today software is all encompassing, in?uencing every walk of human life from power systems to retail. Is AI waiting for a similar abstraction revolution? While AI has been part of many systems and services from search engines to online retail, to realize the vision of “AI everywhere”, a revolution similar to that which occurred in software is needed. Despite all the recent successes of AI, many questions remain unanswered. In many ways, Watson represents a solution to many problems, yet still has some limitations in moving to a new domain. Watson cannot hit the ground running in a completely new domain, automatically deploying and recon?guring itself online when situations change. The machine learning system of Watson is very good, but cannot auto-tune to a problem domain instantaneously. The concept of domain changes in many of these applications is still a problem of interest. Researchers throughout the AI community have been asking, “How do you improve productivity in the creation and deployment of AI systems?” In other words, how can we produce AI systems rapidly and reliably as the applications of AI expand from understanding speci?c scenes to serving societal and business needs in critical areas? The authors and their team have introduced an alternative approach through Synthe- sizable Arti?cial Intelligence, or SAI technology. Previous work by Mukhopadhyay, Iyengar et al., on Cognitive Information Processing Shell [1] served as an impetus for this approach. 660 S. Mukhopadhyay et al. SAI is unique from any other AI system by virtue of its ?ve technological advances or “abilities”; (1) continuous learning after training by “connecting the dots”; (2) meas- uring the quality of success; (3) correcting concept drift; (4) “self-correcting” for new paradigms; and (5) retroactively applying new learning for development of “long-term self-learning.” SAI can retroactively apply new concepts to old examples, “self-learning” in a new way by considering recent experiences similar to the human experience. In this paper, we demonstrate how our work on SAI has overcome limitations of other AI systems, and its current and future applications in transferring seamlessly from one domain to another. We show its use in current commercial applications, including engine sound analysis, where it provides real-time indications of potential engine failure, and its future uses in “automatic drug discovery”. 2 Hierarchical Fractal Architecture of SAI Agents and Related Work Currently, di?erent AI systems specialize in single speci?c tasks, determined by the data type in which they were trained in advance. SAI is unique in measuring the applicability of a given agent (neural network), or cluster of neurons within a network, to a speci?c task in real time. Thus, SAI can detect if the input changes to something the network is not equipped to deal with and draw from a wide variety of related and unrelated data to activate di?erent neural clusters that can be used to rapidly understand the new input. Adapting to new types of input during execution time is a di?cult problem. Obvi- ously, there cannot be true learning to predict labels without at least a few ground truth labels to check against. That said, unsupervised methods like self-organizing map or auto encoders together with clustering can work in some situations. Unfortunately, these unsupervised methods require a lot of data, so they cannot be used to adapt rapidly to a new circumstance in real time. By learning data distribution without labels, or automat- ically organizing the data into clusters and assigning arbitrary labels, the data can be correctly understood with only a handful of additional ground truth examples. By e?ec- tively utilizing neural clusters trained on other problems, we solve this problem, enabling unsupervised learning that can also adapt to new circumstances immediately. As SAI learns new concepts from other problems, we can retroactively apply these concepts on old labeled data, allowing continual improvement as we gain better under- standing of old data. This is similar to the way humans perfect their skills in complex tasks. SAI determines the applicability of a given neural network, network layer, or feature map, to the analysis of a given input. When applied properly, neural networks internally organize input data into increasingly high level abstract information. By analyzing the response of a network segment to known and unknown information, we can develop a relation to determine whether the abstract concepts learned by a segment generalize well to a given piece of data. This allows us to immediately draw from a diverse ‘segment library’ of learned concepts when analyzing a new problem. Membership in the segment library is determined by maximum applicability to any problem, not applicability to the The Next Generation of Arti?cial Intelligence: Synthesizable AI 661 current problem, so useful concepts are always retained. Humans can separate their processing based upon natural features by ?lling in the open parts through learned expe- rience, enabling them to transfer the information to new experiences (Fig. 1). Fig. 1. Feature response separation. SAI agents are organized into a series of progressively more task speci?c network layers, where each layer can be connected to multiple sub layers (Fig. 2). Fig. 2. Hierarchical fractal architecture. We refer to a layer plus all its possible sublayers as a lobe. Inputs such as raw sensory data will ?ow into our network. Low level lobes will apply to most or all problems and will start to process the data; at this point it will only be passed on to sub-lobes where it is most applicable. As data ?ows through the network, only sections that are capable of processing that type of data are activated, while non-relevant sections are bypassed. Eventually the fully processed and understood data is routed to a ?nal high-level lobe and produces a result. Di?erent lobes can be associated with di?erent data types, but di?erent high-level lobes attached to the same mid-level lobe can also be associated with di?erent tasks for the same data. This allows us to e?ciently use the same type of data 662 S. Mukhopadhyay et al. di?erently depending on the task, while still sharing maximum knowledge between tasks. When we reach a point where no sub-lobe of a given lobe is applicable to a given task, we ‘grow’ a new sub-lobe starting from that point, created from the most applicable available network segments from previous tasks. As we train or learn about our new task, these segments may diverge form their original values as the network improves at the task. If this happens, they are also added to our segment library for future use. By having a way to measure how well a network segment applies to a given input, we can instantly transfer knowledge learned from other problems to the current situation. The ability to transfer knowledge e?ectively between very di?erent tasks allows rapid adap- tation to new and unusual conditions. Conversely, when a given segment no longer applies to a given set of inputs, it can be saved to a knowledge library rather than overwritten, allowing it to be recalled if it becomes relevant again in the future. This gives us an e?ective method for lifelong self-learning, where potentially valuable concepts that are irrelevant to the current problem are unused, but not forgotten. SAI’s hierarchical structure allows us to e?ciently share knowledge between di?erent tasks, while retaining a segment library eliminating the problem of catastrophic forgetting. Our method allows valuable learned concepts to be identi?ed, and when they become irrelevant due to changes in conditions or concept drift, they can be bypassed, or phased out and saved for use in later tasks. Because our system can measure the applicability of network weights to a speci?c piece of data, we can swap out, or reroute around currently useless sections of our network rather than overwriting them. This eliminates the danger of catastrophic forget- ting. We believe that biological learning systems can adapt to new conditions so rapidly because of two mechanisms: associative memory and analogy. Associative memory is a mechanism by which training data can be e?ectively transferred between tasks that are only loosely related, by associating part of one task with another, and using the previously learned knowledge for the new task. Analogy is more complex and likely only humans are capable of it. Analogy involves forming an association between two relations involving the data instead of two data items. It is very powerful in that it allows us to infer complex relations from sparse data. When conditions change abruptly, a biological system will ?rst try to adapt to the new conditions by looking for similar conditions in the past in a di?erent context (e.g., if we’re trying to detect cars and one went through a shadow); in past trainings other objects have been in or gone through shadows, so we leverage the now very relevant shadow-resistant features from other trainings to correct the problem rapidly without needing to build a dataset of cars going through shadows), or by dropping part of the classi?cation criterion that have become unreliable in favor of a subset (e.g., blue paint was spilled on a cat, now our method shows the standard color based map responses as not applicable; but the shape based responses are still applicable, thus we swap out the now irrelevant color-based cat features from our system (archiving them in the segment library) and quickly learn to classify cats without relying on color). SAI integrates both methods. The Next Generation of Arti?cial Intelligence: Synthesizable AI 663 SAI can be used to ‘grow’ new lobes on a nodal network agent when new useful features are discovered and determine which lobes to branch based on inclusion of features applicable to the current problem (Fig. 2). The issue of balance would apply to determining the optimal feature set to assign to each lobe. If a lobe became so large it was computationally infeasible to process data through it, it would be split into two smaller balanced lobes. Because our network can bypass non-applicable layers and their sub-layers, we avoid having to make such a hard tradeo? between knowledge acquisition and memory retention. Our SAI architecture can e?ciently treat the same data di?erently based on context and system goals; the same lower level lobe will be associated with di?erent higher-level lobes for di?erent ways of handling its output. These can be activated selectively based on system goals, or simultaneously to accomplish two tasks with only incremen- tally more processing power than is required for one. Similarly, several high-level lobes may be associated with di?erent versions of a drifting concept, or di?erent noise types. If the system goals are not explicitly given, the route the data takes through the network is determined primarily by lobe applicability, and output paths represent system goals. This means the system has the ability to choose its own system goals based on the situation if necessary. Changing architecture based on sensory input is a fundamental property of SAI in that data is routed only through lobes capable of processing it. As with all cognitive architectures, memory and computation are di?erent aspects of the same connections and weights. Sensory inputs are ?rst processed in general areas of the network, and then routed through dedicated areas based on the speci?c data type and target task. Instincts can be emulated by training a network segment to emulate a hard-coded rule and adding that rule to the segment library. That allows it to be swapped either manually or automatically where applicable and allows the system to learn to re?ne or ignore the instinctual rule where necessary. SAI has a library of network segments to draw on, and segments are stored by maximum value in any situation, not current value. Therefore, catastrophic forgetting cannot happen. SAI represents a new paradigm in machine learning, able to draw on diverse knowl- edge to adapt any new situation rapidly. 3 Self-learning Typical AI systems start out at some initial conditions, then improve at their target task iteratively during training time, and reach some asymptotic maximum quality, then are frozen in that state and ?elded. A human expert however can continue to gain expertise at a task long after they are ?nished being trained by an expert. Even when a human is the best in the world at a given task, and no better expert exists, they can still continue to gain expertise on their own. How is this possible? Well, in one class of tasks the human can easily determine a success/failure labeling or quality measure accurately, and therefore generate their own labeled data after 664 S. Mukhopadhyay et al. deployment. They then use reinforcement learning to continuously improve at their task. Machine learning can already do this quite well, assuming a system can be trained to estimate the quality measure, so we will neglect this case. In another case the task we are trying to improve is a labeling task, so the system/ human can never really be sure it is improving at the task after deployment without the occasional ground truth. Even for human experts, something akin to concept drift is possible. Nevertheless, a human expert will gain a better and better understanding of the task via unlabeled training; and be able to correct any concept drift from a single example. Existing machine learning systems generally have the capability to correct for concept drift via unlabeled plus labeled examples, but only our SAI architecture provides a mechanism to detect the concept drift automatically, so it knows when to ask for more examples. If existing lobes become inapplicable to the current tasks, the system will grow new lobes from that level on that apply better to the current problem and use them instead. This is analogous to a human whose old way of doing things isn’t working anymore experiencing a paradigm shift. The system may still need some ground truth to get a handle on the new situation, but it would realize on its own that the old learning was failing and that the results were no longer reliable and could ask for labeled examples to regain its bearings. Our system also demonstrates self-learning in another way. The system can retro- actively apply new concepts to old examples, learning new ways to understand long known tasks in light of recent experiences. This means our system could continue to get better at a task long after labeled data on the task had stopped coming in by transferring useful concepts from other tasks. This sort of long term self-learning is one of the ways human experts gain the highest levels of expertise. Let’s look at two potential applications for this advanced technology. 3.1 A Practical Example To illustrate the usefulness of SAI’s advanced architecture, consider a video recognition network for classifying clips from musicals. The network would be trained for several tasks related to classifying the clips, such as ?lling in the sound e?ects, recognizing famous actors, determining the genre, and determining whether the clip comes from the beginning, middle, or end of the story. This scenario is signi?cant for two reasons. Firstly, there are thousands of hours of such videos available, either already labeled or easily labeled automatically. Second, there is signi?cant interest in these types of appli- cations. Shazam does something similar with music. SAI would start out using segments from one or more of the tasks, and produce an input layer, some intermediate sub-layers, and three output lobes; one for each task. If the same types of features were useful for all three problems, the network wouldn’t split into these sub branches until shortly before the ?nal layer. To clarify, if the actor recognition task used very di?erent features (facial features) from the sound prediction task (visual cues, gestures, and body movements), the network would bifurcate somewhere in the middle. Regardless, the early layers would contain The Next Generation of Arti?cial Intelligence: Synthesizable AI 665 features that applied to all three problems while the later layers would contain problem speci?c features. The feature library would contain both. This illustrates a commercial application of SAI, but its real strength lies in its potential use to track suspicious behavior. Suppose we are training a system to detect pickpockets (or terrorists) from watching a video feed. There are not thousands of hours of data available on this; and it is not publicly available and well labeled. We may have a few tens of examples of pickpockets on video if we are lucky. Classically, this would make the problem infeasible; a computer couldn’t solve the problem even though a human might be able to do it never having seen a single real example. Humans can transfer knowledge from millions of other more innocent interactions in their experience to understand what is happening. The human already knows that the hands are used to grasp objects and are of interest, the clothing has pockets in it, usually in the same areas, that someone suddenly changing direction might be signi?cant, etc. Similarly, SAI will look at the handful of ‘pickpocketing’ interactions and search its feature library for anything applicable. The sound prediction features will be attuned to looking for small hand gestures (to predict ?ngers snapping) and leg movements (to predict footstep sounds). The segment that predicts where along the timeline a clip came from does so by learning to estimate fatigue level from pose and timing di?erences. The actor recognition features would understand the meaning and signi?cance of faces and would share these types of features with the fatigue estimation portion, which could use them to look for sweat on faces. Some of these features (hand movements, stress level from pose) would have higher than baseline applicability towards pickpocketing detection, and we could immediately identify these and use them. So, when SAI has the initial path leading to our new ‘pick- pocketing’ output lobe, it would already understand a great deal about the meaning and context of the scenes before even training with the ‘pickpocketing’ samples (Fig. 3). Fig. 3. Pickpocket scenario demonstrating breakout of video features into characteristic lobes and storage in the main Segment Feature Library for rapid future learning (Photo Courtesy of Ili Simhi). The new series of lobes would share low level features with the existing network; and even the high-level features would be initialized from the most applicable members 666 S. Mukhopadhyay et al. of the feature library. At that point the network would proceed to learn from the ‘pick- pocketing’ samples, and if any features changed signi?cantly, the network would know it had learned a useful new concept, thus adding the concept to the segment library. New concepts learned this way are retroactively applied to old problems. In this case new concepts learned from pickpocketing detection could be checked for applicability to music classi?cation. This would allow our network to continue learning about a problem long after data on that problem has stopped coming in, and therefore enabling a better understanding of “old memories” in light of new experiences. 3.2 Transfer Learning Due to transfer learning, analogical reasoning, and automated tuning, SAI can easily transfer from one domain to another, unlike some other AI systems which cannot be readily deployed to a new domain in order to learn from one another or from each other. In SAI, for example, an “agent” performing the task of understanding imagery from Synthetic Aperture Radars (SAR) can gain knowledge through transfer from an agent performing semantic segmentation of CAMVID imagery or from a VGG-16 model pre-trained on Imagenet [2]. A “core sample” from a previously trained agent on one task can be used to train a new agent for a di?erent task [2]. This strategy helps avoid the need to train an entire “network” on a large dataset and improve overall performance. For example, training a large VGG-16 network on a reasonably large dataset takes a long time; SAI avoids that by using a VGG-16 model pre-trained on the Imagenet dataset and extracting a “core sample” from it to create a new “agent”. Not only does this strategy save training time, but it also helps create a trained agent from a relatively smaller training dataset. In Watson, such an agent needs to be built from scratch by training on a large labeled (SAR) dataset. Another application using transfer of knowledge obtained through recognizing objects is in the ImageNet dataset, which has no characters to segment and classify handwritten foreign characters. Because ImageNet is drawn from a large and diverse dataset, its features can be assumed to be more general purpose, being able to represent many types of shapes and textures equally well. While it may not have the capacity to directly recognize foreign characters, it should have the ability to recognize many common simple structures in a wide array of image conditions, including noise. This is the knowledge that we want to transfer out of it and combine with our own knowledge of foreign characters. In general, assume that the SAI framework is asked to con?gure an AI engine for a task “T.” We have at our disposal AI engines (neural networks) for solving tasks T1, …, Tn. Some concepts learned in solving one or more of T1, …, Tn will be relevant to solving task T. Assume we are provided a labeled dataset D for training a neural network to solve task T. SAI ?rst starts with a randomly initialized neural network to solve T. For each network corresponding to Ti, and for each of its layers, SAI determines the applicability of the learned concepts towards the new task T. This is done through the evaluation of a transferability metric that provides a measure between 0 and 1. SAI sorts the corre- sponding layers of each of the networks corresponding to T1, …, Tn in terms of decreasing transferability measure. For each layer of the new network corresponding to The Next Generation of Arti?cial Intelligence: Synthesizable AI 667 T, SAI transfers the top k “relevant” weights from T1, …, Tn. Finally, SAI partitions D into two subsets: a small subset Dtrain that will be used to ?ne tune the network and a testing set Dtest that is used to test it. Note that both Dtrain and Dtest are also used in computing the transferability metric. Notice that the data needed for ?ne tuning is only a small subset of D. This scheme works even if D is small. Today, large AI systems are developed and ?ne-tuned by companies with armies of highly paid data scientists and engineers. It takes a signi?cant amount of time, money, and e?ort, together with a deluge of training data, to build and train an AI system that can operate at the level of humans in a new domain.1 Even with this enormous force, gaps in training remain (Fig. 4). Fig. 4. Gaps in current state-of-the-art intelligent systems. The AI community has recognized this limitation as one of the main stumbling blocks hindering progress and preventing AI from positively in?uencing important areas of human endeavor. The fast-changing nature of today’s world where M&As happen at the blink of an eye, new diseases appear at an alarming rate (e.g., Zika), political landscapes change overnight, and natural disasters come out of the blue make this slow mobility of AI across domains a formidable problem. For years, scientists have wrestled with a variety of solutions to this problem, such as “transfer learning.” Most AI systems today rely on “transfer learning” to bring the experience of an AI system in one domain to bear upon problems in another. This technique, however, ignores the tremendous amount of human experience already available in the new domain. Compare this to the way a person explores a new city. The person will combine previously acquired skills, such as map reading with the knowledge obtained from questioning locals about the best restaurants, museums, and shops, allowing them to navigate and enjoy the city even though the city is new, and the tourist may not speak 1 While deep learning techniques have eliminated the need for automatically extracting features, they have been shown not to work well, for example, for texture datasets where the inherent dimensionality of the data is high [2]. 668 S. Mukhopadhyay et al. the local language. This dynamic combination also enables a person to deal with unforeseen events such as road closures and detours. Humans have this innate ability to use this combination in their daily life to adapt to new situations and tasks. This fundamental recipe used by humans to survive in a rapidly evolving world is missing in current AI systems.2 How then is it possible to rapidly synthesize AI systems, leveraging previous experience and existing knowledge in a new domain to hit the ground running? Solving this problem requires rethinking the funda- mentals of existing AI architectures, through development of loosely coupled elastic architectures that can interact with humans and other AI systems and draw upon their knowledge and skills gained from previous experience and collaboratively solve inter- disciplinary problems. 3.3 Expanding the Reach of AI Through Synthesizable AI Using Peer Learning Figure 5 depicts a loosely coupled Synthesizable AI architecture. The top layer provides the reasoning, learning, and knowledge representation functionalities. It includes models that represent human background knowledge. Multiple generative models exist, such as Hidden Markov Models (HMM). In addition, SAI includes a transfer learning and an analogical reasoning framework, a deep neural network model (DNN), a statistical model (like statistical region merging), hypergraph-based models for large scale infer- ence together with heuristics to prune the search space, frameworks for active semi-supervised and online learning, and an automatically curated belief store (based on autoencoders) that manages beliefs of humans and AI systems. Fig. 5. Synthesizable AI architecture. 2 Some individual pieces of the puzzle are already developed in sub?elds of AI like active learning and transfer learning. The Next Generation of Arti?cial Intelligence: Synthesizable AI 669 The middle layer allows deployment, recon?guration, and collaboration among AI systems solving diverse problems using an elastic peer-to-peer agent architecture that exploits the top layer and provides agility to it through dynamic agent synthesis and deployment based on declaratively speci?ed knowledge in near real time. That is, the agents will use the reasoning engines as well as rules learnt by learning engines to process information, learn from other agents or human expertise through transfer learning and analogical reasoning, provide classi?cation, and make decisions. It is this layer that allows meta-learning for handling dynamically available human expert knowledge and for dealing with concept drifts. It provides a single programming interface to synthesize agents. Furthermore, the same layer enforces hot deployment of these agents under operating condition by leveraging the third layer described below. The organization of the agents in this layer can be ?at (peer-to-peer) or hierarchical where agents in upper layers are built by composing those in lower layers and can perform higher level tasks. The third layer depicts a high-performance run-time execution middleware that enables automated agent deployment and redeployment in real time through persistent hot-swapping, provides runtime monitoring for the agents, interfaces with sensors and actuators, and provides a distributed key-value store for publishing and subscribing to information by agents and sensors. Agents in the second layer can tune the runtime execution environment for optimal performance. Figure 6 shows an example ?ow for the synthesizable AI architecture. The architecture creates and combines a feedback-based meta-learning paradigm which is continuously monitoring the performance and relevance of existing/emerging data sets. In case the data characteristic changes drasti- cally (e.g., in streaming video analytics, where the background changes from lighted to dark as day gives way to night), a continually evaluated metric may indicate that the performance of an agent has fallen below a threshold (for example, in the case of video analytics, an unacceptable number of track overlaps, jumps, and drifts). SAI would respond to this situation by dynamically replacing this agent by another more appropriate to the altered situation or by adapting the former by transferring knowledge to it from agent(s) already experienced in such situations. A measure of transfer is used to deter- mine which agent(s) the knowledge is transferred from in the latter case. Fig. 6. Synthesizable AI ?ow diagram. 670 S. Mukhopadhyay et al. The Synthesizable AI architecture provides a practical approach in combining a multi-agent-based architecture with machine reasoning and learning. It leverages distributed and dynamic multi-agent synthesis to provide the following key features: (a) Dynamically incorporating the contextual knowledge from experts into the learning system; (b) Selectively use multiple learners to adapt to situation changes (c) Enable a never-ending learning system to deal with concept drift, and (d) Enable transfer of knowledge between agents solving problems in di?erent domains. The integrated system provides near real time response to rapidly changing situations without quality degra- dation or disruption in service commitments. The architecture allows a marketplace of AI systems, that cooperate and learn from each other to solve interdisciplinary problems, to be rapidly created, deployed, and adapted (Fig. 6). 4 Evaluation and Commercialization: A New Revolution for the Next Decade While SAI is still work in progress, it has been commercialized by AutoPredictive- Coding LLC., (http://autopredictivecoding.com) in the vertical of automated machine diagnostics. The resulting SpotCheck application [4] provided real time machine diag- nostics from emitted sounds, vibrations, and magnetic ?elds (Fig. 7). As deterioration of the machine lubricants, bearings, brushes, or other components occurs, very subtle changes also occur in the sounds and vibrations of the machine as it continues to operate. These sounds can be analyzed to estimate the oil quality, vacuum level, belt tension, bearing condition, and other elements, and provide real-time indications of potential internal failure. This analysis was used to drive systems longer, pushing them to their limits, while avoiding catastrophic failure and saving millions of dollars each year. Fig. 7. Using automated diagnostics to prolong the life of industrial machinery. 4.1 Automated Machine Learning System Now Used by NASA For terrain recognition (Figs. 8 and 9) [5, 6], the advanced supercomputing division at NASA Ames has been working with Louisiana State University to blend deep learning techniques for use on existing neural networks to create a robust satellite The Next Generation of Arti?cial Intelligence: Synthesizable AI 671 dataset analysis system. Using a massive survey database consisting of over 330,000 scenes from across the United States, the system has been able to quickly train and learn relevant patterns. The average image tile is 6000 pixels wide and 7000 pixels deep, comprising approximately a 200 Mb file for each image. The entire data set consists of 65 TB covering a ground sample distance of one meter. By using the SAI technology and synthesized AI, the networks can then be trained one layer at a time across very large and noisy datasets to provide the necessary fidelity for automatic terrain recognition and terrain authentication. Fig. 8. Sample images from the SAT 4/SAT 6 dataset [3]. Fig. 9. Automated tree cover estimation. The technology has most recently been used for (Fig. 10), automatic yield prediction, and automatic infrastructure tuning [7, 8]. Through a collaboration with NASA Ames Research Center, SAI has recently been applied to determine tree cover areas and agri- cultural areas in California (Fig. 9). These activities will assist in monitoring potential plant disease areas in remote, inaccessible areas requiring USDA intervention. 672 S. Mukhopadhyay et al. Fig. 10. Automatic yield prediction. 4.2 SAI: A Potential Solution for US Department of Agriculture Use in Yield Prediction Another application is the prediction of agricultural yields, based upon evaluation of complex datasets, provides an excellent foundation for evaluation of these large data sets and establishment of automatic yield prediction as depicted in Fig. 10. In the ?gure, colors more clearly de?ne the yield production based upon an original LANDSAT tile which has been analyzed for speci?c patterns, most likely to yield higher growth. Yet another emerging application is the use of synthesizable AI to analyze and auto- matically color images. This will have enormous application in a variety of areas, including undersea exploration, and deep space exploration, as well as analysis of remote area activities. Figure 11 depicts the application’s use in automatically coloring a black-and- white terrain landscape through analysis of speci?c features and the system’s capacity to “self-learn” based upon slight variations of terrain texture. Fig. 11. Automatic terrain landscape coloring. The search-based program/agent generation facility has already been used for intel- ligent tutoring applications in high school math education [9, 10], automated drug discovery [11], and automated program visualization [12]. These applications will continue to expand into the future. The Next Generation of Arti?cial Intelligence: Synthesizable AI 673 IBM’s Watson has been used commercially in IoT and the automotive industry, in social media campaigning, in medical diagnosis, in image interpretation in radiology, in natural language processing and speech recognition, in education, in ?nancial serv- ices, in supply chain management, and commerce. Recently, there have been applica- tions of Watson to automated material discovery. SAI has been used in a variety of domains including automated diagnostics for industrial machinery [4], satellite image understanding [5, 6], infrastructure tuning [7, 8, 13], education [9, 10, 14], program execution visualization [12], noisy natural language processing [15], and automated drug discovery [11, 16] some of which have not yet been addressed by Watson. One of the more promising future applications of the synthesizable AI applications will be in development of automatic drug discovery, an area in which we were only now beginning to envision. SAI technology is currently competing for the AI XPrize with AI-based automated drug-discovery as its target. 5 The Future: Automatic Drug Discovery In this age of vaccines and antibiotics there is still a constant e?ort to ?nd new drugs to combat illnesses for which there are no known cures. There is a need to discover replacements for existing drugs targeted at pathogens which have become resistant to current drugs. There is also a need to develop new drug therapies for health issues adversely a?ecting the lives of hundreds of million people every day. Indiscriminate use of antibiotics has resulted in pathogens developing drug resistance to produce “super- bugs” (http://www.cdc.gov/drugresistance). Although the multidrug-resistance in pathogens is growing fast, the number of new drugs being developed to treat bacterial infections has reached its lowest point since the beginning of the antibiotic era. The resistance is particularly problematic in Gram-posi- tive organisms S. aureus, E. faecalis, and S. pneumoniae as well as a number of Gram-negative organisms including K. pneumonia, A. baumannii, and P. aeruginosa. Hence, there is a dire need to develop new platforms and approaches to discover antibacterial agents against novel molecular targets. Not only are new drugs not being created, but the existing process of creating drugs is slow, ine?cient and costly. There is a desperate need to identify new antibiotics and antimicrobials rapidly as opposed to the normal time taken to create a drug. The solution is to develop a technique to construct libraries of molecules with the end goal of ?nding and developing new antibiotics and antimicrobial agents in a more e?cient and cost-e?ective manner. Our synthesizable AI-based approach (in collaboration with Dr. Brylinski from LSU Biochemistry) can automatically synthesize targeted drug molecules (see http:// brylinski.cct.lsu.edu/content/molecular-synthesis for the eSynth tool), ?lter candidates based on chemical criteria (such as being an antibiotic) [11], involves the analysis of 3D image models of the pathogen, automates clinical testing for side e?ects, and predict the candidate or candidates that is most likely to succeed. Our engine eSynth, generates target directed libraries using a limited set of building blocks and coupling rules mimicking active compounds. Given a set of initial molecules, eSynth synthesizes new 674 S. Mukhopadhyay et al. compounds to populate the pharmacologically relevant space. The building blocks [16] of eSynth are: Rigids: in?exible fragments often a single or fused aromatic group and Linkers: ?exible fragments connecting rigid blocks The eSynth software rapidly generates a series of compounds with diverse chemical sca?olds complying with Lipinski’s criteria for drug-likeness. Although, these mole- cules may have di?erent physicochemical properties, the initial fragments are procured from biologically active and synthetically feasible compounds. eSynth can successfully reconstruct chemically feasible molecules from molecular fragments. Figure 12 shows a 19-atom molecule compound rebuilt using eSynth. The process involves decomposition of the original 19-atom molecule through fragmentation and subsequent rebuilding to potentially more useful structures [16]. Furthermore, in a procedure mimicking the real application, where one expects to discover novel compounds based on a small set of already developed bioactives, eSynth can generate diverse collections of molecules with the desired activity pro?les. Fig. 12. A 19-atom molecule rebuilt using eSynth [9]. Research activity is ongoing in several new, emerging areas as outlined in the following paragraphs. 5.1 Antibiotic/Drug Filter The goal is for eSynth to synthesize new compounds to populate the pharmacologically relevant space. We use Lipinski’s Rule-of-Five to ensure that the synthesized compounds have drug-like properties. Due to the number of possible combinations growing exponentially with the number of molecular fragments, Lipinski’s Rule-of-Five is applied to exclude those compounds that do not satisfy drug-like criteria. 5.2 Side-E?ect/Toxicity Filter Even after pharmaceutical companies spend years and billions of dollars in creating a new drug, often it is the case that the drug has undesirable side-e?ects that renders it unusable. To detect side e?ects, the companies must conduct extensive clinical trials The Next Generation of Arti?cial Intelligence: Synthesizable AI 675 that consume years of e?ort and billions of dollars. All that money and e?ort ultimately gets wasted if the drug has a negative side-e?ect in which case it is rejected by the FDA. 5.3 Synthetic Accessibility Analysis Natural products are a source of ingredients for many drugs. Some of these natural products are hard to acquire. It is also di?cult to analyze the molecular structure of a compound for negative side-e?ects. We use deep neural network models that, from the molecular structure of a natural product, can predict it synthetic accessibility score. For compounds with high scores, it is possible to synthesize them using eSynth and analyze their side-e?ects. 5.4 Automatic Drug Repurposing Based on features extracted by 3D image models of the pathogens and those of drugs, learning models will be used to repurpose existing drugs to new diseases. 5.5 Other Future Applications Another application that SAI has been focusing on is automated vulnerability analysis. SAI has been automatically able to localize the “attack surface” of an application. Current research is focusing on automatically patching such vulnerabilities as well as extending the analysis to large cyber infrastructures. SAI is currently being targeted in the automatic lighting control domain in smart buildings. 6 Limitations For continued expansion to synthesize new compounds in pharmacology, SAI and eSynth must be strengthened through the use of more expanded deep neural networks to determine side e?ects. We are currently evaluating and using deep neural network models to predict possible side e?ects from the molecular structure and the bondings in the drug molecule. 7 Conclusion The future for arti?cial intelligence remains bright. Each day, new technologies such as the Synthesizable AI can be called upon to rapidly assume even “deeper roles” in inter- disciplinary areas ranging from open street maps, cybersecurity and power systems to kidney stone surgery through analysis of extreme and complex events and ever larger sets of mega-data and utilization of newer computing architectures [17, 18]. 676 S. Mukhopadhyay et al. References 1. Iyengar, S., Mukhopadhyay, S., Steinmuller, C., Li, X.: Preventing future oil spills with software-based event detection. IEEE Comput. 43(8), 95–97 (2010). IEEE Computer Society, 0018–9162/10 2. Karki, M., DiBiano, R., Basu, S., Mukhopadhyay, S.: Core sampling framework for pixel classi?cation. In: Proceedings of the International Conference on Arti?cial Neural Networks (2017) 3. Basu, S., Karki, M., Mukhopadhyay, S., Ganguly, S., Nemani, R., DiBiano, R., Gayaka, S.: A theoretical analysis of Deep Neural Networks for texture classi?cation. IJCNN 2016, 992– 999 (2016) 4. DiBiano, R., Mukhopadhyay, S.: Automated diagnostics for manufacturing machinery based on well regularized deep neural networks, integration. VLSI J. 58, 303–310 (2017) 5. Basu, S., Ganguly, S., Nemani, R., Mukhopadhyay, S., Zhang, G., Milesi, C., et al.: A semi automated probabilistic framework for tree cover delineation from 1-M NAIP imagery using a high performance computing architecture. IEEE Trans. Geosci. Remote Sens. 53(10), 5690– 5708 (2015) 6. Basu, S., Ganguly, S., Mukhopadhyay, S., DiBiano, R., Karki, M., Nemani, R.: DeepSat—a learning framework for satellite imagery. In: Proceedings of the ACM SIGSPATIAL 2015 (2015) 7. Sidhanta, S., Golab, W., Mukhopadhyay, S., Basu, S.: Adaptable SLA-aware consistency tuning for quorum-replicated data stores. IEEE Trans. Big Data 3, 248–261 (2017) 8. Sidhanta, S., Mukhopadyay, S.: Infra: SLO aware elastic auto scaling in the cloud for cost reduction. In: IEEE BigData Congress, pp. 141–148 (2016) 9. Alvin, C., Gulwani, S., Majumdar, R., Mukhopadhyay, S.: Synthesis of geometry proof problems. In: Proceedings of AAAI, pp. 245–252 (2014) 10. Alvin, C., Gulwani, S., Majumdar, R., Mukhopadhyay, S.: Synthesis of solutions for shaded area geometry problems. In: Proceedings of FLAIRS (2017) 11. Naderi, M., Alvin, C., Ding, Y., Mukhopadhyay, S., Brylinski, M.: A graph-based approach to construct target focused libraries for virtual screening. J. Chemoinform. 8, 14 (2016) 12. Alvin, C., Peterson, B., Mukhopadhyay, S.: StaticGen: static generation of UML sequence diagrams. In: Proceedings of the International Conference on the Foundational Aspects of Software Engineering (2017) 13. Mukhopadhyay, S., Iyengar, S.S.: System and architecture for robust management of resources in a wide-area network. US Patent Number 9,240,955 issued January 2016 14. Alvin, C., Gulwani, S., Majumdar, R., Mukhopadhyay, S.: Synthesis of problems for shaded area geometry reasoning. In: Proceedings of AIED (2017) 15. Basu, S., Karki, M., Ganguly, S., DiBiano, R., Mukhopadhyay, S., Gayaka, S., Kannan, R., Nemani, R.: Learning sparse feature representations using probabilistic quadtrees and deep belief nets. Neural Process. Lett. 1–13 (2016). https://doi.org/10.1007/s11063-016-9556-4 16. Liu, T., Naderi, M., Alvin, C., Mukhopadhyay, S., Brylinski, M.: Break down in order to build up: decomposing small molecules for fragment-based drug design with eMolFrag. J. Chem. Inf. Model. 57, 627–631 (2017) 17. Boyda, E., Basu, S., Ganguly, S., Michaelis, A., Mukhopadhyay, S., Nemani, R.: Deploying a quantum annealing processor to detect tree cover in aerial imagery of California. PLoS ONE (2017) 18. Ganguly, S., Basu, S., Nemani, R., Mukhopadhyay, S., Michaelis, A., Votava, P., Milesi, C., Kumar, U.: Deep learning for very high resolution imagery classi?cation. In: Srivastava, A., Nemani, R., Steinhaeuser, K. (eds.) Large-Scale Machine Learning in the Earth Sciences. CRC Press, Boca Raton (2017) The Next Generation of Arti?cial Intelligence: Synthesizable AI 677 Cognitive Natural Language Search Using Calibrated Quantum Mesh Rucha Kulkarni, Harshad Kulkarni, Kalpesh Balar, and Praful Krishna(?) Arbot Solutions Inc., dba Coseer, San Francisco, CA 94105, USA praful@coseer.com Abstract. This paper describes the application of a search system for helping users ?nd the most relevant answers to their questions from a set of documents. The system is developed based on a new algorithm for Natural Language Under- standing (NLU) called Calibrated Quantum Mesh (CQM). CQM ?nds the right answers instead of documents. It also has the potential to resolve confusing and ambiguous cases by mimicking the way a human brain functions. The method has been evaluated on a set of queries provided by users. The relevant answers given by the Coseer search system have been judged by three human judges as well as compared to the answers given by a reliable answering system called AskCFPB. Coseer performed better in 57.0% of cases, and worse in 16.5% cases, while the results were comparable to AskCFPB in 26.6% of cases. The usefulness of a cognitive computing system over a Microsoft-powered key-word based search system is discussed. This is a small step toward enabling arti?cial intelli- gence to interact with users in a natural manner like in an intelligent chatbot. Keywords: Chatbot · Cognitive computing Natural Language Processing (NLP) · Cognitive search Natural Language Understanding (NLU) 1 Introduction Natural Language Search and one of its prominent applications, Chatbots, are popular topics in the ?eld of technology as well as research. Their popularity can be attributed to the tremendous potential and promises in several ?elds [1–6]. There are several areas of business, for example, brand-building, customer acquisition, product discovery, support, etc. that require human interaction. There is always high cost related to human labor, inaccuracy related to fatigue and general human biases and errors. An automation system based on Natural Language Search can remove several of these problems by simply replacing the human. A well-designed chatbot, for example, can be used to facilitate the internal processes of a business. A chatbot, if successfully developed as a subject matter expert, can be deployed to any part of the business so that any employee or customer can retrieve important information from it at any time. © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 678–686, 2019. https://doi.org/10.1007/978-3-030-02686-8_51 However, in the current state, a clear majority of systems based on NLU are not well designed or accurate enough. High accuracy is necessary so that business managers can entrust them with mission-critical roles and tasks. Highly advanced Arti?cial Intelligence (AI) technologies like deep learning have been tremendously successful in analyzing structured data [7–9]. However, when it comes to unstructured data, especially processing natural language like English, they seem to fail. For a technology like deep learning to be successful, it needs considerable amount of training data which might not be available to the enterprises. Moreover, such data must be annotated by subject matter experts, which can be prohibitively expensive. Most intelligent natural language systems like Chatbots fail because they are unable to interact and process content like human beings do. Frequently, they are based on keyword correlation which does not enable them to “understand” the relations between words and their context. Humans process information around certain ideas. Ideas are entities that are expressed by words and phrases and complex relationships between them, – something computer systems cannot trivially handle. Thus, they are unable to retrieve meaning from information. Some essential characteristics of human thought process are: focusing on ideas rather than words, prioritizing ideas based on signi?cance and credibility and knowing when there is not enough information available to take a decision. Intelligent machines capable of producing high accuracy can be designed based on the imperatives mentioned above without relying on keywords. They can extract ideas, order them, store them in a hierarchical data structure and even derive context from live conversations. This type of approach o?ers a signi?cant advantage over traditional chatbots in terms of capability and performance. This unique paradigm of intelligent understanding of information is captured in one branch of AI technology: cognitive computing [10–15]. Cognitive computing can be used to automate tedious, repetitive and language-driven work?ows that do not require human intelligence anymore. This would allow the humans to focus on creativity and judgment while the machines take care of the mundane jobs. In this work, we have developed a Natural Language Search system that can help users with their queries. It analyzes the query placed by the user and suggests relevant answers from a list of Frequently Asked Questions (FAQ). The reported answer may be a direct match with an existing entry in the FAQ or produce a solution that is part of some other entry. To evaluate the performance of the system, we used a human judge as well as compared the results with that of AskCFPB [16]. AskCFPB is a well-established and trustworthy resource to get answers maintained by the Consumer Finance Protection Bureau of United States Government. It covers a variety of topics including bank accounts, credit reports, credit scores, debt collection, student loans and mortgages. There is a search box on the website where the users can enter their queries and look at related questions and answers. This system is powered by popular Microsoft search engine – Bing. The rest of the paper describes the method, the evaluation criteria used, and the results of the evaluation. We close with discussion on future work already underway at Coseer. Cognitive Natural Language Search Using Calibrated Quantum Mesh 679 2 Methods 2.1 Tactical Cognitive Computing All Coseer systems are built using Tactical Cognitive Computing (TCC). TCC is a programming paradigm with a focus on high accuracy, short training times and low cost. Tactical Cognitive Computing has been developed as a solution to traditional cognitive computing systems that are expensive and take years to implement. To be called tactical a cognitive computing system must be highly accurate. While lower level accuracy has been accepted and even lauded in the consumer world, the businesses need highly reliable systems. A TCC system must also be quick to train. The key factor in enabling a quick training time is a system’s ability to train without annotated training data. Annotation of training data typically needs subject matter experts that are very expensive. Annotation is also a time-intensive e?ort – some prominent implementations have taken years to train the data. Finally, a TCC system must be con?gurable, at low cost, to a wide variety of situa- tions in an enterprise. A key component of such con?gurability is the ability of tactical cognitive computing systems to be deployed over commoditized hardware in public cloud, private cloud or on-premise. Coseer’s implementation of TCC for natural language uses our work with Calibrated Quantum Mesh (CQM) and cognitive calibration, apart from various techniques in natural language processing, natural language understanding, and arti?cial intelligence. 2.2 Calibrated Quantum Mesh Calibrated Quantum Mesh (CQM) is a novel AI algorithm that is speci?cally built for understanding natural language as human beings do. It does not need annotated training data and reduces the need for unannotated data to a fraction. CQM works on three basic principles, as shown in Fig. 1: Multiple Meanings. CQM recognizes that any symbol, word or text can have more than one meaning or quantum states with di?erent probabilities. It considers all these possible states to ?nd the most probable answer. Interconnectedness. CQM recognizes that everything is correlated to each other and modi?es each other’s behavior. Speci?cally, each item can in?uence the probability distribution across quantum states of all other items it is connected to. CQM considers such mesh of interconnections to reduce error. Calibration. CQM sequentially adds all available information to help converge the mesh into a single meaning. The calibration process is fast, accurate and e?cient in detecting any lacunas. The calibrations are implemented on training data, contextual data, reference data and other known facts about the problem. Sometimes these cali- brating systems called Calibrating Data Layers are handled by an independent CQM module or another AI process. 680 R. Kulkarni et al. Fig. 1. Basic tenets of Calibrated Quantum Mesh (CQM). Cognitive Natural Language Search Using Calibrated Quantum Mesh 681 When the training data is passed through CQM, it de?nes many of the mesh’s inter- relationships. Where applicable, data layer algorithms learn from such data. Often new relations and nodes are added to the mesh, making it smarter. When a work?ow is modeled by CQM, the creation of any black boxes is avoided to the maximum extent. This ensures transparency and interpretability of the models. We note that keywords are not important for CQM in processing natural language. Complex ideas are represented by di?erent parts of the mesh with varying complexity. This enables the algorithm to handle ?uid, multi-state and inter-connected knowledge – some inherent criteria of natural language. The algorithm can also learn from non-direct corpora. For example, while assisting a UK tax advisory, it was executed over HMRC.com, Law.com, Investopedia and a proprietary glossary. The most important advantage of CQM is that it does not need annotated training data. As a result, training a CQM model is very fast and cost-e?ective. It also allows iterations over the training process leading to highly accurate results. This capability quali?es CQM based systems to be part of TCC. 2.3 Cognitive Natural Language Search System A cognitive search system can be applied to understand and interpret textual data in a natural way (Fig. 2). We used an algorithm based on CQM, which is also a TCC system, to develop a Natural Language Search system. We applied the Coseer system to assist users of AskCFPB with their questions. Fig. 2. Overview of the cognitive search system. The search system has two main steps: ingestion and search. In the ingestion step, documents are interpreted by the CQM and processed into relevant data structures. In this case, it was the FAQs that were processed and stored in a database. Then a search module takes the query as input and searches the database for the relevant text or a snippet. The relevant text is then sent to the user as a possible answer to the query. 682 R. Kulkarni et al. 3 Evaluation Criteria The cognitive search system was evaluated in the following ways: Accuracy. This criterion measures how accurately the system answers the queries. It was calculated by dividing the number of queries correctly answered by the total number of queries. The search system was tested with 158 queries. For each query, the top three results returned by the system were evaluated by three human judges. The results were marked as relevant if any of the top three results satisfactorily answered the question. Comparative Performance. This evaluation criterion demonstrates how well the search system performs as compared to AskCFPB search. AskCFPB was selected for comparison because it is the most closely related search system. This system is powered by Bing Search Engine. For this evaluation criteria, the same 158 queries were tested on both the search system. Three human judges evaluated the top snippet in the following categories: COMPARABLE, COSEER_BETTER and ASKCFPB_BETTER, according to which result seemed more relevant to the query. While AskCFPB returns documents, not answers, we considered most relevant snippet identi?ed by Bing Search Engine. We acknowledge that this is a very stringent evaluation criterion towards Coseer systems. Amount of Training Data Necessary (Not Used). A third evaluation criterion that is not used in this evaluation is the amount of training data that is necessary to train a tactical cognitive system. Typically, TCC systems need a fraction of data than other AI systems, and do not need them to be annotated. In this evaluation an untrained model was used. 4 Results and Discussion For the accuracy calculation, 130 out of the 158 queries were correctly answered by the Coseer cognitive search system, as evaluated by the human judges. This computes to 82.3% accuracy. This seems to be reasonable considering that the system was not trained for this subject matter. For the comparative study, 158 new queries were considered. Figure 3 shows the results of the comparative study. Out of the 158 queries, 26.6% showed comparable results. In 16.5% of the cases, AskCFPB performed better than Coseer and in 57.0% of the cases, Coseer performed better than AskCFPB. To get further insight into why one system works better than the other, we reported a couple of representative cases. Cognitive Natural Language Search Using Calibrated Quantum Mesh 683 Fig. 3. Results of comparative performance between AskCFPB and Coseer. Table 1 shows two queries where Coseer performed better than AskCFPB. Table 1. Cases where Coseer performed better than AskCFPB. Query Coseer answer AskCFPB answer How long do mortgages normally last? How can I determine how long it will take me to pay o? my mortgage loan? What exactly happens when a mortgage lender checks my credit? What type of rent information is on my credit report? At least one of the big three consumer reporting agencies, Experian, uses rental payment and collection information in its credit reports What is a credit report? - Consumer Financial Protection… There are several reasons behind the better performance of Coseer over AskCFPB. Unlike AskCFPB, Coseer considers the context and the meaning of the query. It provides emphasis on the functional words like ‘how long’ instead of matching keywords. Similarly, Coseer considers all other possible meanings of the search query to execute its search. Special attention is given to the important phrases, abbreviations, and colloquialisms. Table 2 reports a couple of cases where AskCFPB performed better than Coseer. The second query in Table 2 is of special interest. Although the question here is whether paying rent on time would strengthen credit history, the information about a weakening of the credit history due to late payment is very relevant. Even though it appears to be diametrically the opposite answer, AskCFPB has correctly recognized such an answer as relevant. Coseer algorithm can be further improved by teaching it how to handle such cases. 684 R. Kulkarni et al. Table 2. Cases where AskCFPB performed better than Coseer. Query Coseer answer AskCFPB answer What info does a credit report show? If the investigation shows the company provided wrong information about you, or the information cannot be veri?ed, the company must notify all the credit reporting companies to which it provided the wrong information… A credit report is a statement that has information about your credit activity and current credit situation such as loan paying history and… Can I build my credit history by paying my rent on time? You have a steady source of income and a good record of paying your bills on time. Lenders will look at your ability to repay the mortgage… Could late rent payments or problems with a landlord be in my credit report? 5 Limitations, Conclusions and Future Work The most signi?cant limitation of the study is that an untrained AI system was used. In future, it is necessary to train a system to achieve more than 90% accuracy as per the ?rst evaluation criterion. In that study, we will also be able to compare the two systems as per the third evaluation criterion - how much data is necessary to train the system? Although Natural Language Search is an exciting and popular technology with ever increasing areas of applications, its ability to interact with people in a natural manner remains at an early stage. We applied a tactical cognitive computing system in conju- gation with calibrated quantum mesh to develop a chatbot that helps customers with their questions. The search system demonstrated reasonable accuracy in assisting the users to ?nd the answers to their queries. Although there are several opportunities to improve, this comparative study demonstrates the usefulness of such an approach over typical key-word based natural language processing systems. It recommends cognitive computing as a key player in solving di?cult problems that require humanlike thinking, ability to reason and extract meaning from information. We plan to extend CQM for other basic cognitive processes like processing intona- tions in speech, translating ideas back into words and perhaps processing and expressing unarticulated thoughts and emotions in text. Idea-oriented chatbots can be the key to assimilating human and computing worlds. Coseer’s solutions demonstrate that we are already capable of designing and training machines to process information like humans do, talk like humans do and provide busi- ness value as humans do. Since the chatbots can run round the clock, at a fraction of the cost of a human resource and with high accuracy, it is perhaps not an overstatement to say that the future of the chatbot could be the future of all business. Cognitive Natural Language Search Using Calibrated Quantum Mesh 685 Acknowledgment. We thank the larger team of Coseer for developing the system. We also thank Obaidur Rahaman for assistance in preparing the manuscript. References 1. Ghose, S., Barua, J.J.: Toward the implementation of a topic speci?c dialogue based natural language chatbot as an undergraduate advisor. In: 2013 International Conference on Informatics, Electronics and Vision (ICIEV), pp. 1–5 (2013) 2. Heller, B., et al.: Freudbot: an investigation of chatbot technology in distance education. In: EdMedia: World Conference on Educational Media and Technology, pp. 3913–3918 (2005) 3. Hill, J., et al.: Real conversations with arti?cial intelligence: a comparison between human– human online conversations and human–chatbot conversations. Comput. Hum. Behav. 49, 245–250 (2015) 4. Huang, J.Z., et al.: Extracting Chatbot Knowledge from Online Discussion Forums (2007) 5. Jia, J.: The study of the application of a web-based chatbot system on the teaching of foreign languages. In: Society for Information Technology and Teacher Education International Conference, pp. 1201–1207 (2004) 6. Jia, J.Y.: CSIEC: a computer assisted English learning chatbot based on textual knowledge and reasoning. Knowl.-Based Syst. 22, 249–255 (2009) 7. Goodfellow, I., et al.: Deep Learning, vol. 1. MIT press, Cambridge (2016) 8. LeCun, Y., et al.: Deep learning. Nature 521, 436 (2015) 9. Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015) 10. Ferrucci, D.A.: Introduction to “This is Watson”. IBM J. Res. Dev. 56 (2012) 11. Li, Y., et al.: Cognitive computing in action to enhance invoice processing with customized language translation. Presented at the 2017 IEEE 1st International Conference on Cognitive Computing (2017) 12. McCord, M.C., et al.: Deep parsing in watson. IBM J. Res. Dev. 56 (2012) 13. Amir, A., et al.: Cognitive computing programming paradigm: a corelet language for composing networks of neurosynaptic cores. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–10 (2013) 14. Cassidy, A.S., et al.: Cognitive computing building block: a versatile and e?cient digital neuron model for neurosynaptic cores. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–10 (2013) 15. Esser, S.K., et al.: Cognitive computing systems: algorithms and applications for networks of neurosynaptic cores. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–10 (2013) 16. Dhoat, K.K.: Cognitive Search Technique for Textual Data. College of Engineering, Pune (2013) 686 R. Kulkarni et al. Taxonomy and Resource Modeling in Combined Fog-to-Cloud Systems Souvik Sengupta(B) , Jordi Garcia, and Xavi Masip-Bruin Advanced Network Architectures Lab, CRAAX, Universitat Polit`ecnica de Catalunya, UPC BarcelonaTech, Vilanova i la Geltru, ´ 08800, Barcelona, Spain {souvik,jordig,xmasip}@ac.upc.edu Abstract. As the technology is rapidly evolving, the society as a whole is gradually surrounding by the Internet. In such a high connectivity sce-nario, the recently coined IoT concept becomes a commodity driving data generation rate to increase swiftly. To process and manage these data in an e?cient way, a new strategy, referred to as Fog-to-Clod (F2C), has been recently proposed leveraging two existing technologies, fog com-puting and cloud computing, where resources are playing a pivotal role to manage data e?ciently. In these scenarios, vast numbers of inter-connected heterogeneous devices coexist, thus crafting a complex set of devices. Managing e?ciently these devices requires a proper resources classi?cation and organization. In this paper, we o?er a model to clas-sify and taxonomies the whole set of resources aimed to best suit the Fog-to-Cloud (F2C) paradigm. Keywords: Fog-to-Cloud (F2C)· Taxonomy · Ontology Resources classi?cation · Class diagram 1 Introduction Technologies are rapidly evolving driving the whole society towards a new era of smart services. Indeed, day by day, we are moving towards the ‘smart’ to the ‘smarter’ world. As per the United Nation [1], by 2050 about 64% of the developing world and 86% of the developed world will be urbanized. Also as per some statistics [2], by 2050 more than 70% of world population will be living in a smart environment, where most of the things will connect to the network. Gartner Inc. [3] forecasts that by 2020 almost 20.4 billion connected things will be in use worldwide. Also by 2022, M2M tra?c ?ows are expected to constitute up to 45% of the whole Internet tra?c [4]. Beyond these predictions, the McKinsey Global Institute [5] reported in 2015 that the number of connected machines (units) had grown 300% over the last ?ve years. Tra?c monitoring of a cellular network in the US also showed an increase of 250% for M2M tra?c volume in 2011. Also, Cisco [6] predicted that 50 billion objects and devices would be connected to the Internet by 2020. However, although more than 99% 9 c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 687–704, 2019. https://doi.org/10.1007/978-3-030-02686-8_52 688 S. Sengupta et al. of today’s available things in the world remain unconnected, several pieces of evidence de?ne the connectivity trend in di?erent sectors. The following two examples are highlighting the fact. According to a Navigant research report [7], the number of installed smart meters around the world will grow to 1.1 billion by 2022. Another report from Automotive News [8], states that the number of cars connected to the Internet worldwide will increase from 23 million in 2013 to 152 million in 2020. By following up all the trends and above scenarios, it is clear that IoT con-nected devices are going to rule over the smart environment, being a key com-ponent in the whole system. In short, the envisioned ‘smart’ scenario consists in a massive amount of IoT devices, highly distributed over the network, along with a set of highly demanding services, some of them not yet foreseen though. It is also widely accepted that the bene?ts of cloud computing bring to handle high processing and storage services demands. However, it is also recognized that cloud data centres may fail to deal with services demanding strict low latency, mainly due to the distance from the cloud -where the data is to be processed - to the edge -where the data is to be collected, and the user is. As a conse-quence, some critical undesired e?ects, such as network congestion, high latency in service delivery and reduced Quality of Service (QoS) are being experienced [9]. By addressing these problems, the fog computing recently came up, rely-ing on adding processing capabilities between the cloud data centre and the IoT devices/sensors, thus aimed at extending the cloud computing facilities to the edge of the network [9,10]. However, interestingly, it is also recognized the fact that fog computing is not going to compete with cloud computing, instead collaborate both together, intended to provide better facilities to the next gen-eration of computing and networking platforms [10]. Indeed, the whole scenario may be seen as a stack of resources, from the edge up to the cloud, where a smart management system may adequately allocate resources best suiting ser-vices demands, regardless where the resources are, either at the cloud or fog. The recently coined Fog-to-cloud (F2C) architecture [11], has been proposed intended to build such a coordinated management framework. Therefore, it is clear that the development and combination of new technologies (i.e., IoT, Cloud, and Fog computing, etc.) o?ers a multi fascinate solution for the future smart scenario. Unfortunately, the enormous diversity of devices makes such a management system not easy to deploy. Indeed, e?cient and proper management of such a set of heterogeneous devices is a crucial challenge for any IoT computing platform to succeed. However, to facilitate the design of the suggested resources coordinated management framework, it is essential to know what the resources characteristics and attributes are, thus building some resources catalogue. This paper aims to identify a resource taxonomy and resource model, suitable for a coordinated F2C system, as a mandatory step towards a real F2C management architecture deployment. The rest of the paper is organized as follows. Section 2 positions the current state of the art. Next, Sect. 3 presents an architectural overview for the coordi-nated Fog-to-Cloud paradigm. In Sect. 4, we show a class diagram to represent Taxonomy and Resource Modeling in Combined Fog-to-Cloud Systems 689 our taxonomic view of the F2C resources, and also we discuss on the various taxonomic parameters considered to make the classi?cation of an F2C resource. Following up the previous section, in Sect. 5 we represent and de?ne the gener-alized resource model for the F2C computing platform. To support our resource model, we present some examples of real devices participating in the F2C sys-tem. Finally, some concluding remarks and future directions of our research work given in Sect. 6. 2 State of the Art: Related Work and Motivation For any management system, proper utilization of resources undoubtedly facili-tates an optimal service execution and hence helps to build an e?ective manage-ment solution. Most importantly, to manage the whole set of resources, it is very much essential to have them categorized and classi?ed into a resources catalogue. Apparently, to build such a description is necessary to identify the character-istics and attributes of the resources to be organized. In this paper, we aim at determining a resource classi?cation and taxonomy, for a scenario combining fog and cloud resources, like the one, we envisioned by the F2C. The underlying objective of such a classi?cation is to describe a catalogue of resources, where resources are formally de?ned, thus easing both an e?cient resources utilization and an optimal services execution. In a previous work [12], we put together a comprehensive literature survey, highlighting the resource characteristics for distinct computing paradigms and also observed several interesting ?ndings. We found that, in most cases, hardware components (i.e., memory, storage, processor, etc.), software (i.e., - APIs, OS, etc.) and network aspects (i.e., - protocol, standards etc.) of the devices [13,14] have been considered to classify the edge resources. Even for grid resources hard-ware components have also been studied (i.e., storage capacity, memory etc.), to classify them [15,16]. We recognized the relevance of e?cient network man-agement to build a dynamic computing platform, many references [13,17,18], put the focus to identify the networking standards, technology and bandwidth capacity. Interestingly, after revisiting the literature, we found that in most of the fog and edge computing related work focuses on the network bandwidth as the essential characteristics for e?cient network management. It is worth high-lighting the fact that, the closer to the edge resources are, the more signi?cant the impact on the access network is. Indeed, access networks become a criti-cal part of the whole network infrastructure concerning the quality provision-ing, congestion, real capacity and availability and also the part where devices’ mobility brings signi?cant collateral e?ects on performance. Hence, as a sum-mary, network bandwidth - as well as other network attributes at the edge must be undoubtedly considered as a critical characteristic to characterize a resource. Also, di?erent edge devices may use di?erent networking standards and tech-nologies to communicate [13]. So consideration of the networking standards and techniques are also mandatory when categorizing a resource. 690 S. Sengupta et al. Di?erently, in the cloud arena, no such concerns are found for processing, storage, power, or network (i.e., bandwidth) capacities of the cloud resources. Interestingly, researchers have given their focus on managing the security, pri-vacy, and reliability aspects [18,19] in the cloud paradigm. We also found that cost management (i.e., charges for access and utilization of resources), is one of the crucial aspects to build an e?cient Cloud platform [20]. Indeed, several works propose a cost model for system resources and services [18,20]. After a compre-hensive reading (see [12] for more details) we may conclude that: (i) most of the cloud-resources have some unique features - e.g., they are centralized, fault-tolerant [18,20–22] etc.; (ii) in IoT, edge or fog, resources are geographically distributed [12,21,23], while much agiler than cloud resources and suitable for supporting real-time services. In summary, we may quickly assess that there are a signi?cant variety and diversity of system resources, what undoubtedly makes resource categorization a challenging task. 3 An Overview of the F2C Architecture The F2C has been introduced as a framework intended to both optimize the resources utilization and improving the service execution, the latter through an optimal mapping of services into the resources best suiting the services demands. To that end, resources categorization becomes an essential component for a suc-cessful F2C deployment. Consequently, an accurate description of the di?erent attributes and characteristics to be used to categorize a resource e?ciently. Just as an illustrative example, Fig. 1 depicts a picture showing how an F2C deploy-ment in a smart city may look like, mainly representing the technological inte-gration of the Cloud, Fog/Edge and IoT resources. Fig. 1. Fog and cloud resources deployment in a Smart City. Taxonomy and Resource Modeling in Combined Fog-to-Cloud Systems 691 It is pretty apparent to observe the fact, that in a smart city, as shown in the Fig. 1, several distinct and heterogeneous fog node devices may be found (i.e., smartphone, smartwatch, car, etc.) and also many IoT devices (i.e., surveillance camera, temperature sensor etc.) can be connected or attached with them. We also identify that several devices may become the leader fog node (i.e., road-side unit, etc.) and each of them serve as the fog service provider of a particular fog area to the smart city. Similarly, many di?erent cloud providers may take over the provisioning of cloud facilities to the citizens. The F2C solution, designed to be a coordinated management platform, facilitates optimal management of this broad set of heterogeneous resources (i.e., -IoT devices, fog nodes, cloud resources, etc.). Unquestionably, the supervision of heterogeneous resources is a crucial characteristic of the F2C platform. Thus, before devoting e?orts to categorize the resources, it is mandatory to revisit what the main aspects of F2C are. In [11], the F2C is proposed as a combined, hierarchical and layered architecture, where cloud resources reside at the top layer, the IoT layer at the bottom consists in the set of IoT devices, and several intermediate fog layers are considered bringing together the collection of heterogeneous edge devices. In Fig. 2, we represent the hierarchical structure of the F2C architecture. Following the hierarchical structure of the F2C architecture and considering the smart city scenario, we found that the leader fog node of each fog area is responsible for communicating with the upper layer resources in the F2C platform. Also, the leader fog node is responsible for informing the upper layer resources about the total resource information of its fog area. It is worth emphasizing the fact that the concept of fog node has not widely converged towards a unique de?nition yet. Although in a general view, this paper is only using such fog node concept to represent a device belonging to fog (or by extension to F2C), readers interested in this topic may ?nd a more elaborated discussion on its meaning in [24]. Fig. 2. Hierarchical architecture of F2C paradigm. 692 S. Sengupta et al. Authors in [11], highlight the need to have a comprehensive devices control and management strategy to build an e?cient F2C system. As said earlier, it is essential to correctly identify the resources characteristics and behaviour for a successful F2C deployment. Indeed, by adequately identifying resources charac-teristics and their behaviours - helps to build an e?cient taxonomy of resources of the F2C paradigm. This taxonomy would help the services to resources map-ping process and thus optimizing the service execution. In the next Sect. 4, we present the taxonomic view of F2C resources and later, in Sect. 5, we present a generalized resource model for the F2C paradigm. 4 Proposing Taxonomy of the F2C Resources The enormous diversity, heterogeneity and variety envisioned for the whole set of resources from the edge up to the cloud, makes resources management in Fog-to- Cloud a challenging e?ort. From a broad perspective, it is pretty evident that the closer to the top (i.e., cloud) the larger the capacities are. Thus, we may undoubtedly assess that computation, processing and storage capabilities are higher in the cloud than in fog and higher in fog than in the edge. Interestingly, in the F2C envisioned scenario this assessment is even more elaborated, leveraging the di?erent layers foreseen for fog. Indeed, in F2C di?erent layers are identi?ed to meet di?erent characteristics of distinct devices. Thus, considering the current state of the art contributions, the speci?c layers architecture de?ned in F2C and the potential set of attributes to characterize each one of them, we propose a taxonomy for characterizing resources in an F2C system, as described next. In the collaborative model foreseen in an F2C system, devices may partici-pate as either ‘Consumer’, ‘Contributor’, or ‘Both’ of them. When a device acts as a ‘Consumer’, the device gets into the F2C system to execute services, thus being a pure resources consumer. When acting as ‘Contributor’, the device o?ers its resources to both, itself and third users (in a future collaborative scenario), to run services. Finally, some resources can act as ‘Both’, hence not only access-ing (i.e., consuming) some services but also contributing with their resources to support services execution. Thus, according to the participation role, in ?rst approach resources in an F2C system may be classi?ed into three distinct types. However, although the participation role is a key aspect, many other attributes and characteristics must be considered as well in order to accommodate the large heterogeneity of resources, including - Device attributes (Hardware, Software, Network speci?cation etc.), IoT components & Attached com-ponents (Sensors, Actuators, RFID tags, Other attached device components), Security & Privacy aspects (Device hardware security, Network Security and Data Security), Cost Information (Chargeable device, Non-Chargeable device), and History & Behavioural information (Participation role, Mobil-ity, Life Span, Reliability, Information of the device location, etc.). Taxonomy and Resource Modeling in Combined Fog-to-Cloud Systems 693 Fig. 3. The ontology-based F2C resource classi?cation. 4.1 Taxonomy Modeling: Based on Ontology In this paper, we present an F2C resource taxonomy model leveraging a proposed ontology. To that end, in order to present the ontology-based resource taxonomy model in F2C paradigm, we adopt the classi?cation method proposed by Perez [25]. According to the ontological model, modeling elements are divided into ?ve basic modeling original language: classes, relations, functions, axioms, and instances. The ontology model O, is depicted in Fig. 3 and is shown as: O = {C, R, F, A, I} (1) C represents the class or concepts and can be further classi?ed and subdivided into a kind of basic class Ci. R represents the collection of relations, mainly containing four basic types: part-of, kind-of, instance-of and attribute-of. F rep-resents the collection of functions which can be formalized as: F = C1 × C2 × C3 × ... × Cn-1 ?. Cn (2) A represents the collection of axioms, and I represents the collection of instances. Based on the ontological model described above, this paper analyzes the basic elements of parameters C (class) and R (relation), according to the attributes and expected behaviour for the whole set of resources in an F2C system. This analysis will help to both propose the resource taxonomy for F2C and build the resource description model for F2C. 694 S. Sengupta et al. 4.2 F2C Resource Taxonomy: View of the Class Diagram Adopting the ontological model described above and following the attributes and expected behaviour for the F2C system resources, Fig. 4, depicts in the form of a class diagram the taxonomy proposed for F2C resources. Fig. 4. Class diagram of the F2C Resource taxonomy: a completed model in Prot´eg´e. According to the proposed class diagram, all resources in F2C can be initially classi?ed according to ?ve di?erent classes, each one further divided into several sub-classes. Next, we present a brief description of each class and subclasses. 1. Device attributes - Devices participating in an F2C system can be classi?ed according to their hardware, software, networking speci?cation and also by considering their type. – Hardware components - In an F2C system, storage, processor, main mem-ory, graphics processing unit, and power source information of a device help to classify them further. – Software components - To participate in any service-oriented comput-ing paradigm, devices must have an entry point, ‘software’ or ‘applica-tion’. We assume that devices can join an F2C system in two ways: (i) devices have the application or software copy installed, or; (ii) devices must connect to another device, running the application or software copy. According to the F2C architecture, two types of entry point are identi?ed for F2C resources: (i) one for cloud resources, and; (ii) another one for the fog resources. This characteristic must also be considered to classify F2C resources. Finally, also the operating system information and other installed apps and APIs information will help classify them. – Device type - Devices participating in an F2C system can be either phys-ical or virtual device. Taxonomy and Resource Modeling in Combined Fog-to-Cloud Systems 695 – Networking information - According to the large diversity of devices envi-sioned in an F2C system; devices are expected to use several di?erent networking standards and technologies (i.e., wi?, Bluetooth, etc.). Hence, information about the networking standards and supported technologies must also be considered to classify F2C resources. Finally, being a key attribute in the networking arena, we identify bandwidth as a key param-eter to characterize F2C resources as well. 2. IoT components & Attached components information - The resources working in an F2C system may have some sensors, actuators, RFID tags and other attached-device components (i.e., webcam, printer, etc.). Therefore, resources can be further classi?ed according to the information of sensors, actuators, RFID tags and other attached device components. – Sensors - F2C resources may have attached various kind of sensors (i.e., temperature sensor, proximity sensor, etc.). Therefore, this information must also be considered. – Actuators - Similar to the previous one, many di?erent actuators may be attached to F2C resources (i.e., Mechanical, Thermal or Magnetic etc.). Hence, similarly, this information must also be considered. – RFID tags - F2C resources may also have the active or passive type of RFID tags attached, so to be considered as well. – Other attached device components - Many di?erent external devices may be connected to an F2C device (i.e., Webcam, external audio system, printer, scanner, Arduino kit etc.). This information is enriching the whole system; thus it must be undoubtedly considered to classify an F2C resource. 3. Security & Privacy aspects - To build an e?cient system, it is essential to iden-tify the set of system resources requiring some protection and those requiring not to be protected. In an F2C system, according to the device hardware security, data privacy and the network security aspects, the resources can be further classi?ed as protected and insecure resources. 4. Cost information - In an F2C system, some resources are expected to o?er free access (i.e., with no cost) while some other may require some fee for granting access. Therefore, according to the accessing cost, F2C resources can be classi?ed into Chargeable and Non-Chargeable resources. 5. History & Behavioral information - Beyond considering information about resources attributes and components, resources in an F2C system may also be classi?ed according to the information of their present and past system interaction, including resource reliability, life-span, mobility information, par-ticipation role and information of their location. Based on the above analysis, this paper considers the following ?ve classes, device attributes, information of IoT components and other attached devices, cost infor-mation, security and privacy aspects, history and behavioural information, to categorize resources in the F2C system. 696 S. Sengupta et al. 5 Presenting the Resource Description Model in F2C In an F2C system, several fog areas may co-exist, as shown in Fig. 5 for an illustrative smart city scenario. Each fog area is composed of one leader fog node, various kind of fog node devices, IoT (i.e., sensors, actuators, etc.) and other elements (i.e., printer, etc.), putting together a heterogeneous set of resources as well as di?erent data sources. As earlier stated, such heterogeneity makes some challenges for the global management. Thus, as also mentioned in this paper, correct and appropriate classi?cation of resources becomes a must, to facilitate in such coordinated management. Also, it is necessary to have a clear and combined version of a generalized resource description. In this section, we de?ne a combined version of a generalized resource description for devices in an F2C system. Fig. 5. F2C scenario in a smart city. Based on the previously described ontology and matching the F2C archi-tecture, we conclude that the design of the full classes and sub-classes for each Taxonomy and Resource Modeling in Combined Fog-to-Cloud Systems 697 resource turns into a key challenge to manage the whole system resources prop-erly. Moving back to the smart city scenario depicted in Fig. 5, we may see, just as an example, that the laptop contains the classes of device attributes, IoT com-ponents & Attached components, Security & Privacy aspects, Cost information as well as History & Behaviors. Each class includes di?erent subclasses, such as Hardware components, software components, Network information etc. Also, the laptop contains a device id and a user id. To build an e?cient F2C system and to manage all the system resources properly, it is also essential to know the total capacity and attributes of each fog area. Figure 5 shows that a fog area is composed by a leader fog node, several types of fog node devices (i.e., laptop, car, smartphone) and other attached devices (i.e., printer, light, etc.). Leveraging such attributes description, we ?rst propose a generalized resource description model for an F2C system in Subsect. 5.1, and later, in Subsect. 5.2; we focus on identifying the aggregated resource information model for a particular fog area. Moreover, it is worth highlighting the fact that for an F2C system to properly work, the resource information must be stored e?ciently. To that end, it is essen-tial to have a strong but light-weight database. Also, e?ciently and guaranteed transfer of the resource description information, it is also mandatory to describe the resource information through a standard and formatted language. Consid-ering the characteristics of di?erent databases and languages and according to the proposed model, in this paper, we adopt a relational database management system -SQLite, to store the resource information. Finally, to transfer the data from one resource to another resource, in this paper we adopt JSON as the information transferring implementation language. 5.1 Generalized Resource Description Model: A Single Resource To participate in any service-oriented computing platform, devices must have an entry point or ‘software’ or ‘application’ to join in. In the F2C system, devices can join the system by two ways: (i) they have the ‘application’ or ‘software’ installed on their device, or; (ii) they connect to another device that has the ‘application’ or ‘software’. So, considering the ontology-based resources classi?cation model proposed in Sect. 4, and for the sake of illustration aligned to the smart city scenario depicted in Fig. 5, all devices in the smart city are denoted as - R, and all devices endowed with the F2C enabled ‘software’ or ‘application’ copy are denoted as - RF 2C. Hence, according to our proposed resource taxonomy of a F2C system, RF 2C ?C R. The devices that do not have the F2C enabled ‘software’ or ‘application’, can also join the F2C system through a connection with an F2C enabled device. They can be known as - ‘Other attached device components’ of 698 S. Sengupta et al. the F2C enabled-device. We present the generalized resource description model for the F2C enabled-device in a tuple form, as follows: RF 2C = < user name; device id; Device attributes: < Hardware components: < Storage information; Main memroy information; Processor information; Power source information; GPU & FPGA information >; Software components: < Apps & APIs: < F2C app: < cloud resource app; fog resource app >; Other apps & APIs >; Operating system >; Network information: < Bandwidth information; Networking standards information >; Resource type: >; IoT components & Attached components: < Sensors; Actuators; RFID tags; Other attached device components >; Security & Privacy aspects: < Device hardware security; Network security; Data privacy >; Cost information: < Chargeable device; Non-Chargeable device >; History & Behaviors: < Participation role; Mobility; Life span; Reliability; Information of the device Location; resource sharing information > > Before sharing the resource information, all resources (RF 2C) in the F2C sys-tem, keep storing a copy of their resource information according to the general-ized resource description model. Resources(RF 2C) are using their local database (i.e., SQLite) to store their resource and components information. To share the resource information e?ciently with other F2C enabled resources, we adopt the JSON language to make a standard and formatted description ?le. In the Listing 1.1, we represent the resource description ?le of an F2C enabled laptop, based on the JSON language. The description ?le contains the detailed information about the hardware (i.e., total and current available storage, RAM informa-tion), software (i.e.,OS information, F2C app information etc.), IoT and other Taxonomy and Resource Modeling in Combined Fog-to-Cloud Systems 699 attached components (sensors and other connected device information), history & behavioural (i.e., current location information, participation role etc.) etc. of the F2C enabled laptop. Listing 1.1. The JSON-formatted resource description ?le for a F2C-enabled laptop: An example 1 2 { 3 " user_name ":" craax_user123 ", 4 " device_id ":11078934576 , 5 " Device_attributes ": { 6 " Hardware_components ": { 7 " Storage_information_ ( _in_MB_ ) ": { 8 " Total ":122880 , 9 " Available ":965890 10 }, 11 " Main_Memory_information_ ( _in_MB_ ) ": { 12 " Total ":32768 , 13 " Available ":13968 14 }, 15 " Processor_information ": { 16 " Processor_maker ":" Intel Core i7 -8550 U CPU @ 1.80 GHz ", 17 " Available_percentage_of_processor ":90.7 18 " Processor_architecture ":" X86_64 " 19 }, 20 . 21 . 22 }, 23 " Software_components ": { 24 " Operating_system ":" Windows -10 -10.0.16299 - SP0 ", 25 " Apps_ & _APIs ": { 26 " F2C_app ":" fog_resource_app ", 27 " Other_apps_ & _APIs ": { 28 " Adobe Acrobat Reader DC ", 29 " AMD Software ", 30 . 31 . 32 } 33 } 34 }, 35 . 36 . 37 }, 38 " IoT_components_ & _attached_device_components ": { 39 . 40 . 41 }, 42 . 43 . 44 } 5.2 Resource Description Presentation: Aggregated Model in Each Hierarchy of F2C As shown in Fig. 5 several fog areas may be included in a smart city, each of them providing F2C services to the citizens. The policies used to de?ne the fog areas are out of the scope of this paper. However, it is pretty apparent that correct management of the whole set of resources in fog areas is essential to make the F2C system to be accurate and e?cient. Unfortunately, since each fog area is built by distinct resources not only in quantity but also in typology, the capacity of processing, storage, power and networking techniques may di?er for each individual fog area, thus endowing each particular fog area with distinct characteristics and features. This scenario makes the management of all fog 700 S. Sengupta et al. areas notably challenging, thus di?culting the objective of building an e?cient F2C system. To mitigate this problem, a clear description of the entire set of capacities and characteristics of each individual fog area is mandatory. Fig. 6. Resource information sharing: from Fog to Cloud. Previously we de?ned that, in the F2C system, devices those are sharing their resources can participate in the system as - ‘Contributor’, or ‘Both’. Let’s consider the Fig. 6 as an illustrative scenario to depict that cooperative sce-nario. We may see that ‘Fog Area1’, contains one leader fog node and two fog node devices (i.e., smartphone, laptop) along with other connected devices (i.e., printer, bulb etc.). Let’s consider that the two fog node devices and the leader fog node are participating in the system as ‘Both’. In this case, the two fog node devices are sharing their resource information with the leader fog node. Thus, once the leader fog node receives the resource information for the two fog node devices, it aggregates all the information along with its own resource components information to form the resource information for the particular fog area. Then, the leader fog node shares this aggregated information to the higher layer in the F2C architecture. To make it work an strategy to aggregate the resources information must be de?ned. To that end, next, we propose a general-ized aggregated resource description model for the F2C system. We identify the Taxonomy and Resource Modeling in Combined Fog-to-Cloud Systems 701 aggregated resource description model as aRDF 2C, and its structure is described as following: aRDF 2C = < fog node id; fog area id; total number of the attached F2C enabled resources; main memory capacity info ( in MB ): < total available main memory; F2C resource with highest main memory; F2C resource with lowest main memory >; storage capacity info ( in MB ): < total available storage size; F2C resource with highest storage size; F2C resource with lowest storage size >; processor info: < processing capacity info ( in percentage ): < average of processing capacity; F2C resource with highest processing; F2C resource with lowest processing >; processor core info ( number of cores ): < average of total number of cores; F2C resource with highest processor core; F2C resource with lowest processor core > >; gpu capacity ( in MB ): < total available gpu capacity; F2C enabled resource with highest gpu; F2C enabled resource with lowest gpu >; power info remaining time ( in seconds ): < average time of power remain; F2C resource with highest power remain; F2C resource with lowest power remain >; IoT & other attached devices info: < sensors type info; actuators type info; RFID tag type info; other attached device info; >; Security & Privacy score: < average score for F2C resource; F2C enabled resource with highest score; F2C enabled resource with lowest score > > By following this aggregated resource information, it can be easily drawn that it is quite di?erent from the generalized resource description model of a single F2C resource. After getting all the resource information of a fog area, the leader fog node of the respective area is aggregating all of the information, and it is making an aggregated description ?le according to the upper mentioned model. The aggregated description ?le only contains the information about leader fog node id, fog area id, total number of fog nodes, the total capacity of main memory, storage, GPU etc., information about the highest and lowest main memory, storage, processing, GPU capacity of the F2C enabled fog node of the respective fog area and so on. Then after creating the aggregated resource information model, the leader fog node share this information with the upper layer resources of the F2C paradigm. 702 S. Sengupta et al. 6 Conclusion In this paper, we start highlighting the need to de?ne a resources model to ease the management of the F2C system. To that end, we begin presenting a taxonomy for F2C resources. Leveraging the taxonomy along with the recent literature, we propose an ontology-based resource description model for the F2C system, where resources are described by device attributes, IoT components and attached components, security and privacy aspects, cost information, and histor-ical and behavioural information classes. The proposed model is illustrated in a smart city scenario for the sake of understanding. And ?nally, in this paper, we have also introduced the model for a generalized aggregated resources descrip-tion ?le, aimed at sharing the resource information of a particular fog area. This work is presented as the ?rst step towards a comprehensive resource categoriza-tion system which is considered as mandatory for an e?cient F2C management framework. Still, many challenges remain to be addressed. For example, consid-ering active/non-active resources in the aggregated information, or even more interesting, de?ning a strategy to implement the resource sharing as described in the F2C. Even the classi?cation of the F2C resources will help us to ?nd out the proper resources to map with services in the F2C paradigm. Implicitly, this work will help us to de?ne the cost-model for the F2C resources, and that will also help us to ?nd out some optimal solution for choosing the resources to execute some tasks and provide some services. Thus, these challenges, as well as many other open issues, will constitute the core of our future work as a follow up of this paper. Acknowledgment. This work was supported by the Spanish Ministry of Economy and Competitiveness and the European Regional Development Fund, under contract TEC2015-66220-R(MINECO/FEDER), and by the H2020 EU mF2C project reference 730929. References 1. Department of Economic and Social A?airs. World Urbanization Prospects The 2014 Revision - Highlights. United Nations (2014). https://esa.un.org/unpd/wup/ publications/?les/wup2014-highlights.pdf. ISBN 978-92-1-151517-6 2. Ismail, N.: What will the smart city of the future look like? Information Age Magazine, 21 September 2017. http://www.information-age.com/will-smart-city-future- look-like-123468653/ 3. van der Meulen, R.: Gartner Says 8.4 Billion Connected “Things” Will Be in Use in 2017, Up 31 Percent From 2016. Press Release by the Gartner, Inc. (NYSE: IT), 7 February 2017. https://www.gartner.com/newsroom/id/3598917 4. Al-Fuqaha, A., Guizani, M., Mohammadi, M., Aledhari, M., Ayyash, M.: Internet of Things: a survey on enabling technologies, protocols, and applications. IEEE Commun. Surv. Tutor. 17(4), 2347–2376 (2015) 5. Manyika, J., Woetzel, J., Dobbs, R., Chui, M., Bisson, P., Bughin, J., Aharon, D.: Unlocking the potential of the Internet of Things. McKinsey&Company, June 2015. https://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/ the-internet-of-things-the-value-of-digitizing-the-physical-world Taxonomy and Resource Modeling in Combined Fog-to-Cloud Systems 703 6. Cisco Systems Inc.: New Cisco Internet of Things (IoT) System Provides a Foundation for the Transformation of Industries. Cisco News, 29 June 2015. https://investor.cisco.com/investor-relations/news-and-events/news/news-details/ 2015/New-Cisco-Internet-of-Things-IoT-System-Provides-a-Foundation-for- the-Transformation-of-Industries/default.aspx 7. Martin, R.: The Installed Base of Smart Meters Will Surpass 1 Billion by 2022, Posted in the Newsroom of the Navigant Research, 11 November 2013 8. Ahmed, E., Yaqoob, I., Gani, A., Imran, M., Guizani, M.: Internet-of-Things-based smart environments: state of the art, taxonomy, and open research challenges. IEEE Wirel. Commun. 23(5), 10–16 (2016) 9. Mahmud, R., Buyya, R.: Fog computing: a taxonomy, survey and future directions. In: Internet of Everything, pp. 103-130. Springer (2018) 10. Bonomi, F., Milito, R., Natarajan, P., Zhu, J.: Fog computing: a platform for internet of things and analytics. In: Big Data and Internet of Things: A Roadmap for Smart Environments, pp. 169–186. Springer (2014) 11. Masip-Bruin, X., Marin-Tordera, E., Jukan, A., Ren, G.J., Tashakor, G.: Foggy clouds and cloudy fogs: a real need for coordinated management of fog-to-cloud (F2C) computing systems. IEEE Wirel. Commun. Mag. 23(5), 120–128 (2016) 12. Sengupta, S., Garcia, J., Masip-Bruin, X.: A literature survey on ontology of di?er-ent computing platforms in smart environments. arXiv preprint arXiv:1803.00087 (2018) 13. Perera, C., Qin, Y., Estrella, J.C., Rei?-Marganiec, S., Vasilakos, A.V.: Fog com-puting for sustainable smart cities: a survey. ACM Comput. Surv. (CSUR) 50(3), 32 (2017) 14. Dorsemaine, B., Gaulier, J.-P., Wary, J.-P., Kheir, N., Urien, P.: Internet of Things: a de?nition & taxonomy. In: 2015 9th International Conference on Next Generation Mobile Applications, Services and Technologies, pp. 72–77 (2015) 15. Vaithiya, S., Bhanu, M.S.: Ontology based resource discovery mechanism for mobile grid environment. In: 2013 2nd International Conference on Advanced Computing, Networking and Security (ADCONS), pp. 154–159 (2013) 16. Karaoglanoglou, K., Karatza, H.: Directing requests in a large-scale grid system based on resource categorization. In: 2011 International Symposium on Perfor-mance Evaluation of Computer & Telecommunication Systems (SPECTS), pp. 9–15 (2011) 17. Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of Things (IoT): a vision, architectural elements, and future directions. Futur. Gener. Comput. Syst. 29(7), 1645–1660 (2013) 18. Arianyan, E., Ahmadi, M.R., Maleki, D.: A novel taxonomy and comparison method for ranking cloud computing software products. Int. J. Grid Distrib. Com-put. 9(3), 173–190 (2016) 19. Parikh, S.M., Patel, N.M., Prajapati, H.B.: Resource management in cloud com-puting: classi?cation and taxonomy. arXiv preprint arXiv:1703.00374 (2017) 20. Zhang, M., Ranjan, R., Haller, A., Georgakopoulos, D., Menzel, M., Nepal, S.: An ontology-based system for cloud infrastructure services’ discovery. In: 2012 8th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom), pp. 524–530 (2012) 21. Baccarelli, E., Naranjo, P.G.V., Scarpiniti, M., Shojafar, M., Abawajy, J.H.: Fog of everything: energy-e?cient networked computing architectures, research chal-lenges, and a case study. IEEE Access 5, 9882–9910 (2017) 704 S. Sengupta et al. 22. Moscato, F., Aversa, R., Di Martino, B., Forti¸s, T.-F., Munteanu, V.: An analysis of mOSAIC ontology for cloud resources annotation. In: 2011 Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 973–980 (2011) 23. Botta, A., de Donato, W., Persico, V., Pescap`e, A.: Integration of cloud computing and internet of things: a survey. Futur. Gener. Comput. Syst. 56, 684–700 (2016) 24. Marin-Tordera, E., Masip-Bruin, X., Garcia, J., Jukan, A., Ren, G.J., Zhu, J.: Do we all really know what a Fog Node is? Current trends towards an open de?nition. Comput. Commun. 109, 117–130 (2017) 25. Gomez-Perez, A., Fernandez-Lopez, M., Corcho, O.: Ontological engineering: with examples from the areas of knowledge management, e-commerce and the semantic web. Data Knowl. Eng. 46(1), 41–64 (2003) Predicting Head-to-Head Games with a Similarity Metric and Genetic Algorithm Arisoa S. Randrianasolo1(B) and Larry D. Pyeatt2 1 Lipscomb University, Nashville, TN, USA arisoa.randrianasolo@lipscomb.edu 2 South Dakota School of Mines and Technology, Rapid City, SD, USA larry.pyeatt@sdsmt.edu Abstract. This paper summarizes our approach to predict head to head games using a similarity metric and genetic algorithm. The prediction is performed by simply calculating the distances of any two teams, that are set to play each other, to an ideal team. The nearest team to the ideal team is predicted to win. The approach uses genetic algorithm as an optimization tool to improve the accuracy of the predictions. The optimization is performed by adjusting the ideal team’s statistical data. Soccer, basketball, and tennis are the sport disciplines that are used to test the approach described in this paper. We are comparing our pre-dictions to the predictions made by Microsoft’s bing.com. Our ?ndings show that this approach appears to do well on team sports, accuracies above 65%, but is less successful for predicting individual sports, accu-racies less than 65%. In our future work, we plan to do more testing on team sports as well as studying the e?ects of the di?erent parameters involved in the genetic algorithm’s setup. We also plan to compare our approach to ranking and point based predictions. Keywords: Sports predictions · Similarity calculation Genetic algorithm 1 Introduction International sport competitions, professional sports, college sports, and even regional and city tournaments now keep track of various data about the teams involved in the competitions. Those data can be available right away as the games progress, or may be extracted later by some experts after reviewing the video of the games. The challenge is ?nding ways to make use of the available data. Is there enough information in the data to predict the outcomes of future games? What algorithm and calculations can be utilized to predict the outcomes of future games? Those are some of the questions that teams and coaches may have after receiving their statistical data from a tournament. .a c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 705–720, 2019. https://doi.org/10.1007/978-3-030-02686-8_53 706 A. S. Randrianasolo and L. D. Pyeatt In this paper, we summarize our approach to predicting the outcomes of head to head games in tournaments. Our approach di?ers from others because it is not utilizing all the possible historical data that can be gathered about the teams that are involved. It is also not taking in consideration the past performance of the teams in the same competition from previous years or previous matches. We restrict the data that we are using to perform the prediction to only consist of the most recent teams’ statistics in the tournament of interest. This restriction of the data is based on the assumption that the performance in the current tournament of interest is most indicative of the current strength of the teams. Also, by using this restriction, this approach can be used in tour-nament settings where teams do not necessarily know much about each other before hand. This latter reason is our main motivation for this research. Our approach uses a similarity metric over the most recent statistical data of the teams involved in the tournament to predict the outcomes of head to head games. To improve the predictions, we use a genetic algorithm as an optimization mechanism. This paper will cover some of the previous work done in terms of head to head game predictions. Then, it will explain our early observation in predicting head to head games. The forth section of the paper will cover the approach that we are proposing. This will be followed by the testing and the results of our experiments. The last section of this paper will contain our conclusions and future work. 2 Related Work The idea of predicting the outcome of a pairwise sport matchup is a research topic for many investigators. Chen and Joachims explained the use of a general probabilistic framework for predicting the outcome of pairwise matchups using the blade-chest model [1,2]. A player or a team was represented by a blade vector and a chest vector. The winning and losing probabilities were decided based on the distance between one player’s blade to his opponent’s chest and vice versa. The blade and chest vector were extracted from the player’s data and the game features. This approach trained on historical data to tweak the parameters involved in the model by maximizing the log-likelihood of the probability of the known winner. Machine learning is also used widely in sport predictions. In most of these cases, as in the approach described previously, a considerable amount of historical data is needed to train the model. For example, Pretorius and Parry trained a random forest on past rugby games in order to predict the 2015 Rugby World Cup [3]. The accuracy of the predictions made by their system was no di?erent than the prediction made by human agents on the 2015 Rugby World Cup. Brooks, Kerr, and Guttag trained an SVM to predict if possessions will result in shots in soccer [5]. The approach was applied on the Spanish La Liga soccer league using the data from the 2012–2013 season. It had an Area Under the ROC (Receiver Operating Characteristic) curve of 0.79. Microsoft’s Bing Predicts [11] Predicting Head-to-Head Games 707 also claims to use machine learning in its prediction. Bing Predicts claimed a 63.5% accuracy on predicting the 2016 NCAA March Madness and a 75% prediction accuracy on the 2015 Women’s Soccer World Cup. Evolutionary systems are also used in sport and matchup predictions. Soares and Gilbert used a Particle Swarm Optimizer (PSO) to predict Cross-country results [4]. Their approach transformed the team features from historical data into a set of rankings. The rankings were multiplied by weights to produce the ?nal rankings. The ?nal rankings were then evaluated from the results of the cross-country meets as follows: A team received 1 point for each team it beat if it was ranked ahead of that team, and received 1 point for each team it lost to if it was ranked behind that team. A team received 0 points for each team in which the opposite of either case above happened [4]. The goal of this approach was to maximize the points earned through producing the ?nal rankings used in the predictions, and the way to do so is to optimize the weights using a PSO. Another approach that uses ranking as a way to predict performance is to create a complex-network based on di?erent measures, such as clustering coe?- cient and node degree [6,7]. With this approach, a team sports league is viewed as a network of players, coaches, and teams in evolution. The network was used to predict teams’ behavior and to predict rankings. The rankings could be used to predict the league’s winner. This approach was applied to NBA (National Bas-ketball Association) and MLB (Major League Baseball) data and has achieved a 14% rank prediction accuracy improvement over its best competitor [7]. The ?rst di?culty in using many of these approaches resides in ?nding the appropriate functions or transforms that can extract the needed information from the historical data. Our approach uses a simple similarity metric and the well known genetic algorithm to create the predictions. The second di?culty arises from the struggle of ?nding enough data to train the model. In well known competitions with well known teams, ?nding historical data is not a problem. However, in less known competitions, such as regional or city or invitational or small tournaments, ?nding historical data is not always possible. This is the reason why we restrict the data that we are using to perform the predictions to only consist of the most recent teams’ statistics in the tournament of interest. We apply this restriction in all of the tournaments that we predicting regardless of whether they are well known or not. 3 Early Observation This research started because of a soccer coach who came to us with all sorts of data about his team, and was struggling to ?nd a way to use it to his team’s advantage. The data that we received had no information about the other teams in the division, so we could not do much in predicting head to head outcomes. To continue this research, we started exploring publicly available data from other sports competitions. Our early observation lead us to notice that teams work to improve some trackable features in the game. For example, in soccer, a team may try to maxi-mize its ball possession time or possession percentage and minimize the amount 708 A. S. Randrianasolo and L. D. Pyeatt of red cards that its players receive. In basketball, for example, a team may try to minimize its turnover rate and maximize its three-points percentage. The teams’ statistics data can be represented in a vector format. This observation lead us to begin considering the idea of an ideal team. This ideal team has the statistics that all teams, in a particular sport of interest, try to reach. The values for the features in the ideal team’s vector can be hard to reach for some teams. These values may even be impossible, but they should represent what a perfect team should look like in the sport of interest. Now that we have teams vectors and an ideal team vector, we can start working on predictions. Fig. 1. Similarity calculation. The prediction is done simply by computing the similarity of each team to the ideal team. A simple illustration of this idea is expressed in Fig. 1. Since the data are vectorized, a distance or similarity calculation is not hard to compute, and there are several distance measures that could be used. Given two teams that are due to play in a head to head game, we predict that the nearest one to the ideal team, represented by the ideal vector, will win the game. 3.1 Early Testing and Results We started testing our approach on three competitions in 2016. The test com-petitions were, the 2016 U.S. Open (tennis), the 2016 FIBA Africa Under 18 (basketball), and the 2016 UEFA European Championship (soccer) also known as “euro 2016”. The 2016 FIBA Africa Under 18 was the ideal setup to test our approach. The teams in that competition did not appear to have much infor-mation about each other, and somehow had to utilize the statistics about the other teams in order to know their winning chances and to create strategies. The drawback of using this particular basketball competition was that it was not a well known competition. We were not be able to compare our predictions Predicting Head-to-Head Games 709 to other live predictions. This was the reason why we tested our approach to the 2016 U.S. Open and the 2016 UEFA European Championship competitions. In the 2016 U.S. Open, we used the data from rounds one through four to predict the quarter?nals. Then, we utilized the data from rounds one through four plus the quarter?nals to predict the semi?nals. Finally, we employed the data from rounds one through four plus the quarter?nals and the semi?nals to predict the ?nals. In the 2016 FIBA Africa Under 18, we used the data from the group stage to predict the quarter?nals. Then, we followed the same procedure as in the 2016 U.S. Open. In the 2016 UEFA European Championship, we also utilized the data from the group stage to predict the round of 16 and then we followed the same approach as in the previous two sports mentioned above. The features used during this early testing are shown in Table 1. Table 1. Features used in early testing 2016 U.S. Open 2016 FIBA Africa 2016 UEFA Euro sets played tie breaks played total games total aces total double faults 1st serves in % 1st serve points won % 2nd Serve points won % return games won winners unforced errors points per game ?eld goal attempts ?eld goal % 3-points attempts 3-points % free throw attempts free throw attempts % total corner for total corner against o?side fouls committed fouls su?ered yellow cards red cards pass completed ball possession % total attempt attempt on target attempt o? target attempt blocked attempt against wood work total goals total goals against There was no speci?c study done in choosing the predictors during the early observation part of this research. We used our knowledge about these three di?erent sports in choosing those predictors. We also used our knowledge about these sports in selecting the ideal vectors. An in-depth study on how to pick the predictors was left to the next phase of this research, which is summarized in the next section. 710 A. S. Randrianasolo and L. D. Pyeatt The ideal vector for the 2016 U.S. Open Men’s competition was: 3, 0, 18, 20, 0, 100, 100, 100, 9, 80, 0 . The ideal vector for the 2016 U.S. Open Women’s competition was: 2, 0, 12, 20, 0, 100, 100, 100, 6, 80, 0 . The ideal vector for the 2016 FIBA Africa Under 18 was: 150, 150, 80, 50, 50, 50, 80 . The ideal vector for the 2016 UEFA European Championship was: 100, 0, 0, 0, 100, 0, 0, 100, 100, 200, 200, 0, 0, 0, 60, 0 . In our early exploration, we used three di?erent similarity or distance mea-sures: Cosine distance, Manhattan distance (L1 -norm), and Euclidean distance (L2 -norm). The prediction accuracy from 0 to 1 (0% to 100%) of each of these three di?erent distance metrics are captured in Fig. 2. We compared our pre-dictions to the predictions from Microsoft’s Bing Predicts. The results of this comparison are shown in Fig. 3. Fig. 2. Comparison of similarity measures. Predicting Head-to-Head Games 711 Fig. 3. Comparison with Bing.com. 4 Prediction Method 4.1 Choosing a Similarity Metric Our early exploration seems to indicate that switching similarity metric based on the sport event is possibly the way to proceed. However, we want to create a general approach that will work in any type of sports. We locked our choice to using Cosine distance as our similarity metric for the rest of this research. The reasoning for this choice is that out of the combined predictions (U.S. Open Men + U.S. Open Women + 2016 UEFA European Championship) recorded in Fig. 3, the accuracy for Cosine distance was 18 30 which was similar to the Manhattan distance’s accuracy, while the combined accuracy for the Euclidean distance was 17 30 . We did not break the tie between Cosine and Manhattan; we just picked one to go with. 4.2 E?ect of the Ideal Vector Our early observation has also pointed out that a change in the ideal vector will a?ect the predictions. In the early observation, we used our personal knowledge about the sports that we were dealing with to set up the ideal vectors. We do not claim to be an expert in these sports or the competitions that we dealt with in the early observation, and the ideal vectors that we picked could be erroneous. Also, we want the ideal vector to be in close relationship with the trend in the tournament. In one tournament, for example, a ball possession of 60% could be 712 A. S. Randrianasolo and L. D. Pyeatt enough to win the tournament. While in another tournament, a ball possession of 80% may be needed to win. This prompted us to employ an optimization strategy to improve the ideal vector. 4.3 Approach Our approach is summarized by Fig. 4. It starts with an input ?le containing the statistics from the early rounds of the tournament and a starting ideal vector that we manually selected based on what we think an ideal statistics should look like for an ideal team. The approach, then, makes its ?rst set of predictions based on the next set of games that are to played in the tournament. The predictions are compared to the observed outcomes to obtain the accuracy of the ideal vector. Next, a genetic algorithm is called to optimize the ideal vector [8–10]. The genetic algorithm utilizes the same input ?le containing the team statistics and the observed outcomes to calculate the ?tness of each candidate ideal vector. The best ideal vector is saved for the next set of predictions. Fig. 4. The overall approach. For the second set of predictions, the approach utilizes the best ideal vector produced by the genetic algorithm in the ?rst optimization and the team statis-tics from the beginning of the tournament up to the most recent games. For the third set of predictions, the approach uses the best ideal vector produced by Predicting Head-to-Head Games 713 the genetic algorithm in the second optimization and the team statistics from the beginning of the tournament up to the most recent games. The approach continues in this manner until the approach produces the last set of predictions, after which no further optimization is required. As a tournament moves from one round to the next, there are usually fewer games to predict. This means that the accuracy of any prediction methods can potentially go down from one round to the next as the tournament progress. This is another reason why we use a genetic algorithm optimization between rounds so that the approach can learn the trend or the pattern from the previous rounds to better predict the next round. 4.4 Short Introduction to Genetic Algorithms A genetic algorithm is a search and an optimization process inspired from biology. It is based on the survival of the ?ttest. In a genetic algorithm, a potential solution is called an individual. An individual is, most of time, expressed as a string of characters. The set of individuals is known as a population. Each individual in the population has a ?tness value. This value indicates the individual’s quality of being a solution to the problem. Individuals in the population are allowed to mate to produce new solutions. The mating part of the algorithm is known as a crossover. During a crossover, two individuals exchange characters to form a new string. Individuals that par-ticipate in crossovers are selected by a process that is based on their ?tness. The more ?t individuals have higher chances to participate in crossovers. The eventual exchange of characters is governed by a crossover probability. This probability determines whether the exchange is allowed to happen or not. Individuals in the population can also mutate with a de?ned probability known as the mutation probability. The mutation is usually performed by alter-ing one or more characters from the string that represents an individual. In each iteration, the algorithm attempts to create new individuals. The algorithm halts when an individual with the desired ?tness is generated, or when the maximum number of allowed iterations is reached. Other halting conditions can also be adopted. 4.5 Genetic Algorithm Set up The individuals in the population are candidate ideal vectors. The population size is ?xed to 100 for our experiments, and the probability of crossover is set to 60%. A roulette wheel selection approach is used to select the parents for the crossover. Other selection approaches exist and we plan to study those more in our future work. The crossover is performed at a ?xed point which is always at the middle of the candidate ideal vectors. The probability of mutation is 0.1%. The mutation is performed by either adding 1, with a probability of 50%, or subtracting 1, with a probability of 50%, to each of the values of a candidate ideal vector that have range greater than or equal to 5. It is performed by 714 A. S. Randrianasolo and L. D. Pyeatt adding or subtracting 0.1 with equal probability for values that have range less than 5. Each candidate ideal vector is used to predict the set of games that just happened; to which the observed outcomes are available. The ?tness of each candidate ideal vector is nothing else but its accuracy on the game that just happened. The genetic algorithm is allowed to generate 1200 new individuals before it stops. Then survival of the ?ttest is used to place a new individual in the population. Our genetic algorithm approach was modeled after the approach described by Goldberg [9]. 5 Testing and Results We revisited the competitions in the early observation with this new proposed approach. The results are captured by Fig. 5. Since there is some randomness in generating the population in the genetic algorithm, we ran the approach 51 times on each set of games that it tried to predict. We then used a majority rule between any two teams going head to head to see which one was mentioned the most to be the winner in the 51 prediction attempts. We choose 51, which is an odd number, because we are interested in a win or lose situation and not a draw. There seems to be an improvement in predicting the men’s U.S. Open tour-nament and a slight improvement on the 2016 UEFA European Championship, so we tested the approach with two other tournaments: the 2016–2017 UEFA Champions League and the 2017 Australian Open. Before proceeding to use the approach, we ran a correlation analysis on the predictor variables to help us in choosing the features for the ideal vectors and the vectors for each team. Figure 6 has the correlation plot for the 2016–2017 UEFA Champions League competition and Fig. 7 has the correlation plot for the 2017 Australian Open competition. Table 2 shows the ?nal features for the team vectors and the ideal vectors that were used in the testing. Tables 3 and 4 show the ranges of the possible values for each feature in the ideal vectors for the two competitions. The starting ideal vector for the 2016– 2027 UEFA Champions League was: 60, 0, 200, 0, 0, 0, 100, 100, 100, 100, 0, 100, 0, 0 . The starting ideal vector for the 2017 Australian Open Men’s competition was: 1, 80, 1, 90, 100, 30, 1, 100, 100 . The starting ideal vector for the 2017 Australian Open Women’s competition was: 0, 80, 0, 80, 100, 20, 0, 100, 100 . The accuracy of the predictions can be seen in Fig. 8. Over the eleven competitions that we have been predicting so far, we also tracked how this approach performed as it moved from the ?rst round of the pre-dictions to the next rounds. Some competitions had more rounds than the others; however they all had at least three rounds. The accuracy of the predictions from the ?rst three rounds are summarized in Fig. 9. Predicting Head-to-Head Games 715 Fig. 5. Revisit of the early observations. Fig. 6. Correlation for the 2016–2017 UEFA Champions League. 716 A. S. Randrianasolo and L. D. Pyeatt Fig. 7. Correlation for the 2017 Australian Open. Table 2. Features used in the testing. 2016-2017 UEFA 2017 Australian Open total goals total goal against attempt on target attempt o? target attempt blocked attempts against wood work pass completion percentage ball possession total corner for cross completion fouls committed fouls su?ered yellow cards red cards tie break winners unforced errors service points won percentage of ?rst serve in aces double fault percentage of 1st serve point won percentage of 2nd serve point won Predicting Head-to-Head Games 717 Table 3. Range of values in the Ideal Vector for the 2016–2017 UEFA Champions League. Total goals 0–40 Total goal against 0–40 Attempt on target 10–90 Attempt o? target 10–90 Attempt blocked 5–60 Attempts against wood work 0–10 Pass completion percentage 50–100 Ball possession 30–70 Total corner for 0–10 Cross completion 5–90 Fouls committed 50–200 Fouls su?ered 50–200 Yellow cards 5–30 Red cards 0–3 Table 4. Range of values in the Ideal Vector for the 2017 Australian Open. Men Women Tie break 0–2 0–1 Winners 90–100 0–50 Unforced errors 0–2 0–50 Service points won 90–100 0–70 Percentage of ?rst serve in 90–100 50–100 Aces 20–40 0–20 Double fault 0–2 0–10 Percentage of 1st serve pt. won 90–100 50–100 Percentage of 2nd serve pt. won 90–100 50–100 718 A. S. Randrianasolo and L. D. Pyeatt Fig. 8. Performance on the 2016–2017 UEFA Champions League and the 2017 Aus-tralian Open. Fig. 9. Performance from one round to the next. Predicting Head-to-Head Games 719 6 Conclusion and Future Works In this paper, we have summarized our approach on predicting head to head games using only the statistical data of what the teams have been doing in a tournament of interest. Our approach is aimed at predicting local or regional competitions where little or no historical data is available by using a simple similarity metric and the well known genetic algorithm. Individual sports are more di?cult to predict than team sports. Injuries, emotions, fatigue, and other factors have a greater e?ect on individuals than they do on teams. For individual sports, these factors must be taken in consideration to improve the prediction. Taking social media input (similar to what Microsoft’s bing.com [11] claims to be doing) or using additional data about each game, such as time of the day or weather or public support (similar to what was done by Chen and Joachims [1,2]) can be bene?cial. Even in the work by Chen and Joachims [1,2], predictions are still only around 60% and 70% in tennis. Team performances in collective sports appear to have more regularity, making predictions a little bit less di?cult than individual sports. The performance of our approach on the 2016 FIBA Africa Under 18, the 2016 UEFA European Championship, and the 2016–2017 UEFA Champions League, indicates that it has the potential to do well for predicting the outcomes of team and collective sports head to head games. We plan to test this approach on more team sports in the future. Our future goals also include ?nding a way to automatically infer the initial ideal vectors from the initial data rather than depending on a human agent to generate them. We also plan to engage in a more detailed analysis of the parameters involved in the genetic algorithm. This will involve exploring di?erent selection approaches and experimenting with the crossover and the mutation probability. Our aim in this endeavor is not only to improve the accuracy but also to uncover the reason for the slight drop of performance between the second rounds and the third rounds of predictions as we can see from Fig. 9. We also plan to compare our predictions to ranking based and point based predictions. References 1. Chen, S., Joachims, T.: Predicting matchups and preferences in context. In: Pro-ceedings of the 22nd ACM SIGKDD International Conference on Knowledge Dis-covery and Data Mining, KDD 2016, San Francisco, California, USA, pp. 775–784. ACM, New York (2016) 2. Chen, S., Joachims, T.: Modeling intransitivity in matchup and comparison data. In: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, WSDM 2016, San Francisco, California, USA, pp. 227–236. ACM, New York (2016) 3. Pretorius, A., Parry, D.A.: Human decision making and arti?cial intelligence: a comparison in the domain of sports prediction. In: Proceedings of the Annual Conference of the South African Institute of Computer Scientists and Information Technologists, SAICSIT 2016, Johannesburg, South Africa, pp. 32:1–32:10. ACM, New York (2016) 720 A. S. Randrianasolo and L. D. Pyeatt 4. Soares, C., Gilbert, J.E.: Predicting cross-country results using feature selec-tion and evolutionary computation. In: The Fifth Richard Tapia Celebration of Diversity in Computing Conference: Intellect, Initiatives, Insight, and Innovations, TAPIA 2009, Portland, Oregon, pp. 41–45. ACM, New York (2009) 5. Brooks, J., Kerr, M., Guttag, J.: Developing a data-driven player ranking in soccer using predictive model weights. In: Proceedings of the 22nd ACM SIGKDD Inter-national Conference on Knowledge Discovery and Data Mining, KDD 2016, San Francisco, California, USA, pp. 49–55. ACM, New York (2016) 6. Vaz de Melo, P.O.S., Almeida, V.A.F., Loureiro, A.A.F.: Can complex network metrics predict the behavior of NBA teams? In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008, Las Vegas, Nevada, USA, pp. 695–703. ACM, New York (2008) 7. Vaz de Melo, P.O.S., Almeida, V.A.F., Loureiro, A.A.F., Faloutsos, C.: Forecasting in the NBA and other team sports: network e?ects in action. ACM Trans. Knowl. Discov. Data 6, 13:1–13:27 (2012) 8. Mitchell, M., Forrest, S.: Genetic algorithms and arti?cial life. Artif. Life 1, 267– 289 (1994) 9. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learn-ing. Addison-Wesley Longman Publishing Co. Inc., Boston (1989) 10. Holland, J.H.: Adaptation in Natural and Arti?cial Systems: An Introductory Analysis with Applications to Biology, Control and Arti?cial Intelligence. MIT Press, Cambridge (1992) 11. Bing Predicts. http://www.bing.com/explore/predicts. Accessed 17 July 2017 Arti?cial Human Swarms Outperform Vegas Betting Markets Louis Rosenberg(?) and Gregg Willcox Unanimous AI, San Luis Obispo, CA, USA Louis@Unanimous.AI Abstract. Swarm Intelligence (SI) is a natural phenomenon in which biological groups amplify their collective intelligence by forming dynamic systems. It has been studied extensively in bird ?ocks, ?sh schools, and bee swarms. In recent years, AI technologies have enabled networked human groups to form systems modeled on natural swarms. Referred to as Arti?cial Swarm Intelligence or ASI, this approach has been shown to signi?cantly amplify the e?ective intelligence of human groups. The present study compares the predictive ability of ASI to Vegas betting markets when forecasting sporting events. Groups of average sports fans were required to forecast the outcome of 200 hockey games in the NHL league (10 games per week for 20 weeks). The expected win rate for Vegas favorites was 62% across the 200 games based on the published odds. The ASI system achieved a win rate of 85%. The probability that the ASI system outper- formed Vegas by chance was very low (p = 0.006), indicating a signi?cant result. Researchers also compared the ROI generated from two betting models: one that wagered weekly on the top Vegas favorite, and one that wagered weekly on the top ASI favorite. At the end of the 20-week period, the Vegas model generated a 41% ?nancial loss, while the ASI model generated a 170% gain. Keywords: Swarm intelligence · Arti?cial intelligence Collective intelligence 1 Background Arti?cial Swarm Intelligence (ASI) is a powerful method for amplifying the predictive accuracy of networked human groups [1, 2]. A variety of prior studies, across a wide range of prediction tasks have demonstrated that real-time “human swarms” can produce more accurate forecasts than traditional “Wisdom of Crowds” methods such as votes, polls, and surveys [3]. For example, a study in 2015 tested the ability of human swarms to predict the outcome of college football games. The ASI system tapped the real-time intelligence of 75 amateur sports fans to predict 10 bowl games. As individuals, the participants averaged 50% accuracy when predicting outcomes against the spread. When forecasting together as a real-time ASI system, those same participants achieved 70% accuracy against the spread [2]. Similar increases have been found in other studies, including a ?ve-week study that tasked human participants, connected as an ASI system, with predicting a set of 50 soccer matches in the English Premier League. Results showed a 31% increase in accuracy when participants were connected in ASI swarms as © Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 721–729, 2019. https://doi.org/10.1007/978-3-030-02686-8_54 compared to forecasting as individuals [4]. The human swarms also outperformed the BBC’s machine-model known as “SAM” over those same 50 games [5]. Although previous research has shown that ASI technology can empower human groups to outperform individual forecasters as well as traditional crowd-based methods, no formal study has been conducted to compare the predictive ability of ASI to major betting markets [6]. To address this need, the current study was conducted to rigorously compare “human swarms” to Vegas betting markets, assessing the accuracy rates and the ?nancial returns across a large set of predictions. Speci?cally, this largescale study required groups of sports fans to forecast the outcome of 200 games in the National Hockey League (NHL), structured as 10 games per week for 20 consecutive weeks. 1.1 From Crowds to Swarms When collecting input from human groups, the phase “Wisdom of Crowds” is generally used whenever the input is aggregated to generate output of higher accuracy. [7–9]. The basic premise, also referred to as Collective Intelligence, dates to the early 1900’s and generally involves collecting survey data from groups of individuals and computing a statistical result. When comparing “swarms” and “crowds”, the primary di?erence is that in crowd-based systems, the participants provide isolated input that is aggregated in external statistical models, whereas in swarm-based systems the participants interact in real-time, “thinking together” as a uni?ed system. In other words, crowds are statis- tical constructs while swarms are closed-loop systems in which the participants act, react, and interact in real-time, converging together on optimized solutions. ASI systems are generally modeled on biological systems such as ?sh schools, bird ?ocks, and bee swarms. The present study uses Swarm AI technology from the company Unanimous AI. This technology is modeled primarily on the collective decision-making processes employed by honeybee swarms [4]. This framework was chosen because honeybee populations have been shown to reach optimal decisions by forming real-time closed-loop systems [10]. In fact, at a structural level, the decision-making methods observed in honeybee swarms are very similar to the decision-making processes observed in neurological brains [11, 12]. When reaching decisions, swarm and brains are both employ large populations of simple excitable units (i.e., bees and neurons) that operate in parallel to (a) integrate noisy data about the world, (b) weigh competing alternatives when a decision needs to be made, and (c) converge on preferred decisions as a uni?ed system. In both brains and swarms, outcomes are arrived upon through competition among sub-populations of simple excitable units. When one sub-population exceeds a threshold level of support, the corresponding alternative is chosen by the system. In honeybees, this enables the group to converge on optimal decisions across a wide range of tasks, for example when selecting the best possible hive location from a large set of options. Researchers have shown that honey bees converge on the best possible solution to this life-or-death deci- sion approximately 80% of the time [13, 14]. 722 L. Rosenberg and G. Willcox 1.2 Creating Human Swarms Unlike birds and bees and ?sh, humans have not evolved the natural ability to swarm, as we don’t possess the subtle skills that other organisms use to establish high speed feedback-loops among their members. Fish for example, when moving in schools, detect faint vibrations in the water around them. Birds, when ?ocking, detect subtle motions propagating through the formation. Honeybees, when reaching decisions as a uni?ed swarm, use complex body vibrations called a “waggle dance” to encode their changing views. To enable real-time swarming among groups of networked humans, specialized software is required to close the loop among all members. To solve this problem, a software platform (swarm.ai) was created to allow human groups to form real-time systems from anywhere in the world [1, 6]. Modeled after the decision-making process of honeybee swarms, swarm.ai enables groups of networked users to work in parallel to (a) integrate noisy information, (b) weigh competing alternatives when making deci- sions, and (c) converge on decisions, together as a real-time closed-loop system. As shown in Fig. 1 below, arti?cial swarms answer questions by moving a graphical puck to select among a set of answer options. Each participant provides their input by moving a graphical magnet with a mouse, touchpad, or touchscreen. By adjusting their magnet in relation to the moving puck, real-time participants can express their individual intent on the system as a whole. The input from each user is not a vote, but a continuous stream of vectors that varies freely over time. Because all members of the networked population can vary their intent continuously in real-time, as moderated by AI algo- rithms, the arti?cial swarm explores the decision-space, not based on the input of any single individual, but based on the emergent dynamics of the system as a whole. This enables complex deliberations to emerge among all participants at the same time, empowering the group to collectively consider each of the options and converge on the solution that best represents their combined knowledge, wisdom, and insights. Fig. 1. Real-time ASI choosing between options. Arti?cial Human Swarms Outperform Vegas Betting Markets 723 It is critical point out that participants do not only vary the direction of their individual intent, but also modulate the magnitude by manipulating the distance between their magnet and the puck. Because the puck is in ?uid motion throughout the decision-space, users need to continuously update the position and orientation of their magnet so that it stays close to the puck’s outer rim. This is important, for it requires participants to remain engaged throughout the decision-making process, continuously evaluating and re-eval- uating their individual thoughts and feelings with respect to the question at hand. If they stop moving their magnet in relation to the changing position of the puck, the distance grows and their applied sentiment wanes. 2 Forecasting Study To quantify the forecasting ability human swarms as compared to large Vegas betting markets, a 20-week study was conducted using randomly selected human subjects. The participants, who were self-reported sports fans, were split into weekly groups. Each group consisted of 25 to 35 participants, all of whom logged in remotely to the swarm.ai system. Human subjects were paid $3.00 for their participation in each weekly session, which required them to forecast the outcome of all ten hockey games being played that night. All subjects were required to make their forecasts in two ways – (a) as individuals reporting on a standard online survey, and (b) as a contributor to a real-time ASI system. For each hockey game, participants were tasked with forecasting the winner and the margin of victory, expressed as either (a) the team win by 1 goal, or (b) the team win by 2 or more goals. The margins were chosen to match common Vegas gambling spreads. Figure 2 below shows a snapshot of a human swarm comprised of 31 partici- pants in the process of predicting a match between Toronto and Calgary. Fig. 2. ASI in the process of forecasting an NHL game. 724 L. Rosenberg and G. Willcox As shown in Fig. 2, each real-time swarm is tasked with selecting from among four outcome options, indicating which team will win and which margin is most likely. Again, the participants do not cast discrete votes but express their intent continuously over time, converging together as a system. The image shown in Fig. 2 is a snapshot of the system as it moves across the decision-space and converges upon an answer, a process that generally required between 10 and 60 s to complete. In addition to forecasting each individual game, participants were asked to identify which of the weekly predictions is the most likely to be a correct assessment. In other words, which of the teams forecast to win their games that week should be deemed the “pick of the week” as a consequence of being the most likely team to win its game. Figure 3 shown below is an example of ASI system in the process of identifying the pick of the week. As shown, the system is selecting from among six possible teams to decide which is most likely to win its game that week. Fig. 3. ASI in process of identifying “Pick of the Week”. 2.1 Wagering Protocol By collecting predictions for each of the 10 weekly games as well as a top “pick of the week”, forecasting data was collected across all 20 weeks for accuracy comparison against Vegas betting markets. To enable ROI comparisons against betting markets, two standardized betting models were tracked across the 20-week period. In both models, an initial simulated betting pool of $100 was created as the starting point for ROI computations, the pools tracked over the 20-week period. Arti?cial Human Swarms Outperform Vegas Betting Markets 725 In “Wagering Model A,” a simple heuristic was de?ned which allocated weekly bets equal to 15% of the current betting pool, dividing it equally across all ten weekly fore- casts made by the ASI system. In “Wagering Model B,” a similar heuristic was de?ned which also allocated 15% of the current betting pool for use in weekly bets, but placed the entire 15% upon one game, identi?ed as “pick of the week”. Both pots were tracked over the 20-week period, using actual Vegas payouts to compute returns. Vegas odds used in this study were captured from www.sportsbook.ag, a popular online betting market. 3 Results Across the set of 200 games forecast by the ASI system, an accuracy rate of 61% was achieved. This compares favorably to the expected accuracy of 55% based on Vegas odds (p = 0.0665). Of course, the more important skill in forecasting sporting events is identifying which games can be predicted with high con?dence as compared to those games which are too close to call. This skill is re?ected in the “pick of the week” gener- ated by the ASI system. Across the 20 weeks, the system achieved 85% accuracy in correctly predicting the winner of the “pick of the week” game. This compares very favorably to the expected accuracy of 62% based on Vegas odds. Figure 4 below shows the distribution of Vegas Odds for the twenty selected “pick of the week” games. As described above, the swarm-based system had a win rate of 85% across these same games. This is a signi?cant improvement, equivalent to reducing the error in Vegas Odds by 61%. The probability that the swarm outperformed Vegas Odds by chance was extremely low (p = 0.0057), indicating a highly signi?cant result. Fig. 4. Results across 20 weeks of NHL predictions. 726 L. Rosenberg and G. Willcox In addition, a betting simulation was run for each prediction set in which 15% of the current bankroll was bet on each weekly prediction. The performance of this model, when betting against Vegas is shown below in Fig. 5. Starting with $100 and investing each week according to this strategy, the Pick of the Week strategy results in a gain of $270.20, equivalent to a 20-week ROI of 170%, and a week-over-week average ROI of 5.09%. For comparison, betting on all of the swarm’s picks evenly (for a total of 15% of the bankroll) results in $121.82, or a 20-week ROI of 21.8%, indicating that the swarm is selecting better than randomly among its picks. Fig. 5. Cumulative betting performance across 20 weeks. While it’s impressive to achieve 170% ROI over 20 weeks, we can gain additional insight into the signi?cance of this outcome by comparing against additional baselines. For example, we can compare these results to (a) randomly placed bets across all games played as a means of assessing if the swarm bets across all games are as signi?cant as they appear, and (b) bets placed on the Vegas favorite each week as a means of assessing if betting on the swarm’s top picks is as impressive as it seems. These baselines are shown in Fig. 6 as the green line and red line, respectively. Looking ?rst at random betting across all games, the net outcome across 20 weeks was $72.39, which equates to 28% loss over the test period. This is signi?cantly worse than the $122 (22% gain) achieved by betting on all swarm-based forecasts. Even more surprising, betting on the Vegas favorites each week resulted in a net outcome of $59, which equates to a 41% loss over the 20-week test period. This is signi?cantly worse than the $270 (170% gain) achieved by betting on the swarm’s top picks. Arti?cial Human Swarms Outperform Vegas Betting Markets 727 Fig. 6. Swarm performance vs Baseline performance across 20 weeks. 4 Conclusions Can real-time human swarms, comprised of average sports fans connected by swarming algorithms, outperform the predictive abilities of largescale betting markets? The results of this study suggest this is very much the case. As demonstrated across a set of 200 games during the 2017–2018 NHL hockey season, an ASI systems comprised of approx- imately 30 typical sports fans, were able to out-forecast Vegas betting markets. This was most signi?cant when the ASI system identi?ed a “pick of the week” as the most likely game to achieve the predicted outcome. Across the 20 weeks, the system achieved 85% accuracy when predicting the “pick of the week” games, which compares favorably to the expected accuracy of 62% based on Vegas odds. The probability that the system outperformed Vegas by chance was extremely low (p = 0.006), indicating a highly signi?cant result. In addition, when using the “pick of the week” within a simple automated wagering heuristic, a simulated betting pool that started at $100, grew to $270 over the 20-week period based on the swarm-based predictions. This was a 170% ROI. Additional work is being conducted to optimize this wagering heuristic, as there appears to be room for improvement when optimizing Vegas wagers based on a swarm-based predictive intel- ligence. Looking towards future research, additional studies are planned to better under- stand which types of problems are best suited for solutions using “human swarms” as well as the impact of swarm size on output accuracy. References 1. Rosenberg, L.: Human swarms, a real-time method for collective intelligence. In: Proceedings of the European Conference on Arti?cial Life 2015, pp. 658–659 2. Rosenberg, L.: Arti?cial swarm intelligence vs human experts. In: 2016 International Joint Conference on Neural Networks (IJCNN). IEEE 728 L. Rosenberg and G. Willcox 3. Rosenberg, L., Baltaxe, D., Pescetelli, N.: Crowds vs Swarms, a Comparison of Intelligence. In: IEEE 2016 Swarm/Human Blended Intelligence (SHBI), Cleveland, OH (2016) 4. Baltaxe, D., Rosenberg, L., Pescetelli, N.: Amplifying prediction accuracy using human swarms. In: Collective Intelligence 2017, New York, NY (2017) 5. McHale, I.: Sports Analytics Machine (SAM) as reported by BBC. http://blogs.salford.ac.uk/ business-school/sports-analytics-machine/ 6. Rosenberg, L., Willcox, G.: Arti?cial Swarms ?nd Social Optima. In: 2018 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA 2018) – Boston, MA (2018) 7. Bonabeau, E.: Decisions 2.0: The power of collective intelligence. MIT Sloan Manag. Rev. 50(2), 45 (2009) 8. Woolley, A.W., Chabris, C.F., Pentland, A., Hashmi, N., Malone, T.W.: Evidence for a collective intelligence factor in the performance of human groups. Science 330(6004), 686– 688 (2010) 9. Surowiecki, J. The wisdom of crowds. Anchor (2005) 10. Seeley, T.D., Buhrman, S.C.: Nest-site selection in honey bees: how well do swarms implement the ‘best-of-N’ decision rule? Behav. Ecol. Sociobiol. 49, 416–427 (2001) 11. Marshall, J., Bogacz, R., Dornhaus, A., Planqué, R., Kovacs, T., Franks, N.: On optimal decision-making in brains and social insect colonies. Soc. Interface (2009) 12. Seeley, T.D., et al.: Stop signals provide cross inhibition in collective decision-making by honeybee swarms. Science 335(6064), 108–111 (2012) 13. Seeley, T.D.: Honeybee Democracy. Princeton University Press, Princeton (2010) 14. Seeley, T.D., Visscher, P.K.: Choosing a home: how the scouts in a honey bee swarm perceive the completion of their group decision making. Behav. Ecol. Sociobiol. 54(5), 511–520 Arti?cial Human Swarms Outperform Vegas Betting Markets 729 Genetic Algorithm Based on Enhanced Selection and Log-Scaled Mutation Technique Neeraj Gupta1(B) , Nilesh Patel1 , Bhupendra Nath Tiwari2 , and Mahdi Khosravy3 1 Department of Computer Science and Engineering, Oakland University, Rochester, MI, USA {neerajgupta,npatel}@oakland.edu 2 INFN-Laboratori Nazionali di Frascati, Via. E. Fermi, 40 – I – 00044, Frascati, Rome, Italy bhupendray2.tiwari.phd@iitkalumni.org 3 Department of Electrical and Electronics Engineering, Fedral University of Juiz de Fora, Juiz de Fora, Brazil mahdi.khosravy@ufjf.edu.br Abstract. In this paper, we introduce the selection and mutation schemes to enhance the computational power of Genetic Algorithm (GA) for global optimization of multi-modal problems. Proposed operators make the GA an e?cient optimizer in comparison of other variants of GA with improved precision, consistency and diversity. Due to the presented selection and mutation schemes improved GA, as named Enhanced Selec-tion and Log-scaled Mutation GA (ESALOGA), selects the best chro-mosomes from a pool of parents and children after crossover. Indeed, the proposed GA algorithm is adaptive due to the log-scaled mutation scheme, which corresponds to the ?tness of current population at each stage of its execution. Our proposal is further supported via the sim-ulation and comparative analysis with standard GA (SGA) and other variants of GA for a class of multi-variable objective functions. Addi-tionally, comparative results with other optimizers such as Probabilistic Bee Algorithm (PBA), Invasive Weed Optimizer (IWO), and Shu?ed Frog Leap Algorithm (SFLA) are presented on higher number of vari-ables to show the e?ectiveness of ESALOGA. Keywords: Selection operator · Mutation operator Log-scaled mutation · Diversity preservation · Genetic algorithms Metropolis algorithm 1 Introduction Rapid industrial growth and utilization of the available resources e?ciently are of the prime importance nowadays, for example, route identi?cation in tra?c systems, optimization of process allocation in maximizing production, utilization n c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 730–748, 2019. https://doi.org/10.1007/978-3-030-02686-8_55 Advances in Genetic Algorithm 731 of energy resources in power systems, optimizing VLSI circuits design, CAN optimization in vehicles, etc. [1–12]. Most of the industrial problems are complex in nature and belong to the combinatorial optimization, where the main focus it to optimize discrete variables for maximizing/minimizing required objectives [1,2]. Although, two methods are available to solve this type of problems, which are integer programming approach and dynamic programming. These traditional methods are known as exact algorithms [3,5]. However, due to the computational complexity, where a fast solution is required for huge size optimization problems that cannot be relied on such algorithms. Optimization in this aspect may be critically crucial for the sustainable growth of industries in competitively winning and in highly uncertain economic environments [3–5]. Henceforth, in last two decades, as an alternative approach to solve combi-natorial problems, a large number of researchers have focused on approximate methods to solve these problems that are close to their optimal state in a reason-ably acceptable time. Thus, the development of heuristic algorithms in the ?eld of mathematics, engineering, etc. [6,7] have demonstrated a successful imple-mentation towards the solution of real-life problems. As a result, a considerable number of heuristic evolutionary algorithms have invented to work e?ciently on linear/ nonlinear, di?erentiable/ non-di?erentiable, concave/ convex prob-lems with discrete variables [6–9]. A general description of complex functions can be seen in [10] and their applications with discrete variables on power sys-tem design in [11,12], and the capacity of the energy generators, the quantity of goods produced, number of vehicles on the route, etc. in [13–15]. GA, as it works on binary variables, hardware friendly algorithms have been proposed in many variants to solve the combinatorial problems. The literature-survey shows a huge scope to further improve it by an appropriate combination of mathematical modeling along with the heuristic concept [9]. GA and its asso-ciated variants have been proved to give globally optimal solutions, especially for the multi-modal non-di?erentiable/ combinatorial/ industrial problems [16–18]. Moreover, GA is very easy to implement and has an advantage of developing its operators in a simple process from the inspiration of genetical processes which have been rigorously investigated at a large scale during the last two decades [1–19]. As developed by John Henry Holland [20], GA is inspired from the “survival of the ?ttest principle”, which mimics the natural process of evolution in terms of several operators as the selection, crossover, and mutation operators [20]. An adaptation of these operators is analyzed and modeled by a large community of researchers, where several of them have given evidence and they have improved it by introducing novel selection approaches of the ?ttest individuals, types of crossover variants and mutation schemes. These improved models of GA keep the search not to stuck in a premature convergence. In the light of GA research, this paper o?ers a combination of mathematical modeling and heuristic approach together in order to ?nd the global optimal solutions for multimodal nonlinear functions. It is worth mentioning that over the last few decades GA has been elected as a successful heuristic evolutionary technique for addressing various 732 N. Gupta et al. global combinational industrial problems and it has been widely used due to its simple structure, see for instance [9,16,21–23]. Regardless of the state of a?airs, GA has as powerful optimization fundamen-tals with a few drawbacks that can be seen in a number of readings [8,9,16,24]. GA convergences prematurely due to improper selection, crossover and muta-tion probabilities and associated criteria [25–27]. In these papers, a variant of GA has been described as a modi?cation of the GA model parameters, i.e., selection method, crossover operator, mutation operator, and undermining probabilities. Based on [28], elitism ensures that winner chromosomes go in the next-generation process that moves the search from a premature to the mature phase. This is exploited in Sect. 3. Hereby, in the light of Adaptive GA [29], our proposal fur-ther gives motivations to evolve its mutation probability based on the present state of all candidates by using a probabilistic modeling. This paper is structured in six sections. Firstly, in Sect. 2, we provide a brief step-by-step description of the GA algorithm, as our proposal arises its improve-ment. In Sect. 3, as the most important part of this paper, a brief description of the proposed enhanced selection scheme and log-scaled mutation operators are provided. Consequently, Sect. 4 presents a binary coded Enhanced Selection and Log-scaled mutation Genetic Algorithm (ESALOGA) that as an optimization package solves combinatorial problems. Section 5 presents simulated results in comparison to other variants of GA and three real coded optimizers concerning multi-modal benchmark functions. Finally, Sects. 6 and 7, respectively conclude the paper and give future research directions and improvements. 2 Binary Coded GA A step by step operation of the binary coded GA is presented [9] that ?rstly allows to understand the concept of GA and symbiotic integration of di?erent operators such as selection, crossover and mutation operators. Step 1: At ?rst, the parameters of GA are initialized as the crossover and mutation probabilities Pc and Pm, such that Pm m Pc, chromosomes in the population s, number of bits l to represent one variables as to decide the length of chromosomes, which is nl for n variables in the chosen problem. Termination criteria as the maximum number of generations that GA could proceed is selected based on the problem size. Step 2: To start the evolution process, the ?tness of each chromosome is cal-culated in the population. In this process, a part of binary chromosome representing a variable is decoded to express in decimal represented as dn = .nl-1 i=0 2i bn i where bn i ?n {0, 1} belongs to the nth variable. Values of nth variable are obtained as the bound x( L) n =) xn =n x( U) n , where nth variable xn is calculated as xn = x( L) n + x( U) n -x( L) n 2l n-1 dn based on its respective lower and upper bounds x( L) n and x( U) n . After converting the variable in a required domain, the associated objective function f(x) is calculated for all individuals repre-sented by the chromosome strings in the population. For the minimization Advances in Genetic Algorithm 733 problem, the ?tness function Fs, associated with s chromosome, is adopted as Fs = 1 1+fs(x) , which is the function of objective function fs(x). Step 3: At this point, a selection operator selects the ?ttest chromosomes as the candidates go for mating, based on Roulette wheel selection [9,30]. This is the ?rst stage of GA process, where multiple di?erent operators have been proposed, i.e., roulette wheel as in standard GA, tournament and uniform selection as a variant of GA. In the proposal, we introduce an enhanced selection scheme which is utilized after Step-4 instead in Step-3. Step 4: In sequel, the crossover operator gives a number of strings from the mating pool using ?xed crossover probability Pc. For the selected pair of candidates, knows as parents, a cross-site is generated randomly in the inter-val (0,nl -l 1) and swapped the selected regions between two pairs. At this step, di?erent crossover mechanisms have been proposed such as single-point, multi-point, uniform crossover, and etc. The use of di?erent crossover tech-niques makes the standard GA to enhance as the variant of it. Step 5: After above serial processes, children chromosome strings arise as the result and the population of which is known as an intermediate population as taken in [9]. At this step, we have a pool of parents and resulting their o?spring. Our proposal aims to answer which candidates should go to the next evolution phase as better parents. Step 6: At this juncture, bitwise mutation is carried out, where as a result of mutation operator, a selected bit in the chromosome is ?ipped to opposite binary value based on a relatively low ?xed mutation probability pm. To make the process adaptive, based on the current status of the population, we propose a log scaled-mutation technique. Step 7: Until the termination criterion is not reached, return to Step 2. 3 Selection and Mutation Schemes In this section, we provide a step-by-step working principle of the proposed enhance selection and log-scaled mutation operators in order for providing an improved GA (ESALOGA) as a better optimization technique. 3.1 Proposed Selection Operator Based on the Metropolis algorithm [31], we focus on possible improvements of the GA for ?nding the optimal solution in due course of the cross-over while selecting chromosome strings. This keeps intact a high degree of diversity in selecting the children which are the most suitable when the chosen parents undergo a cross-over. To chose appropriate candidates from the current pool of parents and o?spring a block diagrammed of the proposed selection strategy is given in Fig. 1. Mathematically, this is realized by introducing a selection probability as the Boltzmann probability distribution. Precisely, let T be the temperature, then the selection probability p(T ) reads as the Maxwellian distribution. p(T ) = e-?E/ kT , (1) 734 N. Gupta et al. where ?E represents change in energy between the chosen parents and children. With the above probability p(T ), a set of selected strings passed to the next stage of evolution. It is worth mentioning that the principle of elitism [9] o?ers the best ?tness value to the string for a given pool of parents and children. Following (1), the subsequent strings are selected that are the ?ttest string in Fig. 1. Flow diagram for selection strategy after crossover. Advances in Genetic Algorithm 735 the previous stage of the evolution. The proposed model is realized as per the following steps: Step 1: Choose an initial value of the temperature T as T = as M I , (2) where M is the maximum value of the ?tness function {Fs|s = 1, 2,... , 20}, I is the number of iterations and s labels the strings pertaining to the cross-over of a given population. Note that the initial value of the temperature T is taken as large as possible such that it decreases in the subsequent iteration to its desired value. Here, the proportionality constant ah is set as per the chosen algorithm. Step 2: In order to ?nd energy di?erence, one chooses jth string in a given pool of parents and children and subtract the corresponding ?tness value Fj of the jth string to a priorly selected string Ff. In other words, the energy di?erence that governs the probability distribution is given by ?E = Ff -f Fj (3) with j = 1, 2, 3. Step 3: Compute p as per the equation 1 and obtain its minimum value as p = min(1,e-?E kT ) (4) Step 4: Acquire a random number r ?: (0, 1). Step 5: If a candidate string is selected, and the corresponding previously selected partner string is the ?ttest one, that is r < p. Step 6: Else, go to Step 2, and repeat the search. In the case when none of the strings are selected, one increases the value of mutation probability pm. In practical situations, we may consider the corresponding value pm = 0.1. Step 7: Finally, one selects the partner string chromosome by repeating Steps 2 to 6. 3.2 Proposed Mutation Operator In this subsection, we o?er log-scaled mutation strategy as given in Fig. 2 with the corresponding operations as below: Step 1: Obtain the mutation probability for a given ?tness value fs(x) as per the transformation ys = log10Fs. Step 2: For the maximum value of the ?tness Fmax s , de?ne ymax s = log10Fmax s . Step 3: Corresponding to the minimum ?tness value Fmin s , de?ne ymin s = log10Fmin s . Step 4: ymax s is mapped to the minimum mutation probability pmin m such that the best candidates remain intact. Step 5: ymin s is mapped to the maximum mutation probability pmax m such that the worst candidate mutate. 736 N. Gupta et al. Step 6: De?ne a linear relationship between ys and pm,s as pm,s = pmax m -m pmin m ymax s -m ymin s (ys -s ymin s ), (5) where the ratio of pmax m -f pmin m and ymax s -n ymin s gives the ßh as the slop of the line plotted between pm,s and ys. This leads to the following linear equation: pm,s = ßys + ?, (6) where ?) is the intercept of the line as in (6) as ?s = -s pmax m -s pmin m ymax s -s ymin s ymin s (7) With the above slop ßh and intercept ?, the mutation probability pm,s is obtained by the following logarithmic relation pm,s = ßlog10Fs + ?, (8) where s labels the undermining chromosome. Physically, this shows the inverse relation [9] between the ?tness value Fs and the mutation probability pm,s. Fig. 2. Log-scaled mutation strategy. Advances in Genetic Algorithm 737 Step 7: This assigns a unique mutation probability pm,s to each candidate strings in the range (pmin m ,pmax m ), viz. we have pmin m =e pm,s =m pmax m (9) Step 8: Finally, a diversity in the selected population is realized by a bitwise mutation process. The ?tness values of the strings are usually sparse, thus we propose a log-scale mutation operator. In this approach, we ?nd that all mutation probabilities are kept in a speci?ed range, irrespective of variations in the ?tness values. This makes our proposal adaptive and yields an evolution from a premature to mature phase of a given population. In a nutshell, we have illustrated that there is a non-linear relationship between mutation probability and ?tness value as far as evolutionary algorithms are concerned. In addition, it follows that the higher ?tness value leads to the lower mutation probability. This indicates comparatively a larger search space while ?nding the global optimal solution. 4 Proposed GA (ESALOGA) Based on the proposed enhanced selection and log-scaled mutation strategies, we provide below pseudo-code of the algorithm. For a given input parameters randomly generates a binary initial population P with the fact that the candidate chromosome strings and mutation proba-bility pm is adaptively selected by enhanced selection operation (EnSelection) and given mutation range (pmin m ,pmax m ), respectively. Produce the mating pool for breeding for a given crossover probability pc. Extract two parents from the mat-ing pool using standard roulette wheel (RW) selection operator. Indeed, other selection schemes such as tournament and uniform could be adopted, as well, for better performance. Perform the single-point crossover operation to produce two children. Infact instead of sigle-point the use of two-points or uniform crossover operation may enhance the computational capability. At this junction, form a pool of two parents and produced their children choose two appropriate candi-date strings using the enhanced selection (EnSelection) operator with probability p(T ) as in (1). As a result of this two appropriate candidates are selected to go in next evolution. When no chromosomes are selected from the pool, mutate all the strings with an increased mutation probability pm and repeat the EnSelec-tion operation. After this operation we get the intermediate population which is subjected to the mutation operator with mutation probability pm. Following the log-scaled strategy (LSMut), produce population of mutated string (Pm), as in Algorithm 1. Taking best chromosome from the above two populations as shown in algorithm in line number (11). Repeate the steps until the termination criterion is reached. 738 N. Gupta et al. Algorithm 1. Pseudo-code for the proposed ESALOGA Require: N: the number of chromosomes, pc: crossover probability, tmax: maximum iterations, pm: mutation probability, pmin m : lower bound on pm, pmax m : upper bound on pm, b: number of bits to represent one variable, v: number of variables. P ?-round(rand(N,b*v)):initialize binary population randomly 1: GP?-best of [P]: GP belongs to the best solution in current P 2: for i ?o 1 to tmax do 3: n?- 1 4: while n =: N do 5: [Parent1, Parent2] ?-- Selection(P ) : RW or Tournament selection operation 6: [Children1, Children2] ?-- Xover(Parent1, P arent2) : Crossover operation 7: [string1, string2] ?-- EnSelection(Parent1, P arent2, Children1, Children2) : Enhanced Selection operation two select two appropriate strings 8: P(n)?-o string1 9: P(n+1)?-string2 10: n?- n+2 11: end while 12: Pm?-LSMut(P): Log-scale mutation after crossover 13: P?- N best chromosomes of [P,Pm] 14: GP(i)?-best of [P] 15: if Fitness(GP(i)) < Fitness(GP(i-1)) then 16: GP(i)?-GP(i -) 1) 17: end if 18: end for 19: return GP 5 Results and Discussion In this section, we provide e?ectiveness of the proposed GA for various bench-mark functions [9,33]. Hereby comparing few variants of the GA, where they are distinguished based on their di?erent selection and crossover strategies, an outline is given in Table 1. All the above variants are discussed in [32,33] and tested on the benchmark functions which are concisely tabulated in Table 2. We ?rstly present the results on Goldsteinprice, Levi, Beale, Himmelblau, Ackley, and Rastrigin benchmark functions. Note that Rastrigin and Himmelblau func-tions are multimodal in their nature while the Ackly function possesses a large hole at its center with multi modularity. On the other hand, Beale function is a unimodular with four sharp peaks at the corners. Similarly, Levi function has a non-linear search space that may show a premature convergence in due course of the execution of our optimization algorithm. Equally, it is worth noticing that an optimization algorithm may get trapped in some of local minima of the objective function, which our proposal overcome by having a larger diversity as shown in Fig. 3 for di?erent problems. Simulation results for comparative analysis of the ESALOGA with respect to standard GA, VGA-1, VGA-2, VGA-3, and VGA-4 is given in Table 3 for 100 runs on aforementioned two variables problems. Advances in Genetic Algorithm 739 Table 1. Selection and crossover strategies in variants of GA (VGA) GA variants SGA VGA-1 VGA-2 VGA-3 VGA-4 Selection RW Random RW Random Tournament Crossover Single-point Two-points Uniform Uniform Uniform Table 2. Benchmark functions for testing ESALOGA Functions Mathematical Description Himmelblau: f 1 (x 1 , x 2 ) = (x 2 1 + x 2 -o 11) 2 + (x 1 + x 2 2 -1 7) 2 with variables limit -6 =6 x 1 , x 2 =6 6 Rastrigin: f(x i ) = An + .nn i=1 (x 2 i -= Acos(2px i )) with variables limit -5.12 =) x 1 , x 2 =) 5.12 Ackley: f(x 1 , x 2 ) = -20exp(-0.2 .( 0.5(x 2 1 + x 2 2 )) -) exp(0.5(cos(2px 1 ) + cos(2px 2 ))) + e + 20 a = 20, b = 0.2, c = 2p, with variables limit -35 =3 x i =3 35 Beale: f(x 1 x 2 ) = (1.5 -. x 1 + x 1 x 2 ) 2 + (2.25 -. x 1 + x 1 x 2 2 ) 2 + (2.625 -. x 1 + x 1 x 3 2 ) 2 with variables limit -4.5 =. x 1 , x 2 =. 4.5 Levi: f(x 1 x 2 ) = sin 2 (3px 1 ) + (x 1 -p 1) 2 (1 + sin 2 (3px 2 )) + (x 2 -) 1) 2 (1 + sin 2 (2px 2 )) with variables limit -10 =1 x 1 , x 2 =1 10 Goldstein: f(x 1 x 2 ) = (1 + (x 1 + x 2 + 1) 2 (19 -9 14x 1 + 3x 2 1 -x 14x 2 + 6x 1 x 2 + 3x 2 2 ))(30 + (2x 1 -x 3x 2 ) 2 (18 -8 32x 1 + 12x 2 1 + 48x 2 -8 36x 1 x 2 + 27x 2 2 )) with variables limits -2 =2 x 1 , x 2 =2 2 Styblinski -Tang: f(x) = 1 2 .)n i=1 (x 4 i -= 16x 2 i + 5x i ), with variables limit -5 =5 x i =5 5 Michalewicz: f(x) = -) .) n i=1 sin(x i ) sin 2m .m ix 2 i px .x with variables limit 0 =x x i =x px Schaffer No2.: f(x) = 0.5 + ..n-1 i=1 sin 2 (x 2 i -x 2 i+1 )-0.5 (1+0.001(x 2 i +x 2 i+1 )) 2 with variables limit -100 =) x i =) 100 Deceptive: f(x) = -) .) 1 n .) n i=1 g i (x i ) .= ß= with variables limit 0 == x i == 1, and ßn = 2, Keane Bump: f(x) = -| .|n i=1 cos 4 (x i )-2 .2n i=1 cos 2 (x i ) .o .o n i=1 ix 2 i .x 0.5 | subject to: g 1 (x) = 0.75 -) .) n i=1 x i < 0, g 2 (x) = .) n i=1 x i -= 7.5n < 0 Results are compared on Six attributes, i.e., the best achieved by algorithms, mean of the all solutions in 100 runs, standard deviation (Std) of the solutions achieved in 100 runs, reliability of the algorithms stand for the solution achieved by all lower than the mean of proposed GA, the worst achieved and at the last average time taken by all algorithms for 1000 evolution epochs. This follows from the average measurement techniques, giving a consistent and accurate determi-nation of the approximate global optimal point as the e?ective of our proposed algorithm. Interestingly, while the SGA and other variants get trapped in one of their local optima, our proposed algorithm successfully terminates by locating the global optimum for various benchmark functions. The corresponding comparative results of the diversity preservation is depicted in Fig. 3. In this ?gure, one can observe the spread of search for Him-melblau, Beale, Ackley and Levi functions. As we can see that SGA trapped at one point where ESALOGA examines di?erent points for the global solution. Approximately, similar e?ect can be seen for other functions. We address the issue of premature convergence of the algorithm where the diversity preserva-tion, where most of the GA variants behave similarly. Thus, we have proposed an enhance selection scheme to overcome this condition of premature conver-gence. We can equally maintain the diversity preservation adequately as shown 740 N. Gupta et al. Fig. 3. Comparative result for the diversity preservation for the same number of gen-erations (Left: Standard GA, Right: Proposed GA). Advances in Genetic Algorithm 741 Table 3. Comparative simulation results of the proposed GA and other GA variants in 100 runs PGA SGA VGA-1 VGA-2 VGA-3 VGA-4 Goldsteinprice Best 3.0010 3.0010 3.0010 3.0010 3.0010 3.0010 Mean 3.0806 11.4169 11.5692 6.1199 5.7327 12.7232 Std 0.0734 17.1370 17.9966 14.0022 8.2900 17.3742 Reliability 60% 56% 56% 60% 60% 50% Worst 3.313 88.868 84.080 89.541 32.634 76.699 time 1.2953 2.9910 3.2558 2.9816 2.9116 3.6707 Levi Best 7.8091e-04 5.5598e-05 5.5598e-05 5.5598e-05 5.5598e-05 5.5598e-05 Mean 0.0268 0.0555 0.0712 0.1220 0.1019 0.1432 Std 0.0270 0.1362 0.1733 0.2254 0.1790 0.3829 Reliability 60% 60% 70% 64% 54% 48% Worst 0.110 0.725 0.725 0.725 0.725 2.600 time 1.3567 3.1313 3.7612 3.0177 3.3239 3.9555 Beale Best 3.1186e-05 8.0472e-05 8.0472e-05 3.1186e-05 3.1186e-05 2.1385e-04 Mean 0.0024 0.2249 0.2651 0.2645 0.1736 0.2841 Std 0.0027 0.3054 0.4824 0.3083 0.2815 0.3226 Reliability 60% 14% 14% 14% 24% 60% Worst 0.012 0.926 2.689 0.816 0.926 0.974 time 1.3624 2.9880 3.2909 2.9839 2.9306 3.6881 Himmelblau Best 3.9863e-05 4.9682e-04 4.9682e-04 4.9682e-04 4.9682e-04 4.9682e-04 Mean 0.0633 0.2900 0.1968 0.2373 0.1486 0.6404 Std 0.2850 0.7672 0.5683 0.6091 0.4067 1.2703 Reliability 96% 76% 84% 76% 86% 64% Worst 1.444 4.705 2.755 3.717 1.643 6.625 time 50.8745 3.6972 4.1740 3.4376 3.4056 4.5079 Ackley Best 0.0182 0.1982 0.1982 0.1982 0.1982 0.1982 Mean 0.0372 0.6684 0.4341 0.6788 0.3161 0.9028 Std 0.0097 1.1057 0.8200 1.0503 0.5920 1.2887 Reliability 42 % 0% 0% 0% 0% 0% Worst 0.061 3.639 3.639 3.639 3.639 3.639 time 1.4144 3.0694 3.3770 3.3646 3.2705 3.6703 Rastrigin Best 0.0099 0.0104 0.0104 0.0104 0.0104 0.0104 Mean 0.2907 1.0351 1.2141 0.9595 1.5399 2.2707 Std 0.3054 1.0248 1.5045 1.0608 1.3773 2.1881 Reliability 66% 32% 32% 34% 20% 16% Worst 1.0160 4.1020 7.9655 4.9817 5.0958 9.1854 time 1.3768 2.8542 3.1760 2.8413 3.0166 3.9116 742 N. Gupta et al. in Fig. 3, which makes our algorithm relatively e?cient. Further, we see from Fig. 3 that our proposal reveals various local and global optimal points of the aforementioned benchmark functions and o?ers a great diversity in searching process instead of getting the same point under di?erent evolutions. This yields an appropriate optimization with high diversity preservation in a given mating pool. We ?nd an improved reliability (in percentage) as shown in Table 3 in contrast to the standard GA and its other variants that get trapped in an intermediate suboptimal state at most of the time. Hereby, we ?nd that the average performance of ESALOGA is comparable with the standard deviation. Also the time taken by ESALOGA is reasonable. Moreover, from the results on Himmelblau function, one can observe that ESALOGA tries the best to ?nd a better solution, but on the cost of its runtime. It reveals that ESALOGA guarantees a better solution every time. As a mater of the fact, our algorithm yields an intelligent mechanism to come out from a suboptimal trap and local optima of a class of benchmark functions. By tuning the selective pressure to its higher value, we can generate a desired diversity in the population and scan entire search space while searching the global optimum. This provides an appropriate trade-o? between the selective pressure and diversity pressure. 5.1 Comparison of Proposed GA with Other Optimizers In this section, we extend our algorithm to higher dimensions and provide com-parison of ESALOGA with other optimizers involving certain complex functions, i.e., Rastrigin, Ackley, Scha?er no2 [34], Michalewicz [35], Styblinski-Tang [36], Deceptive [37,38] and constrained Keane’s bump [39], as shown in Table 2. These functions have ability to extend in arbitrary dimensions with nonlinear analytical investigations. These functions are related to the real-world problems, for exam-ple, the Ackley function is considered as the free energy hypersurface of proteins. Most of the above test functions add the di?culty of being less symmetric and possess higher harmonics, which makes the functions di?cult to solve and keep the environment uncertain. Namely, the Scha?er function has concentric bar-riers, whereby it capable to discriminate di?erent optimizers. Hereby, we have tested our algorithm on highly nonlinear, multi-modal functions with a large number of local extrema. As mentioned above, one of them is the Michalewicz function, which is observed as a strange mathematical function having n! num-ber of local optima in n dimensions. Our optimization algorithm has given an improved solution, as shown below in Table 4. Description of the Styblinski-Tang Function [36] is considered further. Another complex function is the deceptive function, which ?nds its impor-tance in discriminating di?erent optimizers. As in [37,38], it can be seen in the existing literature about its computational di?culties. Here, we have shown the results on the above complex functions, which qualify our algorithm as an apt global optimizer. In the sequel, we focus on the constrained complex function in multi-dimension. Namely, the Keane’s bump function is considered as the test function. It is highly nonlinear and di?cult to solve by the existing opti-mizers because its solution exists at a nonlinear boundary. Performance of the Advances in Genetic Algorithm 743 ESALOGA has been analyzed in comparison to VGA-4, probabilistic bee opti-mization (PBA) [40], invasive weed optimization (IWO) [41], and shu?e frog leap algorithm (SFLA) [42]. The comparative results of the above optimizers on ten dimensional test functions are shown in Table 4. The parameters setting of all the optimizers are taken as per the followings: 1. VGA-4 parameters: (a) Fifty chromosomes are taken in a population (b) Crossover probability is ?xed at 0.8 to form the matting pool (c) Mutation probability is taken as 0.02 2. PBA parameters: (a) Number of scout bees are 50 (b) Recruited bees scale are de?ned as ?round(0.3*50)? (c) Neighborhood radius is set as 0.1*(maximum variable value ? minimum variable value) (d) Neighborhood radius damp rate is 0.9 3. IWO parameters: (a) Population size is taken as 50 (b) Minimum and maximum numbers of the seeds are 0 and 5 respectively (c) Variance reduction exponent is set to 2 (d) Initial and ?nal values of the standard deviation are 0.5 and 0.01 respec-tively 4. SFLA parameters: (a) Memeplex size is 25 (b) Number of memeplexes is 2 (c) Number of parents are de?ned as the maximum of rounded value of (0.3*25) or 2 (d) Number of o?-springs is taken as 3 (e) Maximum number of iterations is 5 5. proposed ESALOGA parameters: (a) 50 chromosomes are taken in the population (b) Crossover probability is 0.8 to form a matting pool (c) Mutation probability is adaptively de?ned between 0 to 0.05 by our pro-posed mutation scheme (d) Mutation probability during enhance selection procedure is 0.02. In one run of the optimization, all optimizers give the solution in ?ve hun-dred generations. We run the proposed algorithm for all the above mentioned benchmark functions for ?fty times to see the performance statistics. Hereby, we compare the results on all the selected benchmark functions. Through the observation of the Table 4, we can deduce the preeminence of the ESALOGA over PBA, IWO, SFLA and GA for the above class of test functions. Compari-son made on the ?ve indices named as Best, Worst, Mean, which is achieved in 50 runs of the optimizer, Std the standard deviation of solutions in 50 runs by optimizer, and Consistency which is de?ned as how many times the optimizer is quali?ed as an expected solution (in percentage). 744 N. Gupta et al. Table 4. Comparative results on ten variables for ?fty runs VGA-4 PBA IWO SFLA PGA Styblinski-Tang Function Best -389.2077 -261.6483 -377.5249 -377.5249 -391.6528 Worst -374.5288 -176.9046 -320.9780 -374.5288 -376.9688 Mean -383.0037 -218.7373 -352.0788 -354.9062 -385.3267 Std 4.2495 22.0725 16.9144 15.4860 3.4056 Consistency (Solution <-383) 55% 0% 0% 0% 90% Michalewicz Extension function Best -9. 5033 -3.4877 -9.3631 -9.2164 -9.6575 Worst -8.1878 -2.2156 -7.9995 -8.1878 -8.2459 Mean -8.9632 -2.8983 -8.8179 -8.6147 -9.0075 Std 0.3990 0.3521 0.4090 0.4242 0.3343 Consistency (Solution <-9) 55% 0% 40% 30% 65% Ackley function Best 0.0016 2.3168 0.0020 0 1.335e-04 Worst 2.0225 19.7360 18.8521 1.6538 1.6538 Mean 0.1086 11.6177 12.7090 0.4520 0.0828 Std 0.4506 4.8105 8.5472 0.8391 0.3698 Consistency (Solution <0.1) 95% 0% 30% 75% 99% Rastrigin function Best 3.0071 9.9496 0.9955 2.9849 1.0173 Worst 21.2696 34.8234 16.9149 21.2696 14.2134 Mean 10.8882 24.4759 8.7562 14.8746 6.3935 Std 5.0115 5.7601 3.7172 8.1695 3.5663 Consistency (Solution <10) 55% 50% 75% 35% 90% Scha?er function No. 2 Best -3.9918 -1.0227 -1.1854 -3.4150 -3.7789 Worst -2.1046 0.0065 -0.1801 -2.1046 -2.6381 Mean -3.1594 -0.0941 -0.5848 -2.6446 -3.3691 Std 0.4458 0.2305 0.2734 0.5166 0.2948 Consistency (Solution <-3) 55% 0% 0% 25% 75% Deceptive function Best -0.9255 -0.4140 -0.7724 -0.8464 -0.9255 Worst -0.7483 -0.2729 -0.7040 -0.7483 -0.7187 Mean -0.8196 -0.3185 -0.7259 -0.7853 -0.7955 Std 0.0394 0.0389 0.0247 0.0326 0.0399 Consistency (Solution <0.8) 40% 0% 0% 10% 100% Keane Bump function Best -0.7257 -0.2368 -0.7492 -0.7038 -0.7405 Worst -0.6290 -0.1238 -0.2740 -0.6014 -0.6014 Mean -0.6818 -0.1750 -0.5778 -0.5532 -0.6856 Std 0.0292 0.0278 0.1532 0.1073 0.0357 Consistency (Solution <0.6) 30% 0% 25% 15% 55% Advances in Genetic Algorithm 745 Based on the observations as in Table 4, we can extract the following com-parative results: 1. ESALOGA is highly consistent than the other optimizers. In comparison other optimizers, we have observed with the presented results that the ESA- LOGA performs well for multimodal functions, which are highly complex functions in their nature according to the literature [37,38]. 2. For Styblinski-Tang, Ackley, Rastrigin and Deceptive functions, we ?nd that no other optimizer that the VGA-4 gives acceptable results as in Table 4. Here, ESALOGA gives the optimal solution with a high consistency and low standard deviation. Table 4 shows that the consistency of ESALOGA is 90%, 90%, 99%, and 100% for Styblinski-Tang, Rastrigin Ackley and Deceptive functions respectively. 3. For Michalewicz function, the best optimization is given by ESALOGA with a consistent mean around -9.0075, which is the best in comparison to all other optimizers. 4. Scha?er function No. 2 is another highly complex function, which we have solved by the ESALOGA with better results than the other above mentioned optimizers. 5. On the highly complex constraint test function named as Kean-bump func-tion, the ESALOGA gives outstanding results over other optimizers. Note that only GA tries to compete with the results of the ESALOGA. 6. Overall, statistical results of ESALOGA are far better than the other opti-mizers, as well. 6 Conclusion In this paper, we have given an improved search technique based on biological evolution. This is well suited to optimize multi-variable objective functions with and without discontinuities. As a matter of the fact, the proposed operators are ?exible in ?nding the global minimum solution of a benchmark function. Hereby, our proposition gives an improved technique for solving optimization problems. Further, we have given simulation results of our proposal as a variant of the standard GA. As the veri?cation of the same, we have enlisted the global solution of various of two variables bench-mark functions. From the simulated results, it is found that our method precisely locates the optimal points of multi-modal bench-mark functions. Hereby, various drawbacks of the binary-coded GA including imprecision and inconsistency are taken care by Metropolis scheme. This provides an enhanced selection and adaptive log-scale mutation scheme. Subsequently, the global optimal solution is obtained with an acceptable value of selection pressure. In other words, our proposal is a meta-heuristic approach as far as the global optimization problems are con-cerned. Indeed, this gives an improved precision and consistency as revealed via the simulated results. 746 N. Gupta et al. 7 Future Scope Proposed GA has a considerable scope of further improvement as discussed in this section. The ?rst stage of improvement belongs to the parallel population approach which may give a better solution. To introduce more diversity, random selection, tournament selection can be tested instead of roulette wheel selection before crossover, where we are proposing selection after crossover. This is taken as complimentary selection scheme for introducing more diversity after crossover. In next paper, we will test the above speci?ed selection strategies with proposed GA and compare with the di?erent variants of the available GA. The second improvement is at the stage of crossover where di?erent crossover techniques such as single-point, multi-point, uniform, mid-point techniques can be tested to see the superiority of proposed GA over other variants. Results on the limited functions showing the proposed GA superiority over SGA and other variants of GA. Thus, the insertion of enhanced selection scheme as treated as complimentary selection after crossover, the log mutation scheme in the structure of other GA variants may give better results over others. Moreover, performance of the proposed GA can be increased by utilizing binary tree mem-ory. Thus, we observe that the proposed GA has a wide scope of improvements and it may further emerge as a dominant optimization algorithm for large scale complex problems from sociology, engineering, Topology, Graphs, Biology, etc. At this juncture, we anticipate that our proposal ?nds various applications in real world industrial problems such as power systems, its transmission expansion planning, data systems and wireless technology. References 1. Bill, N.M., David, M.R.: Total productive maintenance: a timely integration of production and maintenance. Prod. Inven. Manag. J. 33(4), 6–10 (1992) 2. Bevilacqua, M., Braglia, M.: The analytic hierarchy process applied to maintenance strategy selection. Reliab. Eng. Syst. Saf. 70(1), 71–83 (2000) 3. Doganay, K.: Applications of optimization methods in industrial maintenance scheduling and software testing. M¨alardalen University Press Licentiate Theses, School of Innovation, Design and Engineering, 180 (2014) 4. Shen, M., Peng, M., Yuan, H.: Rough set attribute reduction based on genetic algorithm. In: Advances in Information Technology and Industry Applications, The Series Lecture Notes in Electrical Engineering, vol. 136, pp. 127–132 (2012) 5. Sobh, T., Elleithy, K., Mahmood, A., Karim, M.: Innovative algorithms and tech-niques in automation, Industrial Electronics and Telecommunications (2007) 6. Hillier, M.S., Hillier, F.S.: Conventional optimization techniques, evolutionary opti-mization. Int. Ser. Oper. Res. Manag. Sci. 48, 3–25 (2002) 7. Miettinen, K., Neittaanmaki, P., Makela, M.M., Periaux. J.: Evolutionary algo-rithms in engineering and computer science: recent advances in genetic algorithms. In: Evolution Strategies, Evolutionary Programming, Genetic Programming and Industrial Applications, Wiley (1999) 8. Kar: Genetic algorithm application (2016). http://business-fundas.com/2011/ genetic-algorithm-applications/. Accessed 27 June 2016 Advances in Genetic Algorithm 747 9. Deb, K.: Optimization for Engineering Design: Algorithms and Examples. Prentice Hall of India Private limited, New Delhi (2005) 10. Tiwari, B.N.: Geometric perspective of entropy function: embedding, spectrum and convexity, LAP LAMBERT Academic Publishing, ISBN-13: 978-3845431789 (2011) 11. Gupta, N., Tiwari, B.N., Bellucci, S.: Intrinsic geometric analysis of the network reliability and voltage stability. Int. J. Electr. Power Energy Syst. 44(1), 872–879 (2010) 12. Bellucci, S., Tiwari, B.N., Gupta, N.: Geometrical methods for power network analysis. Springer Briefs in Electrical and Computer Engineering (2013). ISBN: 978-3-642-33343-9 13. Nelson, B.L.: Optimization via simulation over discrete decision variables. In: Tuto-rials in Operation Research, INFORMS, pp. 193 – 207 (2010) 14. Gupta, N., Shekhar, R., Kalra, P.K.: Computationally e?cient composite transmis-sion expansion planning: a Pareto optimal approach for techno-economic solution. Electr. Power Energy Syst. 63, 917–926 (2014) 15. Gupta, N., Shekhar, R., Kalra, P.K.: Congestion management based roulette wheel simulation for optimal capacity selection: probabilistic transmission expansion planning. Electr. Power Energy Syst. 43, 1259–1287 (2012) 16. Goldberg, D.E.: Genetic Algorithms in Search Optimization and Machine Learning. Addison-Wesley, Reading (1989b) 17. Chung, H.S.H., Zhong, W., Zhang, J.: A novel set-based particle swarm optimiza-tion method for discrete optimization problem. IEEE Trans. Evol. Comput. 14(2), 278–300 (2010) 18. Liang, Y.C., Smith, A.E.: An ant colony optimization algorithm for the redundancy allocation problem (RAP). IEEE Trans. Reliab. 53(3), 417–423 (2004) 19. Sharapov, R.R.: Genetic algorithms: basic ideas, variants and analysis, Source: Vision Systems: Segmentation and Pattern Recognition, ISBN 987-3-902613-05-9, Edited by: Goro Obinata and Ashish Dutta, pp.546, I-Tech, Vienna, Austria, June 2007. Open Access Database www.i-techonline.com 20. Holland, J.H.: Adaptation in natural and arti?cial systems, University of Michigan Press, Ann. Arbor, MI (1975) 21. Goldberg, D.E., Lingle, R.: Alleles, loci, and the TSP. In: Proceedings of the 1st International Conference on Genetic Algorithms, pp. 154 – 159 (1985) 22. Malhotra, R., Singh, N., Singh, Y.: Genetic algorithms: concepts, design for opti-mization of process controllers. Comput. Inf. Sci. 4(2), 39–54 (2011) 23. Spears W.M., De Jong, K.A.: On the virtues of parameterized uniform crossover. In: Proceedings of the 4th International Conference on Genetic Algorithms (1994) 24. Gupta, D., Gha?r, S.: An Overview of methods maintaining diversity in genetic algorithms. Int. J. Emerg. Technol. Adv. Eng. 2(5), 263–268 (2012) 25. Ming, L., Junhua, L.: Genetic algorithm with dual species. In: International Con-ference on Automation and Logistics Qingdao, pp. 2572 – 2575 (2008) 26. Cantu-Paz, E.: A survey of parallel genetic algorithms. Calc. Paralleles Reseaux Syst. Repartis 10(2), 141–171 (1998) 27. Aggarwal, S., Garg, R., Goswani, P.: A review paper on di?erent encoding schemes used in genetic algorithms. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 4(1), 596– 600 (2014) 28. Baluja, S., Caruana, R.: Removing the genetic form the standard genetic algorithm. In: Proceedings of the 12th International Conference on Machine Learning, pp. 38 – 46 (1995) 748 N. Gupta et al. 29. Srinivas, M., Patnaik, M.: Adaptive probabilities of crossover and mutation in genetic algorithms. IEEE Trans. Syst. Man Cybern. 24(4), 656–667 (1994) 30. Goldberg, D.E., Sastry, K., Kendall, G.: Genetic algorithms. In: Burke, E.K., Kendall, G. (eds.), Search Methodologies: Introductory Tutorials in Optimization and Decision Support Techniques. Springer, Science + Business Media, NY (2014) 31. Cipra, B.A.: The Best of the 20th Century: Editors Name Top 10 Algorithms, SIAM News 33(4) (2016). https://www.siam.org/pdf/news/637.pdf. Accessed 27 June 2016 32. Man, K.F., Tang, K.S., Kwong, S.: Genetic algorithm: concepts and applications. IEEE Trans. Ind. Electron. 43(5), 519–534 (1996) 33. Jamil, M., Yang, X.: A Literature survey of benchmark functions for global opti-mization problems. Int. J. Math. Model. Numer. Optim. 4(2), 150–194 (2013) 34. https://www.sfu.ca/~ssurjano/scha?er2.html 35. https://www.sfu.ca/~ssurjano/michal.html 36. https://www.sfu.ca/~ssurjano/stybtang.html 37. Icl?anzan, D.: Global optimization of multimodal deceptive functions. In: Blum, C., Ochoa, G. (eds.) Evolutionary Computation in Combinatorial Optimisation. EvoCOP 2014. Lecture Notes in Computer Science, vol. 8600. Springer, Berlin, Heidelberg (2014) 38. Li, Y.: The deceptive degree of the objective function. In: Wright A.H., Vose M.D., De Jong K.A., Schmitt L.M. (eds.) Foundations of Genetic Algorithms. FOGA 2005. Lecture Notes in Computer Science, vol. 3469. Springer, Heidelberg (2005) 39. Mishra, S.K.: Minimization of Keane’s bump function by the repulsive particle swarm and the di?erential evolution methods, May 2007 (2007). SSRN:http:// ssrn.com/abstract=983836 40. Karaboga, D., Akay, B.: A comparative study of arti?cial bee colony algorithm. Appl. Math. Comput. 214(1), 108–132 (2009) 41. Bozorg-Haddad, O., Solgi, M., Lo´aiciga, H.A.: Invasive weed optimization. Meta- Heuristic and Evolutionary Algorithms for Engineering Optimization, pp. 163–173. Wiley (2017) 42. Eusu?, M., Lansey, K., Pasha, F.: Shu?ed frog-leaping algorithm: a memetic meta-heuristic for discrete optimization. Eng. Optim. 38(2), 129–154 (2006). Taylor & Francis Second-Generation Web Interface to Correcting ASR Output Old?rich Kr°uza(B) and Vladislav Kubon? Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, Charles University, Malostransk´e n´am. 25, Prague, Czech Republic {kruza,vk}@ufal.mff.cuni.cz Abstract. This paper presents a next-generation web application that enables users to contribute corrections to automatically acquired tran-scription of long speech recordings. We describe di?erences from similar settings, compare our solution with others and re?ect on the develop-ment from the now 6 years old work we build upon in the light of the progress made, lessons learned and the new technologies available in the browser. Keywords: Speech recognition · Transcription · Community-driven Web standards 1 Introduction In 2012 [7], we have presented a setting where a community of users contributed corrections to automatically transcribed talks of a single speaker. Now that the browser technologies evolved drastically and we could observe the usage patterns and discover shortcomings of the solution at hand, we have created a next gen-eration of the programme. We shall describe the steps taken and discuss their motivation and impact. The application we describe is a part of a larger system that deals with Makon’s ? recordings. It consists roughly of (1) the corpus itself, (2) an ASR system trained specially for it and (3) a web interface for the users. These three parts form a whole where the ASR gives a baseline transcription, the users correct it and the corrections are fed as further training data to the acoustic and language models. In this paper, we focus on the web interface. 1.1 Motivation Our project focuses on the collection of recordings of Karel Makon? [5] *1912 †1993, the author of numerous books, translations and comments to works of spiritual and religious nature, who was in?uenced by trances during recurring surgery without anesthesy in the age of 6, ecstasies in the youth and ?nally .n c Springer Nature Switzerland AG 2019 K. Arai et al. (Eds.): FTC 2018, AISC 880, pp. 749–762, 2019. https://doi.org/10.1007/978-3-030-02686-8_56 750 O. Kr°uza and V. Kubon? facing and surviving certain death in a Nazi concentration camp, after which he experienced enlightenment. He gave talks in a narrow circle of friends and the recordings in our care have been taken between early 70’s and 1991, spanning about 1000 h in total. All of Makon’s ? work deals more or less directly with a single topic: entering the eternal life before the physical death. He draws mainly from the Christian symbolism, builds up on Christian mysticism and ancient tradition of India and China. Makon’s ? written works present his teachings in a systematic, comprehen-sive fashion, while the recordings o?er bonuses: talks tailored to the audience, answers to questions, personal experiences, behind-the-scenes to the books etc. The archive is freely accessible1 under the CC-BY license. 2 Di?erences to Other Settings The spoken corpus is about 1000 h of a single speaker. Our aim is to have a transcription as good as possible for the purpose of searching and further, higher-level processing of the data. There is a pool of people interested in the talks, who on one hand are the force we can try to employ and on the other hand are the consumers of our e?ort, our target group so to speak. The web application should therefore combine the two purposes: 1. serve its user with making the content available in a manner as good as possible and 2. animate the user to give as much and as high-quality contribution as possible. To our best knowledge, there is no other project with a comparable setting. However, we can compare single aspects found in other applications. 2.1 Transcription Apps The best widespread match to our task is that of creating an application for transcribing speech recordings. Let us compare the two tasks, pointing out the main points of di?erence. For reference, we take (1) Transcriber2 , a classical open-source program written in TCL, (2) oTranscribe3 , a free modern web-based transcription tool and (3) Transcribe4 a commercial web-based transcription tool. The numbers in the bullet list below denote the programs our statement applies to. For example, of the three only Transcriber allows speaker annotation, hence there is only the number (1) standing at the second list item. 1 https://lindat.m?.cuni.cz/repository/xmlui/handle/11372/LRT-1455. 2 trans.sourceforge.net. 3 otranscribe.com. 4 transcribe.wreally.com. Second-Generation Web Interface to Correcting ASR Output 751 • transcription applications: • our application: • are optimized for the case where there is no transcription available and it must be acquired from scratch; (1,2,3) • always assumes a prior transcription is available; • allow annotation of speakers; (1) • assumes all utterances come from the same speaker; • need no quality control: the user is free to enter whatever transcription she pleases and the ultimate measure is her satisfaction; (1,2,3) • needs the transcription to be accurate because it is used as training data for the acoustic model; • use alignment on the level of phrases, if any; (1) 5 • uses alignment on the level of words; • are user-centric: the user transcribes whatever acoustic data they choose; (1,2,3) • is data-centric: the whole application with all its tools and persons revolves around the data set; • assumes the user wants to transcribe; (1,2,3) • assume the user wants to listen and possibly read along and we want to animate her to submit transcriptions; • has no shared data between users; (1,2) 6 • must count with collisions. Despite of these di?erences, we can still learn a lot from transcription soft-ware. The ease of performing common tasks, like pausing, resuming and rewind-ing is crucial for the user experience and in e?ect for the amount of submissions that we receive. Also, the way the text is displayed synchronously to the audio played has a big impact and the approaches have a lot of space for variation. 2.2 Wiki Where our application diverts from transcription software, it mostly resembles a wiki: a community platform that serves its users including the contributors but where the quality of the contributions is essential, while the contributor’s satisfaction alone is of less importance. One major point of di?erence to a wiki is that wiki is creative, whereas our task is mechanical. The user has basically no room for their own invention: providing a di?erent than correct transcription is seen as an error. 5 Transcriber explicitly aligns the text with speech, while the other two merely support addition of timestamps into the transcription. 6 Transcribe supports team co-operation. 752 O. Kr°uza and V. Kubon? Popular wikis have good measures for edit con?icts, which is where we could learn some lessons. However, so far there was no need to do that because 1. if we always simply take the most recent version of a segment, the result stays consistent even if a piece from user A comes into a larger transcription of user B; 2. our user base is so far limited to a small community who have no problem coordinating with each other. We plan to expand to broader public soon though. With regard to the transription as presented to the user, a submitted seg-ment of transcription always overwrites the present version but we keep all the submissions in a database, so undo operations, clustering submissions by their author etc. are possible but we had little need for this so far. 2.3 Corpora Our project is not the ?rst involving community-driven care of a corpus. We can mention the Manually annotated sub-corpus [6], where annotations of various kinds are gathered from volunteers, or the Wikicorpus [10], a corpus of Wikipedia articles with some linguistic annotation. Our project may reach profound sim-ilarities with these in the future, when we no longer focus on the transcription itself but rather on annotation. There is also CzEng [3], the Czech-English Parallel Corpus, where a large part of the translation is provided by volunteers. The similarity in setting is considerable as both projects involve a machine-produced erroneous derivative of the original material (in our case audio transcriptions, in the case of CzEng Czech translations of English texts), and a community of volunteers correct these. But the speci?cs of the projects bring di?erent challenges and dictate di?erent approaches. Marge (2009) [8] investigates using The Mechanical Turk to obtain audio transcriptions. Mihalcea (2004) [9] o?ers a web interface for word-sense disam-biguation and focuses mostly on annotator con?ict resolution. 3 Description of the Web Application 3.1 Usage We have no special assumption of the user beyond basic computer usage skills and understanding the audio. We assume no prior training. There is a manual for clearing common points of confusion. The main message in it is that any-thing that is to be transcribed, should be transcribed with respect to phonetic precision, even if it results in nonsensical character strings. Anything except words spoken by the one speaker of interest is to be left untranscribed, including noise or speech by other persons7 . Incomprehensible 7 In our data, other speakers represent a negligible fraction but we may later add support for speaker annotation. Second-Generation Web Interface to Correcting ASR Output 753 words are to be left uncorrected (the ASR output kept) if the phones are unclear. If the phones uttered are clear but it is not clear what word was meant, the word may be transcribed phonetically. 3.2 Implementation The application consists of several views: 1. the start page where all recordings are listed and each points to a detail view, 2. the detail view, where a recording can be played back, its transcription is displayed and can be corrected by the user, 3. the search page, where hits to a search query are listed and point to corre-sponding positions in the recordings, 4. static pages with general information, contact etc. We shall only discuss the detail view as the others are not relevant to this paper. Figure 1 shows the interface during playback. Figure 2 shows the interface while a segment is being edited. The interface in the ?gures is conveniently shown in English, although in reality it is in Czech. Legend to Fig. 1: 1. Header with – app name linking to start page, – about link, – search ?eld and – username input ?eld; 2. Identi?er of the recording; 3. Automatically transcribed segments in grey; 4. Manually transcribed segments in black; 5. Currently played-back word highlighted by yellow background; 6. Marked word highlighted in regent st. blue; 7. Marked word info: – occurrence: the word with contextual capitalization and punctuation as it appeared in the text (currently being edited as the selected initial letter reveals), – form: normalized word form as it appears in the word list, – pronunciation: Czech phonetic transcription of the word, – position: time of the beginning of the word in seconds from the start of the recording; 8. Tools for storing: – direct links to the audio ?les, – selecting the whole transcription for easy pasting, – storing the decoded recording in the browser’s IndexedDB; 9. Graphical equalizer for compensating narrow-band noise; 754 O. Kr°uza and V. Kubon? Fig. 1. Web interface during playback 10. Audio playback controls: – play/pause button, – current playback position, – playback scrollbar, – total recording length; 11. Current position re?ected in URL fragment. Legend to Fig. 2: 1. Selecting a text range with the mouse de?nes the segment the user is about to transcribe; 2. The edit tool with – text area pre?lled with the current transcription, – playback button that plays the corresponding segment, – save button and – download-segment button, which initiates a ?le-save action for the audio segment corresponding the the selected text. The synthesis of the down-loaded ?le takes place in the browser. The commonest tasks have keyboard shortcuts: ctrl+space for play/pause and ctrl+enter for submitting a correction. Second-Generation Web Interface to Correcting ASR Output 755 Fig. 2. Interface in the state of editing a segment 3.3 Displaying the Transcription Many transcription programs show the transcription as a vertical list of utter-ances, see Fig. 3 for an example of Transcriber. We attribute this to the fact that the atomic elements of the transcription are the user-entered utterances and their boundaries are reliable. In our case, the atomic elements are words. There are sentences, sure, but the segmentation to sentences by the ASR is very unreliable, so we want it to be natural to transcribe a segment overlapping sentence boundaries. This is one of the reasons why we display the transcription basically as a single wrapped line. Performance Challenge. The transcription display was designed to have these features: 1. Currently played-back word should be highlighted; 2. Manually transcribed segments should be clearly distinct from automatically transcribed ones; 756 O. Kr°uza and V. Kubon? 3. Selecting one or more words with the mouse should trigger transcription mode for the selected text; upon a successful save, this should be merged into the display; 4. Clicking a word should bring up its context info (we call this the marked word as the term selected word is already taken); 5. The whole transcription should be shown at once for easy searching; 6. The page should be responsive. Fig. 3. A screenshot of transcriber These requirements are harder to combine than it may seem. Notably respon-siveness is hard to combine with all of the other ones. Why is that so? Points 1 through 4 call for every word to be wrapped in its own element. Point 5 and the median count of words in a transcript of about 6000 yield 6000 elements just to show the text. Although this may not seem like a big deal, it does a?ect the responsiveness and memory footprint of the page. Second-Generation Web Interface to Correcting ASR Output 757 In the original version, we solved this by sacri?cing point 5: only 3 lines of text are shown with the current word kept on the middle line as shown on Fig. 4.8 Thanks to the development in the web standards and their support from popular browsers, a solution is possible. Fig. 4. Original web interface from 2012 Solution. We can use the fortunate fact that manually transcribed words and automatically transcribed ones tend to form larger chunks. The average number of words per submitted segment is 7.9. Furthermore, the absolute majority of such segments are adjacent to other manually transcribed chunks. 9 Hence, wrap-ping each chunk of consecutive manually or automatically transcribed words in an HTML element is no problem, which solves point 2. Point 3 can be implemented using document.selection and the Range objects, which let us ?nd out the innermost HTML element and text o?set of the start and end of the textual selection. Since we know the length of each word, this allows us to map the selection to the corresponding words in the transcription. 8 The current word is on the top line on the screenshot because it is at the beginning of the recording. 9 The median number of chunks is 1 (most recordings have no manually corrected segments), maximum is 1109. Median only counting touched recordings is 8. 758 O. Kr°uza and V. Kubon? Points 1 and 4 can be implemented in two ways: We could either wrap the current and marked word in a dedicated element or we could draw a highlighting rectangle beneath the word. Wrapping the word would de?netely be more robust and less error-prone but the constant changes in the DOM during playback with possible frequent re?ows speak against it. Finding the exact position of each word and draw-ing a rectangle precisely beneath it (beneath on the z-axis; over it in the x-y sense), avoiding positioning issues and keeping the rectangle position synced even after scrolling/window resizing is de?nitely a challenge but we chose this way nonetheless. The performance gain for the majority of the usage time out-weighs the possible errors in the corner cases, more so since the eventual errors are not critical and mostly remedied by further playback. The e?ciency of repositioning a rectangle is supported by the fact that we can calculate the coordinates of all rendered words once and only recalculate them in two cases: (1) In the rare event of screen resize and (2) when a corrected segment is merged into the transcription, in which case we only need to recalculate for the words further in the document. 10 Manual/Automatic Distinction. As shown on Fig. 1, we draw automatic transcription in grey and manual one in black. Why did we choose this instead of normal/boldface? Firstly, the normal font is optimal for reading. Boldface is meant to highlight spots in text. It becomes bulky when applied on long continuous passages. The automatic transcription contains many errors, so there is no sense in optimizing it for best reading experience. There is also another practical reason. When the two font variants only di?er in color, and a segment of automatic transcription is left intact and submitted as correct transcription, its merge-down into the displayed text causes no re?ow, which saves us computations and raises responsiveness. It may seem like a rare use case but we believe that identifying correctly recognized words is a legitimate way of contribution, so why not optimize for it? Still, the underlying HTML tags are and because that way the distinction persists when copy-pasting the text from the web page to a rich text editor. 3.4 Ergonomy It is clear that the ease of use is crucial in our case where the user is supposed to perform a requiring, tedious task with repeated steps, especially since it is our interest more than hers that she performs them. We compared our setting with that of transcription apps in Sect. 2.1, pointing out lessons to learn. Let us now look at some speci?c points and their actual (lack of) implementation. 10 We could even stop the recalculation as soon as we ?nd that the new horizontal coor-dinate of a word is left untouched, and add the di?erence in the vertical coordinate to all subsequent words, i.e. when a line stays the same, so do all below it. Second-Generation Web Interface to Correcting ASR Output 759 Keyboard Shortcuts. One of the most profound measures in ergonomy are keyboard shortcuts. The most common task is pausing and resuming playback. Both oTranscribe and Transcribe use the esc key for that, and Transcriber uses the tab key. We chose ctrl+space combination. We argue that esc is not the best of options for desktops because the distance the ?ngers have to travel from the alphanumeric keys causes a noticeable delay. This can lead to missing a pause between words. The tab key as chosen by Transcriber is a splendid choice from the ergonomy point of view and there is no reason not to use it in a dedicated user interface. However, in the browser, where the tab key has as native use, re-binding it could lead to confusion and irritation. The space bar is probably the easiest-to-?nd key in all situations and dedicating ctrl to all application-speci?c commands as opposed to single keys lends a sense of consistency, we believe. This is mere personal experience though, as we had no resources so far to perform serious research to support these statements. The only other keyboard shortcut we support is ctrl+enter for submit-ting the correction. We chose this to stay consistent using the ctrl key and because this shortcut is familiar to users of many instant messengers, like the Facebook chat or the once popular o?cial ICQ client. Also, requiring a key combination prevents accidental submission, which is desirable as we only want double-checked, guaranteed-precise ones. In comparison, Transcriber uses the bare enter key to separate utterances. oTranscribe and Transcribe allow free formatting with no explicit alignment, so using the enter key to split utterances by lines is the user’s choice. Missing Features. One of the features that Transcribe, the only commercial tool in our reference list, o?ers is setting up keyboard shortcuts for common words. We have not implemented this because ideally, common words should be covered by speech recognition. However, it could be sensible to implement it anyway. The reason is that a word can be very rare globally and thus poorly recognized by ASR but very common in a speci?c passage. This particularly regards named entities. Another point in our ergonomy to-do list is lifting the need to select a segment prior to correcting it. If the transcription was simply editable, it could increase the ease of use rapidly. We would have to automate the selection of segment to send for forced alignment but we could probably do a better job than the user in the end. 3.5 Mechanics of Submitting a Corrected Segment As stated above, when the user selects at least one character with the mouse, the application enters the state of correcting the selected transcription padded to whole words. In this mode, the transcription to correct is shown in a text area and the global playback controls are replaced by those that only allow playback of audio corresponding to the selected transcription. Once the user believes that the content of the text area corresponds precisely to the words uttered, she hits the save button or the ctrl+enter keyboard 760 O. Kr°uza and V. Kubon? shortcut. This starts an asynchronous HTTP request to the back-end, where parametrized (MFCC) versions of the recordings are stored, along with the new transcription and the time positions of the beginning and end of the segment. The server then cuts o? the corresponding segment from the parametrized recording, runs forced alignment on it with the provided transcription with a threshold to reject bad matches. If the forced alignment fails, an error response is sent back and the transcription is not merged into the original. In the case of a success, the correction is merged on one hand on the server side and pushed to a CDN, on the other hand it is merged into the transcription word array in the JavaScript application. This redundancy warrants that we do not have to reload the whole transcription every time a segment is corrected. React ensures the updating of the chunks, and the coordinates of the words further in the document are recalculated for word-highlighting purposes. Apart from this, the version of the transcription to the recording is updated. This is because the transcription ?les have a long cache time because normally, they do not change at all. At the page load, the versions of all transcriptions are loaded and used as cache busters. This enables us to use an external CDN and cache e?ectively. 3.6 Implementation Details Audio Engine. The adoption of Web Audio API [2] allowed for big improve-ments in comparison with the original implementation. There are four major di?erences between using the HTML