CIS Research in Focus
Faculty Research Highlights
Dr. Frank Hsu is the Clavius distinguished professor of science and co-founder of the Journal of Interconnection Networks and ICCS. He has over 200 publications and edited 27 books. He is the director of the Laboratory for Informatics and Data Mining, and has delivered over 450 invited lectures in 15 countries as well as 60 keynote speeches, plenary talks, or special lectures. He has received substantial external funding.
Improving Data and Prediction Quality of High-Throughput Perovskite Synthesis with Model Fusion: Combinatorial fusion analysis (CFA) is an approach for combining multiple scoring systems using the rank-score characteristic function and cognitive diversity measure. One example is to combine diverse machine learning models to achieve better prediction quality. In this work, we apply CFA to the synthesis of metal halide perovskites containing organic ammonium cations via inverse temperature crystallization. Using a data set generated by high-throughput experimentation, four individual models (support vector machines, random forests, weighted logistic classifier, and gradient boosted trees) were developed. We characterize each of these scoring systems and explore 66 possible combinations of the models. When measured by the precision on predicting crystal formation, the majority of the combination models improves the individual model results. The best combination models outperform the best individual models by 3.9 percentage points in precision. In addition to improving prediction quality, we demonstrate how the fusion models can be used to identify mislabeled input data and address issues of data quality. In particular, we identify example cases where all single models and all fusion models do not give the correct prediction. Experimental replication of these syntheses reveals that these compositions are sensitive to modest temperature variations across the different locations of the heating element that can hinder or enhance the crystallization process. In summary, we demonstrate that model fusion using CFA can not only identify a previously unconsidered influence on reaction outcome but also be used as a form of quality control for high-throughput experimentation.
Dr. Damian Lyons, the director of the Robotics and Computer Vision lab, conducts work in formal methods in robot programming and in visual navigation and has been awarded grants from a wide array of government and private industry funders.
Wall Detection Via IMU Data Classification In Autonomous Quadcopters: An autonomous drone flying near obstacles needs to be able to detect and avoid the obstacles or it will collide with them. In prior work, drones can detect and avoid walls using data from camera, ultrasonic or laser sensors mounted either on the drone or in the environment. It is not always possible to instrument the environment, and sensors added to the drone consume payload and power - both of which are constrained for drones. This paper studies how data mining classification techniques can be used to predict where an obstacle is in relation to the drone based only on monitoring air-disturbance. We modeled the airflow of the rotors physically to deduce higher level features for classification. Data was collected from the drone's IMU while it was flying with a wall to its direct left, front and right, as well as with no walls present. In total 18 higher level features were produced from the raw data. We used an 80%, 20% train-test scheme with the RandomForest (RF), K-Nearest Neighbor (KNN) and GradientBoosting (GB) classifiers. Our results show that with the RF classifier and with 90% accuracy it can predict which direction a wall is in relation to the drone.
Dr. Thaier Hayajneh is the founder and director of the Fordham Center for Cybersecurity, an NSA and Homeland Security designated Center of Excellence and conducts research in cybersecurity and networking and has been awarded over $1 million in Cybersecurity related grants.
Run-time Monitoring and Validation using Reverse Function (RMVRF) for Hardware Trojans Detection: There has recently been a significant growth in resource-constrained devices (RCDs) that exchange sensitive and private data. Lightweight ciphers are designed to implement confidentiality in RCDs. Hardware implementation of lightweight cipher should minimize resources, including area, power and energy. Hardware Trojans (HTs) are malicious circuits that are inserted into designs, including lightweight ciphers, to modify cipher behavior and leak sensitive data. Although runtime monitoring is a very effective method to implement design-for-trust and detect hardware Trojans, it requires significant resources and degrades performance. The main motivation of this research is to design RCD-friendly runtime monitoring and create a trusted design with minimal resource overhead. This paper develops a low-power, low-energy and trusted design based on a smart runtime monitoring algorithm targeted for lightweight ciphers in RCDs, implemented in the FPGA platform. The algorithm is adaptive, minimizes resource, and maximizes confidence and HT detection. The novelties of the algorithm are its adaptive activation of checking and its validation. Checking is triggered when a predefined critical node is active, which limits power/energy overhead. Our proposed algorithm has adaptive positive aging because it forgoes validation when a critical node has been already proven safe. To optimize trust, the validation uses a reverse-function design. The implementation results show that proposed algorithm reduces area, power and energy, when compared with existing algorithms. The proposed algorithm achieves 37%-55% reduction in energy and power, and has an average of 53% improvement in LE×energy metric. Furthermore, our proposed algorithm reduces the area by 25% when both encryption and decryptions are implemented.
Dr. Gary Weiss, the director of the WISDM Lab, whose National Science Foundation and Google supported work on mobile phone data-mining has shown that a person’s activities and even characteristics such as height and sex can be identified from the phone accelerometer data.
Biometric authentication and verification for medical cyber physical systems: A Wireless Body Area Network (WBAN) is a network of wirelessly connected sensing and actuating devices. WBANs used for recording biometric information and administering medication are classified as part of a Cyber Physical System (CPS). Preserving user security and privacy is a fundamental concern of WBANs, which introduces the notion of using biometric readings as a mechanism for authentication. Extensive research has been conducted regarding the various methodologies (e.g., ECG, EEG, gait, head/arm motion, skin impedance). This paper seeks to analyze and evaluate the most prominent biometric authentication techniques based on accuracy, cost, and feasibility of implementation. We suggest several authentication schemes which incorporate multiple biometric properties.
Dr. David Wei has worked on some of the key fundamental areas of parallel and distributed processing, mobile computing and optical networks. He is the director of the Advanced Networking Laboratory.
An improved QKD protocol without public announcement basis using periodically derived basis: The quantum key distribution (QKD) protocol provides an absolutely secure way to distribute secret keys, where security can be guaranteed by quantum mechanics. To raise the key generation rate of classical BB84 QKD protocol, Hwang et al. (Phys Lett A 244(6):489–494, 1998) proposed a subtle variation (Hwang protocol), in which a pre-shared secret string is used to generate the consistent basis. Although the security of Hwang protocol has been verified in ideal condition, its practicality is still being studied in more depth. In this work, we propose a simple attack strategy to obtain all preparation basis by stealing partial information in each round. To eliminate this security threat, we further propose an improved QKD protocol using the idea of iteratively updating the basis. Furthermore, we apply our improved method to decoy-state QKD protocol and double its key generation rate.
Dr. Xiaolan Zhang’s research area is Computer network and distributed systems, in particular, performance evaluation via modeling, simulation and experiments, mobile ad hoc network, Disruption Tolerant Network, the application of network coding, and multimedia streaming system. She has served on NSF NeTS program review panel as well as in many conferences.
Relational Deep Reinforcement Learning for Routing in Wireless Networks: While routing in wireless networks has been studied extensively, existing protocols are typically designed for a specific set of network conditions and so do not easily accommodate changes in those conditions. For instance, protocols that assume network connectivity cannot be easily applied to disconnected networks. In this paper, we develop a distributed routing strategy based on deep reinforcement learning that generalizes to diverse traffic patterns, congestion levels, network connectivity, and link dynamics. We make the following key innovations in our design: (i) the use of relational features as inputs to the deep neural network approximating the decision space, which enables our algorithm to generalize to diverse network conditions, (ii) the use of packet-centric decisions to transform the routing problem into an episodic task by viewing packets, rather than wireless devices, as reinforcement learning agents, which provides a natural way to propagate and model rewards accurately during learning, and (iii) the use of extended-time actions to model the time spent by a packet waiting in a queue, which reduces the amount of training data needed and allows the learning algorithm to converge more quickly. We evaluate our routing algorithm using a packet-level simulator and show that the policy our algorithm learns during training is able to generalize to larger and more congested networks, different topologies, and diverse link dynamics. Our algorithm outperforms shortest path and backpressure routing with respect to packets delivered and delay per packet.
Dr. Yanjun Li’s primary research area is data mining and knowledge discovery. Her recent work has focused on text clustering, text classification, and ontology building. She is referee for many renowned conference and journals.
Food Recipe Alternation and Generation with Natural Language Processing Techniques: We prefer to have more options when we cook. Choosing alternative ingredients or recipes, creating new recipes could be quite challenging. In this research project, we investigated how to apply the state-of-the-art natural language processing techniques such as word embedding to help people choose alternative ingredients/recipes and build language models -N-gram and neural network model to generate new recipes with authentic flavor of certain cuisine style.
Dr. Yijun Zhao’s research interests include machine learning, data mining and statistical pattern recognition. In collaboration with industrial leaders, she has applied machine learning methods to detect brain abnormalities occurring in neurological disorders such as Epilepsy, and predict disease course in Multiple Sclerosis patients.
A calibrated deep learning ensemble for abnormality detection in musculoskeletal radiographs: Musculoskeletal disorders affect the locomotor system and are the leading contributor to disability worldwide. Patients suffer chronic pain and limitations in mobility, dexterity, and functional ability. Musculoskeletal (bone) X-ray is an essential tool in diagnosing the abnormalities. In recent years, deep learning algorithms have increasingly been applied in musculoskeletal radiology and have produced remarkable results. In our study, we introduce a new calibrated ensemble of deep learners for the task of identifying abnormal musculoskeletal radiographs. Our model leverages the strengths of three baseline deep neural networks (ConvNet, ResNet, and DenseNet), which are typically employed either directly or as the backbone architecture in the existing deep learning-based approaches in this domain. Experimental results based on the public MURA dataset demonstrate that our proposed model outperforms three individual models and a traditional ensemble learner, achieving an overall performance of (AUC: 0.93, Accuracy: 0.87, Precision: 0.93, Recall: 0.81, Cohen’s kappa: 0.74). The model also outperforms expert radiologists in three out of the seven upper extremity anatomical regions with a leading performance of (AUC: 0.97, Accuracy: 0.93, Precision: 0.90, Recall:0.97, Cohen’s kappa: 0.85) in the humerus region. We further apply the class activation map technique to highlight the areas essential to our model’s decision-making process. Given that the best radiologist performance is between 0.73 and 0.78 in Cohen’s kappa statistic, our study provides convincing results supporting the utility of a calibrated ensemble approach for assessing abnormalities in musculoskeletal X-rays.
Dr. Daniel Leeds, who directs the Computational Neuroscience laboratory, studies the computational principles underlying perception and cognition, including cognitive decline from concussion and aging.
Predictive Power of Head Impact Intensity Measures for Recognition Memory Performance: Subconcussive head injuries are connected to both short-term cognitive changes and long-term neurodegeneration. Further study is required to understand what types of subconcussive impacts might prove detrimental to cognition. We studied cadets at the US Air Force Academy engaged in boxing and physical development, measuring head impact motions during exercise with accelerometers. These head impact measures were compared with post-exercise memory performance. Investigators explored multiple techniques for characterizing the magnitude of head impacts. Boxers received more head impacts and achieved lower performance in post-exercise memory than non-boxers. For several measures of impact motion, impact intensity appeared to set an upper bound on post-exercise memory performance – stronger impacts led to lower expected memory performance. This trend was most significant when impact intensity was measured through a novel technique, applying principal component analysis to boxer motion. Principal component analysis measures also captured more distinct impact information than seven traditional impact measures also tested.
Dr. Tadeusz Strzemecki’s research interests include designing new algorithms for the generation of prime implicants, minimization of Boolean functions and generalization of the quasi-implicant reduction method. He has two patents on a) New Frequency Divider of a Digital Signal with 80% Duty Cycle at the Output, and b) New Voltage Discriminator.
Widening ROBDDs with Prime Implicants: Despite the ubiquity of ROBDDs in program analysis, and extensive literature on ROBDD minimisation, there is a dearth of work on approximating ROBDDs. The need for approximation arises because many ROBDD operations result in an ROBDD whose size is quadratic in the size of the inputs. Furthermore, if ROBDDs are used in abstract interpretation, the running time of the analysis is related not only to the complexity of the individual ROBDD operations but also the number of operations applied. The number of operations is, in turn, constrained by the number of times a Boolean function can be weakened before stability is achieved. This paper proposes a widening that can be used to both constrain the size of an ROBDD and also ensure that the number of times that it is weakened is bounded by some given constant. The widening can be used to either systematically approximate from above (i.e. derive a weaker function) or below (i.e. infer a stronger function).
Dr. Alam Bhuiyan, whose research focuses on Cybersecurity, Cyber-Physical Systems and Internet of Things, has won the EEE TCSC Early Career Researcher award in 2016. He received 5 best paper awards and two of his 140+ publications are listed by ESI as highly cited papers (i.e. in the top 1% of cited papers in the last 10 years).
Zero shot augmentation learning in internet of biometric things for health signal processing: In recent years, the number of Internet of Things (IoT) devices has increased rapidly. The Internet of Biometric Things (IoBT) can process biometrics and health signals, and it will greatly extend the range of biometric applications. The analysis of health signals in the IoBT can use computer-aided diagnosis techniques. However, most of the existing computer-aided diagnosis methods are developed for common diseases and are not suitable for rare diseases. Zero shot learning is a potential method for the computer-aided diagnosis of rare diseases because it can identify objects of unknown categories. However, the existing zero shot learning methods are based on attribute learning and rely on an attribute dataset. There is no attribute dataset for health signal processing. Therefore, the existing zero shot learning methods are not suitable for health signal processing. Based on the above background, we propose a zero shot augmentation learning model (ZSAL) in the IoBT for health signal processing. First, an expert doctor identifies the contour of a lesion and selects a background image without a lesion. Second, the computer automatically generates virtual images using zero shot augmentation technology. Finally, the generated virtual dataset is used to train a convolutional classifier, and then we apply the classifier to the computer-aided diagnosis of actual medical images. The experiment shows the efficiency and effectiveness of our method.
Dr. Ying Mao’s research essentially targets on cloud and distributed computing from the following aspects. (1) improving the performance of clusters, and (2) enhancing the system robustness, (3) accelerating the distributed learning systems, (4) balancing the machine learning tasks on the cloud with convergence awareness to maximize the overall makespan. He serves on the TPC of several conferences.
Network-Aware Locality Scheduling for Distributed Data Operators in Data Centers: Large data centers are currently the mainstream infrastructures for big data processing. As one of the most fundamental tasks in these environments, the efficient execution of distributed data operators (e.g., join and aggregation) are still challenging current data systems, and one of the key performance issues is network communication time. State-of-the-art methods trying to improve that problem focus on either application-layer data locality optimization to reduce network traffic or on network-layer data flow optimization to increase bandwidth utilization. However, the techniques in the two layers are totally independent from each other, and performance gains from a joint optimization perspective have not yet been explored. In this article, we propose a novel approach called NEAL (NEtwork-Aware Locality scheduling) to bridge this gap, and consequently to further reduce communication time for distributed big data operators. We present the detailed design and implementation of NEAL, and our experimental results demonstrate that NEAL always performs better than current approaches for different workloads and network bandwidth configurations.
Dr. Ruhul Amin develops deep learning and data driven approaches to solve interesting problems in Bioinformatics, Computational Social Science, and Information Security. He previously received $450,000 grants for developing small scale search engine in Bengali. His current collaboration includes University of British Columbia, Pennsylvania State University and Stony Brook University.
Integration of Kalman filter in the epidemiological model: A robust approach to predict COVID-19 outbreak in Bangladesh: As one of the most densely populated countries in the world, Bangladesh has been trying to contain the impact of a pandemic like coronavirus disease 2019 (COVID-19) since March, 2020. Although government announced an array of restricted measures to slow down the diffusion in the beginning of the pandemic, the lockdown has been lifted gradually by reopening all the industries, markets and offices with a notable exception of educational institutes. As the physical geography of Bangladesh is highly variable across the largest delta, the population of different regions and their lifestyle also differ in the country. Thus, to get the real scenario of the current pandemic and a possible second wave of COVID-19 transmission across Bangladesh, it is essential to analyze the transmission dynamics over the individual districts. In this paper, we propose to integrate the Unscented Kalman Filter (UKF) with classic SIRD model to explain the epidemic evolution of individual districts in the country. We show that UKF-SIRD model results in a robust prediction of the transmission dynamics for 1–4 months. Then we apply the robust UKF-SIRD model over different regions in Bangladesh to estimates the course of the epidemic. Our analysis demonstrates that in addition to the densely populated areas, industrial areas and popular tourist spots will be in the risk of higher COVID-19 transmission if a second wave of COVID-19 occurs in the country. In the light of these outcomes, we also provide a set of suggestions to contain the future pandemic in Bangladesh.