|
Research Highlights of Weisong Shi
2011 1. SAIL: Smartphone/Sensor-Assisted
Independent Living (with Kewei, Shinan, Arnetz, Guoxing,
Lingmei, Quan, 2008- )
Adults age 65 years and older
will account for more than 20% of the
U.S. population in 2050. In 1991, DHHS created the Healthy People 2000
project - the first national effort targeted at reducing disability and
promoting physical health in older adults. Despite the initiative, the
number of elders with one or more physical disabilities in increasing.
the mission of the SAIL project is developing the enabling technologies
to assist the independent living of aging citizens, including hardware
design (sensors), systems design, intelligent data analysis, and
security and privacy. The SAIL project consists of a multidisciplinary
team from Computer Science, Gerontology, Nursing, and Psychology and
Family Medicine.
Along this direction, we have
developed SPA, a smartphone
assisted chronic
illness self-management system with participatory sensing. The medical
system has not been able to effectively adapt to the dramatic
transformation in public health challenges, from acute to chronic and
lifestyle-related illnesses. Although acute illnesses can be treated
successfully in an offfice or hospital, chronic illnesses comprise the
bulk of health care needs and require a very different approach.
Patient involvement is critical for sustainable and successful chronic
disease management. Regular feed-back of relevant health data to the
individual patient facilitates patient involvement. Yet, there is a
lack of effective and easily deployed tools for
self-monitoring and self-care, and people often do these tasks poorly,
especially people at socioeconomic risk for chronic illness, such as
urban minorities. Based on the most current cognitive and behavioral
change research, we propose that the prevention or treatment of chronic
illnesses will be greatly aided by an innovative system that can
monitor a person’s body, behavior, and environment during his or
her daily life, and then alert the person to take corrective action
when health risks are identified. In this paper, we propose a smart
phone assisted chronic illness self-management system, named SPA. Our
system can provide continuous monitoring on the health condition of the
system user and give valuable in-situ context-aware
suggestions/feedbacks to improve the public health.
Leveraging body area sensor network (BASN) for health care is a very promising application domain for wireless sensor networks. In a typical BASN health care application, usually, bio-sensors and environmental-sensors connect to a local Preprocessing Unit (PU) first, e.g., a smartphone or a laptop, which in turn extracts the meaningful data and performs necessary processing before the PU transmits the data to a Central Server (CS). In this procedure, we realized the system designers have to work on many repeated jobs in different BASN systems. Even worse, changing one component of the system usually requires designers to rewrite a large portion of code. To this end, we propose a Smart Phone Assisted RealTime heAlth care Network framework (SPARTAN), to simplify the development procedure and extend the flexibility of BASN systems. This project is funded in part by Swedish Council for Working Life and Social Research, and the Department of Homeland Security. 2. Power Profiling and Analysis (with Shinan, Hui , 2009-) Energy profiling plays a crucial role in order to apply efficient energy optimization in the application level, therefore improving energy efficiency. Knowing the quantity of energy consumed by a particular application is very important in adapting the application behavior in order to obtain specific goals, such as extending the battery life time of a mobile device or charging for services in the cloud computing system. However,
current solutions to the energy profiling problem exhibit some
drawbacks that make them difficult to be used widely. First, they are
mainly hardware-based, i.e., they require additional multimeters
reading the current along the wire between the system and its power
supply. This approach is expensive, inflexible, and difficult to
deploy. Second, the energy profiling information is provided offline,
after collecting, analyzing and correlating trace of system activities,
therefore making it difficult for applications to adapt their behavior
at runtime. Finally, energy profiling is not offered as a service in
the system. Without such service, system developers would face
difficulty when implementing energy-ware adaptation protocols to obtain
energy optimization.
This work is ongoing. We have developed two tools: one is pTop, a process-level prower profiling tool. The other is SPAN, a software power analyzer for multicore systems. 3. Data Quality Management in Networked Sensing Systems (with Kewei Sha and Shinan Wang, 2005-) As new fabrication
and integration technologies reduce the cost and size of micro-sensors
and wireless sensors, we will witness another revolution that
facilitates the observation and control of our physical world just as
networking technologies have changed the way individuals and
organizations exchange information. Several applications, such as
habitat monitoring, environment and structure monitoring, target (e.g.,
firefighter) tracking and most recent participatory urban sensing have
been launched, showing the promising future of wide range of
applications of wireless sensor networks.
Their success is nonetheless determined by whether the sensor networks can provide a high quality stream of data over a long period. Most previous efforts focus on devising techniques to save the sensor node energy and thus extend the lifetime of the whole sensor network. However, with more and more deployments of real sensor systems, in which the main function is to collect interesting data and to share with peers, data quality has been becoming a very important issue in the design of sensor systems. In this project, we undertake a novel approach that detects deceptive data through considering the consistency requirements of data, and study the relationship between the quality of data and the multi-hop communication and energy-efficient design of networked sensor systems. In principle, the quality of data should reflect the timeliness and accuracy of collected data that are presented to interested recipients who make the final decision based on these data. Complementing to the work on the sensor design that improves the accuracy of sensing, in this paper, we intend to study the relationship between data quality and energy-efficient design of WSNs. To integrate and manage data quality in WSNs, we propose a consistency-driven data quality management framework called Orchis that integrates the quality of data into an energy efficient sensor system design. Orchis consists of four components, data consistency models, adaptive data sampling and process protocols, consistency-driven cross-layer protocols and flexible APIs to manage the data quality, to support the goals of high data quality and energy efficiency. We first formally define a consistency model, which not only includes temporal consistency and numerical consistency, but also considers the application-specific requirements of data and data dynamics in the sensing field. Next, we propose an adaptive lazy energy efficient data collection protocol, which adapts the data sampling rate to the data dynamics in the sensing field and keeps lazy when the data consistency is maintained. Finally, we conduct a comprehensive evaluation to the proposed protocol based on both a TOSSIM-based simulation and a real prototype implementation using MICA2 motes. The results from both simulation and prototype show that our protocol reduces the number of delivered messages, improves the quality of collected data, and in turn extends the lifetime of the whole network. Our analysis also implies that a tradeoff should be carefully set between data consistency requirements and energy saving based on the specific requirements of different applications. We have extended to the vehicular ad-hoc networks (VANETs) domain. In a VANET, the conventional approaches maintaining the data quality are inappropriate because two factors should be satisfied simultaneously. First, the mechanism must adapt to frequently-changed network topology. Second, sensor networks on vehicles underline the realtime requirement, which forces outlier detection, trust-based detection and similar off-line mechanisms to be insufficient. Although several methods have been proposed emphasizing the real-time functionality of system, most of them assume a particular data model e.g. Gaussian, linear. Others do provide a real-time fashion abnormal data detection mechanism, but fail to be applied in VANETs since their approaches mainly rely on network hierarchy. We first we classify the deceptive data into two categories, redundant data and false data. The redundant data is defined as the data share exactly the same or very similar information with data reported previously or by other sensors. Another type of deceptive data is false data, which may result from the malfunction of the sensor board, the unreliable wireless communication, and the compromised sensors. In the particular setting of VANETs, we are more interested in false data rather than redundant data. First, energy issue of sensors in vehicles affects less or even could be ignored since sufficient power is able to be generated from battery in each vehicle. Second, redundant data could be helpful to detect false data since it provides additional information of the whole sensor network. Based on this definition, we propose a role-differentiated cooperative deceptive data detection mechanism, RD4, to detect and filter false data in VANETs. In RD4, when a sensor is deployed, it picks up a role from the role set based on the several sensing features. We will evaluate the efficiency and efficacy of our mechanism in three specific vehicular scenarios. This project is supported by NSF grant on "Consistency Model Driven Deceptive Data Detection and Filtering in Wireless Sensor Networks." This work is one of the earliest efforts on data quality/creditability management in wireless sensor networks. The results of this work have been published in IEEE Transactions on Vehicular Technologies, Journal of Prallel and Distributed Computing, and European Confernece on Wireless Sensor Networks, and so on. I also organized a special issue on "Data Quality Management in Wireless Sensor Networks" on International Journal on Sensor Networks in 2009. 4. HOURS: Reputation Mechanisms for Resource Sharing (with Zhengqiang Liang, 2004-) Community
computing—federated sharing of dispersed pools of
geographically
distributed computing resources under coordinated control—has
been considered
as a promising platform for solving large-scale problems in science and
engineering. However, resource management in these environments is a
complex
undertaking. These systems need effective mechanism for fair sharing of
community resources, adaptability to dynamic changing conditions,
prevention of
denial-of-service (DoS) attacks, and coordinaton of the diverse
policies, cost
models, and varying loads different peers. As one motivating example, a
classical “tragedy of the commons” for peer-to-peer
file sharing is 50 to 70%
of peers are free riders, which results in a great load imbalance of
the
systems. Resource trading can enforce a cooperative approach for the
resource
sharing and is promising to address the above problems. The autonomous,
heterogeneous, and decentralized nature of participating peers across
multiple
administrative domain introduces two challenging issues related to
resource trading: decentralized
trading scheme, which means the decision of resource exchange
and
negotiation is determined by each peer based on its personalized view
of the
partner and its own policy; self-policing personalized
trustworthiness
management, which means different peers may have different
opinions on the
trustworthiness of the same peer, instead of unique global
trustworthiness
value like eBay.
In this work, we propose an approach that consists of two models: M-CUBE, a Multiple CUrrency Based Economic model, as the decentralized trading scheme, and aPET, an adatptive PErsonalized Trust Model, to provide the trustworthiness of the peer to support M-CUBE. The M-CUBE model provides a general and flexible substrate to support most of high level resource management services required by the P2P computing, such as resource coallocation, quality of service (QoS) control, advance reservation and scheduling algorithms. aPET derives the trustworthiness from the reputation evaluation and risk evaluation. The trustworthiness value provided by PET will be treated as the view of the peer by M-CUBE. The unique feature of our approach is seamless integrating the trustworthiness and dependability of peers into the resource trading. We have successfully the trust models to two applications: ISP peering and Grid resource management. In ISP peering, the fragility and the poor resilience of Internet are manifested by the severe impact of network activities and the slow recovery after the earthquake damaged undersea cables and disrupted telephone and Internet access in East Asia in December 2006. Except the inefficiency of routing protocol, lack of efficient network monitor mechanisms and economic incentives to encourage the service providers to act cooperatively and promptly is another important reason. In this paper, we build a trust-based economic framework called TRECON to address these open problems in Internet routing. The novelty of TRECON is combining an adaptive personalized trust model (aPET) with an economic approach to provide independent trust-based routing among service providers. TRECON provides flexible policy support based on the trust-based economic mechanism so that autonomous organizations with varied interests and optimization criteria can be smoothly integrated together to achieve better adaptiveness and self-management. Through introducing the economic model, TRECON explores a new way to solve the economic problems and incentives issues in the collaboration among service providers. To show the flexibility of routing policies support, we propose four typical routing policies under the TRECON framework. We evaluate our approach by comparing these four trust derived routing policies with the classical global shortest path routing (SPA) approach. We find that the policy based on trustworthiness (TRU) performs much better than all other policies under different network topologies in terms of delay, success delivery rate, and economic effects. The obstacle for the Grid to be prevalent is the difficulty in using, configuring and maintaining it, which needs excessive IT knowledge, workload, and human intervention. At the same time, inter-operation amongst Grids is on track. To be the core of Grid systems, the resource management must be autonomic and inter-operational to be sustainable for future Grid computing. For this purpose, we introduce HOURS, a reputation-driven economic framework for Grid resource management. HOURS is designed to tackle the difficulty of automatic rescheduling, self-protection, incentives, heterogeneous resource sharing, reservation, and SLA in Grid computing. In this paper, we focus on designing a reputation-based resource scheduler, and use emulation to test its performance with real job traces and node failure traces. To describe the HOURS framework completely, a preliminary multiple-currency-based economic model is also introduced in this paper, with which future extension and improvement can be easily integrated into the framework. The results demonstrate that our scheduler can reduce the job failure rate significantly, and the average number of job resubmissions, which is the most important metric in this paper that affects the system performance and resource utilization from the perspective of users, can be reduced from 3.82 to 0.70 compared to simple sequence resource selection. This project is supported by the NSF
CAREER
grant “Mechanisms
for Resource Sharing in Collaborative
High-End Computing
Platforms.”
Together with Prof. Ling Liu
from Georgia Institute of Technology, we have founded TRAM,
an International Workshop on Trust and Reputation Management on
Massively Distributed Computing Systems, and are organizing two special
issues on IEEE Internet Computing (2010) and Journal of Computer
Science and Technology (2009), respectively. Our results have been
published in IEEE Transactions on Systems, Man, and Cybernetics,
Performance Evaluation Journal, Journal of Parallel and Distributed
Computing, ACM Journal of Mobile Applications and Networks, and so
on. The work in this project has been cited numerous in the
community. The original paper of PET, published in HICSS 2005, has been
cited 93 time by Jan. 2011. 5. Adaptive Workflow Scheduling (with Zhifeng Yu, 2005-2010) Workflow management systems define, manage
and execute complex workflows on heterogeneous distributed computing
environments. With the popularity of cluster and Grid platforms,
workflow
applications tend to be more complex than ever, which in turn keeps
pushing the
technology limit of workflow management system. At
the core of this challenge is workflow scheduling, which is a classic
problem but elevated to a much higher level of complexity in context of
Grids. More
specifically, besides the scheduling complexity introduced by
inter-task
dependence of a workflow, an efficient Grid workflow management system
has to
deal with dynamic of Grid resource and workload. In a Grid environment,
resources come and go and work load changes unpredictably. Every
workflow
management system has two core components, workflow Planner and
Executor. The
traditional systems are developed in two extremes, either static, i.e.,
the
Planner makes scheduling decision, or dynamic, i.e. the Executor
schedules
tasks. With assumption that workflow structure is known prior, the
resource
pool does not change and there is only one workflow running at a time,
with
static strategy the Planner makes a global scheduling decision. While
in dynamic
approaches the Executor schedules in a just-in-time fashion to address
some
invalid assumptions made by static ones, however it performs inferiorly
to some
static scheduling approaches even when task performance estimation is
not very
accurate. This project is in part supported by the NSF CAREER grant “Mechanisms for Resource Sharing in Collaborative High-End Computing Platforms.” The proposed adaptive workflow scheduling algorithms in cluster and Grid environment is published in Proceedings of the 21st IPDPS in 2007, and had been cited 28 times. The failure-aware workflow scheduling has been published in Journal of Cluster computing in 2010. 6. The
Fractal Framework
(with Hanping Lufei, 2003-2007) 7. The CONCA Architecture (With Vijay Karamcheti, Daniel Brodie, and Yonggen Mao, 2002-2005) Future
access to web-based content is likely to be dominated by two trends:
(1)
increasing amounts of dynamic, personalized content, and (2) a
significant
growth in “on-the- move” access using various
mobile resource-constrained
devices. We have proposed a novel architecture for COnsistent Nomadic
Content
Access (CONCA), which attempts to support, from the ground up, caching
of dynamic
personalized content for mobile users. CONCA nodes are designed to
reuse the
shared portions of dynamic content, exploiting knowledge of user
content access
preferences to efficiently support transcoding and nomadic access
(e.g., by
prefetching) by assigning a home CONCA node for each user. Based
on CONCA architecture, we have developed Tuxedo, a peer-to-peer
cooperative Web
caching system for transcoded content. We have studied the benefit of
peer-to-peer Web caching systems extensively, focusing on the dynamic
Web
content caching and delivery. The original CONCA paper has been cited
32 times.
The Tuxedo paper has been cited by papers that published at USENIX NSDI
2006
and 2007. We have conducted the first analysis of a personalized Web
site,
NYUHome, and derived several important implications on caching and
delivery
personalized Web content. The paper has been cited more than 41
times. Requests for dynamic and personalized content
have increasingly become
a significant part of Internet traffic, driven both by a growth in
dynamic web services and a “trickle-down” effect stemming
from the effectiveness of caches and content-distribution networks at
serving static content. To efficiently serve this trend, several
server-side and cache-side techniques have recently been proposed.
Although such techniques, which exploit different forms of reuse at the
sub-document level, appear promising, a significant impediment to their
widespread deployment is (1) the absence of good models describing
characteristics of dynamic web content, and (2) the lack of effective
synthetic content generators, which reduce the effort involved in
verifying the effectiveness of a proposed solution. In our object
modeling project, we addressed both of these shortcomings. Its primary
contribution is a set of models that capture the characteristics of
dynamic content both in terms of independent parameters such as the
distributions of object sizes and their freshness times, as well as
derived parameters such as content reusability across time and linked
documents. These models are derived from an analysis of the content
from six representative news and e-commerce sites, using both
size-based and level-based splitting techniques to infer document
objects. A secondary contribution is a Java-based dynamic content
emulator, which uses these models to generate edge-side include (ESI)
based dynamic content and serve requests for whole documents as well as
separate objects. .The paper that presented
this
work has been cited 62 times.
Current
Internet has an inherent mismatch between the low-bandwidth, limited
resource
characteristics of mobile devices and the high-bandwidth expectations
of many content-rich
services. Existing applications and services cope with the above
problem
essentially by providing differentiated service for different networks
and
devices. For example, most popular news, e-mail, and stock trading
services
today present a different front-end for mobile users. Although adequate
in some
scenarios, this approach suffers from the limitation that mobile users
are classified
into a small number of classes and may not receive performance
commensurate
with the capabilities of the device or network they are using. More
importantly, such an approach cannot adequately cope with dynamically
changing
environments where there is a big variation in available bandwidth
(e.g., a
user on a wireless LAN whose distance from an access point varies
through time). To address this mismatch problem between clients and servers, we have proposed Composable Adaptive Network Services (CANS), an application level infrastructure for customizing the data path between client applications and services, which focuses on three challenges: (1) efficient component composition, (2) support for legacy applications and services, and (3) support for distributed adaptation. Our approach relies on three components: (1) type-based specification of components and network resources, (2) automatic path creation strategy, and (3) system support for low overhead path reconfiguration. Our first paper about CANS entitled “CANS: Composable Adaptive Network Services Infrastructure, has been presented the 3rd USENIX Symposium on Internet Technologies and Systems (USITS ’01), which is ranked as the second highest impacted publication venues among all Computer Science Journals and Conferences by CiteSeer, available at http://citeseer.ist.psu.edu/impact.html. This paper has been cited 162 times (till January 2011). 9. JIAJIA
Software Distributed Shared Memory System (with
Weiwu Hu and Zhimin Tang, 1997-2000) Software
Distributed Shared Memory (DSM) is an ideal vehicle for parallel
programming
because of its combination of programmability of shared memory systems
and the
scalability of distributed memory systems. However, the overhead of
maintaining
consistency in software and the high latency of sending messages makes
achieving performance from software DSMs a challenging issue. My Ph.D.
research
focused on techniques for improving the performance from two
perspectives: (1)
reducing the frequency and time of communication entailed by coherence
protocols, and (2) reducing the software overhead of each message
operation. By
analyzing the disadvantages of snoopy and directory-based cache
coherence
protocols, we proposed a lock-based cache coherence protocol for scope
consistency, and developed a software DSM system named JIAJIA based on
this
protocol. Based on a details analysis of the system overheads of
software DSM
systems, we proposed several techniques to reduce these overheads in
home-based
software DSMs. Since data distribution plays a very important role in
home-based software DSM systems, we investigated a home migration
mechanism to
reduce remote data communication, and based on this propose a task
migration
scheme. Furthermore, because of the prevalence of heterogeneous
computing
environments, load scheduling and balancing become critical issues for
achieving high performance in heterogeneous computing environments. As
part of
the JIAJIA work, we also proposed and evaluated an affinity-based self
scheduling scheme for load balancing in home-based software DSM systems. The first paper about JIAJIA entitled “JIAJIA: An SVM System Based on A New Cache Coherence Protocol, presented at HPCN1999, has been cited 166 times (till January 2011). All JIAJIA related papers have been cited more than 250 times. This system has been downloaded and installed in more than 120 institutions around the world. The JIAJIA system has been listed in many web sites, such as Wikipedia and University of California at Irvine. My Ph.D. dissertation titled "Performance Optimization of Software DSMs Systems" has been awarded the National Outstanding Ph.D. Dissertation Award in 2002, and the book has been published by Higher Education Press. |