Research Highlights of Weisong Shi

2011

1. SAIL: Smartphone/Sensor-Assisted Independent Living (with Kewei, Shinan, Arnetz, Guoxing, Lingmei, Quan, 2008- )

Adults age 65 years and older will account for more than 20% of the U.S. population in 2050. In 1991, DHHS created the Healthy People 2000 project - the first national effort targeted at reducing disability and promoting physical health in older adults. Despite the initiative, the number of elders with one or more physical disabilities in increasing. the mission of the SAIL project is developing the enabling technologies to assist the independent living of aging citizens, including hardware design (sensors), systems design, intelligent data analysis, and security and privacy. The SAIL project consists of a multidisciplinary team from Computer Science, Gerontology, Nursing, and Psychology and Family Medicine.

Along this direction, we have developed SPA, a smartphone assisted chronic illness self-management system with participatory sensing. The medical system has not been able to effectively adapt to the dramatic transformation in public health challenges, from acute to chronic and lifestyle-related illnesses. Although acute illnesses can be treated successfully in an offfice or hospital, chronic illnesses comprise the bulk of health care needs and require a very different approach. Patient involvement is critical for sustainable and successful chronic disease management. Regular feed-back of relevant health data to the individual patient facilitates patient involvement. Yet, there is a lack of effective and easily deployed tools for self-monitoring and self-care, and people often do these tasks poorly, especially people at socioeconomic risk for chronic illness, such as urban minorities. Based on the most current cognitive and behavioral change research, we propose that the prevention or treatment of chronic illnesses will be greatly aided by an innovative system that can monitor a person’s body, behavior, and environment during his or her daily life, and then alert the person to take corrective action when health risks are identified. In this paper, we propose a smart phone assisted chronic illness self-management system, named SPA. Our system can provide continuous monitoring on the health condition of the system user and give valuable in-situ context-aware suggestions/feedbacks to improve the public health.

Leveraging body area sensor network (BASN) for health care is a very promising application domain for wireless sensor networks. In a typical BASN health care application, usually, bio-sensors and environmental-sensors connect to a local Preprocessing Unit (PU) first, e.g., a smartphone or a laptop, which in turn extracts the meaningful data and performs necessary processing before the PU transmits the data to a Central Server (CS). In this procedure, we realized the system designers have to work on many repeated jobs in different BASN systems. Even worse, changing one component of the system usually requires designers to rewrite a large portion of code. To this end, we propose a Smart Phone Assisted RealTime heAlth care Network framework (SPARTAN), to simplify the development procedure and extend the flexibility of BASN systems. 

This project is funded in part by Swedish Council for Working Life and Social Research, and the Department of Homeland Security.


2. Power Profiling and Analysis (with Shinan, Hui , 2009-)

Energy profiling plays a crucial role in order to apply efficient energy optimization in the application level, therefore improving energy efficiency. Knowing the quantity of energy consumed by a particular application is very important in adapting the application behavior in order to obtain specific goals, such as extending the battery life time of a mobile device or charging for services in the cloud computing system. 

However, current solutions to the energy profiling problem exhibit some drawbacks that make them difficult to be used widely. First, they are mainly hardware-based, i.e., they require additional multimeters reading the current along the wire between the system and its power supply. This approach is expensive, inflexible, and difficult to deploy. Second, the energy profiling information is provided offline, after collecting, analyzing and correlating trace of system activities, therefore making it difficult for applications to adapt their behavior at runtime. Finally, energy profiling is not offered as a service in the system. Without such service, system developers would face difficulty when implementing energy-ware adaptation protocols to obtain energy optimization.

This work is ongoing. We have developed two tools: one is pTop, a process-level prower profiling tool.  The other is SPAN, a software power analyzer for multicore systems.


3.  Data Quality Management in Networked Sensing Systems
(with Kewei Sha and Shinan Wang, 2005-)

As new fabrication and integration technologies reduce the cost and size of micro-sensors and wireless sensors, we will witness another revolution that facilitates the observation and control of our physical world just as networking technologies have changed the way individuals and organizations exchange information. Several applications, such as habitat monitoring, environment and structure monitoring, target (e.g., firefighter) tracking and most recent participatory urban sensing have been launched, showing the promising future of wide range of applications of wireless sensor networks.

Their success is nonetheless determined by whether the sensor networks can provide a high quality stream of data over a long period. Most previous efforts focus on devising techniques to save the sensor node energy and thus extend the lifetime of the whole sensor network. However, with more and more deployments of real sensor systems, in which the main function is to collect interesting data and to share with peers, data quality has been becoming a very important issue in the design of sensor systems. In this project, we undertake a novel approach that detects deceptive data through considering the consistency requirements of data, and study the relationship between the quality of data and the multi-hop communication and energy-efficient design of networked sensor systems. In principle, the quality of data should reflect the timeliness and accuracy of collected data that are presented to interested recipients who make the final decision based on these data. Complementing to the work on the sensor design that improves the accuracy of sensing, in this paper, we intend to study the relationship between data quality and energy-efficient design of WSNs. To integrate and manage data quality in WSNs, we propose a consistency-driven data quality management framework called Orchis that integrates the quality of data into an energy efficient sensor system design. Orchis consists of four components, data consistency models, adaptive data sampling and process protocols, consistency-driven cross-layer protocols and flexible APIs to manage the data quality, to support the goals of high data quality and energy efficiency. We first formally define a consistency model, which not only includes temporal consistency and numerical consistency, but also considers the application-specific requirements of data and data dynamics in the sensing field. Next, we propose an adaptive lazy energy efficient data collection protocol, which adapts the data sampling rate to the data dynamics in the sensing field and keeps lazy when the data consistency is maintained. Finally, we conduct a comprehensive evaluation to the proposed protocol based on both a TOSSIM-based simulation and a real prototype implementation using MICA2 motes. The results from both simulation and prototype show that our protocol reduces the number of delivered messages, improves the quality of collected data, and in turn extends the lifetime of the whole network. Our analysis also implies that a tradeoff should be carefully set between data consistency requirements and energy saving based on the specific requirements of different applications.

We have extended to the vehicular ad-hoc networks (VANETs) domain. In a VANET, the conventional approaches maintaining the data quality are inappropriate because two factors should be satisfied simultaneously. First, the mechanism must adapt to frequently-changed network topology. Second, sensor networks on vehicles underline the realtime requirement, which forces outlier detection, trust-based detection and similar off-line mechanisms to be insufficient. Although several methods have been proposed emphasizing the real-time functionality of system, most of them assume a particular data model e.g. Gaussian, linear. Others do provide a real-time fashion abnormal data detection mechanism, but fail to be applied in VANETs since their approaches mainly rely on network hierarchy. We first we classify the deceptive data into two categories, redundant data and false data. The redundant data is defined as the data share exactly the same or very similar information with data reported previously or by other sensors. Another type of deceptive data is false data, which may result from the malfunction of the sensor board, the unreliable wireless communication, and the compromised sensors. In the particular setting of VANETs, we are more interested in false data rather than redundant data. First, energy issue of sensors in vehicles affects less or even could be ignored since sufficient power is able to be generated from battery in each vehicle. Second, redundant data could be helpful to detect false data since it provides additional information of the whole sensor network. Based on this definition, we  propose a role-differentiated cooperative deceptive data detection mechanism, RD4, to detect and filter false data in VANETs. In RD4, when a sensor is deployed, it picks up a role from the role set based on the several sensing features. We will evaluate the efficiency and efficacy of our mechanism in three specific vehicular scenarios.

This project is supported by NSF grant  on  "Consistency Model Driven Deceptive Data Detection and Filtering in Wireless Sensor Networks." This work is one of the earliest efforts on data quality/creditability management in wireless sensor networks.  The results of this work have been published in IEEE Transactions on Vehicular Technologies, Journal of Prallel and Distributed Computing, and European Confernece on Wireless Sensor Networks, and so on.  I also organized a special issue on "Data Quality Management in Wireless Sensor Networks" on International Journal on Sensor Networks in 2009. 

4. HOURS: Reputation Mechanisms for Resource Sharing
(with Zhengqiang Liang, 2004-)

Community computing—federated sharing of dispersed pools of geographically distributed computing resources under coordinated control—has been considered as a promising platform for solving large-scale problems in science and engineering. However, resource management in these environments is a complex undertaking. These systems need effective mechanism for fair sharing of community resources, adaptability to dynamic changing conditions, prevention of denial-of-service (DoS) attacks, and coordinaton of the diverse policies, cost models, and varying loads different peers. As one motivating example, a classical “tragedy of the commons” for peer-to-peer file sharing is 50 to 70% of peers are free riders, which results in a great load imbalance of the systems. Resource trading can enforce a cooperative approach for the resource sharing and is promising to address the above problems. The autonomous, heterogeneous, and decentralized nature of participating peers across multiple administrative domain introduces two challenging issues related to resource trading: decentralized trading scheme, which means the decision of resource exchange and negotiation is determined by each peer based on its personalized view of the partner and its own policy; self-policing personalized trustworthiness management, which means different peers may have different opinions on the trustworthiness of the same peer, instead of unique global trustworthiness value like eBay.

In this work, we propose an approach that consists of two models: M-CUBE, a Multiple CUrrency Based Economic model, as the decentralized trading scheme, and aPET, an adatptive PErsonalized Trust Model, to provide the trustworthiness of the peer to support M-CUBE. The M-CUBE model provides a general and flexible substrate to support most of high level resource management services required by the P2P computing, such as resource coallocation, quality of service (QoS) control, advance reservation and scheduling algorithms. aPET derives the trustworthiness from the reputation evaluation and risk evaluation. The trustworthiness value provided by PET will be treated as the view of the peer by M-CUBE. The unique feature of our approach is seamless integrating the trustworthiness and dependability of peers into the resource trading.  We have successfully the trust models to two applications: ISP peering and Grid resource management.

In ISP peering, the fragility and the poor resilience of Internet are manifested by the severe impact of network activities and the slow recovery after the earthquake damaged undersea cables and disrupted telephone and Internet access in East Asia in December 2006. Except the inefficiency of routing protocol, lack of efficient network monitor mechanisms and economic incentives to encourage the service providers to act cooperatively and promptly is another important reason. In this paper, we build a trust-based economic framework called TRECON to address these open problems in Internet routing. The novelty of TRECON is combining an adaptive personalized trust model (aPET) with an economic approach to provide independent trust-based routing among service providers. TRECON provides flexible policy support based on the trust-based economic mechanism so that autonomous organizations with varied interests and optimization criteria can be smoothly integrated together to achieve better adaptiveness and self-management. Through introducing the economic model, TRECON explores a new way to solve the economic problems and incentives issues in the collaboration among service providers. To show the flexibility of routing policies support, we propose four typical routing policies under the TRECON framework. We evaluate our approach by comparing these four trust derived routing policies with the classical global shortest path routing (SPA) approach. We find that the policy based on trustworthiness (TRU) performs much better than all other policies under different network topologies in terms of delay, success delivery rate, and economic effects.

The obstacle for the Grid to be prevalent is the difficulty in using, configuring and maintaining it, which needs excessive IT knowledge, workload, and human intervention. At the same time, inter-operation amongst Grids is on track. To be the core of Grid systems, the resource management must be autonomic and inter-operational to be sustainable for future Grid computing. For this purpose, we introduce HOURS, a reputation-driven economic framework for Grid resource management. HOURS is designed to tackle the difficulty of automatic rescheduling, self-protection, incentives, heterogeneous resource sharing, reservation, and SLA in Grid computing. In this paper, we focus on designing a reputation-based resource scheduler, and use emulation to test its performance with real job traces and node failure traces. To describe the HOURS framework completely, a preliminary multiple-currency-based economic model is also introduced in this paper, with which future extension and improvement can be easily integrated into the framework. The results demonstrate that our scheduler can reduce the job failure rate significantly, and the average number of job resubmissions, which is the most important metric in this paper that affects the system performance and resource utilization from the perspective of users, can be reduced from 3.82 to 0.70 compared to simple sequence resource selection. 

This project is supported by the NSF CAREER grant “Mechanisms for Resource Sharing in Collaborative High-End Computing Platforms.”  Together with Prof. Ling Liu from Georgia Institute of Technology, we have founded TRAM, an International Workshop on Trust and Reputation Management on Massively Distributed Computing Systems, and are organizing two special issues on IEEE Internet Computing (2010) and Journal of Computer Science and Technology (2009), respectively. Our results have been published in IEEE Transactions on Systems, Man, and Cybernetics, Performance Evaluation Journal, Journal of Parallel and Distributed Computing, ACM Journal of Mobile Applications and Networks, and so on.  The work in this project has been cited numerous in the community. The original paper of PET, published in HICSS 2005, has been cited 93 time by Jan. 2011. 

5. Adaptive Workflow Scheduling (with Zhifeng Yu, 2005-2010)

Workflow management systems define, manage and execute complex workflows on heterogeneous distributed computing environments. With the popularity of cluster and Grid platforms, workflow applications tend to be more complex than ever, which in turn keeps pushing the technology limit of workflow management system.  At the core of this challenge is workflow scheduling, which is a classic problem but elevated to a much higher level of complexity in context of Grids. More specifically, besides the scheduling complexity introduced by inter-task dependence of a workflow, an efficient Grid workflow management system has to deal with dynamic of Grid resource and workload. In a Grid environment, resources come and go and work load changes unpredictably. Every workflow management system has two core components, workflow Planner and Executor. The traditional systems are developed in two extremes, either static, i.e., the Planner makes scheduling decision, or dynamic, i.e. the Executor schedules tasks. With assumption that workflow structure is known prior, the resource pool does not change and there is only one workflow running at a time, with static strategy the Planner makes a global scheduling decision. While in dynamic approaches the Executor schedules in a just-in-time fashion to address some invalid assumptions made by static ones, however it performs inferiorly to some static scheduling approaches even when task performance estimation is not very accurate.  

We propose a new design of workflow management system which adapts the static strategies to a dynamic grid environment by implementing a planner guided dynamic scheduler and tackles dynamics of both resource and workload. Its objective is to improve efficiency, practicability of workflow scheduling and efficiency of resource utilization. The adaptive approach exploits advantages of both static and dynamic strategies to achieve the goal of workflow performance from perspectives of both system and user. In our design, the workflow Planner with Executor collaborate each other. The Planner always prioritizes all tasks and even proposes a resource mapping. In a run time, the Executor will manage dynamic resource and workload change. It deals with the workload potentially mixed from multiple workflows and other ordinary jobs, ensures the critical task assigned with the proper resource by utilizing the task priority set by the Planner. The experiment results demonstrate the robustness and effectiveness of the proposed algorithm.
While clusters and Grids are as resource failure prone as any other environments, the resource failure is either ignored by some algorithms, particularly static ones, or it is perceived that only the perfect failure prediction can help scheduling. We augmented our adaptive algorithm with failure awareness by integrating it with an ordinary failure predictor. The experiment demonstrated that, with the failure prediction with practically achievable accuracy rate, our proposed failure aware scheduling algorithm outperformed others significantly in a workload intensive and error prone environment. As a result of this study, we proposed two new definitions of failure prediction accuracy: Application Oblivious Accuracy (AOA) from a system's perspective and Application Aware Accuracy (AAA) from a scheduler's perspective, which we believe better reflect how failure prediction accuracy impacts on scheduling effectiveness.

This project is in part supported by the NSF CAREER grant “Mechanisms for Resource Sharing in Collaborative High-End Computing Platforms.” The proposed adaptive workflow scheduling algorithms in cluster and Grid environment is published in Proceedings of the 21st IPDPS in 2007, and had been cited 28 times. The failure-aware workflow scheduling has been published in Journal of Cluster computing in 2010. 

6. The Fractal Framework (with Hanping Lufei, 2003-2007)
With the development of computer and communication technologies, more and more heterogeneous devices, like desktops, laptops, PocketPCs, and cellular phones are connected to the Internet using diverse networks, like Ethernet, Wi-Fi, Bluetooth, 3G/4G wireless technology. On one hand, different technologies have different characteristics. On the other hand, a heterogeneous environment makes it possible to dynamically change between different devices and network environments. For instance, a person uses a laptop with a cable modem at home, a
cell phone with 3G/4G or Bluetooth on the way to the office, a desktop with Ethernet LAN in the office and a PDA with Wi-Fi in the meeting room. Diverse network connections and heterogeneous devices demand the adaptation functionality in a distributed fashion because no one-size-fits-all single function or protocol can perform well over all these networks and devices. 

It is difficult, if not impossible, to build a one-size-fit-all application or protocol which can run well in the dynamic environment. Adaptation has been considered as a general approach to address the mismatch problem between clients and servers. From the perspective of adaptation locations, some of them propose the in-network adaptation, such as CANS, Active Names, Odyssey, and Rover, which focus on how to do the adaptation step by step across an overlay path. Although the functionalities are well designed, they have not considered the deployment of chosen
components (drivers in CANS) across multiple nodes in the path. This is an obstacle for the wide acceptance of these approaches. Other proposals try to perform the end-to-end adaptation, like the static content-based adaptation, which does not take the mobility of users and dynamically changing environment into consideration. From the network OSI model’s point of view, some of them work in the network laye, which adapts the TCP/IP protocol dynamically according to the changing situations on both ends. Although the results are promising, it is not able to handle the application level protocol adaptation which makes more sense for many overlay distributed applications, e.g., streaming multicast on the Internet. In this paper we propose Fractal, a dynamic application level
protocol adaptation approach, which uses the mobile code technology for protocol adaptation and leverages existing content distribution networks (CDN) for protocol adaptors (PADs) (mobile codes) deployment. The idea of protocol adaptation is based on the assumption that an application protocol is composed of a series of components, also called PADs in the Fractal framework. When a protocol needs to be adapted, the application simply needs to add or remove some PADs into or from it. Before a mobile client starts an application session with the application server, it uses the proposed interactive negotiation protocol to negotiate with the adaptation proxy deployed close to the application server. The negotiation manager inside the adaptation proxy uses the proposed adaptation path search algorithm to find one or more appropriate PADs that should be used in the following communication between the client and the application server. Metadata about these PADs will be sent to the client by the adaptation proxy. The client is then able to retrieve the PADs, which are packaged into mobile code modules, from the CDN and starts the new protocol. Although a large amount of research on mobile code and CDN has been done, few studies have combined the advantage of both of them for the protocol adaptation purpose. Based on the proposed framework, we have designed and implemented two case studies: an adaptive message encryption protocol and an adaptive communication optimization protocol.  The contributions of Fractal are five-fold: (1) Proposing a general framework for dynamic application level protocol adaptation; (2)  Dynamically adapting at the application protocol level ; (3) Leveraging CDN edgeservers for protocol adaptor delivery; (4) Designing and implementing an adaptive message encryption protocol in the context of the Fractal framework; and (5) Proposing and implementing an adaptive communication optimization protocol in the context of the Fractal framework.
  

Since the inception of the service-oriented computing paradigm, we have witnessed a plethora of services deployed across a broad spectrum of applications, ranging from conventional RPC-based services to SOAP-based Web services. Likewise, the proliferation of mobile devices has enabled the remote “on the move” access of these services from anywhere at any time. Secure access to these services is challenging, especially in a mobile computing environment with heterogeneous modalities. Conventional static access control mechanisms are not able to accommodate complex secure access requirements. In this paper, we propose an adaptive secure access mechanism to address this problem. Our mechanism consists of two components: an adaptive access control module and an adaptive function invocation module. It not only adapts access control policies to diverse requirements but also introduces function invocation adaptation during access, which is the missing part of existing access control models. We have successfully applied the proposed adaptive secure access mechanism to a computer-assisted surgery application called UbiCAS, a mobile Computer-Assisted Surgery system that allows surgeons to retrieve, review and interpret multimodal medical images, and to perform some critical neurosurgical procedures on heterogeneous devices from anywhere at anytime. Performance evaluation shows that with limited overhead, our technique enforces secure access to the services provided by the UbiCAS system in a flexible way.

The Fractal paper has been awarded the Best Paper Award in the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS '05), which is a premier conference in the field of parallel and distributed computing. The Fractal work was supported by a four-year project titled ``Integration of Bioengineering and Bio-computing to Advance Michigan Computer-Assisted Surgery,'' funded by Michigan Life Science Corridor. The other novelty of the project includes three-fold: (1) providing a secure Web-based interface for the CAS Engine, which extends the accessibility of the CAS Engine tremendously; (2) proposing an adaptive communication optimization technique (ACO) using the application-level protocol adaptation framework Fractal. The ACO module paves the way of pervasive access to the CAS engine using any devices at anywhere; (3) developing an energy-aware quality-of-service (QoS) model to support efficient collaboration among multiple doctors who may use different communication mechanisms, e.g., smartphones, pocketpcs, and so on. The results of this project have been published on IEEE Transactions on Services Computing, Journal of Parallel and Distributed Computing, and Computer Networks. 

7. The CONCA Architecture (With Vijay Karamcheti, Daniel Brodie, and Yonggen Mao, 2002-2005)

Future access to web-based content is likely to be dominated by two trends: (1) increasing amounts of dynamic, personalized content, and (2) a significant growth in “on-the- move” access using various mobile resource-constrained devices. We have proposed a novel architecture for COnsistent Nomadic Content Access (CONCA), which attempts to support, from the ground up, caching of dynamic personalized content for mobile users. CONCA nodes are designed to reuse the shared portions of dynamic content, exploiting knowledge of user content access preferences to efficiently support transcoding and nomadic access (e.g., by prefetching) by assigning a home CONCA node for each user.

Based on CONCA architecture, we have developed Tuxedo, a peer-to-peer cooperative Web caching system for transcoded content. We have studied the benefit of peer-to-peer Web caching systems extensively, focusing on the dynamic Web content caching and delivery. The original CONCA paper has been cited 32 times. The Tuxedo paper has been cited by papers that published at USENIX NSDI 2006 and 2007. We have conducted the first analysis of a personalized Web site, NYUHome, and derived several important implications on caching and delivery personalized Web content. The paper has been cited more than 41 times. 

Requests for dynamic and personalized content have increasingly become a significant part of Internet traffic, driven both by a growth in dynamic web services and a “trickle-down” effect stemming from the effectiveness of caches and content-distribution networks at serving static content. To efficiently serve this trend, several server-side and cache-side techniques have recently been proposed. Although such techniques, which exploit different forms of reuse at the sub-document level, appear promising, a significant impediment to their widespread deployment is (1) the absence of good models describing characteristics of dynamic web content, and (2) the lack of effective synthetic content generators, which reduce the effort involved in verifying the effectiveness of a proposed solution. In our object modeling project, we addressed both of these shortcomings. Its primary contribution is a set of models that capture the characteristics of dynamic content both in terms of independent parameters such as the distributions of object sizes and their freshness times, as well as derived parameters such as content reusability across time and linked documents. These models are derived from an analysis of the content from six representative news and e-commerce sites, using both size-based and level-based splitting techniques to infer document objects. A secondary contribution is a Java-based dynamic content emulator, which uses these models to generate edge-side include (ESI) based dynamic content and serve requests for whole documents as well as separate objects. .The paper that presented this work has been cited 62 times. 

Futhremore, we have proposed a keyword-based fragment detection approach, which takes original dynamic Web content and converts it to fragment-enabled content. Thus the dynamic parts of the document are separated into separate fragments from the static template of the document. This is dependent on our proposed keyword-based fragment detection approach that uses predefined keywords to find these fragments and to split them out of the core document. Our second proposal, an augmentation to the ESI standard, allows splitting the information of the position of each fragment in the template from the template data itself by using a mapping table. Using this, a fragment enabled cache can have a more fine grained level of identifying fragments independent of their location on the template, which enables it to take into account fragment behaviors such as fragment movement. We used the content taken from three real Web sites to achieve a detailed performance evaluation of our proposals. Our results show that our keyword-based approach for fragment detection and extraction provides us with cacheable fragments that, when combined with our proposed mapping table augmentation, can provide significant advantages for fragment-based Web caching of existing dynamic Web content. This paper won the Best Paper Award of the 2004 International Conference on Web Engineering (ICWE), accept rate is 12% in year 2004. ICWE is the most prestigious conference on Web Engineering. 


8. The CANS Infrastructure
 (with Xiaodong Fu and Vijay Karamcheti, 2000-2001)

Current Internet has an inherent mismatch between the low-bandwidth, limited resource characteristics of mobile devices and the high-bandwidth expectations of many content-rich services. Existing applications and services cope with the above problem essentially by providing differentiated service for different networks and devices. For example, most popular news, e-mail, and stock trading services today present a different front-end for mobile users. Although adequate in some scenarios, this approach suffers from the limitation that mobile users are classified into a small number of classes and may not receive performance commensurate with the capabilities of the device or network they are using. More importantly, such an approach cannot adequately cope with dynamically changing environments where there is a big variation in available bandwidth (e.g., a user on a wireless LAN whose distance from an access point varies through time).

To address this mismatch problem between clients and servers, we have proposed Composable Adaptive Network Services (CANS), an application level infrastructure for customizing the data path between client applications and services, which focuses on three challenges: (1) efficient component composition, (2) support for legacy applications and services, and (3) support for distributed adaptation. Our approach relies on three components: (1) type-based specification of components and network resources, (2) automatic path creation strategy, and (3) system support for low overhead path reconfiguration. 

Our first paper about CANS entitled “CANS: Composable Adaptive Network Services Infrastructure, has been presented the 3rd USENIX Symposium on Internet Technologies and Systems (USITS ’01), which is ranked as the second highest impacted publication venues among all Computer Science Journals and Conferences by CiteSeer, available at  http://citeseer.ist.psu.edu/impact.html. This paper has been cited 162 times (till January 2011).

9. JIAJIA Software Distributed Shared Memory System  (with Weiwu Hu and Zhimin Tang, 1997-2000)

Software Distributed Shared Memory (DSM) is an ideal vehicle for parallel programming because of its combination of programmability of shared memory systems and the scalability of distributed memory systems. However, the overhead of maintaining consistency in software and the high latency of sending messages makes achieving performance from software DSMs a challenging issue. My Ph.D. research focused on techniques for improving the performance from two perspectives: (1) reducing the frequency and time of communication entailed by coherence protocols, and (2) reducing the software overhead of each message operation.

By analyzing the disadvantages of snoopy and directory-based cache coherence protocols, we proposed a lock-based cache coherence protocol for scope consistency, and developed a software DSM system named JIAJIA based on this protocol. Based on a details analysis of the system overheads of software DSM systems, we proposed several techniques to reduce these overheads in home-based software DSMs. Since data distribution plays a very important role in home-based software DSM systems, we investigated a home migration mechanism to reduce remote data communication, and based on this propose a task migration scheme. Furthermore, because of the prevalence of heterogeneous computing environments, load scheduling and balancing become critical issues for achieving high performance in heterogeneous computing environments. As part of the JIAJIA work, we also proposed and evaluated an affinity-based self scheduling scheme for load balancing in home-based software DSM systems.

The first paper about JIAJIA entitled “JIAJIA: An SVM System Based on A New Cache Coherence Protocol, presented at HPCN1999, has been cited 166 times (till January 2011). All JIAJIA related papers have been cited more than 250 times. This system has been downloaded and installed in more than 120 institutions around the world. The JIAJIA system has been listed in many web sites, such as Wikipedia and University of California at Irvine. My Ph.D. dissertation titled "Performance Optimization of Software DSMs Systems" has been awarded the National Outstanding  Ph.D. Dissertation Award in 2002, and the book has been published by Higher Education Press.