Research Highlights of Weisong Shi

2009
  1. Data Quality Management in Networked Sensing Systems (with Kewei Sha and Shinan Wang, 2005-)

    As new fabrication and integration technologies reduce the cost and size of micro-sensors and wireless sensors, we will witness another revolution that facilitates the observation and control of our physical world just as networking technologies have changed the way individuals and organizations exchange information. Several applications, such as habitat monitoring, environment and structure monitoring, target (e.g., firefighter) tracking and most recent participatory urban sensing have been launched, showing the promising future of wide range of applications of wireless sensor networks.

    Their success is nonetheless determined by whether the sensor networks can provide a high quality stream of data over a long period. Most previous efforts focus on devising techniques to save the sensor node energy and thus extend the lifetime of the whole sensor network. However, with more and more deployments of real sensor systems, in which the main function is to collect interesting data and to share with peers, data quality has been becoming a very important issue in the design of sensor systems. In this project, we undertake a novel approach that detects deceptive data through considering the consistency requirements of data, and study the relationship between the quality of data and the multi-hop communication and energy-efficient design of networked sensor systems. The project consists of four components, including (1) formal models for data consistency and data dynamics, (2) APIs to manage the data consistency, (3) protocols to detect deceptive data and improve the quality of collected data, and (4) several cross-layer protocols to support data consistency and filtering of deceptive data. These four components are integrated into a prototype called Orchis. This project is supported by NSF grant  on  "Consistency Model Driven Deceptive Data Detection and Filtering in Wireless Sensor Networks." This work is one of the earliest efforts on data quality/creditability management in wireless sensor networks. 

  2. HOURS: Reputation Mechanisms for Resource Sharing (with Zhengqiang Liang, 2004-)

    Community computing—federated sharing of dispersed pools of geographically distributed computing resources under coordinated control—has been considered as a promising platform for solving large-scale problems in science and engineering. However, resource management in these environments is a complex undertaking. These systems need effective mechanism for fair sharing of community resources, adaptability to dynamic changing conditions, prevention of denial-of-service (DoS) attacks, and coordinaton of the diverse policies, cost models, and varying loads different peers. As one motivating example, a classical “tragedy of the commons” for peer-to-peer file sharing is 50 to 70% of peers are free riders, which results in a great load imbalance of the systems. Resource trading can enforce a cooperative approach for the resource sharing and is promising to address the above problems. The autonomous, heterogeneous, and decentralized nature of participating peers across multiple administrative domain introduces two challenging issues related to resource trading: decentralized trading scheme, which means the decision of resource exchange and negotiation is determined by each peer based on its personalized view of the partner and its own policy; self-policing personalized trustworthiness management, which means different peers may have different opinions on the trustworthiness of the same peer, instead of unique global trustworthiness value like eBay.

    In this work, we propose an approach that consists of two models: M-CUBE, a Multiple CUrrency Based Economic model, as the decentralized trading scheme, and aPET, an adatptive PErsonalized Trust Model, to provide the trustworthiness of the peer to support M-CUBE. The M-CUBE model provides a general and flexible substrate to support most of high level resource management services required by the P2P computing, such as resource coallocation, quality of service (QoS) control, advance reservation and scheduling algorithms. aPET derives the trustworthiness from the reputation evaluation and risk evaluation. The trustworthiness value provided by PET will be treated as the view of the peer by M-CUBE. The unique feature of our approach is seamless integrating the trustworthiness and dependability of peers into the resource trading. This project is supported by the NSF CAREER grant “Mechanisms for Resource Sharing in Collaborative High-End Computing Platforms.”  Together with Prof. Ling Liu from Georgia Institute of Technology, we have founded  TRAM, an International Workshop on Trust and Reputation Management on Massively Distributed Computing Systems, and are organizing two special issues on IEEE Internet Computing (2010) and Journal of Computer Science and Technology (2009), respectively. 

  3. The Fractal Framework (with Hanping Lufei, 2003-2007)

    CANS provide a perfect infrastructure for content adaptation, however, it has two drawbacks. One is the driver deployment problem, i.e., how to find and deliver  these drivers on the Internet. The other is its content-only adaptation, which is not general enough. To attack these two drawbacks, we proposed Fractal, a framework for dynamic application protocol adaptation in pervasive computing. Fractal works entirely at the application level and has no specific requirements about underlying network topologies, connection media types, network protocols, and client hardware configurations. As an general adaptation framework, it focuses on the protocol adaptation method which uses protocol adaptors (PADs) to describe the application protocol structure and distributes the PADs to the client by content distribution networks (CDNs) for protocol the adaptation purpose. By performing protocol adaptation at the application level, the core Internet protocols can continue to do what they do best-best effort, unreliable routing of packets, while the system as a whole can adapt. 

    The Fractal paper has been awarded the Best Paper Award in the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS '05), which is a premier conference in the field of parallel and distributed computing. The Fractal work was supported by a four-year project titled ``Integration of Bioengineering and Bio-computing to Advance Michigan Computer-Assisted Surgery,'' funded by Michigan Life Science Corridor. The idea has been very successfully applied to UbiCAS, a mobile Computer-Assisted Surgery system that allows surgeons to retrieve, review and interpret multimodal medical images, and to perform some critical neurosurgical procedures on heterogeneous devices from anywhere at anytime.  The other novelty of the project includes three-fold: (1) providing a secure Web-based interface for the CAS Engine, which extends the accessibility of the CAS Engine tremendously; (2) proposing an adaptive communication optimization technique (ACO) using the application-level protocol adaptation framework Fractal. The ACO module paves the way of pervasive access to the CAS engine using any devices at anywhere; (3) developing an energy-aware quality-of-service (QoS) model to support efficient collaboration among multiple doctors who may use different communication mechanisms, e.g., smartphones, pocketpcs, and so on.

  4. The CONCA Architecture (With Vijay Karamcheti, Daniel Brodie, and Yonggen Mao, 2002-2005)

    Future access to web-based content is likely to be dominated by two trends: (1) increasing amounts of dynamic, personalized content, and (2) a significant growth in “on-the- move” access using various mobile resource-constrained devices. We have proposed a novel architecture for COnsistent Nomadic Content Access (CONCA), which attempts to support, from the ground up, caching of dynamic personalized content for mobile users. CONCA nodes are designed to reuse the shared portions of dynamic content, exploiting knowledge of user content access preferences to efficiently support transcoding and nomadic access (e.g., by prefetching) by assigning a home CONCA node for each user.

    Based on CONCA architecture, we have developed Tuxedo, a peer-to-peer cooperative Web caching system for transcoded content. We have studied the benefit of peer-to-peer Web caching systems extensively, focusing on the dynamic Web content caching and delivery. The original CONCA paper has been cited 30 times. The Tuxedo paper has been cited by papers that published at USENIX NSDI 2006 and 2007. We have conducted the first analysis of a personalized Web site, NYUHome, and derived several important implications on caching and delivery personalized Web content. The paper has been cited more than 28 times. We have proposed a set of models that capture the characteristics of dynamic content both in terms of independent parameters such as the distributions of object sizes and their freshness times, as well as derived parameters such as content reusability across time and linked documents. The paper that presented this work has been cited 45 times. We have proposed a keyword-based fragment detection approach. This paper won the Best Paper Award of the 2004 International Conference on Web Engineering (ICWE), accept rate is 12% in year 2004. ICWE is the most prestigious conference on Web Engineering.

  5. The CANS Infrastructure  (with Xiaodong Fu and Vijay Karamcheti, 2000-2001)

    Current Internet has an inherent mismatch between the low-bandwidth, limited resource characteristics of mobile devices and the high-bandwidth expectations of many content-rich services. Existing applications and services cope with the above problem essentially by providing differentiated service for different networks and devices. For example, most popular news, e-mail, and stock trading services today present a different front-end for mobile users. Although adequate in some scenarios, this approach suffers from the limitation that mobile users are classified into a small number of classes and may not receive performance commensurate with the capabilities of the device or network they are using. More importantly, such an approach cannot adequately cope with dynamically changing environments where there is a big variation in available bandwidth (e.g., a user on a wireless LAN whose distance from an access point varies through time).

    To address this mismatch problem between clients and servers, we have proposed Composable Adaptive Network Services (CANS), an application level infrastructure for customizing the data path between client applications and services, which focuses on three challenges: (1) efficient component composition, (2) support for legacy applications and services, and (3) support for distributed adaptation. Our approach relies on three components: (1) type-based specification of components and network resources, (2) automatic path creation strategy, and (3) system support for low overhead path reconfiguration. 

    Our first paper about CANS entitled “CANS: Composable Adaptive Network Services Infrastructure, has been presented the 3rd USENIX Symposium on Internet Technologies and Systems (USITS ’01), which is ranked as the second highest impacted publication venues among all Computer Science Journals and Conferences by CiteSeer, available at  http://citeseer.ist.psu.edu/impact.html. This paper has been cited 116 times (till December, 2007).

  6. JIAJIA Software Distributed Shared Memory System  (with Weiwu Hu and Zhimin Tang, 1997-2000)

    Software Distributed Shared Memory (DSM) is an ideal vehicle for parallel programming because of its combination of programmability of shared memory systems and the scalability of distributed memory systems. However, the overhead of maintaining consistency in software and the high latency of sending messages makes achieving performance from software DSMs a challenging issue. My Ph.D. research focused on techniques for improving the performance from two perspectives: (1) reducing the frequency and time of communication entailed by coherence protocols, and (2) reducing the software overhead of each message operation.

    By analyzing the disadvantages of snoopy and directory-based cache coherence protocols, we proposed a lock-based cache coherence protocol for scope consistency, and developed a software DSM system named JIAJIA based on this protocol. Based on a details analysis of the system overheads of software DSM systems, we proposed several techniques to reduce these overheads in home-based software DSMs. Since data distribution plays a very important role in home-based software DSM systems, we investigated a home migration mechanism to reduce remote data communication, and based on this propose a task migration scheme. Furthermore, because of the prevalence of heterogeneous computing environments, load scheduling and balancing become critical issues for achieving high performance in heterogeneous computing environments. As part of the JIAJIA work, we also proposed and evaluated an affinity-based self scheduling scheme for load balancing in home-based software DSM systems.

    The first paper about JIAJIA entitled “JIAJIA: An SVM System Based on A New Cache Coherence Protocol, presented at HPCN1999, has been cited 136 times till December 2007. All JIAJIA related papers have been cited more than 250 times. This system has been downloaded and installed in more than 120 institutions around the world. The JIAJIA system has been listed in many web sites, such as Wikipedia and University of California at Irvine.