Tutorials


One full-day and two half-day short courses will be held on 2 May 2001 at the Kowloon Shangri-La Hotel. Registered delegates who pay the full-day fee will receive a complete set of course notes for all the tutorials and will be able to attend whichever the courses they like. For those who register for a half-day course, they will receive only the notes for registered courses.

 

Full-day Course
   "Recent Developments on Advanced Audio Processing" (Download)
  Speaker: Dr. C.C. Jay Kuo, Signal and Image Processing Institute, University of Southern California

Half-day Courses
  Morning Session: "Pervasive and Mobile Commerce" (Download)
    Speaker: Dr. Chung-Sheng Li, IBM T.J. Watson Research Center
  Afternoon Session: "Quality of Service (QoS) Support in Internet and Web Servers" (Download)
    Speaker: Dr. Prasant Mohapatra, Dept. of Computer Science & Engg., Michigan State University

Registration Fees: Full-day (including lunch at the Kowloon Shangri-La) Half-day (not including lunch)
Regular HK$1,200 / US$160 / RMB„¾@ 1,280 HK$600 / US$80 / RMB„¾@ 640
Student HK$750 / US$100 / RMB„¾@ 800 HK$375 / US$50 / RMB„¾@ 400
Please fill in the Section D of the registration form (Word format)

Full-day Course
Title: Recent Developments on Advanced Audio Processing
Specker: Dr. C.C. Jay Kuo, Signal and Image Processing Institute, University of Southern California
Tutorial 1:

High Fidelity Multichannel Audio Coding

When DVD and home theater systems bcome more popular these days, high fidelity multichannel (5.1 channel or 10.2 channel) audio systems are well received in the market. Compared with the traditional mono or stereo audio, multichannel audio requires a much more efficient coding scheme for its storage and transmission. This talk will present two new multichannel audio coding techniques: (i) the use of the Karhunen-Loeve Transform (KLT) to decorrelate signals of multiple channels; and (ii) the use of bit-layer coding to achieve an fully embedded bitstream. We exploit the inter-channel redundancy inherent in most multichannel audio sources, and prioritized the transformed channel transmission policy. Experimental results show that, compared with MPEG AAC (Advanced Audio Coding) algorithm, the proposed MAACKL (Modified Advanced Audio Coding with KLT) algorithm not only reconstruct better quality of the multichannel audio material at regular low bit rate of 64 kbits/sec/ch, but also achieves quality scalability in the single multichannel audio bit stream. The embedded multichannel audio coding system inherit the efficient inter-channel de-correlation block in MAACKL algorithm, and add an progressive quantization coding block and a context-based QM noiseless coding block. The final bit stream generated by this embedded multichannel audio coding system has a fully progressive property, which can terminate the encoding or decoding at any arbitrary point. Experimental results show that, compared with MPEG AAC algorithm, the reconstructed multichannel audio using the proposed algorithm achieves better performance not only with objective MNR (Mask-to-Noise Ratio) measurement, but also with subjective listening test at various bit rates.

Tutorial 2:

Internet Audio Streaming with Adaptive Time-Scale Modification

Recent research efforts on Internet audio streaming have been focused on error control and delay concealment in the presence of delay jitter and packet loss. Given a fixed receiver buffer and a tight end-to-end delay bound, some packets sent to the receiver may still be discarded since the receiver buffer is adjusted to accommodate the average end-to-end delay. A delay spike happens when several consecutive packets are delayed and arrive at the receiver almost simultaneously. It happens when audio packets piled behind a large Internet load. In this work, we extend the silence interval-based adaptive playout algorithm by exploiting the time-scale modification scheme, focusing on fast adaptation to delay jitter and minimization of packet droppings at the receiver while maintaining a low buffering delay. Time-scale modification (i.e. contraction and/or expansion) modifies the time duration of an acoustic signal without changing of its acoustic attributes, such as pitch, timber, and so on. By applying a varying degree of stretching for each packet (although it is important to maintain the stretching factor within a talk-spurt), every packet could contribute in adapting to the network delay jitter/spike as well as packet loss. For time-scale modification, the synchronized overlap-and-add (SOLA) scheme is adopted in our approach and modified into a fast and packet-oriented version. We manipulate time-scale modification based on the estimation of packet delay under the framework of adaptive playout. The time-scale modification factor will be estimated for each packet depending on the delay constraint, delay statistics, and the number of late-arrival packets. This estimated stretching is then bounded by certain upper/lower bounds that is calculated based on the speech contents. It is shown that the proposed adaptive playout mechanism is adaptive to the observed delay jitter and/or packet loss up to a certain degree.

Tutorial 3:

Encryption, Authentication and IP Protection of Audio/Speech Data

The goal of audiovisual data protection should aim at protecting the content of the data, not the binary bitstream itself. Under this principle, faster algorithms can be developed by selectively encrypting only some part of the bitstream. We propose a selective encryption scheme for ITU G.723.1 speech coding. For systems that are not suitable for selective encryption, another type of fast encryption algorithm is needed. A novel audiovisual data encryption scheme base on multiple entropy coders will be presented. Modern digital media editing and processing technology allows high quality forgery to be created at a relatively low cost. Judging the authenticity of audiovisual data by human perception alone is not enough anymore. Computer-aided methods are increasingly needed for fail-proof message authentication. A speech data integrity algorithm is proposed to protect the integrity of speech content instead of the bitstream itself. Speech features relevant to the semantic meaning are extracted, encrypted and attached as the header information. The receiver decrypts feature values and compares them to features extracted from the received data. A digital audio watermarking scheme of low complexity is proposed as an effective way to deter users from misusing or illegally distributing audio data. We advocate the importance of the synchronization attack caused by casual audio editing or malicious random cropping, which is a low-cost yet effective attack to watermarking algorithms developed before. The proposed scheme is based on audio content analysis with the wavelet filterbank while the watermark is embedded in the Fourier transform domain. A blind watermark detection technique is developed to identify the embedded watermark under various types of attacks.

Tutorial 4:

Technology for 3D Immersive Audio Synthesis

Recently, there has been a proliferation of 3D (or immersive) audio technologies intended for desktop computers. Many sound cards, multimedia speakers, video games, audio software, and CD-ROMs are marketed as having some sort of the 3D capability. In addition, a new technology called acoustic environment modeling has emerged. It combines the basic 3D technology with reverberation and other effects in order to simulate natural acoustic scenes. A 3D audio system has the ability to position sounds all around a listener. The sounds are actually created by loudspeakers (or headphones), but listener's perception is that sounds come from arbitrary points in space. This tutorial will address how 3D audio systems work. In particular, techniques such as HRTF measurements, binaural synthesis, crosstalk cancellation, reverberation algorithms and head-tracking for 3D audio by using loudspeakers will be described in detail.


Half-day Course
Morning Session: Pervasive and Mobile Commerce
Speaker: Dr. Chung-Sheng Li, IBM T.J. Watson Research Center
Synopsis: Due to the continuous performance improvement and cost reduction on the technology and networking infrastructure, there has been an explosive growth in the interests on developing Internet enabled pervasive devices which include PDAs (such as palm, Windows CE, psion), WAP phones, e-books, and various Internet appliances. These devices have just begun to trickle into our daily lives. According to an IDT study, it has been predicted that there will be around 80 million PDA devices and 900 millions WAP phones in just another two to three years in the world. Consequently, it is very likely that these devices will become one of the dominant Internet access mechanisms and supplement PC for Internet access in just a few years. As opposed to the PCs, these pervasive devices have a wide variety of screen size, computing power, and network capabilities. These devices, such as palm, and Windows CE devices in general are much more mobile than a notebook. Instead of the traditional Internet access model where direct connection has to be established, the connectivity of many of these devices to Internet is occasional at best. As a result, these devices have to be able to do serious offline access of Internet. Many of these devices will have location awareness using GPS, wireless cell location, or a combination of both. These devices will include many WAP phones and those wireless-enabled palms and pocket PCs. Many of these devices may be attached to a home appliance such as the refrigerator, microwave oven, or even a toilet. Since 1995, we have already begun to witness the emergence of Business-to-consumer (B2C) type of e-commerce such as Amazon. The pace of developing business-to-business e-commerce (B2B) applications, especially the establishment of electronic marketplace began to accelerate since 1998. It has become apparent that pervasive devices will definitely play an important role in the e-commerce applications. It is also expected that Business-to-Employee (B2E) , Peer-to-Peer (P2P, such as Napster), and Government-to-Citizen (G2C) will pick up steam in the near future. We have also begun to see that many traditional internet-only stores such as Gateway and E*Trade to have physical stores, while many brick-and-motar stores have great success on their web-presence. These multi-channel business model is certainly going to become the norm as opposed to the exception. The convergence of pervasive computing and electronic commerce certainly opens up many new challenges and opportunities for the research community in data management. In particular, we need to address the challenges arising from the evolution from a two-tier client-server model to a multi-tier computational model, an environment that the application server might roam at the edge of the network in order to provide better service for the pervasive devices, the need to anticipate location- and context-dependent queries, and the capability of delivering device- and bandwidth-neutral data and media content. Ultimately, the challenge is the capability of blending the real world and the virtual world in a seemless fashion so that the event in real world can be seemlessly translated into queries in the virtual world. The purpose of this tutorial is to investigate the opportunities and challenges ahead for location-aware pervasive commerce. In particular, we will overview the current and immediate technology trends in the wireless area (WAP, 3G, etc.), the location service (GPS, TDOA, etc.) and bluetooth. We will also overview the current and future e-commerce frameworks and business models, including B2C, B2B, P2P, G2C, and B2E. Important infrastructure generalization that is needed to enable pervasive commerce is then investigated. A number of case studies of pervasive commerce scenarios are used to provide as a backdrop for this fast evolving direction.
Outline: 1 Introduction
2 Preliminaries
3 Emerging E-commerce frameworks
4 Infrastructure for supporting Pervasive Commerce
5 Case studies: Using pervasive & location technologies to enable pervasive commerce
6 Summary and Looking foward
Biography Chung-Sheng Li received his B.S.E.E. degree from National Taiwan University, Taiwan, R.O.C. in 1984, and the M.S. and Ph.D. degree in electrical engineering and computer science from the University of California, Berkeley in 1989 and 1991, respectively. He has joined the computer science division of IBM T. J. Watson Research Center as a research staff member since Sept., 1991, manages the Image Information System Department from 1996 to 1999, and assumes the senior manager position for the E-commerce and Data Management Department since June 2000. His research interests include (1) Broadband applications, which include digital library, knowledge discovery and data mining; (2) Broadband network and switching, which includes all-optical networks, storage area networks, and fiber channel; (3) Broadband technologies, which include optical chip interconnects, optoelectronics, and high-speed analog/digital VLSI circuit design.He has co-initiated several research activities in IBM on fast tunable receiver for all-optical networks and content-based retrieval in the compressed domain for large image/video databases. He is currently the principle investigator of a satellite image database project funded by NASA. Dr. Li has received an Outstanding Innovation Award from IBM in 2000 for his leadership and major contribution to the IBM/NASA digital library project, and a Research Division award from IBM in 1995 for his major contribution to the tunable receiver design for WDMA, and numerous invention and patent application awards. He is currently an associate editor for the IEEE Transaction on Multimedia and Journal of Computer Vision and Image Understanding, the technical editor for the IEEE Communication Magazine. He has authored or coauthored more than 120 journal and conference papers and received one of the best paper awards from the IEEE International Conference on Computer Design in 1992. He is a senior member of the IEEE Circuit and System Society, the Laser Electro-Optic Society, the Communication Society, and the Computer Society.

Half-day Course
Afternoon Session: Quality of Service (QoS) Support in Internet and Web Servers
Speaker: Speaker: Dr. Prasant Mohapatra, Dept. of Computer Science & Engg., Michigan State University
Synopsis: The increasing volume and evolving types of Internet applications have been demanding enhanced services, both in terms of performance and quality of service (QoS), from the Internet infrastructure. The current best-effort service model of the Internet and the web servers are not suitable for fast growing applications such as, continuous media, e-commerce, and several other business services. To provide better services to these important and expanding classes of applications, it is necessary for the Internet infrastructure to provide service differentiation. The Internet infrastructure includes not only the network components but also the web servers (includes proxy servers, application servers, etc.). This tutorial targets QoS issues at both the network level as well as server level. The differentiated service (DiffServ) model proposed by the Internet Engineering Task Force (IETF) has received wider acceptance in the research community and is being actively considered for possible implementation in the next generation Internet. Unlike integrated services, DiffServ does not require end-to-end resource reservation or any state maintenance at the core routers of the Internet domains. Rather than the per-flow basis model, DiffServ routes packets based on the concept of per-hop behavior (PHB) model, in which packets are marked at the edge routers and are routed by the core routers based on the markings. The markings relate to the QoS requirements. Both the markings and the PHB are handled on an aggregated basis. In addition to providing service differentiation in the Internet, DiffServ architecture is a scalable, feasible, and economical. We will do a detailed study of the various issues involved in DiffServ, its basic support requirements, characteristics, and several other research and implementation aspects. Two different approaches for DiffServ - expedited forwarding and assured forwarding - will be analyzed. We will also discuss other approaches for providing DiffServ, such as relative differentiation and QoS-guaranteed DiffServ. In addition, we will discuss the role of TCP in supporting differentiated services. The goals of DiffServ architecture may not be met if it is implemented only at the network level. To provide end-to-end QoS, Internet server must also be capable of providing differentiated services. Unfortunately, the research on the server-level service differentiation has not kept on par with the network-level service differentiation. The current generation Internet servers provide service on a first-come-first serve basis, which is inadequate for QoS-aware applications. We will propose and discuss in detail about service differentiating Internet servers (SDIS). Resource management is the key issue in providing efficient service differentiation at the server level. Thus, we will analyze scheduling, admission control, and other implementation details of SDIS. The capacity planning of Internet servers are based on the average workload characteristics. However, Internet workload is very indeterministic; the maximum bandwidth or computation requirements may exceed the corresponding average value by several orders of magnitude. Thus overload control is a critical issues in managing the server loads. We will explore the issues involved in the implementation of efficient overload control techniques. In this tutorial we will present the state of the art issues on the proposed topic as well as introduce new and novel avenues for research and development. Future work on important issues like multicasting and security will also be discussed.
Outline: 1 Introduction & Motivation of Pervasive Commerce
2 Emerging pervasive technologies
3 Internet QoS - An Overview
4 Differentiated Services in the Internet
5 Support for Service Differentiation in the Web Servers
6 Ongoing and Future Research
7 Concluding Remarks
Biography Prasant Mohapatra received his Ph.D. in computer engineering from the Pennsylvania State University in 1993. He was an assistant professor and then an associate professor in the Department of Electrical and Computer Engineering at Iowa State University from 1993 to 1999. Since then he has been an associate professor in the Department of Computer Science and Engineering at Michigan State University. During the summers of 1998 and 1999, he worked in the Panasonic Information Networking and Technologies Laboratory (PINTL) and at the Server Architecture Laboratory of Intel Corporation, respectively. Dr. Mohapatra has published extensively in various international journals and conferences, and has two patents pending in the internetworking area. He has been an invited speaker at several universities and other organizations. He has taught several advanced courses in computer networks, architecture, performance evaluation, and multimedia systems. Dr. Mohapatra has graduated three Ph.D. students and about fifteen Masters students, and is current guiding about five Ph.D. and four Masters students. His research work has been funded and collaborated by National Science Foundation, EMC Corporation, Panasonic Technologies, Rockwell International, and Intel Corporation. Dr. Mohapatra is a senior member of the IEEE and a member of the ACM. He is currently on the editorial board of the IEEE Transactions on Computers. He has been on the program committees of several international conferences. He was the Program Chair of the workshop on Performance and Architecture of Web Servers (PAWS) held in conjunction with the SIGMETRICS-2000, and is the Vice-Chair for ICPP-2001.

Back to top