![]() |
AdaptPD Online Physical Design in Proxy Caches |
| Introduction |
Applications, such as proxy caches and content distribution networks,
serve a continuous stream of queries from tens of thousands of users in
which a representative workload may not be available. As such, current
research emphasizes the need for design tools that are always on and
can continuously adapt the physical design to changes in the workload.
We study an online approach to vertical partitioning that detects incremental changes in the workload and adapts the physical design automatically. Our target application is physical design of the Bypass cache, a proxy database cache for the SkyQuery federation of Astronomy databases. |
||
|
Technical Report Architectural Components Problem Description Source Code Related Projects Contact |
|||
| Technical Report |
Abstract
Performance of proxy caches for database federations that serve a large number of users is crucially dependent on its physical design. Current techniques, automated or otherwise, for physical design depends on the identification of a representative workload and specification of a database. In proxy caches, however, such techniques are inadequate since workload characteristics and database contents change over time. This is remarkably shown at the proxy cache of SDSS, an Astronomy database, which receives a continuously evolving workload and in which cached objects are continuously loaded and evicted. We present novel techniques for automated physical design that adapt with the workload and balance the performance benefits of physical design decisions with the cost of implementing these decisions. These include online algorithms that respond effectively to drastic changes in the workload and efficient cost estimation modules that make these algorithms practical for deployment in real world caches. Our experiments within the proxy cache of the SDSS database demonstrate significant improvement in cache performance over offline techniques. Download Report PDF PostScript Return to top |
||
| Architectural Components |
|
||
| Problem Description |
![]()
Return to top |
||
| Source Code |
Workload Parser
We are currently building a customized SQL parser for AdaptPD. The grammer was constructed using ANTLR 2.7.5 (http://www.antlr.org) and the driver was developed under Microsoft Visual Studio .Net in C# (Note that the parser is specifically tuned for query logs from the Sloan Digital Sky Survey (SDSS) database, so it may have trouble parsing arbitrary workloads). Download Source Source Readme (An older version of the SQL parser, developed using libraries from the Open SkyQuery project, is also available. The major benefit being that it can output in XML. Download Source Readme) Sloan Digital Sky Survey (SDSS) Workload We are currently using workloads extracted from query logs on the SDSS database. A representative query trace from the DR4 version of the database are included. Download Trace DR4 (1.4 million queries) Open SkyQuery portal A modified version of the Open SkyQuery portal with a semi-functional proxy cache that operates on SDSS at table granularity is available. The caching module is included under "CacheTools" and designed as a proof-of-concept. It demonstrates how bypass decsions are made in the SkyQuery framework but does not actually load and evict multi-gigabyte tables. The distribution is developed as a .Net Web Application and require Internet Information Services (IIS), SQL Server, and .Net framework to function. Download Portal Source Readme Return to top |
||
| Related Projects |
Preliminary Works
Self Managing Database workshop (SMDB'08) Download Paper PDF PostScript New England Database Day conference (NEDBDay'08). Download Poster PDF PostScript Database Systems for Advanced Applications conference (DASFAA'07). Download Paper PDF PostScript SkyQuery SkyQuery is a web-based database application for the World Wide Telescope (WWT). In SkyQuery, clients submit queries through an applet-based query interface, which arrives at the Open SkyQuery portal, the mediation middle-ware of WWT. The mediator then divides requests into sub-queries for each site in the database federation. For details on SkyQuery, please refer to the following links: SkyQuery Project: http://www.skyquery.net/ Open SkyQuery Portal: http://www.openskyquery.org/Sky/skysite/ Bypass Caching The bypass-yield cache framework into the WWT for making query bypass decisions. A bypass-yield cache is altruistic by nature and designed to minimize network traffic in a database federation. For details on bypass caching, please refer to the following paper: Download Paper PDF PostScript Return to top |
||
| Contacts |
Return to top |
||