In addition, we are investigating the issues in indexing images based on texture, color etc. We are examining how to build an image database system that supports searches on various attributes. In this context, we designed an optimal dynamic index structure for searching in multiple dimensions. Using the concepts of this structure, we plan to develop an efficient image-retrieval system for the Alexandria Digital Library by August 97.
ACTUAL ACTIVITY
We have made the promotion-based index a stand-alone index. However,
this index is not integrated with the Alexandria testbed because the
current spatial datablade of Informix is performing quite well.
Instead, we explored how to reduce the performance bottlenecks due
to concurrent operations on a multi-dimensional index. We developed
new concurrency protocols that cater to the idiosyncracies of
multi-dimensional index structures and achieve better throughput than
existing ones.
We then devised new techniques for indexing image attributes such as color, texture etc. We proposed efficient dimensionality-reduction techniques to improve the query performance of such image databases in dynamic environments. In addition to these improvements, we also examined how to reduce the query time complexity of our proposed optimal structure by using data replication.
ACTUAL ACTIVITY
We built a prototype of the automated classification component of
Pharos, arguably the most difficult component to implement. We have
written up various parts of this work and published a subset of the
results.
ACTUAL ACTIVITY
Done
PLANNED ACTIVITY
Aid Testbed Team in implementation of the proposed approach
in which an adaptive mechanism for creating materialized views
dynamically will be developed and integrated into the ADL
testbed system to improve query processing.
ACTUAL ACTIVITY
Implemented the proposed materialized views in the ADL web
prototype system. A translation module has been built to
analyze and translate user queries to the equivalent ones
with materialized views.
The module was demonstrated during the site visit and the
original unaccepted queries due to the unreasonable response
time can now be issued. The answers are returned within a few
seconds or a few minutes at the most. This supports the
possibility of making a fully populated ADL operational.
PLANNED ACTIVITY
For the more theoretical part, the research on the two
key issues about materialized view technique (materialized view
design and query translation) will be continued.
ACTUAL ACTIVITY
A novel technique to materialized view design has been proposed
and published in the WITS'97. An optimal view selection algorithm
is proposed for the outer join materialized view case and a near
optimal view selection algorithm is proposed for the natural join
materialized view case. A near optimal
condition was found which guarantees that the selection is
within 63% of the ``optimal'' for the natural join case.
PLANNED ACTIVITY
In a more general setting, searching for query optimization methods
outside of DBMSs will become increasingly more interesting as
database applications come in more varieties and DBMSs become more
complex. It is expected that the study on using materialized views
to increase query performance will possibly lead to opportunities in
query optimization in such context and, for example, efficient ``glue''
for multi-databases. These issues will be examined.
ACTUAL ACTIVITY
A data integration framework is developed for evaluating queries over
multi-data sources. The issue of query optimization and the
application of materialized view techniques in this context are
planned to be studied.
ACTUAL ACTIVITY
Research on tertiary storage scheduling algorithms was continued
however, the direction taken was that of validating the assumptions
made in the earlier work rather than looking for online solutions.
The team investigated the validity of the assumption made earlier
regarding minimizing the number of switches as a technique for
improving the schedule.
PLANNED ACTIVITY
Work on prefetching large 2-dimensional data will be
continued. Currently, finding an implementation of asynchronous I/O
that works satisfactorily has been a major problem. Earlier work done
on SUN machines was found to be invalid due to the status of the
implementation of asynchronous I/O on SUN machines. Further
investigation of the problem was done on SGI machines, but there seem
to be problems with the implementation of certain functions under IRIX5.3.
Several avenues for evaluating the effectiveness of prefetching
are being considered, including, working with raw devices, using non
POSIX4 implementations of asynchronous I/O on SUN machines and
evaluating the implementation under IRIX 6.2. Prefetching of data
blocks when user access patterns follow some connected path over a
large 2-dimensional image such as could be expected from edge
detection algorithms will be studied.
ACTUAL ACTIVITY
This task was abandoned in favor of the work on declustering of
multi-dimensional data on multiple disks to improve the performance
of range and similarity queries through parallel I/O. This change
was made because of the good results obtained for two-dimensional
range queries which showed great promise for other data sets and
query patterns as well.
ACTUAL ACTIVITY
Work as forecasted in last year's research summary has proceeded as planned.
The work has evolved in the following ways: