Some applications have complete control over the queries that they issue;
the queries are built into the applications. Other applications allow users
to construct queries by filling in fields on a form. Any time you let users
construct ad-hoc queries, you risk the possibility that the query a user constructs
will be one like the following:
SELECT * FROM ExtremelyHugeTable
ORDER BY unIndexedColumn
This statement has no WHERE clause. It will require a full table scan.
To make matters worse, Derby will then have to order the data. Most
likely, the user does not want to browse through all 100,000 rows, and does
not care whether the rows are all in order.
Do everything you can to avoid table scans and sorting of large results
(such as table scans).
Some things you can do to disallow such runaway queries:
- Use client-side checking to make sure some minimal fields are always filled
in. Eliminate or disallow queries that cannot use indexes and are not optimizable.
In other words, force an optimizable WHERE clause by making sure that the
columns on which an index is built are included in the WHERE clause of the
query. Reduce or disallow DISTINCT clauses (which often require sorting) on
large tables.
- For queries with large results, do not let the database do the ordering.
Retrieve data in chunks (provide a Next button to allow the user to retrieve
the next chunk, if desired), and order the data in the application.
- Do not use SELECT DISTINCT to populate lists; instead, maintain a separate
table of the unique items.