Kim Berg Hansen: OK Oracle, Tell Me *Approximately* How Many Unique Visitors We Had Last Weekrozwiń opis
Imagine doing a SELECT and the answer coming back is “approximately 42, give or take a little.” Normally you wouldn’t want that to happen, you expect a SELECT to give you an accurate answer every time. But building a sales dashboard for your boss, for instance, could be a choice between showing him “we had approximately 4200 unique visitors from France last week” in a quarter of a second, versus showing him “we had exactly 4189 unique visitors from France last week” in 5 seconds. Most likely your boss would be happier with a fast but “good enough” answer.
Oracle supports several Approximate Query Processing Functions, starting with APPROX_COUNT_DISTINCT in version 12.1, adding APPROX_MEDIAN and APPROX_PERCENTILE as well as the capability of doing approximate aggregations on multiple levels in version 12.2, followed by APPROX_RANK, APPROX_COUNT and APPROX_SUM in version 18.1. These functions allow you to write SQL that gets approximate, but fast, answers.
But what if you already have a dashboard application built using good old trusted SQL giving accurate answers, but it is really slightly unsatisfactory as each click takes several seconds to complete? Then you can do a FAST=TRUE - if you are prepared to accept approximate answers. You can do an ALTER SESSION that enables the optimizer to rewrite for example your COUNT(DISTINCT column) to an APPROX_COUNT_DISTINCT(column) automatically. Then your application will give approximate, but fast, answers without needing change.
This presentation discusses the different approximate functions, the approximate aggregations and the ALTER SESSION method of speeding the queries - showing examples of syntax and use cases.