Tuesday, September 17, 2013
Tableau "Add to Context" Filters
I like to review ancient Tableau features from time to time, just to make sure things haven't changed. A Tableau "Add to context" filter is one of these.
The Feature called "Add to Context" is a great way to look at a subset of data inside of Tableau. It works by attempting to create a TEMP table in the database (which may or may not always succeed, depending on the database and your permissions - we are assuming that TEMP tables succeed for this blog post). As such, this feature can also be considered "dangerous" if it is not used correctly. Two areas where it is used incorrectly are:
1. Returning all the records from the original join or data connection.
Consider an imaginary table of data which is 100 million rows. The proper use of "add to context" is to filter down to 800k rows (for example), perhaps for one segment, customer, line of service, or whatever dimension you want. The improper use of "add to context" is to select everything in the filter. This has the effect of creating a new, second 100 million row table. If the originating data model was a multiple table join, the resulting context table is a single, denormalized, fully indexed Tableau TEMP table. A nice denormalizing effect, but largely useless if you are simply recreating the original 100 million rows. Now imagine you have Tableau Server thrown into the mix: each time an end user interacts with this sheet or dashboard, a new 100 million row TEMP table will be created. Not good... the solution is do not do this, please!Don't select all the field values in a "add to context" filter. It defeats the purpose of using this feature in the first place.
2. Returning a very wide table result when you only need some of the columns.
Let's continue to use the 100 million row example. But this time, we will stipulate that it contains 160 dimensions and 65 measurements. 225 columns in total. The feature "add to context" will dutifully create a TEMP table with all of these columns, and indexes on each column. This could take a long time to generate, before you even get to the analysis itself! If you do not need them, you can hide fields prior to using "add to context". Once hidden, they will not be included in the TEMP table output. This can have a huge performance impact on your use of "add to context".
In an actual test against a 40 million row table, we ran a context filter on a very wide and a more narrow Tableau result set. The wide data connection took 47.57 seconds to A) create the TEMP table, B) index the columns and C) query the "number of records". You can see all of this in the Tableau desktop logs. By the way, it's worth pointing out how amazing this fact is, that Tableau generates all of the correct SQL code for this. However, the result is huge and wide. This data connection had 50 dimensions and 32 measurements.
We ran the same exact "add to context" test against a second data connection which only had 11 dimensions and 4 measurements. This time the query took 13.98 seconds. This was a 340% reduction in query time!
Be careful when using this feature. It can be a huge time saver - as well as a huge pain...