Event driven analysis.  In other words: give me ALL the data!

Event driven analysis. In other words: give me ALL the data!

A blog about how event driven data can make us all a little more sane

What is it? Why do we care?

Event driven analysis is a pretty hot topic these days. So what does this mean?  In simple terms it means modelling the data you need around a few objects and events that can happen with those objects. 

For example: Retailers of an online store may choose to track user data and user based events including “Viewed Page”, “Added to Shopping Cart”,  “Purchased Item”, “Created Account” etc. 

The benefits? Oh the benefits!  I wrote 2 full pages of benefits and had to scale it back so that I don’t come across unhinged. I come from the traditional normalized database and business intelligence / dashboard world.  It took me a while to fully comprehend what was happening with event driven systems, but I’m now convinced of their necessary place in an analytic stack.  The benefits are numerous, including: easy access to historical data, fast application adoption and more. 

For this blog I will focus on my two favorite things:

1.     They free us up to collect the data we need before we need it

2.     They make detailed analysis truly accessible to the end user

 

Question 1 -  How does it free us up to collect the data before we need it?

Event driven data collection frees us up to collect “all the things”.  We can collect data on anything that interests us without having to then construct the full world order of the data we want to collect.  In a normalized database schema everything must have a relationship defined.  Don’t even think of repeating data.  This approach is exhausting.  It is likely that the full schema is not accurate.  Also, it’s imposing mass structure on a system that is meant to represent real world order.  Real world order can be chaotic and fluid.

On the flip side, when defining an event, you can bring in any properties of this event that you desire.  These can be properties spanning multiple objects.  Even better; you can do this without having to re-evaluate your entire database schema. 

What exactly do I mean by “re-evaluate your entire database schema”?  Well a normalized schema it is essentially a perfect model of the objects included in

your data driven world.  Each new addition of data into this world needs to be evaluated for proper inclusion.  So when you get new data, you need to think very hard about how to put it in the right box and set up all the proper relationships it may have.  When you are done, you breathe a heavy sigh of relief.  Whew! Problem solved.  But now someone has even newer data and the process starts again.  It is difficult to bring in new things to a normalized schema.  Conversely, with events you can bring in new data at any time because inherently this data is not special.  It is a piece of an event, which is a catchall type of object for a large range of actions.

And what happens when it feels difficult for you to bring in new data? Simple: you will bring in less new data.  All new data requires changes to the applications that consume this special new object.  It will also require changes to the database schema so that the object is best placed in it’s perfect spot in that database schema. 

Now what happens when you bring in less data?  Bottom line: you don’t have the data you need.  You have the slightly above minimal amount of data that you thought you might need to do your job.  However, you started using that data and realized it’s not enough.  This is when someone goes to a business intelligence worker and asks them “how hard is it to go get this data?”.  To which the answer hinges on your ability to procure a time machine. Generally, a suggestion like that isn’t taken too well. 

 

Question 2 -  How do they make detailed analysis truly accessible to the end user?

Building on the above scenario, the next logical question is: “WHY do we need our BI folks to do the analysis for us in this world?  We are the ones who need to use this data! We may not know what we are looking for."

Easy. A normalized data structure demands that the analysis be specialized to our particular schema. So, unless we want to be stuck with a very specialized, pre-canned application that's tailored to our specific schema, you need to get someone who knows BI/reporting/SQL which can talk to ANY data structure.

How do we make analysis on our data more accessible to applications and humans?  Answer: Make the data less complex and less special.  Aha!  Enter event driven data! 

By having limited object types; data application tools can easily allow the user to explore this information in depth without having to know all of the possible data being stored in advance.  If it is an event; the system knows how to handle it.  Further, when the data is exposed in the end tools, it’s easier for users to wrap their mind around the type of analysis they can do.  There are minimal objects to engage with and therefore the barrier to get started is lower. 

 

Ending Notes and Disclaimer

Just because I love event driven analysis does not mean that I have abandoned my dashboards and business intelligence ways. These efforts still have a very necessary place in the analytic stack.   We still need the system of record that is optimized for performance.  We still need the ability to make key changes to critical business data without having to perform mass updates across historical records.

Dashboards and normalized tables are not getting kicked out of the stack yet!

In closing, I believe that event driven data has found a solid home in our analytic stack.  While I don’t imagine it will fully displace dashboards in the forseeable future, I do believe it can provide us some much needed flexability in our analytics driven world. 

Written by Laura Ellis

When Good is Perfect

When Good is Perfect

DataLayer at the Alamo Drafthouse

DataLayer at the Alamo Drafthouse