Welcome to

Magenic Technologies Community Blog

Sign in | Join | Help

Aaron's Technology Musings

Who let this guy on the podium?

On Business Intelligence and F#

It is high time that Business Intelligence get the benefits of the language "Cambrian Explosion" and agile revolution.  Think about BI for a second.  Most of the talk around BI is oriented around tools - a stack that ties together presentation, storage, and logic, all in the name of avoiding dealing with pesky programmers.  To the point that "requires no writing code" becomes a feature point.  How did we get here?  And how do we get out?

In the olden days, if you wanted a report, leaving aside BI for the moment... you had to ask your IT department for the report.  You were on their schedule, and more often than not, the backlog was very long.  Leaving aside why this was the case (i.e.budget shortage, lack of IT/business alignment, etc.) - it was.  This begat two primary developments, the shadow IT department, and the market for tools that empowered the business to, at least try, to generate their own reports independent of IT.

Now, in the intervening years, these two developments have not really stopped at all.  There is still a ton of shadow IT, and a ton of tools that purport to help you generate business intelligence by using an integrated stack of tools that, in theory, allow BI to happen without programmers.  The question is... is this a good thing?

I would say no.  BI tools, more often than not, tie you to not only a platform, but frequently, a specific product.  You can't take your BI developed in Microstrategy and run it in Cognos, at least not very easily.  And it makes sense why - as each of these tools competes on the basis of capabilities, and there is therefore no motivation to port the capabilities of one BI product over to another.  And because there is no obvious short term economic justification from the tool vendors point of view, it simply doesn't happen.

Of course, the medium to long term economic justification for tool vendors for this is very good indeed.  By creating an ecosystem of BI that allows for greater innovation and better solutions, BI will receive much greater investment.  The savvy players who take advantage of this will do really, really well, just like Microsoft prospered by having an open PC platform and Google prospered by having an open internet platform.  What has to happen, however, is someone has to move first, and given the nature of the space - big corporate buyers - it has to be one of the big players to do it with any kind of credibility.

That said, it does not help that there has been little standards innovation in the world of SQL.  Not to say that it doesn't happen, but lets put it this way - nobody is proposing SQL as a new .net language like they do for F#, Ruby, or even Boo.  SQL is just now standardizing how objects work... and worst yet, the language continues to get balkanized - especially in BI land where extensions for doing cubes and other specialized functionality tend to differ from vendor to vendor.

So how do we untie this gordian knot and get to a place where BI is portable, testable, and exists in a manner that allows diversity in authoring tools, persistence mechanism, and presentation mechanism?  I humbly submit that F# should be the language of BI.

Why F#?  Well, functional programming in general is oriented towards the folding, summarization, reduction, and calculation of sets of information - that is - data.  SQL is mostly a functional, declarative language anyway, so moving to F# as the lingua-franca of data should be a no-brainer.  Imagine a world where BI is:

* Persistence Ignorant rather than Persistence Obsessed

* Portable from tool to tool - so long as it can parse F#

* BI authoring tools allow business users to use a GUI to write F# constructs rather than balkanized SQL constructs

* The benefits of a modern functional language (ASTs, automatic generalization, massive parallellization, etc.) are finally tools that are easily available to BI

* Allowed to have the benefits that the agile world has brought us (testability, etc.) 

Imagine a world where you write a Domain Specific Language (DSL) in F#, and the BI tools manipulate the DSL.  Imagine being being able to swap out different persistence mechanisms based on strict performance characteristics, rather than having to pay the port tax when you move from one persistence mechanism to another.  Few people in the BI world have been exposed to the recent "Cambrian explosion" of new languages that have emerged in the last few years, and that's a shame, because some cross-pollenization would be very compelling for new kinds of solutions to emerge.

A recent Gartner CIO poll reported that CIOs must ‘Make the Difference’ by replacing generic IT with distinctive solutions that drive enterprise strategy.  This means that true BI that differentiates will likely be invested in.  It would be a shame if we continued to have all this BI live on vendor specific islands that were unable to leverage some of the state of the art work going on in computer science.  On the other hand, BI that leverages these new capabilities that the computer scientists like Don Syme are giving us will have a great chance to "make the difference". 

I conclude this with a call to action.  If you are doing BI, ask why we are using the same basic language we were using 10 years ago.  If you are a language geek or a software developer, ask why what you are doing, particularly if it generates information that is used in the strategic decision making process, isn't considered "BI".  Whomever is the first tool vendor to get to this vision will probably get to have a great deal of control over how it gets done - and the field is very green at the moment for someone to fill this gap :)

Published Wednesday, April 23, 2008 11:38 AM by Anonymous

Comments

# re: On Business Intelligence and F# @ Thursday, April 24, 2008 1:28 PM

Note, if you want to disagree, you are welcome to address the argument, not engage in ad hominem attacks.

Anonymous

# re: On Business Intelligence and F# @ Thursday, April 24, 2008 2:38 PM

Not an attack, simply an observation based on the misguided focus of the article.  The fact that you took it as an attack validates the observation.  I'm gonna try and make some time to post a reply explaining what I mean.

yet another reader...

# re: On Business Intelligence and F# @ Thursday, April 24, 2008 2:57 PM

Looking forward to a real post that explains your point.  Until then, understand that the policy here is that comments have substance and logic.  Attacking the idea is fine, attacking the person isn't.  I learn from the former if it is a well constructed argument.  The latter just simply isn't useful and not appropriate for this forum.

Anonymous

# re: On Business Intelligence and F# @ Thursday, April 24, 2008 4:17 PM

LOL, comments must have substance and logic? Sigh... fine...

My original observation was that it was evident from the article that you probably hand't done much BI.  Why did I say that?  Well, because of your suggestion that "F# should be the language of BI."

It goes without saying that experience with BI projects is probably a prerequisite for being able to make such a determination.  What is the current language of BI?  Is there one? Is a new one needed?  How useful would a new BI language be?  Is focusing on having a BI language relevant given the current state of the field?  One can only answer those questions once familiar with what it takes to deliver BI.  That's a bigger problem, though, ain't it, just what do we mean when we say BI?

That's actually more of a pet peeve of mine, the fact that anyone who delivers a few reports out of a database considers him/herself to be a BI guy.  Consulting companies, especially, like to be able to label their consultants 'BI experts' so they can crank a slightly higher rate.  

Anyways... BI is Business Intelligence (duh!), and reporting is indeed a very big part of it, and, yeah, you need to know SQL and whatever dialect is out there (MDX, etc) to be able to slice and dice your data, but that is not the whole story.  If slicing, dicing, and reporting is are the only things your BI solution delivers, then it is lacking.  It's as if your customer asked you for a car and you delivered a Ford Model T.  It is a car, but nowhere near sufficient given todays standards.

So, what is the bigger fish we have to fry?  What is the other half of BI?  It is none other than data mining.  Being able to deliver a data mining solution is what differentiates a report writer (someone who knows how to throw around some SQL, MDX, or possibly even F#) from a BI expert... and quite honestly, what language is used is the least important detail.  

Most so called BI experts are severely lacking in fundamental areas, from the basic principles of statistics all the way to the principles in each of the individual algorithms used in data mining these days.  Solve that problem, and you have a product!  That, my dear blogger, is a more fitting call to action than "hey guys, let's use F#, 'cause I'm learning it and learning some functional programming concepts, and it appears to have something to do with slicing and dicing and BI and stuff."  

I'm probably rambling by now (it's hard to get a coherent article by typing in this tiny little comment box), so let me summarize:

1) BI does not need another language, although better industry standars would help in terms of product integration (and no, MSFT should not drive this)

2) BI guys would serve themselves and their field a little better if they learned to deliver good data mining solutions

3) writing a report does not make one a BI expert (and neither does designing an warehouse, etl solution, or mastering MDX), you need to know data mining.  You need to be able to tell your customer:  yeah, I can give you a solution where you will a sub-second query be able to know your sales by region by quarter by department, and I can also, witha  little more work, forecast what the sales will be next quarter and next year!

4) You blog is actually pretty decent... my goal was not to attack you, but to encourage you to explore the field of BI.

yet another reader...

# re: On Business Intelligence and F# @ Thursday, April 24, 2008 4:39 PM

Ok, now thats a conversation.

I agree with some of your points.  I actually did a talk recently on BI where I stated that very fact - if you are equating reports to BI, you are not doing BI.  I will forward you the slides if you want.

BI as a solution involves data mining, presentation/visualization, transform, summarization, and most importantly, a human who can use the tooling that is there to generate something that drives business results.  That is, something that allows a business person who may not be an expert to explore and work with data to some degree, as well as tooling oriented towards BI professionals so that the hard parts or the "computer science" parts can be developed.

The problem in my view is the integrated stack - I suspect we agree on this, and the lack of language innovation - and I suspect we would have to agree to disagree on that.  BI, for good reasons, live close to the database - and therefore, the tools of the database developer, SQL and its various BI dialects, become the language of BI.  Personally, I find that very limiting, as there has not been near as much language innovation in SQL as there has been in functional languages lately, F# being only the most recent in a long line of newer computer languages - not to mention rediscovered old ones like ML and Haskell, that  do a *great* job of transforming information.

Am I a BI guy?  No, but I do see and deal with BI projects on a regular basis, and see a world of development and a world of BI that could really probably use each others innovations, but don't, because the two have historically been different silos.

As for F# as the language of BI, I did not mean to imply that is the only piece of it.  F# has nothing to do with presentation, unless your a masochist.  That is what WPF, Silverlight, Flash, HTML, PDF, and god forbid Excel are for.  But for the part where you mine, transform, aggregate, map, and reduce, where a lot of BI professionals do the hard work, F# - or any functional language - is a great language for that kind of thing.

Ramble appreciated :)

Anonymous

# re: On Business Intelligence and F# @ Thursday, April 24, 2008 11:20 PM

Interesting debate (and a debate is always better than just a hand-me-down-opinion blog post!).

I would say you are both correct.

There really isn't a good statistical language for data mining.  S-Plus, S, or R?  Datalog?

For what it is worth, people have explored using functional languages to do data mining.  A memorable paper for me is "Data Mining the Yeast Genome in a Lazy, Functional Langauge", simply because the name is hilarious and I am immature and remember these funny paper titles.

Now, as for dealing with differences between SQL dialects for data mining, that's not too complex as you make it sound.  Sure, there is a lot of essential complexity and conceptual integrity to build an agnostic solution, but it's doable.

The biggest dealbreaker for BI isn't some proprietary tool vendor -- although I've had the CIO of one of the largest hospital systems in the US tell me "we're screwed" if their vendor ever went out of business.  The real dealbreaker is how many people are stuck in ancient technology they can't escape from.  The cost to modernize is high.  Forget the demand for people who know statistics *really, truly well* and can do complicated fuzzy data models.  That demand won't even begin to be filled until the market expands to small businesses.  JMHO.

John "Z-Bo" Zabroski

# re: On Business Intelligence and F# @ Friday, April 25, 2008 9:56 AM

Yeah, R and Octave (and their proprietary counterparts) are nice environments, but mostly out of reach from your average developer (not because of lack of capacity, but because of lack of opportunity). I agree with John that dealing with different dialects of SQL for data mining is not that complicated. Most people who have a good handle on the fundamentals (sets, relations) and a reference book handy, can shoehorn SQL/DMX/MDX into doing whatever is necessary. Although I won't deny that having a good language can make a huge difference, even if that means simply increasing developer happiness. Personally, I am in love with Python, both because of its syntax and because of the libraries (especially if it helps me stay away from C, e.g. Numpy). Yeah, I've heard of Jython and IronPython.... but am not so excited about them... I'm happy with CPython. If I could make a suggestion to an up-and-comer data miner, it would be to read: http://www.amazon.com/Programming-Collective-Intelligence-Building-Applications/dp/0596529325 You can kill 2 birds with one stone: learn some data mining concepts (including algorithm implementations and some cool projects to try them on) and starting a torrid love affair with python. While you're at it, you may want to look at this one too: http://www.amazon.com/Visualizing-Data-Ben-Fry/dp/0596514557 It uses the Processing environment (based on Java). However, Magenic being 110% MSFT centric, you may have a hard time justifying reading non-MSFT material (let alone getting it expensed), even if it means making you a more well-rounded consultant. If that is the case, this one is not a bad intro (although after you're thru with it, a lot of it - the algorithms - may still seem like magic): http://www.amazon.com/Data-Mining-SQL-Server-2005/dp/0471462616 I see what you're saying John, about how people are stuck in ancient technology, but I think that is part of the problem; when faced with a problem, most developers almost immediately start to think in terms of what product they can throw at it: SQL Server, SSAS, BizTalk, SharePoint, WPF, etc, etc... and a lot of times, when brought into a consulting gig, the powers that be at the client may have even already made their determination (effectivelly putting the carriage in front of the horses). This sends developers down a spiral, a vicious cycle, were in addition to delivering solutions, the focus is to advance the empire (be it IBM, MSFT, etc). Data mining is about being curious about your data. Add to that some coding skills and you have a powerful combination. Don't focus on the toolset, we are developers after all, not MSFT or IBM salespeople (consultants will argue with me on this one). Start off small and grow from there. You'd be surprised of how many data mining projects I've started with just a question, a [SQL] connection, and a [python] script.

yet another reader

# re: On Business Intelligence and F# @ Friday, April 25, 2008 10:24 AM

Horrible formatting... I could swear I had several CRLF's.

yet another reader

# re: On Business Intelligence and F# @ Friday, April 25, 2008 10:30 AM

Might be the spam filter software - it tends to textize comments that have a lot of links... sorry about that...

Anonymous

# re: On Business Intelligence and F# @ Friday, April 25, 2008 10:54 AM

That would be a nice pet project right there, to learn some data mining: write a spam filter; of the statistical/bayesian kind, as opposed to rule based -> if linkCount>2, then textize :)

yet another reader

# re: On Business Intelligence and F# @ Saturday, April 26, 2008 12:39 PM

Ancient technology might actually have been the wrong word choice.

Most of the time the biggest issue christening a big IT project is probably going to be dealing with *non-normalized* data storage.  Note that I specifically meant to say *non*-normalized rather than *de*-normalized.  Some companies have applications using schemas that were based on "from the gut" thinking instead of combining intuition and math.  These schemas are not *de*-normalized because in order to be *de*-normalized you must be *normalized* first.

If you think data scrubbing 100 gig databases is hard, try exabytes in the future!

Then there is extremely difficult specifications, such as those seen in disastrous government IT projects, like the failed overhaul of the FAA IT system.  That system required a flip-of-the-switch live update where every system was simultaneously updated to the new system, zero down time.  That system was a multi-billion dollar failure passed onto taxpayers!

John "Z-Bo" Zabroski

# re: On Business Intelligence and F# @ Saturday, April 26, 2008 3:28 PM

I hear you there John.

A good friend and coworker of mine, Aaron Lowe, will frequently get into conversations about the need for data architects as a core member of teams.  I completely agree with him.  Someone to handle how object models will be mapped efficiently against an efficient relational schema is a key player in any project team.  Such a person should understand the value of 3NF - and ideally, understands the set theory behind all of it.  

Sadly, it does not happen near enough.  More often, you see either cowboy coder teams who completely lack DB skills, cowboy DBAs who put so much ceremony around any database change you might ever contemplate in DEV (not to mention qa or prod) - that you start to invent alternatives - and worst yet, combinations of both.  Which results in more project failure, which makes all of our jobs where we try to get business confidence in our ability to deliver solutions that much harder.

Anonymous

# F# Business Intelligence Case Study - XBox Live Trueskill @ Tuesday, September 09, 2008 2:30 PM

Some time ago, I speculated that F# should be the lingua-franca of BI . Well, had I done a little research,

Aaron's Technology Musings

# One step closer to F# for Business Intelligence @ Thursday, December 11, 2008 11:21 PM

I have been beating this drum for some time now, but the latest post from Don Syme confirms the trend.

Aaron's Technology Musings

New Comments to this post are disabled
Powered by Community Server, by Telligent Systems