Disclaimers:
1. The views expressed in this blog are the author's own and do not necessarily reflect the official policy or position of the organization he works for.
2. This site uses cookies from Google and other third-party service providers to deliver its services, to personalize ads and to analyze traffic. Information about your use of this site is shared with Google and other third-party service providers. By using this site, you agree to its use of cookies.

Monday, July 27, 2015

Big break and big data !

My last article was written well over a year back. So much has transpired since then...

I sometimes sit and wonder how many different things I have been juggling with, in life. I am not even talking about work here.

My current stimulus to blogging comes from the fact that my better half has created a blog of her own, which I honestly believe is awesome. That blog is supposed to be bi-lingual (languages being Bengali and English) and will (probably) focus on history, the subject she did her masters in.

Anyway, so now that I have started to pen this article. I am wondering what to write about. So many things happened within this last one year, that I do not even know where to start and where to end. Micro-blogging and Facebook have taken up more of my time, I would think, the time I used to allocate to my passion for writing.

Let me think for a moment.

I have been thinking about writing a blog based on my learning and work in analytics and the current trends. I believe it will serve as a good exercise for me and a way to gather feedback from my esteemed readers as well.

Machine learning and big data -


Over the past few days, I explored the topics of machine learning. One of the things that I definitely found exciting was the ability for smart systems to learn over time, gather experience and use that experience to make more accurate predictions after adjusting the underlying algorithms. This is smart, almost as smart as a human could get. As long as the initial set up is accurate and thoroughly tested, the end results could be amazing.

So, what is machine learning?
Per Wikipedia,
Machine learning is a subfield of computer science[1] that evolved from the study of pattern recognition and computational learning theory in artificial intelligence.[1] Machine learning explores the construction and study of algorithms that can learnfrom and make predictions on data.[2] Such algorithms operate by building a model from example inputs in order to make data-driven predictions or decisions,[3]:2 rather than following strictly static program instructions.
I have been exploring some of the popular machine learning technologies in the market today. I checked out BigML the other day. BigML provides a nice user interface where you can create a free basic account, set up a source, create a dataset, create a model and generate predictions. The user interface is really intuitive and you can play with the data even if you do not have vast knowledge in statistics, although, yes a basic knowledge would be expected.

The other technology that I explored was IBM Watson Analytics, floated by the Big blue. Now, this I felt, is huge. Before I go into what I explored about this, take a look at the video from Jeopardy.


IBM Watson participated in this competition, 'his' competitors being the very smart Ken and Brad. However, Watson triumphed.

From the official IBM site, here is what we know about Watson, in brief -
Since its triumph on the television quiz show Jeopardy! IBM has advanced Watson’s capabilities and made it available via the cloud. Watson now powers new consumer and enterprise services in the health care, financial services, retail and education markets.  IBM has also opened the Watson platform to developers and entrepreneurs, enabling them to build and bring to market their own powered by Watson applications for a variety of industries.
So, what I did was I created a basic account on the Watson analytics portal and played a little bit with a sample data set. Although I am not a Watson analytics expert I found it very intuitive and learned new insights about the data. Apart from providing me answers to my own questions, Watson analytics also provided interesting relationship insights between data points. It even allowed me to take a look at the data closely - the quality of the data in the different columns, which ones were good, which ones were not that good, in terms of percentages. So, I was even able to carry out a refinement of the data and that allowed me to get an improved data set which I would again feed into the system and get more accurate insights. I did all of these with a basic account, but I am sure I would have been provided with more options and more detailed capabilities had I had a paid account.
But in summary, did I like it? Yes I did.

I have had an analytics background in the past, having worked with Data Warehousing technologies for well over 9 years and this, I believe, is the next big thing. When you have an intelligent system that is learning from the data and making itself smarter over time, it makes a whole different story than what we have seen in the past. This is the next age - the age of data.

As IBM CEO Ginni Rometty pointed out  -
Data is the globe's next natural resource.
 Data is everywhere. Every keystroke I make. Every phone call. Every mobile text. Every tweet. Every word I say. Everything is data and we call this humongous volume of data as Big Data. How we tap into it to gather insight is the science which we have come to know as "Data Science".

The challenge we have here is to tap into this data in order to gain actionable insights. Every company (well almost every company) today have their own Data Warehouse with capabilities to generate analytics in order to further enhance their footprint on the market. But what next ! Predictive analytics is something that will provide the cutting edge to these organizations.

Take a look at this article on how Google monitors us today. In this age of internet and social networking, one is tempted to go out to the world and make connections, be vocal and every little thing you do is tracked. The advertisements that you see on the gmail sidebar will more likely reflect something related to the last item you searched on Google ! And that is just a tiny example of what Google knows about you. In essence every person browsing the internet and using the hallowed search engine probably has some kind of a profile created - habits, likes, dislikes, beliefs - every little detail that probably even your partner is unaware of. Is that scary? Yes, definitely. Can you do anything about it? Probably not.

Sorry for digressing. So that "Google example" I just gave above is also part of analytics and data and big amount of data or big data.

The fact remains - Yes, data is the next best natural resource and if we know how to churn it well, it can make the world a better place. How?

Could we have predicted the way this gunman opened fire on innocents, yes I am referring to the Louisiana shooting. From the news article it looks like it was a planned out act by the person. If so, how did he plan it? What kind of Internet footprints did he leave? What were his facebook updates or his twitter tweets? What did he message? The police will of course find out everything, but we are working on reactive analytics.

Could we have been proactive? Could we have, based on the real time data, predicted that there was a 80% chance (I just put a number as an example) of this person getting violent and contributing to mass genocide? That is exactly where we wish to do with Big Data, with predictive analytics, with machine learning.

All feedback appreciated.

PS. The views expressed in this article are my own and do not necessarily represent the thoughts of the organization I work for.

No comments: