Sunday, 22 June 2014

Data Scientist as a role

As promised in my last blog, here is my analysis on what does a Data Scientist Role mean and why it is not a new jargon replacing Marketing Consultants or Product Manager or an IT Manager Role which some of my friends pointed out in the last blog.

Remember our teenage days when we were in probably 10th Std and this profound question was asked , Are u going take Science or Arts?.80% of my friends who mostly are Engineers would have said of course Science without a blink of their eye. Little did we know that every innovation has an artist behind that and every scientist has that artist in him which gives him that imagination to innovate. So what I am trying to make a point is that they are just two sides of a same coin.

Coming back to our Data Scientist Role, it is the best example of how a person can present data in the most creative and intuitive manner that it helps business. Also note that the KPIs are different for each business so the rules of analysis differ. The Data Scientist should be able to interpret the data using mathematical and analytical tools and understand the trends.  There is no more need to be an Engineer and learn all the programming languages like C++, java, Dot-net etc. to get a Data Scientist Role. I believe a Business person who understands the Business and can play with some tools is the best person suited for role. It is the best example of a Role which brings Art and Science together. 

As expected this answer did make the students very happy and suddenly there was lot of eagerness to know what this role entails. So I believe there are just few things which can help to make a Data Scientist:-
1.   Business knowledge and acumen in the stream where one wants to take the role
2.   Statistical and analytical tools usage to bring creativity and make sense of the big data
3.     Good communication

Lots of Universities have already started providing full time courses to specialize in this role. Saint Peter’s University in US provides a full time degree course in the same.


There is no need of being an expert on Hadoop platforms and MapReduce functions because they are just another enterprise platforms where the analytics is done. As per Heidrick & Struggles report a single company like Wipro already has around 8000 people on data science and analytics role and it is just going to grow.

Here is what  VP of Development for Big Data projects at IBM, Anjul Bhambhri says in the following blog. 

Sunday, 8 June 2014

Big data ->Big Employment

I recently visited a college for a talk on Big Data. The College had a combination of MCA and MBA students and a nice set of smart enthusiastic students with gleam in their eyes.  Now I always get excited to see this enthusiasm because I feel that energy and enthusiasm of our youth is what will make a difference to the world in the coming decade.

As we started discussing on usual Big Data topics the discussion started flowing into how this is going to create and shape new requirements and drive the technology in coming future. The most important aspect being how this huge nexus of Data is going to drive new roles in the market and the skills required for the same. Here I was actually able to touch the cord because suddenly they all were very excited on what this new role entails.

The fact that Gartner’s prediction is Big Data is going to create around 4.4 million new IT jobs globally by 2015 reflects this fact. The jobs would involve around technology to handle the data and capabilities to analyze and make sense out of this data set. By 2015 it is also predicted that 80% of the data is going to be uncertain, so in spite of us having all data available there would be a need to know and  determine how to use the data, then determining what confidence level needs to act on the data and in what context is the data available.

So the next question of course from then was what are these new jobs?

Now I had grabbed their attention fully and tried explaining how the counting of mouse clicks on an advertisement, traditional analytics and CRM and MDM data collection, mobile phone updates, and so many more combine together to keep data analysts and data engineers including the CDO and CEO and CMO and everyone concerned with accumulation of data busy and thus keep creating employment opportunities.

The newest and hottest job in Data Management is the person that can interpret such data in innovative ways for their employer, the Data Scientist.

What exactly is a Data Scientist?
“A data scientist is that unique blend of skills that can both unlock the insights of data and tell a fantastic story via the data.”

Anjul Bhambhri, Vice President of Big Data Products at IBM describes the Data Scientist role as below 
“Data scientists are part digital trend spotter and part storyteller stitching various pieces of information together. These are people or teams at organizations that sift through the explosion of data to discover what the data is telling them.” 

And that drove them to next set of question what skills are required to be a Data Scientist. And the answer made them happy and that would be in my next blog.

What does it take to be a Data Scientist? 


Sunday, 2 March 2014

IBM MDM Hybrid Style implementation

IBM MDM V11 release marked a significant change in the way MDM was being implemented in the market. It provide a solution that could support various styles of MDM by performing one single installation and that was HYBRID MDM. This support in V11 provided capability where customer could easily transition from registry to transactional style or co-exist their datasources in each of these styles and at the same time have a centralized MDM solution across the enterprise.

The concept is very simple. The customer has a set of datasources and some or most of these sources want to manage their own data. Well, they can keep on doing that in Virtual MDM (old Initiate Engine) in V11 .There are some other datasources which are transactional and all thier data and attributes are stored in the centralized MDM system. Well that is what Physical MDM (aka old DWL) is all about. But using V11, it is easily possible to move some of the data from Virtual MDM to Physical and persist is as a single unique entity by enabling a single switch during installation of IBM MDM V11.

This is a very unique capability, because there is a lot of MDM requirement in the market around this case. Look at some of the State Resident Hub requirements at State level in India for MDM. There are around 30-40 different departments supported in State Governments like Pension, Scholarship, Ration Cards and each of them want to identify the uniqueness of the resident across these datasources and at the same time keep the ownership of this data because of the propriety attributes. They could easily start with Virtual MDM and move to Physical. As and when they add new departments all of them could do these transitions.

Acquisitions are other scenario where this capability can be a huge advantage to bring in the acquired data into a centralized repository using this capability.

There will be enhancements on the hybrid MDM support in releases to come.

Some interesting links for more info



It would be interesting to know opinion and feedback on the Hybrid MDM capability in V11.

Thursday, 18 July 2013

IBM InfoSphere MDM V11 is out!

Back to my blogs after a long time. And the break was for no other reason than the fact that was very busy in the Next release of MDM and here it is out after all the hardwork. :-)

IBM InfoSphere Master Data Management Server V11 release brings in IBM InfoSphere MDM Advanced Edition aka DWLCustomer and IBM InfoSphere MDM Standard Edition aka Initiate lot more closer to each other. Both the engines go behind the same Enterprise Container and taking a step closer to a vision where Customer's can build plethora of MDM solutions raging from registry to transactional to hybrid styles of MDM Solutions.

The V11 release of MDM brings in a lot of new capabilities in the MDM product like :-
  1. Unified Advanced and Standard Edition Installer
  2. Unified Advanced and Standard Edition Workbench
  3. New BPM Based UIs with Embedded Workflow Engines
  4. OSGI’ied Engine/ backend making the customizations on the product much more easier
  5. Common security framework for the Combined engine
  6. Hybrid MDM style of implementation
In order to get more information on the latest version of the product, help yourself to the following links and get the latest information and insight.
Post topics for MDM V11 on which you would want more information. we can plan some of them as blogs and articles on the same.

Sunday, 15 July 2012

Government Department - Use Case for Registry Style MDM

In a Government Organization, there are lot of departments like Ration Card, Pension Scheme, and Student Scholarship etc. Each of this department has their own data which has been collected while providing cards, schemes etc for the people. Each of this department owns their data and do not want to share the same. To get to a single view without passing the ownership of the data forms a very good use case for a registry style implementation of MDM.

This is the reason why Initiate akka IBM InfoSphere MDM Standard Edition fits very well in such use cases. The UID implementation is again another department and other departments can use that for reference but UID does not own the data for other department. So UID will also not be able to provide a solution of owing the data of various departments. The complete journey of MDM to reach to a physical and transactional style of MDM may be long time away as far as the government organization go.

One another good use case for government organization is the data collected is in various Indian Languages. Initiate has good capability of being able to match when the data is in various native languages and can give very good results. However when the various department have language in various different language, transliteration capability have to be used to convert them into one language for the search to happen.



Fingers crossed to see some of the Government sector realizes and jumps at this option.



Sunday, 12 February 2012

20-20 Cricket match or Test Cricket

Any day if a cricket fan is asked what would he or she love to watch given a choice between 20-20 and Test match and 90%(I think) would probably say 20-20 . In fact a Test match has started taking the form of 20-20 and finish off in 3 days rather than the whole 5 days. Shehwag has to be blamed for that probably because of irrespective of what form of cricket he is playing he males century at the same speed. Poor guy he is just doing what he is naturally good at. :-)

Given the age and time that we are in why do we think that a customer would wait for 9 months to see a solution implemented. Talk about an MDM Solution and we say that any implementation will take nothing less than 9 months. The customer has every right to say then “Thank You” and move on. The MDM solution ROI has to be shown in lesser time and the complexity has to be hidden. That will be the point where we will be make exceptional growth in markets like India. The Customers in growth market do not have a huge IT department to implement complex solution and neither do they have a window big enough as 9 months to see the ROI.

IBM has taken a step in the right direction by acquiring Initiate into the MDM portfolio and enables customers to implement the simple MDM solution at the same time and realize the ROI earlier. The conceptualization of a virtual MDM to a physical MDM would allow the customer to see the MDM path and end goal much more clearly.

The IBM InfoSphere Master Data Management V10 is the latest release from IBM MDM portfolio. Find all the details about it on the following URL
http://www-01.ibm.com/software/data/infosphere/mdm/features.html

Hope it it a step in the righ direction to convert some of the MDM Test matches to a 20-20 and customers come flocking towards it. :-)

Sunday, 22 January 2012

MDM proposition is for Business leaders not IT Department

In market like India, the biggest challenge for selling a product like MDM is the fact that the software is sold to IT department of a company rather than business leaders. The discussion to buy a piece of software or not has still not gone to the Board of Directors or Business leader chambers. The MDM story is approached from typically by fixing the Data Quality issues and ends up becoming a Data Warehousing project along with Analytics losing its advocacy midway somewhere. This is because the customer and their IT department can see the value immediately. However the long term vision of creating a operational hub where the single version of truth can exist is something that can be envisaged as a business need rather than a technology requirement.

Companies need to have business leaders who can vision the roadmap of MDM. Currently the Growth markets customers are trying to solve their day to day problem and the IT Department of various companies are working towards that. When an MDM value proposition is given to the IT department it always gets into make vs buy discussion which becomes an ugly one. The problem needs to be approached as a roadmap where the long term vision needs to be clear. Hopefully these changes soon in countries like India where IT becomes more business driven along with the technology driven as they need to be hand in hand for success. This would lead to more successful stories of MDM in growth market countries like India.