There is much discussion about whether Predictive Analytics (PA) can
be automated or not. This is a false dichotomy.
Predictive
Analytics is a strange beast - it needs to be ‘learned by learning‘ and ‘learned
by doing‘ – BOTH! That is due to the interconnected nature of the field.
To be a successful hyper-specialist in “left nostril” diseases, one needs to
have done Anatomy, Physiology and Biochemistry in med school. Similarly, for
PA, learning-by-learning (which takes at least 6 years of grad school) is not a
step you can skip and go directly to learning-by-doing and hope to become a
true curer of business diseases!
In
PA, learning-by-doing can be an even steeper curve. As I have noted before in
my blogs, PA skills will have to be rounded out with mathematical inventiveness
and ingenuity applied repeatedly in a specific business vertical. These are the
hallmarks of an uber Data Scientist. Clearly, an uber data scientist as
described above cannot be bottled and passed around. Don’t even think of “automating”
all the things that an uber data scientist does. So what do we do about “scaling”?
Are there support pieces we can automate to scale the solution.
Comparison
to a programing environment such as MATLAB is appropriate. MATLAB supplies you
with all kinds of toolboxes. Similarly, in PA, many basic operations can be
automated – clustering, learning, classification, etc. But, like MATLAB, you also
need an environment where these toolboxes can be fine-tuned with inventiveness
appropriate to the business vertical, mixed and matched and augmented with
additional one-off solutions to address the overall business problem at hand. Otherwise, the solution will fall short (or flat!).
So, part of PA can be
automated. PA toolboxes can be
fine-tuned by data scientist associates and the overall solution can be conceived
and put together with these toolboxes (with added “glue”) by the uber data
scientist.
Note
that everything I talked about here refers to PA solution development.
Once the overall solution is developed, “production runs” by customer
personnel and visualizations by executives of the PA solution developed above can
be mostly automated (with data scientist looking over their shoulders –
data can change on you on a dime; someone has to watch for the sanctity of the
data and non-stationarity problems!). Production is where the
solution needs to scale and it can.
Dr. PG Madhavan developed his expertise in analytics as an EECS Professor, Computational Neuroscience researcher, Bell Labs MTS, Microsoft Architect and startup CEO. Overall, he has extensive experience of 20+ years in leadership roles at major corporations such as Microsoft, Lucent, AT&T and Rockwell as well as four startups including Zaplah Corp as Founder and CEO. He is continually engaged hands-on in the development of advanced Analytics algorithms and all aspects of innovation (12 issued US patents with deep interest in adaptive systems and social networks).
No comments:
Post a Comment