What is the difference between Data Science vs Data Analytics?
One who understands the data and business logic and provides predictions by sampling the current business data (also known as “data insights / business insights / data discovery / business discovery”); about the direction in which the business is heading (both good and bad) or where to head by spotting the trends; so that the business can take a right decision on their next steps; such as:
- improving the product/feature based on user interest levels
- driving more users
- driving more clicks/ impressions / conversion / revenue / leads
- user experience
- user retention
Data Scientist Qualifications:
- Familiar on “how to use database systems (SQL interface, ad-hoc) esp. MySQL and Hive (at-least)” to begin with
- Java / python / simple map-reduce jobs development, if needed
- Exposure to various analytics functions (over, median, rank, etc.) and how to use them on various data sets
- Mathematics, Statistics, Correlation, Data mining and Predictive analytics (fast to future prediction based on probability & correlation)
- R” and/or “RStudio” (optionally excel, SAS, IBM SPSS, MATLAB)
- Deep insights into (statistical ) data model development (in agile fashion) and in-general self learning model is the best in today’s dynamics; so that it can learn and tune from its own output by combining with performance over the period of time
- Work with (very) large data sets, grouping together various data sets and visualizing them
- Familiar with machine learning and/or data mining algorithms (Mahout, Bayesian, Clustering, etc.)
Data Analytics (DA) in general is a logical extension (or just a buzz word) to Data Warehousing(DW), Business Intelligence (BI); which provides complete insights into business data in most usable form. The major difference in warehousing to analytics is, analytics can be real-time and dynamic in most cases; where as warehouse is ETL driven in off-line fashion.
Every business who deals with “data”, must have “Data Analytics”; without analytics in-place; the business is treated as dead man walking without a heart, a soul and a mind.
Data Analytics (Engineer) Qualifications:
- Familiar with data warehousing and business intelligence concepts
- Strong in-depth exposure to SQL and analytic solutions
- Exposure to hadoop platform based analytics solution (HBase, Hive, Map-reduce jobs, Impala, Cascading, etc.)
- Exposure to various enterprise commercial data analytical stores (Vertica, Greenplum, Aster Data, Teradata, Netezza, etc.) esp. on how to store/retrieve data in most efficient manner from these stores.
- Familiar with various ETL tools (especially for transforming different sources of data into analytics data stores), if needed able to make everything (or some critical business features) real-time
- Schema design for storing and retrieving data efficiently
- Familiar with various tools and components in the data architecture
- Decision making skills (real-time vs ETL, using X component instead of Y for implementing Z etc.)
Sometimes, A Data Analytics Engineer also plays the role of data mining on demand as needed; as he has a better understanding of the data than anyone else; and in-general they have to work closely to get better results.
Data Analytics can also be divided or shared between 4 different teams or people (as it is hard to hire a person with a complete skill-set and more over administration is different from development).
- data architect
- database administrator
- analytics engineer and