Wednesday, June 29, 2016

False Correlations 

Not all correlations are valid. They did a study in the 1980's regarding shark attacks in Florida. There was a strong positive correlation between the number of shark attacks and the number of ice cream sold.

The data, without an understanding, indicated sharks were drawn to ice cream. Clearly this is false, however it does show that blindly following data points, without reviewing other sources of influence, has the opportunity to sway the user away from what is actually driving the behavior.


Hope is not a strategy!


Big Data / Data Analytics
Predictive Analytics
            Disk Drives
Computers abound in use throughout the business enterprise and with consumers. One aspect of the hardware that directly impacts the user, regardless of their stance, is the disk drives. When these fail, there are direct and immediate issues for the user. Bearing this in mind, disk reliability is paramount. Being able to predict when a drive would fail with a statistical significance would definitely be a benefit. This is, however, not a simple task and requires exhaustive and clean data to develop the appropriate model. This has been analyzed as a proof of concept (POC) by EMC with data from an estimated 50,000 drives. From this the researchers were able to produce the model. The algorithm naturally showed as time increased so did the failure rate. The algorithm was modified and improved the accuracy to 83.3%. The time to the median actual failure was reduced to 14 days.
            Marketing
Advertising and marketing spending had been trending downward until the last few years. This increase in spending was due to data analytics and science. There has been a mass increase in data recording in the last few years. This has been used by business in various forms with success. The business leaders have noted the data can be used to better target their market and potential clients. This precise marketing based on the data analysis has proven to be more cost effective. With the lower costs expensed, the ROI as applied to the advertising has improved significantly. One specific example of this involves television advertising.
            Staffing
Predictive analytics also is being applied to staffing. Hired applied this to hiring developers, engineers, and computer scientists. By reviewing certain data re: the applicant, the management was able to view the top 5% of the 250,000 applications received monthly from the US, UK, Canada, and France. This saves direct expenses and labor, and subsequently indirect expenses.

The first step for the applicant is to pass the algorithm’s filter. This removes the weaker candidates without the labor and time for a human to look at it.