Analytics has foothold beyond business and technology industries. It has also influenced many movie-makers across the last few decades. The same algorithms in breakdown monetary data can also be applied to sports data. Sporting events are major social phenomenon and economic propellants. Huge amounts of analytical dollars are spent in performance enhancement. In this project, we will find Patterns in Ironman performance. Objectives Use advanced data mining techniques to select relevant features Demonstrate statistical concepts and software tools used to identify patterns Determine correlating factors and differentiate between Spatial, Temporal and Coincidental associations Make effective decisions based on observed patterns Instructions If youre not up on your triathlons, an Ironman involves a 2.4-mile swim, then a 112-mile bike ride, then a 26.2-mile marathon. A person has 17 hours to complete them all. Between each event is a transition (T1 and T2, respectively) where the athletes change from wetsuits into cycling clothes, and then from cycling to running clothes. Theres serious money on the line for the winners of these events. For the rest of the competitors, its a chance to see if they have what it takes to become an Ironman. Imagine that you have a long-time friend who wants to compete in the upcoming Ironman competition. S/he spent 2 years preparing for it. S/he is not really what anyone would consider an athlete. S/he eats Pop-Tarts, sleeps late on Sundays and enjoys a refreshing adult beverage from time to time. One of the things s/he did while preparing was some serious data analysis to decide whether or not certain things would be useful. For instance: A new bike would have been about $1,200, and save her/him 20 minutes over the course of the bike ride. S/he decided it was not worth it. A new helmet cost $75 and saved her/him 10 minutes. That WAS worth it. Your friend has a budget and a goal, just like any business, and s/he wants to use data analysis to make purchasing decisions and investments. One of the things s/he has to decide is how to spend her/his time training. Knowing your skills in BI, s/he has solicited you to enhance her/his performance through scientific analysis. You know s/he is slow in all three areas: swimming, biking and running. But YOU ARE good at data analysis. You searched the Internet for published statistics on Ironman and you have found several sources of good data (such as runtri.com; you may find others during student research). Can you help your friend out with the following project? What should s/he concentrate on? Given this information: What are the correlations between the different variables and finishing place? What 5 variables are most strongly correlated with finishing place? Run this analysis for at least 5 most recent consecutive years. What is the likelihood that the person with the fastest swim time will be the winner? (Independent Probability 1) What is the likelihood that the person with the fastest bike time will be the winner? (Independent Probability 2) What is the likelihood that the person with the fastest run time will be the winner? (Independent Probability 3) What is the probability of the 1st person out of the water winning the race, 2nd person? 3rd person? Please either show your calculations or explain them. (Combined Probability) What is the probability of the 1st person off the bike winning the race, 2nd person? 3rd person? Please either show your calculations or explain them. (Combined Probability) Information needs to be actionable in order to be useful. Based on what youve learned, what do you think is more important to focus on in training for an Ironman, Swim, Bike, Run or Transition Times? Use the data to support your answer. Please use appropriate charts and/or graphs to demonstrate. PLEASE NOTE: If there is another sporting industry that you are more passionate about, please feel free to substitute that for Ironman. Just make sure you can translate and answer the above technical questions without compromising BI-PTR standards. Warning: runtri.com is a free live site and is constantly changing; so you may have a different experience. In my lecture, I used it as a fixed example just to show what types of information you should aim for. There are many other tools out there. You can use any tools you wish as long as you can answer the above questions.