Undergraduate students from the University of Toronto competed to see who could best analyze Big Data and attract the interest of employers at the 2016 American Statistical Association (ASA) DataFest @ UofT competition—a unique collaboration between academe, students and industry that was held April 30 through May 1 at UofT accelerator Department of Computer Science Innovation Lab (DCSIL), said Nathan Taback, Assistant Professor, Teaching Stream, Department of Statistical Sciences and lead organizer of ASA DataFest @ UofT. The Department of Statistical Sciences, DCSIL, and University of Toronto Scarborough, Computer and Mathematical Sciences jointly sponsored the event. This is the first time the event has been held at a Canadian university.
ASA DataFest is an annual competition in which student teams work to reveal insights from a large and rich data set. This unique program takes data-analysis learning beyond the constraints normally encountered in a typical statistical science course by enabling the students to work with Big Data provided by a real client.
During the 48-hour event that began on Friday afternoon and concluded Sunday afternoon, each team competed head-to-head with all other teams for prizes in categories ranging for creativity, best use of advanced statistical methods, best use of external information, and best data visualization. Each team presented its findings to a panel of judges—comprised of professors, and data scientists from industry.
Perhaps just as important, the student-competitors had the opportunity to catch the attention of various company and organization representatives that attended the event to offer advice to the competitors and identify the students with the best quantitative and analytical skills for potential job opportunities. Students who have done well at past DataFests are students who have proven that they can navigate the ‘data deluge’. And this is very attractive to potential employers.
Each year, the data and the challenge are different, but the common theme of making sense of big data—larger and more complex than the data sets undergraduate students usually encounter in a classroom—is carried over. The data set this year consisted of over 4 million records of ticket sales from Ticketmaster, but was not unveiled until the start of the competition so participating students could not prepare in advance for the event.
DataFest was first held by the Statistics Department at the University of California, Los Angeles (UCLA) in 2011 and expanded to Duke University the following year. This year, the ASA DataFest program is expanding again, with a total of twenty-one competitions involving nearly 34 schools being staged around the United States, Germany and Canada. The participating institutions can be found at https://www.amstat.org/education/datafest/participants.cfm.