Recognizing that the future technical and information technology workforce increasingly needs to be prepared for a data-driven work environment, Yahoo! is enabling students and faculty at Rutgers to learn how to analyze huge data sets and advance research on such platforms.
This type of training will make possible the analysis of troves of data crawled from the Web, data from and about social networks, data about protein structures, and multimedia data including massive amounts of video and audio -- data measured in petabytes. (One petabyte = 1,000,000 gigabytes.)
Yahoo! awarded Mor Naaman, assistant professor of library and information science at the School of Communication and Information, a $10,000 award as part of its Faculty Research and Engagement Program. Naaman will also receive access to the search engine company’s datasets -- an immense amount of information from search queries, Twitter, and other Internet applications. Yahoo! also contributed $2,500 to develop the data science laboratory at the SC&I LAIR Lab (Laboratory for Advanced Information Research).
This support will help us train researchers and students in big paradigms emerging from data and machine science,” Naaman said. “Companies are interested in people that know how to work with big data. This type of training can point students in the direction of 21st century jobs."
To further support data science research initiatives and education, Yahoo! donated 50 computing servers that are housed in the Department of Computer Science at Rutgers. The servers are set up in an Apache Hadoop cluster, a production application that makes possible the analysis of large data sets. The company uses such clusters (at a much larger scale) in its own tools.
“Yahoo is all about big – we reach at least 700 million users,” said Ken Schmidt, director of Yahoo! Academic Relations. “In a day we could easily acquire a petabyte of information – searches, chats, conversation. All of that amounts to a lot of information, and the research that Mor is doing can pave the way to a more personalized web experience.”
Another way the company is fostering academic collaboration with Rutgers around the topic of data science is through the Yahoo! Data Sciences Seminar, where speakers from various academic institutions and from industry visit Rutgers to talk about a wide range of data science topics. “It creates a broader community that links scientists and academic researchers,” Schmidt said.