National Science Foundation Awards SC&I Professor Paul Kantor nearly $1.0 Million to Help Create Ways to Analyze ‘Big Data’

Rutgers Teams with Cornell and Princeton to Improve Scientist’s Access to New Ideas


Professor Kantor can be reached at or 732-322-8412.

NEW BRUNSWICK, N.J. – The National Science Foundation (NSF) has awarded Rutgers researcher Paul Kantor nearly $1.0 million as part of an initiative to extract useful information from so-called “big data” – massive collections of data from sources such as scientific documents, orbiting instruments, digital images, social media streams, and business transactions.

While it’s easy to dismiss big data as the latest business and scientific buzzword, the problems are important at every scale. Scientific progress depends crucially on being able to find and digest exactly the right parts of the tsunami of scientific literature.

The Rutgers grant is part of a $3 million research effort, in collaboration with Cornell and Princeton to improve the accuracy and relevance of complex scientific literature searches. It represents 20% of NSF’s $15 million worth of funding announced October 3 as part of the agency’s big data research initiative launched earlier in the year.

Paul Kantor is a professor in the Department of Library and Information Sciences in the School of Communication and Information, where he heads the LAIR Laboratory .

“Scientists need to do the equivalent of Google searches on their literature,” said Kantor, “but keywords alone are not sufficient to perform searches quickly and efficiently.” The researchers are investigating methods to search on topics and concepts that would include collaborative information from other searchers to better determine the relevance of the results. These methods could further peg the value of a newly retrieved document by relating it to other documents that the searcher identifies as worthwhile. The researchers have been working with the “arXiv” online public archive of scientific papers run by Cornell University, founded by project collaborator Prof. Paul Ginsparg of Cornell (Physics).

The scientific components of the project address deep problems in information retrieval. Prof. Thorsten Joachims, of Cornell (Computer Science) is developing algorithms to address the fact that the value of a document may depend on synergy with other documents. Prof. Peter Frazier, of Cornell (Operations Research and Information

Engineering) is developing algorithms that may show the user some document that is not exactly what she wants, but which helps the system understand her goals. Usually, search engines index documents by counting the words that are in them. Prof. David Blei of Princeton is developing an innovative new approach that adds information on topics that his algorithms find automatically by analyzing the documents. Finally, all of this will be studied in a framework of experiments, designed at Rutgers, which will accelerate the pace of the research.

Jorge Reina Schement, dean of School of Communication and Information, says "We've been working on collaborative finding of information since the AntWorld system was developed here in the 1990s. With this award, Paul Kantor has brought our program in collaborative information finding into the world of BIG DATA. We are very pleased that this collaboration with Cornell and with Princeton has received one of the largest awards in the NSF's new research program. The faculty of SC&I's LIS department does research at the interface between information systems and their users, and this large research grant opens new beachheads for us."

The researchers have been working together for a little over a year, and are putting the finishing touches on the experimental system that will be used in the research.

Media Contact: Alyse Mattioli

732-932-1500 ext. 8405


About Rutgers University's School of Communication and Information:

The School of Communication and Information (SC&I) at Rutgers, The State University of New Jersey in New Brunswick, NJ, is known for its creativity and commitment to solving society’s problems. SC&I community members work at the interface of communication and information to understand: global media, community, and democracy; health and wellness; organizations, policy, and leadership; and social media interaction and collaborative design. SC&I offers a variety of undergraduate, graduate, and continuing education opportunities on campus, online, and through hybrid formats. Additional information is available by visiting or calling 732-932-7500.