Search
Tuesday, February 09, 2010 ..:: Home ::..   
Site Navigation
  Home
  FAQs

 Data Repositories Minimize

In advance of the DID Challenge, the funders approached many major repositories of digital cultural heritage materials and asked them to provide contact and technical support information for gaining access to their collections.  This list is constantly being updated, so check back often.  If you are a representative of such a collection and wish to be added to this list, please contact the DID Challenge organizers.

Current List of Data Repositories

(Last Updated:  16 June, 2009)


 Print   
 Read the RFP Minimize

Interested in applying? Below you will find the Request for Proposals (RFP) for the DID Challenge as well as four agency-specific addenda.  Please read all materials carefully.

Main DID RFP [PDF]


 Print   
 How to Apply Minimize

Before applying, please read both the main RFP and the RFP Addenda.

Submit a Letter of Intent
(LOI deadline now passed)

Submit a Final Application
(deadline now passed)

Thank you for your interest!  Awards to be announced in December 2009.

**

Frequently Asked Questions (FAQs)
(last modified 14 July)
 


 Print   
 Awardees of 2009 Digging into Data Challenge Minimize

Congratulations to the 2009 Digging into Data Challenge Awardees and a big "thank you" to all the teams that competed and the many libraries and archives that made their collections available to the researchers.

Important note: Each DiD project is composed of an international team of scholars and scientists. Each team had to select one institution to be the awardee of record for each granting agency. However, many other institutions played critical roles in each project.  Below, we will list both the awardees of record as well as other key team members.

Structural Analysis of Large Amounts of Music Information


Awardees:  Stephen Downie, University of Illinois at Urbana-Champaign, NSF; David De Roure, University of Southhampton, JISC; Ichiro Fujinaga, McGill University, SSHRC.
Additional Key Participants:  The Internet Archive, Indiana University, Anthology of Recorded Music (DRAM), and the British Broadcasting Corporation.
Description:  This project will gather approximately 23,000 hours of digitized music representing a wide range of styles, regions and time periods. The goal is to develop tools to tag and analyze the underlying structures of this music, resulting in a body of world music that will provide music scholars with interactive access to previously unavailable analysis and insights.

Digging into the Enlightenment: Mapping the Republic of Letters


Awardees: Dan Edelstein, Stanford University, NEH; Chris Weaver, University of Oklahoma, NSF; Robert McNamee, University of Oxford, JISC.
Additional Key Participants
:  The National Publication of the Works of Antonio Vallisneri, Princeton University, Uppsala University, Utrecht University, Bard Graduate Center, International Center for the History of Universities and Science, University of California at Berkeley, Rutgers University, and the French National Center for Scientific Research.
Description: This project will focus on a body of 53,000 18th-century letters, and analyze the degree to which the effects of the Enlightenment can be observed in the letters of people of various occupations.

Using Zotero and TAPoR on the Old Bailey Proceedings: Data Mining with Criminal Intent

Awardees
: Dan Cohen, George Mason University, NEH; Tim Hitchcock, University of Hertfordshire, JISC; Geoffrey Rockwell, University of Alberta, SSHRC.
Additional Key Participants:  The National Archives (United Kingdom), McMaster University, the Open University, Amherst College, University of Sheffield, Trent University, and the University of Western Ontario.
Description
: This project will create an intellectual exemplar for the role of data mining in an important historical discipline – the history of crime – and illustrate how the tools of digital humanities can be used to wrest new knowledge from one of the largest humanities data sets currently available: the Old Bailey Online.

Towards Dynamic Variorum Editions

Awardees: Gregory Crane, Tufts University, NEH; John Darlington, Imperial College, London, JISC; Bruce Robertson, Mount Allison University, SSHRC.
Additional Key Participants: The University of Massachusetts, Amherst, University of Leipzig, Cairo University, Humboldt University, Berlin.
Description: The creation of a framework to produce "dynamic variorum" editions of classics texts that enable the reader to automatically link not only to variant editions but also to relevant citations, quotations, people, and places that are found in a digital library of over one million primary and secondary source texts.

Digging into Image Data to Answer Authorship Related Questions


Awardees: Dean Rehberger, Michigan State University, NEH; Peter Bajcsy, University of Illinois at Urbana-Champaign, NSF; Peter Ainsworth, University of Sheffield, JISC.
Additional Key Participants
: The Alliance for American Quilts.
Description: This project will pursue research using advanced computational techniques to explore humanities themes related to the authorship of large collections of cultural heritage materials, namely 15th century manuscripts, 17th and 18th century maps, and 19th and 20th century quilts.

Harvesting Speech Datasets for Linguistic Research on the Web

Awardees: Mats Rooth, Cornell University, NSF; Michael Wagner, McGill University, SSHRC.
Description: This project will harvest audio and transcribed data from podcasts, news broadcasts, public and educational lectures and other sources to create a massive corpus of speech. Tools will then be developed to analyze the different uses of prosody (rhythm, stress and intonation) within spoken communication.

Railroads and the Making of Modern America—Tools for Spatio-Temporal Correlation, Analysis, and Visualization

Awardees: William Thomas, University of Nebraska-Lincoln, NEH; Richard Healey, University of Portsmouth, JISC
Additional Key Participants
: The University of Victoria, McGill University, Paris One University, University of Lancaster, Middlebury College, and Stanford University.
Description: This project will integrate a vast collection of textual, geographical and numerical data to allow for the visual presentation of the railroads and its impact on society over time, concentrating initially on the Great Plains and Northeast United States.

Mining a Year of Speech

Awardees: Mark Liberman, University of Pennsylvania, NSF; John Coleman, University of Oxford, JISC.
Additional Key Participants
:  The British Library.
Description: This project focuses on large scale data analysis of audio -- specifically the spoken word.  This project will create tools to enable rapid and flexible access to over 9,000 hours of spoken audio files, containing a wide variety of speech, drawn from some of the leading British and American spoken word corpora, allowing for new kinds of linguistic analysis.


 Print   
 Announcing the Digging into Data Challenge Minimize

The Digging into Data Challenge is an international grant competition sponsored by four leading research agencies, the Joint Information Systems Committee (JISC) from the United Kingdom, the National Endowment for the Humanities (NEH) from the United States, the National Science Foundation (NSF) from the United States, and the Social Sciences and Humanities Research Council (SSHRC) from Canada. 

What is the "challenge" we speak of?  The idea behind the Digging into Data Challenge is to answer the question "what do you do with a million books?"  Or a million pages of newspaper? Or a million photographs of artwork?  That is, how does the notion of scale affect humanities and social science research? Now that scholars have access to huge repositories of digitized data -- far more than they could read in a lifetime -- what does that mean for research?  

Applicants will form international teams from at least two of the participating countries.  Winning teams will receive grants from two or more of the funding agencies and, one year later, will be invited to show off their work at a special conference. Our hope is that these projects will serve as exemplars to the field.

*

The advent of what has been called “data-driven inquiry” or “cyberscholarship” has changed the nature of inquiry across many disciplines, including the sciences and humanities, revealing new opportunities for interdisciplinary collaboration on problems of common interest.  The creation of vast quantities of Internet accessible digital data and the development of techniques for large-scale data analysis and visualization have led to remarkable new discoveries in genetics, astronomy, and other fields, and—importantly—connections between academic disciplinary areas.  New techniques of large-scale data analysis allow researchers to discover relationships, detect discrepancies, and perform computations on data sets that are so large that they can be processed only using computing resources and computational methods developed and made economically affordable within the past few years.  With books, newspapers, journals, films, artworks, and sound recordings being digitized on a massive scale, it is possible to apply data analysis techniques to large collections of diverse cultural heritage resources as well as scientific data.  How might these techniques help scholars use these materials to ask new questions about and gain new insights into our world?  To encourage innovative approaches to this question, four international research organizations are organizing a joint grant competition to focus the attention of the social science and humanities research communities on large-scale data analysis and its potential application to a wide range of scholarly resources. 

The goals of the initiative are

  • to promote the development and deployment of innovative research techniques in large-scale data analysis;
  • to foster interdisciplinary collaboration among scholars in the humanities, social sciences, computer sciences, information sciences, and other fields, around questions of text and data analysis;
  • to promote international collaboration; and
  • to work with data repositories that hold large digital collections to ensure efficient access to these materials for research.

 

If you are interested in taking up this challenge, please read the RFP and addenda available on this page. 


 Print   
 Sponsors Minimize

Agency Logos


 Print   
 Press Minimize

Press Releases About the Launch off Digging into Data Challenge (January 2009)

JISC, NEH, NSF, SSHRC

Press Releases about Awardees (December 2009)

JISC, NEH, NSF, SSHRC

Speech by NEH Chairman Jim Leach at DiD awards ceremony.

Articles about the Winning Projects

PhysOrg.com, December 4, 2009, "'Digging into Data Challenge' grant awarded"

The Tufts Daily, December 4, 2009, "Classics department researchers earn grant"

The Chronicle of Higher Education, December 4, 2009, "A 'New Digital Class' Digs Into Data"

Inside HPC, December 7, 2009, "What would you do with one million books?"

HPCWire, December 11, 2009, "Grant Supports Computational Analysis Of Manuscripts, Maps and Quilts"

The Chronicle of Higher Education, December 13, 2009 "How to Prepare Your College for an Uncertain Digital Future"

The Mason Gazette, December 15, 2009, "Digging through the History of Crime Wins Center a Federal Grant"

McGill Reporter, December 17, 2009, "Two McGill researchers among winners of new international competition"


 Print   
 Related Reading Minimize

Below are some related readings. Please feel free to send us further suggestions.

Anderson, Chris. "The End of Theory: The Data Deluge Makes the Scientific Method Obsolete." Wired 16.07 (2008)

Arms, William and Ronald Larsen (editors). "The Future of Scholarly  Communication: Building the Infrastructure for Cyberscholarship". NSF/ JISC workshop, Phoenix, Arizona, April 17 to 19, 2007.

Clement, Tanya, Sara Steger, John Unsworth, and Kirsten Uszkalo. "How Not to Read a Million Books?" Presentation at Harvard University (2008)

Cohen, Daniel. "From Babel to Knowledge: Data Mining Large Digital Collections." D-Lib Magazine 12.3 (2006)

Cohen KB, Hunter L. "Getting Started in Text Mining." PLoS Computational Biology 4(1): e20 doi:10.1371/journal.pcbi.0040020

Crane, Greg. "What Do You Do with a Million Books?" D-Lib Magazine 12.3 (2006).

Friedlander, Amy (editor). "Promoting Digital Scholarship: Formulating Research Challenges in the Humanities, Social Sciences and Computation." Council on Library and Information Resources (2008).

Halevy, Alon, Peter Norvig, and Fernando Pereira. "The Unreasonable Effectiveness of Data." IEEE Intelligent Systems, vol. 24, no. 2, pp. 8-12, Mar./Apr. 2009, doi:10.1109/MIS.2009.36

Venter, Craig. "Bigger Faster Better." Seed, November 20 (2008).


 Print   
 Contact Minimize

ODH Logo

The Digging into Data Challenge is being administered by the Office of Digital Humanities at the National Endowment for the Humanities. To contact ODH with any kinds of questions, please send us an e-mail.  If you have funder-specific questions, please see the specific contact information found in each RFP Addendum.


 Print   
Privacy/Terms of Use