Harvard University Data Science Fellow in Cambridge, Massachusetts

Details

Title Data Science Fellow

School Faculty of Arts and Sciences

Department/Area Institute for Quantitative Social Science (IQSS)

Position Description

This posting is for a Data Science Fellow to participate in the design of the Automated History Archive. Many of the biggest challenges that our society faces have their roots in the past, and history can provide fundamental insights into their causes and potential solutions. However, vast amounts of historical quantitative data that could shed light on important issues remain locked in hard copy due to prohibitive curation costs. Historical data are often scattered irregularly amongst text in the original publications. Commercial OCR software performs poorly when tables are irregular, often requiring the user to manually denote the structures by drawing boxes. Off the shelf tools for table assembly using clustering machine learning methods do not exist.

The Automated History Archive will automate the conversion of historical quantitative images into classified, machine-readable datasets on a large scale and deposit these in a collaborative, open source data platform. Building on our initial successes, the fellow will play a core role in developing algorithm prototypes that integrate computer vision tools that can recognize data structures in the raw images with machine learning techniques for classifying digitized table fragments.

The position provides a unique opportunity for a promising young scholar – planning to pursue a PhD in Engineering, Computer Science, or a quantitative Social Science – to be immersed in a top-notch research environment. The initiative is housed in Harvard’s Institute for Quantitative Social Science (IQSS), which is dedicated to understanding and solving society’s greatest problems through bold and collaborative social and data science. The fellow will work closely with the PI, Professor Melissa Dell. The fellow will be an active participant in the Harvard research community and will have opportunities to develop their own research agenda on issues related to the initiative.

There are two open positions with a one year term, with a potential opportunity for extension (conditional on funding availability and performance). The start date is flexible.

Basic Qualifications

Applicants should have experience working with machine learning methods for image data. Beyond this, it is imperative that applicants have an interest in advancing methodology for non-standard use, towards automated extraction of structured data from large datasets of historical document scans. The Data Science fellowship requires innovating methods, not simply applying existing tools. The position requires a Bachelors degree.

Additional Qualifications

Special Instructions

Please do NOT apply via ARIES. Only applicants who follow the application instructions will be considered.

Interested candidates should send a CV, transcript, and one letter of reference to gpetrakis@fas.harvard.edu. The subject line should contain the phrase “IQSS fellow application.” Candidates who advance to the next round will be required to complete a programming exercise, submit a research statement, and participate in a Skype interview.

Contact Information

Please see Special Instructions section.

Contact Email gpetrakis@fas.harvard.edu

Equal Opportunity Employer

We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability status, protected veteran status, or any other characteristic protected by law.

Minimum Number of References Required 1

Maximum Number of References Allowed

Supplemental Questions