Institute of Linguistics Graduate Student Research

Persistent link for this collection

Search within Institute of Linguistics Graduate Student Research

Browse

Recent Submissions

Now showing 1 - 4 of 4
  • Item
    Dataset for Research on Lexical Borrowings in French
    (dataset self-published online, 2010) Chesley, Paula; Baayen, R. Harald
    Dataset of lexical borrowings obtained from a corpus of newspaper French (Le Monde; Abeille et al. 2003) as described in "Lexical Borrowings in French: Anglicisms as a separate phenomenon" and "Predicting new words from newer words: Lexical borrowings in French"
  • Item
    Data for Evaluation of Subcategorization Frames
    (dataset self-published online, 2006) Chesley, Paula; Salmon-Alt, Suzanne
    Data for evaluation of subcategorization frames as detailed in the paper "Automatic extraction of subcategorization frames for French", available at http://pages.cs.brandeis.edu/~marc/misc/proceedings/lrec-2006/pdf/101_pdf.pdf
  • Item
    datasets for Using Verbs and Adjectives to Automatically Classify Blog Sentiment
    (dataset self-published online, 2006) Chesley, Paula; Vincent, Bruce; Xu, Li; Rohini, Srihari
    Training and test datasets for the paper "Using Verbs and Adjectives to Automatically Classify Blog Sentiment", available online at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.179.3144&rep=rep1&type=pdf . Zip files of texts from blogs that are manually classified as having positive, negative, or neutral sentiment.
  • Item
    Dataset for African-American English Hip-hop research
    (2011-10-07) Chesley, Paula
    This dataset corresponds to the paper "You know what it is: Learning words through listening to hip-hop" by Paula Chesley. The artists participants listed are given in alphabetical order and in order of frequency. An additional file lists the genre each artist was classified as (for when non-African-American hip-hop artists were not classified as hip-hop artists). An explanation file goes the dataset the studies were done from, and the dataset is given in file aaeHiphopChesley.csv. Please see the above paper for further details on the model, etc.