Experimentation in Psychology and Linguistics
Plan for this course
- Why run experiments?
- General principles of experimental design
- Methods used in experiments in psychology and linguistics
- Data analysis
Why run experiments?
Where experimentation is useful
- Hypothesis testing
- Evaluating models
Where experimentation is useful
- UX/interface design – not covered in class, but can be a (final) project!
- Evaluating observational studies, e.g. from data mining
- Pattern recognition – okcupid example later today
- Human behaviour in social/emerging media
- Social engineering – facebook example later today
- Developing gold standards
- Human capability as a baseline (at least) – aliexpress and microsoft text reading algorithms later today
Evaluating observational studies
What should I say on a first message on okcupid?
Avoid physical compliments!
- Does adding awesome to an email increase the response rate?
- Does adding beautiful decrease it?
Avoid physical compliments?
- Is the addressee reacting to the words or the sender themselves?
- Who is the sender? Are they always effusive?
- Who is the addressee? What factors would change the addressee’s likelihood to reply to any message?
- Is the addressee actually beautiful?
- If so, this may affect two separate things (e.g., expectation to be lauded or compliment aversion)
(Liberally drawn upon O’Neil & Schutt 2014)
Designing an experiment to test okcupid’s observations
- What are the independent and dependent factors?
- How could we mitigate the confounds and random factors?
- How do we make sure we have enough statistical power?
We can do better than correlation is not causation:
- Evaluating dependent and independent factors
- Anticipating confounds and extreneous factors
- Better understanding the characteristic of a complex problem
- Emotional status transferred via emotional contagion
- misspecification of contextual variables?
- failure to account for shared experiences?
- Addressing these issues with a controlled experiment
Factors to consider
- Contagion as a result of interaction with, or just exposure to, a (happy/sad) person
- Passed on verbally or non-verbally?
- Correlation between positive and negative moods:
- The happiness of others might make us sad: alone together social comparison effect
Does exposure to positive/negative content lead to posting content consistent with exposure?
- 689,003 (unsuspecting!) participants exposed to emotional expressions on their News Feed (Kramer et al. 2014)
- Emotional content as a between-subject factor
- Depending on condition, positive/negative emotional content was reduced from the News Feed
- Emotional valence determined by Linguistic Inquiry and Word Count software (LIWC2007)
Developing gold standards
Natural language comprehension
- Developing AI-driven virtual assistants and chatbots
- Emulating humans’ natural language understanding
- Achieving human-like ability is already a challenge, not to mention surpassing it
Reading AI scoring better on SQuAD
Other examples of gold standard
- Text classification
- Perception and categorization (e.g. of objects)
- Later this course: comparing human classification of tweets with text classifiers
Course structure and expectations
- Lecture and discussion on Tuesdays
- Lab on Thursdays
- Attendance mandatory
- Hands-on course
- Focus on discussions and collaboration rather than lectures
- Developing a variety of skils
- Experimental design: Follow-up experiment and final paper (weeks 2-3 and 6-9, respectively)
- Abstract writing: homework assignment (weeks 2-3)
- Research proposal writing (weeks 6-7)
- Peer-review: review abstract (week 3), review experimental proposal (week 7)
- Data analysis (throughout the course)