New York Times Headline Analysis

By Kristina Becvar in New York Times Text Analysis Afghanistan Research Design Projects

May 1, 2022

The primary goal of this aspect of research is to refine the process for examining the content of the full articles for which the main vs. print headlines are the most different from each other in the primary project analysis. 1


New York Times Headline Sentiment Analysis

Previous Research

For this project, I am using some data gathered in the DACSS 602 course “Research Design”. In the project for that course, our research group posed the question:

How did the sentiment of news reporting on the U.S. withdrawal of Afghanistan shift over the period between when it was agreed upon between ex-President Trump and when it was executed by President Biden?

Basic Research Design: Manual coding of a stratified, representative sample of 300 articles after reaching an appropriate inter-coder reliability rating.

Data Sample: We coded articles that mention Afghanistan from the time of the Doha Agreement (February 29, 2020) through September 30, 2021, following the Congressional testimonies conducted September 28-29, 2021 regarding the withdrawal. The articles were collected from the New York Times and the Wall Street Journal World and News sections.

Text Coding Method: We used NVivo 12 to code news articles covering the U.S. withdrawal from Afghanistan during the period leading up to and following the day the last of the U.S. forces left Afghanistan.

Methods/Text Coding Categories:

  • Ground Source: When the reporting is on location
  • Military Source: When there is a quote from a U.S. military spokesperson, leader, or commander
  • Conflict Veteran Source: When there is a quote from a U.S. military veteran who served in Afghanistan
  • Framing: When there is a framing of the Taliban as a threat
  • Sentiment Rating: How the coder felt while reading the article

Outcomes: The article source (NYT v. WSJ) served as a moderator, with the outcomes being the analysis of media frames ‘before and after’ (Trump admin v. Biden admin).


Current Research

I continued down the same path but with new data and a new direction through the DACSS 697D course “Text as Data”.

This project examines the difference in headlines between the paper and online versions of the New York Times articles related to the withdrawal of U.S. troops from Afghanistan. The analysis includes articles that mention “Afghanistan” from the time of the Doha Agreement (February 29, 2020) through September 30, 2021, following the Congressional testimonies conducted September 28-29, 2021.

Analysis of a corpus compiled from data obtained through the New York Times API showed no statistically significant differences in the headlines using three widely used sentiment and emotion lexicons.

Topic modeling and examining a co-occurrence matrix of each set of headlines showed patterns in which types of words are chosen for the respective audience.

Specifically, this preliminary analysis showed that print headlines might carry fewer emotionally weighted words than online headlines.

Preliminary Research Presentation

Poster Presentation of Preliminary Research on This Project


Citations

A complete list of citations can be found on the GitHub Page for this project.

Lexicons Utilized
  • This research makes use of the NRC Word-Emotion Association Lexicon, created by Saif > > Mohammad and Peter Turney at the National Research Council Canada.

  • This research makes use of the Bing Lexicon. This dataset was first published in Minqing Hu and Bing Liu, ``Mining and summarizing customer reviews.’’, Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-2004), 2004.

  • This research makes use of the AFINN Lexicon, Nielsen, F. Å. (2011). A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903.


This project will continue in the coming months utilizing new tools as developed in continuing DACSS courses.


  1. The preliminary work on this project was done as part of the UMass Amherst DACSS Course “Research Design”," taught by Professor Meredith Rolfe, and continuied as part of the UMass Amherst DACSS Course “Text as Data”, taught by Professor Eunkyung Song. ↩︎

Posted on:
May 1, 2022
Length:
3 minute read, 639 words
Categories:
New York Times Text Analysis Afghanistan Research Design Projects
See Also: