Event & Participant Extraction from News Articles (Automatic Timeline Generation of News Events)

The larger aim of the project is to create a chronological ordering of important sub-events (and their participants) involved in a bigger event. I worked on the task of creating a strong baseline model for identifying important events in news articles and their associated participants. We used articles from the ECB+ Corpus and Timebank Corpus. We performed extensive feature engineering using syntactic and semantic features that were then used with a CRF Classifier to achieve the following F-1 scores:

  • Event Extraction

    • ECB+ Corpus - 73.02 %
    • TimeBank Corpus - 80.78%
  • Participants

    • ECB+ Corpus - 56.51%