EVEX Dataset

The EVEX Dataset is the result of running the Turku Event Extraction System together with BANNER and the McClosky-Charniak Parser on a PubMed scale. In 2010, the system was applied to all abstracts in the 2009 distribution of PubMed. In 2011, we transformed the data into a MySQL database, adding features such as gene family based event generalization.

News and updates

  • June 15, 2011: The data in MySQL format is now available for download (Read more / download).
  • Aug 5, 2010: The syntactic parses have now been released (Read more / download).
  • July 2, 2010: Release of a concise, normalized version of all events in XML (obsoleted in 2011 by the MySQL format)
  • June 30, 2010: Release of all events extracted from PubMed in text format (Read more / download).

Release history

  • June 2011: MySQL database containing the 2010 data together with gene-family and canonical-form based event generalizations
  • June 2010: Shared Task format files with all events extracted from the 2009 release of PubMed together with an earlier canonical-form based event generalizations in an XML format, now deprecated by the 2011 MySQL release.