The EVEX Dataset is the result of running the Turku Event Extraction System together with BANNER and the McClosky-Charniak Parser on a PubMed scale. In 2010, the system was applied to all abstracts in the 2009 distribution of PubMed. In 2011, we transformed the data into a MySQL database, adding features such as gene family based event generalization.
News and updates
- June 15, 2011: The data in MySQL format is now available for download (Read more / download).
- Aug 5, 2010: The syntactic parses have now been released (Read more / download).
- July 2, 2010: Release of a concise, normalized version of all events in XML (obsoleted in 2011 by the MySQL format)
- June 30, 2010: Release of all events extracted from PubMed in text format (Read more / download).
Release history
- June 2011: MySQL database containing the 2010 data together with gene-family and canonical-form based event generalizations
- June 2010: Shared Task format files with all events extracted from the 2009 release of PubMed together with an earlier canonical-form based event generalizations in an XML format, now deprecated by the 2011 MySQL release.