AQMAR Arabic Wikipedia Dependency Tree Corpus http://www.ark.cs.cmu.edu/ArabicDeps/ This is a corpus of ten Arabic Wikipedia articles which have been annotated with the POS tag and dependency parse information. There are a total of 36202 tokens in 1262 sentences. It is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License (see LICENSE). Credit: Emad Mohamed, Carnegie Mellon University in Qatar Contact: Behrang Mohit (behrang@cmu.edu), Carnegie Mellon University in Qatar * Annotations are performed by one annotator in one round with minimal quality control. * Annotation are performed using the Brat annotation tool (http://brat.nlplab.org/). Unzipping each article provides a directory of annotated input files (manually POS tagged sentences) along with the manually parsed sentences. * The single annotator followed the framework and guidelines provided by the CATiB Arabic dependency treebank project (Habash and Roth 2009). First, MADA (Habash et al. 2006) was used to automatically POS tag the corpus. The POS tags were manually examined and corrected by the annotator. Finally, the dependency parse annotations were applied. * The ten Wikipedia articles are a subset of the Wikipedia corpus which has been previously annotated for named entities (Mohit et al. 2012) and semantic supersenses (Schneider et al. 2012). These can be obtained at: http://www.ark.cs.cmu.edu/AQMAR/ * The annotated articles are: 1. Atom 2. Football 3. Linux 4. Mohammad Razi 5. Crusades 6. Internet 7. Nuclear Technology 8. Enrico Fermi 9. Islamic Civilisation 10. Raul Gonzales