  Session O1 - Corpora for Machine Translation Chairperson: Josef van Genabith
11:35-11:55 Iñaki San Vicente and Iker Manterola PaCo2: A Fully Automated tool for gathering Parallel Corpora from the Web
11:55-12:15 Mark Fishel, Ondřej Bojar and Maja Popović Terra: a Collection of Translation Error-Annotated Corpora
12:15-12:35 Ahmet Aker, Evangelos Kanoulas and Robert Gaizauskas A light way to collect comparable corpora from the Web
12:35-12:55 Volha Petukhova, Rodrigo Agerri, Mark Fishel, Sergio Penkale, Arantza del Pozo, Mirjam Sepesy Maucec, Andy Way, Panayota Georgakopoulou and Martin Volk SUMAT: Data Collection and Parallel Corpus Compilation for Machine Translation of Subtitles
12:55-13:15 Daniele Pighin, Lluís Màrquez and Lluís Formiga The FAUST Corpus of Adequacy Assessments for Real-World Machine Translation Output


  Session O2 - Infrastructures and Strategies for LRs (1) Chairperson: Hans Uszkoreit
11:35-11:55 Stelios Piperidis The META-SHARE Language Resources Sharing Infrastructure: Principles, Challenges, Solutions
11:55-12:15 Riccardo Del Gratta, Francesca Frontini, Francesco Rubino, Irene Russo and Nicoletta Calzolari The Language Library: supporting community effort for collective resource production
12:15-12:35 Khalid Choukri, Victoria Arranz, Olivier Hamon and Jungyeul Park Using the International Standard Language Resource Number: Practical and Technical Aspects
12:35-12:55 Valérie Mapelli, Victoria Arranz, Matthieu Carré, Hélène Mazo, Djamel Mostefa and Khalid Choukri ELRA in the heart of a cooperative HLT world
12:55-13:15 Christopher Cieri, Marian Reed, Denise DiPersio and Mark Liberman Twenty Years of Language Resource Development and Distribution: A Progress Report on LDC Activities


  Session O3 - Semantics Chairperson: Bolette Pedersen
11:35-11:55 Dan Moldovan and Eduardo Blanco Polaris: Lymba's Semantic Parser
11:55-12:15 Sylvia Springorum, Sabine Schulte im Walde and Antje Roßdeutscher Automatic classification of German """"an"""" particle verbs
12:15-12:35 Livio Robaldo and Jakub Szymanik Pragmatic identification of the witness sets
12:35-12:55 Orphée De Clercq, Veronique Hoste and Paola Monachesi Evaluating automatic cross-domain Dutch semantic role annotation
12:55-13:15 Benoît Robichaud Logic Based Methods for Terminological Assessment


  Session O4 - Speech corpora Chairperson: Sophie Rosset
11:35-11:55 Luis Javier Rodriguez-Fuentes, Mikel Penagarikano, Amparo Varona, Mireia Diez and German Bordel KALAKA-2: a TV Broadcast Speech Database for the Recognition of Iberian Languages in Clean and Noisy Environments
11:55-12:15 Tommaso Raso, Heliana Mello and Maryualê Malvessi Mittmann The C-ORAL-BRASIL I: Reference Corpus for Spoken Brazilian Portuguese
12:15-12:35 Guillaume Gravier, Gilles Adda, Niklas Paulsson, Matthieu Carré, Aude Giraudel and Olivier Galibert The ETAPE corpus for the evaluation of speech-based TV content processing in the French language
12:35-12:55 Daniel Stein and Bela Usabaev Automatic Speech Recognition on a Firefighter TETRA Broadcast Channel
12:55-13:15 Anthony Rousseau, Paul Deléglise and Yannick Estève TED-LIUM: an Automatic Speech Recognition dedicated corpus


  Session O5 - Crowdsourcing (Special Session) Chairperson: Karen Fort and Iryna Gurevych
14:45-15:05 Arno Scharl, Marta Sabou, Stefan Gindl, Walter Rafelsberger and Albert Weichselbraun Leveraging the Wisdom of the Crowds for the Acquisition of Multilingual Language Resources
15:05-15:25 Anoop Kunchukuttan, Shourya Roy, Pratik Patel, Kushal Ladha, Somya Gupta, Mitesh M. Khapra and Pushpak Bhattacharyya Experiences in Resource Generation for Machine Translation through Crowdsourcing
15:25-15:45 Elena Filatova Irony and Sarcasm: Corpus Generation and Analysis Using Crowdsourcing
15:45-16:05 Luís Marujo, Anatole Gershman, Jaime Carbonell, Robert Frederking and João P. Neto Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization


  Session O6 - Dialogue and Multimodality Chairperson: Jimmy Kunzmann
14:45-15:05 Kristiina Jokinen and Graham Wilcock Constructive Interaction for Talking about Interesting Topics
15:05-15:25 Florian Nothdurft and Wolfgang Minker Using multimodal resources for explanation approaches in intelligent systems
15:25-15:45 Shota Yamasaki, Hirohisa Furukawa, Masafumi Nishida, Kristiina Jokinen and Seiichi Yamamoto Multimodal Corpus of Multi-party Conversations in Second Language
15:45-16:05 Takenobu Tokunaga, Ryu Iida, Asuka Terai and Naoko Kuriyama The REX corpora: A collection of multimodal corpora of referring expressions in collaborative problem solving dialogues
16:05-16:25 Harry Bunt, Jan Alexandersson, Jae-Woong Choe, Alex Chengyu Fang, Koiti Hasida, Volha Petukhova, Andrei Popescu-Belis and David Traum ISO 24617-2: A semantically-based standard for dialogue annotation


  Session O7 - Machine Translation and Language Resources (1) Chairperson: Gregor Thurmair
14:45-15:05 Inguna Skadiņa, Ahmet Aker, Nikos Mastropavlos, Fangzhong Su, Dan Tufiș, Mateja Verlic, Andrejs Vasiļjevs, Bogdan Babych, Paul Clough, Robert Gaizauskas, Nikos Glaros, Monica Lestari Paramita and Mārcis Pinnis Collecting and Using Comparable Corpora for Statistical Machine Translation
15:05-15:25 Casey Redd Kennington, Martin Kay and Annemarie Friedrich Suffix Trees as Language Models
15:25-15:45 Ralf Steinberger, Andreas Eisele, Szymon Klocek, Spyridon Pilos and Patrick Schlüter DGT-TM: A freely available Translation Memory in 22 languages
15:45-16:05 Reinhard Rapp, Serge Sharoff and Bogdan Babych Identifying Word Translations from Comparable Documents Without a Seed Lexicon
16:05-16:25 Gideon Kotzé, Vincent Vandeghinste, Scott Martens and Jörg Tiedemann Large aligned treebanks for syntax-based machine translation


  Session O8 - Corpus Processing and Infrastructure Chairperson: Bente Maegaard
14:45-15:05 Lars Borin, Markus Forsberg and Johan Roxendal Korp ― the corpus infrastructure of Spräkbanken
15:05-15:25 Jonathan Wright, Kira Griffitt, Joe Ellis, Stephanie Strassel and Brendan Callahan Annotation Trees: LDC's customizable, extensible, scalable, annotation infrastructure
15:25-15:45 Roland Schäfer and Felix Bildhauer Building Large Corpora from the Web Using a New Efficient Tool Chain
15:45-16:05 Young-Min Kim, Patrice Bellot, Elodie Faath and Marin Dacos Annotated Bibliographical Reference Corpora in Digital Humanities
16:05-16:25 Jan Pomikálek, Miloš Jakubíček and Pavel Rychlý Building a 70 billion word corpus of English from ClueWeb


  Session O9 - Endangered Languages Chairperson: Dafydd Gibbon
16:45-17:05 Melanie Seiss A Rule-based Morphological Analyzer for Murrinh-Patha
17:05-17:25 Dirk Goldhahn, Thomas Eckart and Uwe Quasthoff Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages
17:25-17:45 Helen Aristar-Dry, Sebastian Drude, Menzo Windhouwer, Jost Gippert and Irina Nevskaya ”Rendering Endangered Lexicons Interoperable through Standards Harmonization”: the RELISH project
17:45-18:05 Ryan Georgi, Fei Xia and William Lewis Measuring the Divergence of Dependency Structures Cross-Linguistically to Improve Syntactic Projection Algorithms


  Session O10 - Document Classification, Text Categorisation Chairperson: Luca Dini
16:45-17:05 Julian Brooke and Graeme Hirst Measuring Interlanguage: Native Language Identification with L1-influence Metrics
17:05-17:25 John Noecker Jr and Michael Ryan Distractorless Authorship Verification
17:25-17:45 Monica Lestari Paramita, Paul Clough, Ahmet Aker and Robert Gaizauskas Correlation between Similarity Measures for Inter-Language Linked Wikipedia Articles
17:45-18:05 Ralf Steinberger, Mohamed Ebrahim and Marco Turchi JRC Eurovoc Indexer JEX - A freely available multi-label categorisation tool


  Session O11 - Discourse (1) Chairperson: Haïfa Zargayouna
16:45-17:05 Vinodkumar Prabhakaran, Huzaifa Neralwala, Owen Rambow and Mona Diab Annotations for Power Relations on Email Threads
17:05-17:25 Marilyn Walker, Jean Fox Tree, Pranav Anand, Rob Abbott and Joseph King A Corpus for Research on Deliberation and Debate
17:25-17:45 Jacob Andreas, Sara Rosenthal and Kathleen McKeown Annotating Agreement and Disagreement in Threaded Discussion
17:45-18:05 Sudheer Kolachina, Rashmi Prasad, Dipti Misra Sharma and Aravind Joshi Evaluation of Discourse Relation Annotation in the Hindi Discourse Relation Bank


  Session O12 - Word Sense Disambiguation Chairperson: Kiril Simov
16:45-17:05 Will Roberts and Valia Kordoni Using Verb Subcategorization for Word Sense Disambiguation
17:05-17:25 Marianna Apidianaki and Benoît Sagot Applying cross-lingual WSD to wordnet development
17:25-17:45 Els Lefever, Veronique Hoste and Martine De Cock Discovering Missing Wikipedia Inter-language Links by means of Cross-lingual Word Sense Disambiguation
17:45-18:05 Erwin Fernandez-Ordonez, Rada Mihalcea and Samer Hassan Unsupervised Word Sense Disambiguation with Multilingual Representations


  Session O13 - Multimodal Corpora (1) Chairperson: Christopher Cieri
18:10-18:30 Aude Giraudel, Matthieu Carré, Valérie Mapelli, Juliette Kahn, Olivier Galibert and Ludovic Quintard The REPERE Corpus : a multimodal corpus for person recognition
18:30-18:50 Magdalena Lis Polish Multimodal Corpus ― a collection of referential gestures
18:50-19:10 Stefan Scherer, Georg Layher, John Kane, Heiko Neumann and Nick Campbell An audiovisual political speech analysis incorporating eye-tracking and perception data


  Session O14 - Machine Translation and Evaluation (1) Chairperson: Robert Frederking
18:10-18:30 Sara Stymne, Henrik Danielsson, Sofia Bremin, Hongzhan Hu, Johanna Karlsson, Anna Prytz Lillkull and Martin Wester Eye Tracking as a Tool for Machine Translation Error Analysis
18:30-18:50 Eleftherios Avramidis, Aljoscha Burchardt, Christian Federmann, Maja Popović, Cindy Tscherwinka and David Vilar Involving Language Professionals in the Evaluation of Machine Translation
18:50-19:10 Daniele Pighin, Lluís Màrquez and Jonathan May An Analysis (and an Annotated Corpus) of User Responses to Machine Translation Output


  Session O15 - Information Extraction and Question Answering Chairperson: Allan Hanbury
18:10-18:30 Bonan Min and Ralph Grishman Challenges in the Knowledge Base Population Slot Filling Task
18:30-18:50 Anselmo Peñas, Eduard Hovy, Pamela Forner, Álvaro Rodrigo, Richard Sutcliffe, Corina Forascu and Caroline Sporleder Evaluating Machine Reading Systems through Comprehension Tests
18:50-19:10 Xinkai Wang, Paul Thompson, Jun'ichi Tsujii and Sophia Ananiadou Biomedical Chinese-English CLIR Using an Extended CMeSH Resource to Expand Queries


  Session O16 - Web Services Chairperson: Marc Kemps-Snijders
18:10-18:30 Marc Poch, Antonio Toral, Olivier Hamon, Valeria Quochi and Núria Bel Towards a User-Friendly Platform for Building Language Resources based on Web Services
18:30-18:50 Maciej Ogrodniczuk and Michał Lenart Web Service integration platform for Polish linguistic resources
18:50-19:10 Yoshihiko Hayashi and Chiharu Narawa Classifying Standard Linguistic Processing Functionalities based on Fundamental Data Operation Types


  Session O17 - Infrastructures and Strategies for LRs (2) Chairperson: Andrejs Vasiljevs
9:45-10:05 Claudia Soria, Núria Bel, Khalid Choukri, Joseph Mariani, Monica Monachini, Jan Odijk, Stelios Piperidis, Valeria Quochi and Nicoletta Calzolari The FLaReNet Strategic Language Resource Agenda
10:05-10:25 Daan Broeder, Dieter van Uytvanck, Maria Gavrilidou, Thorsten Trippel and Menzo Windhouwer Standardizing a Component Metadata Infrastructure
10:25-10:45 Daan Broeder, Dieter van Uytvanck and Gunter Senft Citing on-line Language Resources
10:45-11:05 Khalid Choukri and Victoria Arranz An Analytical Model of Language Resource Sustainability
11:05-11:25 David Lewis, Alexander O'Connor, Andrzej Zydroń, Gerd Sjögren and Rahzeb Choudhury On Using Linked Data for Language Resource Sharing in the Long Tail of the Localisation Market


  Session O18 - Dialogue Chairperson: Linne Ha
9:45-10:05 Alexandros Papangelis, Vangelis Karkaletsis and Fillia Makedon Evaluation of Online Dialogue Policy Learning Techniques
10:05-10:25 Lluis-F. Hurtado, Fernando Garcia, Emilio Sanchis and Encarna Segarra The acquisition and dialog act labeling of the EDECAN-SPORTS corpus
10:25-10:45 Jolanta Bachan Developing and evaluating an emergency scenario dialogue corpus
10:45-11:05 Lina M. Rojas-Barahona, Alejandra Lorenzo and Claire Gardent Building and Exploiting a Corpus of Dialog Interactions between French Speaking Virtual and Human Agents
11:05-11:25 Fabrice Lefèvre, Djamel Mostefa, Laurent Besacier, Yannick Estève, Matthieu Quignard, Nathalie Camelin, Benoit Favre, Bassam Jabaian and Lina M. Rojas-Barahona Leveraging study of robustness and portability of spoken language understanding systems across languages and domains: the PORTMEDIA corpora


  Session O19 - Resource Creation and Acquisition Chairperson: James Pustejovsky
9:45-10:05 Xabier Saralegi, Iker Manterola and Iñaki San Vicente Building a Basque-Chinese Dictionary by Using English as Pivot
10:05-10:25 Núria Bel, Lauren Romeo and Muntsa Padró Automatic lexical semantic classification of nouns
10:25-10:45 Ahmet Aker, Mahmoud El-Haj, M-Dyaa Albakour and Udo Kruschwitz Assessing Crowdsourcing Quality through Objective Tasks
10:45-11:05 Attila Zséder, Gábor Recski, Dániel Varga and András Kornai Rapid creation of large-scale corpora and frequency dictionaries
11:05-11:25 Kata Gábor, Marianna Apidianaki, Benoît Sagot and Éric Villemonte de la Clergerie Boosting the Coverage of a Semantic Lexicon by Automatically Extracted Event Nominalizations


  Session O20 - Corpus and Annotation Chairperson: Nancy Ide
9:45-10:05 Karën Fort, Claire François, Olivier Galibert and Maha Ghribi Analyzing the Impact of Prevalence on the Evaluation of a Manual Annotation Campaign
10:05-10:25 Donia Scott, Rossano Barone and Rob Koeling Corpus Annotation as a Scientific Task
10:25-10:45 Stephen Wattam, Paul Rayson and Damon Berridge Document Attrition in Web Corpora: an Exploration
10:45-11:05 Anil Kumar Singh A Concise Query Language with Search and Transform Operations for Corpora with Multiple Levels of Annotation
11:05-11:25 Paola Velardi, Roberto Navigli, Stefano Faralli and Juana Maria Ruiz-Martinez A New Method for Evaluating Automatically Learned Terminological Taxonomies


  Session O21 - Speech Corpora and Tools Chairperson: Robrecht Comeyne
11:45-12:05 Maria Eskevich, Gareth J.F. Jones, Martha Larson and Roeland Ordelman Creating a Data Collection for Evaluating Rich Speech Retrieval
12:05-12:25 Petya Osenova and Kiril Simov The Political Speech Corpus of Bulgarian
12:25-12:45 Brigitte Bigi SPPAS: a tool for the phonetic segmentation of speech
12:45-13:05 Brigitte Bigi, Pauline Péri and Roxane Bertrand Orthographic Transcription: which enrichment is required for phonetization?


  Session O22 - Machine Translation and Evaluation (2) Chairperson: François Yvon
11:45-12:05 Sandra Weiss and Lars Ahrenberg Error profiling for evaluation of machine-translated text: a Polish-English case study
12:05-12:25 Chunqi Shi, Donghui Lin, Masahiko Shimada and Toru Ishida Two Phase Evaluation for Selecting Machine Translation Services
12:25-12:45 Lorenza Russo, Sharid Loáiciga and Asheesh Gulati Italian and Spanish Null Subjects. A Case Study Evaluation in an MT Perspective.
12:45-13:05 Sara Stymne and Lars Ahrenberg On the practice of error analysis for machine translation evaluation


  Session O23 - Semantic Resources Chairperson: Zygmunt Vetulani
11:45-12:05 Janine Pimentel Identifying equivalents of specialized verbs in a bilingual comparable corpus of judgments: A frame-based methodology
12:05-12:25 Alessandra Zarcone and Stefan Rued Logical metonymies and qualia structures: an annotated database of logical metonymies for German
12:25-12:45 Iris Hendrickx, Amália Mendes and Silvia Mencarelli Modality in Text: a Proposal for Corpus Annotation
12:45-13:05 Pablo Mendes, Max Jakob and Christian Bizer DBpedia: A Multilingual Cross-domain Knowledge Base


  Session O24 - Trends in Corpora Chairperson: tba
11:45-12:05 Annie Louis and Ani Nenkova A corpus of general and specific sentences from news
12:05-12:25 Gozde Ozbal, Carlo Strapparava and Marco Guerini Brand Pitt: A Corpus to Explore the Art of Naming
12:25-12:45 Jonathon Read, Dan Flickinger, Rebecca Dridan, Stephan Oepen and Lilja Øvrelid The WeSearch Corpus, Treebank, and Treecache -- A Comprehensive Sample of User-Generated Content
12:45-13:05 Masashi Inoue and Toshiki Akagi Collecting humorous expressions from a community-based question-answering-service corpus


  Session O25 - Multimodal Corpora (2) Chairperson: Nick Campbell
14:55-15:15 Ibrahim Saygin Topkaya and Hakan Erdogan SUTAV: A Turkish Audio-Visual Database
15:15-15:35 Costanza Navarretta and Patrizia Paggio Multimodal Behaviour and Feedback in Different Types of Interaction
15:35-15:55 Carlo Strapparava, Rada Mihalcea and Alberto Battocchi A Parallel Corpus of Music and Lyrics Annotated with Emotions
15:55-16:15 Merlin Teodosia Suarez, Jocelynn Cu and Madelene Sta. Maria Building a Multimodal Laughter Database for Emotion Recognition
16:15-16:35 Dimitra Anastasiou A Speech and Gesture Spatial Corpus in Assisted Living
16:15-16:35 Discussion


  Session O26 - Child Language Corpus Chairperson: Massimo Poesio
14:55-15:15 Priti Aggarwal, Ron Artstein, Jillian Gerten, Athanasios Katsamanis, Shrikanth Narayanan, Angela Nazarian and David Traum The Twins Corpus of Museum Visitor Questions
15:15-15:35 Hyejin Hong, Sunhee Kim and Minhwa Chung Korean Children's Spoken English Corpus and an Analysis of its Pronunciation Variability
15:35-15:55 Marie Tahon, Agnes Delaborde and Laurence Devillers Corpus of Children Voices for Mid-level Markers and Affect Bursts Analysis
15:55-16:15 Aline Villavicencio, Beracah Yankama, Marco Idiart and Robert Berwick A large scale annotated child language construction database
16:15-16:35 Brian MacWhinney Morphosyntactic Analysis of the CHILDES and TalkBank Corpora


  Session O27 - MultiWord Expressions Chairperson: Emanuele Pianta
14:55-15:15 Veronika Vincze Light Verb Constructions in the SzegedParalellFX English--Hungarian Parallel Corpus
15:15-15:35 Antton Gurrutxaga and Iñaki Alegria Measuring the compositionality of NV expressions in Basque by means of distributional similarity techniques
15:35-15:55 Marion Weller and Ulrich Heid Analyzing and Aligning German compound nouns
15:55-16:15 Natalia Loukachevitch Automatic Term Recognition Needs Multiple Evidence
16:15-16:35 Roman Kurc, Maciej Piasecki and Bartosz Broda Constraint Based Description of Polish Multiword Expressions


  Session O28 - Sign Language Chairperson: Javier Caminero
14:55-15:15 Dimitris Metaxas, Bo Liu, Fei Yang, Peng Yang, Nicholas Michael and Carol Neidle Recognition of Nonmanual Markers in American Sign Language (ASL) Using Non-Parametric Adaptive 2D-3D Face Tracking
15:15-15:35 Matti Karppa, Tommi Jantunen, Ville Viitaniemi, Jorma Laaksonen, Birgitta Burger and Danny De Weerdt Comparing computer vision analysis of signed language video with motion capture recordings
15:35-15:55 Annelies Braffort and Leïla Boutora DEGELS1: A comparable corpus of French Sign Language and co-speech gestures
15:55-16:15 Matilde Gonzalez, Michael Filhol and Christophe Collet Semi-Automatic Sign Language Corpora Annotation using Lexical Representations of Signs
16:15-16:35 Umar Shoaib, Nadeem Ahmad, Paolo Prinetto and Gabriele Tiotto A platform-independent user-friendly dictionary from Italian to LIS


  Session O29 - Language Generation and Paraphrasing Chairperson: Robert Dale
16:55-17:15 Houda Bouamor, Aurélien Max, Gabriel Illouz and Anne Vilnat A contrastive review of paraphrase acquisition techniques
17:15-17:35 Matteo Negri, Yashar Mehdad, Alessandro Marchetti, Danilo Giampiccolo and Luisa Bentivogli Chinese Whispers: Cooperative Paraphrase Acquisition
17:35-17:55 Hideki Shima and Teruko Mitamura Diversifiable Bootstrapping for Acquiring High-Coverage Paraphrase Resource
17:55-18:15 Sebastian Varges, Heike Bieler, Manfred Stede, Lukas C. Faulstich, Kristin Irsig and Malik Atalla SemScribe: Natural Language Generation for Medical Reports


  Session O30 - Computer Aided Language Learning Chairperson: Justus Roux
16:55-17:15 Hitokazu Matsushita and Deryle Lonsdale Item Development and Scoring for Japanese Oral Proficiency Testing
17:15-17:35 Manny Rayner, Pierrette Bouillon and Johanna Gerlach Evaluating Appropriateness Of System Responses In A Spoken CALL Game
17:35-17:55 Antonio Moreno-Sandoval, Leonardo Campillos Llanos, Yang Dong, Emi Takamori, José M. Guirao, Paula Gozalo, Chieko Kimura, Kengo Matsui and Marta Garrote-Salazar Spontaneous Speech Corpora for language learners of Spanish, Chinese and Japanese
17:55-18:15 Helmer Strik, Jozef Colpaert, Joost Van Doremalen and Catia Cucchiarini The DISCO ASR-based CALL system: practicing L2 oral skills and beyond


  Session O31 - Discourse (2) Chairperson: Aravind Joshi
16:55-17:15 Marta Tatu and Dan Moldovan A Tool for Extracting Conversational Implicatures
17:15-17:35 Andrei Popescu-Belis, Thomas Meyer, Jeevanthi Liyanapathirana, Bruno Cartoni and Sandrine Zufferey Discourse-level Annotation over Europarl for Machine Translation: Connectives and Pronouns
17:35-17:55 Steven Bethard, Oleksandr Kolomiyets and Marie-Francine Moens Annotating Story Timelines as Temporal Dependency Structures
17:55-18:15 Stergos Afantenos, Nicholas Asher, Farah Benamara, Myriam Bras, Cecile Fabre, Mai Ho-Dac, Anne Le Draoulec, Philippe Muller, Marie-Paul Pery-Woodley, Laurent Prevot, Josette Rebeyrolles, Ludovic Tanguy, Marianne Vergez-Couret and Laure Vieu An empirical resource for discovering cognitive principles of discourse organisation: the ANNODIS corpus


  Session O32 - Syntax and Parsing Chairperson: Adam Przepiórkowski
16:55-17:15 Daniel Zeman, David Mareček, Martin Popel, Loganathan Ramasamy, Jan Štěpánek, Zdeněk Žabokrtský and Jan Hajič HamleDT: To Parse or Not to Parse?
17:15-17:35 Elsa Tolone, Benoît Sagot and Éric Villemonte de la Clergerie Evaluating and improving syntactic lexica by plugging them within a parser
17:35-17:55 Thomas Proisl and Peter Uhrig Efficient Dependency Graph Matching with the IMS Open Corpus Workbench
17:55-18:15 Miguel Ballesteros and Joakim Nivre MaltOptimizer: A System for MaltParser Optimization


  Session O33 - Semantics from Corpora Chairperson: Maria Teresa Pazienza
18:20-18:40 Alex Judea, Vivi Nastase and Michael Strube Concept-based Selectional Preferences and Distributional Representations from Wikipedia Articles
18:40-19:00 Elias Iosif, Maria Giannoudaki, Eric Fosler-Lussier and Alexandros Potamianos Associative and Semantic Features Extracted From Web-Harvested Corpora
19:00-19:20 Octavian Popescu Buildind a Resource of Patterns Using Semantic Types


  Session O34 - Authoring and Related Tools Chairperson: Bernardo Magnini
18:20-18:40 Irina Temnikova, Constantin Orasan and Ruslan Mitkov CLCM - A Linguistic Resource for Effective Simplification of Instructions in the Crisis Management Domain and its Evaluations
18:40-19:00 Robert Dale and George Narroway A Framework for Evaluating Text Correction
19:00-19:20 Paul Rodrigues and C. Anton Rytting Typing Race Games as a Method to Create Spelling Error Corpora


  Session O35 -Word Sense Annotation and Disambiguation Chairperson: Yoshihiko Hayashi
18:20-18:40 Rebecca J. Passonneau, Collin F. Baker, Christiane Fellbaum and Nancy Ide The MASC Word Sense Corpus
18:40-19:00 Darja Fišer, Nikola Ljubešić and Ozren Kubelka Addressing polysemy in bilingual lexicon extraction from comparable corpora
19:00-19:20 Gerard de Melo, Collin F. Baker, Nancy Ide, Rebecca J. Passonneau and Christiane Fellbaum Empirical Comparisons of MASC Word Sense Annotations


  Session O36 - Time and Space Chairperson: Piek Vossen
18:20-18:40 Hector Llorens, Leon Derczynski, Robert Gaizauskas and Estela Saquete TIMEN: An Open Temporal Expression Normalisation Resource
18:40-19:00 Kirk Roberts, Travis Goodwin and Sanda M. Harabagiu Annotating Spatial Containment Relations Between Events
19:00-19:20 James Pustejovsky and Jessica Moszkowicz The Role of Model Testing in Standards Development: The Case of ISO-Space


  Session O37 - Subjectivity and Emotions Chairperson: Julio Gonzalo
9:45-10:05 Jörg Frommer, Bernd Michaelis, Dietmar Rösner, Andreas Wendemuth, Rafael Friesen, Matthias Haase, Manuela Kunze, Rico Andrich, Julia Lange, Axel Panning and Ingo Siegert Towards Emotion and Affect Detection in the Multimodal LAST MINUTE Corpus
10:05-10:25 Isa Maks and Piek Vossen Building a fine-grained subjectivity lexicon from a web corpus
10:25-10:45 Veronica Perez-Rosas, Carmen Banea and Rada Mihalcea Learning Sentiment Lexicons in Spanish
10:45-11:05 Tommaso Caselli, Irene Russo and Francesco Rubino Assigning Connotation Values to Events
11:05-11:25 Balamuraliar, Aditya Joshi and Pushpak Bhattacharyya Cost and Benefit of Using WordNet Senses for Sentiment Analysis


  Session O38 - Named Entities Chairperson: Satoshi Sato
9:45-10:05 Xuansong Li, Stephanie Strassel, Heng Ji, Kira Griffitt and Joe Ellis Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual
10:05-10:25 Dawn Lawrie, James Mayfield, Paul McNamee and Douglas Oard Creating and Curating a Cross-Language Person-Entity Linking Collection
10:25-10:45 Keith J. Miller, Elizabeth Schroeder Richerson, Sarah McLeod, James Finley and Aaron Schein International Multicultural Name Matching Competition: Design, Execution, Results, and Lessons Learned
10:45-11:05 K Saravanan, Monojit Choudhury, Raghavendra Udupa and A Kumaran An Empirical Study of the Occurrence and Co-Occurrence of Named Entities in Natural Language Corpora
11:05-11:25 Olivier Galibert, Sophie Rosset, Cyril Grouin, Pierre Zweigenbaum and Ludovic Quintard Extended Named Entities Annotation on OCRed Documents: From Corpus Constitution to Evaluation Campaign


  Session O39 - Treebanks and Syntax Chairperson: Erhard Hinrichs
9:45-10:05 Wolfgang Seeker and Jonas Kuhn Making Ellipses Explicit in Dependency Conversion for a German Treebank
10:05-10:25 Hongzhi Xu, Helen Kaiyun Chen, Chu-Ren Huang, Qin Lu, Dingxu Shi and Tin-Shing Chiu A Grammar-informed Corpus-based Sentence Database for Linguistic and Computational Studies
10:25-10:45 Tafseer Ahmed, Miriam Butt, Annette Hautli and Sebastian Sulger A Reference Dependency Bank for Analyzing Complex Predicates
10:45-11:05 Jan Hajič, Eva Hajičová, Jarmila Panevová, Petr Sgall, Ondřej Bojar, Silvie Cinková, Eva Fučíková, Marie Mikulová, Petr Pajas, Jan Popelka, Jiří Semecký, Jana Šindlerová, Jan Štěpánek, Josef Toman, Zdeňka Urešová and Zdeněk Žabokrtský Announcing Prague Czech-English Dependency Treebank 2.0
11:05-11:25 Liesbeth Augustinus, Vincent Vandeghinste and Frank Van Eynde Example-Based Treebank Querying


  Session O40 - Semantic Lexicons and Semantic Annotation Chairperson: Dimitrios Kokkinakis
9:45-10:05 Valentin I. Spitkovsky and Angel X. Chang A Cross-Lingual Dictionary for English Wikipedia Concepts
10:05-10:25 Silvie Cinková, Martin Holub, Adam Rambousek and Lenka Smejkalová A database of semantic clusters of verb usages
10:25-10:45 Ernesto William De Luca Is it Useful to Support Users with Lexical Resources? A User Study.
10:45-11:05 Natalia Konstantinova, Sheila C.M. de Sousa, Noa P. Cruz, Manuel J. Maña, Maite Taboada and Ruslan Mitkov A review corpus annotated for negation, speculation and their scope
11:05-11:25 Valerio Basile, Johan Bos, Kilian Evang and Noortje Venhuizen Developing a large semantically annotated corpus


  Session O41 - Machine Translation and Language Resources (2) Chairperson: Atsushi Fujii
11:45-12:05 Víctor M. Sánchez-Cartagena, Miquel Esplà-Gomis and Juan Antonio Pérez-Ortiz Source-Language Dictionaries Help Non-Expert Users to Enlarge Target-Language Dictionaries for Machine Translation
12:05-12:25 Christian Federmann, Eleftherios Avramidis, Marta R. Costa-Jussà, Josef van Genabith, Maite Melero and Pavel Pecina The ML4HMT Workshop on Optimising the Division of Labour in Hybrid Machine Translation
12:25-12:45 Maria Holmqvist, Sara Stymne, Lars Ahrenberg and Magnus Merkel Alignment-based reordering for SMT
12:45-13:05 Monica Gavrila, Walther v. Hahn and Cristina Vertan Same domain different discourse style - A case study on Language Resources for data-driven Machine Translation
13:05-13:25 Hidetsugu Nanba, Toshiyuki Takezawa, Kiyoko Uchiyama and Akiko Aizawa Automatic Translation of Scholarly Terms into Patent Terms Using Synonym Extraction Techniques


  Session O42 - WordNets Chairperson: Martha Palmer
11:45-12:05 Sanni Nimb and Bolette Sandford Pedersen Towards a richer wordnet representation of properties
12:05-12:25 Aitor Gonzalez-Agirre, Mauro Castillo and German Rigau A proposal for improving WordNet Domains
12:25-12:45 Fernando Castilho, Roger Granada, Breno Meneghetti, Leonardo Carvalho and Renata Vieira Corpus+WordNet thesaurus generation for ontology enriching
12:45-13:05 Benoît Sagot and Darja Fišer Cleaning noisy wordnets
13:05-13:25 Valérie Hanoka and Benoît Sagot Wordnet extension made simple: A multilingual lexicon-based approach using wiki resources


  Session O43 - Text Mining Chairperson: Paul Rayson
11:45-12:05 Mathias Bank and Martin Schierle A Survey of Text Mining Architectures and the UIMA Standard
12:05-12:25 Diana Maynard and Mark A. Greenwood Large Scale Semantic Annotation, Indexing and Search at The National Archives
12:25-12:45 Georgeta Bordea, Sabrina Kirrane, Paul Buitelaar and Bianca Pereira Expertise Mining for Enterprise Content Management
12:45-13:05 Elias Iosif and Alexandros Potamianos SemSim: Resources for Normalized Semantic Similarity Computation Using Lexical Networks
13:05-13:25 Raheel Nawaz, Paul Thompson and Sophia Ananiadou Identification of Manner in Bio-Events


  Session O44 - Evaluation of Systems and Application Chairperson: Sabine Schulte im Walde
11:45-12:05 Ioana Vasilescu, Martine Adda-Decker and Lori Lamel Cross-lingual studies of ASR errors: paradigms for perceptual evaluations
12:05-12:25 Kallirroi Georgila, Alan Black, Kenji Sagae and David Traum Practical Evaluation of Human and Synthesized Speech for Virtual Human Dialogue Systems
12:25-12:45 Tomoyosi Akiba, Hiromitsu Nishizaki, Kiyoaki Aikawa, Tatsuya Kawahara and Tomoko Matsui Designing an Evaluation Framework for Spoken Term Detection and Spoken Document Retrieval at the NTCIR-9 SpokenDoc Task
12:45-13:05 Tina Kluewer, Feiyu Xu, Peter Adophs and Hans Uszkoreit Evaluation of the KomParse Conversational Non-Player Characters in a Commercial Virtual World
13:05-13:25 Marcello Federico, Sebastian Stüker, Luisa Bentivogli, Michael Paul, Mauro Cettolo, Teresa Herrmann, Jan Niehues and Giovanni Moretti The IWSLT 2011 Evaluation Campaign on Automatic Talk Translation


  Session O45 - New Media (Special Session) Chairperson: Thierry Declerk
14:55-15:15 Eleanor Clark and Kenji Araki Two Database Resources for Processing Social Media English Text
15:15-15:35 Maite Melero, Marta R. Costa-Jussà, Judith Domingo, Montse Marquina and Martí Quixal Holaaa!! writin like u talk is kewl but kinda hard 4 NLP
15:35-15:55 William J. Corvey, Sudha Verma, Sarah Vieweg, Martha Palmer and James H. Martin Foundations of a Multilayer Annotation Framework for Twitter Communications During Crisis Events
15:55-16:15 Kirk Roberts, Michael A. Roach, Joseph Johnson, Josh Guthrie and Sanda M. Harabagiu EmpaTweet: Annotating and Detecting Emotions on Twitter


  Session O46 - Semantics, Knowledge and Ontologies Chairperson: Christian Chiarcos
14:55-15:15 Nava Maroto, Marie-Claude L'Homme and Amparo Alcina Semantic Relations Established by Specialized Processes Expressed by Nouns and Verbs: Identification in a Corpus by means of Syntactico-semantic Annotation
15:15-15:35 Jorge Vivaldi, Luis Adrián Cabrera-Diego, Gerardo Sierra and María Pozzi Using Wikipedia to Validate the Terminology found in a Corpus of Basic Textbooks
15:35-15:55 Maria Teresa Pazienza, Armando Stellato and Andrea Turbati PEARL: ProjEction of Annotations Rule Language, a Language for Projecting (UIMA) Annotations over RDF Knowledge Bases
15:55-16:15 Peter Exner and Pierre Nugues Constructing Large Proposition Databases
16:15-16:35 Montse Cuadros, Lluís Padró and German Rigau Highlighting relevant concepts from Topic Signatures


  Session O47 - Segmentation,Tagging, Parsing Chairperson: Valia Kordoni
14:55-15:15 Agnieszka Patejuk and Adam Przepiórkowski Towards an LFG parser for Polish: An exercise in parasitic grammar development
15:15-15:35 Yan Song and Fei Xia Using a Goodness Measurement for Domain Adaptation: A Case Study on Chinese Word Segmentation
15:35-15:55 Daniel Bauer, Hagen Fürstenau and Owen Rambow The Dependency-Parsed FrameNet Corpus
15:55-16:15 Majdi Sawalha, Claire Brierley and Eric Atwell Predicting Phrase Breaks in Classical and Modern Standard Arabic Text
16:15-16:35 Sudheer Kolachina and Prasanth Kolachina Parsing Any Domain English text to CoNLL dependencies


  Session O48 - Named Entities and Subjectivity Chairperson: Dan Tufis
14:55-15:15 Udo Hahn, Elena Beisswanger, Ekaterina Buyko, Erik Faessler, Jenny Traumüller, Susann Schröder and Kerstin Hornbostel Iterative Refinement and Quality Checking of Annotation Guidelines ― How to Deal Effectively with Semantically Sloppy Named Entity Types, such as Pathological Phenomena
15:15-15:35 Danuta Ploch, Leonhard Hennig, Angelina Duka, Ernesto William De Luca and Sahin Albayrak GerNED: A German Corpus for Named Entity Disambiguation
15:35-15:55 Ian Lewin, Şenay Kafkas and Dietrich Rebholz-Schuhmann Centroids: Gold standards with distributional variation
15:55-16:15 Yulan He, Hassan Saif, Zhongyu Wei and Kam-Fai Wong Quantising Opinions for Political Tweets Analysis
16:15-16:35 Muhammad Abdul-Mageed and Mona Diab AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis


  Session P1 - Anaphora and Coreference Chair : Ineke Schuurman
11:35-13:15 Abdul-Baquee Sharaf and Eric Atwell QurAna: Corpus of the Quran annotated with Pronominal Anaphora
11:35-13:15 Stefanie Dipper, Melanie Seiss and Heike Zinsmeister The Use of Parallel and Comparable Data for Analysis of Abstract Anaphora in German and English
11:35-13:15 Lucie Poláková, Pavlína Jínová and Jiří Mírovský Interplay of Coreference and Discourse Relations: Discourse Connectives with a Referential Component
11:35-13:15 Luz Rello and Iria Gayo A Portuguese-Spanish Corpus Annotated for Subject Realization and Referentiality
11:35-13:15 Marilisa Amoia, Kerstin Kunz and Ekaterina Lapshinova-Koltunski Coreference in Spoken vs. Written Texts: a Corpus-based Analysis
11:35-13:15 Marta Recasens, M. Antònia Martí and Constantin Orasan Annotating Near-Identity from Coreference Disagreements
11:35-13:15 Thomas Kaspersson, Christian Smith, Henrik Danielsson and Arne Jönsson This also affects the context - Errors in extraction based summaries
11:35-13:15 Natsuko Nakagawa and Yasuharu Den Annotation of anaphoric relations and topic continuity in Japanese conversation
11:35-13:15 Olga Uryupina and Massimo Poesio Domain-specific vs. Uniform Modeling for Coreference Resolution
11:35-13:15 Mateusz Kopeć and Maciej Ogrodniczuk Creating a Coreference Resolution System for Polish


  Session P2 - Tools, Systems and Evaluation Chair : Michael Kipp
11:35-13:15 Felix Burkhardt Fast Labeling and Transcription with the Speechalyzer Toolkit
11:35-13:15 Bart Jongejan Automatic annotation of head velocity and acceleration in Anvil
11:35-13:15 Przemyslaw Lenkiewicz, Binyam Gebrekidan Gebre, Oliver Schreer, Stefano Masneri, Daniel Schneider and Sebastian Tschöpel AVATecH ― automated annotation through audio and video analysis
11:35-13:15 Henk van den Heuvel, Eric Sanders, Robin Rutten, Stef Scagliola and Paula Witkamp An Oral History Annotation Tool for INTER-VIEWs
11:35-13:15 Han Sloetjes and Aarthy Somasundaram ELAN development, keeping pace with communities' needs
11:35-13:15 Michał Marcińczuk, Jan Kocoń and Bartosz Broda Inforex -- a web-based tool for text corpus management and semantic annotation
11:35-13:15 Binyam Gebrekidan Gebre, Peter Wittenburg and Przemyslaw Lenkiewicz Towards Automatic Gesture Stroke Detection
11:35-13:15 Thomas Schmidt EXMARaLDA and the FOLK tools ― two toolsets for transcribing and annotating spoken language
11:35-13:15 Leonardo Campillos Llanos Designing a search interface for a Spanish learner spoken corpus: the end-user's evaluation


  Session P3 - Lexical Resources Chair : Adam Kilgarriff
11:35-13:15 Satoshi Sato Dictionary Look-up with Katakana Variant Recognition
11:35-13:15 Karin Friberg Heppin and Maria Toporowska Gronostaj The Rocky Road towards a Swedish FrameNet - Creating SweFN
11:35-13:15 Marie-Claude L'Homme and Janine Pimentel Capturing syntactico-semantic regularities among terms: An application of the FrameNet methodology to terminology
11:35-13:15 David Graff and Mohamed Maamouri Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects
11:35-13:15 Judith Eckle-Kohler, Iryna Gurevych, Silvana Hartmann, Michael Matuschek and Christian M. Meyer UBY-LMF -- A Uniform Model for Standardizing Heterogeneous Lexical-Semantic Resources in ISO-LMF
11:35-13:15 František Cvrček, Karel Pala and Pavel Rychlý Legal electronic dictionary for Czech
11:35-13:15 Amir Hazem and Emmanuel Morin Adaptive Dictionary for Bilingual Lexicon Extraction from Comparable Corpora
11:35-13:15 Jennifer Williams and Graham Katz A New Twitter Verb Lexicon for Natural Language Processing


  Session P4 - Annotation and Corpora Chair : Andreas Witt
11:35-13:15 Ritesh Kumar Challenges in the development of annotated corpora of computer-mediated communication in Indian Languages: A Case of Hindi
11:35-13:15 Christian Chiarcos Ontologies of Linguistic Annotation: Survey and perspectives
11:35-13:15 Johanka Spoustová and Miroslav Spousta A High-Quality Web Corpus of Czech
11:35-13:15 Xavier Tannier WebAnnotator, an Annotation Tool for Web Pages
11:35-13:15 Chi-Hsin Yu, Yi-jie Tang and Hsin-Hsi Chen Development of a Web-Scale Chinese Word N-gram Corpus with Parts of Speech Information
11:35-13:15 Dominique Fohr and Odile Mella CoALT: A Software for Comparing Automatic Labelling Tools
11:35-13:15 Valentina Bartalesi Lenzi, Giovanni Moretti and Rachele Sprugnoli CAT: the CELCT Annotation Tool
11:35-13:15 Radu Ion, Elena Irimia, Dan Ștefănescu and Dan Tufiș ROMBAC: The Romanian Balanced Annotated Corpus
11:35-13:15 Ismaïl El Maarouf and Jeanne Villaneau A French Fairy Tale Corpus syntactically and semantically annotated
11:35-13:15 Carlos Morell, Jorge Vivaldi and Núria Bel Iula2Standoff: a tool for creating standoff documents for the IULACT
11:35-13:15 Frederic Landragin, Thierry Poibeau and Bernard Victorri ANALEC: a New Tool for the Dynamic Annotation of Textual Data
11:35-13:15 Georgios Petasis The SYNC3 Collaborative Annotation Tool
11:35-13:15 Heba Elfardy and Mona Diab Simplified guidelines for the creation of Large Scale Dialectal Arabic Annotations


  Session P5 - Information Extraction (1) Chair : Günter Neumann
14:45-16:25 Michael Wiegand, Benjamin Roth, Eva Lasarcyk, Stephanie Köser and Dietrich Klakow A Gold Standard for Relation Extraction in the Food Domain
14:45-16:25 Mathias Bank, Robert Remus and Martin Schierle Textual Characteristics for Language Engineering
14:45-16:25 Ziqi Zhang, Philip Webster, Victoria Uren, Andrea Varga and Fabio Ciravegna Automatically Extracting Procedural Knowledge from Instructional Texts using Natural Language Processing
14:45-16:25 Xavier Tannier, Véronique Moriceau, Béatrice Arnulphy and Ruixin He Evolution of Event Designation in Media: Preliminary Study
14:45-16:25 Yunqing Xia, Guoyu Tang, Peng Jin and Xia Yang CLTC: A Chinese-English Cross-lingual Topic Corpus
14:45-16:25 Julia Maria Schulz, Daniela Becks, Christa Womser-Hacker and Thomas Mandl A Resource-light Approach to Phrase Extraction for English and German Documents from the Patent Domain and User Generated Content
14:45-16:25 Md. Faisal Mahbub Chowdhury and Alberto Lavelli An Evaluation of the Effect of Automatic Preprocessing on Syntactic Parsing for Biomedical Relation Extraction
14:45-16:25 Wei Wang, Romaric Besançon, Olivier Ferret and Brigitte Grau Evaluation of Unsupervised Information Extraction
14:45-16:25 Stéphanie Weiser and Patrick Watrin Extraction of unmarked quotations in Newspapers
14:45-16:25 Martin Aleksandrov and Carlo Strapparava NgramQuery - Smart Information Extraction from Google N-gram using External Resources


  Session P6 - Word Sense Disambiguation and Evaluation Chair : Sanni Nimb
14:45-16:25 Héctor Martínez Alonso, Núria Bel and Bolette Sandford Pedersen A voting scheme to detect semantic underspecification
14:45-16:25 Verena Henrich and Erhard Hinrichs A Comparative Evaluation of Word Sense Disambiguation Algorithms for German
14:45-16:25 Piek Vossen, Attila Görög, Rubén Izquierdo and Antal Van den Bosch DutchSemCor: Targeting the ideal sense-tagged corpus
14:45-16:25 Samuel Fernando and Mark Stevenson Mapping WordNet synsets to Wikipedia articles
14:45-16:25 Myriam Rakho, Éric Laporte and Matthieu Constant A new semantically annotated corpus with syntactic-semantic and cross-lingual senses
14:45-16:25 Minoru Sasaki and Hiroyuki Shinnou Detection of Peculiar Word Sense by Distance Metric Learning with Labeled Examples
14:45-16:25 Soojeong Eom, Markus Dickinson and Graham Katz Using semi-experts to derive judgments on word sense alignment: a pilot study
14:45-16:25 John Vogel, Marc Verhagen and James Pustejovsky ATLIS: Identifying Locational Information in Text Automatically


  Session P7 - Multiword Expressions and Term Extraction Chair : Karel Pala
14:45-16:25 Behrang QasemiZadeh, Paul Buitelaar, Tianqi Chen and Georgeta Bordea Semi-Supervised Technical Term Tagging With Minimal User Feedback
14:45-16:25 Miriam Buendía-Castro and Beatriz Sánchez-Cárdenas Linguistic knowledge for specialized text production
14:45-16:25 Rita Marinelli and Laura Cignoni In the same boat and other idiomatic seafaring expressions
14:45-16:25 Sabine Schulte im Walde, Susanne Borgwaldt and Ronny Jauch Association Norms of German Noun Compounds
14:45-16:25 Doaa Samy, Antonio Moreno-Sandoval, Conchi Bueno-Díaz, Marta Garrote-Salazar and José M. Guirao Medical Term Extraction in an Arabic Medical Corpus
14:45-16:25 Matthieu Constant and Isabelle Tellier Evaluating the Impact of External Lexical Resources into a CRF-based Multiword Segmenter and Part-of-Speech Tagger
14:45-16:25 Anita Gojun, Ulrich Heid, Bernd Weißbach, Carola Loth and Insa Mingers Adapting and evaluating a generic term extraction tool
14:45-16:25 Mladen Karan, Jan Šnajder and Bojana Dalbelo Bašić Evaluation of Classification Algorithms and Features for Collocation Extraction in Croatian
14:45-16:25 Thibault Mondary, Adeline Nazarenko, Haïfa Zargayouna and Sabine Barreaux The Quaero Evaluation Initiative on Term Extraction
14:45-16:25 Shiva Taslimipoor, Afsaneh Fazly and Ali Hamzeh Using Noun Similarity to Adapt an Acceptability Measure for Persian Light Verb Constructions
14:45-16:25 Dhouha Bouamor, Nasredine Semmar and Pierre Zweigenbaum Identifying bilingual Multi-Word Expressions for Statistical Machine Translation
14:45-16:25 Takafumi Suzuki, Yusuke Abe, Itsuki Toyota, Takehito Utsuro, Suguru Matsuyoshi and Masatoshi Tsuchiya Detecting Japanese Compound Functional Expressions using Canonical/Derivational Relation
14:45-16:25 Aude Grezka and Céline Poudat Building a database of French frozen adverbial phrases
14:45-16:25 Marc Luder German Verb Patterns and Their Implementation in an Electronic Dictionary


  Session P8 - Authoring Tools, Proofing Chair : Catia Cucchiarini
14:45-16:25 Flore Barcellini, Camille Albert, Corinne Grosse and Patrick Saint-Dizier Risk Analysis and Prevention: LELIE, a Tool dedicated to Procedure and Requirement Authoring
14:45-16:25 Mohammad Hoseyn Sheykholeslam, Behrouz Minaei-Bidgoli and Hossein Juzi A Framework for Spelling Correction in Persian Language Using Noisy Channel Model
14:45-16:25 Nizar Habash, Mona Diab and Owen Rambow Conventional Orthography for Dialectal Arabic
14:45-16:25 Khaled Shaalan, Mohammed Attia, Pavel Pecina, Younes Samih and Josef van Genabith Arabic Word Generation and Modelling for Spell Checking
14:45-16:25 Jan Rygl and Aleš Horák Similarity Ranking as Attribute for Machine Learning Approach to Authorship Identification
14:45-16:25 Shaohua Yang, Hai Zhao, Xiaolin Wang and Bao-liang Lu Spell Checking for Chinese
14:45-16:25 Jordi Atserias, Maria Fuentes, Rogelio Nazar and Irene Renau Spell Checking in Spanish: The Case of Diacritic Accents
14:45-16:25 Michael Rosner, Albert Gatt, Andrew Attard and Jan Joachimsen Incorporating an Error Corpus into a Spellchecker for Maltese


  Session P9 - Morphology Chair : Pushpak Bhattacharyya
16:45-18:05 Marco Passarotti and Francesco Mambrini First Steps towards the Semi-automatic Development of a Wordformation-based Lexicon of Latin
16:45-18:05 Marcin Woliński, Marcin Miłkowski, Maciej Ogrodniczuk and Adam Przepiórkowski PoliMorf: a (not so) new open morphological dictionary for Polish
16:45-18:05 Lionel Nicolas, Jacques Farré and Cécile Darme Unsupervised acquisition of concatenative morphology
16:45-18:05 Emad Mohamed, Behrang Mohit and Kemal Oflazer Annotating and Learning Morphological Segmentation of Egyptian Colloquial Arabic
16:45-18:05 Paul Felt, Eric Ringger, Kevin Seppi, Kristian Heal, Robbie Haertel and Deryle Lonsdale First Results in a Study Evaluating Pre-annotation and Correction Propagation for Machine-Assisted Syriac Morphological Analysis
16:45-18:05 Claudia Marzi, Marcello Ferro, Claudia Caudai and Vito Pirrelli Evaluating Hebbian Self-Organizing Memories for Lexical Representation and Access
16:45-18:05 Cheikh M. Bamba Dione A Morphological Analyzer For Wolof Using Finite-State Techniques
16:45-18:05 Septina Dian Larasati IDENTIC Corpus: Morphologically Enriched Indonesian-English Parallel Corpus
16:45-18:05 Liviu P. Dinu, Vlad Niculae and Octavia-Maria Şulea The Romanian Neuter Examined Through A Two-Gender N-Gram Classification System
16:45-18:05 Toshinobu Ogiso, Mamoru Komachi, Yasuharu Den and Yuji Matsumoto UniDic for Early Middle Japanese: a Dictionary for Morphological Analysis of Classical Japanese
16:45-18:05 Maciej Piasecki, Radoslaw Ramocki and Marek Maziarz Recognition of Polish Derivational Relations Based on Supervised Learning Scheme
16:45-18:05 Dan Cristea, Radu Simionescu and Gabriela Haja Reconstructing the Diachronic Morphology of Romanian from Dictionary Citations
16:45-18:05 Krešimir Šojat, Nives Mikelić Preradović and Marko Tadić Generation of Verbal Stems in Derivationally Rich Language
16:45-18:05 Jonathan Washington, Mirlan Ipasov and Francis Tyers A finite-state morphological transducer for Kyrgyz
16:45-18:05 Fabio Tamburini and Matias Melandri AnIta: a powerful morphological analyser for Italian


  Session P10 - Prosody and Phonetics Chair : Laurence Devillers
16:45-18:05 Gloria Gagliardi, Edoardo Lombardi Vallauri and Fabio Tamburini A topologic view of Topic and Focus marking in Italian
16:45-18:05 Geneviève Caelen-Haumont and Sethserey Sam Comparison between two models of language for the automatic phonetic labeling of an undocumented language of the South-Asia: the case of Mo Piu
16:45-18:05 Benoît Weber, Geneviève Caelen-Haumont, Binh Hai Pham and Do-Dat Tran MISTRAL+: A Melody Intonation Speaker Tonal Range semi-automatic Analysis using variable Levels
16:45-18:05 Nelly Barbot, Olivier Boeffard and Arnaud Delhay Comparing performance of different set-covering strategies for linguistic content optimization in speech corpora
16:45-18:05 Olivier Boeffard, Laure Charonnat, Sébastien Le Maguer and Damien Lolive Towards Fully Automatic Annotation of Audio Books for TTS
16:45-18:05 Iris Merkus and Florian Schiel Statistical Evaluation of Pronunciation Encoding
16:45-18:05 Helen Kaiyun Chen Annotating a corpus of human interaction with prosodic profiles ― focusing on Mandarin repair/disfluency
16:45-18:05 Kikuo Maekawa Prediction of Non-Linguistic Information of Spontaneous Speech from the Prosodic Annotation: Evaluation of the X-JToBI system
16:45-18:05 Antonio Origlia and Iolanda Alfano Prosomarker: a prosodic analysis tool based on optimal pitch stylization and automatic syllabi fication
16:45-18:05 David Doukhan, Sophie Rosset, Albert Rilliard, Christophe d'Alessandro and Martine Adda-Decker Designing French Tale Corpora for Entertaining Text To Speech Synthesis
16:45-18:05 Claire Brierley, Majdi Sawalha and Eric Atwell Open-Source Boundary-Annotated Corpus for Arabic Speech and Language Processing
16:45-18:05 Luc Boruta and Justyna Jastrzebska A Phonemic Corpus of Polish Child-Directed Speech


  Session P11 - Language Resource Infrastructures (1) Chair : Daan Broeder
16:45-18:05 Peter Spyns and Elisabeth D'Halleweyn Smooth Sailing for STEVIN
16:45-18:05 Dieter van Uytvanck, Herman Stehouwer and Lari Lampen Semantic metadata mapping in practice: the Virtual Language Observatory
16:45-18:05 Aditi Sharma Grover, Annamart Nieman, Gerhard Van Huyssteen and Justus Roux Aspects of a Legal Framework for Language Resource Management
16:45-18:05 Elena Volodina and Sofie Johansson Kokkinakis Introducing the Swedish Kelly-list, a new lexical e-resource for Swedish
16:45-18:05 Philippe Langlais, Patrick Drouin, Amélie Paulus, Eugénie Rompré Brodeur and Florent Cottin Texto4Science: a Quebec French Database of Annotated Short Text Messages
16:45-18:05 Jan Odijk Recent Developments in CLARIN-NL
16:45-18:05 Emanuel Dima, Christina Hoppermann, Erhard Hinrichs, Thorsten Trippel and Claus Zinn A Metadata Editor to Support the Description of Linguistic Resources
16:45-18:05 Hanno Biber and Evelyn Breiteneder Fivehundredmillionandone Tokens. Loading the AAC Container with Text Resources for Text Studies.
16:45-18:05 José Pedro Ferreira, Maarten Janssen, Gladis Barcellos de Oliveira, Margarita Correia and Gilvan Müller de Oliveira The Common Orthographic Vocabulary of the Portuguese Language: a set of open lexical resources for a pluricentric language
16:45-18:05 Andrejs Vasiljevs, Markus Forsberg, Tatiana Gornostay, Dorte Haltrup Hansen, Kristín Jóhannsdóttir, Gunn Lyse, Krister Lindén, Lene Offersgaard, Sussi Olsen, Bolette Pedersen, Eiríkur Rögnvaldsson, Inguna Skadiņa, Koenraad De Smedt, Ville Oksanen and Roberts Rozis Creation of an Open Shared Language Resource Repository in the Nordic and Baltic Countries
16:45-18:05 Nicoletta Calzolari, Riccardo Del Gratta, Gil Francopoulo, Joseph Mariani, Francesco Rubino, Irene Russo and Claudia Soria The LRE Map. Harmonising Community Descriptions of Resources
16:45-18:05 Maria Gavrilidou, Penny Labropoulou, Elina Desipri, Stelios Piperidis, Haris Papageorgiou, Monica Monachini, Francesca Frontini, Thierry Declerck, Gil Francopoulo, Victoria Arranz and Valérie Mapelli The META-SHARE Metadata Schema for the Description of Language Resources
16:45-18:05 Yoshinobu Kano Towards automation in using multi-modal language resources: compatibility and interoperability for multi-modal features in Kachako


  Session P12 - Subjectivity: Sentiments, Emotions, Opinions (1) Chair : Carlo Strapparava
18:10-19:30 Xin Zuo, Tian Li and Pascale Fung A Multilingual Natural Stress Emotion Database
18:10-19:30 Takahiro Miyajima, Hideaki Kikuchi, Katsuhiko Shirai and Shigeki Okawa Method for Collection of Acted Speech Using Various Situation Scripts
18:10-19:30 Hong Li, Xiwen Cheng, Kristina Adson, Tal Kirshboim and Feiyu Xu Annotating Opinions in German Political News
18:10-19:30 Akshat Bakliwal, Piyush Arora and Vasudeva Varma Hindi Subjective Lexicon: A Lexical Resource for Hindi Adjective Polarity Classification
18:10-19:30 Juan María Garrido, Yesika Laplaza, Montse Marquina, Andrea Pearman, José Gregorio Escalada, Miguel Ángel Rodríguez and Ana Armenta The I3MEDIA speech database: a trilingual annotated corpus for the analysis and synthesis of emotional speech
18:10-19:30 Panagiotis Giannoulis and Gerasimos Potamianos A hierarchical approach with feature selection for emotion recognition from speech
18:10-19:30 Alexandra Balahur and Jesús M. Hermida Extending the EmotiNet Knowledge Base to Improve the Automatic Detection of Implicitly Expressed Emotions from Text
18:10-19:30 Saeedeh Momtazi Fine-grained German Sentiment Analysis on Social Media
18:10-19:30 Felix Burkhardt “You Seem Aggressive!” Monitoring Anger in a Practical Application
18:10-19:30 Yi-jie Tang and Hsin-Hsi Chen Mining Sentiment Words from Microblogs for Predicting Writer-Reader Emotion Transition
18:10-19:30 Christian Scheible and Hinrich Schütze Bootstrapping Sentiment Labels For Unannotated Documents With Polarity PageRank


  Session P13 - Named Entity Recognition Chair : Antonio Branco
18:10-19:30 Antje Schlaf and Robert Remus Learning Categories and their Instances by Contextual Features
18:10-19:30 Nuno Cardoso Rembrandt - a named-entity recognition framework
18:10-19:30 Bogdan Sacaleanu and Günter Neumann An Adaptive Framework for Named Entity Combination
18:10-19:30 Maria Skeppstedt, Maria Kvist and Hercules Dalianis Rule-based Entity Recognition and Coverage of SNOMED CT in Swedish Clinical Text
18:10-19:30 Mārcis Pinnis Latvian and Lithuanian Named Entity Recognition with TildeNER
18:10-19:30 Marco Dinarelli and Sophie Rosset Tree-Structured Named Entity Recognition on OCR Data: Analysis, Processing and Results
18:10-19:30 Benoît Sagot and Rosa Stern Aleda, a free large-scale entity database for French
18:10-19:30 Pablo Mendes, Joachim Daiber, Rohana Rajapakse, Felix Sasaki and Christian Bizer Evaluating the Impact of Phrase Recognition on Concept Tagging


  Session P14 - Dialogue Chair : Ron Artstein
18:10-19:30 Tobias Heinroth, Maximilian Grotz, Florian Nothdurft and Wolfgang Minker Adaptive Speech Understanding for Intuitive Model-based Spoken Dialogues
18:10-19:30 Kseniya Zablotskaya, Umair Rahim, Fernando Fernández Martínez and Wolfgang Minker Relating Dominance of Dialogue Participants with their Verbal Intelligence Scores
18:10-19:30 Volha Petukhova and Harry Bunt The coding and annotation of multimodal dialogue acts
18:10-19:30 Harry Bunt, Michael Kipp and Volha Petukhova Using DiAML and ANVIL for multimodal dialogue annotations
18:10-19:30 Matthew Fuchs, Nikos Tsourakis and Manny Rayner A Scalable Architecture For Web Deployment of Spoken Dialogue Systems
18:10-19:30 Nikos Tsourakis and Manny Rayner A Corpus for a Gesture-Controlled Mobile Spoken Dialogue System
18:10-19:30 Emina Kurtic, Bill Wells, Guy J. Brown, Timothy Kempton and Ahmet Aker A Corpus of Spontaneous Multi-party Conversation in Bosnian Serbo-Croatian and British English
18:10-19:30 Jing Guang Han, Emer Gilmartin, Celine DeLooze, Brian Vaughan and Nick Campbell The Herme Database of Spontaneous Multimodal Human-Robot Dialogues
18:10-19:30 Yasuharu Den, Hanae Koiso, Katsuya Takanashi and Nao Yoshida Annotation of response tokens and their triggering expressions in Japanese multi-party conversations
18:10-19:30 Thierry Bazillon, Melanie Deplano, Frederic Bechet, Alexis Nasr and Benoit Favre Syntactic annotation of spontaneous speech: application to call-center conversation data
18:10-19:30 Frederic Bechet, Benjamin Maza, Nicolas Bigouroux, Thierry Bazillon, Marc El-Beze, Renato De Mori and Eric Arbillot DECODA: a call-centre human-human spoken conversation corpus
18:10-19:30 Pepi Stavropoulou, Dimitris Spiliotopoulos and Georgios Kouroupetroglou Resource Evaluation for Usable Speech Interfaces: Utilizing Human-Human Dialogue
18:10-19:30 Jens Edlund, Simon Alexandersson, Jonas Beskow, Lisa Gustavsson, Mattias Heldner, Anna Hjalmarsson, Petter Kallionen and Ellen Marklund 3rd party observer gaze as a continuous measure of dialogue flow
18:10-19:30 Marc Tomlinson, David Bracewell, Mary Draper, Zewar Almissour, Ying Shi and Jeremy Bensley Pursing power in Arabic on-line discussion forums
18:10-19:30 Sunao Hara, Norihide Kitaoka and Kazuya Takeda Causal analysis of task completion errors in spoken music retrieval interactions
18:10-19:30 Marilyn Walker, Grace Lin and Jennifer Sawyer An Annotated Corpus of Film Dialogue for Learning and Characterizing Character Style


  Session P15 - Semantic Annotation Chair : Aline Villavicencio
9:45-11:25 Béatrice Arnulphy, Xavier Tannier and Anne Vilnat Event Nominals: Annotation Guidelines and a Manually Annotated Corpus in French
9:45-11:25 Maria Aloni, Andreas van Cranenburgh, Raquel Fernandez and Marta Sznajder Building a Corpus of Indefinite Uses Annotated with Fine-grained Semantic Functions
9:45-11:25 António Branco, Catarina Carvalheiro, Sílvia Pereira, Sara Silveira, João Silva, Sérgio Castro and João Graça A PropBank for Portuguese: the CINTIL-PropBank
9:45-11:25 Ashwini Vaidya, Jinho D. Choi, Martha Palmer and Bhuvana Narasimhan Empty Argument Insertion in the Hindi PropBank
9:45-11:25 Pierrette Bouillon, Elisabetta Jezek, Chiara Melloni and Aurélie Picton Annotating Qualia Relations in Italian and French Complex Nominals
9:45-11:25 Juliette Thuilier and Laurence Danlos Semantic annotation of French corpora: animacy and verb semantic classes
9:45-11:25 Josef Ruppenhofer and Ines Rehbein Yes we can!? Annotating English modal verbs
9:45-11:25 Mehdi Manshadi, James Allen and Mary Swift An Annotation Scheme for Quantifier Scope Disambiguation
9:45-11:25 Yuichiroh Matsubayashi, Yusuke Miyao and Akiko Aizawa Building Japanese Predicate-argument Structure Corpus using Lexical Conceptual Structure
9:45-11:25 Kyoko Ohara Semantic Annotations in Japanese FrameNet: Comparing Frames in Japanese and English
9:45-11:25 Roser Morante and Walter Daelemans ConanDoyle-neg: Annotation of negation cues and their scope in Conan Doyle stories


  Session P16 - Document Classification, Text Categorisation Chair : Serge Sharoff
9:45-11:25 Mike Kestemont, Claudia Peersman, Benny De Decker, Guy De Pauw, Kim Luyckx, Roser Morante, Frederik Vaassen, Janneke van de Loo and Walter Daelemans The Netlog Corpus. A Resource for the Study of Flemish Dutch Internet Language
9:45-11:25 Kseniya Zablotskaya, Fernando Fernández Martínez and Wolfgang Minker Investigating Verbal Intelligence Using the TF-IDF Approach
9:45-11:25 Sanja Štajner and Ruslan Mitkov Diachronic Changes in Text Complexity in 20th Century English Language: An NLP Approach
9:45-11:25 Tommaso Fornaciari and Massimo Poesio DeCour: a corpus of DEceptive statements in Italian COURts
9:45-11:25 Amalia Todirascu, Sebastian Pado, Jennifer Krisch, Max Kisselew and Ulrich Heid French and German Corpora for Audience-based Text Type Classification
9:45-11:25 Borut Sluban, Senja Pollak, Roel Coesemans and Nada Lavrac Irregularity Detection in Categorized Document Corpora
9:45-11:25 Carmen Dayrell, Arnaldo Candido Jr., Gabriel Lima, Danilo Machado Jr., Ann Copestake, Valéria Feltrim, Stella Tagnin and Sandra Aluisio Rhetorical Move Detection in English Abstracts: Multi-label Sentence Classifiers and their Annotated Corpora
9:45-11:25 Andrea Varga, Daniel Preotiuc-Pietro and Fabio Ciravegna Unsupervised document zone identification using probabilistic graphical models
9:45-11:25 Mohammad Hossein Elahimanesh, Behrouz Minaei and Hossein Malekinezhad Improving K-Nearest Neighbor Efficacy for Farsi Text Classification


  Session P17 - Grammar and Syntax Chair : Eleni Efthimiou
9:45-11:25 Erhard Hinrichs and Thomas Zastrow Automatic Annotation and Manual Evaluation of the Diachronic German Corpus TüBa-D/DC
9:45-11:25 Hongsuck Seo, Kyusong Lee, Gary Geunbae Lee, Soo-Ok Kweon and Hae-Ri Kim Grammatical Error Annotation for Korean Learners of Spoken English
9:45-11:25 Heiki-Jaan Kaalep and Kadri Muischnek Robust clause boundary identification for corpus annotation
9:45-11:25 Patrick Ziering, Sina Zarrieß and Jonas Kuhn A Corpus-based Study of the German Recipient Passive
9:45-11:25 Zygmunt Vetulani Wordnet Based Lexicon Grammar for Polish
9:45-11:25 Montserrat Arza, José M. García-Miguel, Francisco Campillo and Miguel Cuevas - Alonso A Galician Syntactic Corpus with Application to Intonation Modeling
9:45-11:25 Hiroaki Sato A Search Tool for FrameNet Constructicon
9:45-11:25 Markus Dickinson and Scott Ledbetter Annotating Errors in a Hungarian Learner Corpus
9:45-11:25 Stefan Bott, Horacio Saggion and Simon Mille Text Simplification Tools for Spanish
9:45-11:25 Antske Fokkens, Tania Avgustinova and Yi Zhang CLIMB grammars: three projects using metagrammar engineering
9:45-11:25 Peteris Paikens and Normunds Gruzitis An implementation of a Latvian resource grammar in Grammatical Framework
9:45-11:25 Shafqat Mumtaz Virk and Elnaz Abolahrar An Open Source Persian Computational Grammar
9:45-11:25 Paula Buttery and Andrew Caines Reclassifying subcategorization frames for experimental analysis and stimulus generation
9:45-11:25 Andrew Caines and Paula Buttery Annotating progressive aspect constructions in the spoken section of the British National Corpus


  Session P18 - Digital Libraries Chair : Monica Monachini
9:45-11:25 Jordi Adell, Antonio Bonafonte, Antonio Cardenal, Marta R. Costa-Jussà, José A. R. Fonollosa, Asunción Moreno, Eva Navas and Eduardo R. Banga BUCEADOR, a multi-language search engine for digital libraries
9:45-11:25 Ranka Stanković, Cvetana Krstev, Ivan Obradović, Aleksandra Trtovac and Miloš Utvić A tool for enhanced search of multilingual digital libraries of e-journals
9:45-11:25 Benjamin Weitz and Ulrich Schäfer A Graphical Citation Browser for the ACL Anthology
9:45-11:25 Eleftheria Ahtaridis, Christopher Cieri and Denise DiPersio LDC Language Resource Database: Building a Bibliographic Database
9:45-11:25 Eneko Agirre, Ander Barrena, Oier Lopez de Lacalle, Aitor Soroa, Samuel Fernando and Mark Stevenson Matching Cultural Heritage items to Wikipedia


  Session P19 - Treebanks Chair : Menno van Zaanen
11:45-13:05 Seth Kulick, Ann Bies and Justin Mott Further Developments in Treebank Error Detection Using Derivation Trees
11:45-13:05 Xuansong Li, Stephanie Strassel, Stephen Grimes, Safa Ismael, Mohamed Maamouri, Ann Bies and Nianwen Xue Parallel Aligned Treebanks at LDC: New Challenges Interfacing Existing Infrastructures
11:45-13:05 Mohamed Maamouri, Ann Bies and Seth Kulick Expanding Arabic Treebank to Speech: Results from Broadcast News
11:45-13:05 Magali Sanches Duran and Sandra Maria Aluísio Propbank-Br: a Brazilian Treebank annotated with semantic role labels
11:45-13:05 Yi Zhang, Rui Wang and Yu Chen Joint Grammar and Treebank Development for Mandarin Chinese with HPSG
11:45-13:05 Annette Rios and Anne Göhring A tree is a Baum is an árbol is a sach'a: Creating a trilingual treebank
11:45-13:05 Miriam Kaeshammer and Vera Demberg German and English Treebanks and Lexica for Tree-Adjoining Grammars
11:45-13:05 Loganathan Ramasamy and Zdeněk Žabokrtský Prague Dependency Style Treebank for Tamil
11:45-13:05 Patricia Gonçalves, Rita Santos and António Branco Treebanking by Sentence and Tree Transformation: Building a Treebank to support Question Answering in Portuguese
11:45-13:05 Dasa Berovic, Zeljko Agic and Marko Tadić Croatian Dependency Treebank: Recent Development and Initial Experiments
11:45-13:05 Rahul Agarwal, Bharat Ram Ambati and Anil Kumar Singh A GUI to Detect and Correct Errors in Hindi Dependency Treebank
11:45-13:05 Masood Ghayoomi From Grammar Rule Extraction to Treebanking: A Bootstrapping Approach
11:45-13:05 Montserrat Marimon, Beatríz Fisas, Núria Bel, Marta Villegas, Jorge Vivaldi, Sergi Torner, Mercè Lorente, Silvia Vázquez and Marta Villegas The IULA Treebank
11:45-13:05 Atro Voutilainen, Kristiina Muhonen, Tanja Purtonen and Krister Lindén Specifying Treebanks, Outsourcing Parsebanks: FinnTreeBank 3
11:45-13:05 Cristina Bosco, Manuela Sanguinetti and Leonardo Lesmo The Parallel-TUT: a multilingual and multiformat treebank
11:45-13:05 Teresa Lynn, Ozlem Cetinoglu, Jennifer Foster, Elaine Uí Dhonnchadha, Mark Dras and Josef van Genabith Irish Treebanking and Parsing: A Preliminary Evaluation


  Session P20 - Parsing Chair : Antonio Moreno-Sandoval
11:45-13:05 Mohammed Attia, Khaled Shaalan, Lamia Tounsi and Josef van Genabith Automatic Extraction and Evaluation of Arabic LFG Resources
11:45-13:05 Kristiina Muhonen and Tanja Purtonen Rule-Based Detection of Clausal Coordinate Ellipsis
11:45-13:05 Gülşen Eryiğit The Impact of Automatic Morphological Analysis & Disambiguation on Dependency Parsing of Turkish
11:45-13:05 Stasinos Konstantopoulos, Valia Kordoni, Nicola Cancedda, Vangelis Karkaletsis, Dietrich Klakow and Jean-Michel Renders Task-Driven Linguistic Analysis based on an Underspecified Features Representation
11:45-13:05 Malin Ahlberg and Ramona Enache Combining Language Resources Into A Grammar-Driven Swedish Parser
11:45-13:05 Eiríkur Rögnvaldsson, Anton Karl Ingason, Einar Freyr Sigurðsson and Joel Wallenberg The Icelandic Parsed Historical Corpus (IcePaHC)
11:45-13:05 Anita Alicante, Cristina Bosco, Anna Corazza and Alberto Lavelli A treebank-based study on the influence of Italian word order on parsing performance
11:45-13:05 Dong Wang and Fei Xia Effort of Genre Variation and Prediction of System Performance


  Session P21 - Information Extraction (2) Chair : Paul Buitelaar
11:45-13:05 Michael Tepper, Daniel Capurro, Fei Xia, Lucy Vanderwende and Meliha Yetisgen-Yildiz Statistical Section Segmentation in Free-Text Clinical Records
11:45-13:05 Ramona Bongelli, Carla Canestrari, Ilaria Riccioni, Andrzej Zuczkowski, Cinzia Buldorini, Ricardo Pietrobon, Alberto Lavelli and Bernardo Magnini A Corpus of Scientific Biomedical Texts Spanning over 168 Years Annotated for Uncertainty
11:45-13:05 Cristina Mota, Alberto Simões, Cláudia Freitas, Luís Costa and Diana Santos Págico: Evaluating Wikipedia-based information retrieval in Portuguese
11:45-13:05 Danica Damljanovic, Udo Kruschwitz, M-Dyaa Albakour, Johann Petrak and Mihai Lupu Applying Random Indexing to Structured Data to Find Contextually Similar Words
11:45-13:05 Horacio Saggion and Sandra Szasz The CONCISUS Corpus of Event Summaries
11:45-13:05 Gracinda Carvalho, David Martins de Matos and Vitor Rocio Building and Exploring Semantic Equivalences Resources
11:45-13:05 Marc Verhagen and James Pustejovsky The TARSQI Toolkit
11:45-13:05 Gabriella Pardelli, Manuela Sassi, Sara Goggi and Stefania Biagioni From medical language processing to BioNLP domain
11:45-13:05 Romaric Besançon, Olivier Ferret and Ludovic Jean-Louis Evaluation of a Complex Information Extraction Application in Specific Domain
11:45-13:05 Hannah Kermes A methodology for the extraction of information about the usage of formulaic expressions in scientific texts
11:45-13:05 André Santos, José João Almeida and Nuno Carvalho Structural alignment of plain text books
11:45-13:05 Gerold Schneider, Fabio Rinaldi and Simon Clematide Dependency parsing for interaction detection in pharmacogenomics
11:45-13:05 William Black, Rob Procter, Steven Gray and Sophia Ananiadou A data and analysis resource for an experiment in text mining a collection of micro-blogs on a political topic.


  Session P22 - Part-of-Speech Tagging Chair : Reinhard Rapp
14:55-16:35 Slav Petrov, Dipanjan Das and Ryan McDonald A Universal Part-of-Speech Tagset
14:55-16:35 Atro Voutilainen Improving corpus annotation productivity: a method and experiment with interactive tagging
14:55-16:35 Andrea Gesmundo and Tanja Samardzic Lemmatising Serbian as Category Tagging with Bidirectional Sequence Classification
14:55-16:35 Souhir Gahbiche-Braham, Hélène Bonneau-Maynard, Thomas Lavergne and François Yvon Joint Segmentation and POS Tagging for Arabic Using a CRF-based Classifier
14:55-16:35 Mans Hulden and Jerid Francom Boosting statistical tagger accuracy with simple rule-based grammars
14:55-16:35 Maarten Janssen NeoTag: a POS Tagger for Grammatical Neologism Detection
14:55-16:35 Francesco Rubino, Francesca Frontini and Valeria Quochi Integrating NLP Tools in a Distributed Environment: A Case Study Chaining a Tagger with a Dependency Parser


  Session P23 - Machine Translation (1) Chair : Philippe Langlais
14:55-16:35 Bruno Cartoni and Thomas Meyer Extracting Directional and Comparable Corpora from a Multilingual Corpus for Translation Studies
14:55-16:35 Marianna J. Martindale Can Statistical Post-Editing with a Small Parallel Corpus Save a Weak MT Engine?
14:55-16:35 Sanja Seljan, Marija Brkić and Tomislav Vičić BLEU Evaluation of Machine-Translated English-Croatian Legislation
14:55-16:35 Chenhui Chu, Toshiaki Nakazawa and Sadao Kurohashi Chinese Characters Mapping Table of Japanese, Traditional Chinese and Simplified Chinese
14:55-16:35 Juan Pablo Martínez Cortés, Jim O'Regan and Francis Tyers Free/Open Source Shallow-Transfer Based Machine Translation for Spanish and Aragonese
14:55-16:35 Jan Berka, Ondřej Bojar, Mark Fishel, Maja Popović and Daniel Zeman Automatic MT Error Analysis: Hjerson Helping Addicter
14:55-16:35 Amit Sangodkar and Om Damani Re-ordering Source Sentences for SMT
14:55-16:35 Ângela Costa, Tiago Luís, Joana Ribeiro, Ana Cristina Mendes and Luísa Coheur An English-Portuguese parallel corpus of questions: translation guidelines and application in SMT
14:55-16:35 Mehmet Talha Çakmak, Süleyman Acar and Gülşen Eryiğit Word Alignment for English-Turkish Language Pair
14:55-16:35 Radu Ion PEXACC: A Parallel Sentence Mining Algorithm from Comparable Corpora
14:55-16:35 Eleftherios Avramidis, Marta R. Costa-Jussà, Christian Federmann, Josef van Genabith, Maite Melero and Pavel Pecina A Richly Annotated, Multilingual Parallel Corpus for Hybrid Machine Translation
14:55-16:35 Stephen Grimes, Katherine Peterson and Xuansong Li Automatic word alignment tools to scale production of manually aligned parallel texts
14:55-16:35 Carla Parra Escartín Design and compilation of a specialized Spanish-German parallel corpus
14:55-16:35 Jörg Tiedemann, Dorte Haltrup Hansen, Lene Offersgaard, Sussi Olsen and Matthias Zumpe A Distributed Resource Repository for Cloud-Based Machine Translation


  Session P24 - Corpus Creation, Processing, Usage (1) Chair : Takenobu Tokunaga
14:55-16:35 Jörg Tiedemann Parallel Data, Tools and Interfaces in OPUS
14:55-16:35 Maciej Ogrodniczuk The Polish Sejm Corpus
14:55-16:35 Lieve Macken, Veronique Hoste, Marielle Leijten and Luuk Van Waes From keystrokes to annotated process data: Enriching the output of Inputlog with linguistic information
14:55-16:35 Maristella Agosti, Birgit Alber, Giorgio Maria Di Nunzio, Marco Dussin, Stefan Rabanus and Alessandra Tomaselli A Curated Database for Linguistic Research: The Test Case of Cimbrian Varieties
14:55-16:35 Michel Généreux, Iris Hendrickx and Amália Mendes Introducing the Reference Corpus of Contemporary Portuguese Online
14:55-16:35 Mojgan Seraji, Beáta Megyesi and Joakim Nivre A Basic Language Resource Kit for Persian
14:55-16:35 Eric Sanders Collecting and Analysing Chats and Tweets in SoNaR
14:55-16:35 Tomaž Erjavec The goo300k corpus of historical Slovene
14:55-16:35 Mathieu-Henri Falco, Véronique Moriceau and Anne Vilnat Kitten: a tool for normalizing HTML and extracting its textual content
14:55-16:35 Maaske Treurniet, Orphée De Clercq, Henk van den Heuvel and Nelleke Oostdijk Collection of a corpus of Dutch SMS
14:55-16:35 Alessandro Panunzi, Marco Fabbri, Massimo Moneglia, Lorenzo Gregori and Samuele Paladini RIDIRE-CPI: an Open Source Crawling and Processing Infrastructure for Supervised Web-Corpora Building
14:55-16:35 Brett Drury and José João Almeida The Minho Quotation Resource
14:55-16:35 Elena Frick, Carsten Schnober and Piotr Bański Evaluating Query Languages for a Corpus Processing System


  Session P25 - Evaluation Methodologies Chair : Mathieu Lafourcade
14:55-16:35 Abdul-Baquee Sharaf and Eric Atwell QurSim: A corpus for evaluation of relatedness in short texts
14:55-16:35 Christina Feilmayr, Birgit Pröll and Elisabeth Linsmayr EVALIEX ― A Proposal for an Extended Evaluation Methodology for Information Extraction Systems
14:55-16:35 Patrick Paroubek and Xavier Tannier A Rough Set Formalization of Quantitative Evaluation with Ambiguity
14:55-16:35 Thomas Eckart, Uwe Quasthoff and Dirk Goldhahn The Influence of Corpus Quality on Statistical Measurements on Language Resources
14:55-16:35 Olga Babko-Malaya, Greg Milette, Michael Schneider and Sarah Scogin Identifying Nuggets of Information in GALE Distillation Evaluation
14:55-16:35 Chieh-Jen Wang, Shuk-Man Cheng, Lung-Hao Lee, Hsin-Hsi Chen, Wen-shen Liu, Pei-Wen Huang and Shih-Peng Lin NTUSocialRec: An Evaluation Dataset Constructed from Microblogs for Recommendation Applications in Social Networks


  Session P26 - Multilinguality Chair : Gil Francopoulo
16:55-18:15 Jyrki Niemi and Krister Lindén Representing the Translation Relation in a Bilingual Wordnet
16:55-18:15 Alexandr Rosen and Martin Vavřín Building a multilingual parallel corpus for human users
16:55-18:15 Martina Katalin Szabó, Veronika Vincze and István Nagy T. HunOr: A Hungarian―Russian Parallel Corpus
16:55-18:15 Kanika Gupta, Monojit Choudhury and Kalika Bali Mining Hindi-English Transliteration Pairs from Online Hindi Lyrics
16:55-18:15 Gilles Sérasset Dbnary: Wiktionary as a LMF based Multilingual RDF network
16:55-18:15 Lluís Padró and Evgeny Stanilovsky FreeLing 3.0: Towards Wider Multilinguality
16:55-18:15 Svetla Koeva, Ivelina Stoyanova, Rositsa Dekova, Borislav Rizov and Angel Genov Bulgarian X-language Parallel Corpus
16:55-18:15 Enikő Héja and Dávid Takács Automatically Generated Online Dictionaries
16:55-18:15 Costanza Navarretta, Elisabeth Ahlsén, Jens Allwood, Kristiina Jokinen and Patrizia Paggio Feedback in Nordic First-Encounters: a Comparative Study
16:55-18:15 Yu Chen and Andreas Eisele MultiUN v2: UN Documents with Multilingual Alignments
16:55-18:15 Zahurul Islam and Alexander Mehler Customization of the Europarl Corpus for Translation Studies
16:55-18:15 Thierry Declerck, Karlheinz Mörth and Piroska Lendvai Accessing and standardizing Wiktionary lexical entries for the translation of labels in Cultural Heritage taxonomies
16:55-18:15 Ying Li, Yue Yu and Pascale Fung A Mandarin-English Code-Switching Corpus
16:55-18:15 Paulo Fernandes, Lucelene Lopes, Carlos A. Prolo, Afonso Sales and Renata Vieira A Fast, Memory Efficient, Scalable and Multilingual Dictionary Retriever
16:55-18:15 Aitor Gonzalez-Agirre, Egoitz Laparra and German Rigau Multilingual Central Repository version 3.0


  Session P27 - Question Answering and Summarisation Chair : Horacio Saggion
16:55-18:15 Christian Smith, Henrik Danielsson and Arne Jönsson A good space: Lexical predictors in word space evaluation
16:55-18:15 Ulrich Andersen, Anna Braasch, Lina Henriksen, Csaba Huszka, Anders Johannsen, Lars Kayser, Bente Maegaard, Ole Norgaard, Stefan Schulz and Jürgen Wedekind Creation and use of Language Resources in a Question-Answering eHealth System
16:55-18:15 Atsushi Fujii, Yuya Fujii and Takenobu Tokunaga Effects of Document Clustering in Modeling Wikipedia-style Term Descriptions
16:55-18:15 Silvia Quarteroni, Vincenzo Guerrisi and Pietro La Torre Evaluating Multi-focus Natural Language Queries over Data Services
16:55-18:15 Maria Fuentes, Horacio Rodríguez and Jordi Turmo Summarizing a multimodal set of documents in a Smart Room


  Session P28 - Multimodal Corpus for Interaction Chair : Vangelis Karkaletsis
16:55-18:15 Dietmar Rösner, Jörg Frommer, Rafael Friesen, Matthias Haase, Julia Lange and Mirko Otto LAST MINUTE: a Multimodal Corpus of Speech-based User-Companion Interactions
16:55-18:15 Karën Fort and Vincent Claveau Annotating Football Matches: Influence of the Source Medium on Manual Annotation
16:55-18:15 Stephanie Strassel, Amanda Morris, Jonathan Fiscus, Christopher Caruso, Haejoong Lee, Paul Over, James Fiumara, Barbara Shaw, Brian Antonishek and Martial Michel Creating HAVIC: Heterogeneous Audio Visual Internet Collection
16:55-18:15 Charlotte Alazard, Corine Astésano and Michel Billières MULTIPHONIA: a MULTImodal database of PHONetics teaching methods in classroom InterActions.


  Session 29 - Ontologies Chair : Paola Velardi
16:55-18:15 Egoitz Laparra, German Rigau and Piek Vossen Mapping WordNet to the Kyoto ontology
16:55-18:15 Kugatsu Sadamitsu, Kuniko Saito, Kenji Imamura and Yoshihiro Matsuo Constructing a Class-Based Lexical Dictionary using Interactive Topic Models
16:55-18:15 Verginica Barbu Mititelu Adding Morpho-semantic Relations to the Romanian Wordnet
16:55-18:15 Julien Seinturier, Elisabeth Murisasco, Emmanuel Bruno and Philippe Blache An ontological approach to model and query multimodal concurrent linguistic annotations
16:55-18:15 Massimo Moneglia, Monica Monachini, Omar Calabrese, Alessandro Panunzi, Francesca Frontini, Gloria Gagliardi and Irene Russo The IMAGACT Cross-linguistic Ontology of Action. A new infrastructure for natural language disambiguation
16:55-18:15 Inga Gheorghita and Jean-Marie Pierrel Towards a methodology for automatic identification of hypernyms in the definitions of large-scale dictionary
16:55-18:15 John McCrae, Elena Montiel-Ponsoda and Philipp Cimiano Collaborative semantic editing of linked data lexica
16:55-18:15 Christophe Roche Ontoterminology: How to unify terminology and ontology into a single paradigm
16:55-18:15 Alexandre Denis, Ingrid Falk, Claire Gardent and Laura Perez-Beltrachini Representation of linguistic and domain knowledge for second language learning in virtual worlds
16:55-18:15 Petya Osenova, Kiril Simov, Laska Laskova and Stanislava Kancheva A Treebank-driven Creation of an OntoValence Verb lexicon for Bulgarian
16:55-18:15 Elisa Bianchi, Mirko Tavosanis and Emiliano Giovannetti Creation of a bottom-up corpus-based ontology for Italian Linguistics
16:55-18:15 Matteo Abrate and Clara Bacciu Visualizing word senses in WordNet Atlas


  Session P30 - Discourse Chair : David Traum
18:20-19:40 Kristiina Jokinen and Silvi Tenjes Investigating Engagement - intercultural and technological aspects of the collection, analysis, and use of the Estonian Multiparty Conversational video data
18:20-19:40 Patrick Saint-Dizier DISLOG: A logic-based language for processing discourse structures
18:20-19:40 Sarah Bourse and Patrick Saint-Dizier A Repository of Rules and Lexical Resources for Discourse Structure Analysis: the Case of Explanation Structures
18:20-19:40 Stefania Degaetano-Ortlieb, Ekaterina Lapshinova-Koltunski and Elke Teich Feature Discovery for Diachronic Register Analysis: a Semi-Automatic Approach
18:20-19:40 Sucheta Ghosh, Richard Johansson, Giuseppe Riccardi and Sara Tonelli Improving the Recall of a Discourse Parser by Constraint-based Postprocessing
18:20-19:40 Elizabeth Baran, Yaqin Yang and Nianwen Xue Annotating dropped pronouns in Chinese newswire text
18:20-19:40 Magdalena Rysova Alternative Lexicalizations of Discourse Connectives in Czech
18:20-19:40 Utku Şirin, Ruket Çakıcı and Deniz Zeyrek METU Turkish Discourse Bank Browser
18:20-19:40 David Elson DramaBank: Annotating Agency in Narrative Discourse
18:20-19:40 Gisela Redeker, Ildikó Berzlánovich, Nynke van der Vliet, Gosse Bouma and Markus Egg Multi-Layer Discourse Annotation of a Dutch Text Corpus
18:20-19:40 Iskandar Keskes, Farah Benamara and Lamia Hadrich Belguith Clause-based Discourse Segmentation of Arabic Texts
18:20-19:40 Mariana Gomes, Ana Guilherme, Leonor Tavares and Rita Marquilhas Project FLY: a multidisciplinary project within Linguistics
18:20-19:40 Ching-Sheng Lin, Zumrut Akcam, Samira Shaikh, Sharon Small, Ken Stahl, Tomek Strzalkowski and Nick Webb Revealing Contentious Concepts Across Social Groups


  Session P31 - Lexical Acquisition Chair : Pierre Zweigenbaum
18:20-19:40 Tommaso Caselli, Francesco Rubino, Francesca Frontini, Irene Russo and Valeria Quochi Customizable SCF Acquisition in Italian
18:20-19:40 Gregor Thurmair, Vera Aleksic and Christoph Schwarz Large Scale Lexical Analysis
18:20-19:40 Elsa Tolone, Stavroula Voyatzi, Claude Martineau and Matthieu Constant Extending the adverbial coverage of a French morphological lexicon
18:20-19:40 Somayeh Bagherbeygi and Mehrnoush Shamsfard Corpus based Semi-Automatic Extraction of Persian Compound Verbs and their Relations


  Session P32 - Corpus Creation, Processing, Usage (2) Chair : Shyam Agrawal
18:20-19:40 Ting Liu, Samira Shaikh, Tomek Strzalkowski, Aaron Broadwell, Jennifer Stromer-Galley, Sarah Taylor, Umit Boz, Xiaoai Ren and Jingsi Wu Extending the MPC corpus to Chinese and Urdu - A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in Language
18:20-19:40 Ivana Tanasijević, Biljana Sikimić and Gordana Pavlović-Lažetić Multimedia database of the cultural heritage of the Balkans
18:20-19:40 Rania Al-Sabbagh and Roxana Girju YADAC: Yet another Dialectal Arabic Corpus
18:20-19:40 Yves Scherrer and Bruno Cartoni The Trilingual ALLEGRA Corpus: Presentation and Possible Use for Lexicon Induction
18:20-19:40 Martin Reynaert, Ineke Schuurman, Veronique Hoste, Nelleke Oostdijk and Maarten van Gompel Beyond SoNaR: towards the facilitation of large corpus building efforts
18:20-19:40 Piotr Bański, Peter M. Fischer, Elena Frick, Erik Ketzan, Marc Kupietz, Carsten Schnober, Oliver Schonefeld and Andreas Witt The New IDS Corpus Analysis Platform: Challenges and Prospects
18:20-19:40 Kurt Eberle, Kerstin Eckart, Ulrich Heid and Boris Haselbach A Tool/Database Interface for Multi-Level Analyses
18:20-19:40 Djamel Mostefa, Khalid Choukri, Sylvie Brunessaux, Karim Boudahmane New language resources for the Pashto language
18:20-19:40 Şenay Kafkas, Ian Lewin, David Milward, Erik van Mulligen, Jan Kors, Udo Hahn and Dietrich Rebholz-Schuhmann CALBC: Releasing the Final Corpora
18:20-19:40 Martin Majliš and Zdeněk Žabokrtský Language Richness of the Web


  Session P33 - Web Services Chair : Nuria Bel
18:20-19:40 Markus Forsberg and Torbjörn Lager Cloud Logic Programming for Integrating Language Technology Resources
18:20-19:40 Marc Kemps-Snijders, Matthijs Brouwer, Jan Pieter Kunst and Tom Visser Dynamic web service deployment in a cloud environment
18:20-19:40 Bharat Ram Ambati, Siva Reddy and Adam Kilgarriff Word Sketches for Turkish
18:20-19:40 Chunqi Shi, Donghui Lin and Toru Ishida Service Composition Scenarios for Task-Oriented Translation
18:20-19:40 Aleksandar Savkov, Laska Laskova, Stanislava Kancheva, Petya Osenova and Kiril Simov Linguistic Analysis Processing Line for Bulgarian
18:20-19:40 Victoria Arranz and Olivier Hamon On the Way to a Legal Sharing of Web Applications in NLP
18:20-19:40 Rafal Rak, Andrew Rowley and Sophia Ananiadou Collaborative Development and Evaluation of Text-processing Workflows in a UIMA-supported Web-based Workbench
18:20-19:40 Javier Caminero, Mari Carmen Rodríguez, Jean Vanderdonckt, Fabio Paternò, Joerg Rett, Dave Raggett, Jean-Loup Comeliau and Ignacio Marín The SERENOA Project: Multidimensional Context-Aware Adaptation of Service Front-Ends


  Session P34 Corpus Creation, Processing, Usage (3) Chair : German Rigau
9:45-11:25 Ronaldo Martins Le Petit Prince in UNL
9:45-11:25 Christian Chiarcos A generic formalism to represent linguistic corpora in RDF and OWL/DL
9:45-11:25 Silvia Pareti A Database of Attribution Relations
9:45-11:25 Bartosz Broda, Michał Marcińczuk, Marek Maziarz, Adam Radziszewski and Adam Wardyński KPWr: Towards a Free Corpus of Polish
9:45-11:25 Yeşim Aksan, Mustafa Aksan, Ahmet Koltuksuz, Taner Sezer, Ümit Mersinli, Umut Ufuk Demirhan, Hakan Yılmazer, Gülsüm Atasoy, Seda Öz, İpek Yıldız and Özlem Kurtoğlu Construction of the Turkish National Corpus (TNC)
9:45-11:25 Jirka Hana, Alexandr Rosen, Barbora Štindlová and Petr Jäger Building a learner corpus
9:45-11:25 Isabella Poggi, Francesca D'Errico and Giovanna Leone Pedagogical stances and their multimodal signals.
9:45-11:25 Tsuyoshi Okita Annotated Corpora for Word Alignment between Japanese and English and its Evaluation with MAP-based Word Aligner
9:45-11:25 Djamé Seddah, Marie Candito, Benoit Crabbé and Enrique Henestroza Anguiano Ubiquitous Usage of a Broad Coverage French Corpus: Processing the Est Republicain corpus


  Session P35 - Language Resource Infrastructures (2) Chair : Claudia Soria
9:45-11:25 Herman Stehouwer, Matej Durco, Eric Auer and Daan Broeder Federated Search: Towards a Common Search Infrastructure
9:45-11:25 Willem Elbers, Daan Broeder and Dieter van Uytvanck Proper Language Resource Centers
9:45-11:25 Sebastian Drude, Daan Broeder, Paul Trilsbeek and Peter Wittenburg The Language Archive ― a new hub for language resources
9:45-11:25 Kais Dukes and Eric Atwell LAMP: A Multimodal Web Platform for Collaborative Linguistic Analysis
9:45-11:25 James Clarke, Vivek Srikumar, Mark Sammons and Dan Roth An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines)
9:45-11:25 Marta Villegas, Núria Bel, Carlos Gonzalo, Amparo Moreno and Nuria Simelio Using Language Resources in Humanities research
9:45-11:25 Sebastian Nordhoff and Harald Hammarström Glottolog/Langdoc:Increasing the visibility of grey literature for low-density languages
9:45-11:25 Steve Cassidy, Michael Haugh, Pam Peters and Mark Fallu The Australian National Corpus: National Infrastructure for Language Resources
9:45-11:25 Christian Federmann, Ioanna Giannopoulou, Christian Girardi, Olivier Hamon, Dimitris Mavroeidis, Salvatore Minutoli and Marc Schröder META-SHARE v2: An Open Network of Repositories for Language Resources including Data and Tools
9:45-11:25 Alessio Bosca, Luca Dini, Milen Kouylekov and Marco Trevisan Linguagrid: a network of Linguistic and Semantic Services for the Italian Language.


  Session P36 - Speech Synthesis Chair : Martine Garnier-Rizet
9:45-11:25 Iñaki Sainz, Daniel Erro, Eva Navas, Inma Hernáez, Jon Sánchez, Ibon Saratxaga and Igor Odriozola Versatile Speech Databases for High Quality Synthesis for Basque
9:45-11:25 Dietmar Schabus, Michael Pucher and Gregor Hofer Building a synchronous corpus of acoustic and 3D facial marker data for adaptive audio-visual speech synthesis
9:45-11:25 Alistair Conkie, Thomas Okken, Yeon-Jun Kim and Giuseppe Di Fabbrizio Building Text-To-Speech Voices in the Cloud
9:45-11:25 Emília Garcia Casademont, Antonio Bonafonte and Asunción Moreno Building Synthetic Voices in the META-NET Framework
9:45-11:25 Nur-Hana Samsudin and Mark Lee Building Text-to-Speech Systems for Resource Poor Languages
9:45-11:25 Eva Szekely, Joao Paulo Cabral, Mohamed Abou-Zleikha, Peter Cahill and Julie Carson-Berndsen Evaluating expressive speech synthesis from audiobook corpora for conversational phrases


  Session P37 - Speech Resources Chair : Andrea Paoloni
9:45-11:25 Panikos Heracleous, Carlos Ishi, Takahiro Miyashita and Norihiro Hagita Body-conductive acoustic sensors in human-robot communication
9:45-11:25 Lucie Válková, Martina Waclawičová and Michal Křen Balanced data repository of spontaneous spoken Czech
9:45-11:25 R.P. Clapham, L. van der Molen, R.J.JH. van Son, M. van den Brekel and F.J.M. Hilgers NKI-CCRT Corpus - Speech Intelligibility Before and After Advanced Head and Neck Cancer Treated with Concomitant Chemoradiotherapy
9:45-11:25 Thomas Ulrich Christiansen and Peter Juel Henrichsen Sense Meets Nonsense - Sense Meets Nonsense - a dual-layer Danish speech corpus for perception studies
9:45-11:25 Peter Juel Henrichsen and Marcus Uneson SMALLWorlds -- Multilingual Content-Controlled Monologues
9:45-11:25 Alexander Schmitt, Stefan Ultes and Wolfgang Minker A Parameterized and Annotated Spoken Dialog Corpus of the CMU Let's Go Bus Information System
9:45-11:25 Sergey Zablotskiy, Alexander Shvets, Maxim Sidorov, Eugene Semenkin and Wolfgang Minker Speech and Language Resources for LVCSR of Russian
9:45-11:25 Dae-Lim Choi, Bong-Wan Kim, Yeon-Whoa Kim, Yong-Ju Lee, Yongnam Um and Minhwa Chung Dysarthric Speech Database for Development of QoLT Software Technology
9:45-11:25 Eckhard Bick, Heliana Mello, Alessandro Panunzi and Tommaso Raso The annotation of the C-ORAL-BRASIL oral through the implementation of the Palavras Parser
9:45-11:25 Janne Bondi Johannessen, Joel Priestley, Kristin Hagen, Anders Nøklestad and André Lynum The Nordic Dialect Corpus
9:45-11:25 Dafydd Gibbon ULex: new data models and a mobile environment for corpus enrichment.
9:45-11:25 Kengo Ohta, Masatoshi Tsuchiya and Seiichi Nakagawa Developing Partially-Transcribed Speech Corpus from Edited Transcriptions
9:45-11:25 Xiaoyi Ma LDC Forced Aligner
9:45-11:25 Sebastian Stüker, Florian Kraft, Christian Mohr, Teresa Herrmann, Eunah Cho and Alex Waibel The KIT Lecture Corpus for Speech Translation
9:45-11:25 Shyam Agrawal, Shweta Sinha, Pooja Singh and Jesper Olson Development of Text and Speech database for Hindi and Indian English specific to Mobile Communication environment


  Session P38 - Subjectivity: Sentiments, Emotions, Opinions (2) Chair : Paolo Rosso
11:45-13:25 Simon Clematide, Stefan Gindl, Manfred Klenner, Stefanos Petrakis, Robert Remus, Josef Ruppenhofer, Ulli Waltinger and Michael Wiegand MLSA ― A Multi-layered Reference Corpus for German Sentiment Analysis
11:45-13:25 Silvia Vázquez and Núria Bel A Classification of Adjectives for Polarity Lexicons Enhancement
11:45-13:25 Jorge Carrillo de Albornoz, Laura Plaza and Pablo Gervás SentiSense: An easily scalable concept-based affective lexicon for sentiment analysis
11:45-13:25 Tom De Smedt and Walter Daelemans “Vreselijk mooi!” (terribly beautiful): A Subjectivity Lexicon for Dutch Adjectives.
11:45-13:25 Rasmus Sundberg, Anders Eriksson, Johan Bini and Pierre Nugues Visualizing Sentiment Analysis on a User Forum
11:45-13:25 Erik Cambria, Yunqing Xia and Amir Hussain Affective Common Sense Knowledge Acquisition for Sentiment Analysis


  Session P39 - Language Resource Infrastructures (3) Chair : Penny Labropoulou
11:45-13:25 Emanuel Dima, Verena Henrich, Erhard Hinrichs, Marie Hinrichs, Christina Hoppermann, Thorsten Trippel, Thomas Zastrow and Claus Zinn A Repository for the Sustainable Management of Research Data
11:45-13:25 Maciej Ogrodniczuk, Piotr Pęzik and Adam Przepiórkowski Towards a comprehensive open repository of Polish language resources
11:45-13:25 Lars Borin, Markus Forsberg, Leif-Jöran Olsson and Jonatan Uppström The open lexical infrastructure of Spräkbanken
11:45-13:25 Christian Chiarcos, Sebastian Hellmann, Sebastian Nordhoff, Steven Moran, Richard Littauer, Judith Eckle-Kohler, Iryna Gurevych, Silvana Hartmann, Michael Matuschek and Christian M. Meyer The Open Linguistics Working Group
11:45-13:25 Silke Scheible, Richard J. Whitt, Martin Durrell and Paul Bennett GATEtoGerManC: A GATE-based Annotation Pipeline for Historical German
11:45-13:25 Nicolas Hernandez Tackling interoperability issues within UIMA workflows


  Session P40 - Knowledge and Ontologies Chair : Robert Gaizauskas
11:45-13:25 Anne-Kathrin Schumann Knowledge-Rich Context Extraction and Ranking with KnowPipe
11:45-13:25 Maria Teresa Pazienza, Noemi Scarpato and Armando Stellato Application of a Semantic Search Algorithm to Semi-Automatic GUI Generation
11:45-13:25 Roldano Cattoni, Francesco Corcoglioniti, Christian Girardi, Bernardo Magnini, Luciano Serafini and Roberto Zanoli The KnowledgeStore: an Entity-Based Storage System
11:45-13:25 Bartosz Broda, Marek Maziarz and Maciej Piasecki Tools for plWordNet Development. Presentation and Perspectives
11:45-13:25 Silvia Moraes and Vera Lima Combining Formal Concept Analysis and semantic information for building ontological structures from texts : an exploratory study
11:45-13:25 Menzo Windhouwer RELcat: a Relation Registry for ISOcat data categories
11:45-13:25 Eric Charton and Michel Gagnon A disambiguation resource extracted from Wikipedia for semantic annotation
11:45-13:25 Guido Boella, Luigi di Caro, Llio Humphreys, Livio Robaldo and Leon van der Torre NLP Challenges for Eunomos a Tool to Build and Manage Legal Knowledge
11:45-13:25 Robert Speer and Catherine Havasi Representing General Relational Knowledge in ConceptNet 5


  Session P41 - Semantics Chair : Marc Verhagen
11:45-13:25 Alain Joubert and Mathieu Lafourcade A new dynamic approach for lexical networks evaluation
11:45-13:25 Roberta Catizone, Louise Guthrie, Arthur Thomas and Yorick Wilks LIE: Leadership, Influence and Expertise
11:45-13:25 Richard Johansson, Karin Friberg Heppin and Dimitrios Kokkinakis Semantic Role Labeling with the Swedish FrameNet
11:45-13:25 Pedro Fialho, Sérgio Curto, Ana Cristina Mendes and Luísa Coheur Extending a wordnet framework for simplicity and scalability
11:45-13:25 Boris Haselbach, Wolfgang Seeker and Kerstin Eckart German """"nach""""-Particle Verbs in Semantic Theory and Corpus Data
11:45-13:25 Alessandro Lenci, Gabriella Lapesa and Giulia Bonansinga LexIt: A Computational Resource on Italian Argument Structure
11:45-13:25 Alessandro Lenci, Simonetta Montemagni, Giulia Venturi and Maria Grazia Cutrullà Enriching the ISST-TANL Corpus with Semantic Frames


  Session P42 - Temporal Information Chair : Uwe Quasthoff
11:45-13:25 Francisco Costa and António Branco TimeBankPT: A TimeML Annotated Corpus of Portuguese
11:45-13:25 Angel X. Chang and Christopher Manning SUTime: A library for recognizing and normalizing time expressions
11:45-13:25 André Bittar, Caroline Hagège, Véronique Moriceau, Xavier Tannier and Charles Teissèdre Temporal Annotation: A Proposal for Guidelines and an Experiment with Inter-annotator Agreement
11:45-13:25 Jannik Strötgen and Michael Gertz Temporal Tagging on Different Domains: Challenges, Strategies, and Gold Standards
11:45-13:25 Leon Derczynski, Hector Llorens and Estela Saquete Massively Increasing TIMEX3 Resources: A Transduction Approach
11:45-13:25 Corina Forascu and Dan Tufiș Romanian TimeBank: An Annotated Parallel Corpus for Temporal Information


  Session P43 - Sign Language Chair : Thomas Hanke
11:45-13:25 Zoya Gavrilov, Stan Sclaroff, Carol Neidle and Sven Dickinson Detecting Reduplication in Videos of American Sign Language
11:45-13:25 Nedelina Ivanova and Olle Eriksen BiBiKit - A Bilingual Bimodal Reading and Writing Tool for Sign Language Users
11:45-13:25 Fabrizio Borgia, Claudia S. Bianchini, Patrice Dalle and Maria De Marsico Resource production of written forms of Sign Languages by a user-centered editor, SWift (SignWriting improved fast transcriber)
11:45-13:25 Jens Forster, Christoph Schmidt, Thomas Hoyoux, Oscar Koller, Uwe Zelle, Justus Piater and Hermann Ney RWTH-PHOENIX-Weather: A Large Vocabulary Sign Language Recognition and Translation Corpus


  Session P44 - Machine Translation (2) Chair : Jan Hajic
14:55-16:35 Saab Mansour and Hermann Ney Arabic-Segmentation Combination Strategies for Statistical Machine Translation
14:55-16:35 Ondřej Bojar, Zdeněk Žabokrtský, Ondřej Dušek, Petra Galuščáková, Martin Majliš, David Mareček, Jiří Maršík, Michal Novák, Martin Popel and Aleš Tamchyna The Joy of Parallelism with CzEng 1.0
14:55-16:35 Takanori Kusumoto and Tomoyosi Akiba Statistical Machine Translation without Source-side Parallel Corpus Using Word Lattice and Phrase Extension
14:55-16:35 Lambert Patrik, Holger Schwenk and Frédéric Blain Automatic Translation of Scientific Documents in the HAL Archive
14:55-16:35 Georgi Iliev and Angel Genov Expanding Parallel Resources for Medium-Density Languages for Free
14:55-16:35 Elisabet Comelles, Jordi Atserias, Victoria Arranz and Irene Castellón VERTa: Linguistic features in MT evaluation
14:55-16:35 Zhiyi Song, Safa Ismael, Stephen Grimes, David Doermann and Stephanie Strassel Linguistic Resources for Handwriting Recognition and Translation Evaluation
14:55-16:35 Fangzhong Su and Bogdan Babych Development and Application of a Cross-language Document Comparability Metric
14:55-16:35 Claire Jaja, Douglas Briesch, Jamal Laoudi and Clare Voss Assessing Divergence Measures for Automated Document Routing in an Adaptive MT System
14:55-16:35 Ananthakrishnan Ramanathan and Karthik Visweswariah A Study of Word-Classing for MT Reordering
14:55-16:35 João Silva, Luísa Coheur, Ângela Costa and Isabel Trancoso Dealing with unknown words in statistical machine translation
14:55-16:35 Wilker Aziz, Sheila Castilho and Lucia Specia PET: a Tool for Post-editing and Assessing Machine Translation
14:55-16:35 Chris Irwin Davis Tajik-Farsi Persian Transliteration Using Statistical Machine Translation
14:55-16:35 Emma Barker and Robert Gaizauskas Assessing the Comparability of News Texts


  Session P45 - Natural Language Generation Chair : Dan Cristea
14:55-16:35 Hilder Pereira, Eder Novais, Andre Mariotti and Ivandre Paraboni Corpus-based Referring Expressions Generation
14:55-16:35 Eder Novais, Ivandre Paraboni and Douglas Silva Portuguese Text Generation from Large Corpora
14:55-16:35 Sigrid Klerke and Anders Søgaard DSim, a Danish Parallel Corpus for Text Simplification
14:55-16:35 Violeta Seretan Acquisition of Syntactic Simplification Rules for French
14:55-16:35 Anja Belz and Albert Gatt A Repository of Data and Evaluation Resources for Natural Language Generation
14:55-16:35 Eric Kow and Anja Belz LG-Eval: A Toolkit for Creating Online Language Evaluation Experiments


  Session P46 - Crowdsourcing Chair : Collin Baker
14:55-16:35 Chris Biemann Turk Bootstrap Word Sense Inventory 2.0: A Large-Scale Resource for Lexical Substitution
14:55-16:35 Marion Potet, Emmanuelle Esperança-Rodier, Laurent Besacier and Hervé Blanchon Collection of a Large Database of French-English SMT Output Corrections
14:55-16:35 Jirka Hana and Barbora Hladka Getting more data -- Schoolkids as annotators
14:55-16:35 Anna Rumshisky, Nick Botchan, Sophie Kushkuley and James Pustejovsky Word Sense Inventories by Non-Experts.


  Session P47 - Text Mining and Text Entailment Chair : Sophia Ananiadou
14:55-16:35 Anton Leuski, Carsten Eickhoff, James Ganis and Victor Lavrenko The BladeMistress Corpus: From Talk to Action in Virtual Worlds
14:55-16:35 Alvin Grissom II and Yusuke Miyao Annotating Factive Verbs
14:55-16:35 Mahdi Khademian, Kaveh Taghipour, Saab Mansour and Shahram Khadivi A Holistic Approach to Bilingual Sentence Fragment Extraction from Comparable Corpora
14:55-16:35 Mohammad Fazleh Elahi and Paola Monachesi An Examination of Cross-Cultural Similarities and Differences from Social Media Data with respect to Language Use
14:55-16:35 Seniz Demir, Ilknur Durgar El-Kahlout, Erdem Unal and Hamza Kaya Turkish Paraphrase Corpus
14:55-16:35 Rui Wang and Shuguang Li Constructing a Question Corpus for Textual Semantic Relations
14:55-16:35 Alexandra Roshchina, John Cardiff and Paolo Rosso Evaluating the Similarity Estimator component of the TWIN Personality-based Recommender System


  Session P48 - Speech/Multimodal Tools, Systems, Applications Chair : Tomaž Erjavec
14:55-16:35 Michael Kipp Annotation Facilities for the Reliable Analysis of Human Motion
14:55-16:35 Michael Carl Translog-II: a Program for Recording User Activity Data for Empirical Reading and Writing Research
14:55-16:35 Giovanni Costantini, Andrea Paoloni and Massimiliano Todisco Intelligibility assessment in forensic applications
14:55-16:35 David Tavarez, Eva Navas, Daniel Erro and Ibon Saratxaga Strategies to Improve a Speaker Diarisation Tool
14:55-16:35 Igor Odriozola, Eva Navas, Inma Hernáez, Iñaki Sainz, Ibon Saratxaga, Jon Sánchez and Daniel Erro Using an ASR database to design a pronunciation evaluation system in Basque
14:55-16:35 Francesco Cutugno, Vincenza Anna Leano and Antonio Origlia W-PhAMT: A web tool for phonetic multilevel timeline visualization
14:55-16:35 Amalia Zahra and Julie Carson-Berndsen English to Indonesian Transliteration to Support English Pronunciation Practice
14:55-16:35 Joao Paulo Cabral, Mark Kane, Zeeshan Ahmed, Mohamed Abou-Zleikha, Eva Szekely, Amalia Zahra, Kalu Ogbureke, Peter Cahill, Julie Carson-Berndsen and Stephan Schlogl Rapidly Testing the Interaction Model of a Pronunciation Training System via Wizard-of-Oz
14:55-16:35 Bernhard Brüning, Christian Schnier, Karola Pitsch and Sven Wasmuth PAMOCAT: Automatic retrieval of specified postures


