The CODIM project focuses on the two main linguistic resources for organizing monologues or conversations in human languages : D(iscourse) M(arkers)(therefore/donc, well/ben,bon etc. in English/French) and prosody (in particular, intonation). It will evaluate their status with respect to two major views on communication: compositionality (the possibility of combining meaningful expressions into more complex meaningful expressions) and pattern or construction-based approaches (the idea that language users exploit partly ‘frozen’ strings of words). We will compare the semantic and prosodic properties of simple and complex French DM (e.g. ah + bon) found in corpora for written and spoken French, using a variety of technical tools for DM identification (category-driven text mining), clustering (statistics and Machine Learning) and research in prosody (duration and intensity measures, contour representation). The project fosters a number of collaborations between linguists and computer scientists. |