02423nam a2200289 n 450 TD2100140020210412102544.0TDMAGDIG20190501d2021 --k--ita-50----ba engStatistical and computational approaches to first language acquisition. Mining a set of French longitudinal corpora (CoLaJE)Tesi di dottoratodiritti: info:eu-repo/semantics/openAccessIn relazione con info:eu-repo/semantics/altIdentifier/hdl/11570/3193136tesi di dottoratoSettore SECS-S/05 - Statistica SocialeThis thesis is based on a French datasets composed by seven longitudinal corpora of child spoken language. Each monthly transcript can be turned in a machine-readable spreasheet which is the base of all the computations that have been made, as well as the related graphical visualisations. Hypotheses about phonemes acquisition, phonological acquisition and grammar acquisition have been tested by using tools and concept from descriptive and inferential statistics, regression (chi squared) and clustering. A complete part-of-speech tagging of around 15'000 sentences is proposed to study the emergence of syntax (from one-word to multi-word utterances). A convolutional neural network trained on the same dataset is proposed and the accuracy of its prediction is discussed. A final consideration on the importance of modelling phonetic variations within the syllable level is finally discussed, as the main limit of the thesis has been to having put aside the coarticulatory differences that a given phoneme can have according to the place it occupies in the syllable structure (onset-nucleus-coda).Settore SECS-S/05- Statistica SocialeTDRBRIGLIA, ANDREAMUCCIARDI, MassimoJérémi Sauvage (tesi in cotutela con l'Université "Paul-Valéry" Montpellier 3- Francia)FALZONE, AlessandraITIT-FI0098http://memoria.depositolegale.it/*/http://hdl.handle.net/11570/3193136http://hdl.handle.net/11570/3193136http://memoria.depositolegale.it/*/http://iris.unime.it/bitstream/11570/3193136/1/Thesis_final_Andrea_Briglia.pdfhttp://iris.unime.it/bitstream/11570/3193136/1/Thesis_final_Andrea_Briglia.pdf CRCFTDTD