Item


Inferring the semantic properties of sentences by mining syntactic parse trees

We extend the mechanism of logical generalization toward syntactic parse trees and attempt to detect semantic signals unobservable in the level of keywords. Generalization from a syntactic parse tree as a measure of syntactic similarity is defined by the obtained set of maximum common sub-trees and is performed at the level of paragraphs, sentences, phrases and individual words. We analyze the semantic features of this similarity measure and compare it with the semantics of traditional anti-unification of terms. Nearest-Neighbor machine learning is then applied to relate the sentence to a semantic class. By using a syntactic parse tree-based similarity measure instead of the bag-of-words and keyword frequency approaches, we expect to detect a subtle difference between semantic classes that is otherwise unobservable. The proposed approach is evaluated in three distinct domains in which a lack of semantic information makes the classification of sentences rather difficult. We conclude that implicit indications of semantic classes can be extracted from syntactic structures

We are grateful to our colleagues SO Kuznetsov, B Kovalerchuk and others for valuable discussions and to our anonymous reviewers for their suggestions. This research is partially funded by the EU Project No. 238887, a unique European Citizens’ attention service (iSAC6+) IST-PSP. This research is also funded by the Spanish MICINN (Ministerio de Ciencia e Innovacion) IPT-430000-2010-13 project Social powered Agents for Knowledge search Engine (SAKE), TIN2010-17903 Comparative approaches to the implementation of intelligent agents in digital preservation from a perspective of the automation of social networks, and the AGAUR 2011 Fl_B00927 research grant awarded to Gabor Dobrocsi and the grup de recerca consolidat CSI-ref.2009SGR-1202

info:eu-repo/grantAgreement/MICINN//TIN2010-17903/ES/ENFOQUES COMPARADOS DE LA APLICACION DE AGENTES INTELIGENTES EN PRESERVACION DIGITAL, DESDE LA PERSPECTIVA DE LA AUTOMATIZACION DE REDES SOCIALES/

Elsevier

Manager: Ministerio de Ciencia e Innovación (Espanya)
Generalitat de Catalunya. Agència de Gestió d’Ajuts Universitaris i de Recerca
Author: Galitsky, Boris A.
Rosa, Josep Lluís de la
Dobrocsi, Gábor
Date: 2012
Abstract: We extend the mechanism of logical generalization toward syntactic parse trees and attempt to detect semantic signals unobservable in the level of keywords. Generalization from a syntactic parse tree as a measure of syntactic similarity is defined by the obtained set of maximum common sub-trees and is performed at the level of paragraphs, sentences, phrases and individual words. We analyze the semantic features of this similarity measure and compare it with the semantics of traditional anti-unification of terms. Nearest-Neighbor machine learning is then applied to relate the sentence to a semantic class. By using a syntactic parse tree-based similarity measure instead of the bag-of-words and keyword frequency approaches, we expect to detect a subtle difference between semantic classes that is otherwise unobservable. The proposed approach is evaluated in three distinct domains in which a lack of semantic information makes the classification of sentences rather difficult. We conclude that implicit indications of semantic classes can be extracted from syntactic structures
We are grateful to our colleagues SO Kuznetsov, B Kovalerchuk and others for valuable discussions and to our anonymous reviewers for their suggestions. This research is partially funded by the EU Project No. 238887, a unique European Citizens’ attention service (iSAC6+) IST-PSP. This research is also funded by the Spanish MICINN (Ministerio de Ciencia e Innovacion) IPT-430000-2010-13 project Social powered Agents for Knowledge search Engine (SAKE), TIN2010-17903 Comparative approaches to the implementation of intelligent agents in digital preservation from a perspective of the automation of social networks, and the AGAUR 2011 Fl_B00927 research grant awarded to Gabor Dobrocsi and the grup de recerca consolidat CSI-ref.2009SGR-1202
Format: application/pdf
Document access: http://hdl.handle.net/10256/11695
Language: eng
Publisher: Elsevier
Collection: info:eu-repo/semantics/altIdentifier/doi/10.1016/j.datak.2012.07.003
info:eu-repo/semantics/altIdentifier/issn/0169-023X
info:eu-repo/semantics/altIdentifier/eissn/1872-6933
info:eu-repo/grantAgreement/MICINN//IPT-430000-2010-013/ES/SAKE - Social powered Agents for Knowledge search Engine/
AGAUR/2009-2014/2009 SGR-1202
Is part of: info:eu-repo/grantAgreement/MICINN//TIN2010-17903/ES/ENFOQUES COMPARADOS DE LA APLICACION DE AGENTES INTELIGENTES EN PRESERVACION DIGITAL, DESDE LA PERSPECTIVA DE LA AUTOMATIZACION DE REDES SOCIALES/
Rights: Tots els drets reservats
Subject: Mineria de dades
Data mining
Semàntica -- Automatització
Semantics -- Automation
Title: Inferring the semantic properties of sentences by mining syntactic parse trees
Type: info:eu-repo/semantics/article
Repository: DUGiDocs

Subjects

Authors