{"product_id":"9783319025964","title":"Unsupervised Information Extraction by Text Segmentation","description":"A new unsupervised approach to the problem of Information Extraction by Text Segmentation (IETS) is proposed, implemented and evaluated herein. The authors’ approach relies on information available on preexisting data to learn how to associate segments in the input string with attributes of a given domain relying on a very effective set of contentbased features. The effectiveness of the contentbased features is also exploited to directly learn from test data structurebased features, with no previous humandriven training, a feature unique to the presented approach. Based on the approach, a number of results are produced to address the IETS problem in an unsupervised fashion. In particular, the authors develop, implement and evaluate distinct IETS methods, namely \u003ci\u003eONDUX\u003c\/i\u003e, \u003ci\u003eJUDIE\u003c\/i\u003e and \u003ci\u003eiForm\u003c\/i\u003e.   \u003cp\u003e \u003ci\u003eONDUX\u003c\/i\u003e (On Demand Unsupervised Information Extraction) is an unsupervised probabilistic approach for IETS that relies on contentbased features to bootstrap the learning of structurebased features. \u003ci\u003eJUDIE\u003c\/i\u003e (Joint Unsupervised Structure Discovery and Information Extraction) aims at automatically extracting several semistructured data records in the form of continuous text and having no explicit delimiters between them. In comparison with other IETS methods, including \u003ci\u003eONDUX\u003c\/i\u003e, \u003ci\u003eJUDIE\u003c\/i\u003e faces a task considerably harder that is, extracting information while simultaneously uncovering the underlying structure of the implicit records containing it.\u003ci\u003e iForm\u003c\/i\u003e applies the authors’ approach to the task of Web form filling. It aims at extracting segments from a datarich text given as input and associating these segments with fields from a target Web form. \u003c\/p\u003e  \u003cp\u003e All of these methods were evaluated considering different experimental datasets, which are used to perform a large set of experiments in order to validate the presented approach and methods. These experiments indicate that the proposed approach yields high qualityresults when compared to stateoftheart approaches and that it is able to properly support IETS methods in a number of real applications. The findings will prove valuable to practitioners in helping them to understand the current stateoftheart in unsupervised information extraction techniques, as well as to graduate and undergraduate students of web data management. \u003c\/p\u003e","brand":"Springer International Publishing","offers":[{"title":"Default Title","offer_id":46411200954609,"sku":"9783319025964","price":54.99,"currency_code":"USD","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0674\/5433\/7265\/files\/9783319025964_p0.jpg?v=1765165533","url":"https:\/\/shop.barnesandnoble.com\/products\/9783319025964","provider":"Barnes \u0026 Noble","version":"1.0","type":"link"}