A Bayesian Sampling Method for Product Feature Extraction From Large Scale Textual Data

Lim, Sunghoon; Tucker, Conrad S.

Source: Journal of Mechanical Design:;2016:;volume( 138 ):;issue: 006::page 61403

Author:

Lim, Sunghoon

Tucker, Conrad S.

DOI: 10.1115/1.4033238

Publisher: The American Society of Mechanical Engineers (ASME)

Abstract: The authors of this work propose an algorithm that determines optimal search keyword combinations for querying online product data sources in order to minimize identification errors during the product feature extraction process. Datadriven product design methodologies based on acquiring and mining online productfeaturerelated data are presented with two fundamental challenges: (1) determining optimal search keywords that result in relevant product related data being returned and (2) determining how many search keywords are sufficient to minimize identification errors during the product feature extraction process. These challenges exist because online data, which is primarily textual in nature, may violate several statistical assumptions relating to the independence and identical distribution of samples relating to a query. Existing design methodologies have predetermined search terms that are used to acquire textual data online, which makes the resulting data acquired, a function of the quality of the search term(s) themselves. Furthermore, the lack of independence and identical distribution of text data from online sources impacts the quality of the acquired data. For example, a designer may search for a product feature using the term â€œscreen,â€‌ which may return relevant results such as â€œthe screen size is just perfect,â€‌ but may also contain irrelevant noise such as â€œresearchers should really screen for this type of error.â€‌ A text mining algorithm is introduced to determine the optimal terms without labeled training data that would maximize the veracity of the data acquired to make a valid conclusion. A case study involving realworld smartphones is used to validate the proposed methodology.

Download: (888.2Kb)
Show Full MetaData Hide Full MetaData
Get RIS
Item Order
Go To Publisher
Price: 5000 Rial
Statistics

A Bayesian Sampling Method for Product Feature Extraction From Large Scale Textual Data

URI

http://yetl.yabesh.ir/yetl1/handle/yetl/161793

Collections

Journal of Mechanical Design

Show full item record

contributor author	Lim, Sunghoon
contributor author	Tucker, Conrad S.
date accessioned	2017-05-09T01:31:00Z
date available	2017-05-09T01:31:00Z
date issued	2016
identifier issn	1050-0472
identifier other	md_138_06_061404.pdf
identifier uri	http://yetl.yabesh.ir/yetl/handle/yetl/161793
description abstract	The authors of this work propose an algorithm that determines optimal search keyword combinations for querying online product data sources in order to minimize identification errors during the product feature extraction process. Datadriven product design methodologies based on acquiring and mining online productfeaturerelated data are presented with two fundamental challenges: (1) determining optimal search keywords that result in relevant product related data being returned and (2) determining how many search keywords are sufficient to minimize identification errors during the product feature extraction process. These challenges exist because online data, which is primarily textual in nature, may violate several statistical assumptions relating to the independence and identical distribution of samples relating to a query. Existing design methodologies have predetermined search terms that are used to acquire textual data online, which makes the resulting data acquired, a function of the quality of the search term(s) themselves. Furthermore, the lack of independence and identical distribution of text data from online sources impacts the quality of the acquired data. For example, a designer may search for a product feature using the term â€œscreen,â€‌ which may return relevant results such as â€œthe screen size is just perfect,â€‌ but may also contain irrelevant noise such as â€œresearchers should really screen for this type of error.â€‌ A text mining algorithm is introduced to determine the optimal terms without labeled training data that would maximize the veracity of the data acquired to make a valid conclusion. A case study involving realworld smartphones is used to validate the proposed methodology.
publisher	The American Society of Mechanical Engineers (ASME)
title	A Bayesian Sampling Method for Product Feature Extraction From Large Scale Textual Data
type	Journal Paper
journal volume	138
journal issue	6
journal title	Journal of Mechanical Design
identifier doi	10.1115/1.4033238
journal fristpage	61403
journal lastpage	61403
identifier eissn	1528-9001
tree	Journal of Mechanical Design:;2016:;volume( 138 ):;issue: 006
contenttype	Fulltext

YaBeSH Engineering and Technology Library

Archive