• Home
  • Advanced Search
  • Directory of Libraries
  • About lib.ir
  • Contact Us
  • History

عنوان
Big data integration /

پدید آورنده
Xin Luna Dong, Google Inc., Divesh Srivastava, AT&T Labs-Research

موضوع
Big data,Data integration (Computer science)

رده
QA76
.
9
.
D343
D654
2015

کتابخانه
Center and Library of Islamic Studies in European Languages

محل استقرار
استان: Qom ـ شهر: Qom

Center and Library of Islamic Studies in European Languages

تماس با کتابخانه : 32910706-025

INTERNATIONAL STANDARD BOOK NUMBER

(Number (ISBN
1627052232
(Number (ISBN
9781627052238

NATIONAL BIBLIOGRAPHY NUMBER

Number
dltt

TITLE AND STATEMENT OF RESPONSIBILITY

Title Proper
Big data integration /
General Material Designation
[Book]
First Statement of Responsibility
Xin Luna Dong, Google Inc., Divesh Srivastava, AT&T Labs-Research

EDITION STATEMENT

Edition Statement
First edition

PHYSICAL DESCRIPTION

Specific Material Designation and Extent of Item
xx, 178 pages :
Other Physical Details
color illustrations ; 24 cm

SERIES

Series Title
Synthesis lectures on data management,
Volume Designation
#40
ISSN of Series
2153-5418 ;

INTERNAL BIBLIOGRAPHIES/INDEXES NOTE

Text of Note
Includes bibliographical references (pages 165-173) and index

CONTENTS NOTE

Text of Note
1. Motivation: challenges and opportunities for BDI -- 1.1 Traditional data integration -- 1.1.1 The flights example: data sources -- 1.1.2 The flights example: data integration -- 1.1.3 Data integration: architecture & three major steps -- 1.2 BDI: challenges -- 1.2.1 The "V" dimensions -- 1.2.2 Case study: quantity of deep web data -- 1.2.3 Case study: extracted domain-specific data -- 1.2.4 Case study: quality of deep web data -- 1.2.5 Case study: surface web structured data -- 1.2.6 Case study: extracted knowledge triples -- 1.3 BDI: opportunities -- 1.3.1 Data redundancy -- 1.3.2 Long data -- 1.3.3 Big data platforms -- 1.4 Outline of book --
Text of Note
2. Schema alignment -- 2.1 Traditional schema alignment: a quick tour -- 2.1.1 Mediated schema -- 2.1.2 Attribute matching -- 2.1.3 Schema mapping -- 2.1.4 Query answering -- 2.2 Addressing the variety and velocity challenges -- 2.2.1 Probabilistic schema alignment -- 2.2.2 Pay-as-you-go user feedback -- 2.3 Addressing the variety and volume challenges -- 2.3.1 Integrating deep web data -- 2.3.2 Integrating web tables --
Text of Note
3. Record linkage -- 3.1 Traditional record linkage: a quick tour -- 3.1.1 Pairwise matching -- 3.1.2 Clustering -- 3.1.3 Blocking -- 3.2 Addressing the volume challenge -- 3.2.1 Using MapReduce to parallelize blocking -- 3.2.2 Meta-blocking: pruning pairwise matchings -- 3.3 Addressing the velocity challenge -- 3.3.1 Incremental record linkage -- 3.4 Addressing the variety challenge -- 3.4.1 Linking text snippets to structured data -- 3.5 Addressing the veracity challenge -- 3.5.1 Temporal record linkage -- 3.5.2 Record linkage with uniqueness constraints --
Text of Note
4. BDI: data fusion -- 4.1 Traditional data fusion: a quick tour -- 4.2 Addressing the veracity challenge -- 4.2.1 Accuracy of a source -- 4.2.2 Probability of a value being true -- 4.2.3 Copying between sources -- 4.2.4 The end-to-end solution -- 4.2.5 Extensions and alternatives -- 4.3 Addressing the volume challenge -- 4.3.1 A MapReduce-based framework for offline fusion -- 4.3.2 Online data fusion -- 4.4 Addressing the velocity challenge -- 4.5 Addressing the variety challenge --
Text of Note
5. BDI: emerging topics -- 5.1 Role of crowdsourcing -- 5.1.1 Leveraging transitive relations -- 5.1.2 Crowdsourcing the end-to-end workflow -- 5.1.3 Future work -- 5.2 Source selection -- 5.2.1 Static sources -- 5.2.2 Dynamic sources -- 5.2.3 Future work -- 5.3 Source profiling -- 5.3.1 The Bellman system -- 5.3.2 Summarizing sources -- 5.3.3 Future work --
Text of Note
6. Conclusions -- Bibliography -- Authors' biographies -- Index
0
8
8
8
8
8

SUMMARY OR ABSTRACT

Text of Note
The big data era is upon us: data are being generated, analyzed, and used at an unprecedented scale, and data-driven decision making is sweeping through all aspects of society. Since the value of data explodes when it can be linked and fused with other data, addressing the big data integration (BDI) challenge is critical to realizing the promise of big data. BDI differs from traditional data integration along the dimensions of volume, velocity, variety, and veracity. First, not only can data sources contain a huge volume of data, but also the number of data sources is now in the millions. Second, because of the rate at which newly collected data are made available, many of the data sources are very dynamic, and the number of data sources is also rapidly exploding. Third, data sources are extremely heterogeneous in their structure and content, exhibiting considerable variety even for substantially similar entities. Fourth, the data sources are of widely differing qualities, with significant differences in the coverage, accuracy and timeliness of data provided. This book explores the progress that has been made by the data integration community on the topics of schema alignment, record linkage and data fusion in addressing these novel challenges faced by big data integration. Each of these topics is covered in a systematic way: first starting with a quick tour of the topic in the context of traditional data integration, followed by a detailed, example-driven exposition of recent innovative techniques that have been proposed to address the BDI challenges of volume, velocity, variety, and veracity. Finally, it presents emerging topics and opportunities that are specific to BDI, identifying promising directions for the data integration community

TOPICAL NAME USED AS SUBJECT

Big data
Data integration (Computer science)

DEWEY DECIMAL CLASSIFICATION

Number
006
.
312
Edition
23

LIBRARY OF CONGRESS CLASSIFICATION

Class number
QA76
.
9
.
D343
Book number
D654
2015

PERSONAL NAME - PRIMARY RESPONSIBILITY

Dong, Xin Luna,1975-

PERSONAL NAME - ALTERNATIVE RESPONSIBILITY

Srivastava, Divesh

ORIGINATING SOURCE

Date of Transaction
20160129110409.0
Cataloguing Rules (Descriptive Conventions))
rda

ELECTRONIC LOCATION AND ACCESS

Electronic name
 مطالعه متن کتاب 

[Book]

Y

Proposal/Bug Report

Warning! Enter The Information Carefully
Send Cancel
This website is managed by Dar Al-Hadith Scientific-Cultural Institute and Computer Research Center of Islamic Sciences (also known as Noor)
Libraries are responsible for the validity of information, and the spiritual rights of information are reserved for them
Best Searcher - The 5th Digital Media Festival