• Home
  • Advanced Search
  • Directory of Libraries
  • About lib.ir
  • Contact Us
  • History

عنوان
Text Representation using Convolutional Networks

پدید آورنده
Zhang, Xiang

موضوع
Computer science

رده

کتابخانه
Center and Library of Islamic Studies in European Languages

محل استقرار
استان: Qom ـ شهر: Qom

Center and Library of Islamic Studies in European Languages

تماس با کتابخانه : 32910706-025

NATIONAL BIBLIOGRAPHY NUMBER

Number
TLpq2198862498

LANGUAGE OF THE ITEM

.Language of Text, Soundtrack etc
انگلیسی

TITLE AND STATEMENT OF RESPONSIBILITY

Title Proper
Text Representation using Convolutional Networks
General Material Designation
[Thesis]
First Statement of Responsibility
Zhang, Xiang
Subsequent Statement of Responsibility
LeCun, Yann

.PUBLICATION, DISTRIBUTION, ETC

Name of Publisher, Distributor, etc.
New York University
Date of Publication, Distribution, etc.
2019

PHYSICAL DESCRIPTION

Specific Material Designation and Extent of Item
164

DISSERTATION (THESIS) NOTE

Dissertation or thesis details and type of degree
Ph.D.
Body granting the degree
New York University
Text preceding or following the note
2019

SUMMARY OR ABSTRACT

Text of Note
This dissertation applies convolutional networks for learning representations of text, and it consists of several parts. The first part offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks. These results indicate that using low-level inputs - in this case characters - for convolutional networks could be feasible for text representation learning. The second part concerns which text encoding method might work for convolutional networks. We include a comprehensive comparison of different encoding methods for the task of text classification using 14 large-scale datasets in 4 languages including Chinese, English, Japanese and Korean. Different encoding levels are studied, including UTF-8 bytes, characters, words, romanized characters and romanized words. For all encoding levels, whenever applicable, we provide comparisons with linear models, fastText and convolutional networks. For convolutional networks, we compare between encoding mechanisms using character glyph images, one-hot (or one-of-n) encoding, and embedding. From these 473 models, one of the conclusions is that byte-level one-hot encoding works consistently best for convolutional networks. Based on this, in the third part of the dissertation we develop a convolutional network at the level of bytes for learning representations through the task of auto-encoding. The proposed model is a multi-stage deep convolutional encoder-decoder framework using residual connections, containing up to 160 parameterized layers. Each encoder or decoder contains a shared group of modules that consists of either pooling or up-sampling layers, making the network recursive in terms of abstraction levels in representation. The decoding process is non-sequential. Results for 6 large-scale paragraph datasets are reported, in 3 languages including Arabic, Chinese and English. Analyses are conducted to study several properties of the proposed model. Experiments are presented to verify that the auto-encoder can learn useful representations. In the fourth part of the dissertation, we use the improved design from the previous auto-encoding model to text classification, adding comparisons between residual and dense connections. This further validates the choice of the architecture we made for the auto-encoding model, and the effectiveness of the recursive architecture with residual or dense connections.

TOPICAL NAME USED AS SUBJECT

Computer science

PERSONAL NAME - PRIMARY RESPONSIBILITY

LeCun, Yann
Zhang, Xiang

ELECTRONIC LOCATION AND ACCESS

Electronic name
 مطالعه متن کتاب 

p

[Thesis]
276903

a
Y

Proposal/Bug Report

Warning! Enter The Information Carefully
Send Cancel
This website is managed by Dar Al-Hadith Scientific-Cultural Institute and Computer Research Center of Islamic Sciences (also known as Noor)
Libraries are responsible for the validity of information, and the spiritual rights of information are reserved for them
Best Searcher - The 5th Digital Media Festival