Free Ebook Introduction to Clustering Large and High-Dimensional Data, by Jacob Kogan
Sooner you obtain the e-book Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan, sooner you could appreciate checking out guide. It will be your resort to maintain downloading and install guide Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan in supplied web link. By doing this, you can really choose that is worked in to obtain your personal e-book on the internet. Here, be the initial to obtain the publication entitled Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan as well as be the very first to recognize exactly how the author implies the message as well as expertise for you.
Introduction to Clustering Large and High-Dimensional Data, by Jacob Kogan
Free Ebook Introduction to Clustering Large and High-Dimensional Data, by Jacob Kogan
Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan. Join with us to be member here. This is the site that will provide you ease of browsing book Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan to read. This is not as the other site; guides will be in the types of soft documents. What advantages of you to be member of this site? Obtain hundred collections of book connect to download as well as obtain constantly updated book on a daily basis. As one of guides we will present to you currently is the Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan that has an extremely pleased principle.
Reading, again, will provide you something new. Something that you do not know after that revealed to be populared with guide Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan message. Some understanding or lesson that re obtained from checking out e-books is vast. A lot more e-books Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan you check out, more knowledge you obtain, and also more opportunities to always enjoy reading books. Due to this factor, reviewing publication needs to be begun with earlier. It is as what you can acquire from the book Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan
Get the perks of reviewing routine for your lifestyle. Book Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan message will certainly consistently associate with the life. The reality, expertise, science, wellness, religious beliefs, enjoyment, as well as much more can be located in created books. Many authors supply their experience, science, study, as well as all points to show you. Among them is with this Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan This book Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan will certainly supply the needed of notification as well as statement of the life. Life will certainly be finished if you understand much more things through reading books.
From the explanation above, it is clear that you need to review this book Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan We provide the on the internet publication entitled Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan right below by clicking the web link download. From discussed publication by online, you can give much more perks for many individuals. Besides, the visitors will be additionally easily to get the preferred book Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan to check out. Locate the most preferred and also required book Introduction To Clustering Large And High-Dimensional Data, By Jacob Kogan to review now and below.
There is a growing need for a more automated system of partitioning data sets into groups, or clusters. For example, digital libraries and the World Wide Web continue to grow exponentially, the ability to find useful information increasingly depends on the indexing infrastructure or search engine. Clustering techniques can be used to discover natural groups in data sets and to identify abstract structures that might reside there, without having any background knowledge of the characteristics of the data. Clustering has been used in a variety of areas, including computer vision, VLSI design, data mining, bio-informatics (gene expression analysis), and information retrieval, to name just a few. This book focuses on a few of the most important clustering algorithms, providing a detailed account of these major models in an information retrieval context. The beginning chapters introduce the classic algorithms in detail, while the later chapters describe clustering through divergences and show recent research for more advanced audiences.
- Sales Rank: #2742848 in Books
- Brand: Brand: Cambridge University Press
- Published on: 2006-11-13
- Released on: 2007-02-22
- Original language: English
- Number of items: 1
- Dimensions: 8.98" h x .59" w x 5.98" l, .65 pounds
- Binding: Paperback
- 222 pages
- Used Book in Good Condition
Review
"...this book may serve as a useful reference for scientists and engineers who need to understand the concepts of clustering in general and/or to focus on text mining applications. It is also appropriate for students who are attending a course in pattern recognition, data mining, or classification and are interested in learning more about issues related to the k-means scheme for an undergraduate or master's thesis project. Last, it supplies very interesting material for instructors."
Nicolas Loménie, IAPR Newsletter
About the Author
Jacob Kogan is an Associate Professor in the Department of Mathematics and Statistics at the University of Maryland, Baltimore County. Dr. Kogan received his PhD in Mathematics from Weizmann Institute of Science, has held teaching and research positions at the University of Toronto and Purdue University. His research interests include Text and Data Mining, Optimization, Calculus of Variations, Optimal Control Theory, and Robust Stability of Control Systems. Dr. Kogan is the author of Bifurcations of Extremals in Optimal Control and Robust Stability and Convexity: An Introduction. Since 2001, he has also been affiliated with the Department of Computer Science and Electrical Engineering at UMBC. Dr. Kogan is a recipient of 2004-2005 Fulbright Fellowship to Israel. Together with Charles Nicholas of UMBC and Marc Teboulle of Tel-Aviv University he is co-editor of the volume Grouping Multidimensional Data: Recent Advances in Clustering.
Most helpful customer reviews
4 of 4 people found the following review helpful.
Compilation of lecture notes with little or no directions and guidelines
By Peter Kwok
This book is based on the author's lecture notes for an undergraduate class. Unfortunately, even after some editorial effort, this book remains largely a compilation of theorems and exercises with little coherence or direction. Chapter 1 frames the whole book from the standpoint of information retrieval (IR). But, as you read on, it should be clear that this book has little to do with IR, nor does it use the examples well enough to make a point in the book. The rest of the book can be divided into two parts.
The first part spans from Chapters 2 through 5 to go over three basic clustering algorithms and their variations: $k$-means algorithm, BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies), and PDDP (Principle Direction Divisive Partitioning). Here $k$-means needs no further introduction. BIRCH was recognized in SIGMOD 10-year test-of-time award in 2006. PDDP is primarily a "bisecting" algorithm. It and its variation are regarded by the author as elegant. So, I think the author has made some good choices here, especially for a short book like this.
What I have problems with is that, although the book contains many basic facts and theorems to these three algorithms, and even bits and pieces of references back to the document embedding example in Chapter 1, there are never enough explanations to tie the clustering algorithms to their significance in those examples. For instance, in subsection 2.3.1, various collections of documents (even with a URL) were introduced. This book is written for general readers who are not necessarily IR-oriented. Yet, without even some mentioning of how the documents were clustered and what the clusters were used for, the author just presented tables after tables of "quality" measures and improvement rates, assuming the readers have already understood what is going on and accepted what needs to be done. At that point, PDDP was not yet introduced. But that didn't stop the author from making comparisons already. When you think the next algorithm, BIRCH, would be given more details simply because it is well-known for its efficiency in squashing large data, there are only some minimal descriptions. Worse, the author insisted that his presentation was equivalent to the original BIRCH paper (Zhang, Ramakrishnan, Livny 1997) and, thereby, skipped a lot of details (e.g., phases 1 through 4) that are crucial to the understanding of the more commonly known BIRCH rather than just his simplified version. The chapter on PDDP is equally unnoteworthy. Subsection 5.4.3 has a cute little example on a hub-and-authority model. But the technique used is quite model-specific and does not shed any new light upon the characteristics of the algorithms already presented. I personally feel that there are just too many this kind of by-the-way-you-can-also-do-this moments to disrupt the flow of presentation and obscure the main messages.
Since the examples are not well chosen, the strengths and weaknesses of the algorithms become harder to see. For the few comparisons among the algorithms, there are some algorithmic step count comparisons, and some illustrations of extreme scenarios. As important as those comparisons are, it would have been more helpful if a discussion on computational complexity and constraints can be added. Even though the book's title mentions "large" and "high-dimensional" data, it is not obvious from its contents why the three algorithms are particularly good for large and high-dimensional data as claimed.
The second part of the book spans from Chapters 6 through 10 to explore alternatives of distance functions and clustering performance measures. This is important because the choice of a distance or distance-like function is often arbitrary. Alternatives, such as Kullback-Leibler divergence, that are more directly related to the information-theoretic properties of probability distributions would naturally be more appealing at least from a conceptual standpoint. However, either the semester was getting short at the time the author prepared the lecture notes, or he made a conscious editorial decision to shorten the exposition in the book, each topic was only briefly touched. Again, one alternative after another is presented, but the book gives little or no space to discuss what they are good for and when they should be used. If you are looking for the next mathematical theorem to prove, that may not be a bad thing if your imagination is kept open. But, as a practitioner, I find the lack of discussion to be a little discomforting.
Interested students may welcome Chapter 11, which contains solutions to exercises throughout the book. Serious researchers may also find the bibliography informative. At the end of most chapters throughout the book, there is usually a section on bibliographic notes, which I also found to be very helpful in understanding the motivations behind the development of many of the ideas.
In summary, this book is short, and gets to the points quickly, which is good. If you are only interested in knowing what a clustering algorithm is, this can be a decent reference. The down side is that the exposition never gives enough depth in the sense that it does not successfully show how one algorithm performs differently than another. Moreover, the book provides little or no guidance on how to choose an algorithm or distance function. Its examples are hopelessly disconnected from many main themes. There are many good points in this book. And I think the author did the research community a service by writing on the important topic of large data set clustering. However, due to its many shortcomings, I have given the book only 3 stars.
Introduction to Clustering Large and High-Dimensional Data, by Jacob Kogan PDF
Introduction to Clustering Large and High-Dimensional Data, by Jacob Kogan EPub
Introduction to Clustering Large and High-Dimensional Data, by Jacob Kogan Doc
Introduction to Clustering Large and High-Dimensional Data, by Jacob Kogan iBooks
Introduction to Clustering Large and High-Dimensional Data, by Jacob Kogan rtf
Introduction to Clustering Large and High-Dimensional Data, by Jacob Kogan Mobipocket
Introduction to Clustering Large and High-Dimensional Data, by Jacob Kogan Kindle
Tidak ada komentar:
Posting Komentar