dc.contributor.author |
Gupta Rajesh. |
|
dc.date.accessioned |
2024-01-26T21:32:21Z |
|
dc.date.available |
2024-01-26T21:32:21Z |
|
dc.date.issued |
2019 |
|
dc.identifier.citation |
Gupta. Hands-On Data Analysis with Scala: Perform Data Collection, Processing, Manipulation, and Visualization with Scala. - Birmingham: Packt Publishing, Limited, 2019 - 1 online resource (288 pages) - URL: https://libweb.kpfu.ru/ebsco/pdf/2117000.pdf |
|
dc.identifier.isbn |
1789344263 |
|
dc.identifier.isbn |
9781789344264 |
|
dc.identifier.uri |
https://dspace.kpfu.ru/xmlui/handle/net/178247 |
|
dc.description |
Natural language processing for data analysis |
|
dc.description.abstract |
This book will help you perform effective data analysis with Scala using practical examples. You will come across different challenges and their effective solutions for a variety of data processing tasks - be it data exploration, data manipulation, or real-time data analysis using Apache Spark. |
|
dc.description.tableofcontents |
Cover; Title Page; Copyright and Credits; Dedication; About Packt; Contributors; Table of Contents; Preface; Section 1: Scala and Data Analysis Life Cycle; Chapter 1: Scala Overview; Getting started with Scala; Running Scala code online; Scastie; ScalaFiddle; Installing Scala on your computer; Installing command-line tools; Installing IDE; Overview of object-oriented and functional programming; Object-oriented programming using Scala; Functional programming using Scala; Scala case classes and the collection API; Scala case classes; Scala collection API; Array; List; Map |
|
dc.description.tableofcontents |
Overview of Scala libraries for data analysisApache Spark; Breeze; Breeze-viz; DeepLearning; Epic; Saddle; Scalalab; Smile; Vegas; Summary; Chapter 2: Data Analysis Life Cycle; Data journey; Sourcing data; Data formats; XML; JSON; CSV; Understanding data; Using statistical methods for data exploration; Using Scala; Other Scala tools; Using data visualization for data exploration; Using the vegas-viz library for data visualization; Other libraries for data visualization; Using ML to learn from data; Setting up Smile; Running Smile; Creating a data pipeline; Summary; Chapter 3: Data Ingestion |
|
dc.description.tableofcontents |
Data extractionPull-oriented data extraction; Push-oriented data delivery; Data staging; Why is the staging important?; Cleaning and normalizing; Enriching; Organizing and storing; Summary; Chapter 4: Data Exploration and Visualization; Sampling data; Selecting the sample; Selecting samples using Saddle; Performing ad hoc analysis; Finding a relationship between data elements; Visualizing data; Vegas viz for data visualization; Spark Notebook for data visualization; Downloading and installing Spark Notebook; Creating a Spark Notebook with simple visuals; More charts with Spark Notebook |
|
dc.description.tableofcontents |
Box plotHistogram; Bubble chart; Summary; Chapter 5: Applying Statistics and Hypothesis Testing; Basics of statistics; Summary level statistics; Correlation statistics; Vector level statistics; Random data generation; Pseudorandom numbers; Random numbers with normal distribution; Random numbers with Poisson distribution; Hypothesis testing; Summary; Section 2: Advanced Data Analysis and Machine Learning; Chapter 6: Introduction to Spark for Distributed Data Analysis; Spark setup and overview; Spark core concepts; Spark Datasets and DataFrames; Sourcing data using Spark; Parquet file format |
|
dc.description.tableofcontents |
Avro file formatSpark JDBC integration; Using Spark to explore data; Summary; Chapter 7: Traditional Machine Learning for Data Analysis; ML overview; Characteristics of ML; Categories or types of ML; Decision trees; Implementing decision trees; Decision tree algorithms; Implementing decision tree algorithms in our example; Evaluating the results; Using our model with a decision tree; Random forest; Random forest algorithms; Ridge and lasso regression; Characteristics of ridge regression; Characteristics of lasso regression; k-means cluster analysis |
|
dc.language |
English |
|
dc.language.iso |
en |
|
dc.publisher |
Birmingham Packt Publishing, Limited |
|
dc.subject.other |
Data mining. |
|
dc.subject.other |
Scala (Computer program language) |
|
dc.subject.other |
SQL. |
|
dc.subject.other |
Electronic books. |
|
dc.title |
Hands-On Data Analysis with Scala: Perform Data Collection, Processing, Manipulation, and Visualization with Scala. |
|
dc.type |
Book |
|
dc.description.pages |
1 online resource (288 pages) |
|
dc.collection |
Электронно-библиотечные системы |
|
dc.source.id |
EN05CEBSCO05C288 |
|