Электронный архив

Hands-On Data Analysis with Scala: Perform Data Collection, Processing, Manipulation, and Visualization with Scala.

Показать сокращенную информацию

dc.contributor.author Gupta Rajesh.
dc.date.accessioned 2024-01-26T21:32:21Z
dc.date.available 2024-01-26T21:32:21Z
dc.date.issued 2019
dc.identifier.citation Gupta. Hands-On Data Analysis with Scala: Perform Data Collection, Processing, Manipulation, and Visualization with Scala. - Birmingham: Packt Publishing, Limited, 2019 - 1 online resource (288 pages) - URL: https://libweb.kpfu.ru/ebsco/pdf/2117000.pdf
dc.identifier.isbn 1789344263
dc.identifier.isbn 9781789344264
dc.identifier.uri https://dspace.kpfu.ru/xmlui/handle/net/178247
dc.description Natural language processing for data analysis
dc.description.abstract This book will help you perform effective data analysis with Scala using practical examples. You will come across different challenges and their effective solutions for a variety of data processing tasks - be it data exploration, data manipulation, or real-time data analysis using Apache Spark.
dc.description.tableofcontents Cover; Title Page; Copyright and Credits; Dedication; About Packt; Contributors; Table of Contents; Preface; Section 1: Scala and Data Analysis Life Cycle; Chapter 1: Scala Overview; Getting started with Scala; Running Scala code online; Scastie; ScalaFiddle; Installing Scala on your computer; Installing command-line tools; Installing IDE; Overview of object-oriented and functional programming; Object-oriented programming using Scala; Functional programming using Scala; Scala case classes and the collection API; Scala case classes; Scala collection API; Array; List; Map
dc.description.tableofcontents Overview of Scala libraries for data analysisApache Spark; Breeze; Breeze-viz; DeepLearning; Epic; Saddle; Scalalab; Smile; Vegas; Summary; Chapter 2: Data Analysis Life Cycle; Data journey; Sourcing data; Data formats; XML; JSON; CSV; Understanding data; Using statistical methods for data exploration; Using Scala; Other Scala tools; Using data visualization for data exploration; Using the vegas-viz library for data visualization; Other libraries for data visualization; Using ML to learn from data; Setting up Smile; Running Smile; Creating a data pipeline; Summary; Chapter 3: Data Ingestion
dc.description.tableofcontents Data extractionPull-oriented data extraction; Push-oriented data delivery; Data staging; Why is the staging important?; Cleaning and normalizing; Enriching; Organizing and storing; Summary; Chapter 4: Data Exploration and Visualization; Sampling data; Selecting the sample; Selecting samples using Saddle; Performing ad hoc analysis; Finding a relationship between data elements; Visualizing data; Vegas viz for data visualization; Spark Notebook for data visualization; Downloading and installing Spark Notebook; Creating a Spark Notebook with simple visuals; More charts with Spark Notebook
dc.description.tableofcontents Box plotHistogram; Bubble chart; Summary; Chapter 5: Applying Statistics and Hypothesis Testing; Basics of statistics; Summary level statistics; Correlation statistics; Vector level statistics; Random data generation; Pseudorandom numbers; Random numbers with normal distribution; Random numbers with Poisson distribution; Hypothesis testing; Summary; Section 2: Advanced Data Analysis and Machine Learning; Chapter 6: Introduction to Spark for Distributed Data Analysis; Spark setup and overview; Spark core concepts; Spark Datasets and DataFrames; Sourcing data using Spark; Parquet file format
dc.description.tableofcontents Avro file formatSpark JDBC integration; Using Spark to explore data; Summary; Chapter 7: Traditional Machine Learning for Data Analysis; ML overview; Characteristics of ML; Categories or types of ML; Decision trees; Implementing decision trees; Decision tree algorithms; Implementing decision tree algorithms in our example; Evaluating the results; Using our model with a decision tree; Random forest; Random forest algorithms; Ridge and lasso regression; Characteristics of ridge regression; Characteristics of lasso regression; k-means cluster analysis
dc.language English
dc.language.iso en
dc.publisher Birmingham Packt Publishing, Limited
dc.subject.other Data mining.
dc.subject.other Scala (Computer program language)
dc.subject.other SQL.
dc.subject.other Electronic books.
dc.title Hands-On Data Analysis with Scala: Perform Data Collection, Processing, Manipulation, and Visualization with Scala.
dc.type Book
dc.description.pages 1 online resource (288 pages)
dc.collection Электронно-библиотечные системы
dc.source.id EN05CEBSCO05C288


Файлы в этом документе

Данный элемент включен в следующие коллекции

Показать сокращенную информацию

Поиск в электронном архиве


Расширенный поиск

Просмотр

Моя учетная запись

Статистика