Показать сокращенную информацию
dc.contributor.author | Gupta Rajesh. | |
dc.date.accessioned | 2024-01-26T21:32:21Z | |
dc.date.available | 2024-01-26T21:32:21Z | |
dc.date.issued | 2019 | |
dc.identifier.citation | Gupta. Hands-On Data Analysis with Scala: Perform Data Collection, Processing, Manipulation, and Visualization with Scala. - Birmingham: Packt Publishing, Limited, 2019 - 1 online resource (288 pages) - URL: https://libweb.kpfu.ru/ebsco/pdf/2117000.pdf | |
dc.identifier.isbn | 1789344263 | |
dc.identifier.isbn | 9781789344264 | |
dc.identifier.uri | https://dspace.kpfu.ru/xmlui/handle/net/178247 | |
dc.description | Natural language processing for data analysis | |
dc.description.abstract | This book will help you perform effective data analysis with Scala using practical examples. You will come across different challenges and their effective solutions for a variety of data processing tasks - be it data exploration, data manipulation, or real-time data analysis using Apache Spark. | |
dc.description.tableofcontents | Cover; Title Page; Copyright and Credits; Dedication; About Packt; Contributors; Table of Contents; Preface; Section 1: Scala and Data Analysis Life Cycle; Chapter 1: Scala Overview; Getting started with Scala; Running Scala code online; Scastie; ScalaFiddle; Installing Scala on your computer; Installing command-line tools; Installing IDE; Overview of object-oriented and functional programming; Object-oriented programming using Scala; Functional programming using Scala; Scala case classes and the collection API; Scala case classes; Scala collection API; Array; List; Map | |
dc.description.tableofcontents | Overview of Scala libraries for data analysisApache Spark; Breeze; Breeze-viz; DeepLearning; Epic; Saddle; Scalalab; Smile; Vegas; Summary; Chapter 2: Data Analysis Life Cycle; Data journey; Sourcing data; Data formats; XML; JSON; CSV; Understanding data; Using statistical methods for data exploration; Using Scala; Other Scala tools; Using data visualization for data exploration; Using the vegas-viz library for data visualization; Other libraries for data visualization; Using ML to learn from data; Setting up Smile; Running Smile; Creating a data pipeline; Summary; Chapter 3: Data Ingestion | |
dc.description.tableofcontents | Data extractionPull-oriented data extraction; Push-oriented data delivery; Data staging; Why is the staging important?; Cleaning and normalizing; Enriching; Organizing and storing; Summary; Chapter 4: Data Exploration and Visualization; Sampling data; Selecting the sample; Selecting samples using Saddle; Performing ad hoc analysis; Finding a relationship between data elements; Visualizing data; Vegas viz for data visualization; Spark Notebook for data visualization; Downloading and installing Spark Notebook; Creating a Spark Notebook with simple visuals; More charts with Spark Notebook | |
dc.description.tableofcontents | Box plotHistogram; Bubble chart; Summary; Chapter 5: Applying Statistics and Hypothesis Testing; Basics of statistics; Summary level statistics; Correlation statistics; Vector level statistics; Random data generation; Pseudorandom numbers; Random numbers with normal distribution; Random numbers with Poisson distribution; Hypothesis testing; Summary; Section 2: Advanced Data Analysis and Machine Learning; Chapter 6: Introduction to Spark for Distributed Data Analysis; Spark setup and overview; Spark core concepts; Spark Datasets and DataFrames; Sourcing data using Spark; Parquet file format | |
dc.description.tableofcontents | Avro file formatSpark JDBC integration; Using Spark to explore data; Summary; Chapter 7: Traditional Machine Learning for Data Analysis; ML overview; Characteristics of ML; Categories or types of ML; Decision trees; Implementing decision trees; Decision tree algorithms; Implementing decision tree algorithms in our example; Evaluating the results; Using our model with a decision tree; Random forest; Random forest algorithms; Ridge and lasso regression; Characteristics of ridge regression; Characteristics of lasso regression; k-means cluster analysis | |
dc.language | English | |
dc.language.iso | en | |
dc.publisher | Birmingham Packt Publishing, Limited | |
dc.subject.other | Data mining. | |
dc.subject.other | Scala (Computer program language) | |
dc.subject.other | SQL. | |
dc.subject.other | Electronic books. | |
dc.title | Hands-On Data Analysis with Scala: Perform Data Collection, Processing, Manipulation, and Visualization with Scala. | |
dc.type | Book | |
dc.description.pages | 1 online resource (288 pages) | |
dc.collection | Электронно-библиотечные системы | |
dc.source.id | EN05CEBSCO05C288 |