R Programming for Data Science (v1.0)
This course will teach you the fundamentals of programming in R to get you started. It will also teach you how to use R to perform common data science tasks and achieve data-driven results for the business.
Description
Overview
In our data-driven world, organizations need the right tools to extract valuable insights from that data. The R programming language is one of the tools at the forefront of data science. Its robust set of packages and statistical functions makes it a powerful choice for analyzing data, manipulating data, performing statistical tests on data, and creating predictive models from data. Likewise, R is notable for its strong data visualization tools, enabling you to create high-quality graphs and plots that are incredibly customizable.
Course Objectives
In this course, you will use R to perform common data science tasks. You will:
- Set up an R development environment and execute simple code
- Perform operations on atomic data types in R, including characters, numbers, and logicals
- Perform operations on data structures in R, including vectors, lists, and data frames
- Write conditional statements and loops
- Structure code for reuse with functions and packages
- Manage data by loading and saving datasets, manipulating data frames, and more
- Analyze data through exploratory analysis, statistical analysis, and more
- Create and format data visualizations using base R and ggplot2
- Create simple statistical models from data
Who Should Attend
This course is designed for students who want to learn the R programming language, particularly students who want to leverage R for data analysis and data science tasks in their organization. The course is also designed for students with an interest in applying statistics to real-world problems. A typical student in this course should have several years of experience with computing technology, along with a proficiency in at least one other programming language.
Course Outline
Module 1: Setting Up R and Executing Simple Code
- Set Up the R Development Environment
- Write R Statements
Module 2: Processing Atomic Data Types
- Process Characters
- Process Numbers
- Process Logicals
Module 3: Processing Data Structures
- Process Vectors
- Process Factors
- Process Data Frames
- Subset Data Structures
Module 4: Writing Conditional Statements and Loops
- Write Conditional Statements
- Write Loops
Module 5: Structuring Code for Reuse
- Define and Call Functions
- Apply Loop Functions
- Manage R Packages
Module 6: Managing Data in R
- Load Data
- Save Data
- Manipulate Data Frames Using Base R
- Manipulate Data Frames Using dplyr
- Handle Dates and Times
Module 7: Analyzing Data in R
- Examine Data
- Explore the Underlying Distribution of Data
- Identify Missing Values
Module 8: Visualizing Data in R
- Plot Data Using Base R Functions
- Plot Data Using ggplot2
- Format Plots in ggplot2
- Create Combination Plots
Module 9: Modeling Data in R
- Create Statistical Models in R
- Create Machine Learning Models in R
Prerequisites
To ensure your success in this course, you should be comfortable with basic computer programming concepts, including but not limited to: syntax, data types, conditional statements, loops, and functions. You can obtain this level of skills and knowledge by taking the Introduction to Programming with Python® course. You should also have at least a high-level understanding of fundamental data science concepts, including but not limited to: data engineering, data analysis, data storage, data visualization, and statistics.