4:00pm to 6:00pm |
|
LISA Statistics Short Course: Introduction to Web Scraping in R
(Research)
LISA SHORT COURSES IN STATISTICS
LISA (Virginia Tech's Laboratory for Interdisciplinary Statistical Analysis) is providing a series of evening short courses to help graduate students use statistics in their research. The focus of these two-hour courses is on teaching practical statistical techniques for analyzing or collecting data. See www.lisa.stat.vt.edu/?q=short_courses for instructions on how to REGISTER and to learn more.
Spring 2016 Schedule:
Tuesday, March 15, 4:00-6:00 pm: Comparing Means and Other Measures of Location between Two Populations by Significance Tests and Effect Size;
Tuesday, March 22, 4:00-6:00 pm: Data Analytics - Classification;
Tuesday, March 29, 4:00-6:00 pm: Basics of R;
Tuesday, April 5, 4:00-6:00 pm: Statistical Analysis Using R;
Tuesday, April 12, 4:00-6:00 pm: Better Data Visualization in R Using the ggplot2 Package;
Tuesday, April 19, 4:00-6:00 pm: Introduction to Web Scraping in R;
Tuesday, April 26, 4:00-6:00 pm: Introduction to Multivariate Analysis of Variance (MANOVA) in JMP;
Tuesday, April 19, 4:00-6:00 pm;
Location: 1100 Torgersen Hall;
Instructor: Adam Edwards;
Title: Introduction to Web Scraping in R;
R is an open source software with many tools for data manipulation and analysis. The basic R packages include several data sets, as well as functions to read data from files on the local drive. This course will go over a more advanced package that will allow users to create their own data sets by compiling information from web pages.
In this course, users will learn the functions necessary to collect information from webpages, and how to manipulated compiled information into data frames. This course will provide an example of pulling a table directly from a webpage, as well as a more complex example of compiling a dataset from multiple webpages coherently. All code used in this course will be made available.
This is an advanced R course and prior knowledge in R is necessary. It is recommended that participants attend, or watch other LISA short courses in R. For a full list of courses being taught this semester, check www.lisa.stat.vt.edu/?q=short_courses. Past short courses can be viewed at www.lisa.stat.vt.edu/?q=past_courses.
Topics covered:
1) Reading HTML pages into R
2) Selecting specific HTML nodes
3) Isolating specific attributes of a node
4) Manipulating data into data frames
Follow us on Facebook (www.facebook.com/Statistical.collaboration) or Twitter (www.twitter.com/LISA_VT) to be the first to know about LISA events! More information...
|