Intro to PCDA course

Welcome to my Practical Computing for Data Analytics (PCDA) class. We’ll do several things the first week of class:

  • overview of the field of business analytics / data science

  • course overview and logistics

  • get some hands on experience with some of the technology we’ll use in the course

  • start to learn how to use the Linux shell for basic file management and putting together Linux commands to accomplish simple analytical tasks

Objectives

Through this module you will:

  • explore the syllabus and course web sites so that you know how this course will operate,

  • have had a preview of some of the types of things you’ll learn and the activities you’ll do in this course,

  • have begun to get hands on experience with some of the technical computing tools used in this course,

  • be ready to learn all kinds of cool business analytics things.

Readings

We’ll start using the Linux bash shell during the first week of class. So, might as well get going on learning the basics.

For now, read Section 1 of the Software Carpentry tutorial entitled: The Unix Shell. In Week 2 we’ll be learning the things covered in Sections 1-4 so feel free to skim those if you want to get a head start.

See the Explore section below for additional Linux shell related resources.

Downloads

There will always be one or more “Download” files for each class. It is a compressed archive containing all the files we’ll need for the session. In the Windows world, this would usually be a .zip file. However, in the Linux world, we often use “gzipped tarballs” which will have a .tar.gz extension. We’ll extract these in our Linux virtual machine (as part of our Week 1 intro), though you can certainly extract these files in Windows as well using the free utility 7-Zip.

Activities

Note

Our SBA web server has some issues that sometimes leadto problems loading our course webpages or my faculty home page. If this happens, you can usually fix theproblem by clearing your browser cache and reloading the page.

Or, you can use one of the alternative links - https://pcda.misken.org or https://mis5470.netlify.app.

Overview of pcda class

I’ll present an overview of this class as well as the general topic of data science / business analytics.

Warning

If you are using the VM, do NOT watch the screencasts from within the pcda VM. Watch them from a browser opened in your host OS (i.e. Windows or Mac).

Class logistics

Between the “Course welcome video” and the “Week 1 Welcome Video” (both available via Moodle), all of the course logistics are covered. So, if you haven’t watched these yet, please do so ASAP. Also, read the syllabus carefully (again, Moodle). Finally, review the first two Announcements I made in Moodle.

The pcda computing appliance

We’ll discuss things which led to the pcda appliance:

  • why’s and what’s of Linux

  • why’s and what’s of R and Python

  • open source facilitates contributed packages with latest and greatest statistical techniques, bug fixes, domain specific tools, etc.

  • free, like speech and like beer

  • efficiency of command line and scripts vs GUI

  • reproducible analysis/research

You should go through (if you haven’t already) the screencasts and instructions on the pcda VM page that covers installation and an overview of VirtualBox and the Lubuntu desktop. The screencasts below are from Fall 2020 but nothing has changed except the name of the VM.

Preview of data science with R and R Studio

You’ll get your first peek at these tools and get a preview of a typical analysis project involving building and comparing predictive models. This will serve as a preview of much of what this course is about.

Preview of Python and Anaconda

We’ll just do a quick look so that those who are curious can start to tinker around. We’ll be learning Python later in the semester.

Explore (OPTIONAL)

A few more Linux shell tutorials that I’ve found useful are:

Note

In the “Learn Enough Command Line to be Dangerous…” tutorial, there are two nice boxes describing the “magic of computers” and “technical sophistication”. READ THEM.

This section will typically have links related to the topic, … or not. Have fun exploring and learning more.