Schedule

Data Analytics and Data Visualization in the Social and Behavioral Sciences
M W 11:00am – 12:15pm, 3611 James Hall

I recommend that you use one of the cloud storage services for your work in the course this semester. You can post URLs to your work on your cloud account on the course site, and email them to me, which will make it easier to keep track of everything. There are many options for free cloud accounts, including:
Dropbox https://www.dropbox.com/individual
Box https://www.box.com/pricing/individual
Google Drive https://www.google.com/drive/
Microsoft and Apple also have cloud storage services.

Required readings:
1. Beer, David. 2019. The Data Gaze: Capitalism, Power and Perception. Sage.
2. Jones, Ben. 2014. Communicating Data with Tableau. O’Reilly.
3. Dale, Kyran. Data Visualization with Python and JavaScript. O’Reilly.

Additional readings will be assigned.

Documentation
JupyterLab: https://jupyterlab.readthedocs.io/en/latest/
Jupyter Notebooks: https://nbformat.readthedocs.io/en/latest/
*** Introduction to Jupyter notebooks: https://realpython.com/jupyter-notebook-introduction/#cell-types
Pandas: https://pandas.pydata.org/pandas-docs/stable/
*** Pandas exercises: https://www.machinelearningplus.com/python/101-pandas-exercises-python/
Mathplotlib: https://matplotlib.org/3.1.1/contents.html
NumPy: https://docs.scipy.org/doc/numpy/user/index.html
statsmodels: https://www.statsmodels.org/stable/examples/index.html
seaborn: https://seaborn.pydata.org/

Topics and schedule:
Week 1. Introduction to the course
M 1/27) Discussion of the syllabus and classroom processes and assignments. Discussion of history of data analytics and contexts for data analysis in the “data revolution” such as visualization, machine learning, big data, social networks.

Part I. Tableau
W 1/29) Introduction to Tableau. Read: Jones 1

Week 2. Critical thinking about data
M 2/3) Discussion of contexts of real world data analytics. Identification of real world data. Construction of research questions and discussion of the logic of analysis. Read: Beer 1.

W 2/5) Mastery of Tableau basics. Discussion of real world data sources for team projects. Read: Jones 3-6

Assignment 1: Logic of analysis

Week 3. Introduction to data visualization
M 2/10) Discussion of principles of visual communication. Read: Dale 1

W 2/19) Mastery of Tableau. Discussion of data issues for team project. Read: Jones 7-10

Assignment 2: Tableau dashboard

** No classes on 2/12 or 2/17 **

Week 4. Project planning
M 2/24) Discussion of project planning, timelines, benchmarks, coordination.

W 2/26) More on dashboards. Read: Jones 12

Assignment 3: Project design

Week 5. Finishing Tableau
M 3/2) Principles of visual communication.

W 3/4) Data narratives. Read: Jones 13-14

Assignment 4: Data story

Part II. Python
Week 6. Introduction to programming
M 3/9) Code: Introduction to Python. Getting set up with JupyterLab. Basic syntax, data structures

W 3/11) Code: Libraries, methods. loops. Read: Dale 2, 3.

Assignment 5: JupyterLab & Python basics

Proposal due.

The university has announced that there will be no classes from 3/12 to 3/18, and beginning on the 19th to the end of the semester, all instruction will be online. So we're moving our work to a fully online format during this time of public health challenges.

Week 7. Ethics of data and algorithms
M 3/16) Discussion of recent literature on data revolution and social processes and institutions. Identification of real world settings where data practices are contested.

W 3/18) Code: Introduction to DataFrames.

We will be in fully online mode from this point to the end of the semester. We will not meet in person during class time on Mondays or Wednesdays. Instead, I will be available on Slack and in virtual office hours, M 11AM-12:15PM. I'll add optional meetings on W 11AM-12:15PM if you wish to discuss the coding activities and assignments. (I'll send out instructions for the meeting platform on Slack and by email.) Activities (readings and coding) and assignments through the course site as usual. As we adjust to our stressful and rapidly changing routines during this public health emergency, our relationship to our competing commitments will require some flexibility. I will accept work for credit, including website activities, coding assignments, and the visualization project, until the last class meeting; I think it will be helpful to try to keep to the schedule as best you can, but no work will considered late if turned in before the end of the semester.

Week 8. Introduction to modeling
Code: Linear models, part I. Linear models, part II.

W 3/25) We'll have a zoom meeting with the IBM data scientists.

The university announced another change to the schedule. The chancellor announced a "recalibration" period from Friday, March 27, through Wednesday April 1. There will be no "classes" during this period. As a result, the "University’s previously scheduled Spring Recess will now run from Wednesday April 8 through Friday April 10," which means we'll have "classes" the week of April 12. To make things clearer on the remainder of the schedule, I'm going to replace the dates with weeks. We'll have a Zoom meeting on Mondays, 11AM-12:15PM. I'll be available on Wednesdays 11AM-12:15PM and during regular office hours on the Slack channel and the course site. Email communication will continue as before. We'll arrange communications with the IBM team as needed.

Week 9. No classes

Week 10. Beginning the machine learning project
Project: The classification project. Exploring the dataset. Data preparation. Modeling, part I. Modeling, part II.

M 4/6) Zoom meeting to discuss linear and logistic regression.
T 4/7) Zoom meeting, open office hours on models and projects

Assignment 6: Linear models (OLS & Logistic)

** Spring break W 4/8 to F 4/10. **

Week 11. More on visualization
Project: Modeling, part I. Code: Visualizing data with Seaborn.

M 4/13) Zoom meeting to discuss the machine learning project code.
W 4/15) Zoom meeting, open office hours on models and projects

Assignment 7: Exploratory data analysis

Week 12. Continuing machine learning
Project: Modeling, part II. Testing the model.

M 4/20) Zoom meeting to discuss visualization code.

Assignment 8. Classification model

Week 13. Finishing machine learning
Project: Evaluating the model. Performance metrics. Describing the results.

M 4/27) Zoom meeting to discuss machine learning project code.

Assignment 9: Model evaluation

Week 14. Reviewing the classification model
Project: Time for catching up and reviewing the classification project.

M 5/4) Zoom meeting to discuss machine learning project code.

Assignment 10: Modeling the social world

Part III. Presentations
Week 15. Visualization projects.
Because we won't be meeting in person, research reports should be posted to the course site by M 5/11.

M 5/11) Final Zoom meeting to discuss the visualization projects.

Students will have until W 5/20 to comment on the other groups’ projects.