Learning Objectives:

Software tools needed: web browser and Python programming environment with the pandas and matplotlib packages installed.

Download the Skeletal Notes and Focus Questions to guide you while studying this lab.
These are a useful tool for note taking and you can keep these handy to study for and refer to during the final exam.

Quizzes

At the end of this lab, don't forget to take Lab Quiz 7! See the quiz page for details of the work due this week.

Using Python, Gradescope, and Blackboard

See Lab 1 for details on using Python, Gradescope, and Blackboard.

Using NYC OpenData

Much of the data collected by city agencies is publicly available at NYC Open Data. Let's use pandas to plot some data from NYC OpenData. Below is a graph of the total number of individuals in New York City's shelter system from 2010 to 2016:


We'll start by downloading data that has the daily number of families and individuals residing in the Department of Homeless Services (DHS) shelter system:

Click on the "View Data" button. To keep the data set from being very large (and avoid some missing values in 2014), we are going filter the data to be all counts after January 1, 2017. To do this:

To download the file,

Move your CSV file to the directory that you save your programs. Open with Calc (the built-in spreadsheet program for Ubuntu Linux running on the lab machines), Excel, or your favorite spreadsheet program to make sure it downloaded correctly. Look at the names of the columns since those will correspond to series we can plot.

Now, we can write a (short) program to display daily counts:

import pandas as pd
import matplotlib.pyplot as plt

homeless = pd.read_csv("DHS_Daily_Report.csv")
homeless.plot(x = "Date of Census", y = "Total Individuals in Shelter")
plt.show()
The program above assumes that you saved you data as DHS_Daily_Report.csv. If you saved the data under a different name, alter the program above to use that file. Save your program and try on your dataset.

Challenges

Once you have completed the above, see the Programming Problem List.

main()

Python allows you to write programs as scripts: basically, a list of commands that are executed one after the other. You can also organize the programs in functions, which groups commands together that can be reused. Many programming languages (like C++ or Java) require that your programs be organized in functions.

To define function in Python, we use the def command, which has the basic form:

def myFunction(input1, input2, ...):
    command1
    command2
    ...
Note that everything indented below the def line is considered part of the function. When you type the function name (followed by parenthesis), it calls (or "invokes") the function, which means it executes all the commands, one after another, that are part of the function.

Let's rewrite our first program, using functions. By tradition (and since it matches the naming protoccol of C & C++), we will call our function main() (see Section 6.7: Using a Main Function):

#Name:  your name here
#Date:  October 2017
#This program, uses functions, says hello to the world!

def main():
    print("Hello, World!")

if __name__ == "__main__":
    main()
In Python, we have the option of running our programs as a standalone program, or included as module as part of another program. Since it's common to do either, we include the last two lines of the file, which say if the program is being run directly (which we can test to see if the variable __name__ that is set by Python is __main__), then we call main(). If it's not, then the file is being included in something else, and leaves it to that program to call it.

Save your program and try running it in IDLE.

Now, at the prompt (the window with the lines beginning with >>>), type main(). This calls the function directly. Note that calling the function either way results in the same actions: the commands inside main() are executed.

When you have a running version, see the Programming Problem List.

Using Python from the Command Line Interface

In addition to IDLE (and other development environments with graphical interfaces), Python can also be used directly from the command line. In fact, this is what the grading scripts do to evaluate your programs, since Gradescope uses a remote cloud server and does not have a graphics window.

To start, we need a command line interface (aka a terminal window). To launch the terminal, click on the terminal window icon in the left menu, or go to search option in the upper left corner and type and then open terminal.

In Lab 1, we launched IDLE from the terminal by typing:

$ idle3

We can use Python in a similar fashion. In a terminal window, change directories to where you stored your hello program above (see Lab 4 for changing directories at the command line).

Let's run your hello program from the command line. If your program is called hello.py, you would type at the command line:

$ python3 hello.py
Notice that the output goes directly to the terminal window. Try running other programs you have written from the command line.

What's Next?

You can start working on this week's programming assignments. The Programming Problem List has problem descriptions, suggested reading, and due dates next to each problem.
Keep in mind that the due dates are one week late for flexibility (if one week there is a setback and you can't submit your programs, you will have time catch up). Still, each week you should work on the programming assignments for that week, even if they are due a week later. If you are on a roll, you are welcome to work ahead!!!