Table of Contents
- Jupyter Notebook Basics
- Importing Pandas
- Reading a CSV File Using Pandas
- Printing the data
- Video tutorial
1 Jupyter Notebook Basics
Once you launch the Jupyter notebook, by using the plus sign at the menu bar you can add cells as shown in figure 1. The code should be written inside the cells.
By using the scissor icon at the menu bar you can delete the cells as shown in figure 2. In this session, we are going to learn about how to import files to the Jupyter notebook using Pandas.
2 Importing Pandas
In order to import pandas type, import pandas in the cell as shown in figure 2. You can see the downloaded packages from the Anaconda prompt using “conda list” command, as we discussed in the previous video.
Then execute the cell by clicking the run command at the menu bar. Nothing is going to show up here as we just imported the library. If any error is done when typing, an error message will show up at the output. As an example, if we type “panda” instead of Pandas, an error will show up, when you run the cell.
3 Reading a CSV file
In this section, we are going to read a CSV file called “india_car_sales.csv” as shown on figure 3. It contains details about cars sold according to the year, maker, quantity sold, and market share.
To import the file, type the command shown in figure 4. In order to read the CSV file, we need the read_csv function from the Pandas library. Inside the brackets, you should give the path of the CSV file. In here the path is c:\Data_Set\conv_data\india_car_sales.csv. Please make sure to use the file extension (in here its CSV) at the destination file. You can find out the extension using the properties of the file
Then execute it using the run command. It will provide the all records in the file. In our case we can observe, there are 16 records starting from 0. This is shown in figure 5.
The imported file can be utilized in another location or can be given as an input to another source. Hence, it should be stored by passing it to a variable. As shown in figure 6 the variable used here is, “india_sales”. When you run the cell, nothing is going to show up here. In order to print the data just put the dataframe name. The data will be printed in tabular format as shown in figure 7.
5 Printing the data
Printing as a thread
By using print(india_sales) command you can print the data in thread format as shown in figure 8.
Printing as a dictionary
Type the code as given in figure 10. In here all the columns are given and the respective values in each column are given in front of it. The variable name is given as, “car_sales”. Execute the code. The output is shown in the format of the dictionary
Printing in tabular form
The above output is shown in dictionary format. This can be printed in tabular form by using the DataFrame function in Pandas. DataFrame function should be applied to the variable, car_sales. This is shown in figure 10