how to read data from github url using python packages

Table of Contents

In this practical we are going to read .csv file from a Github URL

Figure 1 shows the data set which is related to the Github URL. We need to read this data file into the Jupyter notebook using the URL.

Figure 1: The data file

Import the packages: pandas, io, and requests. The requests package is needed to read the data from a URL. The code is shown in the figure 2.

We need to use the get() function from the requests package. The URL should be given as the input parameter to this function. Since we need the content, put content at the end.

Figure 2: Reading the data from a URL

Execute the file which is read by the URL. Refer the figure 3, the data is shown as a sequence. Each and every record is separated by a newline character (\n).

Figure 3: The data read by the URL

Then we are going to convert the data into the tabular format. The read_csv() function in Pandas can be used to do this. See the code given in figure 4.

The stringIO() function get all the strings and put it in a buffer pool. Then we should pass the data; read_data. Then it should be decoded using decode() function. Give utf-8 as the input parameter to the decode() function. Once the data is available use read_csv() function.

The output is given in figure 5. Now the data set is visualized in tabular format.

Figure 5: Data set in tabular format

Video Tutorial

Leave a Reply

Your email address will not be published. Required fields are marked *