Reading and Displaying Files ================================ To read a file, we must first open the file. This is done using Python's built-in open() function. The open() function returns a file object, which has a read() method for reading files and storing their contents as variables . Once a file is read and stored as a variable, the original file can be closed. We can then loop through the file variable and do stuff with its contents. .. code-block:: python file = open("/Users/student/Desktop/Equakes2.csv", "r") text = file.readlines() file.close() for line in text: print (line) **Data** * `Download Earthquake Data `_ * You can also download earthquake data from the `USGS website `_. *Points to Note anout the Above Code* 1. The open method has the following modes: | ‘r’ – Python read file. Read mode is used when the file is only being read | ‘w’ – Python write file. Write mode is used to edit and write new information to the file (any existing files with the same name will be erased when this mode is activated) | ‘a’ – Python append file. Append mode is used to add new data to the end of the file; that is new information is automatically amended to the end | ‘r+’ – Special read and write mode is used to handle both actions when working with a file 2. Always close the file after reading or writing to it. In most cases, upon termination of an application or script, a file will be closed eventually. However, there is no guarantee when exactly that will happen. This can lead to unwanted behavior including resource leaks and unwanted behavior. | Writing File Paths -------------------- In Python, it is better not to write your file paths with backslashes as shown below: .. code-block:: python file = open("C:\Users\student\Desktop\earthquakes.csv", "r") Writing paths with backslashes is not advisable because a Python string can contain special characters that also incorporate backslashes for other reasons, e.g., "\t" (tabs), "\n" (newlines), "\r" ( carriage returns), etc. These special characters make it very hard to create a file path that uses single backslashes because file paths in Windows also use single backslashes. There are several workarounds: **1. Use Forward Slashes** .. code-block:: python file = open("C:/Users/student/Desktop/earthquakes.csv", "r") | **2. Use Raw Strings** These are regular strings, but includes an r before the script begins. The r tells the Python Interpreter that the string does not contain any special characters or escape characters. .. code-block:: python file = (r"C:\Users\student\Desktop\earthquakes.csv") | **3. Use Double Backslahes** In this case, the backslashes are escaped using a second backslash. .. code-block:: python "C:\\Users\\student\\Desktop\\earthquakes.csv" | **4. Use os.path.join** The os module provides access to operating system functions regardless of the platform you are using, i.e, Windows, Mac OS, Linux, etc.  os.path.join() takes any number of path strings and returns a single path using the platform-specific path separator. To create a string specifically for the Windows environnment, we can write the os.path.join() string as shown below. .. code-block:: python import os os.path.join("c:/", "Windows") >>> 'c:/Windows' To create a string for a platform regardless of whether it is Windows, Mac OS, Linux, etc., we can write the os.path.join() string as shown below. In this situation, the operating system will take care of the separator. .. code-block:: python import os os.path.join('c:', os.sep, 'mydata') >>> 'c:\\mydata' Note: os.sep will supply the separator It is also possible to pass in the current working directory to os.path.join(), as shown in the script below. .. code-block:: python import os dem_path = os.path.join(os.getcwd(), 'Shasta-30m-DEM.tif') >>> dem_path >>> dem_path = '/Users/hsemple/Documents/python_gis_tutorials/docs/Shasta-30m-DEM.tif | Opening a file using the "With" Statement ------------------------------------------ A second way second way to open a file is to use the "with" statement. The with statement automatically takes care of closing the file once it leaves the with block, even in cases of error. I highly recommend that you use the with statement as much as possible, as it allows for cleaner code and makes handling any unexpected errors easier for you. .. code-block:: python with open("/Users/student/Desktop/Equakes2.csv", 'r') as file: text = file.readlines() for line in text: print (line) | Opening a csv File using Python's Standard Library ---------------------------------------------------- Let's open a csv data file using some basic Python statements. First, we will let Python read the file and create a file object. Next, we will split the file contents into inidividual columns, and then store the column data in variables. This is a long-winded way of displaying csv files, but the objective with this code sample is to illustrate some details that is involved in opening these files. For regular work, we use libraries such as Pandas which it much easier to display the csv files. Pandas will be discussed shortly. .. code-block:: python infile = open("/Users/student/Desktop/earthquakes.csv", 'r') lines = infile.readlines() infile.close() del lines[0] # Remove the first line #Create empty lists xvar = [] yvar = [] count = 0 for line in lines: elements = line.split(",") # splits the line mag = float(elements[2]) # Get the data in the third column dep = float(elements[3]) # Get the data in the fourth column # Add magnitude and depth data to the empty lists xvar.append(dep) yvar.append(mag) #Format the two lists as columns separated with a space magnitude = "Magnitude" depth = "Depth" print ("%-15s %s" %(magnitude, depth)) print ("") for c1, c2 in zip(xvar, yvar): print ("%-15s %s" % (c1, c2)) .. image:: img/magnitude_depth.png :alt: Formatted Table *Things to Look up in the above code* a. `Formatting Output `_ b. `Python Zip Function `_ | **Performing Calculations and Making a Graph using Python's Standard Library** .. code-block:: python import math import matplotlib.pyplot as plt infile = open("C:/Users/student/Desktop/earthquakes.csv, 'r') lines = infile.readlines() infile.close() del lines[0] # Remove the header line #Create empty lists xvar = [] yvar = [] std_dev = [] count = 0 sum = 0 for line in lines: elements = line.split(",") #splits the lines mag = float(elements[4]) # Get the data in the fifth column count = count + 1 # Add count and magnitude data to the empty x,y lists xvar.append(count) yvar.append(mag) sum = sum + mag #Calculate mean average = sum / count rint ("Average is", round(average,5)) print ("") width = 1 fig, ax = plt.subplots(figsize=(6, 4)) ax.bar(xvar, yvar, width, color='orangered') ax.set_yscale('log') # Set y-axis to logarithmic scale ax.set_xlabel("No. of Earthquakes", fontweight='bold', fontsize=12, color='black') ax.set_ylabel("Magnitude", fontweight='bold', color='black', fontsize=12) ax.set_title("Magnitude of Earthquakes") plt.tight_layout() # Adjust figure spacing plt.show() .. image:: img/earth_magnitude_graph.png :alt: Formatted Table | The Matplotlib Figure Object -------------------------------- When working with Python, it is important to know how to manipulate matplotlib's figure object. Almost all of your graphical output will require the use of this figure object either directly or indirectly. The links below provide some information on ths topic. In terms of a general overview, in matplotlib's languge, the figure object is the top level object. It serves as the container for all other plot-related objects. The figure object can contain one or more axes or plotting objects. Axes are actual X,Y plots or diagrams that appear in the document. Axes contain the X-axis and the Y-axis, which are the XY lines in the plots. Axes also have plot objects, titles, legends, etc. The easiest way to create a new figure is to use the figure method of the matplotlib pyplot interface: >>> f = plt.figure() If you want to control the figure size, you can use the figsize argument of the figure method. Figure size is given in width first then height (in inches) >>> f = plt.figure.figsize=(12,10) The above statements will create an empty figure object with no axes. Often, we encounter code samples, such as the one below, that do not specify any figure object. If no figure object is specified in the code, then pyplot uses a default figure object to draw the figure. .. code-block:: python import matplotlib.pyplot as plt x = [0.3, 3.8, 1.2, 2.5] y = [11, 25, 9, 26] plt.scatter (x,y, color='darkgreen', marker='^') plt.xlim(0.5, 4.5) plt.show() The main benefit of explicity specifying a figure is that you can control its size, background, color, etc. In the code sample below, the size of the figure object is specified. .. code-block:: python import matplotlib.pyplot as plt x = [0.3, 3.8, 1.2, 2.5] y = [11, 25, 9, 26] fig = plt.figure(figsize=(4,6)) plt.scatter (x,y, color='darkgreen', marker='^') plt.xlim(0.5, 4.5) plt.show() | **Subplots** Within a figure, there can be one or more "subplots". Subplots are objects within a figure that divide up the figure. The subplots are positioned in a grid so that 221, means 2 rows, 2 columns, and the first cell. .. image:: img/subplot1.png :alt: Subplots The code below includes a figure and a subplots. Note that when a subplot is created, it is referred to as an axes. That's because it is also the plotting area for the figure. Also, unused axes are not shown. .. code-block:: python import matplotlib.pyplot as plt x = [0.3, 3.8, 1.2, 2.5] y = [11, 25, 9, 26] fig = plt.figure(figsize=(4,6)) ax = fig.add_subplot(221) plt.scatter (x,y, color='darkgreen', marker='^') plt.xlim(0.5, 4.5) plt.show() .. image:: img/subplots_2x2.png :alt: Subplots | **Axes** An axes is the area on which the data is plotted using functions such as plot() and scatter(). Axes can have ticks, labels, etc, associated with it. As shown in the previous section, axes can be created from a figure object ax = fig.add_subplot(111) Axes can also be created directly from pyplot: ax = plt.subplot(221) .. image:: img/matplotlib_figure_object.png :alt: Subplots | Once the axes is created, it can be used to plot various types of charts or to plot images and maps. For example, to plot various types of charts, we write: | ax.bar - bar charts | ax.hist - histograms | ax.scatter - scatter plot | ax.imshow. - images | etc When plotting graphs, charts, and images, if no axes is specified in the code, you can still use the pyplot module to plot the objects. | plt.bar - bar charts | ply.hist - histograms | plt.scatter - scatter plot | plt.imshow. - images | **Plotting Multiple Axes on a Single Figure** A single figure can contain multiple axes, each of which can contain separate plots. .. code-block:: python import matplotlib.pyplot as plt import geopandas as gpd file = "/Users/.../school_districts.shp" df = gpd.read_file(file) ax1 = plt.subplot(221) df.plot(ax=ax1, edgecolor="black") ax2 = plt.subplot(222) df.plot(ax=ax2, edgecolor="black") ax3 = plt.subplot(223) df.plot(ax=ax3) plt.show() .. image:: img/subplots221.png :alt: Subplots | **Setting Legend** To set the legend for an axes, simply type: ax.legend() .. code-block:: python import matplotlib.pyplot as plt fig = plt.figure(figsize=(6,4)) ax = fig.add_subplot(111) ax.plot([300, 400,500,600,700,800,900], [1.00, 1.20, 1.40, 1.60, 1.80, 2.00, 2.20], label = 'Supply') ax.plot([300, 430,560,680,710,800,925], [2.20, 2.00,1.80, 1.60,1.40,1.20,1.00], label = 'Demand') ax. set_ylabel('Price') ax. set_xlabel('Quantity Demanded and Supplied') ax.set_title("Demand and Supply Data") ax.legend() plt.show .. image:: img/matplot_legend.png :alt: Subplots For more information on how to use the matplotlib figure object, please click on the links below: * `Anatomy of a Matplotlib Figure `_ * `Pyplot Tutorial `_ | | Reading CSV Files Using Pandas ----------------------------------- The script above used many lines of code to open the csv file, compute average, and plot a graph. We can achieve the same results with far fewer lines of code by using a specialized library named Pandas. Pandas is a very powerful, popular and easy to use Python library for data analysis. It has many Excel-like functions. Its primary object is the DataFrame, which can be thought of as an abstract database table or spreadsheet. Once you create a dataframe object, you can use it to display tables, plot columns, create and run queries, with just a few lines of code. Let's plot the earthquake dataset using Pandas. .. code-block:: python import pandas as pd df = pd.read_csv("../earthquakes.csv") df | **Displaying Specific Columns of your Dataframe** To select multiple columns, use a list of column names within the selection brackets []. In the example below, we are selecting the earthquake depth and magnitude fields. .. code-block:: python import pandas as pd df = pd.read_csv("../earthquakes.csv") df[['depth', 'magnitude']] | Get the size of the table. .. code-block:: python import pandas as pd df = pd.read_csv("../earthquakes.csv") df[["depth", "mag"]].shape | **Filtering the Data Frame for Certain Rows** .. code-block:: python import pandas as pd df = pd.read_csv("../earthquakes.csv") above_3.5 = df[["mag"] > 3.5] above_35.head() | Plotting Graphs ----------------- Many types of graphs can be plotted by pandas. Below are seven types of graphs that are useful to know how to create. The kind parameter accepts eleven different string values and determines which kind of plot you’ll create: | "area" is for area plots. | "bar" is for vertical bar charts. | "barh" is for horizontal bar charts. | "box" is for box plots. | "hexbin" is for hexbin plots. | "hist" is for histograms. | "kde" is for kernel density estimate charts. | "density" is an alias for "kde". | "line" is for line graphs. | "pie" is for pie charts. | "scatter" is for scatter plots. | **Line Graph** .. code-block:: python import matplotlib.pyplot as plt import pandas as pd df = pd.read_csv("../earthquakes.csv") df.plot(kind='line',y='depth',color='red', figsize=(6, 8)) plt.show() | **Histogram** .. code-block:: python import matplotlib.pyplot as plt import pandas as pd df = pd.read_csv("../earthquakes.csv") #df.plot.line(column = df.columns[3], figsize=(6, 8)) df.plot(kind='hist',y='Depth_mls',color='red',bins = 10, figsize=(6, 8)) plt.show() | **Bar Plot** .. code-block:: python import matplotlib.pyplot as plt import pandas as pd speed = [0.1, 17.5, 40, 48, 52, 69, 88] lifespan = [2, 8, 70, 1.5, 25, 12, 28] index = ['snail', 'pig', 'elephant','rabbit', 'giraffe', 'coyote', 'horse'] df = pd.DataFrame({'speed': speed,'lifespan': lifespan}, index=index) ax = df.plot.bar(rot=10) plt.title("Speed vs Lifespan, Selected Animals") plt.show() | **Scatterplot** To plot the Depth and Magnitude Data as a scatterplot, write: .. code-block:: python import matplotlib.pyplot as plt import pandas as pd df = pd.read_csv("/Users/.../earthquakes.csv") df.plot(kind='scatter', x='depth',y='magnitude', color='red',figsize=(6, 8)) plt.show() or .. code-block:: python df.plot(kind='scatter',x='Depth_mls',y='Magnitude',color='red') plt.show() | | *Notes:* * ax is the axes on which to draw the map * cmap is the name of the `colormap `_ * legend & legend_kwds control the display of the legend. | Writing to a File -------------------- Once we are done with data analysis, we can also write to a file, as shown below. .. code-block:: python with open("C:/Users/.../john.txt", "w") as f: f.write('Hello \n') f.write('Hello \n') f.write('Hello \n') f.write('Hello \n') f.write('Hello \n') f.write('Hello \n') f.write('Hello \n') f.close | Reading a Data file into Python, splitting its contents by columns, and storing the columns in variables .. code-block:: python infile = open("C:/Users/.../earthquakes.csv, 'r') lines = infile.readlines() newfile=open("../newfile.txt",mode="a+",encoding="utf-8") del lines[0] # Remove the first line #Create empty lists xvar = [] yvar = [] count = 0 sum = 0 for line in lines: elements = line.split(",") # splits the line mag = float(elements[4]) # Get the data in the fifth column count = count + 1 # Add count and magnitude data to the empty x,y lists xvar.append(count) yvar.append(mag) sum = sum + mag newfile.write("\n") newfile.write(str(mag)) #Calculate mean average = sum / count newfile2.write(str(average)) infile.close() print ("The average earthquake magnitude is",round(average, 2)) print ("") | Plotting Shapefiles -------------------- There are many libraries available for plotting shapefiles. Such libraries include Geopandas, Shapely, Fiona, OGR, and descartes. Altogether, one of the main contributions of these libraries is that we no longer have to rely on GIS software to view shapefiles and their attribute data. This can be done easily with Python. Among the various libraries, Geopandas is perhaps the most widely used for plotting shapefiles as it only requires a few lines of code to display the shapefile. Also, attribute tables can be easily manipulated as a panda/geopanda dataframes. The other libraries are also important depending on the task that you would like to perform. Since geopandas hides almost all its implementation, you will find that studying the code of other libaries gives you a clearer picture of how to handle shapefiles from a Python perspective. | **Displaying Shapefiles using Geopandas** To use Geopandas, first, download and install the `library <'https://geopandas.org/en/stable/getting_started/install.html'>`_. Afterwards, import the library and use it in your code. As an example, run the code below in your favorite development environment. Note that the code draws the shapefile with a single color because no specific column is being plotted. .. code-block:: python import geopandas as gpd import matplotlib.pyplot as plt geo_df = gpd.read_file ("/Users/../Michigan.shp") geo_df.plot() .. image:: img/michigan_single_color.png :alt: Michigan Map | **Viewing the Shapefile's Attribute Table using Geopandas** To view the attribute table of a shapefile, wit the Geopandas library, use the head() method of the data frame object. .. code-block:: python import geopandas import matplotlib.pyplot as plt gdf = geopandas("/Users/../Michigan.shp") #Show data in the attribute table print(gdf.head()) #Display the shapefile f, ax = plt.subplots(1, figsize=(8, 11)) gdf.plot(ax = ax, edgecolor='black') ax.set_title("Water Wells, Washtenaw County, Michigan", fontdict={'fontsize': '14', 'fontweight' : '3'}) plt.show() .. image:: img/michigan_attribute_table.png :alt: Michigan Map | **Creating a Graduated Color Thematic Map using Geopandas** Often in GIS, we are interested in creating graduated color thematic maps based on specific columns. To do, simply enter parameters in the gpd.plot() method, as shown below. *gdf.plot(ax = ax, column= 'HISPANIC', cmap='OrRd' , scheme='fisher_jenks', legend=True, edgecolor='black')* .. code-block:: python import geopandas import matplotlib.pyplot as plt gdf = geopandas.read_file("../Michigan.shp") #Display the shapefile f, ax = plt.subplots(1, figsize=(10, 13)) gdf.plot(ax = ax, column= 'HISPANIC', cmap='OrRd' , scheme='fisher_jenks', legend=True, edgecolor='black', legend_kwds={'loc': 'lower left'}) ax.set_title("Mobile Homes, Michigan", fontdict={'fontsize': '14', 'fontweight' : '1'}) plt.show() #Save the map f.savefig("/Users/.../map_export.png", dpi=300) .. image:: img/michigan_mobile_homes.png :alt: Thematic Map See the `Geopandas User Guide `_ for more information. | **Plotting World Population Using Data Provided by Geopandas** .. code-block:: python import geopandas as gpd df_world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres')) print (df_world.head()) fig, ax = plt.subplots(1, 1, figsize=(16, 12)) df_world.plot(column='pop_est', cmap='OrRd', ax = ax, ec="black", legend=True, legend_kwds={"label": "Population", "orientation":"horizontal"}) ax.set_title("Countries of the World by Population") .. image:: img/world_population.png :alt: Thematic Map | Displaying Shapefiles Using Fiona ----------------------------------- Fiona is a low-level library that focuses on reading and writing geospatial data formats. It acts as a Pythonic interface to the OGR library, providing direct access to the capabilities of OGR for working with vector data. Fiona is known for its efficiency, simplicity, and interoperability with various geospatial file formats. It is often used when specific data access or format-related functionalities are required. .. code-block:: python import fiona import matplotlib.pyplot as plt from shapely.geometry import shape # Path to the shapefile shapefile_path = '/Users/.../cities.shp' # Open the shapefile using Fiona with fiona.open(shapefile_path, 'r') as shp: # Create a new figure plt.figure(figsize=(10, 10)) # Iterate over each feature in the shapefile if geometry is not None: for feature in shp: geometry = shape(feature['geometry']) properties = feature['properties'] # Check the type of geometry if geometry.geom_type == 'Point': # Handle points plt.plot(geometry.x, geometry.y, 'ro') # Add label plt.text(geometry.x, geometry.y, properties['NAME'], fontsize=8, ha='center', va='bottom') elif geometry.geom_type == 'LineString': # Handle lines plt.plot(*geometry.xy, 'b-') # Add label centroid = geometry.centroid plt.text(centroid.x, centroid.y, properties['NAME'], fontsize=8, ha='center', va='bottom') elif geometry.geom_type == 'Polygon': # Handle polygons plt.plot(*geometry.exterior.xy, 'g-') for interior in geometry.interiors: plt.plot(*interior.xy, 'g--') # Add label centroid = geometry.centroid plt.text(centroid.x, centroid.y, properties['NAME'], fontsize=8, ha='center', va='bottom') # Set the plot limits and labels plt.xlabel('Longitude') plt.ylabel('Latitude') plt.title('Shapefile Visualization') plt.grid(True) # Show the plot plt.show() | *Notes:* * ax is the axes on which to draw the map * cmap is the name of the `colormap `_ * legend & legend_kwds control the display of the legend. | **Displaying a Shapefile Using PyShp** Pyshp is an open source library that can be used to display a shapefile using pure Python. The project is described at this `website <"https://pypi.org/project/pyshp/">`_. PyShp code for displaying a shapefile is lengthier than the one used by geopandas, but it gives us an opportunity to learn about multipart polygons and polylines. Please note that after installing the "Pyshp" library, it is imported into Python using the shapefile keyword, e.g., "import shapefile". Prior to running the code,it is useful to know that shapefiles have a 'parts collection' and a 'points collection'. A parts collection is used to keep track of the number of polygons or line segments associated with each record. This is required because in many situations, more than one polygons or polylines must be linked to a record in the attribute table. As an example, since Hawaii is made up of several islands, multiple polygons are needed to represent each island in the state. However in the attribute table, only a single record is used to represent the state. In a shapefile, a record with one polygon will have one part, but this will show in the parts collection as 0 because the counting system begins with zero. A points collection refers to the list of points associated with a shape or feature. When working with points collection, be aware that the x-coordinates have an index of [0] while the y-coordinates will have an index value of [1]. The index values allow us to retrieve the x and y coordinates as separate lists. The illustration below shows how we visualize the relationships between a parts collection and a points collection. The upper section of the illustration shows the polygons associated with a single record while the lower section shows the part and points collection associated with the polygons. Since there are two polygons associated with the record, the parts collection for that record is 1 because the count starts from zero. The values in the squares in the parts collection represent index numbers that keep track of the starting coordinates for each part. Thus, the first part has coordinates that go from from p1 to p6. The second part starts from index value 6, and has coordinates that go from p7 to p11. .. image:: img/point_parts_collection.png :alt: Parts and Point Collection of a Shapefile Using Pyshp we can experiment with displaying a shapefile using only its points collection and with both its parts and points collection. The code below shows how to display a shapefile using only its points collection. .. code-block:: python import matplotlib.pyplot as plt import shapefile sf = shapefile.Reader("Users/../Michigan.shp") plt.figure(figsize=(7,8))  for each_rec in sf.shapeRecords(): x = [i[0] for i in each_rec.shape.points] y = [i[1] for i in each_rec.shape.points] plt.plot(x,y, color="gray") plt.show() | The code below plots a Michigan shapefile using both the parts and points collection. This is a much better way to plot the shapefile. To display the parts and points of a shapefile, the code must first get the number of parts associated with each polyline or polygon, then loop through each part to get the array of points for the part. .. code-block:: python import shapefile as shp import matplotlib.pyplot as plt sf = shp.Reader("Users/../Michigan.shp") plt.figure(figsize = (7,8)) # loop through each record in the shaperecords collection for each_rec in sf.shapeRecords(): for i in range(len(each_rec.shape.parts)): i_start = each_rec.shape.parts[i] #Get the starting values for the part. if i==len(each_rec.shape.parts)-1: i_end = len(each_rec.shape.points) #Get the length of the points collection. else: i_end = each_rec.shape.parts[i+1] #Get the X,Y coord of the points in each part and make a list x = [i[0] for i in each_rec.shape.points[i_start:i_end]] y = [i[1] for i in each_rec.shape.points[i_start:i_end]] plt.plot(x,y, color = "green") plt.show() | **Displaying Single or Multi-Polygon Shapefiles using Descartes** The Descartes library is another alternative for displaying shapefiles using Python. Descartes uses geometric objects as input for displaying `matplotlib paths and patches as lines and polygons `_. Experiment with the code below to learn about `geo_interfaces `_ from the shapefile library, add_patch() from matplotlib, and PolyPatch () from Descartes. .. code-block:: python import shapefile as shp import matplotlib.pyplot as plt from descartes import PolygonPatch sf = shp.Reader("../School_Districts.shp") fig = plt.figure() ax = fig.gca() for poly in sf.shapes(): poly_geo=poly.__geo_interface__ ax.add_patch(PolygonPatch(poly_geo, fc='#6699cc', ec='#000000', alpha=0.5, zorder=2 )) ax.axis('scaled') plt.show() .. image:: img/school_districts_descartes.png :alt: Shapefile Displayed with Descartes | In this example, the code above has been modified to display the polygons in the shapefile with multiple colors. .. code-block:: python import shapefile as shp import matplotlib.pyplot as plt from descartes import PolygonPatch import random sf = shp.Reader("/Users/.../school_districts.shp") fig = plt.figure() ax = fig.gca() for poly in sf.shapes(): poly_geo=poly.__geo_interface__ # Generate a random number between 0 and 2^24 color = random.randrange(0, 2**24) # Convert the number from base-10 (decimal) to base-16 (hexadecimal) hex_color = hex(color) std_color = "#" + hex_color[2:] ax.add_patch(PolygonPatch(poly_geo, fc= str(std_color), ec='#000000', alpha=0.5, zorder=2 )) ax.axis('scaled') plt.show() .. image:: img/school_districts_multi_color.png :alt: Shapefile Displayed with Descartes | **Displaying a List of XY Coordinates as Points** .. code-block:: python # import libraries import pandas as pd # Pandas Read csv file from shapely.geometry import Point # Shapely for converting latitude/longtitude to geometry import geopandas as gpd # Geopandas for visualing shapefile import matplotlib.pyplot as plt homicides = pd.read_csv('/Users/.../Toronto_Homicides.csv') #print (homicides.head()) #Download data at - https://data.torontopolice.on.ca/datasets/TorontoPS::homicides-open-data-asr-rc-tbl-002/explore # creating a geometry column geometry = [Point(xy) for xy in zip(homicides['X'], homicides['Y'])] # Coordinate reference system : WGS84 crs = {'init': 'epsg:4326'} # Creating a Geographic data frame gdf = gpd.GeoDataFrame(homicides, geometry=geometry) # Plot all points gdf.plot(marker='o', color='r', markersize=1.5) plt.show() .. image:: img/toronto_homicides.png :alt: Toronto Homicides | The above task can be done entirely in geopandas using a script such as the one below: .. code-block:: python import pandas as pd import geopandas as gpd homicide_data = pd.read_csv('/Users/.../Toronto_Homicides2.csv') homicide_gpd = gpd.GeoDataFrame(homicide_data, geometry = gpd.points_from_xy(homicides_data["X"],homicides_data["Y"] )) #homicide_gpd homicide_gpd.plot(markersize = 1.5, color = "red", figsize =(10,10)) Please view the video below for an explanation of the code. .. raw:: html | | Displaying Rasters -------------------- In programming, rasters are considered as *arrays*. An array is a collection of items of the same data type that can be manipulated as a single entity. In Python, a list is a one dimensional array. However, when we are thinking about rasters, we are typically thinking of two dimensional arrays that are defined by rows and columns. In Python, one difference between a list and a two-dimensional array is that whereas a list can store multiple data types, a two-dimensional array can store only one data type. .. image:: img/arrays1.png :alt: One and Two Dimensional Arrays Python has specialized libraries for manipulating arrays. Two popular ones are the "numpy" library and the “array" module. Numpy appears to be more popular. To import Numpy into your script, type: >>> import numpy as np The "np" is a nickname by convention. To learn more about arrays, please click on `this link `_. | **Displaying a DEM using GDAL and Matplotlib** One way to display a raster is to open the raster file using the gdal library, then convert the raster into an array using GDAL. Afterwards, we can use matplotlib's pyplot.imshow() to display the array. If you not familiar with imshow, please look it up. Imshow takes a 'vmin' argument which sets the minimum display values. In this case elevation values of zero and higher will be displayed. Thus, if the raster has a negative nodata value, it will bot be displayed. Cmap is a color map. Here we are using the matplotlib's `color map `_ named "gist_earth". .. code-block:: python import gdal import matplotlib.pyplot as plt #Open raster and read number of rows, columns and bands ds = gdal.Open("/Users/.../topography/dem") band1 = ds.GetRasterBand(1) raster_array = band1.ReadAsArray() plt.imshow(raster_array,vmin=0, cmap="gist_earth") plt.show() .. image:: img/wayne_dem2.png :alt: Wayne DEM | In the illustration above, the raster is displayed without any reference to real world coordinates. The coordinates displayed are just counts of the raster cells. In GIS, it is typical for map layers to be displayed with their real world coordinates. In the sample script below, the geotransform function is introduced. This function is used to recalibrate the cell values in terms of real world coordinates. The script also introduces raster legends. The GeoTransform contains a tuple of values that are used to establish the coordinates of the upper left (UL) corner of a raster. GeoTransform basically takes these values and use them to renumber the entire image in terms of the real world coordinate values. There are six values in the geotransform tuple. They are shown below. | GeoTransform[0] - he upper left xcoordinate | GeoTransform[1] - west-east pixel resolution | GeoTransform[2] - rotation, 0 if image is north up. | GeoTransform[3] - the upper left y coordinate | GeoTransform[4] - rotation, 0 if image is north up. | GeoTransform[5] - north-south pixel resolution | In the script below, ulx and uly are the coordinates in the the upper left corner in the GeoTranformed raster while lrx and lry are the coordinates in the lower right corner. .. code-block:: python import gdal import matplotlib.pyplot as plt import matplotlib.colors as colors #Set directory ds = gdal.Open('/Users/.../topography/dem') if ds is None: print ('Could not open file') sys.exit(1) band1 = ds.GetRasterBand(1) no_data = band1.GetNoDataValue() raster_array = band1.ReadAsArray() ulx,xres,xskew,uly,yskew,yres = ds.GetGeoTransform() lrx = ulx + (ds.RasterXSize * xres) lry = uly + (ds.RasterYSize * yres) print (ulx,lrx,lry,uly) fig, ax = plt.subplots(figsize=(8, 5)) #Array contains many no data values, therefore normalize legend from 0 to 1. norm = colors.Normalize(vmin = 0, vmax = raster_array.max()) cmap = plt.get_cmap("gist_earth") img = plt.imshow(raster_array, cmap, norm, extent=(ulx, lrx,lry,uly)) # Get the plot renderer object. cbar = plt.colorbar(img,shrink=0.75) #Associate the color bar with the plot renderer and axes objects. cbar.set_label('Meters') plt.savefig('dem_plot.png', dpi=300, bbox_inches='tight') plt.show() .. image:: img/wayne_dem_gdal.png :alt: Wayne DEM *References* Raster data processing with Python and GDAL - https://notebook.community/Automating-GIS-processes/Lesson-7-Automating-Raster-Data-Processing/Python-and-Gdal Introduction to NumPy and OpenCV. http://vision.deis.unibo.it/~smatt/DIDATTICA/Sistemi_Digitali_M/PDF/Introduction_to_NumPy_and_OpenCV.pdf | **Displaying a Raster using Rasterio and Matplotlib** Rasterio is a popular open source Python library used for viewing and manipulating rasters. Rasterio utilizes the gdal library to display rasters. With rasterio, viewing a raster can be done with just a few lines of code, like the example below.  Rasterio has a show( ) method for displaying rasters. However, the library also uses pyplot’s imshow method to display the data. .. code-block:: python import rasterio from matplotlib import pyplot src = rasterio.open("/Users/.../topography/dem") # Get the bounding box coordinates of the raster extent=[src.bounds[0], src.bounds[2], src.bounds[1], src.bounds[3]] #Convert the raster into an array src_array = src.read(1) fig, ax = pyplot.subplots(1, figsize=(8, 5)) # Get the plot renderer object. img = ax.imshow(src_array, cmap=plt.get_cmap('jet'), extent=extent, norm = colors.Normalize(vmin = 0, vmax = src_array.max())) ax.set_title("Digital Elevation Model, Wayne County") #Associate the figure object with plot renderer and axes objects. fig.colorbar(img, ax=ax) #Let the axes object set the length of the colobar. ax.set_aspect('auto') pyplot.show() `Source `_ .. image:: img/rasterio_dem1.png :alt: Wayne DEM | **Displaying a three-band Aerial Photo with GDAL** .. code-block:: python import numpy as np from osgeo import gdal import matplotlib.pyplot as plt aerial = gdal.Open("/Users/semple/Desktop/Detroit.png bnd1 = aerial.GetRasterBand(1) bnd2 = aerial.GetRasterBand(2) bnd3 = aerial.GetRasterBand(3) #Now turn each band into a ndarray: img1 = bnd1.ReadAsArray() img2 = bnd2.ReadAsArray() img3 = bnd3.ReadAsArray() #Then stack them to have a 3 band image img = np.dstack((img1,img2,img3)) plt.imshow(img) plt.show() | **Displaying Satellite Imagery with GDAL** .. code-block:: python import numpy as np from osgeo import gdal import matplotlib.pyplot as plt #Image-2019 band5 = "/Users/.../Landsat/LT05_L1TP_019031_20111106_20160830_01_T1_B5.TIF" band4 = "/Users/.../Landsat/LT05_L1TP_019031_20111106_20160830_01_T1_B4.TIF" band3 = "/Users/.../Landsat/LT05_L1TP_019031_20111106_20160830_01_T1_B3.TIF" #Open the Landsat image bands with GDAL B5_data = gdal.Open(band5) B4_data= gdal.Open(band4) B3_data = gdal.Open(band3) B5_Array = B5_data.GetRasterBand(1).ReadAsArray().astype(np.float32) B4_Array = B4_data.GetRasterBand(1).ReadAsArray().astype(np.float32) B3_Array = B5_data.GetRasterBand(1).ReadAsArray().astype(np.float32) #Then stack them to have a 3 band image img = np.dstack((B5_Array,B4_Array,B3_Array)) plt.imshow(img) plt.show() | | **Displaying a Three-band Raster with Rasterio** Rasterio is a popular open source Python library used for viewing and manipulating rasters. Rasterio utilizes the gdal library to display rasters. With rasterio, viewing a raster can be done with just a few lines of code, like the example below.  Rasterio has a show( ) method for displaying rasters. However, the library can also use pyplot’s imshow method to display the data. The example below uses the show method. .. code-block:: python import rasterio from rasterio.plot import show src = rasterio.open("/Users/.../Detroit.png") show(src) | **Displaying a Raster in QGIS Using Python** .. raw:: html | | **Displaying Web Maps and Feature Layers Stored in ArcGIS Online Using ArcGIS API for Python** *Making an Anonymous Connection to ArcGIS Online* The code sample code below lets you connect anonymously to Arcgis Online and search for content that are related to 'Michigan' and that are publicly available. .. code-block:: python from arcgis.gis import GIS gis = GIS() # Connect to ArcGIS Online as an anonymous user search_subset = gis.content.search("Michigan", max_items=100) search_subset The results should be similar to the illustration below. .. image:: img/search_arcgisonline.png :alt: Search Arcgisonline Now, let's search for a particular layer named "BA Web App Suitability Analysis Data" .. code-block:: python from arcgis.gis import GIS gis = GIS() # Connect to ArcGIS Online as an anonymous user search_subset = gis.content.search("BA Web App Suitability Analysis Data") search_subset The result is: [, , , , , ,  Let's get the first item from the list. .. code-block:: python from arcgis.gis import GIS gis = GIS() # Connect to ArcGIS Online as an anonymous user search_subset = gis.content.search("BA Web App Suitability Analysis Data") subset_item = search_subset[0] subset_item .. image:: img/Justice40Tracts.png :alt: Search Arcgisonline *Connect to your ArcGIS Online Account* Let's log into our own ArcGIS Online Account and display some of the content. The general syntax is shown below. Notice that we are passing in the URL of the ArcGIS Online server (or portal), our username, and password to "GIS". .. code-block:: python import arcgis from arcgis.gis import GIS #connect to your GIS gis = GIS("https://www.arcgis.com","username","password") my_content = gis.content.search(query="owner:" + gis.users.me.username, max_items=100) my_content | *Connect to a Feature Layer* .. code-block:: python # Establish a connection to your GIS. from arcgis.gis import GIS from IPython.display import display gis = GIS("https://www.arcgis.com","username","password") # Search for 'USA major cities' feature layer collection search_results = gis.content.search('title: USA Major Cities', 'Feature Layer') # Access the first Item that's returned major_cities_item = search_results[0] major_cities_item | *Connect to a Feature Layer using the Feature's ItemID* .. code-block:: python from arcgis.gis import GIS from IPython.display import display gis = GIS() # anonymous connection to www.arcgis.com freeways = gis.content.get('91c6a5f6410b4991ab0db1d7c26daacb') freeways .. image:: img/USAFreewaySystem.png :alt: Search Arcgisonline For more information, please visit the link below: https://developers.arcgis.com/python/guide/working-with-feature-layers-and-features/ | | Exercises ----------- 1. Visit `this NFL website `_ and copy the quarter back data. Paste the data into Excel and save it in CSV format. Use the standard library in Python or pandas to plot a simple histogram and boxplot of the data in the passing attempts field (ATT). Also, calculate the mean and standard deviation of passing attempts. Explain what your data is portraying in short paragraph. 2. Using Geopandas, create a thematic map for the USA or `Michigan `_ showing the distribution of Covid19 cases across the country or state for the date for which you have data. Write comments to explain what your code is doing. Repeat the process using another Python library of your choice. 3. Extend the sample code in this Module that uses Descartes to display shapefiles in which the polygons are labeled. 4. Complete the `tutorial on this website <"https://blog.matthewgove.com/2021/06/18/the-ultimate-in-python-data-processing-how-to-create-maps-and-graphs-from-a-single-shapefile/">`_ including the challenge task at the end. 5. Review `the tutorial on this page `_ and be prepared to discuss the code, particularly how Matplotlib is how used. 6. Modify the ArcGIS API for Python code above to enable you to log into your own ArcGIS Online account and display a web map. | Resources ----------- * https://automating-gis-processes.github.io/CSC18/index.html * Python Shapefile Library - https://pythonhosted.org/Python%20Shapefile%20Library/ * Land Cover Change Analysis with Python and GDAL - Tutorial - https://hatarilabs.com/ih-en/land-cover-change-analysis-with-python-and-gdal-tutorial * Rasterio - https://geobgu.xyz/py/rasterio.html# * Create Random Hex Color Code Using Python - https://www.geeksforgeeks.org/create-random-hex-color-code-using-python/# * Plotting large shapefiles with matplotlib - https://gis.stackexchange.com/questions/202839/plotting-large-shapefiles-with-matplotlib/266675#266675 * ArcGIS API for Python. Working with web maps and web scenes - https://developers.arcgis.com/python/guide/working-with-web-maps-and-web-scenes/ https://mapscaping.com/how-to-read-a-shapefile-using-python/ |