Chapter 3 Lab 1 - Getting Started with R | Crime Mapping and Analysis (2024)

Welcome to Lab 1! In this lab, we will be focusing on a few, very basic issues. Below, you will see a list of the majortopics that will be covered. Every lab will have a list of the major skills you will be learning.

  • Topics Covered

    • Introduction to RStudio
    • Installing packages
    • Loading data into R
    • Working with variables
    • Plotting spatial data

3.1 Introduction to RStudio

After installing both the R programming language AND Rstudio (hint, if you haven’t done this already please go here and here first).You should open Rstudio by double clicking on the Rstudio icon. You should be greeted with something that looks likethis:

Chapter 3 Lab 1 - Getting Started with R | Crime Mapping and Analysis (1)

Figure 3.1: The RStudio Interface

The first thing you will want to do is to create a new script. A script is just a list of instructions you will begiving to the computer - like a text document. To start, go to the file window and select ‘New File > R Script.’ A newwindow should pop up in the top left corner. You will be doing all your work here, in the script window.

You can write and run commands in the editor. Let’s start with some basic math. Trying typing in the following:

2 + 2

After typing it in, hit either ctrl+enter (on PC) or command+return (on Mac) to run the code. You should see the resultsbelow in the console window.

Chapter 3 Lab 1 - Getting Started with R | Crime Mapping and Analysis (2)

Figure 3.2: Console output

The console window will print out the results of your code. As you will see a little later, we can use the consolewindow to view the results of our commands, as well as any errors or other messages given to us by R.

3.2 Installing packages

The first thing we need to do is to install some packages. In R, a package is a collection of tools that we can useto analyze data. Some of these tools will help us examine spatial crime data, others will help us process data. In R itis quite easy to install these packages and we only need to do it once. There are a few packages we will need toinstall first. The most important ones are as follows:

  1. tidyverse: A collection of tools making working with data easier
  2. sf: A package used to read and process spatial data
  3. spatstat: A spatial statistics package
  4. maptools: Miscellaneous tools for editing spatial data
  5. raster: A package for plotting ‘raster’ based data

Let’s start by installing the first one, tidyverse. This is very popular package in R and is primarily used tohandle all kinds of data (see here for more information). To install a package, all weneed to do is specify some code. The command in R is install.packages. All we have to do is provide the name of thepackage we want to install. Let’s try it.

This will likely run for a bit on your computer, then finally stop. After it’s finished, you’re done! You will beable use all the tidyverse tools in R whenever you want. We will actually use some of these below in a minute. Now,you should finish installing the rest of the packages. Copy and paste the following code and run it in your scriptwindow. This may take a while (depending on your computer) but you only need to install these one time.

install.packages("sf")install.packages("spatstat")install.packages("maptools")install.packages("raster")

After installing these packages, we just need to load them into R. To do that, we use the function library to loadour tools. Unlike install.packages we need to use library every single time we close and open R. Right now, wewill need the tools from the tidyverse package - so let’s load it in:

library(tidyverse)

Now we’re ready to try loading some of our data into R!

3.3 Loading data into R

Loading data into R might seem daunting, but it is actually quite easy once you figure out how it works. All you haveto do is specify two things!

  1. The name of the file you want to load into R.
  2. The address of the file on your computer.

On a computer, each file has an address. This is just the location where the file is located. To load a file into Ryou need to provide the computer the address and name of the file. Below, I have the first file shown on my computer,a .csv file named ‘usa_arrests.’ As an aside, a .csv file (also known as a comma-separated values file) is just a waycomputers commonly store tabular data.

Chapter 3 Lab 1 - Getting Started with R | Crime Mapping and Analysis (3)

Figure 3.3: File Address of a .csv File

On my computer, my username is ‘gioc4’ and the file is located on my Desktop. The full address of the file is “C:/Users/gioc4/Desktop/usa_arrests.csv.” Now, let’s try and load this into R. Before we do that, we need to talk aboutan important tool: the assignment operator

3.3.1 The assignment operator

In R, we will often be saving the results of our analyses as objects that we can access later. Think of these asindividual files that we can work with (like a word document). We create a name for our object, then save some datainto it. That’s where the assignment operator comes in.

In R, we save data using the arrow icon: <-. It’s just a greater-than sign < and a minus sign - next to eachother. Whenever we want to save some data as an object, we use the arrow key pointing toward it. For example:

name_of_variable <- some_data

On the left-hand side is the name of the object or variable. We put the arrow pointing toward it to save our dataas that name, then put the actual data on the right-hand side. Now we’re ready to start.

3.3.2 Using read_csv to load a .csv file

Let’s start by loading a .csv file into R. You should have placed the file usa_arrests.csv onto your dekstop(or somewhere else where you can point the computer to it). Putting together what we discussed above, we are going todo the following:

  1. Define a name for our object
  2. Use the assignment operator to save data into our new object
  3. Use the R function read_csv to import the data into R

The R function read_csv will import our data into R for us. All we have to do is provide the address to the file onour computer. Once you’re done, your code should look something like this (with the user name changed, of course).

arrests <- read_csv("C:/Users/gioc4/Desktop/usa_arrests.csv")

Breaking it down, each piece of this code does this:

Chapter 3 Lab 1 - Getting Started with R | Crime Mapping and Analysis (4)

Figure 3.4: Reading a .csv file into R

As you can see, we chose the name arrests for our object name, we used the R function read_csv to load a .csv file,and then provided (in quotes) the location of the file on our computer.

If you don’t know what your user name is, you can type the following in the console window and hit ‘enter’

getwd()
## [1] "C:/Users/gioc4/Dropbox/crime_mapping_and_analysis"

The beginning part of this will tell you what your username is. Here you can see my username is ‘gioc4.’

If everything went correctly, you should see the following in the top-right corner of your RStudio window in theenvironment tab. That’s our data!

Chapter 3 Lab 1 - Getting Started with R | Crime Mapping and Analysis (5)

Figure 3.5: Data in the environment tab

3.4 Working with variables

Now that we’ve successfully loaded our data into R, we can start using some data analysis functions to look at it.We have access to a lot of tools to analyze our data. Let’s look at a few.

3.4.1 head and glimpse

First, we can examine the variables in our dataset in order to get an idea of what we’re working with. Two functions,head and glimpse can help us with that. Let’s first try using the head function on our dataset.

head(arrests)
## # A tibble: 6 x 4## Murder Assault UrbanPop Rape## <dbl> <dbl> <dbl> <dbl>## 1 13.2 236 58 21.2## 2 10 263 48 44.5## 3 8.1 294 80 31 ## 4 8.8 190 50 19.5## 5 9 276 91 40.6## 6 7.9 204 78 38.7

Here, head gives us the top 6 rows in our dataset, along with the names of the variables. The function glimpsewill work in a similar way, but slightly more consice way.

glimpse(arrests)
## Rows: 50## Columns: 4## $ Murder <dbl> 13.2, 10.0, 8.1, 8.8, 9.0, 7.9, 3.3, 5.9, 15.4, 17.4, 5.3, 2.6, 10.4, 7.2, 2.2, 6.0, 9.7, 15.4, 2.1, 11.3, 4~## $ Assault <dbl> 236, 263, 294, 190, 276, 204, 110, 238, 335, 211, 46, 120, 249, 113, 56, 115, 109, 249, 83, 300, 149, 255, 7~## $ UrbanPop <dbl> 58, 48, 80, 50, 91, 78, 77, 72, 80, 60, 83, 54, 83, 65, 57, 66, 52, 66, 51, 67, 85, 74, 66, 44, 70, 53, 62, ~## $ Rape <dbl> 21.2, 44.5, 31.0, 19.5, 40.6, 38.7, 11.1, 15.8, 31.9, 25.8, 20.2, 14.2, 24.0, 21.0, 11.3, 18.0, 16.3, 22.2, ~

3.4.2 summary

We can also easily get the summary statistics for one, or all of our variables using the summary function.If we want to examine a single variable inside of our dataset, we need to use the dollar sign $ to access it.So if I want to get summary statistics on the variable Murder I should do:

summary(arrests$Murder)
## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.800 4.075 7.250 7.788 11.250 17.400

This will give us the minimum, maximum, median, mean, and 1st and 3rd quartiles of the data. Here we see the averagemurder rate in this dataset is 7.78 per 100,000.

We can also get summary statistics on all the variables at the same time, by just putting in the name of the dataobject.

summary(arrests)
## Murder Assault UrbanPop Rape ## Min. : 0.800 Min. : 45.0 Min. :32.00 Min. : 7.30 ## 1st Qu.: 4.075 1st Qu.:109.0 1st Qu.:54.50 1st Qu.:15.07 ## Median : 7.250 Median :159.0 Median :66.00 Median :20.10 ## Mean : 7.788 Mean :170.8 Mean :65.54 Mean :21.23 ## 3rd Qu.:11.250 3rd Qu.:249.0 3rd Qu.:77.75 3rd Qu.:26.18 ## Max. :17.400 Max. :337.0 Max. :91.00 Max. :46.00

3.5 Loading and plotting spatial data

Now that we’re comfortable loading .csv files, let’s move onto files that have spatial data attached. The mostcommon way we encounter spatial data, is through something called a shapefile. A shapefile is:

“a geospatial vector data format for geographic information system software”

Essentially, it is a collection of files that store information about spatial data. For example - it might be theboundaries of a city, or the location of crimes in a city. As you will see below, shapefiles are a bit special becausethey are a collection of files. Each file has its own purpose, and together they make up all the data needed to plotan object. Look below:

Chapter 3 Lab 1 - Getting Started with R | Crime Mapping and Analysis (6)

Figure 3.6: Data in the environment tab

In R, we will load shapefiles much the same way we do with a .csv file. However, instead of read_csv we willuse st_read in the sf package. Just like we did above with with read_csv we will do the exact same thing.We just need to point the computer to the location of the shapefile on our computer. Note: While a shapefile is acollection of files, we only need to point the computer to the file with the .shp after it.

library(sf)new_haven <- st_read("C:/Users/gioc4/Desktop/nh_blocks.shp")

Now that its loaded it, we can try plotting it. Try out the code below:

plot(new_haven, max.plot = 1)

Chapter 3 Lab 1 - Getting Started with R | Crime Mapping and Analysis (7)

There! Now you’ve succesfully plotted your first shapefile. You can see the colors of the plot correspond to some ofthe underlying variables in the shapefile. A shapefile is actually very similar to a normal data file (like a .csv)except that it has spatial data attached to it. In fact, many of the same functions we used for the .csv file abovewill also work here. Let’s see what happens if we use head and glimpse

head(new_haven)
## Simple feature collection with 6 features and 28 fields## Geometry type: POLYGON## Dimension: XY## Bounding box: xmin: 534687.6 ymin: 177306.3 xmax: 569625.3 ymax: 188464.6## Projected CRS: Lambert_Conformal_Conic## NEWH075H_ NEWH075H_I HSE_UNITS OCCUPIED VACANT P_VACANT P_OWNEROCC P_RENTROCC NEWH075P_ NEWH075P_I POP1990 P_MALES## 1 2 69 763 725 38 4.980341 0.393185 94.626474 2 380 2396 40.02504## 2 3 72 510 480 30 5.882353 20.392157 73.725490 3 385 3071 39.07522## 3 4 64 389 362 27 6.940874 57.840617 35.218509 4 394 996 47.38956## 4 5 68 429 397 32 7.459207 19.813520 72.727273 5 399 1336 42.66467## 5 6 67 443 385 58 13.092551 80.361174 6.546275 6 404 915 46.22951## 6 7 133 588 548 40 6.802721 52.551020 40.646259 7 406 1318 50.91047## P_FEMALES P_WHITE P_BLACK P_AMERI_ES P_ASIAN_PI P_OTHER P_UNDER5 P_5_13 P_14_17 P_18_24 P_25_34 P_35_44## 1 59.97496 7.095159 87.020033 0.584307 0.041736 5.258765 12.813022 24.707846 7.888147 12.479132 16.026711 8.555927## 2 60.92478 87.105177 10.452621 0.195376 0.521003 1.725822 1.921198 2.474764 0.814067 71.149463 7.359166 4.037773## 3 52.61044 32.931727 66.265060 0.100402 0.200803 0.502008 10.441767 13.554217 5.722892 8.835341 17.670683 17.871486## 4 57.33533 11.452096 85.553892 0.523952 0.523952 1.946108 10.853293 17.739521 7.709581 12.425150 18.113772 10.853293## 5 53.77049 73.442623 24.371585 0.327869 1.420765 0.437158 6.229508 8.633880 2.950820 7.103825 17.267760 16.830601## 6 49.08953 87.784522 7.435508 0.758725 0.834598 3.186646 8.725341 8.194234 3.641882 10.091047 29.286798 12.898331## P_45_54 P_55_64 P_65_74 P_75_UP geometry## 1 5.759599 4.924875 4.048414 2.796327 POLYGON ((540989.5 186028.3...## 2 1.595571 1.758385 3.712146 5.177467 POLYGON ((539949.9 187487.6...## 3 8.734940 5.923695 7.931727 3.313253 POLYGON ((537497.6 184616.7...## 4 9.056886 6.287425 4.266467 2.694611 POLYGON ((537497.6 184616.7...## 5 8.415301 7.431694 14.426230 10.710383 POLYGON ((536589.3 184217.5...## 6 7.814871 7.814871 6.904401 4.628225 POLYGON ((568032.4 183170.2...
glimpse(new_haven)
## Rows: 129## Columns: 29## $ NEWH075H_ <dbl> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30~## $ NEWH075H_I <dbl> 69, 72, 64, 68, 67, 133, 73, 134, 84, 80, 79, 136, 77, 97, 94, 102, 78, 66, 83, 62, 121, 135, 65, 126, 81,~## $ HSE_UNITS <dbl> 763, 510, 389, 429, 443, 588, 410, 615, 316, 365, 276, 393, 355, 595, 518, 277, 232, 264, 502, 772, 467, 2~## $ OCCUPIED <dbl> 725, 480, 362, 397, 385, 548, 389, 562, 293, 337, 256, 377, 309, 534, 475, 263, 194, 237, 459, 730, 441, 2~## $ VACANT <dbl> 38, 30, 27, 32, 58, 40, 21, 53, 23, 28, 20, 16, 46, 61, 43, 14, 38, 27, 43, 42, 26, 21, 33, 17, 64, 44, 9,~## $ P_VACANT <dbl> 4.980341, 5.882353, 6.940874, 7.459207, 13.092551, 6.802721, 5.121951, 8.617886, 7.278481, 7.671233, 7.246~## $ P_OWNEROCC <dbl> 0.393185, 20.392157, 57.840617, 19.813520, 80.361174, 52.551020, 57.804878, 33.658537, 49.367089, 38.63013~## $ P_RENTROCC <dbl> 94.626474, 73.725490, 35.218509, 72.727273, 6.546275, 40.646259, 37.073171, 57.723577, 43.354430, 53.69863~## $ NEWH075P_ <dbl> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30~## $ NEWH075P_I <dbl> 380, 385, 394, 399, 404, 406, 407, 408, 409, 412, 413, 414, 415, 416, 417, 418, 420, 422, 423, 425, 426, 4~## $ POP1990 <dbl> 2396, 3071, 996, 1336, 915, 1318, 1041, 1148, 862, 940, 729, 872, 910, 1398, 1383, 554, 558, 521, 1293, 18~## $ P_MALES <dbl> 40.02504, 39.07522, 47.38956, 42.66467, 46.22951, 50.91047, 48.89529, 53.74565, 44.43156, 46.17021, 41.700~## $ P_FEMALES <dbl> 59.97496, 60.92478, 52.61044, 57.33533, 53.77049, 49.08953, 51.10471, 46.25435, 55.56844, 53.82979, 58.299~## $ P_WHITE <dbl> 7.095159, 87.105177, 32.931727, 11.452096, 73.442623, 87.784522, 66.378482, 70.121951, 9.164733, 5.638298,~## $ P_BLACK <dbl> 87.020033, 10.452621, 66.265060, 85.553892, 24.371585, 7.435508, 30.931796, 24.041812, 89.327146, 90.63829~## $ P_AMERI_ES <dbl> 0.584307, 0.195376, 0.100402, 0.523952, 0.327869, 0.758725, 0.000000, 0.087108, 0.928074, 0.425532, 0.4115~## $ P_ASIAN_PI <dbl> 0.041736, 0.521003, 0.200803, 0.523952, 1.420765, 0.834598, 1.633045, 4.006969, 0.580046, 2.340426, 0.0000~## $ P_OTHER <dbl> 5.258765, 1.725822, 0.502008, 1.946108, 0.437158, 3.186646, 1.056676, 1.742160, 0.000000, 0.957447, 0.4115~## $ P_UNDER5 <dbl> 12.813022, 1.921198, 10.441767, 10.853293, 6.229508, 8.725341, 6.820365, 4.965157, 7.308585, 9.255319, 12.~## $ P_5_13 <dbl> 24.707846, 2.474764, 13.554217, 17.739521, 8.633880, 8.194234, 9.894332, 6.358885, 13.225058, 10.531915, 1~## $ P_14_17 <dbl> 7.888147, 0.814067, 5.722892, 7.709581, 2.950820, 3.641882, 4.226705, 1.916376, 5.800464, 4.893617, 3.9780~## $ P_18_24 <dbl> 12.479132, 71.149463, 8.835341, 12.425150, 7.103825, 10.091047, 18.731988, 11.585366, 11.136891, 14.680851~## $ P_25_34 <dbl> 16.026711, 7.359166, 17.670683, 18.113772, 17.267760, 29.286798, 13.928915, 30.836237, 15.777262, 14.36170~## $ P_35_44 <dbl> 8.555927, 4.037773, 17.871486, 10.853293, 16.830601, 12.898331, 16.234390, 14.634146, 13.689095, 13.723404~## $ P_45_54 <dbl> 5.759599, 1.595571, 8.734940, 9.056886, 8.415301, 7.814871, 10.470701, 9.843206, 12.993039, 10.638298, 8.9~## $ P_55_64 <dbl> 4.924875, 1.758385, 5.923695, 6.287425, 7.431694, 7.814871, 6.820365, 8.449477, 11.484919, 9.255319, 10.56~## $ P_65_74 <dbl> 4.048414, 3.712146, 7.931727, 4.266467, 14.426230, 6.904401, 7.492795, 7.578397, 6.960557, 7.872340, 6.310~## $ P_75_UP <dbl> 2.796327, 5.177467, 3.313253, 2.694611, 10.710383, 4.628225, 5.379443, 3.832753, 1.624130, 4.787234, 2.469~## $ geometry <POLYGON [US_survey_foot]> POLYGON ((540989.5 186028.3..., POLYGON ((539949.9 187487.6..., POLYGON ((537497.6 18~

Here we see there are a lot of variables here, including many census-level variables from the 1990 census. Forexample, POP1990 is the 1990 decennial population for each census tract in New Haven. In your lab assignment belowyou will load this data into R, plot some variables, and get the descriptive statistics for one of the variables.

3.6 Lab 1 Assignment

This lab assignment is worth 10 points. Follow the instructions below.

  1. Use read_csv to load the file usa_arrests.csv into R
    • Use the function summary on one of the variables
    • Report the mean, median, minimum and maximum values for that variable
  2. Use st_read to load the shapefile new_haven.shp into R
    • Use the plot function to plot the shapefile
    • Save and export your plot as a image
    • Use the function summary on one of the variables
    • Report the mean value for that variable
Chapter 3 Lab 1 - Getting Started with R | Crime Mapping and Analysis (2024)

References

Top Articles
Latest Posts
Article information

Author: Rubie Ullrich

Last Updated:

Views: 6329

Rating: 4.1 / 5 (52 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Rubie Ullrich

Birthday: 1998-02-02

Address: 743 Stoltenberg Center, Genovevaville, NJ 59925-3119

Phone: +2202978377583

Job: Administration Engineer

Hobby: Surfing, Sailing, Listening to music, Web surfing, Kitesurfing, Geocaching, Backpacking

Introduction: My name is Rubie Ullrich, I am a enthusiastic, perfect, tender, vivacious, talented, famous, delightful person who loves writing and wants to share my knowledge and understanding with you.