Appendix E — Assignment E

Instructions

  1. You may talk to a friend, discuss the questions and potential directions for solving them. However, you need to write your own solutions and code separately, and not as a group activity.

  2. Do not write your name on the assignment.

  3. Make R code chunks to insert code and type your answer outside the code chunks. Ensure that the solution is written neatly enough to understand and grade.

  4. Render the file as HTML to submit.

  5. There are 10 points for cleanliness and organization. The breakdown is as follows:

  • Must be an HTML file rendered using Quarto (2 pts).

  • The table of contents appears in the file (2)

  • There aren’t excessively long outputs of extraneous information (e.g. no printouts of unnecessary results without good reason, there aren’t long printouts of which iteration a loop is on, there aren’t long sections of commented-out code, etc.). There is no piece of unnecessary / redundant code, and no unnecessary / redundant text (2 pt)

  • The code follows the tidyverse style guide for naming variables, spaces, curly braces, etc. (2 pt)

  • The code should be commented and clearly written with intuitive variable names. For example, use variable names such as number_input, factor, hours, instead of a,b, xyz, etc. For repetitive code chunks, either copy the comments or just leave a comment mentioning that the comment is the same as in the previous occurrace of the code chunk (2 pts)

  • 2 points will be deducted for every question, where you have not printed out the final answer, and the grader needs to execute your code to check the answer.

  • 2 points will be deducted for every question, where you have hard-coded the final answer. For example, if the final answer is 10, then you can’t say, print(10). Your code must execute correctly to print out the final answer.

  1. The assignment is worth 100 points, and is due on 26th Feb 2024 at 11:59 pm.

E.1 Streaming platforms

The streaming platforms used by the students of two sections of the STAT303-1 Fall 2022 class are give below as two atomic vectors - streaming_platform_section1 and streaming_platform_section2

streaming_platform_section1<-c('Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'HBO Max;Disney+;Paramount Plus, Peacock', 'Netflix', 'Netflix;Hulu;Disney+', 'Netflix;Amazon Prime;HBO Max', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu', 'None', 'Amazon Prime', 'Netflix;Hulu;Amazon Prime', 'Amazon Prime', 'Netflix', 'Netflix', 'Netflix', 'none', 'Netflix;Amazon Prime', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Hulu;HBO Max', 'Netflix;Disney+', 'Netflix', 'Netflix;HBO Max', 'Netflix;Hulu', 'Netflix;Amazon Prime;HBO Max;Disney+', 'Netflix;Disney+', 'Netflix;Disney+', 'Netflix;HBO Max;Disney+', 'Netflix', 'Netflix;Amazon Prime', 'Netflix;Disney+', 'HBO Max;Disney+', 'Netflix;Amazon Prime', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+;Shudder, Apple TV+', 'Netflix;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime', 'Hulu;Amazon Prime', 'Netflix;Hulu', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu', 'Netflix;Hulu;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime', 'Netflix;Hulu;Amazon Prime', 'Netflix;Amazon Prime', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Amazon Prime;HBO Max', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime', 'Netflix;Hulu;HBO Max', 'Netflix', 'Netflix', 'Netflix;Hulu;Amazon Prime;HBO Max', 'Netflix;Hulu;Amazon Prime;HBO Max', 'Netflix;Amazon Prime', 'Netflix', 'Disney+', 'Netflix;Amazon Prime;HBO Max', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+;Peacock, Paramount+', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Amazon Prime', 'Netflix;HBO Max', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix', 'Netflix;Amazon Prime;HBO Max', 'Netflix', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix', 'Netflix;Amazon Prime;HBO Max', 'Netflix;Hulu;Amazon Prime;HBO Max', 'Netflix', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max', 'Netflix;Disney+', 'Amazon Prime;Disney+', 'Netflix;Amazon Prime', 'Netflix;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max', 'Netflix;Hulu;Amazon Prime;HBO Max', 'ESPN+, Peacock, Paramount+', 'Netflix', 'Netflix;Hulu;Amazon Prime;Disney+', 'None', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Amazon Prime;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Youtube', 'Netflix;Amazon Prime;HBO Max', 'Amazon Prime', 'Hulu', 'Netflix;Amazon Prime;Disney+', 'Netflix;Amazon Prime', 'Netflix;HBO Max', 'Netflix;HBO Max;Disney+', 'Netflix;HBO Max', 'Disney+', 'Netflix;Hulu;HBO Max', 'Netflix', 'Netflix', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Amazon Prime', 'Netflix;Hulu;Amazon Prime;HBO Max', 'Netflix;Amazon Prime', 'Netflix;Hulu;HBO Max;Disney+', 'none', 'Netflix', 'Netflix;Hulu', 'Netflix;Hulu', 'Netflix;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max', 'Netflix;HBO Max', 'Netflix;Hulu;Amazon Prime;HBO Max', 'Netflix', 'Netflix;HBO Max', 'Netflix', 'Netflix;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Amazon Prime;HBO Max', 'Netflix', 'Netflix;Hulu;HBO Max', 'Netflix;HBO Max', 'Amazon Prime', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime', 'Hulu;Peacock', 'Amazon Prime', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix')
streaming_platform_section2<-c('Netflix;Hulu;Amazon Prime;Disney+', 'Netflix;Amazon Prime;HBO Max;Disney+', 'Amazon Prime;HBO Max', 'Netflix;Hulu', 'Netflix;Hulu', 'Netflix', 'Free Online Sites', 'Netflix;Hulu', 'Netflix;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Disney+', 'Netflix', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;HBO Max', 'Netflix;Amazon Prime', 'Netflix;Hulu', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;HBO Max', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime;HBO Max', 'Netflix;Amazon Prime;Disney+', 'Netflix;HBO Max', 'Netflix;Hulu;HBO Max', 'Netflix;Disney+', 'Netflix', 'Netflix;Hulu;Amazon Prime;HBO Max', 'Netflix', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Disney+', 'Netflix;Hulu;HBO Max', 'Netflix;Amazon Prime', 'Netflix;Disney+', 'Netflix;Hulu;HBO Max', 'Netflix;Hulu;HBO Max;Disney+', 'Netflix;Hulu;Amazon Prime', 'Netflix;Hulu', 'Netflix;Hulu;Amazon Prime;HBO Max;Disney+', 'Netflix;Hulu', 'Netflix;Hulu;Amazon Prime;HBO Max', 'Netflix', 'Animepahe.com ', 'Netflix;Hulu;HBO Max', 'Netflix;Amazon Prime;HBO Max;hotstar')

E.1.1 Streaming platforms of all students

Concatenate the atomic vectors streaming_platform_section1 and streaming_platform_section2 to obtain a vector named as streaming_platforms that consists of the streaming platforms of all students.

(2 points)

E.1.2 Number of unique combinations of streaming platforms

What is the number of unique combinations of streaming platforms used by students?

Hint: Use the functions unique() and length()

(3 points)

E.1.4 Frequency of a streaming platform

Write a function that takes a streaming platform (for example, ‘Netflix’) as an argument, and returns the number of students using that platform.

Call the function to find the number of students using:

E.1.4.1 Netflix

E.1.4.2 Hulu

(6 points)

Note that a student using a particular platform may be using other platforms as well.

E.1.5 At least 3 platforms

How many students use at least 3 streaming platforms?

(6 points)

E.2 Prime Factors with and without for loops

E.2.1 Prime number

Define a function named as prime that checks if an integer is prime. The function must accept the integer as an argument, and return TRUE if the integer is prime, otherwise it must return FALSE.

Call your function with the argument as 197.

(2 points)

E.2.2 Factor

Define a function named as factor that checks if an integer is a factor of another integer. The function must accept both the integers as arguments, and return TRUE if the integer is a factor, otherwise it must return FALSE.

Call your function with the arguments as (19,85)

(2 points)

E.2.3 Prime factors

Define a function named as prime_factors that returns the prime factors of an integer. The function must use the functions prime and factor.

Call the function with the argument as 1234567, and print the returned object, which should be the prime factors of the number.

(4 points)

E.2.4 Prime factors code: Time

Record and print the time taken to execute the function prime_factors for the integer 1234567.

(3 points)

E.2.5 Prime number: Without for loop

Update the function prime so that it works without a for loop. Name the updated function as prime_nofor. Call the function prime_nofor with the argument as 197.

(5 points)

E.2.6 Prime factors: Without for loop

Update the function prime_factors so that it works without a for loop. Name the updated function as prime_factors_nofor. The function must use the functions prime_nofor and factor. Call the function prime_factors_nofor with the argument as 1234567, and print the returned object, which should be the prime factors of the number.

(12 points)

E.2.7 Prime factors: Time taken without for loop

Record and print the time taken to execute the function prime_factors_nofor for the integer 1234567. Is it less than the time taken by prime_factors for the integer 1234567?

(3 points)

E.3 Air quality sensors

Air quality sensors are used to measure the amount of contaminants in air. This question will guide you in finding the location of installing 50 air quality sensors in the State of Colorado, such that they are as far away from each other as possible. The approach below is a greedy algorithm to find an approximate Maximin design.

Use the following code to generate the coordinate-pairs (latitude and longitude) of potential locations for installing an air quality sensor in Colorado:

x1<-seq(37,41,length=100)
x2<-seq(102,109,length=100)
candidate_set<-as.matrix(expand.grid(x1,x2))

E.3.1 Number of coordinate-pairs

How many coordinate-pairs are there in the object candidate_set?

Note that:

  1. A coordinate-pair means a latitude-longitude pair.

  2. ‘Air quality sensor’ will be referred as ‘sensor’ in the questions below for brevity.

(2 points)

E.3.2 First sensor

The first sensor is to be installed closest to Denver (closest in terms of Euclidean distance). Find the coordinate-pair of the location where the first sensor will be installed. The coordinate-pair of Denver is: [39.7392\(^{\circ}\) N, 104.9903\(^{\circ}\) W]

Note that the suffixes \(^{\circ}\) N and \(^{\circ}\) W are omitted in the object candidate_set.

You are not allowed to use a for() loop in this question.

(6 points)

E.3.3 Second sensor

Find the coordinate-pair of the installation of the next sensor, such that it is as far as possible from the first sensor installed near Denver.

You are not allowed to use a for() loop in this question.

(6 points)

E.3.4 Air sensor coordinates

Use the rbind() function to stack the coordinate-pairs of the first and second sensors vertically to obtain a 2 x 2 matrix. Name the matrix as air_sensor_coordinates.

Run the code below to check if your results seem correct. The coordinate-pairs of the two air quality sensors will be marked as black dots over the map of Colorado. Make sure the file colorado.png is in the same directory as the *qmd file.

You are not allowed to use a for() loop in this question.

library(png)
if(!require(png)){
  install.packages("png")
  library(png)
}
library(png)
my_image <- readPNG("colorado.png")
plot(-x2,x1, type='n', main="", xlab="longitude", ylab="latitude")
rasterImage(my_image, xleft=-109, xright=-102, ybottom=37, ytop=41)
points(-air_sensor_coordinates[,2],air_sensor_coordinates[,1],pch=16)

(4 points)

E.3.5 Third sensor

Now you need to find the coordinate-pair for installing the third sensor such that it is far away from the two already-installed sensors. Proceed as follows:

  1. Find the minimum distance of each coordinate-pair in candidate_set from the two already installed sensors. For example, if a coordinate-pair is at a distance of 5 units from the first sensor, and 10 units from the second sensor, then its minimum distance from the sensors will be \(\min(5,10) = 5\) units.

  2. Select the coordinate-pair (from candidate_set) whose minimum distance from the two already installed sensors is the maximum.

  3. Stack the coordinate-pair of the third air quality sensor vertically on the array air_sensor_coordinates.

Execute the code provided in the previous question to check if your results seem correct. The coordinate-pairs of the three air quality sensors will be marked as black dots over the map of Colorado.

You are not allowed to use a for() loop in this question.

Hint:

For step (1) above:

  1. Define a function which computes the distances of a coordinate-pair from all the coordinates of air_sensor_coordinates, and returns the minimum distance.

  2. Apply the function on all the coordinate-pairs in candidate_set using the apply() function.

(15 points)

E.3.6 47 more sensors

You need to find 47 more coordinate-pairs to install air quality sensors well-spread across Colorado. We will generalize the steps in the previous question as follows:

  1. Suppose you have already found the coordinate-pairs for the installation of i sensors.

  2. Find the minimum distance of each coordinate in candidate_set from the i already installed sensors. For example, if a coordinate-pair is at a distance of \(d_1\) from the first sensor, \(d_2\) from the second sensor,…, and \(d_i\) from the \(i^{th}\) sensor, then its minimum distance from the sensors will be \(min(d_1, d_2, ..., d_i\)).

  3. Select the \(i+1^{th}\) coordinate-pair (from candidate_set) as the one whose minimum distance from the \(i\) already installed sensors is the maximum. Execute the code provided in the previous question to check if your results seem correct. You should see 50 black dots well spread across Colorado.

You are allowed to use only a single for() loop in this question.

(6 points)