In this course, you’ll learn to use the free, open-source statistical software R. Using R directly requires a bit of programming, so we’ll take advantage of the free, open-source program R Studio, which provides a convenient interface to R.

Here’s what you need to do by 8 a.m., ~~Friday, January 20th~~ Monday, January 23rd:

- Download and install R.
- Download and install R Studio.
- Read the first five pages of this introduction to R from our textbook authors. Try out the software as suggested in this introduction.
- Answer the reading questions below. (Be sure to login to the blog before leaving your answers in the comment section below.)
- Bring your laptop to class on Friday, if practical. (If you don’t bring one, you can work with a partner who did.)

Here are your reading questions, which you should be able to answer whether or not you successfully install and run these programs:

- What purpose does the $ in the command “arbuthnot$boys” serve?
- Describe in words the result of the command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”.
- At this point, what do you find most confusing about using R and R Studio?

1. The $ in the arbuthnot$boys command allows you to access the boys column of data by itself, separating that data from the rest.

2. That command would calculate the percentage of boys christened each year by dividing the number of boys christened each year by the total number of children christened in that year.

3. Making graphs is the most confusing.

1. The $ sign isolates one of the columns of a data set (so the format is dataset$specific column), listing the values in that column, in this case “boys,” as the elements of a 1-D vector.

2. RStudio interprets “arbuthnot$boys/(arbuthnot$boys + arbuthnot$girls)” as dividing each “boys” value by the sum of itself and the “girls” value from the same year – effectively finding the percentage of male christenings out of total christenings for each year.

3. Plotting data requires more complicated syntax, including specification of “type” by seemingly arbitrary number.

(Note – the reply above mine is visible, so I expect mine is as well).

1. The $ allows you to access the specified column instead of accessing the entire table. The command returns a vector with all of the values in the specified column

2. The command returns a vector of the of the number of boys born in each year divided by the total number of boys and girls born for each year.

3. It doesn’t seem too confusing yet, but I wonder how to make other types of plots and graphs.

1) The dollar symbol extracts the elements of a specified variable column and stores them in a one-dimensional vector. In this example, arbuthnot$boys, the $ extracts the elements from column “boys” from matrix “arbuthnot” and stores those elements in a vector.

2)This command first adds the corresponding elements from the “boys” and “girls” vectors together and then divides the “boys” elements by the respective results from the addition of “boys” and “girls”.

3) At this point, I would like to see more of how R studio interfaces with R and if all of these commands are possible with R as well. That is probably just the CS major in me đź™‚ Rstudio seems pretty straightforward, considering most of us at this point will have already used matlab or mathematica.

1) The $ in the command “arbuthnot$boys” allows us to access the data in each column of a data separately; this will only show the number of boys christened each year.

2) The result of the command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)” is the proportion of newborns that are boys in 1629.

3) So far I did not find anything confusing about these programs. They just require some experience.

Q: What purpose does the $ in the command “arbuthnot$boys” serve?

A: It tells the program that you only want to see the “boys” data. In this case you see the number of boys each year in array form.

Q: Describe in words the result of the command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”.

A: You get the proportion of boys christened each year in array form.

Q: At this point, what do you find most confusing about using R and R Studio?

A: Learning all of the syntax for perfuming the operations you want. However, they can easily be looked up.

1. Selects a column in the dataset

2. Proportion of boys christened for each year in the data set

3. I’m wondering how we can upload files from sources other than the website in the tutorial.

1. The $ sign allows us to access the data in a specific column. The command arbuthnot$boys will give us an 82×1 column vector that contains the number of boys christened per year.

2. The command will give us the ratio of boys christened in a year to the total number of boys and girls christened during the same year in an 82×1 matrix, where each element corresponds to the years starting from 1629 to 1710.

3. It seems the page [

source(“http://www.openintro.org/stat/data/arbuthnot.R”] no longer exists. I have installed both R and R Studio, but I can only see the Console frame in my interface. Is it possible to access the Workspace and History directly on the interface instead of clicking the File tab above? Thanks.

1. What purpose does the $ in the command “arbuthnot$boys” serve?

The $ command specifies which column of data you are interested in and extracts only that section of data into a new vector.

2. Describe in words the result of the command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”.

The command gives you a vector that contains the proportion of children that were christened boys in each year of the study.

3. At this point, what do you find most confusing about using R and R Studio?

I have used R before and am ok on everything we have done so far. I will probably find it more difficult one we get into more statistical operations but I have used all of today’s functions before.

1. The dollar sign indicates which column of the data set to be accessed.

2. This command says, “divide the number of boys christened in each year by the number of boys and girls (total) christened in each year.” Eighty-two values should be returned, one for each year.

3. The only real issue I had was errors resulting from not being use to the R language. Otherwise, the program was pretty intuitive. I liked how you could access data from the internet.

The $ in the command “arbuthnot$boys” serves to select the entries in the boys column of the arbuthnot data input and present it in the form of a vector.

The command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”, first computes the sum of the number of boys and girls christened each year (in London from 1629 to 1710) in vector form (so 82 entries in the vector with each entry corresponding to the sum of number of boys and girls christened that year) and then divides the number of boys christened each year by that sum (in vector form). This results in a vector with 82 entries of the proportion of boys christened each year in London from 1629 to 1710.

At this point, everything seems clear. The GUI of RStudio seems to mimic that of Matlab.

1) The “$” in “arbuthnot$boys” tells RStudio to access a specific vector of data named “boys” from the array “abuthnot”

2) “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)” is command that tells RStudio to return a vector of the percentage of total christenings that were male, by year.

3) The software seems pretty intuitive, but as with any new language it will take some work with it to get used to the language’s specific conventions. So I guess the question here is that the language will need some more acclimation and thus more practice.

1. What purpose does the $ in the command “arbuthnot$boys” serve?

The dollar sign selects the boys field/column from the Arbuthnot dataset

2. Describe in words the result of the command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”.

The command will display a vector of the decimal percent of boys out of the total babies for each year.

3. At this point, what do you find most confusing about using R and R Studio?

I’ve used Matlab and command lines extensively, so R is pretty straightforward. Tab completion, up/down history all work as expected. The main question I have at this point are what some of the other commands, since the intro only introduces a couple basic ones.

1. The $ in the command ensures that only the data for the number of boys christened is shown.

2. The command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)” displays the proportion of newborns that are boys born in every year. The results are displayed in “packed” form so that every resulting value is displayed from left to right with the index displayed in “[ ]” to indicate which row the starting value is from.

3. One small misprint can lead to a long struggle to figure out where you made a typo.

A)

To access the data of the “boys” column separately alone from the rest of other columns in the file. In other words, to choose only the data of “boys” column,

B)

The proportion of newborns that are boys in 1629 (according to the file).

Because:

arbuthnot$boys <==== # of boys

arbuthnot$boys+arbuthnot$girls <==== # of girls and boys

C)

Comparisons are not that clear to me. I am wondering if I can compare row by row?

Moreover, Can I compare all the column by another column?

1. The $ in the command Arbuthnot$boys, extracts only the data for the number of boys christened each year. Instead of separating it into columns when the entire data set is pulled up, it stores the data for boys as a vector with 82 elements.

2. This statement, â€śArbuthnot$boys/( Arbuthnot$boys+ Arbuthnot$girls)â€ť is the ratio of the number of newborn boys to the total number of newborns, girls and boys in the data set. This expression uses the complete vectors to divide the number of boys by the number of newborns.

3. It is confusing to know what expressions to use to execute a command. Small details, such as using a period instead of a space, make it difficult to know and understand exactly how R and R studio work without having studied it well.

1) The dollar sign specifies the next statement as a subcategory of the previous statement. So, out of all of the data included in arbuthnot, arbuthnot$boys only shows the information regarding the boys. You could do the same for the categories “girls” and “year”

2) That command shows the proportion of boys to girls each year

3) Does R always use order of operations or is it simply order of commands?

1. The $ symbol tells RStudio to call the column labeled â€śboysâ€ť from the arbuthnot dataframe. The column is placed in the console in vector form.

2. The command displays an output vector of the proportion of all newborns that are boys in the console. Specifically, RStudio separately divides the number of boys by the total number of boys and girls for each individual year. The result is a vector containing the proportion of newborns that are boys in each year.

3. I am most familiar with Matlab. As I was following the RStudio lesson, I became worried that I would have to type out something lengthy like “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)” each time I wanted to manipulate or make calculations using the columns in the dataframe. I did a quick test and discovered that I could rename parts of the data so I would only have to type in something short like â€śb/(b+g)â€ť to perform the identical calculation. Are you aware of any other shortcuts available in RStudio? Is there anything I can place after a command so that I donâ€™t see a giant output in the console (in Matlab, a semicolon does this task)? How can I set up an Excel spreadsheet so that RStudio can read my data? What does the â€śRâ€ť in â€śRStudioâ€ť stand for? With all these questions, I realize Iâ€™m not really answering the reading question about what I find â€śmost confusing about using R and RStudioâ€ť. I guess Iâ€™m not really confused â€“ I just want to know more. This is a pretty sweet program and Iâ€™m excited to try it out in my other classes!

1. The $ in the statement, “arbuthnot$boys” calls on the column of data stored in the boys vector contained in the arbutnot data. The command will create a vector of the 82 numbers contained in “boys”

2. The command, “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”, creates a vector of 82 numbers containing the proportion of boys to total births for each year.

3. Being that I have taken many programming classes, it seems pretty basic and self explanatory. The only thing that seems remotely strange is that it seems identical to excel… it appears to do everything excel does but just uses a command window instead of a point and click user interface.

1. $ allows us to access data in a column of a dataframe separately.

2. It gives the proportion of boys in each year of the data.

3. Does RStudio automatically save the commands that we used when we open RStudio again in the future (like Matlab)?

1. The $ allows us to specify which column of the dataframe arbuthnot to use. “arbuthnot$boys” should only show the number of boys christened each year.

2. This results in the ratio of the number of boys to the total number of babies christened from 1629 to 1710.

3. Is R Studio simply the User Interface for R?

1. The “$” in the command separates the name of the dataset and the variable the user wants to access. Using this command, the user accesses only the “boys” variable in the “arbuthnot” dataset.

2. The command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)” will give a vector containing, for each year, an element equal to the ratio of boys christened to the total number of children christened that year.

3. I didn’t understand how R Studio was able to load the information from the Internet. Can you enter any URL into the “source” function?

1. The $ in that command makes it print out the results in vector form instead of storing the data in columns.

2. The arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls command will provide, in a vector layout, the decimal percentage of the number of boys born each year out of the total number of boys and girls that were born.

3. Its not too bad at all, it’s very similar to matlab which I’ve had a little experience with. The most difficult part will be learning the commands that we’ll be using as a class.

1) The “$” means that the data that is showed comes from only the “boys” column. The “$” signifies that the data from only the column (coming after the “$”) will be displayed.

2) This gives the % of boys as a proportion of the total boys and girls born in each year (but shown as a decimal).

3) I think that R and R Studio is fairly straightforward, it’s just going to take a little bit of time to get acquainted with the commands. It reminds me a of a simpler version of MATLAB, and one that doesn’t take up as much space on my computer.

1) the $ allows me to specify which elements of the data set I’d like shown to me. I.e. arbuthnot shows me all data, while arbuthnot$boys shows my only the boys datapoints

2) This command will show the proportion, in decimal form, of total births in London during each year who were boys. RStudio will display these points horizontally, with the year of the leftmost data point serving as a guide.

3) Nothing confusing yet, but I’m sure it will come

1) The $ means that it will only show the data of the column that you specify after the dollar sign. So arbuthnot$boys only shows the data in the boys column. It also puts the data in vector form so it doesn’t waste space by having each number be in it’s own row.

2) This command gives you the percentage of boys in each year. This is also given in vector format so no space is wasted.

3) The most confusing part now is all of the different commands but once I use the program more I will become more comfortable with it.

1. The $ separates a variable in a data set so arbuthnot$boys would give us all the data in the boys column.

2. Every year the ratio of boys to girls was always around, but never below 0.5, and never above 0.54, showing that there was always more boys than girls, but never by very much.

3. Adding a data set to the workspace.

1) It would make the system can’t read the next word which is boy so it just read arbuthnot.It says coercing Left hand side to the list.

2)It says cannot find arbuthnot. So I guess that is an equation of how many boys in the class / the total class students.

3)Does this system work the same as Matlab?

1. The $ in the command â€śarbuthnot$boysâ€ť serves to extract the column â€śboysâ€ť from the array â€śarbuthnotâ€ť.

2. The result of the command â€śarbuthnot$boys/( arbuthnot$boys+ â€śarbuthnot$girls)” is to give the percentage of christenings that were performed on males for each year in the data set.

3. Are R and R Studio primarily used for statistical applications? How extensively are they used in academia and industry?

1. The $ command allows the user to access the data in the column labelled “boys”.

2. It gives you the proportion of new borns that are boys for every year in the data set.

3. The software is pretty straightforward. The most helpful thing would simply be a list of commonly used commands.

1) The “$” serves to access the data in a particular column. arbuthnot$boys will return the “boys” column of the arbuthnot dataset.

2) arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls) will take each value from the “boys” column of the arbuthnot dataset and divide it by the sum of itself plus its corresponding value in the “girls” column.

3) I have not yet found anything particularly confusing about R or R-Studio, it seems to be a fairly straightforward and easy-to-use programming language.

1) What purpose does the $ in the command “arbuthnot$boys” serve?

> It gives the column vector labels ‘boys’ inside the datafram ‘arbuthnot’

2) Describe in words the result of the command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”.

> This command gives the ratio of boys born to the total number of children born

3) At this point, what do you find most confusing about using R and R Studio?

> Nothing yet…

1. The command arbuthnot$boys access the data from the boys column only. The $ allows this data to be stored as a vector.

2. This command would give the proportion of newborns that were boys as it is asking for the number of boys/ the number of boys +girls giving the proportion of boys to total number of babies born.

3. I think this introduction was very helpful. I guess right now I’m just a little confused on all the functions of R and how it will be used throughout the semester

What purpose does the $ in the command “arbuthnot$boys” serve?

Describe in words the result of the command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”.

At this point, what do you find most confusing about using R and R Studio?

1. The $ tells R which column in the data set you want.

2. This will give the percent of newborns that were boys for each year.

3. Nothing yet. It seems a lot like Matlab and I hate Matlab.

1) Specify the column of data (variable) to be accessed from the entire set of data.

2)Calculate the proportion of boys over the overall newborns.

3)So far so good. The commands are pretty straight forward.

The $ in arbuthnot$boys indicates which column (in this case, the column titled “boys”) to retrieve out of a given data set (“arbuthnot”).

The result of arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls) will yield the proportion of kids christened that happened to be boys each year. The denominator will add up to the total number of kids christened, and this will do a year-by-year comparison of the data. Boys in 1629/Kids in 1629, Boys in 1630/Kids in 1630, etc.

The most confusing thing with R and RStudio so far is, what happens if you don’t have a sort of “null” first column? Will it always just assume you’ve identified the first column as something not raw data?

1. What purpose does the $ in the command “arbuthnot$boys” serve?

It lets us access the data in a single column

2.Describe in words the result of the command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”.

It computes the proportion of newborns that were boys for each year

3. At this point, what do you find most confusing about using R and R Studio?

How do you change the plot scale and axis labels?

1. The $ tells RStudio to display the information in the specified column as a row vector.

2. Displays the proportion of newborns that were boys as a vector.

3. How do you get the workspace to show up?

1. The $ symbol in “arbuthnot$boys” designates the column “boy” within the data set “arbuthnot”. It allows the user to identify and manipulate columns within a data set that may have more than one column of information. The symbol allows easy designation of which column is needed by the user.

2. The command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)” yields a columns worth of information into rows. It is coordinated into rows so that it is easier for the user to read. Each row has a number in front indicating which row the information is in if the information were to be viewed in column form. The actual information present takes one row’s ( in this case year’s) “boys” column information and divides it by the addition of both the “boys” column and “girls” column within that same row (or year). The addition is done first according to order of operations.

3. If there were to be column names that were similar, how would that data be accessed? How do users create data sets within R-studio? Does it have to be done outside of the program?

1. The $ symbol allows you to access the data in each column of a data set separately.

2. This command takes the information from each the boy and the girl columns and adds them together as vectors. This would give the proportion of newborns which are boys.

3. Just some of the syntax, really. It seems like it wouldn’t be too hard to get used to though.

1. Since “arbuthnot” is a database and “boys” is a column of that database, the “$” in “arbuthnot$boys” allows you to access only that specific column of data in the database.

2. This command allows you to find the percentage of boys born. All you are doing is calculating the ratio of boys to all children (total boys added to total girls).

3. Is there a general place to go to get random statistic databases? I’m guessing wolfram wont help because they want us to use Mathematica… but same idea?

1. What purpose does the $ in the command “arbuthnot$boys” serve?

arbuthnot tells R that we are accessing the arbuthnot data file in our work space. $ signifies that we are going to access a particular column. boys signifies that column name.

However, I would like to know whether we can use $ to access different rows as well (or does it only work for columns). What would happen if you had a row and a column with the same name (call it that.name) and you used arbuthnot$that.name? Then what? Does the $ operator just search the file for a word and then return the vector containing the data associated with that word?

2 .Describe in words the result of the command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”.

This returns a vector, where each data point in the vector is the individual computation of the statement for corresponding values in the two vectors boys and girl within the arbuthnot data set. Thus call the vector, result, then result[0] = boys[0] / (boys[0] + girls[0]). And result[1] = boys[1] / (boys[1] + girls[1]) and…. so forth until.. result[82] = boys[82] / (boys[82] + girls[82])

3. At this point, what do you find most confusing about using R and R Studio?

How do we just use the statement arbuthnot… Is it s.t. if you have a dataset in your workspace, you can call it by name and call various methods on it because its in your dataset?

1.

The $ in the command makes it so that we are only shown the number of boys christened each year instead of boys and girls.

2.

The result of this command is the proportion of newborns that are boys for each year from 1629 to 1710 (in total 82 ratios are displayed).

3.

It seems very similar to MATLAB so it seems straightforward to me as of now, especially with the help commands. If I had to pick I’d say the most confusing thing is getting used to and knowing the types of arguments that go into commands.

1) It allows you to access a certain column by name

2) It is the equation for the proportion of newborns that are boys in the “arbuthnot” data set

3) I don’t know very many commands or the language itself, and I think I would have a hard time completing tasks without tutorials for help on the language.

1. The ‘$’ specifies a sub set of the data. In this case, we have an 82×3 matrix, and it takes the boys column and turns it into a vector with each entry representing the value of boys christened that particular year.

2. This command takes the vector of boys christened in a year, and divides it by the sum of the vectors boys and girls. It essentially allows us to find the percentage of children christened that were boys for each year.

3. This software seems to work in a similar manner to other programming softwares(matlab and mathematica to a lesser degree), so it isn’t too confusing, only I don’t know very many commands at the present.

1. The symbol ‘$’ serves to access the subset “boys” of the larger data set “arbuthnot”.

2. The subset “boys” and the subset “girls” are added together for each year, resulting in a new data set. The subset “boys” is divided by the new set at each year, delivering the proportion of boys delivered each year.

3. The most confusing part of R Studio is how I typed the line

plot(arbuthnot$year, arbuthnot$boys + arbuthnot$girls, type = “l”)

and I received an error of invalid plot type ‘1’.

1. The $ command tells R studio that you are just looking at one specific column of data.

2. This command returns the ratio of boys to total children for each year in the data.

3. At this point, R Studio is not that confusing.

1. The $ commands the program to only show the number of boys christened each year from the data set.

2. The command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)” will display the fraction of babies born in 1629 that are boys from the data set.

3. Is it possible to save the command arbuthnot$boys as a separate variable to avoid typing all that out (say “Boys”)?

1. The $ allows access to a category of a table.

2. The ratio of boys to total births for each year.

3. I was able to find a maximum for the total births, but wasn’t quite sure how to find the year where the maximum occurs. There was a link to a large reference sheet, but it just seemed confusing to see all that data presented together.

1) The $ in arbuthnot$boys tells R studio to display the number of boys’ christenings from 1629 to 1710

2) arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls displays the proportion of boys to total christenings in London from 1629 to 1710.

3) I’m not really sure how I would create and name my own data set, right now I just know how to download one and enter it into R studio

1. It gives access to the individual data in a column, in this situation, the boys.

2. It gives the proportion of newborn boys in 1629.

3. Is there a lot of coding/ computer science experience needed to run R?

1. The purpose of using $ is to only display the column of data you want to display

2. the command calculates the percentage of boys in the total number of babies born by taking the number of boys born at each year and dividing it by the number of boys minus the number of girls for that same year. R does that for every line of the data.

3. I don’t understand some of the information displayed in the windows on R?

What purpose does the $ in the command “arbuthnot$boys” serve?: It allows us to access the data in each column individually

Describe in words the result of the command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”.:

The ratio of boys christened in the sample year divided by the total number of christenings

At this point, what do you find most confusing about using R and R Studio?

How to download and install it

1) This symbol gathers all of the numerical values of a particular vector. The vector is chosen by simply typing its name after one types the target data group followed by the symbol, “$”.

2) All data is taken from the arbuthnot data set named by the statistical collecting of Dr. John Arbuthnot. The proportion of the number of boys to the number of total babies is found by dividing the number of boys christened by the sum total of the number of boys added to the number of girls christened.

3) It seems decently simply. The only trouble I had is when I caused successive errors it stopped taking commands and the “>>” symbol changed to “+”.

1. The $ separates the name of the column from the dataframe, i.e. that we want the data from the “boys” column in the “arbuthnot” dataframe in this case.

2. The command adds each data element in the “boys” column of the “arbuthnot” dataframe to each respective element in the “girls” column of the “arbuthnot” dataframe. It then divides each data element in the “boys” column of the “arbuthnot” dataframe by each respective summed element. Or, in essence, it gives the proportion of newborns that are boys over the years 1629-1710.

3. As a Computer Science major, I have had extensive experience with programming and various programming languages, and so R comes very naturally (especially given the fact that I have had exposure to MATLAB). There is nothing confusing so far about anything that I have learned about R.

1. $ means the dimension “boys” is under the data set “arbuthnot”.

2. That is the ratio of boys to the total population

3. They don’t seem confusing to me because I found they are similar to Matlab.

The $ command is the token for using all / each of the data points for a given calculation or plotting. It would be used if one was interested in adding each element in two different columns together, without having to manually specify each entry.

The command will display the gender ratio of births for each given year from 1940 to 2002

I find the difficulty of manipulating the data on a row by row basis most confusing. It is at this point hard to trace back to which year this data comes from, and when calculations are performed their results are displayed independent of the original data.

And these are the questions from the learning guide, which I happened to do on the side and thought I might as well add in.

1940 – 2002

63 Rows and 3 Columns

Variable names: Year, Boys, Girls

These counts compare closely in relationship to arbuthnot’s data, but are with much higher sample sizes. The greater size is reflected in the decreased variability in the boy/girl proportion.

Arbuthnot’s observation about the boy girl ratio does in fact hold up in the US, according to this data

The plot displays a seeming downward trend in the gender ratio as the years progress, but the total decline (.2% point difference) is so slight it does not seem sigficant.

1961 had the most number of births.

1. The dollar symbol ($) tells R the column that you wish to see or work with.

2. R sums each of the corresponding terms in boys and girls columns, and then divides each term in the boys column with the corresponding results. Corresponding terms are determined by matching index numbers.

3. With the commands, nothing. So far R seems pretty straight forward. However, I am having trouble finding where the workspaces are stored on my computer. I also cannot see the side bars that are mentioned in the tutorial.

1. it access the column “boys” of arbuthnot

2. the proportion of boys to the whole population of the sample

1. It tells R to show only the following column of the preceding data set, so for this, it show the boys column of the arbuthnot data.

2. It is the proportion of newborn boys to total newborns.

3. How would we make our own datasets?

1. To access the data in each column of a

dataframe separately

2. The ratio of newborn boys to the entire newborns.

3. The rows and columns in R are confusing. I don’t exactly understand how they are

related.

1. (For command x$y): the $ symbol serves as a way of telling RStudio that you would like to access the y column of the x data set.

2 This command will display the percentage of boys born in each year compared to the total born in that year (sum of boys and girls born).

3. R and RStudio seem pretty straight forward! Granted, I’m a CS major so a lot of this seems natural. The only thing difficult is the tutorial seems to favor the windows UI and navigated around on OSX was a bit different. Other than that, not bad at all!

1. The $ in the aforementioned command brings about the vector that is defined to the name listed after the $.

2. This command displays a all of the individual calculations of the boys/(boys+girls) per year. These are shown in an array, starting at the top left, and proceeding towards the right until a row is full. At the beginning of each row is the number corresponding to the row number in the original arbuthnot vector which these decimal values were originally in.

3. It seems relatively similar to many programming languages that I’ve taken before, especially Matlab, as Matlab is a great program for vector computing.

One issue I ran into was from the tutorial pages. I received this error message: “Error in plot.xy(xy, type, …) : invalid plot type ‘1’” when I typed in this command: “plot(x = arbuthnot$year, y = arbuthnot$girls, type = “1”)” and I’m not sure what I did incorrectly. I changed the spacing, to see if that was a problem, but to no avail.

1) What purpose does the $ in the command “arbuthnot$boys” serve?

2) Describe in words the result of the command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”.

3) At this point, what do you find most confusing about using R and R Studio?

1) It is an operator with an RHS and an LHS, with the meaning of extracting from the dataframe (on the LHS) all the data in a column of the name on the RHS.

2) This command would result in a vector of proportions (numbers of the form 0.???, with a min possible of 0.0 and a max of 1.0) which represent the proportion of all births in a given year which were boys.

3) Actually R and RStudio seem really straighforward. I don’t think I’m confused by it at really (at least yet). I guess if I had to ask a question it’d be just how extensive or fancy can you be with this/how powerful is this tool?

1. I truly cant determine exactly what the $ means when that command is called.

2. This equation will calculate the proportion of newborns that are boys in the 1629 data set.

3. So is this program used more as a linux type system with command line calls or is it just used for expressions of datasets??

1. We can access the data in each column of the dataframe separately.

2. The proportion of newborns is given by the command. The command divides the number of boys by the total number of newborns, which is why we have to use

parentheses.

3. I am most confused by the semantic aspects of how to type in data. It is somewhat similar to MATLAB, but some of the more involved processes are more difficult to remember.

What purpose does the $ in the command “arbuthnot$boys” serve?

It accesses the variable ‘boys’ in the R object ‘arbuthnot’

Describe in words the result of the command “arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)”.

It gives the percentage of the number of boys for each year.

At this point, what do you find most confusing about using R and R Studio?

Where will the R object come from? Will the location of the objects be given to us for each assignment or will we have to create them ourselves?

1. The “$” serves the purpose to extract the data of how many boys were christened each year.

2. This command gives us the proportion of newborns that are boys in 1969 which is given in vector form.

3. The most confusing thing about the program for me is that if you just typed in a command that was suppose to give you the proportion of newborns that are boys in 1969, why is the output for than one number?

1.) The $ sign is used to specify the column which you are trying to output in the console.

2.)This shows the ratio of boys to total christenings for each year.

3.)Nothing, everything seems okay at this point, the layout is very similar to MATLAB and i have experience using that.

P.S.

I am really sorry this is coming late, for some reason I thought the assignments were due on the 25th. I hope I get partial credit even though I submitted late. I promise this won’t happen again.