Problem Set 5 Solutions

Posted on December 9, 2013 by Derek Bruff

Apparently, I never got around to posting solutions to your fifth problem set. Better late than never. Here are solutions to Problem Set 5.

Final Exam Information and Resources

Posted on December 5, 2013 by Derek Bruff

You have two options for taking the final exam: Tuesday, December 10th, from 3 to 5 p.m. or Friday, December 13th, from 12 to 2 p.m. Both offerings will be in our regular classroom.

The final exam will be comprehensive and will have a format similar to the midterms, with a mix of conceptually oriented multiple-choice questions and computationally oriented free-response questions. There will be no test corrections on this exam.

You’re welcome (and, indeed, encouraged) to use Wolfram Alpha, your graphing calculator, or a similar tool on the exam. You’re also welcome to bring two 3″ x 5″ index cards with notes (front and back).

Here’s my office hour schedule for the next week:

Friday the 6th, 2-3pm
Monday the 9th, 10:30-11:30am and 2:30-3:30pm
Tuesday the 10th, 10:30-11:30am
Wednesday the 11th, 11:30-12:30pm
Thursday the 12th, 2-3pm

And some resources for you as you study:

And some practice problems:

Untying the Gordian Knot

Posted on November 15, 2013 by alavelle

According to legend, Alexander the Great faced a challenge, where his success prophesied his eventual triumph over lands in Asia. Alexander considered untying the knot, but chose instead to use his sword to cut the knot, revealing its ends. After this, the knot could be easily unwound. The question here is, if the Gordian knot cannot be untied except by revealing/creating ends, can it be a knot?

Knots are created by a linear combination of moves done upon strings that result in a series of bends. Some “knots” are trick knots – a slip “knot,” without much tension on either end, quickly unravels. Here we will define these trick knots as not true knots. A knot must hold weight for extended periods, resisting capsizing or slipping (both of which result in coming untied). An example of knot-steps that would not count would be two strings wound around each other. If either string is pulled upon, unless there is an exorbitant friction coefficient, the two strings will come apart, back to their original condition.

What our project looks at is how to go from a completed knot back to the individual string(s) that it originated from. We hope to develop a method that covers knots in the general case. Our theory frames knots such that they only result from linearly independent ends of the strings (working end and standing end), with linear moves in which order matters (though this last part we will investigate more theoretically than factually), and only two strings. Therefore, the solution to the matrix that results in the knot in the first place will be the inverse matrix formed by each step of tying the knot. (Consider each transformation on the string[s] as a standard matrix.)

The reverse to this theory is our prediction that the only way a knot cannot be untie-able is if the string’s ends are not linearly dependent – visually where a string meets itself again, it is fused at some level. Furthermore, we might even suggest that a knot that does not appear untie-able would be the monkey’s fist, where one could nearly hide the ends within the knot itself. Therefore, the Gordian knot fails to be a knot in its inability to be untied.

The expansion of this theory leads to chains of operations. When you repeat certain processes, you get knots on top of knots. Because knots have a standing end and a working end, when this process is repeated, you get a number of stacked knots. This number can be considered an eigenknot (or scalar of a knot). If you think about a plane in 2D space, you would get the same knot repeated at a larger y. The thickness (3D component) here is not necessary. Continuing the process infinitely to tie a square knot will result in a series of square knots until either the working or standing ends run out.

In conclusion, we hope to show that knots can be represented in a matrix form of unique operations that will row reduce to result back into a solution with a positive number of possible answers. What will be interesting to see is if there are more than unique answers – we certainly expect this from larger knots comprised of prime knots, but we also consider that there is more than one way to untie, and furthermore to tie, a square knot (and that furthermore the ease of untying when pulling down on one loop is what led to the rise of said square knots).

Our main problem will be finding a way to model knot interactions (steps) in a way that gives sensical answers for this class. Certainly, we can think about each of these interactions as symmetric to things we do in linear algebra, so it will be necessary to combine our two sciences to create something that is inherently linear algebraic rather than merely representative of what we have been doing.

South Africa Population Model

Posted on November 13, 2013 by jliu

For our application project for this class, we will be analyzing South Africa’s population growth, or in this case, the country’s population decreases. Based on information from the Central Intelligence Agency’s “The World Factbook,” South Africa’s population growth rate is -0.45% as of 2013. Out of the 233 countries in the world, South Africa’s population growth rate is number 222, with number 1 having the largest population growth rate. The country has the twenty-seventh largest population in the world, so we are questioning, why South Africa has a negative population growth rate? What age groups are causing this decline in population?

To figure out the answers to these questions, we are going to use the data from the Central Intelligence Agency’s “The World Factbook,” which has information on every country in the world. This data resource (https://www.cia.gov/library/publications/the-world-factbook/geos/sf.html) provides us with up to date and reliable information on South Africa’s people. We will be using the actual birth rate, death rate, maternal mortality rate, infant mortality rate, and migration rate to analyze the population decline of South Africa. With this data, we will carry out the analysis using the eigenvalue approach to dynamical systems.

To start, we will simplify our data from the Central Intelligence Agency’s “The World Factbook,” to include only five age groups of population. Those age groups will be 0 to 14 years, 15 to 24 years, 25 to 54 years, 55 to 64 years, and 65 years and older. By doing so, for each age category, the population growth rate will be represented by a five-dimensional vector. When we pull all of this information together, we will create a five-by-five matrix that includes the five rates (birth, death, maternal mortality, infant mortality, migration) for each of the age groups of the population. This five-by-five matrix will be our transition matrix (P). Now, we will solve for the eigenvalues by using the equation, determinant(P-I)=0. Once we know the eigenvalues, we will solve for the eigenvectors for each eigenvalue using the equation, (P-nI)x = 0. Each of these eigenvectors will describe the eigenspace for that specific eigenvalue. Since the eigenvectors form a basis for R5, x0=c1v1+c2v2+c3v3+c4v4+c5v5. Then, xk=Pkx0=Pk(c1v1+c2v2+c3v3+c4v4+c5v5). We know that Pkv1=kv1,so xk=c1Pkv1+c2Pkv2+c3Pkv3+c4Pkv4+c5Pkv5=c1kv1+c2kv2+c3kv3+c4kv4+c5kv5.Now, we can input the eigenvalues that we will calculate. To find whether the population growth in South Africa will increase or decrease, we will see what happens when k goes to infinity. If the eigenvalue is larger than one, the population will grow, but if the eigenvalue is less than one, it will decrease. We will find our own population growth rate for South Africa and see what factors are truly causing the population to decline based on the absolute values of our eigenvalues. In addition, we will ultimately be able to find the ratio of age groups among the population and determine the importance that it holds.

Google It: Markov Chains in Modern Search Engines

Posted on November 13, 2013 by vshah

As the Internet has grown, so has the need for a system that can easily navigate all the information it contains. However, the obvious logistical concerns makes creating a comprehensive system that can efficiently organize this information extremely difficult. The Internet is an unregulated organic entity, and on top of that, there are nearly 20 billion web pages.

Given the monumental task at hand, what Google has accomplished is all the more impressive. With near ubiquitous recognition across the Western world and over 83% of the US market share, Google represents the incredible power of linear algebra applied on an unprecedented scale.

Ultimately, what separates Google from the competition is that it ranks pages in a way that is not only efficient, but also ensures trustworthy and relevant sites are at or near the top of every search query.

Earlier search engine systems used an index that searched web pages and ranked them according to the frequency of the keyword queries. However, this proved to be an inefficient and inconsistent way to gauge the relative importance of resulting pages. While initially helpful, it was clear that this method was tedious and would only become more dysfunctional as the web grew.

It is for this reason that the innovative techniques pioneered by Google founders Sergey Brin and Lawrence Page are so integral. Rather than focusing on frequency, search results were ranked based on relevance; this meant no more sifting through volumes of hyperlinks. Instead, these hyperlinks were utilized as the means to determine relevant results. Behind this revolutionary idea: an algorithm called PageRank.

As we have explored eigenvectors and eigenvalues over the past few weeks, we have seen many long-term population modeling applications; however, we have not delved into many other real-life examples. As such, we will be exploring the model and algorithms that underlie Google’s monumental success. In addition, we will explore how Page and Brin modeled the web using modified stochastic matrices, Markov chains, and their intimate connection to eigenvectors.

To model how PageRank is used, we will scale the web down to a simpler system of 5 pages. Each page has a certain number of links to each of the other four, and each website passes on its “importance” to the pages to which it is linked. For example, if page 1 is linked to pages 2, 3, and 4, then it would pass ⅓ of its “importance” to each of those pages. This is essential in determining which pages are ranked higher, and in turn, show up higher on the results than others. Using this process, we can construct a stochastic matrix that models the links between the pages, called the “link matrix”. From here, it can be seen that if one start out with any vector “x” that represents the initial relative importances of the pages, multiplying it by the link matrix repeatedly will yield a final page matrix. That is, certain pages will become more important than others. In order to obtain the final page matrix, we will find the eigenvalues of the link matrix and determine the eigenvector for which the eigenvalue = 1, or more recognizably, where Pq=q.

In addition, we will examine how Google’s innovations invariably played a role in facilitating this model. Examining damping factors to account for irregularities in the columns of a stochastic matrix and the premise of hanging nodes, or disconnected subsystems, within such a network is essential to understanding the evolution of Google.

Simply put, the Internet is a complex, chaotic, and disconnected entity. Through our simplified model, we hope to give the class an idea of how Google’s application of linear algebra brought a sense of organization to the Internet.

References:

http://www.ams.org/samplings/feature-column/fcarc-pagerank

http://www.pewinternet.org/Press-Releases/2012/Search-Engine-Use-2012.aspx

http://infolab.stanford.edu/pub/papers/google.pdf

http://www.math.cornell.edu/~mec/Winter2009/RalucaRemus/Lecture3/lecture3.html

http://www.google.com/competition/howgooglesearchworks.html

http://www.worldwidewebsize.com/

Practical Fractaling

Posted on November 13, 2013 by jbasri

Group Members: Mark Chambliss, Jason Stewart, and Jason Basri

Fractals are relatively new observances to the math world. It was not until 1985 that Michael Barnsley developed the first practical way for generating fractals. Fractals however have been around since before man. They are seen in biology and nature ranging from microorganisms to mountain ranges.

Fractals in its basic form is an object made up of smaller copies of itself. In a more mathematical sense, a fractal is a subset of a Euclidean space whose Hausdorff dimension and topological dimension are not equal (Klang). We as linear algebra students were shown linear transformations and a few applications of them. We would like to take what we have learned and apply it to our own project of creating fractals. With the help of Matlab, this will be possible.

For our application project we will look into the problem “How can computers be used to create fractal images?”. As stated earlier, fractals are images created by performing repetitive linear transformations on points and images. Computers are very adept at performing repetitive tasks and thus will be well suited for performing the necessary repetitive operations for creating fractal images. To achieve our desired goal, we will study algorithms that have been developed by others in the past to create fractal images and determine the differences and benefits of each approach. Lastly, we will reflect back upon how the fractal image algorithms can be used in modern applications. An example of a simple fractal image that was created with matlab is displayed below (Seiler):

Asking a more basic question is “what processes are used by various math based software to create complex fractal images?” To help understand how fractals are used to create an image like this, we will use the basic fractal of the sierpinksi carpet. This fractal takes a 2×2 matrix and turns it into a carpet like image comprised of a pattern made up of many small squares. This is done through linear transformations called similitudes, which take that original 2×2 matrix, scale it down, and transform it to another area of the graph. A process of n similar similitudes, typically 8 in the case of the sierpinski carpet, continue with this resizing and translation of the matrix until many “squares” appear on the graph. These similitudes, many of them quite complex, form the basis of many fractals, which can be comprised of thousands of similitudes. Because of this, to form many of the more advanced fractals, such as the one seen above, the Monte Carlo approach is used, which involves performing many similitudes upon many randomly selected coordinates in the graph. This is where the fractal mapping software becomes ideal, as it can perform these various complex functions (McKeeman).

Bibliography

Klang, Jesse. An introduction to Fractals. http://home2.fvcc.edu/~dhicketh/LinearAlgebra/studentprojects/fall2004/jesseklang/Fractalsproject.htm.

McKeeman, Bill. Wild Fern. http://www.mathworks.com/matlabcentral/fileexchange/19141-wind-fern

Seiler, Ben. Fractal Generation. http://wso.williams.edu/~skaplan/fractal/chapter3.htm

Data Compression

Posted on November 13, 2013 by imundy

A common problem for computer scientists, which has very direct effects on the consumers of their products, is the way in which they store data that will be used by an end user. Finding an intelligent method to store this data in such a way that it does not take up an unnecessarily large amount of space is a common problem that has been looked at for many years. This becomes especially applicable when we are looking at the transmission of this data, or when we are storing it on some external system.

As an example, we will observe storing images and compressing them. The simplest way to represent an image in a computer is what we call a bitmap. For a grayscale image, you can imagine this as being represented as a large matrix of numbers with gray values, representing intensities, ranging from 0 to 255 (the numbers that can be represented in 8 binary bits, or one byte). For color images, we have almost the exact same thing, but instead we have to have three matrices – one each for the red, green, and blue values of each pixel. Every image can be represented in this way, or some equivalent combination of color values, for instance cyan, yellow, and magenta is also a popular model. Obviously, the amount of data grows very quickly as the image grows, with a 1000×1000 pixel image representing 3×10002=2.86 Megabytes, while still being a relatively small image.

For someone who is working with something such as billboards, it is very important that these images have as much information as possible. That is why high megapixel cameras, which capture details at a smaller and smaller level, are important in high end photography; however, if I am designing a webpage in which images will only be displayed as smaller 500 by 500 squares, this extra information is unnecessary and will only serve to fill up available space faster. The question, which we will address using topics covered in linear algebra, is how can this information be compressed in such a way that the image maintains acceptable quality? How can we decide which values are important to keep in the compressed image?

We believe that these questions can be addressed systematically using concepts from linear algebra this semester. For instance we know that the eigenvalues are unique to any given matrix, so we could use these to help us approximate a solution. We also know that each matrix has a given rank, so when dealing with large numbers, as long as this rank is greater than 0, we have some set of linearly independent columns, so we can analyze this set and the space it occupies to try to reduce our image. For instance, even if the three columns have a rank of three, we know that they may only occupy a subspace of R3 meaning we only have to represent a limited space. Information such as this could help us in deciding how to approximate our image. Our project will attempt to use the characteristics of the matrices of the image as a way to approximate a condensed form of that image.

Modeling the One Child Policy

Posted on November 13, 2013 by chsu

The project will analyze the effect of the One Child Policy on the population of China. The One Child Policy was implemented to prevent social and economic problems of rapid population growth by barring most families from having more than one child. Since it came up into effect three decades ago, the policy has reduced the Chinese population by at least 250 million. However, recent studies have shown that the One Child Policy may have been the cause of a worsening demographic problem with an aging population and a dwindling work force. TIME magazine reported that there were six adults of working age for every retiree. The ratio is expected to continue to decline. The National Committee of Ageing estimated that 52 percent of the work force in 2053 will have to support one-third of the Chinese seniors as well as 16 percent of the children.

In order to investigate the consequences of the policy, we will predict and compare the long term population of China with and without the One Child Policy. We will first construct a transition matrix that models the current population growth of different age groups in China by using current birth rate and mortality rate. Three equations will used to indicate the number of minors(0-18), adults(18-65), and the elderly(65+). Then we will construct a transition matrix of the Chinese population without the effects of the One Child Policy. The transition matrix without the effect of the One Child Policy will be applied to the population of when the policy was first implemented. We will assume that all births prevented will be added to the minors age group and the birth rate and mortality rate are constant. Furthermore, we will assume that every adult will be in the workforce for later analysis. With the given assumptions, the eigenvalues of the matrices will show us the ratio of the age groups in both scenarios.

The long term population growth of both cases can be studied from the data collected. We are mostly interested in the ratio between the working force and elderly. We will compare our derived ratio and the ratio projected by experts. This difference will be taken account and show us the validity of our data when studying the projected ratio of the working force and the elderly in the case where the One Child Policy was not implemented. The result of this study will show us whether removing the One Child Policy would alleviate the burden of future generations.

Citation:

Republic of China. Information Office of the State Council Of the People’s Republic of China. Family Planning in China. Beijing: , 1995. Web. <http://www.fmprc.gov.cn/ce/celt/eng/zt/zfbps/t125241.htm>.

A BRIEF HISTORY OF-China’s One-Child Policy

<http://content.time.com/time/world/article/0,8599,1912861,00.html>

Markov Model of the Drive

Posted on November 12, 2013 by jpevitt

Project Proposal for Jason Pevitt and Nick Kline

“You find out life’s this game of inches, so is football. Because in either game – life or football – the margin for error is so small…The inches we need are everywhere around us. They’re in every break of the game, every minute, every second. On this team we fight for that inch.” – Al Pacino, Any Given Sunday

Al Pacino put it better than anyone has since in this speech; the game of football is truly a game of inches. At its highest level, each decision a coach makes will be scrutinized by fans and media alike to a level unheard of elsewhere. Most coaches would argue that, given a situation on the field, they have a rough idea of the likelihood of all outcomes. However, we (and Keith Goldner before us) felt that this “rough idea” business should be done away with; it seemed reasonable to us that, using Markov chains, we could give an exact probability for many scenarios in a football game. The availability of this data would provide invaluable information to many coaches in certain game situations (i.e. the decision to attempt a fourth down conversion versus punting in a critical, late-game scenario; the debate over when to use a time-out in a two-minute drill; etc.). Basically, our data should allow us to give the probabilities to every outcome of a drive beginning anywhere on the field.

The game of football at the professional level is a conglomeration of the outcomes of many NFL seasons; a season itself can be simplified down to the outcomes of many different games, which can further be broken down to the outcomes of various possessions or drives. Because Markov chains are excellent for modeling probabilities of successive events given a starting state, we believe they provide an excellent basis for modeling football if one considers each play to be a different, independent state. The first, most important step to modeling the game, then, is to model the drive. That is what we will be doing in our project.

To begin this undertaking, the drive has to be divided into all – or at least close to all – of its possible situations. Because of changes to the game in the last decade, we agreed with Goldner’s decision to limit the sample size to games played in the last five years. The non-drive-ending, or transient, states are determined by down, distance to the first down, and yardline on the field. The field was then subdivided into zones of five-yard increments. The same idea was applied to distance-to-go measurements, yielding five-yard zones save the fifth one which held all distances of twenty-plus yards (because of the relatively similar probabilities of converting a 3^rd and 26 and a 3^rd and 43). We felt this idea of grouping different yardages into zones was applicable in order to give each zone a frequency high enough for each to be statistically relevant. These transient states totaled 340. The drive-ending, or absorbing states, totaled nine as follows: field goal, safety, missed field goal, fumble, interception, turnover on downs, punt, end of half or game.

From here, each game from the 2006 to 2011 season was analyzed. The probability of moving from one state to the next was charted in accordance with our understanding of Markov chain analysis. From here, we expect to be able to make data-driven decisions in various game situations.

Credit to Keith Goldner:

– http://www.drivebyfootball.com/2011/05/introduction-to-markov-model-of.html

– http://www.drivebyfootball.com/2011/05/markov-model-of-football.html

– https://docs.google.com/spreadsheet/ccc?key=0Ag6b9q23rVNDdGUyQU0xVTJiUnRqbXNFV3pWVjZmaEE&authkey=CNWl2cUF&hl=en&authkey=CNWl2cUF#gid=0

Power Ranking Football Teams

Posted on November 12, 2013 by rhurt

Rankings are important for many different applications in a wide variety of fields; in essence, any time there are multiple options to choose from, it is generally useful to have a systematic way of deciding which choice is the best. Whether considering colleges, national economies, or web pages, people value well-designed ranking systems that help them to make informed decisions. Ranking systems are especially important within the world of sports, where team rankings can make or break a season or a career.

A natural way to compare two athletic teams exists already; they play against each other, and their relative order is determined by the game’s outcome. However, given the variety of complicated factors that can decide who wins or loses, these one-to-one comparisons can’t reliably be extended to give a complete ordering of a set of many teams — that is, beating one team doesn’t in turn guarantee superiority over every team that they have already beaten. Some method is needed to take the basic information provided by individual games and consolidate it into a measure of overall strength.

One way of doing this would be simply to compare the win-loss records of teams, but this oversimplifies the situation. For example, suppose Team A and Team B are evenly matched in skill, but Team A faces stronger opponents. Team A will win fewer games than Team B, and this model would incorrectly rank Team B ahead of Team A.

Alternatively, we can rank n teams by constructing an n-by-n matrix, initially filled with zeroes. In this matrix, there is a column and a row to represent each team, and the matrix entries represent each possible match-up between two teams. For each completed game, some value is added to the matrix entry in the row of the winning team and the column of the losing team (depending on the type of model, this value could simply be a 1 to represent a win, a point margin of victory, or some other weighted measure of the win). At this point, the sum of each row represents the corresponding team’s “primary wins”, or the games that they themselves won. By itself, this is very similar to a ranking strategy based solely off of win-loss records. However, once the matrix is set up in this way, several possibilities for deeper analysis become available:

Squaring the matrix will form a new matrix in which the row sums now represent “secondary wins”, or the games that a team’s defeated opponents have won. This allows the model to reward teams for victories over very strong opponents. This process can be continued to include third, fourth, and nth level wins if desired — but each level becomes increasingly irrelevant to the problem at hand.
The original matrix can be normalized so that each column total is the same; this is another way to account for strength of schedule, because it devalues victories over very unsuccessful teams.
We can also consider an arbitrary ranking vector, which is refined by multiplying it with the matrix to produce a new ranking vector. Over many cycles of refinement, this will converge to the dominant eigenvector of our matrix – so in order to determine a final ranking we need only to find the dominant eigenvector.

The proposed project will attempt to accurately rank 2013 SEC football teams using some combination of the approaches discussed above, based on the statistics available from the season so far.

Math 194: Linear Algebra

Fall 2013, Vanderbilt University