# hat factors are most relevant in helping to predict property prices

## Scenario: Melbourne Property Prices

An important issue to many young Australians is housing affordability with recent

booms in property prices in Australia’s largest cities making it more difficult for

first-time homebuyers to enter the market. Sellers and buyers alike are interested in

what drives property prices, and buyers are generally interested in knowing where

bargains can be found.

We will consider data on more than 23,000 properties listed on

domain.com.au in the Melbourne metropolitan area between January 2016 and September 2017.

These data include key geographic information (e.g., suburb, address, distance from Melbourne CBD), property information (e.g., property type, number of rooms, land

and building area), and selling information (e.g., agent, sale method, price).

For this project, there are three primary questions to be investigated:

(a) Does property price increase the closer you get to the Melbourne CBD, and does the relationship between distance from the Melbourne CBD and property price change depending on the property type?

(b)What factors are most relevant in helping to predict property prices, and which

general region (REGION NAME) appears to be the best bargain for houses (i.e.

excluding other property types) based on what you would predict house prices

to be for that region?

(c)Are there certain (non-geographic) attributes of properties that characterize a

general region (REGION NAME)? In other words, is (non-geographic) information

on the property sufficient to allow a buyer to understand where a property is

likely to be located?

**Expectation:** **Methods and Analysis**:

Provides a description of the data and the statistical analyses that will be performed. If a **linear regression**, **principal component analysis**, or **linear discriminant analysis** is to be carried out, this section should provide an explanation of and motivation for the variables that are included in the model. This section should also include descriptive statistics (statistics, tables, graphs) that are useful in describing the data and providing a glimpse of what you might expect from your statistical analyses. A good deal of thought should go into your descriptive statistics, as these must clearly show some relevance to your questions of interest, and you must explain what you can derive from these.

**Results**:

Provides a thorough description of the results of the analyses you described in the previous section. Include tables with relevant output. If analyses are carried out that involve the estimation of parameters, this should include an interpretation of the parameters for the variables of interest. Any issues with significant violations of the requirements/assumptions needed to perform the analyses carried out must be addressed.

**R code** and summary output should not be pasted into the document, but instead relevant results should be presented in nicely formatted tables.