I have provided you with an Excel spreadsheet called Last_FM_data_shuffled.xlsx. It contains the log of all the music I have listened to on my phone since I began using the Last.fm website. As the name implies however, I have shuffled the entries so that they are no longer in chronological order. There is a header row at the top of the spreadsheet, and there are four columns of data: Band, Album, Song, and Date.
- Assuming you are not using packages that let you read from Excel, what must you do first in order to prepare this data to import to an R dataframe? What command will you use to import it?
For this problem, submit a .r file where the first line is a comment telling me what you have to do, and the second line is the R command to import the data. Remember that # is the comment character.
- What is a single R command that can be used to count how many different bands are represented in the data file?
- Write an R script that will sort the data back into chronological order and store it in a new dataframe.
- Recall that the table() function can be used to quickly summarize data. As an example, assuming I have attached the dataframe with the song data, I can type
And get the following output
(Song For My) Sugar Spun Sister 1901 45
2 1 2 50 Ways to Say Goodbye 6th Avenue Heartache 8:02:00 PM 1 2 1
Each song title appears as a column heading and the number underneath it represents the number of time the song appears in the Song column of the dataframe.
Using this, what is the R command to determine the name of the song that has been played the most times? What is the R command to determine how many times that song has been played?
- Using R, determine the average number of songs I listened to per day over the time period in the dataset.