50% OFF STATA BUNDLE!!
Time series analysis is one of the most powerful tools in data science, especially in fields like economics, finance, and forecasting. Why? Because so much of the world around us changes over time—stock prices, oil production, inflation, unemployment, or even daily website traffic. A time series is simply a sequence of observations recorded at regular intervals, and it allows us to capture patterns, seasonality, and trends that static data cannot show.
But here’s the catch: before we can run advanced models like ARIMA or VAR, we first need to properly format our data as a time series object in R. Converting raw data into this format is essential because it allows R to understand not just the values but also their temporal structure—when they occurred and how often.
In this tutorial, we’ll work with a dataset containing oil prices and an activity index starting in 1988. Our goal is simple but foundational:
Import the dataset into R.
Convert it into a time series object.
Create a basic plot to visualize the data.
By the end, you’ll understand not just how to do it, but also why each step matters. Let’s dive in!
In R, the working directory is the folder where R looks for files and saves outputs. Think of it like telling R: “Here’s the folder where my project lives.”
Replace the path with the folder where your dataset is stored.
If R can’t find your file, it’s often because the working directory is incorrect. Use getwd() to check your current folder.
R can’t read Excel files natively—that’s where packages come in. A package is a collection of functions that extend R’s abilities.
We’ll use readxl, which is designed for importing .xlsx files.
install.packages("readxl"): downloads and installs the package.
library(readxl): loads the package into your session.
💡 Tip: You only need to install a package once, but you must load it every time you restart R.
Now let’s bring the Excel file into R.
read_excel("oildata.xlsx", sheet = 1): reads the first sheet of the Excel file.
[, -1]: removes the first column (often an index or date column we don’t need).
head(dataset): shows the first 6 rows so we can confirm the data looks correct.
Common error: “Cannot find oildata.xlsx” → This usually means your working directory is set incorrectly. Double-check the file path.
Clear column names are important because R is case-sensitive, and cryptic names can cause errors later.
Here we renamed the columns to Oil and Activity, making them easier to reference in later steps.
Now comes the key step: telling R that this dataset represents quarterly time series data starting in 1988.
ts(): converts data into a time series object.
start = c(1988, 1): tells R the first observation is in 1988, Quarter 1.
frequency = 4: means 4 observations per year (quarterly). For monthly data, you’d use 12; for yearly, use 1.
Why does this matter? Because time series objects allow R to recognize the temporal order, enabling forecasting, decomposition, and seasonal analysis.
Finally, let’s visualize our data.
Here’s what happens in this code:
png("myplot.png", width = 800, height = 600): tells R to open a PNG “device” where the plot will be written. The file will be named myplot.png and saved in your working directory.
plot(): creates the time series plot just like before, but now inside the PNG file.
dev.off(): closes the device so the file is finalized. Without this, the file won’t be created properly.
💡 Tip: You can also use other formats such as jpeg("myplot.jpg") or pdf("myplot.pdf") if you want different outputs.
Note: You can always ignore the lines 1,2 and 10 and just run the command from lines 4 to 8. In this particular case, the image will not save into our folder. Instead, it will just display in the plot section in the R console.