Three alternative solutions:
1) With data.table:
You can use the same melt
function as in the reshape2
package (which is an extended & improved implementation). melt
from data.table
has also more parameters that the melt
-function from reshape2
. You can for example also specify the name of the variable-column:
library(data.table)
long <- melt(setDT(wide), id.vars = c("Code","Country"), variable.name = "year")
which gives:
> long
Code Country year value
1: AFG Afghanistan 1950 20,249
2: ALB Albania 1950 8,097
3: AFG Afghanistan 1951 21,352
4: ALB Albania 1951 8,986
5: AFG Afghanistan 1952 22,532
6: ALB Albania 1952 10,058
7: AFG Afghanistan 1953 23,557
8: ALB Albania 1953 11,123
9: AFG Afghanistan 1954 24,555
10: ALB Albania 1954 12,246
Some alternative notations:
melt(setDT(wide), id.vars = 1:2, variable.name = "year")
melt(setDT(wide), measure.vars = 3:7, variable.name = "year")
melt(setDT(wide), measure.vars = as.character(1950:1954), variable.name = "year")
2) With tidyr:
library(tidyr)
long <- wide %>% gather(year, value, -c(Code, Country))
Some alternative notations:
wide %>% gather(year, value, -Code, -Country)
wide %>% gather(year, value, -1:-2)
wide %>% gather(year, value, -(1:2))
wide %>% gather(year, value, -1, -2)
wide %>% gather(year, value, 3:7)
wide %>% gather(year, value, `1950`:`1954`)
3) With reshape2:
library(reshape2)
long <- melt(wide, id.vars = c("Code", "Country"))
Some alternative notations that give the same result:
# you can also define the id-variables by column number
melt(wide, id.vars = 1:2)
# as an alternative you can also specify the measure-variables
# all other variables will then be used as id-variables
melt(wide, measure.vars = 3:7)
melt(wide, measure.vars = as.character(1950:1954))
NOTES:
- reshape2 is retired. Only changes necessary to keep it on CRAN will be made. (source)
- If you want to exclude
NA
values, you can add na.rm = TRUE
to the melt
as well as the gather
functions.
Another problem with the data is that the values will be read by R as character-values (as a result of the ,
in the numbers). You can repair that with gsub
and as.numeric
:
long$value <- as.numeric(gsub(",", "", long$value))
Or directly with data.table
or dplyr
:
# data.table
long <- melt(setDT(wide),
id.vars = c("Code","Country"),
variable.name = "year")[, value := as.numeric(gsub(",", "", value))]
# tidyr and dplyr
long <- wide %>% gather(year, value, -c(Code,Country)) %>%
mutate(value = as.numeric(gsub(",", "", value)))
Data:
wide <- read.table(text="Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246", header=TRUE, check.names=FALSE)
Best Answer
There are many ways to do this. This answer starts with what is quickly becoming the standard method, but also includes older methods and various other methods from answers to similar questions scattered around this site.
Using the tidyverse:
The new cool new way to do this is with
pivot_wider
fromtidyr 1.0.0
. It returns a data frame, which is probably what most readers of this answer will want. For a heatmap, though, you would need to convert this to a true matrix.The old cool new way to do this is with
spread
fromtidyr
. It similarly returns a data frame.Using reshape2:
One of the first steps toward the tidyverse was the reshape2 package.
To get a matrix use
acast
:Or to get a data frame, use
dcast
, as here: Reshape data for values in one column.Using plyr:
In between reshape2 and the tidyverse came
plyr
, with thedaply
function, as shown here: https://stackoverflow.com/a/7020101/210673Using matrix indexing:
This is kinda old school but is a nice demonstration of matrix indexing, which can be really useful in certain situations.
Using
xtabs
:Using a sparse matrix:
There's also
sparseMatrix
within theMatrix
package, as seen here: R - convert BIG table into matrix by column namesUsing
reshape
:You can also use the base R function
reshape
, as suggested here: Convert table into matrix by column names, though you have to do a little manipulation afterwards to remove an extra columns and get the names right (not shown).