Tidy-up data in R

Back to home

Excel Pivot Table
Console summary of aggregated data

Excel Pivot Table

Excel alike pivot table:

suppressPackageStartupMessages(library(dplyr))
library(tidyr)

council_name <- c("Barking and Dagenham","Barking and Dagenham",
                  "Barking and Dagenham","Barking and Dagenham",
                  "Barking and Dagenham","Barking and Dagenham",
                  "Barking and Dagenham","Barking and Dagenham",
                  "Barnet","Barnet")
period <- c("1st 2006", "1st 2006", "1st 2006", "1st 2006", "2nd 2006", 
            "2nd 2006", "2nd 2006", "2nd 2006", "1st 2006", "1st 2006") 
category <- c ("glass", "fridges", "paper", "glass", "glass", "fridges", 
               "paper", "glass", "glass", "fridges")
data <- c(333, 222, 100, 98, 450, 540, 33, 450, 560, 120)
category2 <- c ("collected", "collected", "collected", "no donors", 
                "collected", "collected", "collected", "no donors", 
                "collected", "collected")
df <- data.frame (council_name, period, category, category2, data)
df

##            council_name   period category category2 data
## 1  Barking and Dagenham 1st 2006    glass collected  333
## 2  Barking and Dagenham 1st 2006  fridges collected  222
## 3  Barking and Dagenham 1st 2006    paper collected  100
## 4  Barking and Dagenham 1st 2006    glass no donors   98
## 5  Barking and Dagenham 2nd 2006    glass collected  450
## 6  Barking and Dagenham 2nd 2006  fridges collected  540
## 7  Barking and Dagenham 2nd 2006    paper collected   33
## 8  Barking and Dagenham 2nd 2006    glass no donors  450
## 9                Barnet 1st 2006    glass collected  560
## 10               Barnet 1st 2006  fridges collected  120

df %>%
  pivot_wider(id_cols = c(council_name,period),
              names_from = c(category, category2),
              values_from = data,
              values_fn = sum)

## # A tibble: 3 x 6
##   council_name period glass_collected fridges_collect… paper_collected
##   <fct>        <fct>            <dbl>            <dbl>           <dbl>
## 1 Barking and… 1st 2…             333              222             100
## 2 Barking and… 2nd 2…             450              540              33
## 3 Barnet       1st 2…             560              120              NA
## # … with 1 more variable: `glass_no donors` <dbl>

Console summary of aggregated data

Useful way to get a summary of aggregated data:

stats::xtabs(data ~ ., df)

## , , category = fridges, category2 = collected
## 
##                       period
## council_name           1st 2006 2nd 2006
##   Barking and Dagenham      222      540
##   Barnet                    120        0
## 
## , , category = glass, category2 = collected
## 
##                       period
## council_name           1st 2006 2nd 2006
##   Barking and Dagenham      333      450
##   Barnet                    560        0
## 
## , , category = paper, category2 = collected
## 
##                       period
## council_name           1st 2006 2nd 2006
##   Barking and Dagenham      100       33
##   Barnet                      0        0
## 
## , , category = fridges, category2 = no donors
## 
##                       period
## council_name           1st 2006 2nd 2006
##   Barking and Dagenham        0        0
##   Barnet                      0        0
## 
## , , category = glass, category2 = no donors
## 
##                       period
## council_name           1st 2006 2nd 2006
##   Barking and Dagenham       98      450
##   Barnet                      0        0
## 
## , , category = paper, category2 = no donors
## 
##                       period
## council_name           1st 2006 2nd 2006
##   Barking and Dagenham        0        0
##   Barnet                      0        0

Tidy-up data in R

My personal knowledge base

Back to home

Contents

Excel Pivot Table

Console summary of aggregated data