Using Loops to pull in data for a certain range of dates depending on multiple column values











up vote
0
down vote

favorite












First and foremost, thank you for taking your time to view/answer my question.



I asked this question before, albeit very un-clearly, however, I think I am close to a solution and hoping you can help me out!



I have 2 main df's: Customer (base table), and Top_Customers (a subset of the top n customers per quarter)



Both tables have the same layout, seen below:



Cust_ID   Date    QTR     Sales    Action   Link_Cust_ID
1 1/1/18 2018 Q1 23 NA NA
1 1/2/18 2018 Q1 22.2 NA NA
1 1/3/18 2018 Q1 12.1 2 5
1 1/4/18 2018 Q1 14.1 5 NA
3 1/1/18 2018 Q1 101 NA NA
3 1/2/18 2018 Q1 55 2 18
... ... ... ... ... ...


Sometimes - a customer might have reference to another customer - hence the link_cust_id column might be populated with the ID of another customer.
My Goal is: If the Action column == 2, I want to include that customer (from Link_Cust_ID) in my Top_Customer table from the date the link_cust_id is populated to the END OF THE QUARTER



For example, I would include Cust_ID = 5 from 1/3/18 to 3/31/18 (end of the quarter) from the above chart



I've been trying at this for a while now and have come up with the following code (which currently doesn't work, but I think the idea is there)



Linking_ID <- function(data)
{
if (data$link_type == 2)
{
temp.linkid <- data$link_cust_id[i] #stores the linked customer_id
temp.date <- data$Date[i] #stores the date linking occurs
temp.data <- customer_data %>% group_by(Quarter) %>% filter(customer_id = temp.linkid & Date >= temp.date)
#the above line of code is suppose to subset only link_customer_id data from the link_date to the end of the quarter
data <- rbind(data, temp.data)
}}


I am not great with loops, and try and not use them much in my code, but I might not have a choice in this situation. If you think another way might be better, please do suggest it!



dput for base table (all customers)



    structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1"),
Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
55, NA, NA, 10.5, 11.1),
Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))


dput for top_customer tble:



structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3), Date = 
structure(c(1514764800,
1514851200, 1514937600, 1515024000, 1514764800, 1514851200, 1514937600,
1515024000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1"), Sales = c(23, 22.2, 12.1,
14.1, 101, 55, 56, 55), Action = c(NA, NA, 2, NA, NA, 2,
NA, NA), Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))









share|improve this question
























  • can you share both dataframes with dput?
    – Mike
    10 hours ago










  • Hi @Mike, see the dput for the base table.. ill add the dput for the top_customer list soon
    – yungpadewon
    10 hours ago

















up vote
0
down vote

favorite












First and foremost, thank you for taking your time to view/answer my question.



I asked this question before, albeit very un-clearly, however, I think I am close to a solution and hoping you can help me out!



I have 2 main df's: Customer (base table), and Top_Customers (a subset of the top n customers per quarter)



Both tables have the same layout, seen below:



Cust_ID   Date    QTR     Sales    Action   Link_Cust_ID
1 1/1/18 2018 Q1 23 NA NA
1 1/2/18 2018 Q1 22.2 NA NA
1 1/3/18 2018 Q1 12.1 2 5
1 1/4/18 2018 Q1 14.1 5 NA
3 1/1/18 2018 Q1 101 NA NA
3 1/2/18 2018 Q1 55 2 18
... ... ... ... ... ...


Sometimes - a customer might have reference to another customer - hence the link_cust_id column might be populated with the ID of another customer.
My Goal is: If the Action column == 2, I want to include that customer (from Link_Cust_ID) in my Top_Customer table from the date the link_cust_id is populated to the END OF THE QUARTER



For example, I would include Cust_ID = 5 from 1/3/18 to 3/31/18 (end of the quarter) from the above chart



I've been trying at this for a while now and have come up with the following code (which currently doesn't work, but I think the idea is there)



Linking_ID <- function(data)
{
if (data$link_type == 2)
{
temp.linkid <- data$link_cust_id[i] #stores the linked customer_id
temp.date <- data$Date[i] #stores the date linking occurs
temp.data <- customer_data %>% group_by(Quarter) %>% filter(customer_id = temp.linkid & Date >= temp.date)
#the above line of code is suppose to subset only link_customer_id data from the link_date to the end of the quarter
data <- rbind(data, temp.data)
}}


I am not great with loops, and try and not use them much in my code, but I might not have a choice in this situation. If you think another way might be better, please do suggest it!



dput for base table (all customers)



    structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1"),
Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
55, NA, NA, 10.5, 11.1),
Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))


dput for top_customer tble:



structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3), Date = 
structure(c(1514764800,
1514851200, 1514937600, 1515024000, 1514764800, 1514851200, 1514937600,
1515024000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1"), Sales = c(23, 22.2, 12.1,
14.1, 101, 55, 56, 55), Action = c(NA, NA, 2, NA, NA, 2,
NA, NA), Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))









share|improve this question
























  • can you share both dataframes with dput?
    – Mike
    10 hours ago










  • Hi @Mike, see the dput for the base table.. ill add the dput for the top_customer list soon
    – yungpadewon
    10 hours ago















up vote
0
down vote

favorite









up vote
0
down vote

favorite











First and foremost, thank you for taking your time to view/answer my question.



I asked this question before, albeit very un-clearly, however, I think I am close to a solution and hoping you can help me out!



I have 2 main df's: Customer (base table), and Top_Customers (a subset of the top n customers per quarter)



Both tables have the same layout, seen below:



Cust_ID   Date    QTR     Sales    Action   Link_Cust_ID
1 1/1/18 2018 Q1 23 NA NA
1 1/2/18 2018 Q1 22.2 NA NA
1 1/3/18 2018 Q1 12.1 2 5
1 1/4/18 2018 Q1 14.1 5 NA
3 1/1/18 2018 Q1 101 NA NA
3 1/2/18 2018 Q1 55 2 18
... ... ... ... ... ...


Sometimes - a customer might have reference to another customer - hence the link_cust_id column might be populated with the ID of another customer.
My Goal is: If the Action column == 2, I want to include that customer (from Link_Cust_ID) in my Top_Customer table from the date the link_cust_id is populated to the END OF THE QUARTER



For example, I would include Cust_ID = 5 from 1/3/18 to 3/31/18 (end of the quarter) from the above chart



I've been trying at this for a while now and have come up with the following code (which currently doesn't work, but I think the idea is there)



Linking_ID <- function(data)
{
if (data$link_type == 2)
{
temp.linkid <- data$link_cust_id[i] #stores the linked customer_id
temp.date <- data$Date[i] #stores the date linking occurs
temp.data <- customer_data %>% group_by(Quarter) %>% filter(customer_id = temp.linkid & Date >= temp.date)
#the above line of code is suppose to subset only link_customer_id data from the link_date to the end of the quarter
data <- rbind(data, temp.data)
}}


I am not great with loops, and try and not use them much in my code, but I might not have a choice in this situation. If you think another way might be better, please do suggest it!



dput for base table (all customers)



    structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1"),
Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
55, NA, NA, 10.5, 11.1),
Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))


dput for top_customer tble:



structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3), Date = 
structure(c(1514764800,
1514851200, 1514937600, 1515024000, 1514764800, 1514851200, 1514937600,
1515024000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1"), Sales = c(23, 22.2, 12.1,
14.1, 101, 55, 56, 55), Action = c(NA, NA, 2, NA, NA, 2,
NA, NA), Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))









share|improve this question















First and foremost, thank you for taking your time to view/answer my question.



I asked this question before, albeit very un-clearly, however, I think I am close to a solution and hoping you can help me out!



I have 2 main df's: Customer (base table), and Top_Customers (a subset of the top n customers per quarter)



Both tables have the same layout, seen below:



Cust_ID   Date    QTR     Sales    Action   Link_Cust_ID
1 1/1/18 2018 Q1 23 NA NA
1 1/2/18 2018 Q1 22.2 NA NA
1 1/3/18 2018 Q1 12.1 2 5
1 1/4/18 2018 Q1 14.1 5 NA
3 1/1/18 2018 Q1 101 NA NA
3 1/2/18 2018 Q1 55 2 18
... ... ... ... ... ...


Sometimes - a customer might have reference to another customer - hence the link_cust_id column might be populated with the ID of another customer.
My Goal is: If the Action column == 2, I want to include that customer (from Link_Cust_ID) in my Top_Customer table from the date the link_cust_id is populated to the END OF THE QUARTER



For example, I would include Cust_ID = 5 from 1/3/18 to 3/31/18 (end of the quarter) from the above chart



I've been trying at this for a while now and have come up with the following code (which currently doesn't work, but I think the idea is there)



Linking_ID <- function(data)
{
if (data$link_type == 2)
{
temp.linkid <- data$link_cust_id[i] #stores the linked customer_id
temp.date <- data$Date[i] #stores the date linking occurs
temp.data <- customer_data %>% group_by(Quarter) %>% filter(customer_id = temp.linkid & Date >= temp.date)
#the above line of code is suppose to subset only link_customer_id data from the link_date to the end of the quarter
data <- rbind(data, temp.data)
}}


I am not great with loops, and try and not use them much in my code, but I might not have a choice in this situation. If you think another way might be better, please do suggest it!



dput for base table (all customers)



    structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1"),
Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
55, NA, NA, 10.5, 11.1),
Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))


dput for top_customer tble:



structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3), Date = 
structure(c(1514764800,
1514851200, 1514937600, 1515024000, 1514764800, 1514851200, 1514937600,
1515024000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1"), Sales = c(23, 22.2, 12.1,
14.1, 101, 55, 56, 55), Action = c(NA, NA, 2, NA, NA, 2,
NA, NA), Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))






r dplyr






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 10 hours ago

























asked 10 hours ago









yungpadewon

376




376












  • can you share both dataframes with dput?
    – Mike
    10 hours ago










  • Hi @Mike, see the dput for the base table.. ill add the dput for the top_customer list soon
    – yungpadewon
    10 hours ago




















  • can you share both dataframes with dput?
    – Mike
    10 hours ago










  • Hi @Mike, see the dput for the base table.. ill add the dput for the top_customer list soon
    – yungpadewon
    10 hours ago


















can you share both dataframes with dput?
– Mike
10 hours ago




can you share both dataframes with dput?
– Mike
10 hours ago












Hi @Mike, see the dput for the base table.. ill add the dput for the top_customer list soon
– yungpadewon
10 hours ago






Hi @Mike, see the dput for the base table.. ill add the dput for the top_customer list soon
– yungpadewon
10 hours ago














2 Answers
2






active

oldest

votes

















up vote
1
down vote



accepted










You can do the following:



First, extract those customers that have a linking customer ID with Action 2 and set the Quarter end dates.



library(dplyr) 

link_ids_action2 <- customer %>%
select(Cust_ID, Action, Link_Cust_ID, Date, Quarter) %>%
filter(Action == 2) %>%
mutate(Quarter_end_dates = case_when(grepl("Q1", Quarter) ~ "2018-03-31",
grepl("Q2", Quarter) ~ "2018-06-30",
grepl("Q3", Quarter) ~ "2018-09-30",
grepl("Q4", Quarter) ~ "2018-12-31"),
Quarter_end_dates = as.Date(Quarter_end_dates)) %>%
select(Link_Cust_ID, New_Cust_ID = Cust_ID, Start_date = Date, Quarter_end_dates)

link_ids_action2
# A tibble: 2 x 4
# Link_Cust_ID New_Cust_ID Start_date Quarter_end_dates
# <dbl> <dbl> <dttm> <date>
# 1 5 1 2018-01-03 00:00:00 2018-03-31
# 2 18 3 2018-01-02 00:00:00 2018-03-31


Right join on the original data frame, which gives you only those which are the linked customers and filter by the dates. I had to add as.Dates to filter properly on your datetime. Select only those columns you need for your top_customer data.



new_top_customers <- 
right_join(customer, link_ids_action2, by = c("Cust_ID" = "Link_Cust_ID")) %>%
filter(as.Date(Date) >= as.Date(Start_date), as.Date(Date) <= Quarter_end_dates) %>%
select(Cust_ID, Date, Quarter, Sales, Action, Link_Cust_ID)

new_top_customers
# A tibble: 2 x 6
# Cust_ID Date Quarter Sales Action Link_Cust_ID
# <dbl> <dttm> <chr> <dbl> <dbl> <dbl>
# 1 5 2018-01-03 00:00:00 2018 Q1 10.5 NA NA
# 2 5 2018-01-04 00:00:00 2018 Q1 11.1 NA NA


No with bind_rows you can add the new top customers:



bind_rows(top_customer, new_top_customers)

# A tibble: 10 x 6
# Cust_ID Date Quarter Sales Action Link_Cust_ID
# <dbl> <dttm> <chr> <dbl> <dbl> <dbl>
# 1 1 2018-01-01 00:00:00 2018 Q1 23 NA NA
# 2 1 2018-01-02 00:00:00 2018 Q1 22.2 NA NA
# 3 1 2018-01-03 00:00:00 2018 Q1 12.1 2 5
# 4 1 2018-01-04 00:00:00 2018 Q1 14.1 NA NA
# 5 3 2018-01-01 00:00:00 2018 Q1 101 NA NA
# 6 3 2018-01-02 00:00:00 2018 Q1 55 2 18
# 7 3 2018-01-03 00:00:00 2018 Q1 56 NA NA
# 8 3 2018-01-04 00:00:00 2018 Q1 55 NA NA
# 9 5 2018-01-03 00:00:00 2018 Q1 10.5 NA NA
# 10 5 2018-01-04 00:00:00 2018 Q1 11.1 NA NA


Data



customer <- structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1"),
Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
55, NA, NA, 10.5, 11.1),
Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))


top_customer <- structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3),
Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1"),
Sales = c(23, 22.2, 12.1, 14.1, 101, 55, 56, 55),
Action = c(NA, NA, 2, NA, NA, 2, NA, NA),
Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)),
row.names = c(NA, -8L), class = c("tbl_df", "tbl", "data.frame"))





share|improve this answer























  • Hi @kath, thank you very much. I am working with your solution, and I think with a bit more effort on my part - I can use your sample solution as a template ... I really appreciate the help!
    – yungpadewon
    7 hours ago


















up vote
0
down vote













I think this might help. You don't need to do loops here.



all_cust <-  structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000,
1514764800, 1514851200, 1514937600, 1515024000),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1"),
Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
55, NA, NA, 10.5, 11.1),
Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))

top_cust <- structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3), Date =
structure(c(1514764800,
1514851200, 1514937600, 1515024000, 1514764800, 1514851200, 1514937600,
1515024000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
"2018 Q1", "2018 Q1", "2018 Q1"), Sales = c(23, 22.2, 12.1,
14.1, 101, 55, 56, 55), Action = c(NA, NA, 2, NA, NA, 2,
NA, NA), Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))


library(dplyr)
#get where action is equal to 2
top_cust2 <- filter(top_cust, Action == 2)
#join on cust_id to link_cust_id
#Then filter where date is greater or equal too
#reference date and in the same quarter
bth <- inner_join(all_cust,top_cust2,
by =c("Cust_ID"="Link_Cust_ID")) %>%
filter(Date.x >= Date.y, Quarter.x == Quarter.y)
#remove .y columns from all_cust
bth <- bth[,!grepl(".y",colnames(bth))]
#drop .x from variable names
colnames(bth) <- gsub(".x","",colnames(bth))

cmb <- bind_rows(top_cust, bth)





share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53266010%2fusing-loops-to-pull-in-data-for-a-certain-range-of-dates-depending-on-multiple-c%23new-answer', 'question_page');
    }
    );

    Post as a guest
































    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote



    accepted










    You can do the following:



    First, extract those customers that have a linking customer ID with Action 2 and set the Quarter end dates.



    library(dplyr) 

    link_ids_action2 <- customer %>%
    select(Cust_ID, Action, Link_Cust_ID, Date, Quarter) %>%
    filter(Action == 2) %>%
    mutate(Quarter_end_dates = case_when(grepl("Q1", Quarter) ~ "2018-03-31",
    grepl("Q2", Quarter) ~ "2018-06-30",
    grepl("Q3", Quarter) ~ "2018-09-30",
    grepl("Q4", Quarter) ~ "2018-12-31"),
    Quarter_end_dates = as.Date(Quarter_end_dates)) %>%
    select(Link_Cust_ID, New_Cust_ID = Cust_ID, Start_date = Date, Quarter_end_dates)

    link_ids_action2
    # A tibble: 2 x 4
    # Link_Cust_ID New_Cust_ID Start_date Quarter_end_dates
    # <dbl> <dbl> <dttm> <date>
    # 1 5 1 2018-01-03 00:00:00 2018-03-31
    # 2 18 3 2018-01-02 00:00:00 2018-03-31


    Right join on the original data frame, which gives you only those which are the linked customers and filter by the dates. I had to add as.Dates to filter properly on your datetime. Select only those columns you need for your top_customer data.



    new_top_customers <- 
    right_join(customer, link_ids_action2, by = c("Cust_ID" = "Link_Cust_ID")) %>%
    filter(as.Date(Date) >= as.Date(Start_date), as.Date(Date) <= Quarter_end_dates) %>%
    select(Cust_ID, Date, Quarter, Sales, Action, Link_Cust_ID)

    new_top_customers
    # A tibble: 2 x 6
    # Cust_ID Date Quarter Sales Action Link_Cust_ID
    # <dbl> <dttm> <chr> <dbl> <dbl> <dbl>
    # 1 5 2018-01-03 00:00:00 2018 Q1 10.5 NA NA
    # 2 5 2018-01-04 00:00:00 2018 Q1 11.1 NA NA


    No with bind_rows you can add the new top customers:



    bind_rows(top_customer, new_top_customers)

    # A tibble: 10 x 6
    # Cust_ID Date Quarter Sales Action Link_Cust_ID
    # <dbl> <dttm> <chr> <dbl> <dbl> <dbl>
    # 1 1 2018-01-01 00:00:00 2018 Q1 23 NA NA
    # 2 1 2018-01-02 00:00:00 2018 Q1 22.2 NA NA
    # 3 1 2018-01-03 00:00:00 2018 Q1 12.1 2 5
    # 4 1 2018-01-04 00:00:00 2018 Q1 14.1 NA NA
    # 5 3 2018-01-01 00:00:00 2018 Q1 101 NA NA
    # 6 3 2018-01-02 00:00:00 2018 Q1 55 2 18
    # 7 3 2018-01-03 00:00:00 2018 Q1 56 NA NA
    # 8 3 2018-01-04 00:00:00 2018 Q1 55 NA NA
    # 9 5 2018-01-03 00:00:00 2018 Q1 10.5 NA NA
    # 10 5 2018-01-04 00:00:00 2018 Q1 11.1 NA NA


    Data



    customer <- structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
    Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000),
    class = c("POSIXct", "POSIXt"), tzone = "UTC"),
    Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1"),
    Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
    55, NA, NA, 10.5, 11.1),
    Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
    Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
    row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))


    top_customer <- structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3),
    Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000),
    class = c("POSIXct", "POSIXt"), tzone = "UTC"),
    Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1"),
    Sales = c(23, 22.2, 12.1, 14.1, 101, 55, 56, 55),
    Action = c(NA, NA, 2, NA, NA, 2, NA, NA),
    Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)),
    row.names = c(NA, -8L), class = c("tbl_df", "tbl", "data.frame"))





    share|improve this answer























    • Hi @kath, thank you very much. I am working with your solution, and I think with a bit more effort on my part - I can use your sample solution as a template ... I really appreciate the help!
      – yungpadewon
      7 hours ago















    up vote
    1
    down vote



    accepted










    You can do the following:



    First, extract those customers that have a linking customer ID with Action 2 and set the Quarter end dates.



    library(dplyr) 

    link_ids_action2 <- customer %>%
    select(Cust_ID, Action, Link_Cust_ID, Date, Quarter) %>%
    filter(Action == 2) %>%
    mutate(Quarter_end_dates = case_when(grepl("Q1", Quarter) ~ "2018-03-31",
    grepl("Q2", Quarter) ~ "2018-06-30",
    grepl("Q3", Quarter) ~ "2018-09-30",
    grepl("Q4", Quarter) ~ "2018-12-31"),
    Quarter_end_dates = as.Date(Quarter_end_dates)) %>%
    select(Link_Cust_ID, New_Cust_ID = Cust_ID, Start_date = Date, Quarter_end_dates)

    link_ids_action2
    # A tibble: 2 x 4
    # Link_Cust_ID New_Cust_ID Start_date Quarter_end_dates
    # <dbl> <dbl> <dttm> <date>
    # 1 5 1 2018-01-03 00:00:00 2018-03-31
    # 2 18 3 2018-01-02 00:00:00 2018-03-31


    Right join on the original data frame, which gives you only those which are the linked customers and filter by the dates. I had to add as.Dates to filter properly on your datetime. Select only those columns you need for your top_customer data.



    new_top_customers <- 
    right_join(customer, link_ids_action2, by = c("Cust_ID" = "Link_Cust_ID")) %>%
    filter(as.Date(Date) >= as.Date(Start_date), as.Date(Date) <= Quarter_end_dates) %>%
    select(Cust_ID, Date, Quarter, Sales, Action, Link_Cust_ID)

    new_top_customers
    # A tibble: 2 x 6
    # Cust_ID Date Quarter Sales Action Link_Cust_ID
    # <dbl> <dttm> <chr> <dbl> <dbl> <dbl>
    # 1 5 2018-01-03 00:00:00 2018 Q1 10.5 NA NA
    # 2 5 2018-01-04 00:00:00 2018 Q1 11.1 NA NA


    No with bind_rows you can add the new top customers:



    bind_rows(top_customer, new_top_customers)

    # A tibble: 10 x 6
    # Cust_ID Date Quarter Sales Action Link_Cust_ID
    # <dbl> <dttm> <chr> <dbl> <dbl> <dbl>
    # 1 1 2018-01-01 00:00:00 2018 Q1 23 NA NA
    # 2 1 2018-01-02 00:00:00 2018 Q1 22.2 NA NA
    # 3 1 2018-01-03 00:00:00 2018 Q1 12.1 2 5
    # 4 1 2018-01-04 00:00:00 2018 Q1 14.1 NA NA
    # 5 3 2018-01-01 00:00:00 2018 Q1 101 NA NA
    # 6 3 2018-01-02 00:00:00 2018 Q1 55 2 18
    # 7 3 2018-01-03 00:00:00 2018 Q1 56 NA NA
    # 8 3 2018-01-04 00:00:00 2018 Q1 55 NA NA
    # 9 5 2018-01-03 00:00:00 2018 Q1 10.5 NA NA
    # 10 5 2018-01-04 00:00:00 2018 Q1 11.1 NA NA


    Data



    customer <- structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
    Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000),
    class = c("POSIXct", "POSIXt"), tzone = "UTC"),
    Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1"),
    Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
    55, NA, NA, 10.5, 11.1),
    Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
    Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
    row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))


    top_customer <- structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3),
    Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000),
    class = c("POSIXct", "POSIXt"), tzone = "UTC"),
    Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1"),
    Sales = c(23, 22.2, 12.1, 14.1, 101, 55, 56, 55),
    Action = c(NA, NA, 2, NA, NA, 2, NA, NA),
    Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)),
    row.names = c(NA, -8L), class = c("tbl_df", "tbl", "data.frame"))





    share|improve this answer























    • Hi @kath, thank you very much. I am working with your solution, and I think with a bit more effort on my part - I can use your sample solution as a template ... I really appreciate the help!
      – yungpadewon
      7 hours ago













    up vote
    1
    down vote



    accepted







    up vote
    1
    down vote



    accepted






    You can do the following:



    First, extract those customers that have a linking customer ID with Action 2 and set the Quarter end dates.



    library(dplyr) 

    link_ids_action2 <- customer %>%
    select(Cust_ID, Action, Link_Cust_ID, Date, Quarter) %>%
    filter(Action == 2) %>%
    mutate(Quarter_end_dates = case_when(grepl("Q1", Quarter) ~ "2018-03-31",
    grepl("Q2", Quarter) ~ "2018-06-30",
    grepl("Q3", Quarter) ~ "2018-09-30",
    grepl("Q4", Quarter) ~ "2018-12-31"),
    Quarter_end_dates = as.Date(Quarter_end_dates)) %>%
    select(Link_Cust_ID, New_Cust_ID = Cust_ID, Start_date = Date, Quarter_end_dates)

    link_ids_action2
    # A tibble: 2 x 4
    # Link_Cust_ID New_Cust_ID Start_date Quarter_end_dates
    # <dbl> <dbl> <dttm> <date>
    # 1 5 1 2018-01-03 00:00:00 2018-03-31
    # 2 18 3 2018-01-02 00:00:00 2018-03-31


    Right join on the original data frame, which gives you only those which are the linked customers and filter by the dates. I had to add as.Dates to filter properly on your datetime. Select only those columns you need for your top_customer data.



    new_top_customers <- 
    right_join(customer, link_ids_action2, by = c("Cust_ID" = "Link_Cust_ID")) %>%
    filter(as.Date(Date) >= as.Date(Start_date), as.Date(Date) <= Quarter_end_dates) %>%
    select(Cust_ID, Date, Quarter, Sales, Action, Link_Cust_ID)

    new_top_customers
    # A tibble: 2 x 6
    # Cust_ID Date Quarter Sales Action Link_Cust_ID
    # <dbl> <dttm> <chr> <dbl> <dbl> <dbl>
    # 1 5 2018-01-03 00:00:00 2018 Q1 10.5 NA NA
    # 2 5 2018-01-04 00:00:00 2018 Q1 11.1 NA NA


    No with bind_rows you can add the new top customers:



    bind_rows(top_customer, new_top_customers)

    # A tibble: 10 x 6
    # Cust_ID Date Quarter Sales Action Link_Cust_ID
    # <dbl> <dttm> <chr> <dbl> <dbl> <dbl>
    # 1 1 2018-01-01 00:00:00 2018 Q1 23 NA NA
    # 2 1 2018-01-02 00:00:00 2018 Q1 22.2 NA NA
    # 3 1 2018-01-03 00:00:00 2018 Q1 12.1 2 5
    # 4 1 2018-01-04 00:00:00 2018 Q1 14.1 NA NA
    # 5 3 2018-01-01 00:00:00 2018 Q1 101 NA NA
    # 6 3 2018-01-02 00:00:00 2018 Q1 55 2 18
    # 7 3 2018-01-03 00:00:00 2018 Q1 56 NA NA
    # 8 3 2018-01-04 00:00:00 2018 Q1 55 NA NA
    # 9 5 2018-01-03 00:00:00 2018 Q1 10.5 NA NA
    # 10 5 2018-01-04 00:00:00 2018 Q1 11.1 NA NA


    Data



    customer <- structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
    Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000),
    class = c("POSIXct", "POSIXt"), tzone = "UTC"),
    Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1"),
    Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
    55, NA, NA, 10.5, 11.1),
    Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
    Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
    row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))


    top_customer <- structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3),
    Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000),
    class = c("POSIXct", "POSIXt"), tzone = "UTC"),
    Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1"),
    Sales = c(23, 22.2, 12.1, 14.1, 101, 55, 56, 55),
    Action = c(NA, NA, 2, NA, NA, 2, NA, NA),
    Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)),
    row.names = c(NA, -8L), class = c("tbl_df", "tbl", "data.frame"))





    share|improve this answer














    You can do the following:



    First, extract those customers that have a linking customer ID with Action 2 and set the Quarter end dates.



    library(dplyr) 

    link_ids_action2 <- customer %>%
    select(Cust_ID, Action, Link_Cust_ID, Date, Quarter) %>%
    filter(Action == 2) %>%
    mutate(Quarter_end_dates = case_when(grepl("Q1", Quarter) ~ "2018-03-31",
    grepl("Q2", Quarter) ~ "2018-06-30",
    grepl("Q3", Quarter) ~ "2018-09-30",
    grepl("Q4", Quarter) ~ "2018-12-31"),
    Quarter_end_dates = as.Date(Quarter_end_dates)) %>%
    select(Link_Cust_ID, New_Cust_ID = Cust_ID, Start_date = Date, Quarter_end_dates)

    link_ids_action2
    # A tibble: 2 x 4
    # Link_Cust_ID New_Cust_ID Start_date Quarter_end_dates
    # <dbl> <dbl> <dttm> <date>
    # 1 5 1 2018-01-03 00:00:00 2018-03-31
    # 2 18 3 2018-01-02 00:00:00 2018-03-31


    Right join on the original data frame, which gives you only those which are the linked customers and filter by the dates. I had to add as.Dates to filter properly on your datetime. Select only those columns you need for your top_customer data.



    new_top_customers <- 
    right_join(customer, link_ids_action2, by = c("Cust_ID" = "Link_Cust_ID")) %>%
    filter(as.Date(Date) >= as.Date(Start_date), as.Date(Date) <= Quarter_end_dates) %>%
    select(Cust_ID, Date, Quarter, Sales, Action, Link_Cust_ID)

    new_top_customers
    # A tibble: 2 x 6
    # Cust_ID Date Quarter Sales Action Link_Cust_ID
    # <dbl> <dttm> <chr> <dbl> <dbl> <dbl>
    # 1 5 2018-01-03 00:00:00 2018 Q1 10.5 NA NA
    # 2 5 2018-01-04 00:00:00 2018 Q1 11.1 NA NA


    No with bind_rows you can add the new top customers:



    bind_rows(top_customer, new_top_customers)

    # A tibble: 10 x 6
    # Cust_ID Date Quarter Sales Action Link_Cust_ID
    # <dbl> <dttm> <chr> <dbl> <dbl> <dbl>
    # 1 1 2018-01-01 00:00:00 2018 Q1 23 NA NA
    # 2 1 2018-01-02 00:00:00 2018 Q1 22.2 NA NA
    # 3 1 2018-01-03 00:00:00 2018 Q1 12.1 2 5
    # 4 1 2018-01-04 00:00:00 2018 Q1 14.1 NA NA
    # 5 3 2018-01-01 00:00:00 2018 Q1 101 NA NA
    # 6 3 2018-01-02 00:00:00 2018 Q1 55 2 18
    # 7 3 2018-01-03 00:00:00 2018 Q1 56 NA NA
    # 8 3 2018-01-04 00:00:00 2018 Q1 55 NA NA
    # 9 5 2018-01-03 00:00:00 2018 Q1 10.5 NA NA
    # 10 5 2018-01-04 00:00:00 2018 Q1 11.1 NA NA


    Data



    customer <- structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
    Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000),
    class = c("POSIXct", "POSIXt"), tzone = "UTC"),
    Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1"),
    Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
    55, NA, NA, 10.5, 11.1),
    Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
    Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
    row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))


    top_customer <- structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3),
    Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000),
    class = c("POSIXct", "POSIXt"), tzone = "UTC"),
    Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1"),
    Sales = c(23, 22.2, 12.1, 14.1, 101, 55, 56, 55),
    Action = c(NA, NA, 2, NA, NA, 2, NA, NA),
    Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)),
    row.names = c(NA, -8L), class = c("tbl_df", "tbl", "data.frame"))






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 9 hours ago

























    answered 10 hours ago









    kath

    3,700724




    3,700724












    • Hi @kath, thank you very much. I am working with your solution, and I think with a bit more effort on my part - I can use your sample solution as a template ... I really appreciate the help!
      – yungpadewon
      7 hours ago


















    • Hi @kath, thank you very much. I am working with your solution, and I think with a bit more effort on my part - I can use your sample solution as a template ... I really appreciate the help!
      – yungpadewon
      7 hours ago
















    Hi @kath, thank you very much. I am working with your solution, and I think with a bit more effort on my part - I can use your sample solution as a template ... I really appreciate the help!
    – yungpadewon
    7 hours ago




    Hi @kath, thank you very much. I am working with your solution, and I think with a bit more effort on my part - I can use your sample solution as a template ... I really appreciate the help!
    – yungpadewon
    7 hours ago












    up vote
    0
    down vote













    I think this might help. You don't need to do loops here.



    all_cust <-  structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
    Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000,
    1514764800, 1514851200, 1514937600, 1515024000),
    class = c("POSIXct", "POSIXt"), tzone = "UTC"),
    Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1"),
    Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
    55, NA, NA, 10.5, 11.1),
    Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
    Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
    row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))

    top_cust <- structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3), Date =
    structure(c(1514764800,
    1514851200, 1514937600, 1515024000, 1514764800, 1514851200, 1514937600,
    1515024000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
    Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
    "2018 Q1", "2018 Q1", "2018 Q1"), Sales = c(23, 22.2, 12.1,
    14.1, 101, 55, 56, 55), Action = c(NA, NA, 2, NA, NA, 2,
    NA, NA), Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)), row.names = c(NA,
    -8L), class = c("tbl_df", "tbl", "data.frame"))


    library(dplyr)
    #get where action is equal to 2
    top_cust2 <- filter(top_cust, Action == 2)
    #join on cust_id to link_cust_id
    #Then filter where date is greater or equal too
    #reference date and in the same quarter
    bth <- inner_join(all_cust,top_cust2,
    by =c("Cust_ID"="Link_Cust_ID")) %>%
    filter(Date.x >= Date.y, Quarter.x == Quarter.y)
    #remove .y columns from all_cust
    bth <- bth[,!grepl(".y",colnames(bth))]
    #drop .x from variable names
    colnames(bth) <- gsub(".x","",colnames(bth))

    cmb <- bind_rows(top_cust, bth)





    share|improve this answer



























      up vote
      0
      down vote













      I think this might help. You don't need to do loops here.



      all_cust <-  structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
      Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
      1514764800, 1514851200, 1514937600, 1515024000,
      1514764800, 1514851200, 1514937600, 1515024000,
      1514764800, 1514851200, 1514937600, 1515024000),
      class = c("POSIXct", "POSIXt"), tzone = "UTC"),
      Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
      "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
      "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
      "2018 Q1"),
      Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
      55, NA, NA, 10.5, 11.1),
      Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
      Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
      row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))

      top_cust <- structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3), Date =
      structure(c(1514764800,
      1514851200, 1514937600, 1515024000, 1514764800, 1514851200, 1514937600,
      1515024000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
      Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
      "2018 Q1", "2018 Q1", "2018 Q1"), Sales = c(23, 22.2, 12.1,
      14.1, 101, 55, 56, 55), Action = c(NA, NA, 2, NA, NA, 2,
      NA, NA), Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)), row.names = c(NA,
      -8L), class = c("tbl_df", "tbl", "data.frame"))


      library(dplyr)
      #get where action is equal to 2
      top_cust2 <- filter(top_cust, Action == 2)
      #join on cust_id to link_cust_id
      #Then filter where date is greater or equal too
      #reference date and in the same quarter
      bth <- inner_join(all_cust,top_cust2,
      by =c("Cust_ID"="Link_Cust_ID")) %>%
      filter(Date.x >= Date.y, Quarter.x == Quarter.y)
      #remove .y columns from all_cust
      bth <- bth[,!grepl(".y",colnames(bth))]
      #drop .x from variable names
      colnames(bth) <- gsub(".x","",colnames(bth))

      cmb <- bind_rows(top_cust, bth)





      share|improve this answer

























        up vote
        0
        down vote










        up vote
        0
        down vote









        I think this might help. You don't need to do loops here.



        all_cust <-  structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
        Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
        1514764800, 1514851200, 1514937600, 1515024000,
        1514764800, 1514851200, 1514937600, 1515024000,
        1514764800, 1514851200, 1514937600, 1515024000),
        class = c("POSIXct", "POSIXt"), tzone = "UTC"),
        Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
        "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
        "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
        "2018 Q1"),
        Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
        55, NA, NA, 10.5, 11.1),
        Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
        Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
        row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))

        top_cust <- structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3), Date =
        structure(c(1514764800,
        1514851200, 1514937600, 1515024000, 1514764800, 1514851200, 1514937600,
        1515024000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
        Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
        "2018 Q1", "2018 Q1", "2018 Q1"), Sales = c(23, 22.2, 12.1,
        14.1, 101, 55, 56, 55), Action = c(NA, NA, 2, NA, NA, 2,
        NA, NA), Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)), row.names = c(NA,
        -8L), class = c("tbl_df", "tbl", "data.frame"))


        library(dplyr)
        #get where action is equal to 2
        top_cust2 <- filter(top_cust, Action == 2)
        #join on cust_id to link_cust_id
        #Then filter where date is greater or equal too
        #reference date and in the same quarter
        bth <- inner_join(all_cust,top_cust2,
        by =c("Cust_ID"="Link_Cust_ID")) %>%
        filter(Date.x >= Date.y, Quarter.x == Quarter.y)
        #remove .y columns from all_cust
        bth <- bth[,!grepl(".y",colnames(bth))]
        #drop .x from variable names
        colnames(bth) <- gsub(".x","",colnames(bth))

        cmb <- bind_rows(top_cust, bth)





        share|improve this answer














        I think this might help. You don't need to do loops here.



        all_cust <-  structure(list(Cust_ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5), 
        Date = structure(c(1514764800, 1514851200, 1514937600, 1515024000,
        1514764800, 1514851200, 1514937600, 1515024000,
        1514764800, 1514851200, 1514937600, 1515024000,
        1514764800, 1514851200, 1514937600, 1515024000),
        class = c("POSIXct", "POSIXt"), tzone = "UTC"),
        Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
        "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
        "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
        "2018 Q1"),
        Sales = c(23, 22.2, 12.1, 14.1, 18, 18.8, 19.2, 19.8, 101, 55, 56,
        55, NA, NA, 10.5, 11.1),
        Action = c(NA, NA, 2, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA),
        Link_Cust_ID = c(NA, NA, 5, NA, NA, NA, NA, NA, NA, 18, NA, NA, NA, NA, NA, NA)),
        row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))

        top_cust <- structure(list(Cust_ID = c(1, 1, 1, 1, 3, 3, 3, 3), Date =
        structure(c(1514764800,
        1514851200, 1514937600, 1515024000, 1514764800, 1514851200, 1514937600,
        1515024000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
        Quarter = c("2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1", "2018 Q1",
        "2018 Q1", "2018 Q1", "2018 Q1"), Sales = c(23, 22.2, 12.1,
        14.1, 101, 55, 56, 55), Action = c(NA, NA, 2, NA, NA, 2,
        NA, NA), Link_Cust_ID = c(NA, NA, 5, NA, NA, 18, NA, NA)), row.names = c(NA,
        -8L), class = c("tbl_df", "tbl", "data.frame"))


        library(dplyr)
        #get where action is equal to 2
        top_cust2 <- filter(top_cust, Action == 2)
        #join on cust_id to link_cust_id
        #Then filter where date is greater or equal too
        #reference date and in the same quarter
        bth <- inner_join(all_cust,top_cust2,
        by =c("Cust_ID"="Link_Cust_ID")) %>%
        filter(Date.x >= Date.y, Quarter.x == Quarter.y)
        #remove .y columns from all_cust
        bth <- bth[,!grepl(".y",colnames(bth))]
        #drop .x from variable names
        colnames(bth) <- gsub(".x","",colnames(bth))

        cmb <- bind_rows(top_cust, bth)






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 9 hours ago

























        answered 10 hours ago









        Mike

        697316




        697316






























             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53266010%2fusing-loops-to-pull-in-data-for-a-certain-range-of-dates-depending-on-multiple-c%23new-answer', 'question_page');
            }
            );

            Post as a guest




















































































            Popular posts from this blog

            mysqli_query(): Empty query in /home/lucindabrummitt/public_html/blog/wp-includes/wp-db.php on line 1924

            How to change which sound is reproduced for terminal bell?

            Can I use Tabulator js library in my java Spring + Thymeleaf project?