Plotting the duration of exceedances of a pollutant standard in ggplot2
I have a data frame with a Date column and a column of calculated concentrations of a parameter. I am trying to plot a time-series plot that has scatter points of all the concentrations and then have a horizontal line showing the standard of the pollutant(which is 500). I can do this no problem. The problem I am having is trying to plot a line showing the duration of the exceedances > 500. I can't seem to find anything to solve my question. I would appreciate any guidance.
Sample Data:
df<-structure(list(Date_Time = structure(c(1480093200, 1482660000,
1395651343, 1329823800, 1326929400, 1331233200, 1490130000, 1476138600,
1474070400, 1489393800, 1483272000, 1393515068, 1480471200, 1332680400,
1471226400, 1470853800, 1396124591, 1496250000, 1394581991, 1438177553,
1332108000, 1493051400, 1475949600, 1491024600, 1488832200, 1473697800,
1475404200, 1488511800, 1490212800, 1477040400, 1494793740, 1389346885,
1473933600, 1390611191, 1486551600, 1476475200, 1473593400, 1388854543,
1327012200, 1493611140), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Calculated_TDS = c(271.3692, 634.3604, 634.246, 219.546,
674.286, 169.21, 506.118, 452.6932, 314.8412, 4640.3052,
358.0844, 734.918, 97.71, 460.358, 385.998, 283.9532, 370.554,
309.2356, 296.766, 137.079616, 24.494, 383.996, 321.2476,
784.6248, 642.1396, 1320.7032, 213.254, 462.1884, 547.6452,
376.274, 195.1216, 595.35, 320.1608, 411.166, 882.5512, 288.5292,
533.574, 1000.326, 124.022, 256.6116)), row.names = c(NA,
-40L), class = c("tbl_df", "tbl", "data.frame"), .Names = c("Date_Time",
"Calculated_TDS"))
Code:
library(tidyverse)
test_df<-df%>%
mutate(greater = Calculated_TDS > 500)%>%
group_by(Date_Time,Calculated_TDS)%>%
summarize(n_greater = sum(greater), duration = length(Date_Time))
plot<-ggplot() +
geom_point(data = test_df , aes(x = Date_Time, y = Calculated_TDS))+
geom_line(data= test_df,aes(x=Date_Time, y = duration),stat="identity")+
geom_hline(aes(yintercept = 500,color="red"),size=1.3)
plot
I know what I have doesn't make sense but I don't understand how to find the duration of exceedances.
r ggplot2 dplyr time-series
|
show 1 more comment
I have a data frame with a Date column and a column of calculated concentrations of a parameter. I am trying to plot a time-series plot that has scatter points of all the concentrations and then have a horizontal line showing the standard of the pollutant(which is 500). I can do this no problem. The problem I am having is trying to plot a line showing the duration of the exceedances > 500. I can't seem to find anything to solve my question. I would appreciate any guidance.
Sample Data:
df<-structure(list(Date_Time = structure(c(1480093200, 1482660000,
1395651343, 1329823800, 1326929400, 1331233200, 1490130000, 1476138600,
1474070400, 1489393800, 1483272000, 1393515068, 1480471200, 1332680400,
1471226400, 1470853800, 1396124591, 1496250000, 1394581991, 1438177553,
1332108000, 1493051400, 1475949600, 1491024600, 1488832200, 1473697800,
1475404200, 1488511800, 1490212800, 1477040400, 1494793740, 1389346885,
1473933600, 1390611191, 1486551600, 1476475200, 1473593400, 1388854543,
1327012200, 1493611140), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Calculated_TDS = c(271.3692, 634.3604, 634.246, 219.546,
674.286, 169.21, 506.118, 452.6932, 314.8412, 4640.3052,
358.0844, 734.918, 97.71, 460.358, 385.998, 283.9532, 370.554,
309.2356, 296.766, 137.079616, 24.494, 383.996, 321.2476,
784.6248, 642.1396, 1320.7032, 213.254, 462.1884, 547.6452,
376.274, 195.1216, 595.35, 320.1608, 411.166, 882.5512, 288.5292,
533.574, 1000.326, 124.022, 256.6116)), row.names = c(NA,
-40L), class = c("tbl_df", "tbl", "data.frame"), .Names = c("Date_Time",
"Calculated_TDS"))
Code:
library(tidyverse)
test_df<-df%>%
mutate(greater = Calculated_TDS > 500)%>%
group_by(Date_Time,Calculated_TDS)%>%
summarize(n_greater = sum(greater), duration = length(Date_Time))
plot<-ggplot() +
geom_point(data = test_df , aes(x = Date_Time, y = Calculated_TDS))+
geom_line(data= test_df,aes(x=Date_Time, y = duration),stat="identity")+
geom_hline(aes(yintercept = 500,color="red"),size=1.3)
plot
I know what I have doesn't make sense but I don't understand how to find the duration of exceedances.
r ggplot2 dplyr time-series
FYI, you code didn't run. Can you draw your expected output in Paint or Word?
– Tung
Nov 19 '18 at 17:49
Are you looking for something similar to flow duration curve? stackoverflow.com/questions/52831687/…
– Tung
Nov 19 '18 at 17:51
@Tung fixed code sorry, should work now. That's not quite what I want. I will try to draw expected output
– NBE
Nov 19 '18 at 17:53
@Tung I included expected output. The lines should be the time duration of each event that went over the standard of 500.
– NBE
Nov 19 '18 at 18:04
Can you pls define for us what an "event" is? Let's say a reading is over 500, as is the case about 35% of the time in your sample. When should the duration start and when should it end? As long as the readings continue to be over 500? In your drawing it looks like there is grouping that includes readings before and after the > 500 readings.
– Jon Spring
Nov 19 '18 at 18:31
|
show 1 more comment
I have a data frame with a Date column and a column of calculated concentrations of a parameter. I am trying to plot a time-series plot that has scatter points of all the concentrations and then have a horizontal line showing the standard of the pollutant(which is 500). I can do this no problem. The problem I am having is trying to plot a line showing the duration of the exceedances > 500. I can't seem to find anything to solve my question. I would appreciate any guidance.
Sample Data:
df<-structure(list(Date_Time = structure(c(1480093200, 1482660000,
1395651343, 1329823800, 1326929400, 1331233200, 1490130000, 1476138600,
1474070400, 1489393800, 1483272000, 1393515068, 1480471200, 1332680400,
1471226400, 1470853800, 1396124591, 1496250000, 1394581991, 1438177553,
1332108000, 1493051400, 1475949600, 1491024600, 1488832200, 1473697800,
1475404200, 1488511800, 1490212800, 1477040400, 1494793740, 1389346885,
1473933600, 1390611191, 1486551600, 1476475200, 1473593400, 1388854543,
1327012200, 1493611140), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Calculated_TDS = c(271.3692, 634.3604, 634.246, 219.546,
674.286, 169.21, 506.118, 452.6932, 314.8412, 4640.3052,
358.0844, 734.918, 97.71, 460.358, 385.998, 283.9532, 370.554,
309.2356, 296.766, 137.079616, 24.494, 383.996, 321.2476,
784.6248, 642.1396, 1320.7032, 213.254, 462.1884, 547.6452,
376.274, 195.1216, 595.35, 320.1608, 411.166, 882.5512, 288.5292,
533.574, 1000.326, 124.022, 256.6116)), row.names = c(NA,
-40L), class = c("tbl_df", "tbl", "data.frame"), .Names = c("Date_Time",
"Calculated_TDS"))
Code:
library(tidyverse)
test_df<-df%>%
mutate(greater = Calculated_TDS > 500)%>%
group_by(Date_Time,Calculated_TDS)%>%
summarize(n_greater = sum(greater), duration = length(Date_Time))
plot<-ggplot() +
geom_point(data = test_df , aes(x = Date_Time, y = Calculated_TDS))+
geom_line(data= test_df,aes(x=Date_Time, y = duration),stat="identity")+
geom_hline(aes(yintercept = 500,color="red"),size=1.3)
plot
I know what I have doesn't make sense but I don't understand how to find the duration of exceedances.
r ggplot2 dplyr time-series
I have a data frame with a Date column and a column of calculated concentrations of a parameter. I am trying to plot a time-series plot that has scatter points of all the concentrations and then have a horizontal line showing the standard of the pollutant(which is 500). I can do this no problem. The problem I am having is trying to plot a line showing the duration of the exceedances > 500. I can't seem to find anything to solve my question. I would appreciate any guidance.
Sample Data:
df<-structure(list(Date_Time = structure(c(1480093200, 1482660000,
1395651343, 1329823800, 1326929400, 1331233200, 1490130000, 1476138600,
1474070400, 1489393800, 1483272000, 1393515068, 1480471200, 1332680400,
1471226400, 1470853800, 1396124591, 1496250000, 1394581991, 1438177553,
1332108000, 1493051400, 1475949600, 1491024600, 1488832200, 1473697800,
1475404200, 1488511800, 1490212800, 1477040400, 1494793740, 1389346885,
1473933600, 1390611191, 1486551600, 1476475200, 1473593400, 1388854543,
1327012200, 1493611140), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Calculated_TDS = c(271.3692, 634.3604, 634.246, 219.546,
674.286, 169.21, 506.118, 452.6932, 314.8412, 4640.3052,
358.0844, 734.918, 97.71, 460.358, 385.998, 283.9532, 370.554,
309.2356, 296.766, 137.079616, 24.494, 383.996, 321.2476,
784.6248, 642.1396, 1320.7032, 213.254, 462.1884, 547.6452,
376.274, 195.1216, 595.35, 320.1608, 411.166, 882.5512, 288.5292,
533.574, 1000.326, 124.022, 256.6116)), row.names = c(NA,
-40L), class = c("tbl_df", "tbl", "data.frame"), .Names = c("Date_Time",
"Calculated_TDS"))
Code:
library(tidyverse)
test_df<-df%>%
mutate(greater = Calculated_TDS > 500)%>%
group_by(Date_Time,Calculated_TDS)%>%
summarize(n_greater = sum(greater), duration = length(Date_Time))
plot<-ggplot() +
geom_point(data = test_df , aes(x = Date_Time, y = Calculated_TDS))+
geom_line(data= test_df,aes(x=Date_Time, y = duration),stat="identity")+
geom_hline(aes(yintercept = 500,color="red"),size=1.3)
plot
I know what I have doesn't make sense but I don't understand how to find the duration of exceedances.
r ggplot2 dplyr time-series
r ggplot2 dplyr time-series
edited Nov 19 '18 at 18:03
NBE
asked Nov 19 '18 at 17:29
NBENBE
15110
15110
FYI, you code didn't run. Can you draw your expected output in Paint or Word?
– Tung
Nov 19 '18 at 17:49
Are you looking for something similar to flow duration curve? stackoverflow.com/questions/52831687/…
– Tung
Nov 19 '18 at 17:51
@Tung fixed code sorry, should work now. That's not quite what I want. I will try to draw expected output
– NBE
Nov 19 '18 at 17:53
@Tung I included expected output. The lines should be the time duration of each event that went over the standard of 500.
– NBE
Nov 19 '18 at 18:04
Can you pls define for us what an "event" is? Let's say a reading is over 500, as is the case about 35% of the time in your sample. When should the duration start and when should it end? As long as the readings continue to be over 500? In your drawing it looks like there is grouping that includes readings before and after the > 500 readings.
– Jon Spring
Nov 19 '18 at 18:31
|
show 1 more comment
FYI, you code didn't run. Can you draw your expected output in Paint or Word?
– Tung
Nov 19 '18 at 17:49
Are you looking for something similar to flow duration curve? stackoverflow.com/questions/52831687/…
– Tung
Nov 19 '18 at 17:51
@Tung fixed code sorry, should work now. That's not quite what I want. I will try to draw expected output
– NBE
Nov 19 '18 at 17:53
@Tung I included expected output. The lines should be the time duration of each event that went over the standard of 500.
– NBE
Nov 19 '18 at 18:04
Can you pls define for us what an "event" is? Let's say a reading is over 500, as is the case about 35% of the time in your sample. When should the duration start and when should it end? As long as the readings continue to be over 500? In your drawing it looks like there is grouping that includes readings before and after the > 500 readings.
– Jon Spring
Nov 19 '18 at 18:31
FYI, you code didn't run. Can you draw your expected output in Paint or Word?
– Tung
Nov 19 '18 at 17:49
FYI, you code didn't run. Can you draw your expected output in Paint or Word?
– Tung
Nov 19 '18 at 17:49
Are you looking for something similar to flow duration curve? stackoverflow.com/questions/52831687/…
– Tung
Nov 19 '18 at 17:51
Are you looking for something similar to flow duration curve? stackoverflow.com/questions/52831687/…
– Tung
Nov 19 '18 at 17:51
@Tung fixed code sorry, should work now. That's not quite what I want. I will try to draw expected output
– NBE
Nov 19 '18 at 17:53
@Tung fixed code sorry, should work now. That's not quite what I want. I will try to draw expected output
– NBE
Nov 19 '18 at 17:53
@Tung I included expected output. The lines should be the time duration of each event that went over the standard of 500.
– NBE
Nov 19 '18 at 18:04
@Tung I included expected output. The lines should be the time duration of each event that went over the standard of 500.
– NBE
Nov 19 '18 at 18:04
Can you pls define for us what an "event" is? Let's say a reading is over 500, as is the case about 35% of the time in your sample. When should the duration start and when should it end? As long as the readings continue to be over 500? In your drawing it looks like there is grouping that includes readings before and after the > 500 readings.
– Jon Spring
Nov 19 '18 at 18:31
Can you pls define for us what an "event" is? Let's say a reading is over 500, as is the case about 35% of the time in your sample. When should the duration start and when should it end? As long as the readings continue to be over 500? In your drawing it looks like there is grouping that includes readings before and after the > 500 readings.
– Jon Spring
Nov 19 '18 at 18:31
|
show 1 more comment
2 Answers
2
active
oldest
votes
Andrew Lavers' answer is a good one. An alternative approach is to make groups for your geom_line
and subset data within that geom
.
test_df$group <- paste0("Group_", cumsum(c(1, diff(test_df$n_greater) != 0)))
test_df$duration <- ifelse(test_df$n_greater == 1, diff(test_df$Date_Time), 0)
# A tibble: 40 x 5
# Groups: Date_Time [?]
Date_Time Calculated_TDS n_greater duration group
<dttm> <dbl> <int> <dbl> <chr>
1 2012-01-18 23:30:00 674. 1 23 Group_1
2 2012-01-19 22:30:00 124. 0 0 Group_2
3 2012-02-21 11:30:00 220. 0 0 Group_2
4 2012-03-08 19:00:00 169. 0 0 Group_2
5 2012-03-18 22:00:00 24.5 0 0 Group_2
6 2012-03-25 13:00:00 460. 0 0 Group_2
7 2014-01-04 16:55:43 1000. 1 137. Group_3
8 2014-01-10 09:41:25 595. 1 351. Group_3
9 2014-01-25 00:53:11 411. 0 0 Group_4
10 2014-02-27 15:31:08 735. 1 296. Group_5
# ... with 30 more rows
ggplot() +
geom_point(data = test_df , aes(x = Date_Time, y = Calculated_TDS))+
geom_line(data = subset(test_df, Calculated_TDS > 500), aes(x=Date_Time, y = Calculated_TDS, group = group), stat="identity")+
geom_hline(aes(yintercept = 500,color="red"),size=1.3)
thanks for your answer. Is there a way to make a new column saying how long each exceedance lasted?
– NBE
Nov 19 '18 at 19:06
1
You can usediff
, see my edits.
– Anonymous coward
Nov 19 '18 at 21:37
add a comment |
I am not quite sure what you want, but here is a starting point. The idea is simply to make columns for the exceeded and the below, filled with NA, and then plot.
Color assigned is red to the the exceeded line and blue to the below points. Note that the color = "red"
should be outside the aes
for the horizontal line. Only use color inside aes
when the color should vary with the value.
library(tidyverse)
test_df <- df %>%
mutate(greater = Calculated_TDS > 500,
exceed_value = if_else(greater, Calculated_TDS, as.numeric(NA)),
below_value = if_else(greater, as.numeric(NA), Calculated_TDS))
plot <- ggplot(data = test_df, aes(x = Date_Time)) +
geom_point(aes(y = exceed_value), color = "red") +
geom_point(aes(y = below_value), color = "blue") +
geom_line(aes(y = exceed_value),
color = "red") + geom_hline(aes(yintercept = 500), color = "red", size = 1.3)
print(plot)
#> Warning in as.POSIXlt.POSIXct(x): unknown timezone 'zone/tz/2018g.1.0/
#> zoneinfo/America/New_York'
#> Warning: Removed 26 rows containing missing values (geom_point).
#> Warning: Removed 14 rows containing missing values (geom_point).
#> Warning: Removed 4 rows containing missing values (geom_path).
Thanks for your answer. Is there a way to get how long each event lasted in a new column?
– NBE
Nov 19 '18 at 19:01
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53379823%2fplotting-the-duration-of-exceedances-of-a-pollutant-standard-in-ggplot2%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Andrew Lavers' answer is a good one. An alternative approach is to make groups for your geom_line
and subset data within that geom
.
test_df$group <- paste0("Group_", cumsum(c(1, diff(test_df$n_greater) != 0)))
test_df$duration <- ifelse(test_df$n_greater == 1, diff(test_df$Date_Time), 0)
# A tibble: 40 x 5
# Groups: Date_Time [?]
Date_Time Calculated_TDS n_greater duration group
<dttm> <dbl> <int> <dbl> <chr>
1 2012-01-18 23:30:00 674. 1 23 Group_1
2 2012-01-19 22:30:00 124. 0 0 Group_2
3 2012-02-21 11:30:00 220. 0 0 Group_2
4 2012-03-08 19:00:00 169. 0 0 Group_2
5 2012-03-18 22:00:00 24.5 0 0 Group_2
6 2012-03-25 13:00:00 460. 0 0 Group_2
7 2014-01-04 16:55:43 1000. 1 137. Group_3
8 2014-01-10 09:41:25 595. 1 351. Group_3
9 2014-01-25 00:53:11 411. 0 0 Group_4
10 2014-02-27 15:31:08 735. 1 296. Group_5
# ... with 30 more rows
ggplot() +
geom_point(data = test_df , aes(x = Date_Time, y = Calculated_TDS))+
geom_line(data = subset(test_df, Calculated_TDS > 500), aes(x=Date_Time, y = Calculated_TDS, group = group), stat="identity")+
geom_hline(aes(yintercept = 500,color="red"),size=1.3)
thanks for your answer. Is there a way to make a new column saying how long each exceedance lasted?
– NBE
Nov 19 '18 at 19:06
1
You can usediff
, see my edits.
– Anonymous coward
Nov 19 '18 at 21:37
add a comment |
Andrew Lavers' answer is a good one. An alternative approach is to make groups for your geom_line
and subset data within that geom
.
test_df$group <- paste0("Group_", cumsum(c(1, diff(test_df$n_greater) != 0)))
test_df$duration <- ifelse(test_df$n_greater == 1, diff(test_df$Date_Time), 0)
# A tibble: 40 x 5
# Groups: Date_Time [?]
Date_Time Calculated_TDS n_greater duration group
<dttm> <dbl> <int> <dbl> <chr>
1 2012-01-18 23:30:00 674. 1 23 Group_1
2 2012-01-19 22:30:00 124. 0 0 Group_2
3 2012-02-21 11:30:00 220. 0 0 Group_2
4 2012-03-08 19:00:00 169. 0 0 Group_2
5 2012-03-18 22:00:00 24.5 0 0 Group_2
6 2012-03-25 13:00:00 460. 0 0 Group_2
7 2014-01-04 16:55:43 1000. 1 137. Group_3
8 2014-01-10 09:41:25 595. 1 351. Group_3
9 2014-01-25 00:53:11 411. 0 0 Group_4
10 2014-02-27 15:31:08 735. 1 296. Group_5
# ... with 30 more rows
ggplot() +
geom_point(data = test_df , aes(x = Date_Time, y = Calculated_TDS))+
geom_line(data = subset(test_df, Calculated_TDS > 500), aes(x=Date_Time, y = Calculated_TDS, group = group), stat="identity")+
geom_hline(aes(yintercept = 500,color="red"),size=1.3)
thanks for your answer. Is there a way to make a new column saying how long each exceedance lasted?
– NBE
Nov 19 '18 at 19:06
1
You can usediff
, see my edits.
– Anonymous coward
Nov 19 '18 at 21:37
add a comment |
Andrew Lavers' answer is a good one. An alternative approach is to make groups for your geom_line
and subset data within that geom
.
test_df$group <- paste0("Group_", cumsum(c(1, diff(test_df$n_greater) != 0)))
test_df$duration <- ifelse(test_df$n_greater == 1, diff(test_df$Date_Time), 0)
# A tibble: 40 x 5
# Groups: Date_Time [?]
Date_Time Calculated_TDS n_greater duration group
<dttm> <dbl> <int> <dbl> <chr>
1 2012-01-18 23:30:00 674. 1 23 Group_1
2 2012-01-19 22:30:00 124. 0 0 Group_2
3 2012-02-21 11:30:00 220. 0 0 Group_2
4 2012-03-08 19:00:00 169. 0 0 Group_2
5 2012-03-18 22:00:00 24.5 0 0 Group_2
6 2012-03-25 13:00:00 460. 0 0 Group_2
7 2014-01-04 16:55:43 1000. 1 137. Group_3
8 2014-01-10 09:41:25 595. 1 351. Group_3
9 2014-01-25 00:53:11 411. 0 0 Group_4
10 2014-02-27 15:31:08 735. 1 296. Group_5
# ... with 30 more rows
ggplot() +
geom_point(data = test_df , aes(x = Date_Time, y = Calculated_TDS))+
geom_line(data = subset(test_df, Calculated_TDS > 500), aes(x=Date_Time, y = Calculated_TDS, group = group), stat="identity")+
geom_hline(aes(yintercept = 500,color="red"),size=1.3)
Andrew Lavers' answer is a good one. An alternative approach is to make groups for your geom_line
and subset data within that geom
.
test_df$group <- paste0("Group_", cumsum(c(1, diff(test_df$n_greater) != 0)))
test_df$duration <- ifelse(test_df$n_greater == 1, diff(test_df$Date_Time), 0)
# A tibble: 40 x 5
# Groups: Date_Time [?]
Date_Time Calculated_TDS n_greater duration group
<dttm> <dbl> <int> <dbl> <chr>
1 2012-01-18 23:30:00 674. 1 23 Group_1
2 2012-01-19 22:30:00 124. 0 0 Group_2
3 2012-02-21 11:30:00 220. 0 0 Group_2
4 2012-03-08 19:00:00 169. 0 0 Group_2
5 2012-03-18 22:00:00 24.5 0 0 Group_2
6 2012-03-25 13:00:00 460. 0 0 Group_2
7 2014-01-04 16:55:43 1000. 1 137. Group_3
8 2014-01-10 09:41:25 595. 1 351. Group_3
9 2014-01-25 00:53:11 411. 0 0 Group_4
10 2014-02-27 15:31:08 735. 1 296. Group_5
# ... with 30 more rows
ggplot() +
geom_point(data = test_df , aes(x = Date_Time, y = Calculated_TDS))+
geom_line(data = subset(test_df, Calculated_TDS > 500), aes(x=Date_Time, y = Calculated_TDS, group = group), stat="identity")+
geom_hline(aes(yintercept = 500,color="red"),size=1.3)
edited Nov 19 '18 at 21:39
answered Nov 19 '18 at 18:49
Anonymous cowardAnonymous coward
1,4531919
1,4531919
thanks for your answer. Is there a way to make a new column saying how long each exceedance lasted?
– NBE
Nov 19 '18 at 19:06
1
You can usediff
, see my edits.
– Anonymous coward
Nov 19 '18 at 21:37
add a comment |
thanks for your answer. Is there a way to make a new column saying how long each exceedance lasted?
– NBE
Nov 19 '18 at 19:06
1
You can usediff
, see my edits.
– Anonymous coward
Nov 19 '18 at 21:37
thanks for your answer. Is there a way to make a new column saying how long each exceedance lasted?
– NBE
Nov 19 '18 at 19:06
thanks for your answer. Is there a way to make a new column saying how long each exceedance lasted?
– NBE
Nov 19 '18 at 19:06
1
1
You can use
diff
, see my edits.– Anonymous coward
Nov 19 '18 at 21:37
You can use
diff
, see my edits.– Anonymous coward
Nov 19 '18 at 21:37
add a comment |
I am not quite sure what you want, but here is a starting point. The idea is simply to make columns for the exceeded and the below, filled with NA, and then plot.
Color assigned is red to the the exceeded line and blue to the below points. Note that the color = "red"
should be outside the aes
for the horizontal line. Only use color inside aes
when the color should vary with the value.
library(tidyverse)
test_df <- df %>%
mutate(greater = Calculated_TDS > 500,
exceed_value = if_else(greater, Calculated_TDS, as.numeric(NA)),
below_value = if_else(greater, as.numeric(NA), Calculated_TDS))
plot <- ggplot(data = test_df, aes(x = Date_Time)) +
geom_point(aes(y = exceed_value), color = "red") +
geom_point(aes(y = below_value), color = "blue") +
geom_line(aes(y = exceed_value),
color = "red") + geom_hline(aes(yintercept = 500), color = "red", size = 1.3)
print(plot)
#> Warning in as.POSIXlt.POSIXct(x): unknown timezone 'zone/tz/2018g.1.0/
#> zoneinfo/America/New_York'
#> Warning: Removed 26 rows containing missing values (geom_point).
#> Warning: Removed 14 rows containing missing values (geom_point).
#> Warning: Removed 4 rows containing missing values (geom_path).
Thanks for your answer. Is there a way to get how long each event lasted in a new column?
– NBE
Nov 19 '18 at 19:01
add a comment |
I am not quite sure what you want, but here is a starting point. The idea is simply to make columns for the exceeded and the below, filled with NA, and then plot.
Color assigned is red to the the exceeded line and blue to the below points. Note that the color = "red"
should be outside the aes
for the horizontal line. Only use color inside aes
when the color should vary with the value.
library(tidyverse)
test_df <- df %>%
mutate(greater = Calculated_TDS > 500,
exceed_value = if_else(greater, Calculated_TDS, as.numeric(NA)),
below_value = if_else(greater, as.numeric(NA), Calculated_TDS))
plot <- ggplot(data = test_df, aes(x = Date_Time)) +
geom_point(aes(y = exceed_value), color = "red") +
geom_point(aes(y = below_value), color = "blue") +
geom_line(aes(y = exceed_value),
color = "red") + geom_hline(aes(yintercept = 500), color = "red", size = 1.3)
print(plot)
#> Warning in as.POSIXlt.POSIXct(x): unknown timezone 'zone/tz/2018g.1.0/
#> zoneinfo/America/New_York'
#> Warning: Removed 26 rows containing missing values (geom_point).
#> Warning: Removed 14 rows containing missing values (geom_point).
#> Warning: Removed 4 rows containing missing values (geom_path).
Thanks for your answer. Is there a way to get how long each event lasted in a new column?
– NBE
Nov 19 '18 at 19:01
add a comment |
I am not quite sure what you want, but here is a starting point. The idea is simply to make columns for the exceeded and the below, filled with NA, and then plot.
Color assigned is red to the the exceeded line and blue to the below points. Note that the color = "red"
should be outside the aes
for the horizontal line. Only use color inside aes
when the color should vary with the value.
library(tidyverse)
test_df <- df %>%
mutate(greater = Calculated_TDS > 500,
exceed_value = if_else(greater, Calculated_TDS, as.numeric(NA)),
below_value = if_else(greater, as.numeric(NA), Calculated_TDS))
plot <- ggplot(data = test_df, aes(x = Date_Time)) +
geom_point(aes(y = exceed_value), color = "red") +
geom_point(aes(y = below_value), color = "blue") +
geom_line(aes(y = exceed_value),
color = "red") + geom_hline(aes(yintercept = 500), color = "red", size = 1.3)
print(plot)
#> Warning in as.POSIXlt.POSIXct(x): unknown timezone 'zone/tz/2018g.1.0/
#> zoneinfo/America/New_York'
#> Warning: Removed 26 rows containing missing values (geom_point).
#> Warning: Removed 14 rows containing missing values (geom_point).
#> Warning: Removed 4 rows containing missing values (geom_path).
I am not quite sure what you want, but here is a starting point. The idea is simply to make columns for the exceeded and the below, filled with NA, and then plot.
Color assigned is red to the the exceeded line and blue to the below points. Note that the color = "red"
should be outside the aes
for the horizontal line. Only use color inside aes
when the color should vary with the value.
library(tidyverse)
test_df <- df %>%
mutate(greater = Calculated_TDS > 500,
exceed_value = if_else(greater, Calculated_TDS, as.numeric(NA)),
below_value = if_else(greater, as.numeric(NA), Calculated_TDS))
plot <- ggplot(data = test_df, aes(x = Date_Time)) +
geom_point(aes(y = exceed_value), color = "red") +
geom_point(aes(y = below_value), color = "blue") +
geom_line(aes(y = exceed_value),
color = "red") + geom_hline(aes(yintercept = 500), color = "red", size = 1.3)
print(plot)
#> Warning in as.POSIXlt.POSIXct(x): unknown timezone 'zone/tz/2018g.1.0/
#> zoneinfo/America/New_York'
#> Warning: Removed 26 rows containing missing values (geom_point).
#> Warning: Removed 14 rows containing missing values (geom_point).
#> Warning: Removed 4 rows containing missing values (geom_path).
edited Nov 19 '18 at 21:09
answered Nov 19 '18 at 18:42
Andrew LaversAndrew Lavers
3,0411713
3,0411713
Thanks for your answer. Is there a way to get how long each event lasted in a new column?
– NBE
Nov 19 '18 at 19:01
add a comment |
Thanks for your answer. Is there a way to get how long each event lasted in a new column?
– NBE
Nov 19 '18 at 19:01
Thanks for your answer. Is there a way to get how long each event lasted in a new column?
– NBE
Nov 19 '18 at 19:01
Thanks for your answer. Is there a way to get how long each event lasted in a new column?
– NBE
Nov 19 '18 at 19:01
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53379823%2fplotting-the-duration-of-exceedances-of-a-pollutant-standard-in-ggplot2%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
FYI, you code didn't run. Can you draw your expected output in Paint or Word?
– Tung
Nov 19 '18 at 17:49
Are you looking for something similar to flow duration curve? stackoverflow.com/questions/52831687/…
– Tung
Nov 19 '18 at 17:51
@Tung fixed code sorry, should work now. That's not quite what I want. I will try to draw expected output
– NBE
Nov 19 '18 at 17:53
@Tung I included expected output. The lines should be the time duration of each event that went over the standard of 500.
– NBE
Nov 19 '18 at 18:04
Can you pls define for us what an "event" is? Let's say a reading is over 500, as is the case about 35% of the time in your sample. When should the duration start and when should it end? As long as the readings continue to be over 500? In your drawing it looks like there is grouping that includes readings before and after the > 500 readings.
– Jon Spring
Nov 19 '18 at 18:31