detect anomaly in time series












2












$begingroup$


I have data on 5 agriculture fields over the same period of time. At some point, there is a sudden change in behavior for all 5 time series, which correspond to the harvest of the field. My task is to implement a methodology to determine the date at which the field was
harvested and provide prediction intervals.
Here is what it looks like plot. Here is the correlation matrix.



As you can see, it's not obvious from visual inspection as there are several peaks and ups and downs .
It's important to say that I don't know what the values in the time series correspond to.



I have though of some methods like max(sum over all TS of the derivatives at each point), or using first and second derivative etc. But I want to know if there is a more systematic way or method? And how to provide prediction intervals.



I know that there are a lot of packages in R to work with Time series (see here but I have to use python for this problem.
Thanks!










share|cite|improve this question











$endgroup$












  • $begingroup$
    It might be useful to know the relations between the various fields, for instance, are they orthogonal component of the same vector field? In that case, looking a the norm $sqrt{f_1^2+f_2^2+f_3^2}$ may be useful
    $endgroup$
    – Federico
    Dec 10 '18 at 18:53








  • 1




    $begingroup$
    Also, +1 for using Python and not R
    $endgroup$
    – Federico
    Dec 10 '18 at 18:54










  • $begingroup$
    thanks @Federico. I've added the correlation matrix. Then what do you once you have the relations and the norm you mentioned. How do you go to find that date?
    $endgroup$
    – Lu Yin
    Dec 10 '18 at 19:12






  • 1




    $begingroup$
    Uhm... I'm starting to wonder whether you mean crop fields while I'm thinking of vector fields...
    $endgroup$
    – Federico
    Dec 10 '18 at 19:15






  • 1




    $begingroup$
    @the_candyman exactly. That's why I have to provide prediction intervals. So you suggest to run something like k-means on that? but defining clusters doesn't help for the harvest date. hmm maybe some unsupervised anomaly detection method like Autoencoders is good. But I have little data (only 30 samples)
    $endgroup$
    – Lu Yin
    Dec 10 '18 at 19:28
















2












$begingroup$


I have data on 5 agriculture fields over the same period of time. At some point, there is a sudden change in behavior for all 5 time series, which correspond to the harvest of the field. My task is to implement a methodology to determine the date at which the field was
harvested and provide prediction intervals.
Here is what it looks like plot. Here is the correlation matrix.



As you can see, it's not obvious from visual inspection as there are several peaks and ups and downs .
It's important to say that I don't know what the values in the time series correspond to.



I have though of some methods like max(sum over all TS of the derivatives at each point), or using first and second derivative etc. But I want to know if there is a more systematic way or method? And how to provide prediction intervals.



I know that there are a lot of packages in R to work with Time series (see here but I have to use python for this problem.
Thanks!










share|cite|improve this question











$endgroup$












  • $begingroup$
    It might be useful to know the relations between the various fields, for instance, are they orthogonal component of the same vector field? In that case, looking a the norm $sqrt{f_1^2+f_2^2+f_3^2}$ may be useful
    $endgroup$
    – Federico
    Dec 10 '18 at 18:53








  • 1




    $begingroup$
    Also, +1 for using Python and not R
    $endgroup$
    – Federico
    Dec 10 '18 at 18:54










  • $begingroup$
    thanks @Federico. I've added the correlation matrix. Then what do you once you have the relations and the norm you mentioned. How do you go to find that date?
    $endgroup$
    – Lu Yin
    Dec 10 '18 at 19:12






  • 1




    $begingroup$
    Uhm... I'm starting to wonder whether you mean crop fields while I'm thinking of vector fields...
    $endgroup$
    – Federico
    Dec 10 '18 at 19:15






  • 1




    $begingroup$
    @the_candyman exactly. That's why I have to provide prediction intervals. So you suggest to run something like k-means on that? but defining clusters doesn't help for the harvest date. hmm maybe some unsupervised anomaly detection method like Autoencoders is good. But I have little data (only 30 samples)
    $endgroup$
    – Lu Yin
    Dec 10 '18 at 19:28














2












2








2





$begingroup$


I have data on 5 agriculture fields over the same period of time. At some point, there is a sudden change in behavior for all 5 time series, which correspond to the harvest of the field. My task is to implement a methodology to determine the date at which the field was
harvested and provide prediction intervals.
Here is what it looks like plot. Here is the correlation matrix.



As you can see, it's not obvious from visual inspection as there are several peaks and ups and downs .
It's important to say that I don't know what the values in the time series correspond to.



I have though of some methods like max(sum over all TS of the derivatives at each point), or using first and second derivative etc. But I want to know if there is a more systematic way or method? And how to provide prediction intervals.



I know that there are a lot of packages in R to work with Time series (see here but I have to use python for this problem.
Thanks!










share|cite|improve this question











$endgroup$




I have data on 5 agriculture fields over the same period of time. At some point, there is a sudden change in behavior for all 5 time series, which correspond to the harvest of the field. My task is to implement a methodology to determine the date at which the field was
harvested and provide prediction intervals.
Here is what it looks like plot. Here is the correlation matrix.



As you can see, it's not obvious from visual inspection as there are several peaks and ups and downs .
It's important to say that I don't know what the values in the time series correspond to.



I have though of some methods like max(sum over all TS of the derivatives at each point), or using first and second derivative etc. But I want to know if there is a more systematic way or method? And how to provide prediction intervals.



I know that there are a lot of packages in R to work with Time series (see here but I have to use python for this problem.
Thanks!







signal-processing time-series






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Dec 10 '18 at 19:21







Lu Yin

















asked Dec 10 '18 at 18:48









Lu YinLu Yin

183




183












  • $begingroup$
    It might be useful to know the relations between the various fields, for instance, are they orthogonal component of the same vector field? In that case, looking a the norm $sqrt{f_1^2+f_2^2+f_3^2}$ may be useful
    $endgroup$
    – Federico
    Dec 10 '18 at 18:53








  • 1




    $begingroup$
    Also, +1 for using Python and not R
    $endgroup$
    – Federico
    Dec 10 '18 at 18:54










  • $begingroup$
    thanks @Federico. I've added the correlation matrix. Then what do you once you have the relations and the norm you mentioned. How do you go to find that date?
    $endgroup$
    – Lu Yin
    Dec 10 '18 at 19:12






  • 1




    $begingroup$
    Uhm... I'm starting to wonder whether you mean crop fields while I'm thinking of vector fields...
    $endgroup$
    – Federico
    Dec 10 '18 at 19:15






  • 1




    $begingroup$
    @the_candyman exactly. That's why I have to provide prediction intervals. So you suggest to run something like k-means on that? but defining clusters doesn't help for the harvest date. hmm maybe some unsupervised anomaly detection method like Autoencoders is good. But I have little data (only 30 samples)
    $endgroup$
    – Lu Yin
    Dec 10 '18 at 19:28


















  • $begingroup$
    It might be useful to know the relations between the various fields, for instance, are they orthogonal component of the same vector field? In that case, looking a the norm $sqrt{f_1^2+f_2^2+f_3^2}$ may be useful
    $endgroup$
    – Federico
    Dec 10 '18 at 18:53








  • 1




    $begingroup$
    Also, +1 for using Python and not R
    $endgroup$
    – Federico
    Dec 10 '18 at 18:54










  • $begingroup$
    thanks @Federico. I've added the correlation matrix. Then what do you once you have the relations and the norm you mentioned. How do you go to find that date?
    $endgroup$
    – Lu Yin
    Dec 10 '18 at 19:12






  • 1




    $begingroup$
    Uhm... I'm starting to wonder whether you mean crop fields while I'm thinking of vector fields...
    $endgroup$
    – Federico
    Dec 10 '18 at 19:15






  • 1




    $begingroup$
    @the_candyman exactly. That's why I have to provide prediction intervals. So you suggest to run something like k-means on that? but defining clusters doesn't help for the harvest date. hmm maybe some unsupervised anomaly detection method like Autoencoders is good. But I have little data (only 30 samples)
    $endgroup$
    – Lu Yin
    Dec 10 '18 at 19:28
















$begingroup$
It might be useful to know the relations between the various fields, for instance, are they orthogonal component of the same vector field? In that case, looking a the norm $sqrt{f_1^2+f_2^2+f_3^2}$ may be useful
$endgroup$
– Federico
Dec 10 '18 at 18:53






$begingroup$
It might be useful to know the relations between the various fields, for instance, are they orthogonal component of the same vector field? In that case, looking a the norm $sqrt{f_1^2+f_2^2+f_3^2}$ may be useful
$endgroup$
– Federico
Dec 10 '18 at 18:53






1




1




$begingroup$
Also, +1 for using Python and not R
$endgroup$
– Federico
Dec 10 '18 at 18:54




$begingroup$
Also, +1 for using Python and not R
$endgroup$
– Federico
Dec 10 '18 at 18:54












$begingroup$
thanks @Federico. I've added the correlation matrix. Then what do you once you have the relations and the norm you mentioned. How do you go to find that date?
$endgroup$
– Lu Yin
Dec 10 '18 at 19:12




$begingroup$
thanks @Federico. I've added the correlation matrix. Then what do you once you have the relations and the norm you mentioned. How do you go to find that date?
$endgroup$
– Lu Yin
Dec 10 '18 at 19:12




1




1




$begingroup$
Uhm... I'm starting to wonder whether you mean crop fields while I'm thinking of vector fields...
$endgroup$
– Federico
Dec 10 '18 at 19:15




$begingroup$
Uhm... I'm starting to wonder whether you mean crop fields while I'm thinking of vector fields...
$endgroup$
– Federico
Dec 10 '18 at 19:15




1




1




$begingroup$
@the_candyman exactly. That's why I have to provide prediction intervals. So you suggest to run something like k-means on that? but defining clusters doesn't help for the harvest date. hmm maybe some unsupervised anomaly detection method like Autoencoders is good. But I have little data (only 30 samples)
$endgroup$
– Lu Yin
Dec 10 '18 at 19:28




$begingroup$
@the_candyman exactly. That's why I have to provide prediction intervals. So you suggest to run something like k-means on that? but defining clusters doesn't help for the harvest date. hmm maybe some unsupervised anomaly detection method like Autoencoders is good. But I have little data (only 30 samples)
$endgroup$
– Lu Yin
Dec 10 '18 at 19:28










0






active

oldest

votes











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3034324%2fdetect-anomaly-in-time-series%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3034324%2fdetect-anomaly-in-time-series%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Biblatex bibliography style without URLs when DOI exists (in Overleaf with Zotero bibliography)

ComboBox Display Member on multiple fields

Is it possible to collect Nectar points via Trainline?