python curve_fitting with bad results











up vote
1
down vote

favorite
1












the link of data from dropboxbadfittingI tried use the curve_fit to fit the data with my pre_defined function in python, but the result was far to perfect. The code is simple and shown as below. I have no idea what's wrong.

Since I am new to python, are there any other optimization or fitting methods which are suitable for my case with predefined function?



Thanks in advance!



import numpy as np
import math
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

def func(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y

def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x

# test data
f_x= open(r'C:UsersadmDesktopsimpletryfre.txt')
xdata = readdata(f_x)

f_y= open(r'C:UsersadmDesktopsimpletryimpedance.txt')
ydata = readdata(f_y)

xdata = np.array(xdata)
ydata = np.array(ydata)
plt.semilogx(xdata, ydata, 'b-', label='data')

popt, pcov = curve_fit(func, xdata, ydata, bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
plt.semilogx(xdata, func(xdata, *popt), 'r-', label='fitted curve')

print(popt)
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()


as you guessed, this is a LCR circuit model. now I am trying to fit two curves with the same parameters like



def func1(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
return m

def func2(x, r1, r2, r3,l,c):
w=2*math.pi*x
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
return n


is it possible to use the curve_fitting to optimize the parameters?










share|improve this question




















  • 1




    Would you please post a link to the data files?
    – James Phillips
    Nov 12 at 19:56










  • dropbox.com/s/1cciw4h7yrmc1qx/data.rar?dl=0
    – hao
    Nov 13 at 9:55










  • Do you have any idea what the approximate right answer is? You can use that to specify a starting point and bounds that might help the curve-fit routine.
    – Andrew Jaffe
    Nov 13 at 10:02










  • It looks like the best fit has negative values for some of the parameters, so you should just remove the bounds=... part of the call. (Also, I suspect that there may be some degeneracy in your fit -- i.e., more than one set of parameters fits the data just as well.)
    – Andrew Jaffe
    Nov 13 at 10:23






  • 1




    yeah, that's the problem. I had tried to remove the bounds and it can fit well. but the fact is that ALL the parameters need to be positive.
    – hao
    Nov 13 at 10:54















up vote
1
down vote

favorite
1












the link of data from dropboxbadfittingI tried use the curve_fit to fit the data with my pre_defined function in python, but the result was far to perfect. The code is simple and shown as below. I have no idea what's wrong.

Since I am new to python, are there any other optimization or fitting methods which are suitable for my case with predefined function?



Thanks in advance!



import numpy as np
import math
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

def func(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y

def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x

# test data
f_x= open(r'C:UsersadmDesktopsimpletryfre.txt')
xdata = readdata(f_x)

f_y= open(r'C:UsersadmDesktopsimpletryimpedance.txt')
ydata = readdata(f_y)

xdata = np.array(xdata)
ydata = np.array(ydata)
plt.semilogx(xdata, ydata, 'b-', label='data')

popt, pcov = curve_fit(func, xdata, ydata, bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
plt.semilogx(xdata, func(xdata, *popt), 'r-', label='fitted curve')

print(popt)
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()


as you guessed, this is a LCR circuit model. now I am trying to fit two curves with the same parameters like



def func1(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
return m

def func2(x, r1, r2, r3,l,c):
w=2*math.pi*x
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
return n


is it possible to use the curve_fitting to optimize the parameters?










share|improve this question




















  • 1




    Would you please post a link to the data files?
    – James Phillips
    Nov 12 at 19:56










  • dropbox.com/s/1cciw4h7yrmc1qx/data.rar?dl=0
    – hao
    Nov 13 at 9:55










  • Do you have any idea what the approximate right answer is? You can use that to specify a starting point and bounds that might help the curve-fit routine.
    – Andrew Jaffe
    Nov 13 at 10:02










  • It looks like the best fit has negative values for some of the parameters, so you should just remove the bounds=... part of the call. (Also, I suspect that there may be some degeneracy in your fit -- i.e., more than one set of parameters fits the data just as well.)
    – Andrew Jaffe
    Nov 13 at 10:23






  • 1




    yeah, that's the problem. I had tried to remove the bounds and it can fit well. but the fact is that ALL the parameters need to be positive.
    – hao
    Nov 13 at 10:54













up vote
1
down vote

favorite
1









up vote
1
down vote

favorite
1






1





the link of data from dropboxbadfittingI tried use the curve_fit to fit the data with my pre_defined function in python, but the result was far to perfect. The code is simple and shown as below. I have no idea what's wrong.

Since I am new to python, are there any other optimization or fitting methods which are suitable for my case with predefined function?



Thanks in advance!



import numpy as np
import math
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

def func(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y

def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x

# test data
f_x= open(r'C:UsersadmDesktopsimpletryfre.txt')
xdata = readdata(f_x)

f_y= open(r'C:UsersadmDesktopsimpletryimpedance.txt')
ydata = readdata(f_y)

xdata = np.array(xdata)
ydata = np.array(ydata)
plt.semilogx(xdata, ydata, 'b-', label='data')

popt, pcov = curve_fit(func, xdata, ydata, bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
plt.semilogx(xdata, func(xdata, *popt), 'r-', label='fitted curve')

print(popt)
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()


as you guessed, this is a LCR circuit model. now I am trying to fit two curves with the same parameters like



def func1(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
return m

def func2(x, r1, r2, r3,l,c):
w=2*math.pi*x
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
return n


is it possible to use the curve_fitting to optimize the parameters?










share|improve this question















the link of data from dropboxbadfittingI tried use the curve_fit to fit the data with my pre_defined function in python, but the result was far to perfect. The code is simple and shown as below. I have no idea what's wrong.

Since I am new to python, are there any other optimization or fitting methods which are suitable for my case with predefined function?



Thanks in advance!



import numpy as np
import math
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

def func(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y

def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x

# test data
f_x= open(r'C:UsersadmDesktopsimpletryfre.txt')
xdata = readdata(f_x)

f_y= open(r'C:UsersadmDesktopsimpletryimpedance.txt')
ydata = readdata(f_y)

xdata = np.array(xdata)
ydata = np.array(ydata)
plt.semilogx(xdata, ydata, 'b-', label='data')

popt, pcov = curve_fit(func, xdata, ydata, bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
plt.semilogx(xdata, func(xdata, *popt), 'r-', label='fitted curve')

print(popt)
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()


as you guessed, this is a LCR circuit model. now I am trying to fit two curves with the same parameters like



def func1(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
return m

def func2(x, r1, r2, r3,l,c):
w=2*math.pi*x
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
return n


is it possible to use the curve_fitting to optimize the parameters?







python optimization scipy curve-fitting






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 14 at 10:57

























asked Nov 12 at 18:53









hao

163




163








  • 1




    Would you please post a link to the data files?
    – James Phillips
    Nov 12 at 19:56










  • dropbox.com/s/1cciw4h7yrmc1qx/data.rar?dl=0
    – hao
    Nov 13 at 9:55










  • Do you have any idea what the approximate right answer is? You can use that to specify a starting point and bounds that might help the curve-fit routine.
    – Andrew Jaffe
    Nov 13 at 10:02










  • It looks like the best fit has negative values for some of the parameters, so you should just remove the bounds=... part of the call. (Also, I suspect that there may be some degeneracy in your fit -- i.e., more than one set of parameters fits the data just as well.)
    – Andrew Jaffe
    Nov 13 at 10:23






  • 1




    yeah, that's the problem. I had tried to remove the bounds and it can fit well. but the fact is that ALL the parameters need to be positive.
    – hao
    Nov 13 at 10:54














  • 1




    Would you please post a link to the data files?
    – James Phillips
    Nov 12 at 19:56










  • dropbox.com/s/1cciw4h7yrmc1qx/data.rar?dl=0
    – hao
    Nov 13 at 9:55










  • Do you have any idea what the approximate right answer is? You can use that to specify a starting point and bounds that might help the curve-fit routine.
    – Andrew Jaffe
    Nov 13 at 10:02










  • It looks like the best fit has negative values for some of the parameters, so you should just remove the bounds=... part of the call. (Also, I suspect that there may be some degeneracy in your fit -- i.e., more than one set of parameters fits the data just as well.)
    – Andrew Jaffe
    Nov 13 at 10:23






  • 1




    yeah, that's the problem. I had tried to remove the bounds and it can fit well. but the fact is that ALL the parameters need to be positive.
    – hao
    Nov 13 at 10:54








1




1




Would you please post a link to the data files?
– James Phillips
Nov 12 at 19:56




Would you please post a link to the data files?
– James Phillips
Nov 12 at 19:56












dropbox.com/s/1cciw4h7yrmc1qx/data.rar?dl=0
– hao
Nov 13 at 9:55




dropbox.com/s/1cciw4h7yrmc1qx/data.rar?dl=0
– hao
Nov 13 at 9:55












Do you have any idea what the approximate right answer is? You can use that to specify a starting point and bounds that might help the curve-fit routine.
– Andrew Jaffe
Nov 13 at 10:02




Do you have any idea what the approximate right answer is? You can use that to specify a starting point and bounds that might help the curve-fit routine.
– Andrew Jaffe
Nov 13 at 10:02












It looks like the best fit has negative values for some of the parameters, so you should just remove the bounds=... part of the call. (Also, I suspect that there may be some degeneracy in your fit -- i.e., more than one set of parameters fits the data just as well.)
– Andrew Jaffe
Nov 13 at 10:23




It looks like the best fit has negative values for some of the parameters, so you should just remove the bounds=... part of the call. (Also, I suspect that there may be some degeneracy in your fit -- i.e., more than one set of parameters fits the data just as well.)
– Andrew Jaffe
Nov 13 at 10:23




1




1




yeah, that's the problem. I had tried to remove the bounds and it can fit well. but the fact is that ALL the parameters need to be positive.
– hao
Nov 13 at 10:54




yeah, that's the problem. I had tried to remove the bounds and it can fit well. but the fact is that ALL the parameters need to be positive.
– hao
Nov 13 at 10:54












2 Answers
2






active

oldest

votes

















up vote
0
down vote













Here are my results using scipy's differential_evolution genetic algorithm module to generate the initial parameter estimates for curve_fit, along with a simple "brick wall" in the function to ensure all parameters are positive. Scipy's implementation of Differential Evolution uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search - in this example, those bounds are taken from the data maximum and minimum values. My results:



RMSE: 7.415



R-squared: 0.999995



r1 = 1.16614005e+00



r2 = 2.00000664e+05



r3 = 1.54718886e+01



l = 1.94473531e+04



c = 4.32515535e+05



semilog output



import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings

def func(x, r1, r2, r3,l,c):
# "brick wall" ensuring all parameters are positive
if r1 < 0.0 or r2 < 0.0 or r3 < 0.0 or l < 0.0 or c < 0.0:
return 1.0E10 # large value gives large error, curve_fit hits a brick wall

w=2*numpy.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y


def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x

# test data
f_x= open('/home/zunzun/temp/data/fre.txt')
xData = readdata(f_x)

f_y= open('/home/zunzun/temp/data/impedance.txt')
yData = readdata(f_y)

xData = numpy.array(xData)
yData = numpy.array(yData)


# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)


def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minBound = min(minX, minY)
maxBound = max(maxX, maxY)
parameterBounds =
parameterBounds.append([minBound, maxBound]) # search bounds for r1
parameterBounds.append([minBound, maxBound]) # search bounds for r2
parameterBounds.append([minBound, maxBound]) # search bounds for r3
parameterBounds.append([minBound, maxBound]) # search bounds for l
parameterBounds.append([minBound, maxBound]) # search bounds for c

# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x

# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()

# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()

modelPredictions = func(xData, *fittedParameters)

absError = modelPredictions - yData

SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))

print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()


##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)

# first the raw data as a scatter plot
plt.semilogx(xData, yData, 'D')

# create data for the fitted equation plot
yModel = func(xData, *fittedParameters)

# now the model as a line plot
plt.semilogx(xData, yModel)

axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label

plt.show()
plt.close('all') # clean up after using pyplot

graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)





share|improve this answer























  • thanks for your code. now I am trying to fit two curves with the same parameters like **def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n is it possible to use the curve_fitting to optimize the parameters?
    – hao
    Nov 14 at 10:59












  • Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
    – James Phillips
    Nov 14 at 16:27


















up vote
0
down vote













To have your Least Squares regression make sense, you'll have to at least supply initial parameters which make sense.



As all parameters are by default initiated to the value 1, the biggest influence on the initial regression will be resistor r1 which adds a constant to the mix.



Most probably you'll end up in something like the following configuration:



popt
Out[241]:
array([1.66581563e+03, 2.43663552e+02, 1.13019744e+00, 1.20233767e+00,
5.04984535e-04])


Which will output a neat-looking flat line, due to m = something big + ~0 + ~0 ; n=~0 - ~0, so y = r1.



However, if you initialize your parameters somewhat differently,



popt, pcov = curve_fit(func, xdata.flatten(), ydata.flatten(), p0=[0.1,1e5,1000,1000,0.2],
bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))


You will get a better looking fit,



popt
Out[244]:
array([1.14947146e+00, 4.12512324e+05, 1.36182466e+02, 8.29771756e+04,
1.77593448e+03])

((fitted-ydata.flatten())**2).mean()
Out[257]: 0.6099524982664816
#RMSE hence 0.78


enter image description here



P.s. My data starts at the second data point, due to a conversion error with pd.read_clipboard where the first row became headers instead of data. Shouldn't change the overall picture though.






share|improve this answer























  • How did you determine these initial parameter estimates?
    – James Phillips
    Nov 13 at 12:32










  • By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
    – Uvar
    Nov 13 at 12:34












  • I will spend a week writing code to avoid 10 minutes of manual labor...
    – James Phillips
    Nov 13 at 12:35










  • I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
    – Uvar
    Nov 13 at 12:37










  • The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
    – James Phillips
    Nov 13 at 12:40











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268378%2fpython-curve-fitting-with-bad-results%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
0
down vote













Here are my results using scipy's differential_evolution genetic algorithm module to generate the initial parameter estimates for curve_fit, along with a simple "brick wall" in the function to ensure all parameters are positive. Scipy's implementation of Differential Evolution uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search - in this example, those bounds are taken from the data maximum and minimum values. My results:



RMSE: 7.415



R-squared: 0.999995



r1 = 1.16614005e+00



r2 = 2.00000664e+05



r3 = 1.54718886e+01



l = 1.94473531e+04



c = 4.32515535e+05



semilog output



import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings

def func(x, r1, r2, r3,l,c):
# "brick wall" ensuring all parameters are positive
if r1 < 0.0 or r2 < 0.0 or r3 < 0.0 or l < 0.0 or c < 0.0:
return 1.0E10 # large value gives large error, curve_fit hits a brick wall

w=2*numpy.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y


def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x

# test data
f_x= open('/home/zunzun/temp/data/fre.txt')
xData = readdata(f_x)

f_y= open('/home/zunzun/temp/data/impedance.txt')
yData = readdata(f_y)

xData = numpy.array(xData)
yData = numpy.array(yData)


# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)


def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minBound = min(minX, minY)
maxBound = max(maxX, maxY)
parameterBounds =
parameterBounds.append([minBound, maxBound]) # search bounds for r1
parameterBounds.append([minBound, maxBound]) # search bounds for r2
parameterBounds.append([minBound, maxBound]) # search bounds for r3
parameterBounds.append([minBound, maxBound]) # search bounds for l
parameterBounds.append([minBound, maxBound]) # search bounds for c

# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x

# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()

# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()

modelPredictions = func(xData, *fittedParameters)

absError = modelPredictions - yData

SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))

print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()


##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)

# first the raw data as a scatter plot
plt.semilogx(xData, yData, 'D')

# create data for the fitted equation plot
yModel = func(xData, *fittedParameters)

# now the model as a line plot
plt.semilogx(xData, yModel)

axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label

plt.show()
plt.close('all') # clean up after using pyplot

graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)





share|improve this answer























  • thanks for your code. now I am trying to fit two curves with the same parameters like **def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n is it possible to use the curve_fitting to optimize the parameters?
    – hao
    Nov 14 at 10:59












  • Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
    – James Phillips
    Nov 14 at 16:27















up vote
0
down vote













Here are my results using scipy's differential_evolution genetic algorithm module to generate the initial parameter estimates for curve_fit, along with a simple "brick wall" in the function to ensure all parameters are positive. Scipy's implementation of Differential Evolution uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search - in this example, those bounds are taken from the data maximum and minimum values. My results:



RMSE: 7.415



R-squared: 0.999995



r1 = 1.16614005e+00



r2 = 2.00000664e+05



r3 = 1.54718886e+01



l = 1.94473531e+04



c = 4.32515535e+05



semilog output



import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings

def func(x, r1, r2, r3,l,c):
# "brick wall" ensuring all parameters are positive
if r1 < 0.0 or r2 < 0.0 or r3 < 0.0 or l < 0.0 or c < 0.0:
return 1.0E10 # large value gives large error, curve_fit hits a brick wall

w=2*numpy.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y


def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x

# test data
f_x= open('/home/zunzun/temp/data/fre.txt')
xData = readdata(f_x)

f_y= open('/home/zunzun/temp/data/impedance.txt')
yData = readdata(f_y)

xData = numpy.array(xData)
yData = numpy.array(yData)


# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)


def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minBound = min(minX, minY)
maxBound = max(maxX, maxY)
parameterBounds =
parameterBounds.append([minBound, maxBound]) # search bounds for r1
parameterBounds.append([minBound, maxBound]) # search bounds for r2
parameterBounds.append([minBound, maxBound]) # search bounds for r3
parameterBounds.append([minBound, maxBound]) # search bounds for l
parameterBounds.append([minBound, maxBound]) # search bounds for c

# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x

# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()

# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()

modelPredictions = func(xData, *fittedParameters)

absError = modelPredictions - yData

SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))

print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()


##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)

# first the raw data as a scatter plot
plt.semilogx(xData, yData, 'D')

# create data for the fitted equation plot
yModel = func(xData, *fittedParameters)

# now the model as a line plot
plt.semilogx(xData, yModel)

axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label

plt.show()
plt.close('all') # clean up after using pyplot

graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)





share|improve this answer























  • thanks for your code. now I am trying to fit two curves with the same parameters like **def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n is it possible to use the curve_fitting to optimize the parameters?
    – hao
    Nov 14 at 10:59












  • Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
    – James Phillips
    Nov 14 at 16:27













up vote
0
down vote










up vote
0
down vote









Here are my results using scipy's differential_evolution genetic algorithm module to generate the initial parameter estimates for curve_fit, along with a simple "brick wall" in the function to ensure all parameters are positive. Scipy's implementation of Differential Evolution uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search - in this example, those bounds are taken from the data maximum and minimum values. My results:



RMSE: 7.415



R-squared: 0.999995



r1 = 1.16614005e+00



r2 = 2.00000664e+05



r3 = 1.54718886e+01



l = 1.94473531e+04



c = 4.32515535e+05



semilog output



import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings

def func(x, r1, r2, r3,l,c):
# "brick wall" ensuring all parameters are positive
if r1 < 0.0 or r2 < 0.0 or r3 < 0.0 or l < 0.0 or c < 0.0:
return 1.0E10 # large value gives large error, curve_fit hits a brick wall

w=2*numpy.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y


def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x

# test data
f_x= open('/home/zunzun/temp/data/fre.txt')
xData = readdata(f_x)

f_y= open('/home/zunzun/temp/data/impedance.txt')
yData = readdata(f_y)

xData = numpy.array(xData)
yData = numpy.array(yData)


# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)


def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minBound = min(minX, minY)
maxBound = max(maxX, maxY)
parameterBounds =
parameterBounds.append([minBound, maxBound]) # search bounds for r1
parameterBounds.append([minBound, maxBound]) # search bounds for r2
parameterBounds.append([minBound, maxBound]) # search bounds for r3
parameterBounds.append([minBound, maxBound]) # search bounds for l
parameterBounds.append([minBound, maxBound]) # search bounds for c

# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x

# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()

# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()

modelPredictions = func(xData, *fittedParameters)

absError = modelPredictions - yData

SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))

print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()


##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)

# first the raw data as a scatter plot
plt.semilogx(xData, yData, 'D')

# create data for the fitted equation plot
yModel = func(xData, *fittedParameters)

# now the model as a line plot
plt.semilogx(xData, yModel)

axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label

plt.show()
plt.close('all') # clean up after using pyplot

graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)





share|improve this answer














Here are my results using scipy's differential_evolution genetic algorithm module to generate the initial parameter estimates for curve_fit, along with a simple "brick wall" in the function to ensure all parameters are positive. Scipy's implementation of Differential Evolution uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search - in this example, those bounds are taken from the data maximum and minimum values. My results:



RMSE: 7.415



R-squared: 0.999995



r1 = 1.16614005e+00



r2 = 2.00000664e+05



r3 = 1.54718886e+01



l = 1.94473531e+04



c = 4.32515535e+05



semilog output



import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings

def func(x, r1, r2, r3,l,c):
# "brick wall" ensuring all parameters are positive
if r1 < 0.0 or r2 < 0.0 or r3 < 0.0 or l < 0.0 or c < 0.0:
return 1.0E10 # large value gives large error, curve_fit hits a brick wall

w=2*numpy.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y


def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x

# test data
f_x= open('/home/zunzun/temp/data/fre.txt')
xData = readdata(f_x)

f_y= open('/home/zunzun/temp/data/impedance.txt')
yData = readdata(f_y)

xData = numpy.array(xData)
yData = numpy.array(yData)


# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)


def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minBound = min(minX, minY)
maxBound = max(maxX, maxY)
parameterBounds =
parameterBounds.append([minBound, maxBound]) # search bounds for r1
parameterBounds.append([minBound, maxBound]) # search bounds for r2
parameterBounds.append([minBound, maxBound]) # search bounds for r3
parameterBounds.append([minBound, maxBound]) # search bounds for l
parameterBounds.append([minBound, maxBound]) # search bounds for c

# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x

# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()

# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()

modelPredictions = func(xData, *fittedParameters)

absError = modelPredictions - yData

SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))

print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()


##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)

# first the raw data as a scatter plot
plt.semilogx(xData, yData, 'D')

# create data for the fitted equation plot
yModel = func(xData, *fittedParameters)

# now the model as a line plot
plt.semilogx(xData, yModel)

axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label

plt.show()
plt.close('all') # clean up after using pyplot

graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 13 at 12:33

























answered Nov 13 at 12:26









James Phillips

1,258387




1,258387












  • thanks for your code. now I am trying to fit two curves with the same parameters like **def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n is it possible to use the curve_fitting to optimize the parameters?
    – hao
    Nov 14 at 10:59












  • Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
    – James Phillips
    Nov 14 at 16:27


















  • thanks for your code. now I am trying to fit two curves with the same parameters like **def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n is it possible to use the curve_fitting to optimize the parameters?
    – hao
    Nov 14 at 10:59












  • Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
    – James Phillips
    Nov 14 at 16:27
















thanks for your code. now I am trying to fit two curves with the same parameters like **def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n is it possible to use the curve_fitting to optimize the parameters?
– hao
Nov 14 at 10:59






thanks for your code. now I am trying to fit two curves with the same parameters like **def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n is it possible to use the curve_fitting to optimize the parameters?
– hao
Nov 14 at 10:59














Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
– James Phillips
Nov 14 at 16:27




Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
– James Phillips
Nov 14 at 16:27












up vote
0
down vote













To have your Least Squares regression make sense, you'll have to at least supply initial parameters which make sense.



As all parameters are by default initiated to the value 1, the biggest influence on the initial regression will be resistor r1 which adds a constant to the mix.



Most probably you'll end up in something like the following configuration:



popt
Out[241]:
array([1.66581563e+03, 2.43663552e+02, 1.13019744e+00, 1.20233767e+00,
5.04984535e-04])


Which will output a neat-looking flat line, due to m = something big + ~0 + ~0 ; n=~0 - ~0, so y = r1.



However, if you initialize your parameters somewhat differently,



popt, pcov = curve_fit(func, xdata.flatten(), ydata.flatten(), p0=[0.1,1e5,1000,1000,0.2],
bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))


You will get a better looking fit,



popt
Out[244]:
array([1.14947146e+00, 4.12512324e+05, 1.36182466e+02, 8.29771756e+04,
1.77593448e+03])

((fitted-ydata.flatten())**2).mean()
Out[257]: 0.6099524982664816
#RMSE hence 0.78


enter image description here



P.s. My data starts at the second data point, due to a conversion error with pd.read_clipboard where the first row became headers instead of data. Shouldn't change the overall picture though.






share|improve this answer























  • How did you determine these initial parameter estimates?
    – James Phillips
    Nov 13 at 12:32










  • By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
    – Uvar
    Nov 13 at 12:34












  • I will spend a week writing code to avoid 10 minutes of manual labor...
    – James Phillips
    Nov 13 at 12:35










  • I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
    – Uvar
    Nov 13 at 12:37










  • The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
    – James Phillips
    Nov 13 at 12:40















up vote
0
down vote













To have your Least Squares regression make sense, you'll have to at least supply initial parameters which make sense.



As all parameters are by default initiated to the value 1, the biggest influence on the initial regression will be resistor r1 which adds a constant to the mix.



Most probably you'll end up in something like the following configuration:



popt
Out[241]:
array([1.66581563e+03, 2.43663552e+02, 1.13019744e+00, 1.20233767e+00,
5.04984535e-04])


Which will output a neat-looking flat line, due to m = something big + ~0 + ~0 ; n=~0 - ~0, so y = r1.



However, if you initialize your parameters somewhat differently,



popt, pcov = curve_fit(func, xdata.flatten(), ydata.flatten(), p0=[0.1,1e5,1000,1000,0.2],
bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))


You will get a better looking fit,



popt
Out[244]:
array([1.14947146e+00, 4.12512324e+05, 1.36182466e+02, 8.29771756e+04,
1.77593448e+03])

((fitted-ydata.flatten())**2).mean()
Out[257]: 0.6099524982664816
#RMSE hence 0.78


enter image description here



P.s. My data starts at the second data point, due to a conversion error with pd.read_clipboard where the first row became headers instead of data. Shouldn't change the overall picture though.






share|improve this answer























  • How did you determine these initial parameter estimates?
    – James Phillips
    Nov 13 at 12:32










  • By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
    – Uvar
    Nov 13 at 12:34












  • I will spend a week writing code to avoid 10 minutes of manual labor...
    – James Phillips
    Nov 13 at 12:35










  • I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
    – Uvar
    Nov 13 at 12:37










  • The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
    – James Phillips
    Nov 13 at 12:40













up vote
0
down vote










up vote
0
down vote









To have your Least Squares regression make sense, you'll have to at least supply initial parameters which make sense.



As all parameters are by default initiated to the value 1, the biggest influence on the initial regression will be resistor r1 which adds a constant to the mix.



Most probably you'll end up in something like the following configuration:



popt
Out[241]:
array([1.66581563e+03, 2.43663552e+02, 1.13019744e+00, 1.20233767e+00,
5.04984535e-04])


Which will output a neat-looking flat line, due to m = something big + ~0 + ~0 ; n=~0 - ~0, so y = r1.



However, if you initialize your parameters somewhat differently,



popt, pcov = curve_fit(func, xdata.flatten(), ydata.flatten(), p0=[0.1,1e5,1000,1000,0.2],
bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))


You will get a better looking fit,



popt
Out[244]:
array([1.14947146e+00, 4.12512324e+05, 1.36182466e+02, 8.29771756e+04,
1.77593448e+03])

((fitted-ydata.flatten())**2).mean()
Out[257]: 0.6099524982664816
#RMSE hence 0.78


enter image description here



P.s. My data starts at the second data point, due to a conversion error with pd.read_clipboard where the first row became headers instead of data. Shouldn't change the overall picture though.






share|improve this answer














To have your Least Squares regression make sense, you'll have to at least supply initial parameters which make sense.



As all parameters are by default initiated to the value 1, the biggest influence on the initial regression will be resistor r1 which adds a constant to the mix.



Most probably you'll end up in something like the following configuration:



popt
Out[241]:
array([1.66581563e+03, 2.43663552e+02, 1.13019744e+00, 1.20233767e+00,
5.04984535e-04])


Which will output a neat-looking flat line, due to m = something big + ~0 + ~0 ; n=~0 - ~0, so y = r1.



However, if you initialize your parameters somewhat differently,



popt, pcov = curve_fit(func, xdata.flatten(), ydata.flatten(), p0=[0.1,1e5,1000,1000,0.2],
bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))


You will get a better looking fit,



popt
Out[244]:
array([1.14947146e+00, 4.12512324e+05, 1.36182466e+02, 8.29771756e+04,
1.77593448e+03])

((fitted-ydata.flatten())**2).mean()
Out[257]: 0.6099524982664816
#RMSE hence 0.78


enter image description here



P.s. My data starts at the second data point, due to a conversion error with pd.read_clipboard where the first row became headers instead of data. Shouldn't change the overall picture though.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 13 at 12:36

























answered Nov 13 at 12:29









Uvar

2,391519




2,391519












  • How did you determine these initial parameter estimates?
    – James Phillips
    Nov 13 at 12:32










  • By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
    – Uvar
    Nov 13 at 12:34












  • I will spend a week writing code to avoid 10 minutes of manual labor...
    – James Phillips
    Nov 13 at 12:35










  • I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
    – Uvar
    Nov 13 at 12:37










  • The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
    – James Phillips
    Nov 13 at 12:40


















  • How did you determine these initial parameter estimates?
    – James Phillips
    Nov 13 at 12:32










  • By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
    – Uvar
    Nov 13 at 12:34












  • I will spend a week writing code to avoid 10 minutes of manual labor...
    – James Phillips
    Nov 13 at 12:35










  • I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
    – Uvar
    Nov 13 at 12:37










  • The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
    – James Phillips
    Nov 13 at 12:40
















How did you determine these initial parameter estimates?
– James Phillips
Nov 13 at 12:32




How did you determine these initial parameter estimates?
– James Phillips
Nov 13 at 12:32












By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
– Uvar
Nov 13 at 12:34






By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
– Uvar
Nov 13 at 12:34














I will spend a week writing code to avoid 10 minutes of manual labor...
– James Phillips
Nov 13 at 12:35




I will spend a week writing code to avoid 10 minutes of manual labor...
– James Phillips
Nov 13 at 12:35












I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
– Uvar
Nov 13 at 12:37




I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
– Uvar
Nov 13 at 12:37












The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
– James Phillips
Nov 13 at 12:40




The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
– James Phillips
Nov 13 at 12:40


















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268378%2fpython-curve-fitting-with-bad-results%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to change which sound is reproduced for terminal bell?

Can I use Tabulator js library in my java Spring + Thymeleaf project?

Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents