Ian Barber has a new post about an interesting method for determining the “line” that results follow in your statistics – linear regression in PHP (complete with code samples).
There are a lot of problems that fall under predicting these types of continuous values based on limited inputs – for example: given the air pressure, how much rain will there be, given the qualifying times, how quick will the fastest lap be in the race. By taking a bunch of existing data and fitting a line, we will be able to make a prediction easily – and often reasonably correctly.
He defines two pieces of information, the intercept and the gradient, and how they relate to minimize the “square error” that can come from getting the square root of your values based on the difference between an actual and predicted value. Based on a sample data set, he comes up with these results, showing the trend line for the points given. He points out a few issues with the method and corrects them with a few tweaks to his original algorithm.