Statistical Analysis of Wind Speed Data Based on Weibull and Rayleigh Distribution

Global warming and environmental pollution become widely discussed issues within last decades. Current major energy sources have significant impact on ecosystems and should be replaced by alternative renewable sources of energy. Wind power plants are promising energy sources with minimal environmental impact and huge energetic potential. However, attention has to be paid to the optimal locality selection to maximize efficiency and reduce costs. Presented paper introduces the Weibull and Rayleigh distribution as the inventive tools in wind speed analyses as well as proposes a complex methodology recommended for evaluation of wind speed conditions in specific locality. For the sake of this study the wind speed data from the Meteorological observatory Bratislava Mlynska dolina were statistically analyzed. Processed data were collected during year 2009 in quasi-continuous regime by anemometer connected to electronic buffer. The main objective of presented paper is to propose better probability distribution functions for fitting the observed wind speed data and to establish methodology for wind conditions analyses. Based upon studies [1] [14] we introduced the Weibull distribution and its special case Rayleigh distribution to approximate the measured wind speed data. The maximum likelihood method was used to estimate the parameters of the distribution functions. The coefficient of determination (R) and the root mean square error (RMSE) were used to evaluate the fitting performance of the Weibull and Rayleigh distribution functions. The Weibull distribution and its special case the Rayleigh distribution are commonly used and recommended probability distributions to describe the wind speed data. The probability density function of the Weibull distribution with parameters k > 0 and c > 0 is for v > 0 given by


Introduction
Global warming and environmental pollution become widely discussed issues within last decades. Current major energy sources have significant impact on ecosystems and should be replaced by alternative renewable sources of energy. Wind power plants are promising energy sources with minimal environmental impact and huge energetic potential. However, attention has to be paid to the optimal locality selection to maximize efficiency and reduce costs. Presented paper introduces the Weibull and Rayleigh distribution as the inventive tools in wind speed analyses as well as proposes a complex methodology recommended for evaluation of wind speed conditions in specific locality.
For the sake of this study the wind speed data from the Meteorological observatory Bratislava -Mlynska dolina were statistically analyzed. Processed data were collected during year 2009 in quasi-continuous regime by anemometer connected to electronic buffer. The main objective of presented paper is to propose better probability distribution functions for fitting the observed wind speed data and to establish methodology for wind conditions analyses. Based upon studies [1] - [14] we introduced the Weibull distribution and its special case Rayleigh distribution to approximate the measured wind speed data.
The maximum likelihood method was used to estimate the parameters of the distribution functions. The coefficient of determination (R 2 ) and the root mean square error (RMSE) were used to evaluate the fitting performance of the Weibull and Rayleigh distribution functions.
The Weibull distribution and its special case the Rayleigh distribution are commonly used and recommended probability distributions to describe the wind speed data. The probability density function of the Weibull distribution with parameters k > 0 and c > 0 is for v > 0 given by where v is the wind speed, k is the dimensionless shape parameter and c is the scale parameter in units of the wind speed. The corresponding cumulative distribution function is given by The Rayleigh distribution is a special case of the Weibull distribution where the shape parameter is set to k = 2. Consequently the probability density function of the Rayleigh distribution transforms as follows where v i , i = 1,2,...,n, is the averaged wind speed (month, year, season) and n is the number of records.

Methods for estimating the parameters of the Weibull and Rayleigh distribution
The estimates of the Weibull and Rayleigh distribution parameters were calculated using (4), (5) and (6) for each month, season and whole year, respectively.
The performance of the Weibull and Rayleigh distribution was evaluated by the coefficient of determination (R 2 ) and the root mean square error (RMSE). These parameters were calculated using equations (9) and (10) where N is the number of wind speed data, y i is the i th ordered observed wind speed data (y 1 ≤ y 2 ≤ ... ≤ y N ), x i is the i th predicted data calculated using the Weibull or Rayleigh distribution, respectively, i=1,2,...,N, and y r is average of values y 1 , y 2 ,..., y N .
The coefficient R 2 ranges from 0 to 1. The higher value of R 2 is better, R 2 approaches 1 in an ideal case. The coefficient RMSE ranges from 0 to infinity. In this case lower value of RMSE is better, in an ideal case it approaches 0. Therefore, the most suitable wind speed distribution is selected according to higher value of R 2 and lower value of RMSE. R 2 and RMSE were calculated for each month, season and whole year. Table 1 shows the monthly and yearly descriptive statisticsaverage wind speeds, standard deviations, maximum, skewness, kurtosis and median. It has been shown that the yearly average wind speed is 10.485 km/h and the yearly standard deviation is 5.841 km/h. The monthly average wind speed varies between 8.272 and 13.617 km/h with maximum in March and minimum in September The same goes for monthly standard deviation which reaches the highest value in March (7.525 km/h) and the lowest one in September (4.327 km/h). The monthly average wind speeds are shown in Fig. 1. Table 2 shows the seasonal wind speed descriptive statistics. One can see that the highest value of the average wind speed is observed in the winter season 11.657 km/h and the lowest value in the summer season 9.380 km/h. The highest value of the standard deviation was calculated for the spring season 6.260 km/h and the lowest one in the summer season 4.917 km/h. Table 3 shows the monthly and yearly estimates of the Weibull and Rayleigh distribution parameters and statistical analysis for the monthly and yearly wind speeds distributions. One can see that the yearly shape parameter k of the Weibull distribution is paper we chose the maximum likelihood method (see [3], [5] and [15]

Results and discussion
where v i , i = 1,2,...,n, is the wind speed and n is the number of nonzero wind speeds. The shape parameter k was estimated by numerical solving of nonlinear equation (4). Newton method was employed to obtain numerical result. The scale parameter c was estimated by evaluating equation (5). The maximum likelihood method estimate for the parameter c of the Rayleigh distribution can be solved explicitly by equation (6)

Descriptions of wind speed data
The wind speed data processed in presented paper were measured at the Meteorological observatory Bratislava -Mlynska dolina, situated in the campus of Faculty of mathematics, physics and informatics, Comenius University in Bratislava, within time frame January 2009 to December 2009. The wind speed and direction were measured continually by anemometer connected to the storage system. In order to remove accidental fluctuations continual data were hourly averaged and rounded to the nearest integer.

Statistical analysis of wind speed distributions
The wind speed data were generally divided into subsets with respect to the months and four seasons. Spring was considered to last from March to May. Summer lasts from June to August, autumn from September to November and winter from December to February. The monthly, yearly and seasonal average wind Monthly and yearly wind speed descriptive statistics probability density distributions derived from the observed data with the Weibull and Rayleigh probability density distributions are illustrated in Fig. 3. Table 4 shows the seasonal estimates of the Weibull and Rayleigh distribution parameters and statistical analysis for seasonal wind speed. The comparison of seasonal Weibull and Rayleigh probability density distributions with the observed seasonal probability density distributions of the wind speed are illustrated in Fig. 4. In general, the value of the scale parameter c of the Weibull distribution is the highest in the winter season and the lowest in the summer season. Basically the same goes for the parameter c of the Rayleigh distribution. The seasonal value of the Weibull distribution parameter k ranges from 1.768 to 2.064. The value of the parameter c ranges from 10.713 to 13.257 km/h. The seasonal value of the Rayleigh distribution parameter c ranges from 10.641 to 13.203 km/h. The seasonal value of R 2 ranges from 0.85497 to 0.98038 for the Weibull distribution while for the Rayleigh distribution ranges from 0.85634 to 0.98071. The value of RMSE ranges from 0.00355 to 0.00868 for the Weibull distribution while for the Rayleigh distribution ranges from 0.00410 to 0.00864. The performace of the Weibull and Rayleigh distribution was evaluated by the coefficient of determination (R 2 ) and the root mean square error (RMSE). The value of R 2 is 0.99027 for the Weibull distribution and 0.97921 for the Rayleigh distribution when applied to yearly wind speed data. The value of RMSE is 0.00251 for the Weibull distribution and 0.00367 for the Rayleigh distribution when applied to the same set of data. The yearly comparison shows that the Weibull distribution returns higher value of R 2 and the smaller value of RMSE. This indicates that the Weibull distribution is slightly better choice for fitting the yearly wind speed data than the Rayleigh distribution.
For the monthly wind speed data the value of R 2 ranges from 0.70181 to 0.95882 for the Weibull distribution and from 0.69942 to 0.95942 for the Rayleigh distribution. The RMSE ranges from 0.00592 to 0.01392 for the Weibull distribution and from 0.00598 to 0.01393 for the Rayleigh distribution. The month to month comparison shows that, in general, the Weibull distribution leads to the higher values of R 2 and the smaller values of RMSE than the Rayleigh distribution. It holds true for 8 months of year 2009. It confirms that the Weibull distribution is slightly better for fitting the monthly wind speed data than the Rayleigh distribution. The values of R 2 and RMSE obtained by fitting the monthly Seasonal estimates of the Weibull and Rayleigh distribution parameters and statistical analysis for wind speed distributions The Weibull distribution has been found to be more suitable for fitting the wind speed data in eight months than the Rayleigh distribution.
The Weibull distribution can be recommended for fitting the wind speed data at the seasonal base.