DOMINANT AIR POLLUTION SOURCE DETERMINATION IN THE VICINITY OF COKING PLANT BASED ON STATISTICAL DATA ANALYSIS DOMINANT AIR POLLUTION SOURCE DETERMINATION IN THE VICINITY OF COKING PLANT BASED ON STATISTICAL DATA ANALYSIS

The station measures concentrations of SO 2 , NO 2 , PM 2.5 , PM 10 and VOCs. PM 10 is sampled and analysed for presence of PAHs and heavy metals. It also measures basic meteorological variables – temperature, air pressure, humidity, wind direction and velocity at 2 m above ground. The goal of the article is to present the statistical analysis based determination of the dominant air pollution sources in one of the Ostrava pollution monitoring stations. Statistical analyses were based on correlation analysis and quantile-based time pattern analysis. These analyses were able to prove that benzene and toluene pollution is dominantly caused by the same pollution sources. The time pattern analysis then proved the dominance of a nearby cokery. Time pattern analyses also proved traffic to be the dominant source of NO 2 pollution and PM 10 pollution to be the mix of traffic, heating and industrial sources.

pollution source has its specific temporal pattern which may be also observed in the air pollution data.
The car traffic is characterised by its week-based periodicity. There is high car traffic from Monday to Friday, low traffic on Saturday and slightly higher traffic during Sunday.
Heat sources (domestic heating, central heating, industrial heating, etc.) are characterised by their year-long periodicity with high heat source utilisation during winter, low or no utilisation during summer, and decreasing/increasing utilisation during the transitional months of spring and autumn.
The air pollution data can also be clustered by the wind direction which may give an insight into which direction are the dominant pollution sources [8].
Data for the analysis were collected between the years 2007-2013. The air pollution data were taken from the Ostrava-Privoz air pollution measurement station and meteorological data were taken from the meteorological station in Mosnov which represents the overall meteorological conditions in the region.

Methods and data -statistical properties of air pollution data
According to the literature [8], air pollution data have approximately lognormal statistical distribution. Lognormal distribution is not symmetric and can be characterised by onesided extremely high values.
These statistical properties cause that commonly used arithmetic mean and variance based methods should not be used. That is why an approach based on median-quantile analyses was used in this work.
The statistical distribution can be tested by Kolmogorov-Smirnov tests. For each of the studied pollutant, this test was performed to confirm the lognormal distribution (SW Statgraphics [9]). All tests confirmed the lognormal distribution for all observed pollutants (see Fig. 2).

Fig. 2 Histogram of NO 2 concentrations and fitted lognormal distribution
The Ostrava-Privoz station is at one of the most polluted sites in the Czech Republic. There are above threshold limit concentrations of PM 10 and benzene. Threshold limits were exceeded every year in the 2007-2013 period [1].
There are several major air pollution sources which may contribute to high air pollutant concentrations. There are industrial sources: OKK Koksovna Svoboda coking plant to the NE, OKK Koksovna Jan Sverma coking plant to the W, BorsodChem chemical plant to the W, domestic heating sources in the vicinity of the station, Hlucinska street to the E with heavy car traffic and an old ecological burden -the Ostramo oil lagoonsto the SW. The site is within the highly-populated industrial Upper Silesian region; therefore, it is also influenced by the variety of both industrial and non-industrial air pollution sources from both Czech and Polish parts of the region.
The air pollution in the measurement site was analysed in several studies which assessed the air quality in the region. Those studies were based on air pollution dispersion modelling. PM 10 pollution was analysed in the Air Silesia project [3] and the study for the regional council [4]. NO 2 pollution was one of the National Health Institute study's concerns [5]. Results for the Ostrava-Privoz station site are presented in Table 1.
Contribution of pollution sources to the pollutant concentrations in μg/m 3 [3], [4] and [5]  The validity of such results can be questioned because of input data inaccuracy and model simplifications. That is why the model result verification is crucial. One of the verification possibilities is the statistical analysis of the pollution monitoring data. More details and examples can be found for example in literature [6] and [7].

Methods and data -generally
The goal of this work was to perform some basic statistical analyses which would allow determining dominant air pollution sources for each of the studied pollutant -PM 10 , NO 2 , benzene and toluene.
The correlation analysis allows grouping pollutants which are produced by the same dominant pollution sources. Each

Results -NO 2 analysis
The NO 2 analysis shows that concentrations are falling slightly. It corresponds well with slightly lowering emissions of the car traffic in the Czech Republic (see Fig. 7) [10]. Dailybased analysis shows key effect of the car traffic on the NO 2 concentrations (see Fig. 8). Local frequency of transport can be expressed by number of vehicles that cross the measured region per one day. The latest numbers received in the year of 2010 are:

Results -Correlation analysis
There were 4 pollutants analysed -PM 10 , NO 2 , benzene and toluene. Correlation analysis confirmed that benzene and toluene are strongly correlated to each other. This means that they share the same dominant pollution sources. This strong correlation is severely weakened from June to August. This suggests the effect of photochemical reactions which mostly occur during those three months when the solar radiation is the strongest. During this period, each pollutant is decomposed at different rates (see Fig. 3). Correlations among PM 10 , NO 2 and VOCs are lower (0.4-0.6). This suggests different pollution sources that are weighted differently for each pollutant. . This is probably the effect of reduced industrial pollution (cokery batteries modernisation, particulate matter filters in Arcelor Mittal plant, etc.) -see Fig. 4. Monthly-based analysis shows higher and more variable concentrations during winter due to the combination of heat sources' production and occurrence of weather unsuitable for pollution dispersion (see Fig. 5). The analysis of PM 10 concentrations clustered by the wind direction shows the presence of an important point source to the NE -OKK Koksovna Svoboda coking plant -and irregular peaks of concentrations from the east caused by particles reemission by car traffic (see Fig. 6). The analysis of benzene concentrations clustered by the wind direction shows the dominance of a point source to the NE -OKK Koksovna Svoboda coking plant (see Fig. 10).
This result is confirmed by the analysis based on annual comparison which shows a steady decline of concentrations and significantly lower values in 2009. This corresponds well with technology improvements in the coking plant and 40% production reduction in 2009 respectively (see Fig. 11).

Conclusion
The correlation and time pattern analyses are simple procedures which may show a presence or a dominance of certain pollution sources.
They may be used for measurement site data analysis and give insight into locally important pollution sources as well as an independent verification method for more detailed air pollution dispersion modelling.
Analyses of the Ostrava-Privoz air pollution monitoring station confirmed the results of the pollution dispersion modelling. PM 10 concentrations are the combination of heat sources, car traffic and industrial sources, NO 2 concentrations are dominantly caused by the car traffic, and VOC concentrations are dominantly caused by the OKK Koksovna Svoboda coking plant. Performed analyses do not numerically quantify the influence of certain air pollution sources or their groups. It is to be considered if and how could this approach be further developed to provide such results.

Acknowledgement
This article was financially supported by the Ministry of Education, Youth and Sports of the Czech Republic from the "National Feasibility Program I", project LO1208 "Theoretical Aspects of Energetic Treatment of Waste and Environment Protection against Negative Impacts". 12613 private cars, 2470 trucks, 536 buses and 129 motor-bikes [11]. Usually, summation of traffic is worked out every five years.

Results -Benzene analysis
The benzene analysis based on monthly comparison shows that in months with the highest solar radiation (May-July), the concentrations drop significantly because of photochemical reactions (see Fig. 9).