Automatic Identification Of Weed Seeds By Color Image Processing

The analysis and classification of seeds are essential activities contributing to the final added value in the crop production. Besides varietal identification and cereal grain grading, it is also of interest in the agricultural industry the early
of 8
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
  A UTOMATIC I DENTIFICATION OF W EED S EEDSBY C OLOR I MAGE P ROCESSING P. M. Granitto, H. D. Navone, P. F. Verdes and H. A. Ceccatto  Instituto de Física Rosario (CONICET – Universidad Nacional de Rosario) Boulevard 27 de Febrero 210 Bis, 2000 Rosario, Argentina (granitto,navone,verdes,ceccatto) A BSTRACT :   The analysis and classification of seeds are essential activities contributing to thefinal added value in the crop production. Besides varietal identification and cereal grain grading, itis also of interest in the agricultural industry the early identification of weeds from the analysis of strange seeds, with the purpose of chemically controlling their growth.   The implementation of newmethods for reliable and fast identification and classification of seeds is thus of major technical andeconomical importance. Like the manual identification work, the automatic classification of seedsshould be based on knowledge of seed size, shape, color and texture. In this work we present astudy of the discriminating power of morphological, color and textural characteristics of weedseeds, which can be measured from video images. This study was conducted on a large basis,considering images of weed seeds found in Argentina’s commercial seed production industry andlisted by the Secretary of Agriculture as prohibited and primary- and secondary-tolerated weeds.We first describe the experimental setting and hardware used to capture the seed images. Then, wedefine the morphological, color and textural parameters measured from these images, and discussthe selection of the most relevant ones for identification purposes. Finally, we present results for theidentification of test images obtained using a Naive Bayes classifier and a committee of ArtificialNeural Networks. K EYWORDS : machine vision, seed identification, pattern recognition, neural networks.  1. I NTRODUCTION The analysis and classification of seeds are essential activities contributing to the final added valuein the crop production. These studies are performed at different stages of the global process,including the seed production, the cereal grading for industrialization or commercializationpurposes, during scientific research for improvement of species, etc. For all these purposes,different procedures based on manual abilities and appreciation capabilities of specializedtechnicians are employed. In most cases these methods are slow, have low reproducibility, andpossess a degree of subjectivity hard to quantify, both in their commercial as well as in theirtechnological implications. It is then of major technical and economical importance to implementnew methods for reliable and fast identification and classification of seeds. Like the manualidentification work, the automatic classification should be based on knowledge of seed size, shape,color and texture (i.e., greytone variations on the surface). Numerous image analysis algorithms areavailable for such descriptions, which make machine vision a suitable candidate for such a task.Most previous attempts to identify seeds by machine vision have concentrated on cultivatedvarieties. Initially it was assumed that varietal differences could be extracted from the structure of the kernel, so different geometrical measurements were used to describe a variety[1,2]. Otherinvestigations have been conducted to separate different species of cereal grains[3,4], wheat fromnon-wheat components (weed seeds and stones)[5], different types of wheat[6-9] and specialgrading classes[9,10], etc. In these studies the image analysis was essentially restricted to basicgeometrical measurements to obtain different parameters (shape factor, aspect ratio, length/area,etc.). In addition, color was successfully used to separate red-, amber- and white-colored wheat, butcould not separate into grading classes. More recent studies have used color images to establishseed quality and hardseededness of some annual pasture legumes[11], to characterize fungaldamage, viral diseases and immature soybean seeds[12], etc.   Besides varietal identification and cereal grain grading, it is also of major interest in the agriculturalindustry the early identification of weeds from the analysis of strange seeds, with the purpose of chemically controlling their growth. Weed seeds are also identified by seed testing stations and seedcorporations to measure the purity of the harvest, and by research stations to detect changes in theseed banks in the soil. The automatic identification of seeds of wild species is different from theidentification of seeds of varieties of a single species. To be approved as a variety, the cultivatedplants have to be homogeneous with respect to certain plant characters. Wild species, on thecontrary, tend to have larger intra-species variations. Moreover, the variation among weed specieswill be in general larger, but seeds of some closely related species can be very similar. From thecolor point of view, most weed seeds are light to dark brownish or black. All these characteristicsmake the automatic identification of weed seeds a priori a difficult classification problem.Consequently, a successful approach should include parameters associated to all the relevantcharacteristics of size, shape, color and texture above mentioned.   An early attempt to identify weed seeds[13] showed the importance of using color instead of black and white images to improve classification accuracy. However, this investigation was conductedconsidering only four different weed species, which does not provide a good characterization of seeds variations. In this work we present a study of the discriminating power of morphological,color and textural characteristics of weed seeds. This study was conducted on a much larger basis,including seed images of frequent weeds found in Argentina’s commercial seed productionindustry. In order to avoid having a bias in the selection of species to be included in this study, werestricted ourselves to the 58 species listed by the Secretary of Agriculture as prohibited and  primary and secondary tolerated weeds. From this list we finally considered 57 species for which agood number (~ 40) of young exemplars were available in the seed bank of the Seed AnalysisLaboratory at the Oliveros Experimental Station of the National Institute for AgriculturalTechnology (INTA).   This work is organized as follows. In Section 2 we describe the hardware and the experimentalsetting conditions used to capture the seed images. Then, in Section 3 we define the morphological,color and textural parameters measured from these images, and discuss the selection of the mostrelevant ones for identification purposes. In Section 4 we present the results obtained using twodifferent classification methods (Naive Bayes and Artificial Neural Networks). Finally, in Section5 we draw some conclusions.   2. E XPERIMENTAL SETTING FOR SEED IMAGES ACQUISITION   We have built a database containing 3163 images of the 57 species considered (a list of these species isavailable on request). To acquire the images we used a Sony XC-711P RGB video camera with a 2/3”CCD, connected to a AM-CLR frame grabber from Imaging Technology with 8 bits look-up tables percolor channel. Illumination was provided by a 150W Fostec light source through a quadruple fiberoptic bundle of 12.7mm diameter, with the four guides in a symmetric arrangement to produce an evenillumination with good texture enhancement. Regardless of the seed sizes, all images were taken toapproximately fill the camera field of view by adjusting a 6000 System 6.5X parfocal zoom fromNavitar. Lens attachments of 0.5X and 2X allowed to cover the seeds size range of the speciesconsidered. Light intensity was regulated with an iris diaphragm in order to adjust the illumination tothe changing field of view while keeping a constant color temperature (corresponding to a standardUshio 20V-150W halogen projector lamp). A better control of illumination conditions would haveenhanced the classification capabilities of color and texture parameters, which could be required for acommercial system. However, the experimental setting just described was considered enough for thepurposes of the present work.   Images were taken with a 768 × 512 pixel resolution on a blue background, which can be easilysubtracted by standard segmentation routines because of the difference in color with the seeds. Thesegmented images consist of arrays whose entries are 24-bit records, corresponding to the 256 pixelintensity levels (8 bits) for each of the red (  R ), green ( G ) and blue (  B ) channels. In order to reduceeffects associated to illumination changes, we also considered the normalized red ( r=R  /   I  ) and green( g=G/I  ) pixel values, where  I= (  R+G+B )/3 is the average intensity.   3. C LASSIFICATION P ARAMETERS   We have measured a number of features from the raw seed images to be later used for classificationpurposes. As stated above, these features correspond to morphological, color and texturalcharacteristics of the seeds. Below we briefly describe the different parameters considered.   M ORPHOLOGY Size and shape characteristics of the seeds can be easily obtained from the binarized images. Inparticular, we have measured the lengths of the principal axes and several moments of the planar massdistribution with respect to these axes, the size of the minimal rectangular box containing the seed andthe ratio of its area to the seed area (compactness), etc. All these quantities were made dimensionlessby conveniently normalizing them by the required powers of the square root of the seed area (which  was taken as the only dimensional quantity). Furthermore, since we used the principal axes as thereference frame for all the measurements, the resulting values are independent of the image orientation.In total, we have measured 21 morphological features. C OLOR We have determined the gray level histograms in the  I,r,g channels. From these histograms weconsidered standard features such as average, variance and skewness. In addition, we considered ratiosof average histogram values in the RGB channels like, for instance, E[  R ]  /  E[  I  ] and E[ G ]  /  E[  I  ] (here E[.]means the average pixel value in the corresponding channel). We have measured 12 different colorcharacteristics. T EXTURE Like in [13], two different textural analysis were used to describe the texture of the seed surface:1. Gray Level Co-Occurrence Matrix : A two-dimensional matrix with entries  A ij , where i,j  are graylevels and the entry value gives the number of nearest-neighbor pixels in the image having thesegray levels along a given direction (we used alternatively both principal axis directions, whichmakes the textural features rotational invariant). In practice, we have considered a coarse-grainedversion of this two-dimensional histogram. First, we performed a dynamical equalization of thegray level histogram on each channel using 16 boxes in order to eliminate illumination intensityvariations[14]. Then, the indices i,j  were made to correspond to these box levels. From the resulting16 × 16 matrix 17 textural features were obtained. The precise definition of these parameters and theinterpretation of their discriminating properties can be found in [14,15].2. Gray Level Run Length Matrix : The two dimensions in this matrix are the gray level and the so-called run length, i.e.  the base 2 logarithm of the number of adjacent pixels in a given direction withthe same gray level. We have considered both principal axis directions to compute the run lengths.In this case the matrix dimension was reduced by taking the same 16 gray level intervals usedbefore. The resulting matrix allows to measure 4 new textural features. The precise definition of these parameters and the interpretation of their discriminating properties can be found in [16].   In total we have considered 42 textural characteristics. Then, from each color image we measured75 parameters to be used for classification. By simple inspection we determined that several of themhad erratic behaviors and could be discarded. Finally we retained 15 morphological, 8 color and 17textural properties. Of course this large set of parameters still contained redundant, too noisy oreven irrelevant information for classification purposes. In order to choose the best features in eachgroup (those with the largest discriminating power), we implemented standard sequential forwardand backward selection algorithms[17] using the performance of a Naive Bayes classifier asselection criterion. The Naive Bayes classifier fits the class conditional probabilities with a productof normal distributions of the individual features and, in spite of its simplicity, it has a very goodperformance for this problem (see next section). The selection algorithms reduced the parameters tonearly optimal sets of 10 morphological, 7 color and 7 textural features. The same procedureapplied to the joint 24 remaining parameters selected 12 (6 morphological, 4 color and 2 textural)features, which were finally used to build the classifiers. A list of these parameters is given in theAppendix.   4. R ESULTS   In order to compare the discriminating power of the different set of features, in Table I we presentthe generalization capabilities of Naive Bayes classifiers built solely in terms of the 10  morphological, 7 color or 7 textural features. For this we have split the 3163 images of the 57species considered in training and test sets, using, for each species, 80% of the images to build theclassifier and including the remaining 20% in the test set. This leaves 2527 images for training and636 images for testing the system. Table I gives the performances on both the training and test sets,and also indicates how these performances increase when the system is given the chance to assign agiven image to any of the n most probable classes, for n= 1,2 and 3 (this possibility is very useful inpractice, since untrained operators can easily select the correct option by simple visual inspection of stored representative seed images of the n  classes suggested by the classifier).   F IRST O PTION   F IRST T WO O PTIONS   F IRST T HREE O PTIONS   F EATURES   T RAINING   T EST   T RAINING   T EST   T RAINING   T EST   M ORPHOLOGY   86.3   85.5   95.9   95.8   98.3   97.5   C OLOR   62.1   49.2   74.4   64.5   82.1   73.0   T EXTURE   55.6   51.3   69.4   65.4   77.4   72.6   Table I : Naive Bayes classifier performances in % of correct seed identifications using only oneparticular set of features at a time.   A quick look at this table shows, as expected, the large discriminating power of morphologicalfeatures. As anticipated, color is not particularly good because many species are light to dark brownish or black; its discriminating power is nearly equal to that of textural features. However, if we consider any two combined set of features (see Table II), morphology plus color features haveand edge over the combined use of morphology and texture characteristics. Notice, however, that inthis last case it would be enough to consider black and white images, which constitutes an importantsimplification and a reduction in hardware cost. Finally, the performances of the Naive Bayesclassifier built in terms of the optimal set of 12 features listed in the Appendix are given in TableIII.   F IRST O PTION   F IRST T WO O PTIONS   F IRST T HREE O PTIONS   F EATURES   T RAINING   T EST   T RAINING   T EST   T RAINING   T EST   M ORPHOLOGY + C OLOR   96.7   95.4   99.3   98.4   99.5   99.4   M ORPHOLOGY + T EXTURE   91.7   90.4   97.7   96.4   98.6   98.6   C OLOR + T EXTURE   84.0   74.5   91.8   84.7   95.0   90.3   Table II : Naive Bayes classifier performances in % of correct seed identifications using differentcombination of two sets of features.   We have also developed a classifier based on Artificial Neural Networks (ANN)[18]. To this endwe trained 10 feedforward networks with 12 input, h  hidden, and 57 output units. The numbers of input and output units correspond to the number of parameters used and seed species to beidentified respectively. The number of hidden units was varied from h= 20 to h =80; the resultspresented below correspond to h= 40 units, which was found to be nearly optimal. We employedoutput units with softmax (normalized exponential) activation functions to allow the interpretationof outputs as class probabilities. Furthermore, a cross-entropy error measure was used, which is thestandard choice for classification problems. We trained the ANN with the usual backpropagationrule until convergence, since only negligible overfitting problems were observed. This avoided theuse of part of the training set for validation purposes. The performance of a single (generic) ANNand the results obtained by structuring the 10 networks in a committee are shown in Table III. In thecase of the ANN committee we have two options: i) each network votes for the class with the
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!