1. Selecting Adaptive Number of Nearest Neighbors in k-Nearest Neighbor Classifier Apply Diabetes Data

    Natalia Labuda1, Julia Seeliger1, Tomasz Gedrande1, Karol Kozak1,2

    1. Medical Faculty, Dresden University of Technology, Carl Gustav Carus University Hospital Dresden.
    2. Faculty of Management, Finances and Informatics, Wroclaw University of Economy.

    Abstract: The k-nearest neighbours (knn) is a simple but effective method of classification. K is the most important parameter in medical data classification based on k-nearest neighbor algorithm (knn). The major drawback with respect to knn is dependency on the selection of a “good value” for k. The value of k is usually determined by the cross-validation method but if k is too large, big classes will overwhelm small ones. On the other hand, if k is too small, the advantage of the knn algorithm will not be exhibited. Therefore, it is very likely that a fixed k value will result in a bias on large classes. In this paper we propose a modified k-nearest neighbor method, which uses different k values for different regions in an entire data set, rather than a fixed k value for a complete data set. The number of nearest neighbors is selected locally based on P-value Rate criteria. We apply the modified knn method to diagnose type II diabetes dataset which includes 768 samples from diabetic patients taken from Pima Indians Dataset.

    Pages: 1 – 13 | Full PDF Paper
  2. Bayesian Updating Based on Hausdorff Outer Measures and the Role of Emotions in Decision Process During the Therapeutic Phase of Alliance

    Serena Doria1, Iolanda Angelucci2

    1. Department of Enginnering and Geology, University G.D’Annunzio, 63013 Chieti, Italy.
    2. Graduate School in Clinical Psychology and Psychotherapy (S.S.P.I.G.; I.F.R.E.P.); via Reggio Emilia, 52 ,00198, Roma, Italy.


    A probabilistic approach of the diagnostic process is proposed in which the subject’s degree of knowledge is represented with coherent upper conditional probabilities defined by Hausdorff outer measures.

    Using this model the diagnosis is assumed to be positive when it produces a change, that is when the subject’s level of knowledge is defined by an a posterior Hausdorff outer measure different from the initial Hausdorff outer measure.

    The psychotherapist-patient system is interpreted as a complex system, whose evolution, representing the phase of alliance, is described by a finite family of similarities that, starting from certain initial conditions, evolve the system into the attractor. This set, characterized by its own complexity, measured in terms of the Hausdorff dimension, represents the unconscious of the therapist-patient system and it is characterized by symmetry and self-similarity.

    Keywords: Coherent upper conditional probability, Hausdorff outer measures, fractal sets, therapeutic alliance, unconscious, emotions.

    Pages: 14 – 28 | Full PDF Paper
  3. An Ontology-Based Approach To Administrative Data Sources’ Documentation And Quality Evaluation

    Giovanna D’Angiolini, Pierina De Salvo, Andrea Passacantilli

    Italian National Institute of Statistics, Via Cesare Balbo 16, 00184, Rome, Italy.

    Abstract: The paper discusses the documentation and standardization requirements related to the statistical utilization of administrative data sources and illustrates the main features of the Istat’s strategy for satisfying such requirements, which is centred on an ontology based approach to the information content specification and data quality assessment.

    Keywords: Administrative data source, Administrative data documentation, Administrative data quality, Data source ontology, Statistical data production.

    Pages: 29 – 38 | Full PDF Paper