Abstract meiner Dissertation

Conceptualization of Contextual Geodata for Decision Support Systems

The influence of the increasingly progressive information revolution, intensified by mega trends of globalization and digitalization, is more and more affecting the world as a whole. Countless information sources are continuously describing two things with a higher precision and coverage. The geographic world with its interdependencies and correlation as well as the human life in exactly this environment. This information contains not only detailed maps like Google Maps or OpenStreetMap. It also has data about demographical and social attributes of inhabitants, which is, for instance, offered by the Federal Statistical Office of Germany as open data. In addition, there exist many companies focusing on the creation of complex and commercial datasets. These datasets range from regional purchasing power to consumer behavior of citizens. As a consequence, rich and heterogeneous datasets are available, containing spatial and temporal information that is of high relevance to decisions that are made every day [BWS+18].

However, this exponential growth of decisive data makes it increasingly difficult for decisionmakers and experts to analyze all relevant information. Even if, according to estimates and surveys, less than one percent of the data are stored in analog form, digital does not necessarily imply analyzable. Data are not always available and sometimes not always usable, if legal reasons contradict it as with demographic information. Geographical data sources for topographic information have other exchange formats and interfaces than maps or satellite images. Enriching these data sets with semantic relationships as well as qualitative and quantitative information may represent a technical challenge. How these multi-faceted, multi-dimensional, multi-modal geographical data and information set can be modeled in order to recognize hidden correlations and derive new knowledge is the actual research question [BWS+18].

State of the Art in Site Selection

Research work in the area of geospatial data is actively under investigation and, for industrial site selection as an example, traces back nearly a century [LL85]. Through the state of the art, however, it has been concluded that they are not consistently oriented towards spatially and temporally highly granular data sets. Investigations in this area can be analyzed regarding three criteria. These are I) the spatial granularity (nations, countries or municipalities/cites as area of focus), II) the number of sites (i.e., geographic entities of focus) to be analyzed, and III) the number of location factors (i.e., the feature space). In this application scenario of site selection, the ultimate goal is to analyze with a high spatial granularity (municipalities/cites). Furthermore, all existing sites has to be taken into consideration with there large number of locational factors.

However, prior investigations of sites have only been made on the basis of small datasets. These datasets had both, a small number of location factors and assessed sites. None of the studies has used datasets, which fulfills likewise all three criteria [BWS+18]. Similar in crime analysis, where existing approaches focus on higher spatial granularity country or state levels). Previous works have also failed to address the spatial and temporal relationships among observations as everything is naturally linked to each other and thus, having a high impact on spatio-temporal crime analysis [BSAD18].

Data-Driven Site Selection

However, the data available today allows a much more precise and fine-grained view. Conclusions no longer have to be drawn on a high level of granularity. Data sets are continuously generated at the lowest geographic level, with high temporal periodicity and describing the behavior of individuals with manifold, qualitative data sets. To make use of these extensive information sources, a quantitative geodata model has been researched. The objective is that all spatial entities are delineated by their descriptive location factors. On that basis, statistical models are combined with data mining algorithms and deep learning approach for enabling sophisticated predictions and recommendations. By combining these geodata model with multi-model analysis, this thesis evaluated if existing research results for site selection and crime analysis can be reproduced with the same accuracy when it comes to a lower spatial granularity (i.e., municipalities or cites instead of states respective counties).

As a data-driven site selection approach, new approach QuIS was proposed. It impartially finds suitable site based on analyzing decisive data of all location factors in both time and space. A case study on site selection of supermarkets in Germany was performed using real data of more than 200 location factors for 11.162 sites. Evaluation results show that there was a big coverage (86.4%) between the sites of existing supermarkets selected by economists and the sites recommended by the presented method. In addition, the method also recommended many sites (328) where it is beneficial to open a new supermarket [BWS+18]. Furthermore, QuViS was developed, which is an interactive platform for visualization and exploratory data analysis of site selection. QuViS aimed to support decision makers and experts during the process of site selection. Case study results shows that QuViS provides an easy and intuitive way for exploratory analysis of geospatial multidimensional data [BKAD18].

Geospatial Customer, Competitor and Supplier Analysis

By only using statistical data as offered by the Federal Statistical Office in Germany, problems arose for metropolises like Berlin. These cities have several millions of inhabitants but are only modeled as one geographic entity. In order to provide insights into cities down to street level, the geodata model was enriched with the information of OpenStreetMap (OSM). Additionally, the road network of OSM was used to dynamically calculate catchment areas and combine them with location factors. This combination of both enables data-driven customer, competitor, and supplier analysis. The evaluation was performed on the scenario of online food delivery models of Edeka, REWE and Amazon-Fresh. Currently, Amazon Fresh enforces the delivery of perishable groceries, which leads to new challenges for traditional supermarkets. The results indicate, that Edeka’s delivery model, currently investigated in Berlin, only reaches 0,28 % of the inhabitants, whereas Amazon fresh achieves nearly 100 % coverage [BRAD18].

Spatio-Temporal Crime Analysis

This thesis transfers the stated results to the application scenario of crime analysis. The conducted research targeted crime analysis at lower granularity in space and time dimensions. Therefore, more than 400 different socio-economic location factors as well as crime data for county level of Germany were analyzed. The proposed crime analysis is a twostep approach. Spatio-temporal analysis and spatio-temporal prediction for forecasting crime by assessing its relationship with the socio-economic factors for 81 county sites in Germany. The experiments illustrate that i) spatial-temporal correlations exists within crime data and ii) cross-correlation exists between crime and socio-economic factors. Evalutions exhibit a mean absolute percentage error of 6.79% for spatiotemporal model crime predictions. This results outperformed traditional regression techniques having prediction error of 37.1%−37.8%. The experiments showed that spatio-temporal prediction models work better than general regression approaches and hence, crime prediction with high spatial-temporal granularity is feasible [BSAD18].

Conclusion

Summarizing the research findings in these thesis, the following five contributions were made in this thesis:

  1. An integrated multi-faceted, multi-dimensional, multi-modal of geographical data, enabling data analytics and recommendations,
  2. A data-driven site selection approach named QuIS,
  3. An interactive platform for visualization and exploratory data analysis named QuViS,
  4. Data-driven customer, competitor, and supplier analysis by means of open source data, and
  5. The evaluation that crime analysis at low granularity in space and time dimensions yields to results as precise as for high granular data sets.

The implication of these results is that geographical data can be analyzed and evaluated with a much higher spatial and temporal granularity than it is currently applied in practice and research. Specific application scenarios are not the deciding factor, as the approaches and models can be transferred generically. The more profound insight, however, is that the environment, in which we live and interact, can be described much better and more extensively than previously assumed. Conclusions and findings can be derived from descriptive and fine granular data sets. In turn, it can provide the basis for recommendations for action in complex scenarios.

Bibliography

[BKAD18] Baumbach, Sebastian ; Khan, Jahanzeb ; Ahmed, Sheraz ; Dengel, Andreas: QuViS – The Question of Visual Site Selection. In: To be submitted

[BRAD18] Baumbach, Sebastian ; Rubel, Christoph ; Ahmed, Sheraz ; Dengel, Andreas: Geospatial Customer, Competitor and Supplier Analysis for Site Selection of Supermarkets. In: Proceedings of the 2019 2nd International Conference on Geoinformatics and Data Analysis. ACM, Prague, 2018. – ISBN 978-1-4503-6245-0, S. 110-114

[BSAD18] Baumbach, Sebastian ; Sharma, Nikita ; Ahmed, Sheraz ; Dengel, Andreas: Analyzing Spatio-Temporal Effects of Social-Economic Factors on Crime. In: Proceedings of the 10th Tenth International Conference on Advanced Geographic Information Systems, Applications, and Services, IARIA, Rome, 2018. – ISBN 978-1-61208-617-0, S. 11-17

[BWS+18] Baumbach, Sebastian ; Wittich, Frank ; Sachs, Florian ; Ahmed, Sheraz ; Dengel, Andreas: QuIS – The Question of Intelligent Site Selection. In: To be submitted

[LL85] Lehr, J ; Launhardt, W: Mathematische Begründung der Volkswirtschaftslehre. 1885

WordPress Cookie Hinweis von Real Cookie Banner