1 Using geotagged photographs and GPS tracks from social networks to 2 analyse visitor behaviour in national parks 3 This article explores the potential of geotagged data from social networks to 4 analyse visitors’ behaviour in national parks, taking the Teide National Park as a 5 study area. Given its unique landscape and characteristics, plus the fact that it is 6 the most visited national park in Spain, Teide National Park presents itself as a 7 suitable candidate to explore new sources of data for studying visitors’ behaviour 8 in national parks. Through data from a social photo-sharing website (Flickr) and 9 GPS tracks from a web platform (Wikiloc), we outline several visitors’ 10 characteristics such as the spatial distribution of visitors, the points of interest with 11 the most visits, itinerary network, temporal distribution and visitors’ country of 12 origin. Additionally, we propose a practical use of geotagged data for determining 13 optimal locations for new facilities such as information stands. Results show that 14 data from social networks is suitable to analyse visitor behaviour in protected areas. 15 Keywords: Social media data, geotagged photographs, GPS tracks, nature-based 16 tourism, national parks, visitors’ behaviour. 17 Introduction 18 “Nature-based tourism” refers to all types of activities revolving around observation and 19 appreciation of nature, as well as traditional cultures (Amo, López, & Martín, 2006; 20 Buckley, 2008). People are increasingly moving towards an urban lifestyle and the 21 services and facilities that come with it, leading to a near-complete disconnection from 22 nature. Consequently activities and places that allow direct contact with “mother nature” 23 have become an attractive option for tourists worldwide (Newsome, Moore, & Dowling, 24 2012). Nature-based tourism is also considered to be a cultural ecosystem service since it 25 provides nonmaterial benefits such as aesthetic inspiration, cultural identity, a sense of 26 home, and spiritual experience related to the natural environment (Paracchini et al., 2014). 27 Cultural services are ecosystems’ contribution to well-being through spiritual enrichment, 28 cognitive development, reflection, recreation, and aesthetic experiences (Fish, Church, & 29 Winter, 2016). 30 Protected areas and particularly National Parks have become the leading destination for 31 nature-based tourists (Balmford et al., 2009). In this regard, information that allows 32 characterizing and quantifying nature-based tourism in protected areas is fundamental for 33 managers and planners to devise many tasks such as budgeting, scheduling, facilities 34 allocation, demand estimation, tourism impact assessment, and conservation and tourism 35 management policies’ design (Cessford & Muhar, 2003; Eagles, Mccool, & Haynes, 36 2002). 37 The advent of new information technologies such as the web 2.0, social networks, 38 smartphones, and GPS, among others, and their incorporation into our daily life has 39 revolutionised how we create, store, share and interact with geographic information. 40 Particularly in the field of tourism, this phenomenon has transformed how tourists 41 organise and share their travel experiences and have triggered the production of massive 42 amounts of geotagged data (such as photographs, reviews, hotel reservations, check-ins 43 notifications and GPS routes.). Data from social networks due to its high spatial and 44 temporal resolution is useful for nature-based tourism research. It contributes to new 45 insights about visitors’ behaviour, preferences and movement, and enriches traditional 46 data from surveys and official statistics quickly and inexpensively (Campelo & Nogueira 47 Mendes, 2016; J. Y. Lee & Tsou, 2018; Orsi & Geneletti, 2013; Sessions, Wood, 48 Rabotyagov, & Fisher, 2016; Sonter, Watson, Wood, & Ricketts, 2016; Tenkanen et al., 49 2017; Walden-Schreiner, Leung, & Tateosian, 2018; Wood, Guerry, Silver, & Lacayo, 50 2013). 51 The enormous potential of data from social networks has led to a growing interest in using 52 these data sources to study nature-based tourism (see Section 2). However, there is a 53 need to further research on the applicability and effectiveness of social media data for 54 visitor behaviour and monitoring (Heikinheimo et al., 2017). Therefore, this study is an 55 early effort to test the feasibility of this data for visitor demand modelling. Taking Teide 56 National Park as a case study, the main aim of this paper is to analyse visitors’ behaviour 57 using geotagged photographs from photo-sharing site Flickr and GPS tracks from web 58 platform Wikiloc. This research explores data from geotagged photographs and GPS 59 tracks to infer some visitor characteristics such as their country of origin, their spatial and 60 temporal distribution and the itineraries they follow. When possible, we compare our 61 social network data with actual park statistics in order to evaluate their coherence. The 62 rationale is that if in some variables we obtain satisfactory adjustments, other variables 63 for which official data are not available could also be reliable. Additionally, we propose 64 to use geotagged data from Flickr to recommend optimal locations for information stands 65 through the application of a location-allocation model. 66 Background 67 The main destinations for nature-based tourism are usually natural protected areas 68 because their protected status guarantees an unaffected and natural site (Balmford et al., 69 2009). They enjoy exceptional natural qualities such as unique landscapes and wildlife. 70 In this sense, national parks which maintain a special status, while also intended for visits, 71 are the main attractions for this kind of tourism (Newsome et al., 2012). National parks 72 are considered a symbol of national pride (Dudley, 2008). They help prevent loss of 73 biodiversity, preserve the naturalness and magnificence of outstanding landscapes and 74 the supply of ecosystem services such as recreation (Schägner, Brander, Maes, 75 Paracchini, & Hartje, 2016). Moreover, visits to national parks are also a source of income 76 for the local economies that surround them (Eagles et al., 2002). 77 Increasing tourist use of national parks is generally accompanied by growing pressure on 78 resources, with both positive and negative impacts rising. Positive impacts of tourism 79 activities are usually economic and social and include raised awareness on conservation 80 and preservation of natural areas (Ballantyne, Packer, & Falk, 2011), increased 81 employment opportunities and income for local economies (Zambrano, Broadbent, & 82 Durham, 2010) and stronger community values (Scheyvens, 1999). On the other hand, 83 negative impacts tend to be related to ecosystem degradation. For example, higher 84 visitation rates can lead to increases in waste produced within the natural park area and 85 installations (Eagles & Mccool, 2002; Valentine, 1992). Also, more visitors mean higher 86 noise levels which distress wildlife and disrupts their behaviour and natural cycles 87 (Buultjens, Ratnayake, Gnanapala, & Aslam, 2005). Visitors activities can also lead to 88 soil erosion and flora destruction, thus affecting the ecosystem services they provide such 89 as soil regulation (Farrell & Marion, 2001). Therefore, it is of paramount importance for 90 park management to have consistent, reliable and high-quality information about visitors 91 use to provide sustainable management of national parks. In this regard, tourism 92 geography research about visitors’ behaviour can provide insights and understanding of 93 the relationship between visitors and protected sites. 94 So far, most studies on tourism in national parks have focused on estimating carrying 95 capacity (Manning, 2002), environmental impact assessment (Deng, Qiang, Walker, & 96 Zhang, 2003; Fortin & Gagnon, 1999), visitors’ impact on wildlife (Amo et al., 2006), 97 trail design and management (Tomczyk, 2011) and methods of public participation 98 (Brown & Weber, 2011). Nevertheless, some aspects have been researched in much less 99 depth, such as visitor behaviour (Eagles, 2014). Visitor behaviour research is often 100 limited by data availability and detail. Studies on this have depended on survey data and 101 visitor estimates provided by park management agencies or official statistics. In most 102 cases, this data lacks detail and is mostly restricted to visitor counts (Schägner et al., 103 2016). Data on visitors’ activities, presence and schedule, are essential for park managers. 104 Consequently, the need has arisen for alternative data sources that offer a deeper look at 105 visitors’ behaviour. 106 Most recently, researchers have been using tracking technologies such as GPS to collect 107 spatial data of visitors’ movement inside protected areas. GPS tracks have been proven 108 useful to characterize visitors’ intraflows (Meijles, de Bakker, Groote, & Barske, 2014; 109 Orellana, Bregt, Ligtenberg, & Wachowicz, 2012), categorize visitors according 110 movement patterns (Kidd et al., 2018), measure the spatial distribution of visitors’ use of 111 protected areas (Hallo et al., 2012; Taczanowska et al., 2014), evaluate the impact of 112 informal trails (Wimpey & Marion, 2011), mapping recreational suitability (Beeco, 113 Hallo, & Brownlee, 2014) and vehicle stopping behaviour using GPS (Newton, Newman, 114 Taff, D’Antonio, & Monz, 2017). Although GPS data offers advantages over traditional 115 methods such as providing more accurate and reliable data and greater spatial and 116 temporal resolution, the use of this data implicates some problems related to: difficulties 117 in distributing and collecting the tracking devices during experiments, higher costs 118 derived from device acquisition and experiment design (D ’Antonio et al., 2010). 119 New big data sources provide natural parks with original data on visitor behaviour. Big 120 data refers to large datasets that are characterised by their volume, variety, and velocity 121 (Gandomi & Haider, 2015). There are multiple sources of big data such as internet clicks, 122 mobile phone calls, user-generated content, as well as purposefully generated content 123 through sensor networks or business transactions such as sales queries and purchase 124 transactions (George, Haas, & Pentland, 2014). Data from social networks is an attractive 125 alternative to traditional and GPS devices for visitors’ behaviour research due to its easy 126 availability regarding cost and access. Indeed, data from social networks is a cost-free by- 127 product of digital interactions from their users. Social networks provide a digital footprint 128 of their users that come in a variety of formats (photographs, texts, audio, and video) and 129 most of them are geotagged which means that data includes the location of where it was 130 created. In this context, geotagged big data allows the spatial and temporal dimensions of 131 social network users’ behaviour to be analysed. 132 So far, mostly geotagged photographs from photo-sharing services such as Flickr and 133 Instagram have been used to research the tourist use of natural spaces. Geotagged 134 photographs datasets such as Flickr are a suitable approximation of visitors’ annual and 135 monthly rates (Sessions et al., 2016; Sonter et al., 2016; Wood et al., 2013). GIS 136 modelling has also been applied to geotagged photographs to map and identify visitor 137 flows (J. Y. Lee & Tsou, 2018; Orsi & Geneletti, 2013) and to model spatial patterns of 138 visitor use and identify factors contributing to distribution patterns (Tenkanen et al., 2017; 139 Walden-Schreiner et al., 2018). As well, GPS tracks from route-sharing websites like 140 Wikiloc have been used to spatialize and measure the intensity of use for mountain biking 141 (Campelo & Nogueira Mendes, 2016). A comparative study of the information available 142 from three popular volunteer-based geographic information platforms for walking and 143 running in natural parks was carried out by (Norman & Pickering, 2017). 144 Study Area 145 The Teide National Park (Figure 1) is located on the island of Tenerife in the Canary 146 Islands, Spain. The island has 894,000 inhabitants and receives 5.7 million tourists a year 147 (2017). Most of them are international tourists (78.7%), mainly from the United Kingdom 148 (36.2%) and Germany (11%) (Cabildo Insular de Tenerife, 2018). Tourist flow to the 149 island remains very stable throughout the year. Most tourists are attracted by the excellent 150 weather of the Canary Islands (sun and beach tourism), but many of them take advantage 151 of their stay to visit the main attractions of the island, such as its historic cities and Teide 152 National Park. Consequently, pressure from tourism on the island is very high. 153 154 Figure 1. Location map of the Teide National Park 155 156 The Teide National Park, established in 1954, covers a surface area of 18,990 hectares 157 which makes it the largest national park in the Canary Islands. It was declared a UNESCO 158 World Heritage Site in 2007 in recognition of the aesthetic and geomorphologic value of 159 “Las Cañadas” escarpment and the Mount Teide – Pico Viejo stratovolcano. Standing 160 3718 m above sea level, the Mount Teide volcano is the highest peak on Spanish soil, and 161 it stands 7500 m above the ocean floor, making it the world’s third tallest volcanic 162 structure. The park is crossed by a road that links the four access points to the park thus 163 allowing most tourists in Tenerife to cross the park territory. There are no tolls to access 164 the park. A well-maintained trail network, 162 kilometres long, with many 165 infrastructures, including a cable car that takes visitors to Rambleta station at 3555 m and 166 just 163 m from Mount Teide summit, 2 visitors’ centers, 27 viewpoints, a botanical 167 garden, and parking spaces allow its many visitors to explore the park every year 168 (Hernández Álvarez, 2017). Visitors presence is regular throughout the year (Dóniz Páez, 169 2010). According to the Spanish Autonomous Organization of National Parks (OAPN), 170 the Teide National Park is the most visited in Spain. In 2016, the park received 4,079,823 171 visitors (MAPAMA & OAPN, 2017). 172 Data and Methods 173 Data acquisition 174 Official Visitor Data 175 We gathered the official visitor data about yearly and monthly visitors’ estimations from 176 the OAPN website. Official visitors’ estimates are derived from vehicle counters and 177 classifiers placed on the four access points of the park. 178 Visitor data estimations were available for the period 2010 – 2016 and show visitor 179 numbers for year and month. Data about hourly and weekly visitor distributions was not 180 available. Information on the country of origin was only available for the year 2016. 181 Additionally, we downloaded GIS data of the park trail network, infrastructure and park 182 boundaries from the OAPN website. 183 Flickr Data 184 Set up in 2004, Flickr is a social website for sharing photographs and videos online. By 185 2016, Flickr had a community of 112 million users that uploaded an average of 1 million 186 photos per day (The Internet Archive, 2016). In 2006, Flickr introduced the option of 187 locating the photos uploaded on its site. Photograph geolocation can be done using the 188 GPS on mobile devices and cameras or by selecting the location on a world map when 189 uploading the photos from a computer (Flickr, 2017). 190 Data used in this research was downloaded from the Flickr public API (application 191 programming interface) using a Python script. In total, there were 12,949 records 192 concerning photos from 1567 users taken and uploaded throughout Teide National Park 193 from 2010 to 2017. The data was stored in a .geojson file which included the coordinates 194 of photographs, the user ID, the time and date that the photograph was taken and uploaded 195 and the home location of the user. We used the coordinates to create a point layer using 196 GIS software (ArcGIS 10.3 and QGIS 2.14). 197 Wikiloc Data 198 Wikiloc is a web application, also known as a map mashup that allows free GPS tracks 199 and waypoints from around the world to be uploaded and downloaded. It was launched 200 in 2006 and it includes a website and mobile apps for Android and iPhone. At the moment, 201 Wikiloc had 3.3 million members and more than 7.6 million tracks which are mainly 202 intended for hiking and cycling (Wikiloc, 2017). 203 Tracks crossing Teide National Park were downloaded from the Wikiloc website one by 204 one on January 2018. The address search function was used to recover all routes that 205 mentioned Teide National Park in their title. As a result, we found and downloaded 5064 206 tracks correspondent to 1078 users from 2006 to 2018. Tracks are recorded in .gpx files 207 which consist of GPS points taken by the user along the route. ArcGIS was used to convert 208 the waypoints into point layers and then into lines by using the tracking analyst tools. 209 Pre-Processing 210 Spatial aggregation 211 Data collected from Flickr accurately identify where a specific user took every 212 photograph within the boundaries of the Teide National Park. As this study focuses on 213 visitors’ analysis, it was important to account for Flickr user instead of photographs to 214 exclude the bias produced by single users taking several pictures at the same location. 215 Therefore, geotagged photographs were aggregated in hexagons, and the number of users 216 and photographs were counted for each hexagon. Based on the method proposed by (Y. 217 Lee, Kwon, Yu, & Park, 2016), we determined 200 meter-sided hexagons as the optimal 218 size to aggregate our data according to the scale of our study area. The method uses Moran 219 index to evaluate which size hexagon size has the highest autocorrelation according to 220 different zoom levels. Under this approach, the aggregated data maintains the statistical 221 properties of the point data and minimises the modifiable-areal-unit-problem (MAUP) 222 effect. We calculated Moran index for 100, 200, 300 and 400 meter-sided hexagons and 223 200 m sided hexagons yielded the highest value for spatial autocorrelation. Similar 224 studies have also applied this approach to analyse data at the user level (García- 225 Palomares, Gutiérrez, & Mínguez, 2015). 226 Map matching 227 GPS Tracks downloaded from Wikiloc may not perfectly match the digital road and trail 228 network within the park due to GPS location errors. In order to assess visitor’ use of the 229 trail network (number of visitors walking along each section of the network), the GPS 230 tracks should be adjusted to the digital trail network by a process known as map matching. 231 We applied a map matching tool based on the algorithm developed by Dalumpines & 232 Scott (2011). The procedure creates a buffer around the GPS track-line that constrains the 233 estimation of the shortest path between the origin and the destination, by using Dijkstra’s 234 algorithm. In this process, the definition of the buffer distances determines the results: a 235 buffer too small may prevent the matching of many routes, while a buffer too wide may 236 lead to inaccurate or incorrect routes (Figure 2). The GIS-based tool uses route analysis 237 tools from the network analyst extension of ArcGIS to generate the shortest path between 238 the track’s origin and destination along the digital network. The resulting route is the 239 adjusted GPS track (Romanillos & Gutiérrez, 2019). The Map matching tool was also 240 used to adjust itineraries routes created from Flickr sequence of geotagged photographs 241 belonging to one user on the same day. 242 243 Figure 2. Illustration of the Map-matching process. Route map-matched within the 50 m 244 distance buffer. A buffer distance of 25 m would prevent the matching of this route. 245 246 Analysis 247 Home location attribute extraction 248 The home location attribute provided by Flickr users on their account profile was used to 249 infer the country of origin for visitors to the park. Out of the 12,949 photographs, 52% of 250 users had registered their home country. This proportion matches results from other 251 studies that analysed the origin of visitors using Flickr data (Wood et al., 2013). A semi- 252 automated method was applied to classify the origin of every user in our dataset. First, 253 the number of users was summarised according to the home location attribute. Then, 254 irregularities presented by misspelling or abbreviated country names were rectified. 255 Results were compared with official survey data from 2016 by correlation analysis. For 256 this part of the study, we only used the Flickr data from 2016. 257 Temporal aggregation 258 Geotagged photographs provide a source of time data which helps to identify temporal 259 patterns for visitors. We used the timestamp on which each photograph of the dataset was 260 taken to outline visitor flow and monthly, weekly and daily distribution of visits. The time 261 series of the data collected goes from January 2010 to December 2017. From data 262 collected, we determined the total number of unique users per day. These values were 263 summed monthly for every year and yearly. From these values, we calculated average 264 users for each month and compared them with the average monthly visitors. To test the 265 strength of the linear relation between monthly users and official data of monthly visitors, 266 we obtained the Pearson’s correlation coefficient considering the 2010-2017 period for 267 both data sources. Similar approaches had been applied to use temporal data on geotagged 268 photographs from Flickr (Tenkanen et al., 2017). Additionally, we obtained the weekly 269 and hourly distribution of users. 270 Spatial Autocorrelation 271 One of the most exciting features that geotagged data shows is the possibility to visualise 272 the location of visitors during their visit. This “spatial component” allows exploring 273 spatial patterns of visitors’ distribution within the park thought Exploratory Spatial Data 274 Analysis (ESDA). ESDA comprises a set of methods to describe and visualise spatial 275 distributions which focus on distinguishing characteristics of geographical data and, 276 specifically, on spatial autocorrelation and spatial heterogeneity (Anselin, 1995). We used 277 Global and Local Moran Indexes on aggregated Flickr user data to find where people go 278 within the park, which places attract a significant concentration of people and which 279 places are neglected. 280 Anselin's LISA (local indicator of spatial association) Index (Anselin, 1995) was 281 calculated in order to detect the location and extent of spatial clusters of high and low 282 values and outliers. This index identifies four types of clusters: HH (statistically 283 significant cluster of high values), LL (statistically significant cluster of low values), HL 284 (an outlier in which a high value is surrounded by low values), and LH (an outlier in 285 which a low value is surrounded by high values). We used the fixed distance band method 286 with a threshold distance of 1500 m for LISA analysis of Flickr users. We selected this 287 value after applying different thresholds in order to obtain the maximum spatial 288 autocorrelation value. The results obtained using the fixed distance band method and the 289 inverse distance method were quite similar. However, the LISA map obtained using the 290 fixed distance band of 1500m yielded a better result including more significantly clusters 291 and outliers than the inverse distance method. 292 Itinerary mapping 293 Each geotagged photograph from Flickr represents a spatial-temporal event defined by a 294 user’s location at a specific time. When several of these events belong to the same user, 295 it is possible to track his/her travel path by connecting the time-ordered event locations 296 in space, hence obtaining an itinerary for every visitor of the park that has uploaded 297 several photographs within the same day. In this way, we intended to approximate the 298 itineraries followed by the majority of visitors to the Teide National Park road and trail 299 network. For this, we selected every Flickr user that have uploaded more than 10 300 photographs in a single day. 301 Tracking analyst tools produced an itinerary network of users that have uploaded more 302 than 10 photographs in a single day. A total of 290 users met this condition. The resulting 303 network was then adjusted to the trail network by using the map-matching tool. 304 Subsequently, route density was estimated by counting the frequency of tracks on each 305 trail of the park network. 306 A succession of photographs taken by the same user on the same day demonstrates the 307 paths that tourists follow when sightseeing. However, some visitors are particularly 308 interested in recreational activities (hiking, mountain biking) and their behavioural 309 patterns differ from most tourists. This type of park use can be captured using the GPS 310 tracks voluntarily uploaded in dedicated sports web services. Using the previously 311 adjusted tracks from Wikiloc, an overlay analysis was performed to estimate the tracks’ 312 density on the Teide National Park trail network. 313 Optimal location of information stands 314 As an example to show the practical use of new data sources for park managers, we apply 315 a model in order to calculate the optimal location of information stands. Once the need 316 for information stands has been determined by a visitor or employee survey, we propose 317 to use Flickr users’ spatial distribution as a proxy of the spatial distribution of potential 318 demand for information kiosks. The rationale of this application is that the higher number 319 of visitors next to the kiosks, the higher the probability that they will be used by visitors. 320 We used ArcGIS network analyst module to model potential demand from geotagged 321 photographs and apply a location-allocation model to find optimal locations for 322 information stands. Out of the six solutions proposed in ArcGIS, we opted for the 323 maximise attendance approach. This solution aims to maximise the attendance of demand 324 that the facility can cover within a specified distance (Holmes, Williams, & Brown, 325 1972). In this model, the facilities are chosen to allocate as much demand weight as 326 possible while assuming the demand weight decreases about the distance between the 327 facility and the demand point (ESRI, 2018). 328 The location-allocation models require three main inputs; a road network, a set of 329 candidate facilities and the demand points (Mitchell, 2012). We build a network layer 330 from the road and trail layers provided by OAPN. 557 candidate locations were 331 considered for establishing the information stands. These are located along the main road 332 that crosses the park at a fixed interval of 100 meters because the road connects the four 333 access points to the park and it is the starting point for the majority of the trails so all 334 visitors must travel this road. Visitors’ potential demand was modelled from the Flickr 335 users count on the hexagon grid and then transformed into a point layer. Each demand 336 point depicted the number of users within a single hexagon. As well, two required 337 facilities that correspond to the two visitor’s centres were considered in the model. 338 Four scenarios based on the number of additional facilities beside the two existing ones 339 were considered; 1, 3, 5 and 10. We also established four cut-off distances; 1000, 1500, 340 2000 and 2500 m to perform a sensitivity analysis and find out what is the maximum 341 demand coverage for each chosen facility and the existing visitor centres. 342 Results 343 Social media geotagged data visualisation 344 Figure 3 shows the distribution of the 12.949 geotagged photographs along the Teide 345 National Park from 2010-2016. Overall, users took a median of two photographs. The 346 highest active user uploaded a total of 419 photographs. Photographs distribution varied 347 across the years. Year 2016 reported the maximum number of photos (2646) and year 348 2010 had the lowest number of photographs (618). 349 Photograph locations are well aligned with the location and shape of the main road and 350 trail network of the park. High density along this road suggests people are taking 351 photographs either from vehicles or at the viewpoints located adjacent to the road. As 352 expected, most Flickr photos concentrate on the main points of interest within the park. 353 For instance, we observe a significant concentration of photos in four places; Mount Teide 354 summit (1), the cableway base station (3) Cañada Blanca visitors’ centre (4) and Roque 355 García viewpoint (5). 356 357 Figure 3. Spatial distribution of Flickr photographs 2010-2017 358 359 Figure 4 displays the 5064 GPS tracks collected by 1760 users from the Wikiloc site. 360 Users recorded a median of two tracks. The user with the highest number of tracks 361 uploaded 55 routes. The tracks display five activities; hiking, climbing, running, cycling 362 and motorcycling. Hiking is the most popular activity with 77% of the tracks, followed 363 by running 8%, cycling 8%, climbing 7% and, motorcycling 1%. Tracks’ length goes 364 from 0.3 Km to 44Km. The majority of tracks fall within the trail and road network of 365 the Teide National Park. However, there are tracks which are located outside official trails 366 that could indicate the creation of new trails by the visitors. 367 368 Figure 4. Location of Wikiloc Tracks 2006 – 2018 369 370 Country of origin 371 The home location attribute from Flickr shows that visitors come from 27 countries. 372 Comparison between 2016 Flickr users and survey data from the same year (OAPN, 373 2017) resulted in a strong positive correlation (r=0,88, p=0.001). The majority of visitors 374 are international (69%) whereas national visitors account for 31%. As expected most of 375 the international visitors come from European countries, mainly from the United 376 Kingdom (26%), Germany (24%) and France (7%) (Figure 5). Overall, international 377 visitors posted an average of 9 pictures whereas national visitors posted 7 pictures. 378 This data on country of origin is useful for parks managers in order to provide information 379 to the visitors. Since most foreign visitors come from the United Kingdom and Germany, 380 the information kiosks proposed in our study should offer information in at least Spanish, 381 English and German. 382 383 Figure 5 Origin of visitors according to Flickr and Visitor survey; SP: Spain, UK: 384 United Kingdom, DEU: Germany, FR: France, BEL: Belgium, , ITA: Italy, CHE: 385 Switzerland, USA: the United States , RUS: Russia, , NLD: Netherlands, NOR: 386 Norway, OC: other countries. 387 388 Temporal patterns 389 Figure 6 shows the average monthly distribution of users from 2010 to 2016 along with 390 the average monthly distribution of visitors from survey data. The Teide National Park 391 has regular affluence during the year, with slight peaks in March and August which 392 coincide with the Easter holidays and summer seasons respectively. Flickr users’ monthly 393 data for the same period is well aligned with official visitor data provided by OAPN (Fig. 394 5). We found a high correlation value between both estimates (r=0.84, significant at the 395 0.001 level). 396 397 Figure 6. Average Monthly distribution of Flickr users and Teide National Park visitors 398 (2010-2016). 399 400 As well, Flickr data show that the flow of visitors is constant throughout the week, with 401 a slight rise at the weekend, related to a greater presence of inhabitants from the island 402 who visit the park for recreational activities (Figure 7). 403 404 Figure 7.Distribution of Flickr according to days of the week. 405 406 More interesting are the hourly distribution data from Flickr users (Figure 8). There is a 407 very high concentration of visitors in the central hours of the day, particularly between 408 11 am and 5 pm. The high temporal and spatial concentration of visitors (see the following 409 subsection) suggests that overcrowding problems occur at certain times and in certain 410 places and the carrying capacity of some places within the park could be exceeded. Since 411 the presence of visitors is much lower outside the central hours of the day, park managers 412 could mitigate overcrowding with actions aimed at redistributing the flow of visitors 413 throughout the day. 414 415 Figure 8. Hourly distribution of total visitors and in main park attractions 416 417 On a more detailed level, we differentiated the hourly distribution of visitors according 418 to the places that gathered the majority of users; Cañada Blanca visitor’ centre (270 419 users), Roque García (429 users), Mount Teide summit viewpoints (287 users) and the 420 cableway base station (175 users). As a result, (Figure 8), these distributions outlined the 421 daily visitor patterns of each attraction thus allowing us to identify at which time these 422 places are more or less crowded. For example, the Cableway base station’ distribution 423 indicates a higher concentration of visitors from 10:00 to 12:00 which account for the 424 42% of the entire daily visitors. The same can be said for Mount Teide viewpoints where 425 the majority of visitors converge from 11:00 to 14:00 thus accounting for 55% of visitors. 426 In contrast, Cañada Blanca visitor centre and Roque García viewpoint show a more 427 evenly distribution of visitors throughout the day. 428 Spatial patterns 429 The Moran’s global index yielded a value of 0.389 (p-value = 0.00000) which shows a 430 positive spatial autocorrelation, thus following a pattern of spatial clustering. The scatter 431 plot reflects the asymmetry of both distributions (Figure 9). Figure 10 shows the spatial 432 distribution of the clusters: H-H values match areas of high tourist concentration in places 433 where there are paths on which visitors can walk so that they form high-value clusters: 434 Five high-high clusters can be identified: Mount Teide summit, "Cañada Blanca" visitors’ 435 centre, Altavista mountain refuge, Roque García viewpoint, Ucanca Plain viewpoint. H- 436 L values correspond to confined hot spots from which there are no walking opportunities. 437 They are either parking lots or transference stations such as the cableway base station, 438 not connected to trails, from which visitors take photographs. Therefore, they are outliers, 439 points of high concentration of visitors surrounded by “empty” points. Table 1 440 summarises the main results of every cluster. 441 442 Figure 9. The Moran’s I scatterplot of Flickr’s users in the Teide 443 444 Figure 10. Anselin Local Moran's Index Map at the user level 445 446 Table 1. Statistics of point of interest's clusters 447 448 449 450 Itinerary mapping 451 Itineraries identified through the sequence of geotagged photographs by the same user in 452 a single day follow a clear pattern across the main road that crosses the Teide National 453 Park and the cableway line (Figure 11). The results suggest that the majority of 454 photographs are taken from vehicles and viewpoints located along the road, as well from 455 the cableway car. Flickr itineraries hardly reflect the trail network, the only exception 456 being the trail that climbs up to the Teide summit from the south and some minor 457 itineraries. 458 459 Figure 11. Route density of Flickr tracks 460 461 Wikiloc offers complimentary data that portrays a different use of the park. While Flickr 462 tracks showed visitor flows across the main road thus depicting visitor activities such as 463 sightseeing, Wikiloc data reflects a more active use of the park by its visitors. Figure 12 464 shows the tracks’ density in each trail section, as an approximation for the popularity and 465 use of trails for hiking and cycling. Particularly noteworthy is the trail to climb the Teide 466 summit from the southeast, as well as the trails that give access to the peak from the south 467 (Cañada Blanca) and the northeast (Valle de la Orotava). 468 469 Figure 12. Route density of Wikiloc tracks 470 471 Optimal location of information stands 472 Figure 13 shows the required and chosen locations for the four scenarios proposed. The 473 demand assigned to each facility is represented by the number of hexagons covered at 474 each distance cut-off. In all of the scenarios, the locations for additional information 475 stands are well distributed across the park terrain and match up with the viewpoints 476 locations. The maximise attendance model prioritises locations where demand points with 477 higher weight are located closer. In the first scenario, when the model has to add just one 478 extra facility, two different optimal locations are chosen based on the distance cut-off. 479 For a distance cut-off of 1000 m, the optimal location for a new facility is located in the 480 Minas San José viewpoint whereas for the three remaining distance cut-offs the chosen 481 location is located in the cableway base station. In the rest of the scenarios, the same 482 locations are chosen for the different distance thresholds. 483 The chosen location that matches the cableway base station seems to gather the most 484 potential demand for all scenarios. Existing facilities (visitors’ centres) can cover between 485 23% (1000 m distance) to 34% (2500 m) of the potential demand (Table 2). 486 487 Figure 13 Results from the location-allocation models: (1) 1 additional facility, (2) 3 488 additional facilities, (3) 5 additional facilities, (4) 10 additional facilities. 489 490 Table 2. Covered demand for 2 existing facilities 491 492 As additional facilities are allocated, there is a significant increment in covered demand 493 for all cut-off distances (see Table 3). Thus, for example, with the allocation of just one 494 facility, the demand covered increases from 23% to 36% (1000 m cut-off distance). Only, 495 when 10 facilities are added, and the distance cut-off is set to 2500 m, the demand is fully 496 covered. 497 498 Table 3. Covered demand for chosen facilities in all scenarios 499 500 501 Discussion 502 Park managers rely on visitation data to inform policy and management decisions. 503 However, visitation data is often costly and burdensome to obtain and provides a limited 504 depth of information (Sessions et al., 2016). Fortunately, the growing availability of social 505 media data opens new avenues for nature-based tourism research and management. Thus, 506 for example, geotagged photos can provide use estimates and hot spots for large areas 507 quickly and inexpensively (Walden-Schreiner, 2018). In our study, geotagged 508 photographs from Flickr and GPS tracks from Wikiloc provided a close approximation to 509 visitor behaviour in the Teide National Park thus allowing several characteristics of its 510 tourist use to be outlined such as points of interest, visitor concentration and movement, 511 country of origin and visitor distribution over time. 512 This work validates data from social networks using official data when they are available. 513 The rationale is that if in some variables we obtain satisfactory adjustments, other 514 variables for which official data are not available should also be reliable. We could 515 validate two variables: country of origin and monthly distribution of visitors. Thus, the 516 relationship between the distribution of visitors by nationality according to Flickr data 517 and survey data provided by the park managers exhibits a high Pearson’s correlation 518 coefficient (r=0.88), where Spanish, German and British are the most represented 519 nationalities. An exploration of the temporal distribution of Flickr users within the park 520 revealed a continuous flow of visitors through the year which is consistent with 521 conventional data estimated by the park with automatic counters (r=0.84). These findings 522 prove that Flickr data are highly reliable to estimate the origin of the visitors and their 523 monthly distribution. These results are consistent with those obtained on previous 524 research in Finland and South Africa (Tenkanen et al., 2017). They found that half of the 525 analysed parks (28/56) had a Pearson’s correlation coefficient equal to or higher than 0.7. 526 The high correlation value obtained in Teide National Park is probably due to its high 527 number of visitors, which provide a sufficiently large sample of Flickr users. 528 When validating Flickr data, we have to consider bias not only in Flickr but also in official 529 data. Each measurement technique for counting visitors involves sampling error and 530 biases (Sessions et al., 2016). The number of monthly visitors reported by the Teide 531 National Park is an estimation, since it counts vehicles, not people, and apply an estimated 532 person-per-vehicle multiplier. Also, surveys may also be biased towards particular types 533 of visitors. Surveys might potentially under-sample international visitors (Sessions et al., 534 2016). 535 Flickr and Wikiloc provide additional information, not available in official sources, 536 particularly on spatial patterns of visitors. The social activity of sharing pictures leaves 537 digital proxies of spatial preferences, with people sharing specific photos considering the 538 depicted place not only “worth visiting” but also “worth sharing visually (Gliozzo, 539 Pettorelli, & Muki Haklay, 2016). The spatial patterns revealed in this study suggest that 540 visitors to Teide National Park tend to converge at three points of interest: the summit of 541 Mount Teide, Roque García and the Cañada Blanca visitors’ centre. Data on the spatial 542 and temporal concentration of visitors can alert park managers about crowding problems 543 and provide useful information for decision-making to face them. The results obtained 544 from the location-allocation models showed that Flickr might be used to locate new 545 facilities within the park, which can be valuable for decision making and resource 546 optimisation in park management, as visitor infrastructure is a key component to attract 547 visitors. 548 Data from social networks also provide information on the itineraries followed by visitors 549 within the park. According to Flickr data, the majority of visitors follow the main road 550 and a limited number of trails. Wikiloc data shows that visitors frequently use the majority 551 of trails within the park. Our interpretation is that Flickr and Wikiloc provide 552 complementary information on the use of the network of roads and trails: while Flickr 553 mainly detects flows of visitors when sightseeing, Wikiloc identifies trails used when 554 hiking or climbing. 555 Data from social networks prove to be of great value for park managers who need up-to- 556 date information about visitors’ use of the protected site in order to manage these sites 557 properly. The growing number of tourists in national parks poses new challenges to park 558 managers, who must efficiently manage visitor flows and avoid over-crowding and 559 unwanted impacts. The visitors’ digital footprint also helps detect the presence of tourists 560 in particularly sensitive areas within the national parks. 561 Despite the high resolution of social media data in terms of time and space, there are some 562 limitations that we need to address. First, we have to consider bias in the data sources. 563 Social media data is biased by their popularity among users, and it may vary by country, 564 year and demographics. Second, in less popular parks, the volume of data collected is 565 reduced thus decreasing the significance of the analysis. In this aspect, social media data 566 should be used with caution. Finally, there is also the bias produced by highly engaged 567 users which can lead to an overrepresentation of such population. Analysis at the user 568 level proved to be more accurate than at the photograph level to detect country of origin, 569 or where visitors go within the national park and when because it removes the bias effect 570 of highly active users thus providing a reliable indicator of tourist use. 571 Conclusions 572 This study showed how geotagged data from social networks could be used to measure 573 different aspects of visitor behaviour. In parks where the budget for surveys and visitor 574 monitoring studies is reduced or non-existent, geotagged data presents as an accessible 575 and inexpensive alternative. For parks where visitor monitoring is carried out, geotagged 576 data can be a supplementary source to assess visitation and decision making. 577 Data from social networks like Flickr and Wikiloc provides dynamic and high detail data 578 which in contrast to survey data provides a continuous record of visitor behaviour. So the 579 use of social networks can be a good alternative in parks that do not have resources to 580 carry out surveys or to install automatic counters. In the same way, parks that carry out 581 visitor monitoring can complement their visitor data. Most importantly, social networks 582 provide highly accurate spatial information on visitors’ behaviour thanks to the data 583 captured by their mobile GPS devices, identifying hot spots and visitor itineraries in 584 national parks. 585 Flickr can provide a reliable indicator of visitors’ country of origin which is an important 586 characteristic of visitors as it contributes to outline visitor’s profile. This information is 587 needed to develop effective communication strategies and programs such as information 588 about the park characteristics and activities which can enhance the visitor's experience as 589 well as to efficiently communicate conservation and environmental values. 590 We presented an early application of geotagged data to model visitors’ demand and find 591 optimal locations of new information kiosks that need further development/adjust to 592 provide more comprehensive insights. For example, a preliminary survey that addresses 593 the use level of existing facilities could help to define the number of additional facilities. 594 Moreover, spatial modelling of geotagged data can be used to identify optimal location 595 of different services such as camping sites, washrooms and emergency spots or to design 596 optimal routes for new itineraries and to maximize accessibility to least used trails. 597 This study is an initial step towards finding new applications for social media data in 598 nature-based tourism research, particularly in national parks. We presented exploratory 599 data and a visual analysis that outlines the main characteristics of visitor behaviour. 600 Analysing data from social networks offers valuable and innovative information for 601 tourism and geographic research. Insights gained during this study on visitors’ 602 spatiotemporal patterns are relevant for parks managers and can be applied in other 603 national parks. 604 605 606 References 607 Amo, L., López, P., & Martín, J. (2006). Nature-based tourism as a form of predation 608 risk affects body condition and health state of Podarcis muralis lizards. Biological 609 Conservation, 131(3), 402–409. https://doi.org/10.1016/j.biocon.2006.02.015 610 Anselin, L. (1995). Local Indicators of Spatial Association—LISA. Geographical 611 Analysis, 27(2), 93–115. https://doi.org/10.1111/j.1538-4632.1995.tb00338.x 612 Ballantyne, R., Packer, J., & Falk, J. (2011). Visitors’ learning for environmental 613 sustainability: Testing short- and long-term impacts of wildlife tourism experiences 614 using structural equation modelling. Tourism Management, 32(6), 1243–1252. 615 https://doi.org/10.1016/j.tourman.2010.11.003 616 Balmford, A., Beresford, J., Green, J., Naidoo, R., Walpole, M., & Manica, A. (2009). 617 A global perspective on trends in nature-based tourism. PLoS Biology, 7(6), 1–6. 618 https://doi.org/10.1371/journal.pbio.1000144 619 Beeco, J. A., Hallo, J. C., & Brownlee, M. T. J. (2014). GPS Visitor Tracking and 620 Recreation Suitability Mapping: Tools for understanding and managing visitor use. 621 Landscape and Urban Planning, 127, 136–145. 622 https://doi.org/10.1016/j.landurbplan.2014.04.002 623 Brown, G., & Weber, D. (2011). Public Participation GIS: A new method for national 624 park planning. Landscape and Urban Planning, 102(1), 1–15. 625 https://doi.org/10.1016/j.landurbplan.2011.03.003 626 Buckley, R. (2008). Ecotourism: principles and practices. Wallingford. 627 Buultjens, J., Ratnayake, I., Gnanapala, A., & Aslam, M. (2005). Tourism and its 628 implications for management in Ruhuna National Park (Yala), Sri Lanka. Tourism 629 Management, 26(5), 733–742. https://doi.org/10.1016/j.tourman.2004.03.014 630 Cabildo Insular de Tenerife, & Desarrollo Económico. (2018). Turismo alojado en 631 Tenerife por nacionalidad. Retrieved from 632 https://www.webtenerife.com/investigacion/ 633 Campelo, M. B., & Nogueira Mendes, R. M. (2016). Comparing webshare services to 634 assess mountain bike use in protected areas. Journal of Outdoor Recreation and 635 Tourism, 15, 82–88. https://doi.org/10.1016/j.jort.2016.08.001 636 Cessford, G., & Muhar, A. (2003). Monitoring options for visitor numbers in national 637 parks and natural areas. Journal of Nature Conservation, 11(4), 240–250. 638 https://doi.org/10.1078/1617-1381-00055 639 D ’Antonio, A., Monz, A. C., Lawson, S., Newman, P., Pettebone, D., & Courtemanch, 640 A. (2010). GPS-Based Measurements of Backcountry Visitors in Parks and 641 Protected Areas: Examples of Methods and Applications from Three Case Studies. 642 Journal of Park and Recreation Administration Fall, 28(3), 42–60. 643 Dalumpines, R., & Scott, D. M. (2011). GIS-based Map-matching: Development and 644 Demonstration of a Postprocessing Map- matching Algorithm for Transportation 645 Research (pp. 101–120). https://doi.org/10.1007/978-3-642-19789-5_27 646 Deng, J., Qiang, S., Walker, G. J., & Zhang, Y. (2003). Assessment on and perception 647 of visitors’ environmental impacts of nature tourism: A case study of zhangjiajie 648 national forest park, China. Journal of Sustainable Tourism, 11(6), 529–548. 649 https://doi.org/10.1080/09669580308667219 650 Dóniz Páez, F. J. (2010). Turismo y espacios naturales protegidos en Canarias: el 651 Parque Nacional de las Cañadas del Teide (Tenerife, España) durante el periodo 652 2000-2008. Estudios Turísticos, 183, 91–103. Retrieved from 653 http://www.iet.tourspain.es:20000/cgi- 654 iet/tr8spa.exe?W1=5&W2=22586&A6=1001110110001&A7=0 655 Dudley, N. (2008). Guidelines for applying protected area management categories. (N. 656 Dudley, Ed.). Gland, Switzerland: IUCN. 657 https://doi.org/10.2305/IUCN.CH.2008.PAPS.2.en 658 Eagles, P. F. J. (2014). Research priorities in park tourism. Journal of Sustainable 659 Tourism, 22(4), 528–549. https://doi.org/10.1080/09669582.2013.785554 660 Eagles, P. F. J., & Mccool, S. F. (2002). Tourism in national parks and protected areas: 661 planning and management. (CABI, Ed.), Tourism Management. 662 https://doi.org/10.1016/S0261-5177(03)00091-8 663 Eagles, P. F. J., Mccool, S. F., & Haynes, C. D. (2002). Sustainable Tourism in 664 Protected Areas Guidelines for Planning and Management: Issue 8 of Best 665 practice protected area guidelines series. World. 666 https://doi.org/10.2305/IUCN.CH.2002.PAG.8.en 667 ESRI. (2018). Location-allocation analysis. Retrieved from 668 http://desktop.arcgis.com/en/arcmap/latest/extensions/network-analyst/location- 669 allocation.htm 670 Farrell, T. A., & Marion, J. L. (2001). Identifying and assessing ecotourism visitor 671 impacts at eight protected areas in Costa Rica and Belize. Environmental 672 Conservation, 28(03), 215–225. https://doi.org/10.1017/S0376892901000224 673 Fish, R., Church, A., & Winter, M. (2016). Conceptualising cultural ecosystem services: 674 A novel framework for research and critical engagement. Ecosystem Services, 21, 675 208–217. https://doi.org/10.1016/J.ECOSER.2016.09.002 676 Flickr. (2017). Flickr History. Retrieved October 27, 2017, from 677 https://www.flickr.com/photos/flickr/12433867035 678 Fortin, M.-J., & Gagnon, C. (1999). An assesment of social impacts of national parks on 679 communities in Quebec, Canada. Environmental Conservation, 3, 200–211. 680 Gandomi, A., & Haider, M. (2015). Beyond the hype: Big data concepts, methods, and 681 analytics. International Journal of Information Management, 35(2), 137–144. 682 https://doi.org/10.1016/j.ijinfomgt.2014.10.007 683 García-Palomares, J. C., Gutiérrez-Puebla, J., & Mínguez, C. (2015). Identification of 684 tourist hot spots based on social networks: A comparative analysis of European 685 metropolises using photo-sharing services and GIS. Applied Geography, 63, 408– 686 417. https://doi.org/10.1016/j.apgeog.2015.08.002 687 George, G., Haas, M. R., & Pentland, A. (2014). Big data and management: From the 688 editors. Academy of Management Journal, 57(2), 321–326. 689 https://doi.org/10.5465/amj.2014.4002 690 Gliozzo, G., Pettorelli, N., & Muki Haklay, M. (2016). Using crowdsourced imagery to 691 detect cultural ecosystem services: A case study in South Wales, UK. Ecology and 692 Society, 21(3). https://doi.org/10.5751/ES-08436-210306 693 Hallo, J. C., Beeco, J. A., Goetcheus, C., McGee, J., McGehee, N. G., & Norman, W. C. 694 (2012). GPS as a Method for Assessing Spatial and Temporal Use Distributions of 695 Nature-Based Tourists. Journal of Travel Research, 51(5), 591–606. 696 https://doi.org/10.1177/0047287511431325 697 Heikinheimo, V., Minin, E. Di, Tenkanen, H., Hausmann, A., Erkkonen, J., & 698 Toivonen, T. (2017). User-Generated Geographic Information for Visitor 699 Monitoring in a National Park: A Comparison of Social Media Data and Visitor 700 Survey. ISPRS International Journal of Geo-Information, 6(3), 85. 701 https://doi.org/10.3390/ijgi6030085 702 Hernández Álvarez, J. C. (2017). Parque Nacional Teide: Visita Actual y retos del 703 futuro. 704 Holmes, J., Williams, F. B., & Brown, L. A. (1972). Facility Location under a 705 Maximum Travel Restriction: An Example Using Day Care Facilities. 706 Geographical Analysis, 4(3), 258–266. https://doi.org/10.1111/j.1538- 707 4632.1972.tb00474.x 708 Kidd, A. M., D’Antonio, A., Monz, C., Heaslip, K., Taff, D., & Newman, P. (2018). A 709 GPS-Based Classification of Visitors’ Vehicular Behavior in a Protected Area 710 Setting. Journal of Park & Recreation Administration, 36(1), 69–89. 711 https://doi.org/10.18666/JPRA-2018-V36-I1-8287 712 Lee, J. Y., & Tsou, M. H. (2018). Mapping spatiotemporal tourist behaviors and 713 hotspots through location-based photo-sharing service (Flickr) data. Lecture Notes 714 in Geoinformation and Cartography, (208669), 315–334. 715 https://doi.org/10.1007/978-3-319-71470-7_16 716 Lee, Y., Kwon, P., Yu, K., & Park, W. (2016). Method for Determining Appropriate 717 Clustering Criteria of Location-Sensing Data. ISPRS International Journal of Geo- 718 Information, 5(9), 151. https://doi.org/10.3390/ijgi5090151 719 Manning, R. E. (2002). How much is too much? Carrying capacity of national parks and 720 protected areas. … of Visitor Flows in Recreational and Protected Areas. …, 306– 721 313. 722 MAPAMA, & OAPN. (2017). Boletín de la red de parques nacionales. Retrieved from 723 http://www.mapama.gob.es/es/red-parques-nacionales/boletin/visitantes-teide.aspx 724 Meijles, E. W., de Bakker, M., Groote, P. D., & Barske, R. (2014). Analysing hiker 725 movement patterns using GPS data: Implications for park management. 726 Computers, Environment and Urban Systems, 47, 44–57. 727 https://doi.org/10.1016/j.compenvurbsys.2013.07.005 728 Mitchell, A. (2012). The ESRI Guide to GIS Analysis: Modelling Suitability, Movement, 729 and Interaction. Vol. 3 (Vol. 3). ESRI press. 730 Newsome, D., Moore, S., & Dowling, R. (2012). Natural Area Tourism. Bristol. 731 Newton, J. N., Newman, P., Taff, B. D., D’Antonio, A., & Monz, C. (2017). Spatial 732 temporal dynamics of vehicle stopping behavior along a rustic park road. Applied 733 Geography, 88, 94–103. https://doi.org/10.1016/j.apgeog.2017.08.007 734 Norman, P., & Pickering, C. M. (2017). Using volunteered geographic information to 735 assess park visitation: Comparing three on-line platforms. Applied Geography, 736 89(October), 163–172. https://doi.org/10.1016/j.apgeog.2017.11.001 737 OAPN. (2017). Memoria anual de actividades 2016, Parque Nacional Teide. Retrieved 738 from https://www.miteco.gob.es/es/red-parques-nacionales/nuestros- 739 parques/teide/memoria-teide-2016_tcm30-484562.pdf 740 Orellana, D., Bregt, A. K., Ligtenberg, A., & Wachowicz, M. (2012). Exploring visitor 741 movement patterns in natural recreational areas. Tourism Management, 33(3), 742 672–682. https://doi.org/10.1016/j.tourman.2011.07.010 743 Orsi, F., & Geneletti, D. (2013). Using geotagged photographs and GIS analysis to 744 estimate visitor flows in natural areas. Journal for Nature Conservation, 21(5), 745 359–368. https://doi.org/10.1016/j.jnc.2013.03.001 746 Paracchini, M. L., Zulian, G., Kopperoinen, L., Maes, J., Schägner, J. P., Termansen, 747 M., … Bidoglio, G. (2014). Mapping cultural ecosystem services: A framework to 748 assess the potential for outdoor recreation across the EU. Ecological Indicators, 749 45, 371–385. https://doi.org/10.1016/j.ecolind.2014.04.018 750 Romanillos, G., & Gutiérrez, J. (2019). Cyclists do better. Analyzing urban cycling 751 operating speeds and accessibility. International Journal of Sustainable 752 Transportation, 0(0), 1–17. https://doi.org/10.1080/15568318.2019.1575493 753 Schägner, J. P., Brander, L., Maes, J., Paracchini, M. L., & Hartje, V. (2016). Mapping 754 recreational visits and values of European National Parks by combining statistical 755 modelling and unit value transfer. Journal for Nature Conservation, 31, 71–84. 756 https://doi.org/10.1016/j.jnc.2016.03.001 757 Scheyvens, R. (1999). Ecotourism and the empowerment of local communities. Tourism 758 Management, 20(2), 245–249. https://doi.org/10.1016/S0261-5177(98)00069-7 759 Sessions, C., Wood, S. A., Rabotyagov, S., & Fisher, D. M. (2016). Measuring 760 recreational visitation at U.S. National Parks with crowd-sourced photographs. 761 Journal of Environmental Management, 183, 703–711. 762 https://doi.org/10.1016/j.jenvman.2016.09.018 763 Sonter, L. J., Watson, K. B., Wood, S. A., & Ricketts, T. H. (2016). Spatial and 764 temporal dynamics and value of nature-based recreation, estimated via social 765 media. PLoS ONE, 11(9). https://doi.org/10.1371/journal.pone.0162372 766 Taczanowska, K., González, L. M., Garcia-Massó, X., Muhar, A., Brandenburg, C., & 767 Toca-Herrera, J. L. (2014). Evaluating the structure and use of hiking trails in 768 recreational areas using a mixed GPS tracking and graph theory approach. Applied 769 Geography, 55, 184–192. https://doi.org/10.1016/j.apgeog.2014.09.011 770 Tenkanen, H., Di Minin, E., Heikinheimo, V., Hausmann, A., Herbst, M., Kajala, L., & 771 Toivonen, T. (2017). Instagram, Flickr, or Twitter: Assessing the usability of social 772 media data for visitor monitoring in protected areas. Scientific Reports, 7(1), 1–11. 773 https://doi.org/10.1038/s41598-017-18007-4 774 The Internet Archive. (2016). Yahoo timeline. Retrieved March 3, 2017, from 775 https://web.archive.org/web/20080713214826/http://yhoo.client.shareholder.com/p 776 ress/timeline.cfm 777 Tomczyk, A. M. (2011). A GIS assessment and modelling of environmental sensitivity 778 of recreational trails: The case of Gorce National Park, Poland. Applied 779 Geography, 31(1), 339–351. https://doi.org/10.1016/j.apgeog.2010.07.006 780 Valentine, P. (1992). Review: Nature-based tourism. Special Interest Tourism, 105–127. 781 Retrieved from http://link.springer.com/chapter/10.1007/978-1-4020-6799- 782 0_8%5Cnhttp://researchonline.jcu.edu.au/1632/ 783 Walden-Schreiner, C., Leung, Y. F., & Tateosian, L. (2018). Digital footprints: 784 Incorporating crowdsourced geographic information for protected area 785 management. Applied Geography, 90(December 2017), 44–54. 786 https://doi.org/10.1016/j.apgeog.2017.11.004 787 Wikiloc. (2017). Wikiloc’s history. Retrieved from 788 https://es.wikiloc.com/wikiloc/about-us.do#history 789 Wimpey, J., & Marion, J. L. (2011). A spatial exploration of informal trail networks 790 within Great Falls Park, VA. Journal of Environmental Management, 92(3), 1012– 791 1022. https://doi.org/10.1016/j.jenvman.2010.11.015 792 Wood, S. A., Guerry, A. D., Silver, J. M., & Lacayo, M. (2013). Using social media to 793 quantify nature-based tourism and recreation. Scientific Reports, 3, 2976. 794 https://doi.org/10.1038/srep02976 795 Zambrano, A. M. A., Broadbent, E. N., & Durham, W. H. (2010). Social and 796 environmental effects of ecotourism in the Osa Peninsula of Costa Rica: the Lapa 797 Rios case. Journal of Ecotourism, 9(1), 62–83. 798 https://doi.org/10.1080/14724040902953076 799 800 801 Table 3. Statistics of point of interest's clusters # Cluster/Outlier Name # hexagons Users LM Index p value 1 H-H cluster Cañada Blanca 16 913 124.28 0.001 2 H-H cluster Mount Teide 10 601 75.76 0.003 3 H-H cluster Llano de Ucanca 4 106 32.69 0.009 4 H-H cluster Roques de García 4 60 27.83 0.002 5 H-H cluster Altavista refuge 1 22 28.32 0.000 6 H-L outlier Cableway base station 1 114 -29.78 0.000 7 H-L outlier Minas San Jose 1 80 -16.31 0.013 8 H-L outlier La Tarta 1 64 -11.51 0.010 9 H-L outlier Narices del Teide 1 61 -20.36 0.000 10 H-L outlier Boca de Tauce 1 57 -19.42 0.002 802 803 804 Table 4. Covered demand for 2 existing facilities Cut-off Distance Average distance (m) Demand Weighted demand sum % covered demand by existing facilities 1000 m 616,63 372 314,83 23% 1500 m 937,37 454 405,17 29% 2000 m 1146,70 494 491,52 31% 2500 m 1306,54 538 559,06 34% 805 806 807 Table 3. Covered demand for chosen facilities in all scenarios Scenarios Cut-off Distance Average distance (m) Demand Weighted demand sum % covered demand by new facilities %Total demand covered (existing + new facilities) 1000 560,89 200 192,44 13% 36% 1500 943,55 337 262,31 22% 51% 2000 1152,86 404 334,64 26% 57% 1 facility 2500 1377,41 444 391,52 28% 62% 1000 580,20 493 507,11 31% 54% 1500 866,87 580 636,16 37% 66% 2000 1070,64 639 749,63 41% 72% 3 facilities 2500 1280,18 699 854,41 45% 79% 1000 575,94 630 731,19 40% 63% 1500 841,86 707 902,7 45% 74% 2000 1080,68 799 1049,41 51% 82% 5 facilities 2500 1307,86 878 1210,26 56% 90% 1000 561,27 840 1161,05 54% 77% 1500 820,26 912 1425,04 58% 87% 2000 1006,03 953 1645,14 61% 92% 10 facilities 2500 1193,04 1038 1844,61 66% 100% 808 For Peer Review Figure 1. Location map of the Teide National Park For Peer Review Figure 2. Illustration of the Map-matching process. Route map-matched within the 50 m distance buffer. A buffer distance of 25 m would prevent the matching of this route. For Peer Review Figure 3. Spatial distribution of Flickr photographs 2010-2017 For Peer Review Figure 4. Location of Wikiloc Tracks 2006 – 2018 For Peer Review Figure 5 Origin of visitors according to Flickr and Visitor survey; SP: Spain, UK: United Kingdom, DEU: Germany, FR: France, BEL: Belgium, , ITA: Italy, CHE: Switzerland, USA: the United States , RUS: Russia, , NLD: Netherlands, NOR: Norway, OC: other countries. For Peer Review Figure 6. Average Monthly distribution of Flickr users and Teide National Park visitors (2010-2016). For Peer Review Figure 7.Distribution of Flickr according to days of the week. For Peer Review Figure 8. Hourly distribution of total visitors and in main park attractions For Peer Review Figure 9. The Moran’s I scatterplot of Flickr’s users in the Teide For Peer Review Figure 10. Anselin Local Moran's Index Map at the user level For Peer Review Figure 11. Route density of Flickr tracks For Peer Review Figure 12. Route density of Wikiloc tracks For Peer Review Figure 13 Results from the location-allocation models: (1) 1 additional facility, (2) 3 additional facilities, (3) 5 additional facilities, (4) 10 additional facilities.