16

I'd like to add icons to my map showing locations which have Wikipedia entries.

The WikiLocation API is a great resource for determining Wiki entries near an XY location, but it has limitations (20km radius, 50 points only) which make it unsuitable for my purposes.

Ideally I need the map to show ALL locations with Wiki entries, so I guess I'll need to download the Wikipedia dump file and parse it for geocoded entries.

Where is this information found in the Wikipedia dump files?

The Wiki database may be downloaded from http://en.wikipedia.org/wiki/Wikipedia:Database_download but I can't see which file(s) I need to search for the coordinates.

For example, the entry for Sydney includes:

Coordinates:    33°51′35.9″S 151°12′40″E

Where is this found in the dump files? Another way to phrase the question is "how can I determine which Wiki entries are geo-coded?"

2 Answers 2

9

See the database dump section on http://de.wikipedia.org/wiki/Wikipedia:WikiProjekt_Georeferenzierung/Wikipedia-World/en. This gives the option for a PostGIS connection.

There was also a CSV download option at one time but that seems to have been removed sometime between 2011 and 2016.

Also look through http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Geographical_coordinates for other options.

5
  • Thanks, the CSV download looks like what I need, along with the PostGIS file to set field headers. Aug 22, 2011 at 3:12
  • @StephenLead do you/Patrick know if the CSV file has been moved or removed? There was a question looking for it yesterday and to stop that being a duplicate of this I am thinking about noting here that the CSV is no longer available.
    – PolyGeo
    Mar 15, 2016 at 21:21
  • @PolyGeo no idea sorry, I haven't looked at this issue since Aug 19, 2011 ;) Mar 15, 2016 at 22:15
  • Nor I! And I've switched careers!
    – Patrick
    Mar 15, 2016 at 22:15
  • @StephenLead No problem - I'll proceed to edit Patrick's answer as planned and hope he'll correct me if I get it wrong.
    – PolyGeo
    Mar 15, 2016 at 22:16
4

An API solution is probably your best bet, but if you are dead-set on using the database dumps, in this Twitter response the WikiLocation API author explains a bit that he is not scraping the entire Wikipedia database dump but only certain tables.

I would start by looking at the external links database and looking for the links to GeoHack (probably only ones matching the pattern used by such URLs as the Sydney one).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.