Skip to content

Instantly share code, notes, and snippets.

@ColinMaudry
Last active November 24, 2023 15:46
cURL examples to query Wikidata

SPARQL Queries (with cURL command) on Wikidata

This gist resulted to be just the spark for a proper article, and won't be maintained here anymore.

The SPARQL endpoint is http://wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql and it has a Web form to fire queries. However http://www.wikidata.org/prop/direct/P31 ("instance of") tells you what the entity is.

The repository doesn't have named graphs, or at least the SPARQL endpoint rejects graph queries. The classes of entities (rdf:type) are not described in the repository.

To find the HTML page of an entity (such as https://www.wikidata.org/entity/Q866405), simply replace /entity/ with /wiki/.

By default the SPARQL endpoint returns the results in SPARQL Results XML format but

  • adding -H "Accept: application/json" to the cURL command gets them in JSON Query Results
  • adding -H "Accept: text/csv" to the cURL command gets them in CSV format (the most readable).

Don't forget to URL encode the query ;-)

The list of entity types

#Selects the first 20 types of entities encountered:

select distinct ?type where {
?thing a ?type
}
limit 20

curl http://wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql?query=select%20distinct%20%3Ftype%20where%20%7B%0A%3Fthing%20a%20%3Ftype%0A%7D%0Alimit%2020 -H "Accept: text/csv"

Description of an entity (East Antarctica)

describe queries return all the triples in which the selected entity is subject (position 1/3). Certain SPARQL endpoints (such as this one) also return the triples in which the entity is object (position 3/3). The query result being a graph and not a table like for SPARQL Results, the default format is RDF/XML. JSON and Turtle (text/turtle) can also be requested.

describe <http://www.wikidata.org/entity/Q866405>

curl http://wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql?query=describe%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ866405%3E -H "Accept: text/turtle"

@ColinMaudry
Copy link
Author

If you need a specific SPARQL request, you can request (uh uh) it here and I'll see what I can do.

@lucaswerkmeister
Copy link

curl can also do the URL encoding for you:

curl -G https://wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql --data-urlencode query='
select distinct ?type where {
?thing a ?type
}
limit 20
'

(Note the -G option to use a GET request)

@ColinMaudry
Copy link
Author

Nice thanks, I didn't know about it

However, if I paste that it my console, it treats the line breaks as Enter keypresses, and thus as many commands as the number of lines. But that may work in a shell script.

My objective here was to give one-liners to paste in console.

@lucaswerkmeister
Copy link

Weird, the shell shouldn’t start executing the command until the quote is closed… which shell are you using?

@ColinMaudry
Copy link
Author

Apparently it's a Windows limitation 😬

Via SSH on my Ubuntu server, it worked.

I tried with both ConEmu and the standard command prompt from my Windows laptop: fails. In order to leave it cross-platform, I'll leave the URL encoding steps. Thanks for the input!

@vorachet
Copy link

vorachet commented Jun 8, 2017

I try to use https://query.wikidata.org/sparql with HTTP POST. It does not work. This manual (https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual) said

"POST requests also accepts query in the body of the request, instead of URL, allowing to run larger queries without hitting URL length limit."

Any one uses https://query.wikidata.org/sparql with POST?

@dr0i
Copy link

dr0i commented Jun 23, 2017

@vorachet At https://en.wikibooks.org/wiki/SPARQL/Wikidata_Query_Service it is said that PUT is forbidden. Normally, you don't need POST to run large queries - use GET. I do it like @lucaswerkmeister commented on 29 May 2015. Put this into a file.sh:

curl --header "Accept: application/sparql-results+json"  -G 'https://query.wikidata.org/sparql' --data-urlencode query='
SELECT ?s WHERE {
your complex query here
}'

and run it e.g. in bash with "bash file.sh".

@ppKrauss
Copy link

ppKrauss commented Aug 2, 2018

More one example. Generating a CSV file (countries.csv) with only item labels (cuts the Wikidata URL) and some data.

curl -o countries.csv -G 'https://query.wikidata.org/sparql' \
     --header "Accept: text/csv"  \
     --data-urlencode query='
 SELECT DISTINCT ?iso2 ?qid ?osm_relid ?itemLabel
 WHERE {
  ?item wdt:P297 _:b0.
  BIND(strafter(STR(?item),"http://www.wikidata.org/entity/") as ?qid).
  OPTIONAL { ?item wdt:P1448 ?name .}
  OPTIONAL { ?item wdt:P297 ?iso2 .}
  OPTIONAL { ?item wdt:P402 ?osm_relid .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]" . }
 }
 ORDER BY ?iso2
'

@ImperialRoyalKing999
Copy link

Thank you all. I will try iso2

@fishfree
Copy link

fishfree commented Jun 7, 2023

More one example. Generating a CSV file (countries.csv) with only item labels (cuts the Wikidata URL) and some data.

curl -o countries.csv -G 'https://query.wikidata.org/sparql' \
     --header "Accept: text/csv"  \
     --data-urlencode query='
 SELECT DISTINCT ?iso2 ?qid ?osm_relid ?itemLabel
 WHERE {
  ?item wdt:P297 _:b0.
  BIND(strafter(STR(?item),"http://www.wikidata.org/entity/") as ?qid).
  OPTIONAL { ?item wdt:P1448 ?name .}
  OPTIONAL { ?item wdt:P297 ?iso2 .}
  OPTIONAL { ?item wdt:P402 ?osm_relid .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]" . }
 }
 ORDER BY ?iso2
'

It does not work now, even I changed the query path to https://query.wikidata.org

@dr0i
Copy link

dr0i commented Jun 9, 2023

@fishfree try again, maybe this was just a hickup at WD. Tried your snippet and it works here.

@Podbrushkin
Copy link

Podbrushkin commented Nov 24, 2023

Maybe this will be helpful too. Both prints to console and writes to $resp variable. From Powershell:

curl -X POST -H 'Content-Type: application/sparql-query' -H 'Accept: text/csv' --data ($sparql -join ' ') https://query.wikidata.org/sparql | ConvertFrom-Csv -OutVariable resp
Invoke-RestMethod -Uri https://query.wikidata.org/sparql -Method Post -Body $sparql -Headers @{
    "Content-Type" = "application/sparql-query"
    "Accept" = "application/sparql-results+json"
} -OutVariable resp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment