Wikidata:Report a technical problem/WDQS and Search
| Report a problem | How to report a problem | Help with Phabricator | Get involved | WDQS and Search |
| On this page, old discussions are archived after 60 days. An overview of all archives can be found at this page's archive index. The current archive is located at 2026/01. |
item returns 2 different schema:dateModified values
[edit]Moved here from Wikidata:Report a technical problem Lucas Werkmeister (WMDE) (talk) 12:18, 6 June 2025 (UTC)
For Royal Navy vessels, I observe this problem 3 times out of 7500 items.
SELECT DISTINCT ?item ?itemDescription ?modified WHERE {
VALUES ?item {wd:Q1297941}
SERVICE wikibase:label {bd:serviceParam wikibase:language "en-gb,mul,en"}
?item schema:dateModified ?modified .
FILTER(BOUND(?modified) && DATATYPE(?modified) = xsd:dateTime).
}
GROUP BY ?item ?itemDescription ?modified
returns 21 February 2024 and 7 March 2025. Only the latter is correct. How can it be storing 2 values? Vicarage (talk) 11:04, 6 June 2025 (UTC)
- Now only returning one. Another example of the rogue query server? Vicarage (talk) 06:27, 9 June 2025 (UTC)
- Tried it again, got 2 responses again, is the rogue still there? Vicarage (talk) 16:25, 11 June 2025 (UTC)
- @Vicarage thanks for reporting this, I confirm that there are still two WDQS servers misbehaving here (wdqs1017 & wdqs1021), I believe this might be related to phab:T386098 and looking at the progress of the data-transfer I can see that these servers are not yet reloaded. I'll raise this problem on the ticket.
- As to why it happens, if the WDQS machines de-synchronize itself from the update stream this kind of inconsistencies may appear in WDQS. DCausse (WMF) (talk) 10:31, 14 August 2025 (UTC)
- Should be all resolved. If you have trouble getting a response form us in the future (sometimes we forget to check this page as frequently as would be ideal) feel free to drop into #wikimedia-search on libera chat IRC and raise the inquiry there. RKemper (WMF) (talk) 22:06, 26 August 2025 (UTC)
- Tried it again, got 2 responses again, is the rogue still there? Vicarage (talk) 16:25, 11 June 2025 (UTC)
- This is Brian, I'm part of the SRE team that supports WDQS. Sorry for the delay on this! We are working through the hosts on https://etherpad.wikimedia.org/p/wdqs-reload-T386098 . We should be done within the next couple of days. If you continue to see issues, feel free to ping us directly in the linked Phab task. BKing (WMF) (talk) 21:28, 20 August 2025 (UTC)
upstream request timeout
[edit]I'm getting this response and a 60 second timeout from https://query.wikidata.org/sparql for what seem quite innocuous queries. The message has only started appearing in the last couple of days, and replaces the full SPARQL traceback I used to get. Re-runs a few minutes later often work with response times < 10 seconds. Vicarage (talk) 16:24, 11 June 2025 (UTC)
- And what times out from the command line completes in 6 seconds from the website Vicarage (talk) 20:42, 11 June 2025 (UTC)
- @Vicarage do you still see the issue, if yes could you elaborate a bit more on what you are trying to achieve? What is the SPARQL query? What do you mean by command line, are you using curl, if yes could you paste the full command you are running? DCausse (WMF) (talk) 15:13, 13 August 2025 (UTC)
- From the Web page
- SELECT DISTINCT
- ?item
- ?itemDescription
- WHERE {
- SERVICE wikibase:label {bd:serviceParam wikibase:language "en-gb,mul,en"}
- VALUES ?item {wd:Q182027}
- }
- times out today (or takes 28 seconds!!!), after I cut all the real content from a query Vicarage (talk) 09:24, 15 August 2025 (UTC)
- @Vicarage thanks for the information.
- I can't seem to be able to reproduce the issue, I ran some load testing using this query on both datacenters and this query always seems to return results in less than 1 second. By chance, did you hear any other users facing similar problems with this particular query?
- What could explain the problem is that some wdqs servers might enter a dead-lock and could cause some queries to fail but usually the problematic servers are automatically removed so that they no longer serve queries until they're restarted and put back in rotation. DCausse (WMF) (talk) 08:19, 18 August 2025 (UTC)
- Sorry, I don't talk to others about their MYSQL problems really. Another problem that may be due to inconsistent servers is this query, which should always return 1 result, returns randomly 1 2 or 3 results each time I hit submit yesterday and today from the web page, as if I'm getting a different server round robin.
- {{SPARQL|query=
- SELECT DISTINCT
- ?item
- (COALESCE(?label1,SAMPLE(?label2),'Unknown') AS ?title)
- ?itemDescription
- (GROUP_CONCAT(DISTINCT ?alias; separator="#") AS ?aliases)
- (SAMPLE(?country1) AS ?country)
- (MAX(?modified1) AS ?modified)
- (MIN(?start1) AS ?start)
- (MAX(?end1) AS ?end)
- (GROUP_CONCAT(DISTINCT ?conn; SEPARATOR='#') AS ?connlist)
- (GROUP_CONCAT(DISTINCT ?hconn; SEPARATOR='#') AS ?hconnlist)
- (GROUP_CONCAT(DISTINCT ?tag; SEPARATOR='#') AS ?taglist)
- (GROUP_CONCAT(DISTINCT ?note; SEPARATOR=', ') AS ?notelist)
- (GROUP_CONCAT(DISTINCT ?lnote; SEPARATOR=', ') AS ?lnotelist)
- (SAMPLE (COALESCE(?position1,?position2,?position3)) AS ?position)
- (SAMPLE (STR(?bestimage)) AS ?image)
- WHERE {
- SERVICE wikibase:label {bd:serviceParam wikibase:language "en-gb,mul,en"}
- VALUES ?item {wd:Q150609}
- VALUES ?details {
- wdt:P279 # subclass
- }
- VALUES ?hdetails {wdt:P137 wdt:P176} # operator, manufacturer
- VALUES ?idetails {wdt:P7906}
- VALUES ?tags {wdt:P31} # instance
- VALUES ?starts {wdt:P580 wdt:P571 wdt:P729} # start time, inception, service entry
- VALUES ?ends {wdt:P582 wdt:P730} # end time, service retirement
- VALUES ?countries {wdt:P495} # origin
- VALUES ?images {wdt:P7906}
- VALUES ?registers {wd:Q22964288} # military
- OPTIONAL {
- SERVICE wikibase:label {bd:serviceParam wikibase:language "en-gb,mul,en".
- ?item rdfs:label ?label1}. FILTER (!REGEX(?label1,"^[Q][0-9]"))
- }
- OPTIONAL {
- ?item ?countries ?country.
- ?country wdt:P37/wdt:P424 ?langcode.
- ?item rdfs:label ?label2 FILTER (LANG(?label2) = ?langcode)
- }
- OPTIONAL {?item skos:altLabel ?alias. FILTER(LANG(?alias) = "en")}
- OPTIONAL {{?item ?hdetails ?t2} ?t2 rdfs:label ?n2. FILTER (LANG(?n2) = 'en') BIND(CONCAT(?n2,'£',STR(?t2)) AS ?hconn)}
- }
- GROUP BY ?item ?itemDescription ?label1 ?label2 ?wikipedia ?ia
- }} Vicarage (talk) 09:05, 18 August 2025 (UTC)
- @Vicarage in this last query, no the problem is not due to inconsistent WDQS servers but due the SPARQL features you are using. Some SPARQL features such as
GROUP_CONCATorSAMPLEare not deterministic, in other words they might return different results even when given the same arguments. - Here what is likely happening is that the de-duplication you are requesting via
DISTINCTis not de-duplicating the way you expect. The item Q150609 you are requesting is duplicated certainly because some of your triple patterns but theGROUP_CONCATon?hconnmay concatenate its values using a different order for the two, which in turn will tellDISTINCTto keep both lines (you can see that the two lists have a slightly different ordering when two results are returned). - One way to solve this would be to understand why you needed to add
DISTINCTin the first place and fix your triple patterns so that you no longer need it. DCausse (WMF) (talk) 15:44, 19 August 2025 (UTC)- I don't think so. I understand that SPARQL does not guarantee an order for GROUP_CONCAT (and my application manually reorders after extraction), but I think I can expect that for each ?item I get a list of other variables which are either sampled or concatenated. I think my problem might be with ?label2 being in the GROUP BY even though it is SAMPLEd. I will investigate, though I'd expect Blazegraph to catch that. Vicarage (talk) 16:03, 19 August 2025 (UTC)
- @Vicarage in this last query, no the problem is not due to inconsistent WDQS servers but due the SPARQL features you are using. Some SPARQL features such as
- @Vicarage do you still see the issue, if yes could you elaborate a bit more on what you are trying to achieve? What is the SPARQL query? What do you mean by command line, are you using curl, if yes could you paste the full command you are running? DCausse (WMF) (talk) 15:13, 13 August 2025 (UTC)
Lier un lexème à sa page sur le Wiktionnaire
[edit]Hello, Je constate qu'il n'est pas possible de lier un lexème (LID) (ex: akanza) à sa page wiktionnaire (ex: akanza) comme cela se fait déjà pour un élément (QID). Poro26 (talk) 15:36, 22 August 2025 (UTC)
- (apologies for writing in English)
- Hi @Poro26. It’s not possible to directly link Lexemes (LIDs) to Wiktionary pages the same way Items (QIDs) can be linked. See Wikidata:Wiktionary. The main reason is that Wiktionary pages often cover more than one word, so there isn’t always a clear one-to-one correspondence between a Wiktionary page and a single Wikidata Lexeme.
- PS: For future reference, enquiries like this are best placed at Wikidata:Report a technical problem, since this page is specifically for issues with the Query Service and search features. -Mohammed Abdulai (WMDE) (talk) 08:38, 25 August 2025 (UTC)
- Bonjour @Poro26, je viens de déplacer ta question vers Wikidata:Bistro#Lier_un_lexème_à_sa_page_sur_le_Wiktionnaire car je pense que cette page est plus appropriée pour cette question. DCausse (WMF) (talk) 08:45, 25 August 2025 (UTC)
web endpoint truncating
[edit]Queries that return lots of rows are truncating halfway through lines, at 327680 or 1683840 characters, suspicious numbers. The same queries produce extra results through the GUI
SELECT DISTINCT ?item) ?modified
WHERE {
{?item wdt:P31/wdt:P279* wd:Q35509} # cave
{
?wikipedia schema:about ?item.
FILTER regex(str(?wikipedia), 'wikipedia.org')
}
?item schema:dateModified ?modified .
}
using
curl -s -H "Accept: text/csv" -H "User-Agent: expounder \\ (https://expounder.info)" 'https://query.wikidata.org/sparql' --data-urlencode query="$(cat list.query)" > list.wd tail -2 /home/john/Expounder/Underfoot/Sites/select/caves/list.wd http://www.wikidata.org/entity/Q27309689,2024-11-21T17:39:02Z http://www.w
and it its inconsistent, changing cave to tunnel (wd:Q44377) allows 980000 characters thorough
Vicarage (talk) 14:46, 27 August 2025 (UTC)
- Hi,
- I was not able to replicate this issue. In all tests I've done, cli and GUI returned consistent result sets.
- If the issues persists, could you run curl with higher verbosity (-vv) and share the output? GModena (WMF) (talk) 10:31, 2 September 2025 (UTC)
- Still happening over a week. Today, first time it timed out, then gave truncated results after 50 seconds with
- % Total % Received % Xferd Average Speed Time Time Time Current
- Dload Upload Total Spent Left Speed
- 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 2a02:ec80:300:ed1a::1:443...
- TCP_NODELAY set
- Connected to query.wikidata.org (2a02:ec80:300:ed1a::1) port 443 (#0)
- ALPN, offering h2
- ALPN, offering http/1.1
- successfully set certificate verify locations:
- CAfile: /etc/ssl/certs/ca-certificates.crt
- CApath: /etc/ssl/certs
- } [5 bytes data]
- TLSv1.3 (OUT), TLS handshake, Client hello (1):
- } [512 bytes data]
- TLSv1.3 (IN), TLS handshake, Server hello (2):
- { [122 bytes data]
- TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
- { [19 bytes data]
- TLSv1.3 (IN), TLS handshake, Certificate (11):
- { [2768 bytes data]
- TLSv1.3 (IN), TLS handshake, CERT verify (15):
- { [80 bytes data]
- TLSv1.3 (IN), TLS handshake, Finished (20):
- { [36 bytes data]
- TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
- } [1 bytes data]
- TLSv1.3 (OUT), TLS handshake, Finished (20):
- } [36 bytes data]
- SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
- ALPN, server accepted to use h2
- Server certificate:
- subject: CN=*.wikipedia.org
- start date: Aug 10 23:56:29 2025 GMT
- expire date: Nov 8 23:56:28 2025 GMT
- subjectAltName: host "query.wikidata.org" matched cert's "*.wikidata.org"
- issuer: C=US; O=Let's Encrypt; CN=E6
- SSL certificate verify ok.
- Using HTTP2, server supports multi-use
- Connection state changed (HTTP/2 confirmed)
- Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
- } [5 bytes data]
- Using Stream ID: 1 (easy handle 0x559f9c205dc0)
- } [5 bytes data]
- > POST /sparql HTTP/2
- > Host: query.wikidata.org
- > accept: text/csv
- > user-agent: expounder (https://expounder.info)
- > content-length: 1714
- > content-type: application/x-www-form-urlencoded
- >
- } [5 bytes data]
- We are completely uploaded and fine
- { [5 bytes data]
- TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
- { [249 bytes data]
- TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
- { [249 bytes data]
- old SSL session ID is stale, removing
- { [5 bytes data]
- Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
- } [5 bytes data]
- 100 1714 0 0 100 1714 0 197 0:00:08 0:00:08 --:--:-- 0< HTTP/2 200
- < server: nginx/1.18.0
- < date: Tue, 02 Sep 2025 11:07:26 GMT
- < content-type: text/csv;charset=utf-8
- < content-disposition: attachment; filename=query1455078.csv
- < x-first-solution-millis: 7
- < x-served-by: wdqs1011
- < access-control-allow-origin: *
- < access-control-allow-headers: accept, content-type, content-length, user-agent, api-user-agent
- < cache-control: public, max-age=300
- < age: 4
- < vary: Accept, Accept-Encoding
- < x-cache: cp3069 miss, cp3069 pass
- < x-cache-status: pass
- < server-timing: cache;desc="pass", host;desc="cp3069"
- < strict-transport-security: max-age=106384710; includeSubDomains; preload
- < report-to: { "group": "wm_nel", "max_age": 604800, "endpoints": [{ "url": "https://intake-logging.wikimedia.org/v1/events?stream=w3
- c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0" }] }
- < nel: { "report_to": "wm_nel", "max_age": 604800, "failure_fraction": 0.05, "success_fraction": 0.0}
- < x-client-ip: 2a0a:ef40:9ec:9d01:6539:d50c:9686:6497
- < set-cookie: WMF-Uniq=dT5jT3d61u7QwurXwlpVTQJiAAAAAFvdmI6ubr0VowiIZp7-3bCDysD9G5KdipaX;Domain=.wikidata.org;Path=/;HttpOnly;secure;S
- ameSite=None;Expires=Wed, 02 Sep 2026 00:00:00 GMT
- <
- { [13833 bytes data]
- 100 321k 0 320k 100 1714 6516 34 0:00:50 0:00:50 --:--:-- 0
- Connection #0 to host query.wikidata.org left intact
- 50.29
- 5233 results for caves Vicarage (talk) 11:09, 2 September 2025 (UTC)
- A couple of years ago I was looking into various failure modes. One of them are when the query completes, but fails during transfer. This can AFAIK only be detected when the JSON doesn't validate. If you're requesting data in CSV there may be no way to tell. From the above it looks like query started returning results 8 seconds in, then aborts 50 seconds later which fits with the 60 seconds timeout, and the truncated data is as expected. If it takes a minute to transfer 320k worth of data on your internet connection, it sounds like it might be throttled. Infrastruktur (talk) 18:06, 9 September 2025 (UTC)
- I'm on full fibre, so its an upstream problem. Bigger, very similar queries work, but I've also seen other queries fail exactly this way Vicarage (talk) 18:47, 9 September 2025 (UTC)
- Have you tried the query on QLevers Wikidata endpoint? See https://github.com/dpriskorn/WikidataOrcidScraper/blob/master/models/qlever.py for how to make the request. So9q (talk) 04:14, 10 September 2025 (UTC)
- Last time I used qlever using curl a few months back I had assorted problems, ending with the service never responding at all. The level of documentation and support seemed poor. Perhaps when I returnn from holiday. Vicarage (talk) 06:25, 10 September 2025 (UTC)
- Have you tried the query on QLevers Wikidata endpoint? See https://github.com/dpriskorn/WikidataOrcidScraper/blob/master/models/qlever.py for how to make the request. So9q (talk) 04:14, 10 September 2025 (UTC)
- I'm on full fibre, so its an upstream problem. Bigger, very similar queries work, but I've also seen other queries fail exactly this way Vicarage (talk) 18:47, 9 September 2025 (UTC)
- A couple of years ago I was looking into various failure modes. One of them are when the query completes, but fails during transfer. This can AFAIK only be detected when the JSON doesn't validate. If you're requesting data in CSV there may be no way to tell. From the above it looks like query started returning results 8 seconds in, then aborts 50 seconds later which fits with the 60 seconds timeout, and the truncated data is as expected. If it takes a minute to transfer 320k worth of data on your internet connection, it sounds like it might be throttled. Infrastruktur (talk) 18:06, 9 September 2025 (UTC)
false positive / db inconsistency
[edit]Q137041536#P8322 complains about that Eggendorf (Q18755525) violates the uniqueness constraint for property cadastral municipality ID in Austria (P8322). Yesterday I removed the property from the other object [1] and even after purging both objects, the constraint violation persists. Even more, using sparql to find objects with both Ids https://w.wiki/GLRg, Eggendorf (Q18755525) is still listed although the user interface does not show the property cadastral municipality ID in Austria (P8322) any more. best --Herzi Pinki (talk) 19:17, 27 November 2025 (UTC)
- (Moving from the parent page to /WDQS and Search because it sounds like a query service issue – the constraint report also gets the information for this constraint from the query service.) --Lucas Werkmeister (WMDE) (talk) 10:43, 28 November 2025 (UTC)
