Jump to content

Wikidata:دسترسی به داده‌ها

From Wikidata
This page is a translated version of the page Wikidata:Data access and the translation is 23% complete.
Outdated translations are marked like this.

ویکی‌داده برای توسعه‌دهندگان: دسترسی به داده‌ها




ویکی‌داده اکنون بیش از صد و ده میلیون آیتم و بیش از یک میلیون بن واژه دارد و این اعداد و ارقام در حال رشد هستند. برای دسترسی به داده‌ها روش‌های گوناگونی وجود دارد. در این صفحه قصد داریم به کاربران کمک کنیم بهترین راه را برای دستیابی به خواسته‌های مورد نیازشان بیابند.

برای دسترسی به داده‌ها به سریع‌ترین و کارآمدترین شیوه ممکن این صفحه راهنما را برای شما آماده کرده‌ایم. از پرداختن به موضوع‌های غیرضروری پرهیز کرده و یک راست سر اصل مطلب می‌رویم.

پیش از همه چیز

کار کردن با داده‌های ویکی‌داده

نشانواره ما

ویکی‌داده محدودهٔ وسیعی از داده‌های عمومی درباره دنیای ما فراهم می‌کند. این داده‌ها تحت پروانه CC0 «اهدای دامنه عمومی» منتشر می‌شوند. همه می‌توانند داده‌های این ویکی را ویرایش کنند و داده‌ها توسط جامعهٔ ویرایشگران ویکی‌داده نگهداری می‌شوند.

هر گونه تغییر در APIها بر اساس سیاست رابط کاربری پایدار صورت می‌گیرد. منبع داده‌ها ضمانت‌نشده است.

پروژه‌های ویکی‌مدیا

این مطلب مربوط می‌شود به استفاده از داده‌های ویکی‌داده در خارج از پروژه‌های ویکی‌مدیایی. اگر نیاز دارید از داده‌های ویکی‌داده در یکی از پروژه‌های ویکی‌مدیا استفاده کنید می‌توانید از توابع پارسر لوا یا روش‌های درون پروژه‌ای دیگر این کار را انجام دهید که توضیحات جامع‌تر را می‌توانید در روش استفاده از داده‌ها در پروژه‌های ویکی‌مدیا بخوانید.

روال‌های مطلوب

داوطلبانی مانند این افراد (و شما) ویکی‌داده را می‌سازند.

ما در ویکی‌داده به شما داده‌هایی را ارائه می‌کنیم بدون اینکه حتی نیاز باشد به CC-0 در منابع خود اشاره کنید. با این حال ستوده است نام ویکی‌داده را در پروژه خود به عنوان منبع داده‌هایتان درج کنید. بدین ترتیب مهر تأییدی می‌زنید بر درستی کار ویکی‌داده و به ماندگاری آن کمک می‌کنید. ما سعی داریم داده‌هایی به روز و با کیفیت ارائه کنیم و به پروژه‌هایی که از داده‌های ویکی‌داده استفاده می‌کنند، کمک کنیم.

اگر از ویکی‌داده به عنوان منبع استفاده می‌کنید می‌توانید از عبارت‌هایی مثل «برگرفته از ویکی‌داده»، «منبع: ویکی‌داده» استفاده کنید. همچنین می‌توانید یکی از پرونده‌های آماده را به کار ببرید.

شما می توانید از لوگو ویکی پدیا که در بالا نشان داده شده استفاده کنید، اما این کار نباید به هیچ وجه به تأیید ضمنی توسط ویکی پديا یا بنیاد ویکی میدیا تلقی شود.

Please offer your users a way to report issues in the data, and find a way to feed this back to Wikidata's editor community, for example through the Mismatch Finder. Please share the location where you collect these issues on the Project chat.

دسترسی به روال‌های مطلوب

هنگام دسترسی به داده‌های ویکی‌داده بهترین روش‌های زیر را در نظر داشته باشید:

  • Follow the User-Agent policy – send a good User-Agent header.
  • از قانون روبات پیروی کنید: یک دلار بفرستید و درخواست های زیادی را همزمان ارسال نکنید.
  • If you get a 429 Too Many Requests response, stop sending further requests for a while (see the Retry-After response header)
  • When available (such as with the Wikidata Query Service), set the lowest timeout that makes sense for your data.
  • When using the MediaWiki Action API, make liberal use of the maxlag parameter and consult the rest of the guidelines laid out in API:Etiquette.


جستجو

چه چیزی است؟

در ویکی‌داده می‌توان در میان داده‌ها با کمک موتور جستجوی الستیک سرچ جستجو انجام داد. برای انجام جستجو به Special:Search بروید.

چه زمانی باید از آن استفاده کرد؟

در مواقعی که قصد دارید در میان داده‌های رشته‌ای جستجو کنید یا مواقعی که نام جوهره‌ای را که به دنبال آن هستید به خوبی به یاد نمی‌آورید می‌توانید از جستجوی ویکی‌داده استفاده کنید. همچنین در مواقعی که قصد دارید روابط ساده درون داده‌ها را جستجو کنید می‌توانید از این جستجوی ویژه کمک بگیرید.

در مواقعی که روابط بین داده‌ها پیچیده هستند از جستجو استفاده نکنید.

Details

You can make your search more powerful with these additional keywords specific to Wikidata: haswbstatement, inlabel, wbstatementquantity, hasdescription, haslabel. This search functionality is documented on the CirrusSearch extension page. It also has its own API action.

Linked Data Interface (URI)

چه چیزی است؟

The Linked Data Interface provides access to individual entities via URI: http://www.wikidata.org/entity/Q???. Such URIs are called concept URIs. Note concept URIs use HTTP, not HTTPS.

چه زمانی از آن استفاده کنم؟

Use the Linked Data Interface when you need to obtain individual, complete entities that are already known to you.

Don't use it when you're not clear on which entities you need – first try searching or querying. It's also not suitable for requesting large quantities of data.

Details

Meet Q42

هر آیتم یا خصوصیت یک URI دائمی دارد که با کمک شناسه‌اش می‌توانید آن را به دست آورید مثل Q42 یا P12

The namespace for Wikidata's data about entities is https://wikidata.org/wiki/Special:EntityData.

Appending an entity's ID to this prefix (you can use /entity/ for short) creates the abstract (format-neutral) form of the entity's data URL. When accessing a resource in the Special:EntityData namespace, the special page applies content negotiation to determine the output format. If you opened the resource in a browser, you'll see an HTML page containing data about the entity, because web browsers prefer HTML. However, a linked-data client would receive the entity data in a format like JSON or RDF – whatever the client specifies in its HTTP Accept: header.

For example, take this concept URI for Douglas Adams – that's a reference to the real-world person, not to Wikidata's concrete description:
http://www.wikidata.org/entity/Q42
As a human being with eyes and a browser, you will likely want to access data about Douglas Adams by using the concept URI as a URL. Doing so triggers an HTTP redirect and forwards the client to the data URL that contains Wikidata's data about Douglas Adams: https://www.wikidata.org/wiki/Special:EntityData/Q42.

When you need to bypass content negotiation, say, in order to view non-HTML content in a web browser, you can specify the format of the entity data by appending the corresponding extension to the data URL; examples include .json, .rdf, .ttl, .nt or .jsonld. For example, https://www.wikidata.org/wiki/Special:EntityData/Q42.json gives you Item Q42 in JSON format.

Less verbose RDF output

By default, the RDF data that the Linked Data interface returns is meant to be complete in itself, so it includes descriptions of other entities it refers to. If you want to exclude that information, you can append the query parameter ?flavor=dump to the URL(s) you request.

By appending &flavor= to the URL, you can control exactly what kind of data gets returned.

  • ?flavor=dump: Excludes descriptions of entities referred to in the data.
  • ?flavor=simple: Provides only truthy statements (best-ranked statements without qualifiers or references), along with sitelinks and version information.
  • ?flavor=full (default): An argument of "full" returns all data. (You don't need to specify this because it's the default.)

If you want a deeper insight into exactly what each option entails, you can take a peek into the source code.

Revisions and caching

You can request specific revisions of an entity with the revision query parameter: https://www.wikidata.org/wiki/Special:EntityData/Q42.json?revision=112.

The following URL formats are used by the user interface and by the query service updater, respectively, so if you use one of the same URL formats there’s a good chance you’ll get faster (cached) responses:

Wikidata Query Service

What is it?

The Wikidata Query Service (WDQS) is Wikidata's own SPARQL endpoint. It returns the results of queries made in the SPARQL query language: https://query.wikidata.org

When to use it?

Use WDQS when you know only the characteristics of your desired data.

Don't use WDQS for performing text or fuzzy search – FILTER(REGEX(...)) is an antipattern. (Use search in such cases.)

WDQS is also not suitable when your desired data is likely to be large, a substantial percentage of all Wikidata's data. (Consider using a dump in such cases.)

Details

شما می‌توانید از طریق نقطه های پایان اسپارکل به نام سرویس کوئری از داده‌های ویکی‌داده جستار (کوئری) تهیه کنید. این کار توسط BlazeGraph از طریق Wikidata Query Service هم اکنون برای شما قابل انجام است. برای اطلاعات بیشتر راهنما و صفحه‌های جامعه ویکی‌داده را مطالعه کنید.

The query service is best used when your intended result set is scoped narrowly, i.e., when you have a query you're pretty sure already specifies your resulting data set accurately. If your idea of the result set is less well defined, then the kind of work you'll be doing against the query service will more resemble a search; frequently you'll first need to do this kind of search-related work to sharpen up your query. See the Search section.

The query service at query.wikidata.org only contains the main graph of Wikidata. The Items related to scholarly articles are in a separate query service at query-scholarly.wikidata.org. For more details see Wikidata:SPARQL query service/WDQS graph split.

Linked Data Fragments endpoint

What is it?

The Linked Data Fragments (LDF) endpoint is a more experimental method of accessing Wikidata's data by specifying patterns in triples: https://query.wikidata.org/bigdata/ldf. Computation occurs primarily on the client side.

When to use it?

Use the LDF endpoint when you can define the data you're looking for using triple patterns, and when your result set is likely to be fairly large. The endpoint is good to use when you have significant computational power at your disposal.

Since it's experimental, don't use the LDF endpoint if you need an absolutely stable endpoint or a rigorously complete result set. And as mentioned before, only use it if you have sufficient computational power, as the LDF endpoint offloads computation to the client side.

Details

If you have partial information about what you're looking for, such as when you have two out of three components of your triple(s), you may find what you're looking for by using the Linked Data Fragments interface at https://query.wikidata.org/bigdata/ldf. See the user manual and community pages for more information.

Wikibase REST API

What is it?

The Wikibase REST API is an OpenAPI-based interface that allows users to interact with, retrieve and edit items and statements on Wikibase instances – including of course Wikidata: Wikidata REST API

When to use it?

The Wikibase REST API is still under development, but for Wikidata it's intended to functionally replace the Action API as it's a dedicated interface made just for Wikibase/Wikidata.

The use cases for the Action API apply to the Wikibase REST API as well. Use it when your work involves:

  • ویرایش ویکی‌داده
  • Getting direct data about entities themselves

Don't use the Wikibase REST API when your result set is likely to be large. (Consider using a dump in such cases.)

It's better not to use the Wikibase REST API when you'll need to further narrow the result of your API request. In such cases it's better to frame your work as a search (for Elasticsearch) or a query (for WDQS).

Details

The Wikibase REST API has OpenAPI documentation using Swagger. You can also review the developer documentation.

MediaWiki Action API

What is it?

The Wikidata API is MediaWiki's own Action API, extended to include some Wikibase-specific actions: https://wikidata.org/w/api.php

When to use it?

Use the API when your work involves:

  • ویرایش ویکی‌داده
  • Getting data about entities themselves such as their revision history
  • Getting all of the data of an entity in JSON format, in small groups of entities (up to 50 entities per request).

Don't use the API when your result set is likely to be large. (Consider using a dump in such cases.)

The API is also poorly suited to situations in which you want to request the current state of entities in JSON. (For such cases consider using the Linked Data Interface, which is likelier to provide faster responses.)

Finally, it's probably a bad idea to use the API when you'll need to further narrow the result of your API request. In such cases it's better to frame your work as a search (for Elasticsearch) or a query (for WDQS).

Details

The MediaWiki Action API used for Wikidata is meticulously documented on Wikidata's API page. You can explore and experiment with it using the API Sandbox.

There are multiple Wikibase specific endpoints. Here is an example request:

ربات‌ها

ربات‌ها نیز در ویکی‌داده حضور دارند

با استفاده از ربات می‌توان به API دسترسی پیدا کرد.

Recent Changes stream

What is it?

از API تغییرات اخیر می‌توانید ببینید کدام چوهره ها تغییر یافته‌اند و دسترسی به داده برای هر آیتم آن را fetch کنید. به این طریق به صورت تصاعدی برای 30 روز بروزرسانی می‌کنید.

When to use it?

Use the Recent Changes stream when your project requires you to react to changes in real time or when you need all the latest changes coming from Wikidata – for example, when running your own query service.

Details

The Recent Changes stream contains all updates from all wikis using the server-sent events protocol. You'll need to filter Wikidata's updates out on the client side.

You can find the web interface at stream.wikimedia.org and read all about it on the EventStreams page.

Wikidata Vector Database

What it is?

The Wikidata Vector Database stores high-dimensional vector representations of Wikidata entities. It enables semantic search based on meaning and context rather than keyword matching, and supports natural-language queries against entities.

When to use it?

Use vector search for exploration purposes, for example, when you want to uncover entities without explicitly knowing their labels, or when you need to narrow a search down to a smaller, more relevant subgraph of Wikidata as a starting point for further research before moving on to more structured tools.

The vector database can also be used in AI/ML pipelines, such as enabling semantic search in RAG workflows or applying vector distances to tasks like classification and other types of analysis.

Details

You can find more information on the Wikidata Vector Database page. The Wikidata Vector Database is available at wd-vectordb.wmcloud.org, and the API documentation can be found at wd-vectordb.wmcloud.org/docs.


Wikidata MCP

Main page: Wikidata:MCP

What it is?

The Wikidata MCP (Model Context Protocol) provides a set of standardized tools that allow large language models (LLMs) to explore and query Wikidata programmatically. It is designed for agentic AI or AI workflows that need to search, inspect, and query Wikidata, without relying on hardcoded assumptions about its structure or content.

When to use it?

Use the Wikidata MCP when you want to integrate Wikidata directly into a GenAI model or into AI/ML workflows. The MCP provides a set of tools for exploring and accessing Wikidata, but it is limited to read-only use and does not include editing functionality.

Details

The Wikidata MCP is implemented as an HTTP service available at wd-mcp.wmcloud.org. To use it, add https://wd-mcp.wmcloud.org/mcp/ as a connector in your AI client.

Dumps

What are they?

Wikidata dumps are complete exports of all the Entities in Wikidata: https://dumps.wikimedia.org

When to use them?

Use a dump when your result set is likely to be very large. You'll also find a dump important when setting up your own query service.

Don't use a dump if you need current data: the dumps take a very long time to export and even longer to sync to your own query service. Dumps are also unsuitable when you have significant limits on your available bandwidth, storage space and/or computing power.

Details

If the records you need to traverse are many, or if your result set is likely to be very large, it's time to consider working with a database dump: (link to the latest complete dump).

You'll find detailed documentation about all Wikimedia dumps on the "Data dumps" page on Meta and about Wikidata dumps in particular on the database download page.

ابزارها

Local query service

It's no small task to procure a Wikidata dump and implement the above tools for working with it, but you can take a further step. If you have the capacity and resources to do so, you can host your own instance of the Wikidata Query Service and query it as much as you like, out of contention with any others.

To set up your own query service, follow these instructions from the query service team, which include procuring your own local copy of the data. You may also find useful information in Adam Shorland's blog post on the topic.