Jump to content

Wikidata:Requests for comment/Notability policy reform

From Wikidata

Wikidata’s current Notability Policy has been drafted in 2013. Since then, the project has grown and the world changed. We understand Wikidata, what it can do and what its place in the Linked Open Data web is, better. All of this led to discussions about what data should be in Wikidata, what should be in the larger Wikibase Ecosystem and what should maybe go somewhere else entirely. This RfC is meant to gather input for an improved Notability Policy. In a first round it will gather input on a number of questions, which will guide the drafting of the new policy.

Context

[edit]

What has changed since the initial drafting of the policy

[edit]
  • Wikidata has grown to 120 Million Items and 16.000 active editors
  • Wikidata has reached social and technical scaling issues (see Wikidata:Requests for comment/Mass-editing policy#Context for additional information)
  • The Wikibase Ecosystem has matured, with Wikibase Cloud and Wikibase Suite now being viable alternatives for hosting knowledge graphs that are tightly interconnected with Wikidata

What can we learn from other projects

[edit]

Looking at other Notability Policies of other Wikimedia Projects, a few things stood out that we might want to take into account:

  • Wikimedia Commons: It clearly calls out the aim of Commons to ground the policy. It also specifically mentions some content not being a good fit for Commons because there are other projects that handle it. It talks very clearly about requiring use of the files or at the very least requires files to be realistically useful for educational purposes.
  • German-language Wikipedia: It has de:Wikipedia:Relevanzcheck where people can ask before creating a new article if it would meet notability criteria.
  • There is a general theme of requiring reliable, independent, secondary sources.
  • There are projects that define notability generally for all topics (like Wikidata does now), while others go deeper into specific notability criteria for specific topics.

The benefits and problems of the current policy

[edit]

Based on previous discussions there are a number of positive and negative points to the current policy.

Benefits:

  • Small number of criteria: makes it easier to understand and to remember
  • Wiggle-room for marginalized knowledge and more: There is benefit to the current vagueness. It gives admins room to make judgement calls.

Problems:

  • Differing interpretations of criteria 2: Some people see it as a card blanche (everything can have an Item); others interpret it much more strictly. Admins report not being able to be consistent even within their own decisions. It is also harder to objectively evaluate than criteria 1 and 3.
  • Level of consideration is mismatched: The policy works at the level of individual Items but issues are also (even more so) caused by collection of Items even if every individual one of them is notable. We don’t have good ways to draw boundaries about what from a class we want in Wikidata and what we don’t.
  • Enforcement burden: Admins spend a lot of time arguing over the interpretation of and enforcing the policy. People not acting in the best interest of the project create Items to promote themselves, their business, etc, causing more clean-up work for admins.
  • Detachment from reuse: Reuse is not really considered in the criteria but reuse is at the core of what Wikidata is about. Additionally, Items get deleted as not notable despite being used in other projects.

Questions to determine changes to current policy

[edit]

Following are a number of questions that need your input to help draft the new policy.

We should require reuse or at least the very real potential for reuse

[edit]

Votes and comments:

  • I create items for structural need in projects outside the Wikimedia ecosystem, and when they are deleted it blows holes in my data, but how would anyone know, unless you modified wikibase to allow me to record use of items. And sometimes I'd pick up poor items with SPARQL queries, but my using them is not validation of their worth. So recording use beyond the curated Wikipedias and Commons is very hard. Vicarage (talk) 17:02, 30 December 2025 (UTC)[reply]
  • No, I think this is the wrong direction. Furthermore, "reuse" without stricter definition is pretty meaningless, and would be exploited for nefarious (in Wikidata's sense) activities easily. —MisterSynergy (talk) 17:36, 30 December 2025 (UTC)[reply]
  • I agree with MisterSynergy that "reuse" without stricter definition doesn't make much sense; so, waiting for a clearer definition to give input :) --Epìdosis 18:10, 30 December 2025 (UTC)[reply]
     Support So9q (talk) 22:40, 1 January 2026 (UTC)[reply]
     Support I fully agree as well. Clear definitions are important. Otherwise, it won't work. --Gymnicus (talk) 23:49, 1 January 2026 (UTC)[reply]
  • I like the idea that items be reusable, but I don’t feel this principle is well defined. We need to have a clear definition of what this principle means, or it will become another source of self promotional disruption. Bovlb (talk) 19:51, 30 December 2025 (UTC)[reply]
  • I agree with other comments that we need a more concrete definition of "reuse". In particular, since Wikidata is a source for AI, Google Knowledge Graph, and other services, I'm concerned that one could say that since the data is used by them, anything in Wikidata has potential for use/reuse. That would make this requirement meaningless. Mcampany (talk) 03:16, 31 December 2025 (UTC)[reply]
  • Does reuse meant it have to be queried somewhere? Because what seem not used today, might be needed tomorrow. And do we have data on each items of how they were used? If we put this in the notability, then we have make sure we could at least be held accountable when someone ask "how do you measure reusability on my item?". —Yamato Shiya大和 士也 (TalkContribs) 03:56, 31 December 2025 (UTC)[reply]
  • I understand one of the core purposes of Wikidata is to be a “hub of hubs”. I’m not sure how connecting multiple outside sources to each other can be defined as reuse, but it’s clearly valuable. - PKM (talk) 23:21, 31 December 2025 (UTC)[reply]
  • Can't have a notability discussion without having a link to User:Multichill/Questionable notability Wikimedians. Something that happens a lot: A Wikimedian shows up on a couple of events. Photos of this Wikimedian end up on Commons. Someone creates a category on Commons for the Wikimedian. Later someone creates an items for the Wikimedian. Do we consider this valid reuse? Multichill (talk) 14:37, 1 January 2026 (UTC)[reply]
    Interesting case. My item should probably be in that list. So9q (talk) 22:46, 1 January 2026 (UTC)[reply]
  • I believe most users choose Wikidata for (self)promotion simply because they deem everything in Wikidata is implicitly (by design) reusable. We'd really need to specify what (very real potential for) reuse / reusable means from our POV. --Matěj Suchánek (talk) 11:49, 2 January 2026 (UTC)[reply]

We should consider the size of the complete data set when making decisions about individual Items

[edit]

Votes and comments:

  • I think this is intended for cases like "all known stars" or "all the scientific articles ever published", which presently could not go into Wikidata at least for technical reasons; in this sense, I think it makes sense considering this criterium in a future change of the notability policy; however, an item could be considered part of many different datasets, each of a different size (like, for a scientif article: all the articles published in the same volume, or in all the volumes of the same journal, or in all the volumes of the journals treating this topic in language X, or in all journals treating this topic in all languages etc.), so we should be very careful on applying this IMHO. --Epìdosis 18:14, 30 December 2025 (UTC)[reply]
I think you have a point here, but I don't think the carefulness is a hindrance. We just need to be explicit about which set we are talking about and be clear that we don't fall into a scope creep. Ainali (talk) 10:02, 31 December 2025 (UTC)[reply]

We should change the wording of criteria 2 to say “can be described” to “is described”

[edit]

Votes and comments:

Accompanying text: We should require new Items to have at least the statements required to establish notability

[edit]

Votes and comments:

  • Harsh given we have so many items already that don't have that. And in a world of bots and AI, many items might be created in anticipation of waves of bot updates that could be some time away. Scurrying around at creation to find mentions elsewhere would encourage random facts being added, like references being to obscure newspaper articles some are so fond of. Better to consistently back-fill across a class from notable sources. Vicarage (talk) 16:56, 30 December 2025 (UTC)[reply]
  • I agree with Vicarage that this could backfire, but IMHO the pros are more than the cons: after an item has been created, and some time has passed (say one day), the creator should have added to it sufficient data to demonstrate that it is notable. --Epìdosis 18:22, 30 December 2025 (UTC)[reply]
  •  Support But, admins should not delete empty items minutes after they are created (time should be given for the creator to fill the item). Ternera (talk) 19:30, 30 December 2025 (UTC)[reply]
    Agreed. For what it's worth, English Wikipedia's New Page Patrollers are required to wait until an hour after a page has been created before nominating something for deletion unless there are serious content issues, like BLP privacy issues. Perhaps something similar would make sense here. Mcampany (talk) 03:07, 31 December 2025 (UTC)[reply]
    I feel that it is reasonable to expect editors to add a claim within fifteen minutes of item creation, but I normally do not delete empty items until an hour after the last edit. Bovlb (talk) 03:34, 31 December 2025 (UTC)[reply]
    Agreed. 1h is reasonable. So9q (talk) 22:32, 1 January 2026 (UTC)[reply]
  •  Support with the caveat about reasonable time for item enrichment. I find the argument against weak, as empty items are even more of a burden than someone at least trying to add relevant links. (People acting in bad faith is a totally different thing and is a pest regardless if they are creating empty items or adding statements to them.) Ainali (talk) 10:06, 31 December 2025 (UTC)[reply]
  • As a counter example there were 294 Flower-class corvette (Q404394) produced, collectively they won the Battle of the Atlantic, I have created entries for 290odd of them. The class is widely described, but generally collates the members, with individuals featured in random scattering of articles across Wikipedias and other naval sites, but I'd struggle to find a source that mentions them all. There is a structural need to have articles on the class and all its members, but it would be frustrating if when adding the last few members they were challenged because of poor documentation. If WD is to be comprehensive, it is inevitable it might contain results that were merely entries in lists elsewhere, and so not amenable to formal property assignment. I suppose a line in a table in a URL could be used as a reference or qualifier to a instance of (P31) statement, but it feels clumsy. Vicarage (talk) 10:27, 31 December 2025 (UTC)[reply]
  •  Support And we should have a uniform way of reminding users of this requirement. --Matěj Suchánek (talk) 11:38, 2 January 2026 (UTC)[reply]
  •  Strong support This is a very important requirement. I don't see any problems with this as long as we give enough time to add the statements. --Yuriklim (talk) 12:40, 2 January 2026 (UTC)[reply]

Accompanying text: We should give the Wikibase Ecosystem a more prominent place in the text to explain alternatives to adding data to Wikidata outside the notability criteria

[edit]

Votes and comments:

Accompanying text: We should encourage people to flesh out and improve existing Items over adding a lot of new Items

[edit]

Votes and comments:

Accompanying text: We should make it explicit that people should not create Items about themselves, their business, etc

[edit]

Votes and comments:

[edit]

Votes and comments:

Accompanying text: We should encourage people to ask if they are unsure if the Item they want to create is notable, especially if they want to create a lot of new Items.

[edit]

Votes and comments:

  •  Support but where? I would guess a subpage of Wikidata:Notability, like Wikidata:Notability/Questions; please not directly in the Project chat to avoid flooding it even more. --Epìdosis 18:30, 30 December 2025 (UTC)[reply]
  •  Support but are we prepared to answer them promptly? Unlike Relevanzcheck in dewiki, Relevance Check in WIkidata could be flooded in minutes. I think we need to limit the check to item that related to series or datasets, or generational data. I suggest we make something like basic checklist for notability criteria, yet how do we limit the interpretation?Yamato Shiya大和 士也 (TalkContribs) 12:22, 31 December 2025 (UTC)[reply]

Process: We should only let registered users create Items

[edit]

Votes and comments (see also Wikidata talk:Requests for comment/Mass-editing policy#Restrict entity creation to logged-in users):

Process: We should take the intent of the Item creator into account when making decisions about notability

[edit]

Example: Someone gaming the system by creating several Items and connecting them in order to fulfil notability criteria 3, even if the Items would otherwise not be notable

Votes and comments:

  • Establishing intent could become so bureaucratic. Assume good faith for members in good standing. Do we record statistics of how many items each user had has deleted, to trigger different levels of oversight? Vicarage (talk) 16:47, 30 December 2025 (UTC)[reply]
  • the intent of this is good, but it could also be difficult to apply, as Vicarage says; anyway, probably we should probably try to enlarge a bit "structural need" in order to make it more difficult to game it. --Epìdosis 18:35, 30 December 2025 (UTC)[reply]
  •  Oppose If someone creates multiple items or redirects the same item to itself, we can already check that and delete/unlink all of the items. Checking their intent seems like a lot of unnecessary work. Ternera (talk) 19:35, 30 December 2025 (UTC)[reply]
  • This seems to conflate two points.
    • While it is often easy to classify a contribution as promotional, we should avoid going down the path of mind reading and doxxing.
    • We routinely bulk delete groups of items that connect to each other, as not a valid case of N3, but it is harder to investigate. Bovlb (talk) 20:08, 30 December 2025 (UTC)[reply]
  • I support the idea of this, but am not sure how we implement this in a good way. Perhaps this kid of gaming can be mentioned as an example of bad faith editing? Ainali (talk) 10:34, 31 December 2025 (UTC)[reply]

Process: We should let WikiProjects narrow down (but not expand) the general notability guidelines for their area

[edit]

Votes and comments:

  • Notability so depends on the specifics and data volumes a project is designed for, and seeking agreement could stultify things, so let each project have its own criteria. Vicarage (talk) 16:43, 30 December 2025 (UTC)[reply]
  • I would specify some minimal criteria for notability criteria established by specific WikiProjects, specifically: they should be easy to find (i.e. at least all linked from one place), easy to read (some kind of standard structure), and they should be established by at least 10 users (to avoid important decisions being taken by too few users). I agree that these thematic guidelines by WikiProjects should be only restrictive in comparison with the general ones. --Epìdosis 18:39, 30 December 2025 (UTC)[reply]
  •  Support I think this is a great idea, but I think we need to figure out a way to be able to point from the general criteria to WikiProjects that have found consensus for narrower ones in their field. (We don't want to spring this as a surprise on someone who went to the general page and read it.) Ainali (talk) 10:45, 31 December 2025 (UTC)[reply]
  • We should be wary of letting one wikiproject declare that something is not notable, when another wikiproject considers that it is. Conversely, we should be wary of a wikiproject existing just to declare that everything in a given field (Wikiproject SEO Experts? WikiProject YouTube Influencers?) is notable. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:02, 31 December 2025 (UTC)[reply]
    • The point of multiple WikiProject viewpoints are a good one. And I think that it can be solved by not allowing a WikiProject restrict another one. That is, if one WikiProject finds some type of concept notable, but not another, the second WikiProject can abstain from creating items, but not stop the first one. Ainali (talk) 20:01, 31 December 2025 (UTC)[reply]
    • Regarding the point of declaring everything notable, they still shouldn't be able to expand the general notability policy. Ainali (talk) 20:01, 31 December 2025 (UTC)[reply]
  • I have always liked the ideas of WikiProjects, or in general, associations by topics (music, Sweden, influencers, ...) as place to gather all sorts of resources (showcase items, list of properties, do's and dont's, ...). But now I am a little reserved because this could introduce sort of a backdoor to the general policy (i.e., a possibility to overrule what we are trying to reform right now). --Matěj Suchánek (talk) 12:20, 2 January 2026 (UTC)[reply]

Other questions and discussion

[edit]

How can we rely more on tooling and automation for notability checking and processing?

[edit]
  • Record number of deleted items per user, to trigger different oversight levels. Vicarage (talk) 16:49, 30 December 2025 (UTC)[reply]
  • I’m a big fan of tooling and automation. I have a gadget user:Bovlb/notability.js that tries to assess notability, and I have been working on developing a better version. In particular, wider use would require caching, checking of outside databases (e.g. SDC, OSM), and smarter checking of N3. Policy should be operationalisable, but should not be constrained to be automatable. I have also been working on some ways to detect ill-advised bulk creation sooner. Bovlb (talk) 20:14, 30 December 2025 (UTC)[reply]

Do you have additional suggestions for how to tighten/clarify criteria 2?

[edit]

Is there anything we missed?

[edit]
  • When we will discuss the draft of the new notability policy, in the same discussion we will also necessarily have to discuss about the items already existing that will be outside the revised notability criteria, and specifically: 1) how to find them; 2) what to do with them (keep them for some reason; simply delete them; delete them and move them elsewhere [where specifically?]). --Epìdosis 18:42, 30 December 2025 (UTC)[reply]
  • Maybe we should consider somewhere the case of items which met criterium 2 at the time of their creation and does not meet it anymore (i.e. they are based on a reliable online source which has perished in the meanwhile and has not been archived, or only partially); do we want to keep them? Probably not. Cf. Wikidata:External identifiers/Obsolescence (draft) for more context. --Epìdosis 18:54, 30 December 2025 (UTC)[reply]
  • The Wikibase Ecosystem has matured, with Wikibase Cloud and Wikibase Suite now being viable alternatives for hosting knowledge graphs that are tightly interconnected with Wikidata
    Is this based on the needs of the Wikidata community or is it a pipe dream? From what I have understood talking to the Scholia team and knowledge experts this idea about ontologic federation is currently an unknown thing.
    The Scholia team decided to ABANDON WDQS after the graph split of scholarly items because the federation in SPARQL did not work (timeouts, hard to know where the items reside, etc.). I consider the Scholia team to be SPARQL experts and if they can't get it to work I very much doubt anyone else will. Wikibase Suite in it's current state is NOT able to reliably integrate with Wikidata in my opinion. I tried setting up a Wikibase.cloud wiki multiple times and get it integrated, but the federation UX is nowhere to be found.
    My current conclusion is thus: the "ecosystem of connected Wikibases" is not a feasible idea that has gained any mentionable traction. Even if the federated properties + values were to gain adoption it would result in a split of the community and Wikibase Suite does not support RDF streaming so the triples cannot be integrated e.g. by combining multiple streams in a single QLever instance.--So9q (talk) 22:08, 1 January 2026 (UTC)[reply]
  • I don't see scaling Wikibase anywhere above. I recently started to work on wikibase-backend (built using Python, FastAPI) which scales to 1bn+ items and it is as of writing nearing MVP state and already capable of:
    • CRUD entity operations
    • scaling the metadata database to 1bn+ entities (using sharding in Vitess)
    • scaling to 10bn+ revisions (S3 compatible backend based on Ceph)
    • entity locking and archiving is supported
    • entity redirects are supported
    • full RDF output is supported (98% complete as of today 🥳)
I added cost estimates to the repo and they are looking very promising. The current monolithic Wikidata architecture cannot scale to 10x the current number of revisions. It has reached end of life a number of years ago and neither WMDE nor WMF has done anything about fixing the root cause (band-aids, investigations and disaster plans don't count 😉).
This new backend if finished and implemented may very well be a game-changer for the future of Wikidata and notably not one that break current tools like the "ontological federation" effectively would. It most probably also be easier for the new Wikidata Platform team to operate in a sustainable way compared to the current legacy architecture where manual pooling and installation of servers is needed.--So9q (talk) 22:08, 1 January 2026 (UTC)[reply]
  • Although an overhaul to WD:N is being discussed, I think we should also consider reforming WD:RfD (requests for deletion). Re-posting my earlier ideas:
    IMO the process should be split into nominations for deletion and requests for (speedy) deletion. Nominating would involve sending a message to the creator and letting them know what is wrong and what they should do in order to not have their item deleted (cf. User:Bovlb/How to create an item on Wikidata so that it won't get deleted). The discussion could be held on the user's talk page or the item's talk page (which can be categorized, so that there is a general overview of currently or previously nominated items). Having these discussions on WD:RfD is very unfriendly (it's a long page with many threads, it takes long to publish a comment there, it's sensitive to accidental structure changes because we have bots maintaining it, etc.). The process of nominating could also be automated which I think is desperately needed.
  • See also User:ChristianKl/Draft:ProposeDeletion. --Matěj Suchánek (talk) 11:42, 2 January 2026 (UTC)[reply]