Wikidata:Requests for comment/Source items and supporting Wikipedia sources
An editor has requested the community to provide input on "Source items and supporting Wikipedia sources" via the Requests for comment (RFC) process. This is the discussion page regarding the issue. If you have an opinion regarding this issue, feel free to comment below. Thank you! |
THIS RFC IS CLOSED. Please do NOT vote nor add comments.
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- A sources namespace or another type of item like S1234 shouldn't be created, but Wikipedia sources should be supported. I will make some property proposals.--GZWDer (talk) 06:33, 24 November 2013 (UTC)
Two of the topics that generated the biggest concern during the final vote for the Wikidata:Requests for comment/References and sources were:
- big amount of items that will not be linked to any Wikipedia page
- scientific journals as items
To address this specific concerns I'd like to open a new discussion centered on the need (or not) of a new "source entity" (S items) and if Wikipedia sources (represented by the Template:Cite doi and others) should be imported into Wikidata.
The source entity type (S items) would be a new kind of entity used to store metadata from sources. The only difference with the item entity type would be that it wouldn't allow linking to Wikipedia articles. It would be used to store all the items needed to support Guidelines for sourcing statements that don't have a Wikipedia article and (most likely) will never have one:
- edition items
- Scientific, newspaper or magazine article
- Report, technical documentation
Support
[edit]- I think there are advantages not only for organization purposes but also for clarity. Besides if the proposal about storing the sources from Wikipedia goes forward, that would mean that we would have *a lot* of items in a grey area, think about all the cite-doi, cite-book, cite pmid, etc from *all* Wikipedias. The requirements about creating 2 items per book could also be relaxed, creating always the "edition item" as a "source item" and the "work item" whenever is needed to use that book as object of study in Wikipedia. --Micru (talk) 12:29, 26 June 2013 (UTC)~
- This will definitively solve the distinction between work item and edition elements: work item of a book will be in the Q namespace and editions elements will be in the S namespace. Snipre (talk) 12:31, 26 June 2013 (UTC)
- I prefer a different namespace for not generating confusion between item and their sources. With a different namespace, it will be impossible confuse an item when we will insert references in statements or when we will use queries. For books: imho it is not necessary duplicate info: if it is possible, we could use "soft-links", as for redirects in Wikipedia. --Paperoastro (talk) 17:24, 26 June 2013 (UTC)
- It seems to me to be a false herring when others suggest that we will need to move or rename a source item when it becomes worthy of a Q-ID. That can be guided by the community's choice on the subject. Additionally, we should have this separation because it will help external reusers verify information being used as sources, as well as reuse for all of the wikis who want to use common citations. I suppose my question is though, do we also put source information about, say, a person in this one (e.g. a person who has no links except as source information?). --Izno (talk) 03:38, 8 August 2013 (UTC)
Support per Izno. I would even advocate for different entity types for "link" items - those which are only notable because they provide linkage between other items. The vast majority of sources will never get articles, so any renaming to do would be minimal (however, we would need a new interface to change entity types, since Special:MovePage won't work with "data" namespace). In fact, I would even argue that source entities should be separate from regular items that could be on the source, since they serve different purposes.--Jasper Deng (talk) 03:03, 26 August 2013 (UTC)
- Sounds like a basically good idea, as just putting everything in the Q-space is rather confusing (even now, with not all that many references added). Brya (talk) 10:48, 2 September 2013 (UTC)
Oppose
[edit]- The idea is not bad, but sounds too complicated in concrete. Whenever a source would gain a corresponding article on any Wikipedia, it would become an item and therefore have to be renamed. The will never have one is actually impossible to predict : even a scientific article can gain sufficient notoriety to be an object of study per se. Alexander Doria (talk) 23:43, 25 June 2013 (UTC)
- If I understand the proposed idea, I think a source getting an article would be duplicated as an item rather than renamed as an item because we would be separating a book (say) as a source and that same book as an object of study. Pichpich (talk) 03:41, 26 June 2013 (UTC)
- I have removed the line about "renaming" because Pichpich says, there won't be the need to rename anything, just to create a Q item as soon as an edition becomes a subject of study and gets an article in Wikipedia (rare, I haven't seen any case yet).--Micru (talk) 12:29, 26 June 2013 (UTC)
- If I understand the proposed idea, I think a source getting an article would be duplicated as an item rather than renamed as an item because we would be separating a book (say) as a source and that same book as an object of study. Pichpich (talk) 03:41, 26 June 2013 (UTC)
- I can't see any advantage to having a separate 'S' namespace which would be nearly identical to the Q namespace. Remember that most of the books which are in the 'Q' namespace will be used as sources for something (even if only for the properties of that book itself) so nearly all of these will have to be duplicated in the S namespace. Pointless duplication. Filceolaire (talk) 16:08, 26 June 2013 (UTC)
- I agree with the arguments above. The only difference between S* and Q* is lack of links to Wikipedia articles. The rest is the same. The question is whether the lack of any link to Wikipedia article has any negative impact on the Wikidata engine? I don't know, but it shouldn't. We shall think about queries to Q* namespace to find the cited source using one of various identifiers (ISBN, DOI, VIAF etc.) or any properties having specific values, that uniquely identifies it. That mechanism could improve manual editing of the articles. However the main obstacle to make the sources with its properties shared between Wikipedias is bug #47930 that prevents from accessing data that are not connected to the loaded page. Paweł Ziemian (talk) 18:49, 26 June 2013 (UTC)
- There is deep difference which lies at the conceptual level. It is like "word entities" and "meaning entities" proposed for Wiktionary. They are basically the same as Q entities, but the concept they represent is different and makes it more clear to users to have a different entity type, as opposed as having a property <instance of> word. The same applies to S entities, there is a clearly defined set of items that represent a different concept that Q entities, which are: editions, articles and documentation. What Filceolaire says, "most of the books which are in the 'Q' namespace will be used as sources for something", is not totally accurate. What will be used to source statements is edition items, as outlined in the Guidelines for sourcing statements. These "edition items" don't exist (at least not many of them yet) and they are what will be used for sourcing statements as opposed as just the items that are used in Wikipedia (work items). The reason is that each edition contains different data or is in different languages, and each edition has a group of properties that are different from the work item. Let's see a couple of use cases:
- Example 1: On the Origin of Species has an infobox and that data will be imported into the item On the Origin of Species (Q20124). Additionally there are around 10 different editions that are used as sources, all of them with their edition year, doi identifier, links, bibcode, etc. According to the poll below, this information should be imported into Wikidata to be shared across articles and different language wikipedias. Of course the only way to do that (with current technology), is to import them as 10 separate items linked to the main work item. These items could be S entities to have less trouble organizing and less effort explaining to users which one to use as source (definitely not the work item).
- Example 2: The Washington Visual Double Star Catalog is a book used to source two English Wikipedia articles. In Alpha Centauri the 1996 and the 2008 edition are used and in Multiple star only the 2008 edition is used. In the German Wikipedia Tau Ceti, the 1996 edition is used. They will become 2 different items. The sources, however, fail to indicate that there is an article about the catalogue which already has an item: Washington Double Star Catalog (Q932275), so we will have 3 different items without conexion between them. Most likely when someone finds these 3 unconnected items he or she will be confused about what is the relationship between them. If we had a S entity it wouldn't be a problem because the sources will be clearly identified as S, the item as Q, and the only thing missing would be a linking property (edition or translation of (P629)).
- Hopefully these examples give a better idea about why to have a different type. About the performance question, no idea, it should be clarified if it is technically feasible.--Micru (talk) 14:20, 28 June 2013 (UTC)
- What shall be relation between Encyclopædia Britannica 11th edition (Q867541) to Encyclopædia Britannica (Q455), where the first is an S item for the second Q? This is a case when S item is promoted to Q item. I do not know how it shall be resolved in the new schema. Whether the data shall be copied or shared? The duplication of data may introduce inconsistency between the items. Using the edition or translation of (P629) I found the CRC Handbook of Chemistry and Physics (92nd edition) (Q11927173), which shall be S item in the proposed schema, however, it already exists as Q without any Wikipedia article and it looks good for me. I understand that the proposed S itemspace for sources shall exists to implement restriction that only "source item" might be used as source reference for claims. In the current Q only space, any item might be used as source, even it looks strange. But you never know what can be used as source in the future. Someone could said that en:File:Ybc7289-bw.jpg is a stone, but for others it is a very precious source. Maybe instead of S we shall provide an attribute for Q to apply boolean value whether the Q is an information storage. If it contains some information it could be used as source. This could also prevent from using Special:Random item as source. Paweł Ziemian (talk) 20:44, 28 June 2013 (UTC)
- The existence of promoted items is not that dramatic - over thousands or millions of source items there will be only a handful that will exist in both domains. There can be Encyclopædia Britannica 11th edition (Q867541) as object of study, the same edition as source item, and both of them linked as "edition of" Encyclopædia Britannica (Q455). If this would be the norm, then it would be something to be worried about, but just being a couple instances I don't think we shouldn't make a big fuss out of it, specially considering that properties like stated in (P248) should favor S items, if not devoted completely to them. That artifact is really interesting! I regard the "source items" as way of providing a clear answer to bibliographic records. For other needs we will have to keep expanding the Guidelines as needed. They are not set in stone :) --Micru (talk) 00:14, 29 June 2013 (UTC)
- Alternatively, we could take the route that S-Encyc 11th edition is said to be the same as (P460) Q-Encyc 11th Edition, which then inherits the fact that it is an edition of Q-Encyc.
- On a side note, I don't think we need edition of. Subclass of does the same thing, no? --Izno (talk) 03:34, 8 August 2013 (UTC)
- 'Subclass of' links a class to a class so it is not appropriate here. 'part of' could be used but I think 'edition of' provides more information and is worth having instead. Filceolaire (talk) 15:00, 12 August 2013 (UTC)
- An edition of a text is a subclass of that text. That seems rather apparent to me. (It certainly isn't a part of! Volume A-E would be a part of relation.) --Izno (talk) 21:55, 14 August 2013 (UTC)
- I would be more inclined to call an edition of a text to be an instance of (P31) of the text. The text (e.g. Encyclopædia Britannica (Q455)) is an abstract entity that is an instance of (P31) the abstract concept of a book representing all editions, publications etc. The edition (e.g. Encyclopædia Britannica 11th edition (Q867541)) is another abstract entity that is an instance of (P31) the text (Encyclopædia Britannica (Q455)) and also an instance of (P31) the abstract concept of a single edition of a book. There will then be many physical copies of the edition that are instance of (P31) both the edition (Encyclopædia Britannica 11th edition (Q867541)) and the concept of a physical book, but it is highly unlikely that any of them will be notable enough to be included. The main issues are the conflict that all three of the abstract concept of a multi-edition text, the abstract concept of a single edition of a book, and the concept of a physical book are currently the same data item (book (Q571)) and that it appears the class system is not setup around having metaclasses and multiple inheritance. All three "classes" could link to the same wikipedia article as they are all described by it, but they will need to be differentiated if Wikidata is to accurately model the system. -- Nemo157 (talk) 05:06, 18 August 2013 (UTC)
- Eh, I'm not too worried about that, I suppose. I'm simply calling out that "edition of" seems an unnecessary property. :)
- As for multiple inheritance, I believe that is quite possible; see agent (Q392648) as an example. Notionally, there could be an item "human person" which inherits from the two subclasses "person" and "homo sapiens" (and is in fact one of things that I'd like to fix about the GND schema). --Izno (talk) 13:34, 18 August 2013 (UTC)
- I would be more inclined to call an edition of a text to be an instance of (P31) of the text. The text (e.g. Encyclopædia Britannica (Q455)) is an abstract entity that is an instance of (P31) the abstract concept of a book representing all editions, publications etc. The edition (e.g. Encyclopædia Britannica 11th edition (Q867541)) is another abstract entity that is an instance of (P31) the text (Encyclopædia Britannica (Q455)) and also an instance of (P31) the abstract concept of a single edition of a book. There will then be many physical copies of the edition that are instance of (P31) both the edition (Encyclopædia Britannica 11th edition (Q867541)) and the concept of a physical book, but it is highly unlikely that any of them will be notable enough to be included. The main issues are the conflict that all three of the abstract concept of a multi-edition text, the abstract concept of a single edition of a book, and the concept of a physical book are currently the same data item (book (Q571)) and that it appears the class system is not setup around having metaclasses and multiple inheritance. All three "classes" could link to the same wikipedia article as they are all described by it, but they will need to be differentiated if Wikidata is to accurately model the system. -- Nemo157 (talk) 05:06, 18 August 2013 (UTC)
- An edition of a text is a subclass of that text. That seems rather apparent to me. (It certainly isn't a part of! Volume A-E would be a part of relation.) --Izno (talk) 21:55, 14 August 2013 (UTC)
- 'Subclass of' links a class to a class so it is not appropriate here. 'part of' could be used but I think 'edition of' provides more information and is worth having instead. Filceolaire (talk) 15:00, 12 August 2013 (UTC)
- The existence of promoted items is not that dramatic - over thousands or millions of source items there will be only a handful that will exist in both domains. There can be Encyclopædia Britannica 11th edition (Q867541) as object of study, the same edition as source item, and both of them linked as "edition of" Encyclopædia Britannica (Q455). If this would be the norm, then it would be something to be worried about, but just being a couple instances I don't think we shouldn't make a big fuss out of it, specially considering that properties like stated in (P248) should favor S items, if not devoted completely to them. That artifact is really interesting! I regard the "source items" as way of providing a clear answer to bibliographic records. For other needs we will have to keep expanding the Guidelines as needed. They are not set in stone :) --Micru (talk) 00:14, 29 June 2013 (UTC)
- What shall be relation between Encyclopædia Britannica 11th edition (Q867541) to Encyclopædia Britannica (Q455), where the first is an S item for the second Q? This is a case when S item is promoted to Q item. I do not know how it shall be resolved in the new schema. Whether the data shall be copied or shared? The duplication of data may introduce inconsistency between the items. Using the edition or translation of (P629) I found the CRC Handbook of Chemistry and Physics (92nd edition) (Q11927173), which shall be S item in the proposed schema, however, it already exists as Q without any Wikipedia article and it looks good for me. I understand that the proposed S itemspace for sources shall exists to implement restriction that only "source item" might be used as source reference for claims. In the current Q only space, any item might be used as source, even it looks strange. But you never know what can be used as source in the future. Someone could said that en:File:Ybc7289-bw.jpg is a stone, but for others it is a very precious source. Maybe instead of S we shall provide an attribute for Q to apply boolean value whether the Q is an information storage. If it contains some information it could be used as source. This could also prevent from using Special:Random item as source. Paweł Ziemian (talk) 20:44, 28 June 2013 (UTC)
- Hopefully these examples give a better idea about why to have a different type. About the performance question, no idea, it should be clarified if it is technically feasible.--Micru (talk) 14:20, 28 June 2013 (UTC)
- There is deep difference which lies at the conceptual level. It is like "word entities" and "meaning entities" proposed for Wiktionary. They are basically the same as Q entities, but the concept they represent is different and makes it more clear to users to have a different entity type, as opposed as having a property <instance of> word. The same applies to S entities, there is a clearly defined set of items that represent a different concept that Q entities, which are: editions, articles and documentation. What Filceolaire says, "most of the books which are in the 'Q' namespace will be used as sources for something", is not totally accurate. What will be used to source statements is edition items, as outlined in the Guidelines for sourcing statements. These "edition items" don't exist (at least not many of them yet) and they are what will be used for sourcing statements as opposed as just the items that are used in Wikipedia (work items). The reason is that each edition contains different data or is in different languages, and each edition has a group of properties that are different from the work item. Let's see a couple of use cases:
- I agree with the arguments above. The only difference between S* and Q* is lack of links to Wikipedia articles. The rest is the same. The question is whether the lack of any link to Wikipedia article has any negative impact on the Wikidata engine? I don't know, but it shouldn't. We shall think about queries to Q* namespace to find the cited source using one of various identifiers (ISBN, DOI, VIAF etc.) or any properties having specific values, that uniquely identifies it. That mechanism could improve manual editing of the articles. However the main obstacle to make the sources with its properties shared between Wikipedias is bug #47930 that prevents from accessing data that are not connected to the loaded page. Paweł Ziemian (talk) 18:49, 26 June 2013 (UTC)
- Wikidata is a database, not an encyclopedia. When we have an item about a (notable) author we also want to list his or her publications. In Wikipedia we would list them in the article, on Wikidata we have to create separate items and connect them through a property. In this case the publications would clearly belong in the Q namespace, however. Thus, I think the duplication between the namespaces will become much larger than the proponents suspect. I also don't believe this solves the work/edition problem, as this does not exactly correspond to the Q/S distinction; I think this requires more cleverness in the Wikidata software (in particular, getting the software to understand the instance of (P31) relation and adapting the user interface and search engine based on that information). —Ruud13:20, 8 August 2013 (UTC)
- I don't think you (or I) have any idea of how many millions (tens of millions) of citations are possible, and that's just in the form of a slightly different ISBN. It would be an interesting exercise to get a count from, say, all of EN.wiki's featured articles citations to see how many have items (surely the bar we'll need to meet for diversity of citation). I'm skeptical that it will be more than 10%. If that. --Izno (talk) 23:06, 8 August 2013 (UTC)
- It's not really an impressive number for the database because there is already more than 10 million items. It's not a very difficult task to make statements for them as it's a bot task to import datas for other isbn databses. It's however an opportunity to better classify books by their subject, their authors, find the most cited books in featured articles ... TomT0m (talk) 09:07, 9 August 2013 (UTC)
"impressive number" is an unverified claim. It's odd that we have that problem here. :^). As I suggested, we should get a feeling for the number of sources that we will need; I fear that the number would be exponentially larger than the number of items we have or will have.
We can classify those things just as easily in a different namespace, and in fact helps us find said most cited books if it is a separate namespace. --Izno (talk) 13:57, 9 August 2013 (UTC)
- Exponential number would mean that every time we add a reference we would multiply the number of references in the database by some number, like doubling the number. This would indeed be unmanageable, but it's totally unrealistic. Every time we add an article in Wikipedia, we most of the time add no more than ten references (more for featured articles but they are not that big a number), minus those who already are cited like references or classic book, so it's far from beeing exponential, it's sublinear, and those kind of databases can absolutely manage such groth. TomT0m (talk) 16:10, 9 August 2013 (UTC)
- Google estimated there are around 150 million publications in existence, so 150–300 million wouldn't be an unreasonable figure. —Ruud08:41, 10 August 2013 (UTC)
- A publication is defined as...? --Izno (talk) 02:04, 12 August 2013 (UTC)
- Anything which is listed in some bibliographic database (WorldCat, OCLC, LoC, etc..) – The preceding unsigned comment was added byRuud Koot (talk • contribs).
- So that excludes the 10 billion webpages, countless numbers of newspapers, all of which could be used as a source as well. Consider that... --Izno (talk) 19:07, 17 August 2013 (UTC)
- Anything which is listed in some bibliographic database (WorldCat, OCLC, LoC, etc..) – The preceding unsigned comment was added byRuud Koot (talk • contribs).
- A publication is defined as...? --Izno (talk) 02:04, 12 August 2013 (UTC)
- It's not really an impressive number for the database because there is already more than 10 million items. It's not a very difficult task to make statements for them as it's a bot task to import datas for other isbn databses. It's however an opportunity to better classify books by their subject, their authors, find the most cited books in featured articles ... TomT0m (talk) 09:07, 9 August 2013 (UTC)
- I don't think you (or I) have any idea of how many millions (tens of millions) of citations are possible, and that's just in the form of a slightly different ISBN. It would be an interesting exercise to get a count from, say, all of EN.wiki's featured articles citations to see how many have items (surely the bar we'll need to meet for diversity of citation). I'm skeptical that it will be more than 10%. If that. --Izno (talk) 23:06, 8 August 2013 (UTC)
Oppose Do you fear a too long Q number? Let it be Q and model anything as item. In the German Wikipedia we have on average not even 2 (!) references per article and I do not think that there is a single chapter which has 3 references on average. Even if all of them were unique which is of course wrong, the number of needed items would be in the same order of what we already have. KISS! — Felix Reimann (talk) 13:53, 14 August 2013 (UTC)
- In German. The sourcing requirements on wikis which actually expect inline sourcing (see EN, as I've already used as an example) are 2 orders of magnitude more onerous with featured articles reaching 300-350 citations, and we should expect that every item in Wikidata should have sourcing as well; again, that will be. --Izno (talk) 21:57, 14 August 2013 (UTC)
- You are right, Izno: There are articles with 300+ citations. But even in enwiki most featured articles have much less citations. And of course, only 0,1% of all articles are featured. Thus, they do not influence the overall average a lot. Do you have other overall numbers about the number of citations in enwiki? If not, lets work with the numbers I gave you as dewiki is probably not the worst chapter. But even if we have 100 citations in every article on average (!) - it would only increase the length of the Q-number by 2. You see: no problem. What could be a problem? If you want to have a "clean" Wikidata copy without references? A script which searches for all items with instance of (P31) <book, article,...> and strip it and connected items is no rocket science. Still this is a lot easier than creating thousands of other scripts which need to consider an additional source namespace and, especially, coping with all kinds of data redundancies which you will get when you have 2 namespaces to store data. The lack of valid sources is the problem of Wikidata, not having too much of it. — Felix Reimann (talk) 15:43, 2 September 2013 (UTC)
- In German. The sourcing requirements on wikis which actually expect inline sourcing (see EN, as I've already used as an example) are 2 orders of magnitude more onerous with featured articles reaching 300-350 citations, and we should expect that every item in Wikidata should have sourcing as well; again, that will be. --Izno (talk) 21:57, 14 August 2013 (UTC)
Strong oppose for separate source items. Introducing source items would introduce a distinction that doesn't exist in nature. People should really review logic and philosophy on this issue. Most importantly is the fact that sources are also just a part of this universe, just as everything else, and which this database want to describe. --Tobias1984 (talk) 08:38, 18 August 2013 (UTC)
Oppose there is no intrinsic distinction between "source items" and other items, its one the use as source that makes a thing a source.one can make statements source items. To improve the handling of source items, however, an item might better be made visible as being used as source and one might just create items as sources without having to dig into the details of work/edition distinction. -- JakobVoss (talk)
Comments
[edit]...
The general consensus in the previous discussion was that "the sources that are needed will be created". Should this apply only to Wikidata sources or can it have a broader scope and support the sources that are used in Wikipedia only? There are multiple templates used in the Wikipedias, one of the most famous for scientific literature is Template:Cite doi. This template is used in many Wikipedias to store Digital object identifiers. The problem with this template is that:
- The information is not shared across Wikipedias
- It is not reusable in Wikidata
- Vandalism cannot be detected in the article where the template is used
After a discussion on the English Wikipedia about this, I would like to ask here as well if it would be in the scope of the project. In my opinion it could help making easier to source statements in Wikidata (if a source is used in Wikipedia the likelihood of reuse in Wikidata is higher) and it would make easier to share sources across different language Wikipedias, even if they are not used in Wikidata.
Support
[edit]- Definitely. It would certainly ease contribution to enter the references once and for all and being able to call them on every Wikipedia. Alexander Doria (talk) 23:38, 25 June 2013 (UTC)
- Another problem with cite doi on Wikipedia is that its formatting is very inflexible, and does not match the citation formatting of many articles. Splitting the data from the appearance would help with this, because it would allow for multiple variations of the templates for producing the appearance of a citation without having to copy the underlying data. —David Eppstein (talk) 00:05, 26 June 2013 (UTC)
- I cite doi's and PMIDs in multiple languages and on Commons. There will be some challenges to having citation templates based in Wikidata, but the problems of duplicating the same template across multiple projects are even greater. Having citation templates stored on Wikidata is the first thing I thought of when I heard of Wikidata and I want this to happen. Blue Rasberry (talk)10:19, 26 June 2013 (UTC)
Support as above, obvious and useful use case of Wikidata. TomT0m (talk) 11:02, 26 June 2013 (UTC)
Conditional support It is definitely a good thing to have the sources in Wikidata, but I wouldn't like to have them mixed with items. When users are looking for a source, they are not looking for normal items (and the other way around too), that's why I support this option provided that we have this kind of items segregated in a S namespace (see discussion above).--Micru (talk) 15:02, 26 June 2013 (UTC)
Support sources can be organized in databases, so it is natural manage them in Wikidata (but as Micru I prefer a different namespace form them). Once we put sources here, for Wikipedias will became more simple to show references. --Paperoastro (talk) 17:29, 26 June 2013 (UTC)
Support as Paperoastro --Sbisolo (talk) 08:49, 28 June 2013 (UTC)
Support —Ruud21:14, 5 July 2013 (UTC)
Support - Wikidata should also consider incorporating the Cite ISBN template.--Futuretrillionaire (talk) 17:40, 13 July 2013 (UTC)
Support. Would it be possible to rewrite 'cite ISBN' to link to a wikidata item or create a new wikidata item from inside wikipedia? Filceolaire (talk) 15:05, 12 August 2013 (UTC)
- It will be possible, but like the rest any serious work right now on that matter in limited by the fact Wikipedia can not access item datas from a different item than the one associated with the current page, so it will wait until at least september (or when the functionality is implemented). A javascript gadget could be a workaround but it's worth waiting. TomT0m (talk) 15:35, 12 August 2013 (UTC)
Support. Wikidata=the database for Wiki(pedias|voyage)* — Felix Reimann (talk) 13:56, 14 August 2013 (UTC)
Oppose
[edit]...
Comments
[edit]- I like the idea, but, assuming there are no performance issues, I would still be a bit concerned about code readability for occasional users.
<ref>{{cite|S145742358}}</ref>
? I think that would at least require a new feature in the visual editor. --Zolo (talk) 16:00, 26 June 2013 (UTC)
- Of course interfaces are indeed really important, but there is also the possibility to use something similar by "Item by title" in Lua in the template code, with something to show ambiguities as error message in the template. Besides there is also these magic codes with the cide doi and co. templates, which already exists and are used. They would of course benefit from wikidata. TomT0m (talk) 16:10, 26 June 2013 (UTC)
- Great idea in practical terms. A fundamental problem: On what grounds should items that do represent real-world entities (editions of books, journal articles) be segregated to a different namespace? Littledogboy (talk) 15:31, 5 July 2013 (UTC)
- I don't think they should. The distiction between "database records that represent sources" and "database records that represent things which are not sources" seems very artificial and is only going to lead to ontological problems. Many of them will be in both categories. —Ruud21:18, 5 July 2013 (UTC)
- In french Wikipedia there is a Reference namespace for books and their editions. It is underused and indeed, for references that already have an article it is redundant. TomT0m (talk) 10:35, 6 July 2013 (UTC)
- Also, Google once estimated their have been around 150 million "publications" (of which around 130 million are books.) Even at the exteme end, giving each of those a separate entry, would only lead to a 10-fold increase of the current database size, which seems manageable, especially as then can be automatically compared against external biobliographic databases for accuracy. —Ruud10:47, 6 July 2013 (UTC)
- (edit confl.) I hope that's true. Are there other problems with keeping it the Q namespace? Notability? I think we may even get quality database information out of it – lists of articles by authors, possibly tagged with topics, quotations, annotations (somehow?), lists of book editions... Littledogboy (talk) 10:51, 6 July 2013 (UTC)
- Notability is one of the issues, yes. The other one would be desambiguation problems like the one outlined above (Washington Double Star Catalog (Q932275) and its two unconnected editions), hard to know which is which without doing some differentiation. There are also works with 100 editions or more, if you want to connect just the main item, you would have a gigantic list so it wouldn't be easy to find the right one. If those problems can be addressed in another way, I am also fine with it.--Micru (talk) 14:43, 6 July 2013 (UTC)
- (edit confl.) I hope that's true. Are there other problems with keeping it the Q namespace? Notability? I think we may even get quality database information out of it – lists of articles by authors, possibly tagged with topics, quotations, annotations (somehow?), lists of book editions... Littledogboy (talk) 10:51, 6 July 2013 (UTC)
- I don't think they should. The distiction between "database records that represent sources" and "database records that represent things which are not sources" seems very artificial and is only going to lead to ontological problems. Many of them will be in both categories. —Ruud21:18, 5 July 2013 (UTC)
So this awesome future where we have all Wikipedia sources in Wikidata arrives, and then new questions will appear like: how do we display in Wikipedia the link to the Wikidata item containing the source info? How do we display in Wikidata the Wikipedia articles that use the source?
Links from Wikipedia to Wikidata
[edit]Most likely the entry point will be any of the "cite" templates (cite doi, cite book, etc). There can be a direct link, or it can be a pop-up window (like the future image viewer) that displays a window with all the information about the source and then allows to redirect to Wikidata or look for all instances of that source in Wikipedia.
Regarding the visual representation I've been thinking about:
- A superindex link at the end of the line
- A small wikidata icon at the end
Most likely this will have to be decided on the Wikipedias, however it might make sense to start thinking about it and be able to offer some clever solution.--Micru (talk) 13:20, 14 August 2013 (UTC)
- Why do you want to change the current situation of references in Wikipedia ? Just think at the printing possibility of articles to understand why pop-up windows are not desirable. Snipre (talk) 23:07, 14 August 2013 (UTC)
- The link to the wikidata item is not a change, it is an addition. How else could readers (and editors) navigate from the source to the wikidata item containing the bibliographic info? Besides, the "edit template" interface in the visual editor is a window, nothing new.--Micru (talk) 00:07, 15 August 2013 (UTC)
- So if you don't change the current format and all data is displayed in the reference section of the article at the bottom of the page, just create a link there. Snipre (talk) 08:29, 15 August 2013 (UTC)
- Exactly, but how? There are many possibilities: a superindex link, a text that says "see this source in Wikidata", a small WD icon... It is a kind of a link that will appear hundreds of times in the references list, it should be as simple as possible.--Micru (talk) 13:25, 15 August 2013 (UTC)
- So if you don't change the current format and all data is displayed in the reference section of the article at the bottom of the page, just create a link there. Snipre (talk) 08:29, 15 August 2013 (UTC)
- The link to the wikidata item is not a change, it is an addition. How else could readers (and editors) navigate from the source to the wikidata item containing the bibliographic info? Besides, the "edit template" interface in the visual editor is a window, nothing new.--Micru (talk) 00:07, 15 August 2013 (UTC)
Links from Wikidata to Wikipedia
[edit]In this case sitelinks are not an option, since the item won't be connected to one Wikipedia page, but to several. Even so, one possible solution could be to add another row to the "sitelink" table with usage information. Now it has "Language / Code / Linked article", it could have "Language / Code / Linked article / Used in" and in the "used in" column it could display "1670 pages" (for each language version). Clicking on that link would display a list of all the pages that are somehow using information contained in the item.--Micru (talk) 13:20, 14 August 2013 (UTC)
- Useless. A script like the "What links here" is enough. No need of track everything. Snipre (talk) 23:09, 14 August 2013 (UTC)
- It is not user-friendly at all to have to click each time on the link to see if an item is being used in Wikipedia or not. Practically none of the sources will have sitelinks, so I think some similar visual cue is necessary.--Micru (talk) 00:07, 15 August 2013 (UTC)–
- We have to keep the wikidata page at simple as possible and don't add there data you can get from a single query. Wikidata is a database not a visual interface for everybody with all possibles data sets displayed automatically. Just think of the server work to display each time the page. Better keep the spirit of a database and to develop personal tools which perform data extraction according to individual desires. Right now we don't have any problem with data traffic and server load because data are not used in wikipedia but this will be an issue later. Snipre (talk) 08:45, 15 August 2013 (UTC)
- I agree with the simplicity principle, and yet, even being a database it is used by humans who just want to have inmediate access to the information with minimal effort. If it requires too much computational effort to calculate the numbers per domain, a similar link to "what links here" for domains that use the source at least one time should be enough.--Micru (talk) 13:25, 15 August 2013 (UTC)
- "humans who just want to have inmediate access to the information with minimal effort", ok, but what kind of information ? You can't define a common set of data which can be considered as THE data humans want to have quickly. There are not common queries, each person wants to have the data he need and not the data others want. So a javascript tool allowing the saving of queries will be more useful than 2-5 predefined data searches. Dababase means we have to work work with database tools. Snipre (talk) 08:48, 17 August 2013 (UTC)
- I agree with the simplicity principle, and yet, even being a database it is used by humans who just want to have inmediate access to the information with minimal effort. If it requires too much computational effort to calculate the numbers per domain, a similar link to "what links here" for domains that use the source at least one time should be enough.--Micru (talk) 13:25, 15 August 2013 (UTC)
- We have to keep the wikidata page at simple as possible and don't add there data you can get from a single query. Wikidata is a database not a visual interface for everybody with all possibles data sets displayed automatically. Just think of the server work to display each time the page. Better keep the spirit of a database and to develop personal tools which perform data extraction according to individual desires. Right now we don't have any problem with data traffic and server load because data are not used in wikipedia but this will be an issue later. Snipre (talk) 08:45, 15 August 2013 (UTC)
- It is not user-friendly at all to have to click each time on the link to see if an item is being used in Wikipedia or not. Practically none of the sources will have sitelinks, so I think some similar visual cue is necessary.--Micru (talk) 00:07, 15 August 2013 (UTC)–