Module:Citation/CS1/COinS
Appearance
< Module:Citation | CS1
![]() | This Lua module is used on approximately 6,110,000 pages. To avoid major disruption and server load, any changes should be tested in the module's /sandbox or /testcases subpages, or in your own module sandbox. The tested changes can be added to this page in a single edit. Consider discussing changes on the talk page before implementing them. |
![]() | This module is rated as ready for general use. It has reached a mature form and is thought to be relatively bug-free and ready for use wherever appropriate. It is ready to mention on help pages and other Wikipedia resources as an option for new users to learn. To reduce server load and bad output, it should be improved by sandbox testing rather than repeated trial-and-error editing. |
![]() | This module is subject to page protection. It is a highly visible module in use by a very large number of pages, or is substituted very frequently. Because vandalism or mistakes would affect many pages, and even trivial editing might cause substantial load on the servers, it is protected from editing. |
![]() | This module can only be edited by administrators because it is transcluded onto one or more cascade-protected pages. |
This page contains various functions render a cs1|2 template's metadata.
These files comprise the module support for CS1|2 citation templates:
Other documentation:
--[[--------------------------< F O R W A R D D E C L A R A T I O N S >--------------------------------------]]localhas_accept_as_written,is_set,in_array,remove_wiki_link,strip_apostrophe_markup;-- functions in Module:Citation/CS1/Utilitieslocalcfg;-- table of configuration tables that are defined in Module:Citation/CS1/Configuration--[[--------------------------< M A K E _ C O I N S _ T I T L E >----------------------------------------------Makes a title for COinS from Title and / or ScriptTitle (or any other name-script pairs)Apostrophe markup (bold, italics) is stripped from each value so that the COinS metadata isn't corrupted with stringsof %27%27...]]localfunctionmake_coins_title(title,script)title=has_accept_as_written(title);ifis_set(title)thentitle=strip_apostrophe_markup(title);-- strip any apostrophe markupelsetitle='';-- if not set, make sure title is an empty stringendifis_set(script)thenscript=script:gsub('^%l%l%s*:%s*','');-- remove language prefix if present (script value may now be empty string)script=strip_apostrophe_markup(script);-- strip any apostrophe markupelsescript='';-- if not set, make sure script is an empty stringendifis_set(title)andis_set(script)thenscript=' '..script;-- add a space before we concatenateendreturntitle..script;-- return the concatenationend--[[--------------------------< E S C A P E _ L U A _ M A G I C _ C H A R S >----------------------------------Returns a string where all of Lua's magic characters have been escaped. This is important because functions likestring.gsub() treat their pattern and replace strings as patterns, not literal strings.]]localfunctionescape_lua_magic_chars(argument)argument=argument:gsub("%%","%%%%");-- replace % with %%argument=argument:gsub("([%^%$%(%)%.%[%]%*%+%-%?])","%%%1");-- replace all other Lua magic pattern charactersreturnargument;end--[[--------------------------< G E T _ C O I N S _ P A G E S >------------------------------------------------Extract page numbers from external wikilinks in any of the |page=, |pages=, or |at= parameters for use in COinS.]]localfunctionget_coins_pages(pages)localpattern;ifnotis_set(pages)thenreturnpages;end-- if no page numbers then we're donewhiletruedopattern=pages:match("%[(%w*:?//[^ ]+%s+)[%w%d].*%]");-- pattern is the opening bracket, the URL and following space(s): "[url "ifnil==patternthenbreak;end-- no more URLspattern=escape_lua_magic_chars(pattern);-- pattern is not a literal string; escape Lua's magic pattern characterspages=pages:gsub(pattern,"");-- remove as many instances of pattern as possibleendpages=pages:gsub("[%[%]]","");-- remove the bracketspages=pages:gsub("–","-");-- replace endashes with hyphenspages=pages:gsub("&%w+;","-");-- and replace HTML entities (– etc.) with hyphens; do we need to replace numerical entities like   and the like?pages=pages:gsub('%b<>','');-- remove html-like tags; spans are added to <Pages> by utilities.hyphen_to_dash() which should not appear in COinS metadatareturnpages;end--[=[-------------------------< C O I N S _ R E P L A C E _ M A T H _ S T R I P M A R K E R >------------------There are three options for math markup rendering that depend on the editor's math preference settings. Thesesettings are at [[Special:Preferences#mw-prefsection-rendering]] and are PNG images TeX source MathML with SVG or PNG fallbackAll three are heavy with HTML and CSS which doesn't belong in the metadata.Without this function, the metadata saved in the raw wikitext contained the rendering determined by the settingsof the last editor to save the page.This function gets the rendered form of an equation according to the editor's preference before the page is saved. Itthen searches the rendering for the text equivalent of the rendered equation and replaces the rendering with that sothat the page is saved without extraneous HTML/CSS markup and with a reasonably readable text form of the equation.When a replacement is made, this function returns true and the value with replacement; otherwise false and the initialvalue. To replace multipe equations it is necessary to call this function from within a loop.]=]localfunctioncoins_replace_math_stripmarker(value)localstripmarker=cfg.stripmarkers['math'];localrendering=value:match(stripmarker);-- is there a math stripmarkerifnotrenderingthen-- when value doesn't have a math stripmarker, abandon this testreturnfalse,value;endrendering=mw.text.unstripNoWiki(rendering);-- convert stripmarker into rendered value (or nil? ''? when math render error)ifrendering:match('alt="[^"]+"')then-- if PNG math optionrendering=rendering:match('alt="([^"]+)"');-- extract just the math textelseifrendering:match('$%s+.+%s+%$')then-- if TeX math option; $ is legit character that is escapes as \$rendering=rendering:match('$%s+(.+)%s+%$')-- extract just the math textelseifrendering:match('<annotation[^>]+>.+</annotation>')then-- if MathML math optionrendering=rendering:match('<annotation[^>]+>(.+)</annotation>')-- extract just the math textelsereturnfalse,value;-- had math stripmarker but not one of the three defined formsendreturntrue,value:gsub(stripmarker,rendering,1);end--[[--------------------------< C O I N S _ C L E A N U P >----------------------------------------------------Cleanup parameter values for the metadata by removing or replacing invisible characters and certain HTML entities.2015-12-10: there is a bug in mw.text.unstripNoWiki (). It replaces math stripmarkers with the appropriate contentwhen it shouldn't. See https://phabricator.wikimedia.org/T121085 and Wikipedia_talk:Lua#stripmarkers_and_mw.text.unstripNoWiki.28.29TODO: move the replacement patterns and replacement values into a table in /Configuration similar to the invisiblecharacters table?]]localfunctioncoins_cleanup(value)localreplaced=true;-- default state to get the do loop runningwhilereplaceddo-- loop until all math stripmarkers replacedreplaced,value=coins_replace_math_stripmarker(value);-- replace math stripmarker with text representation of the equationendvalue=value:gsub(cfg.stripmarkers['math'],"MATH RENDER ERROR");-- one or more couldn't be replaced; insert vague error messagevalue=mw.text.unstripNoWiki(value);-- replace nowiki stripmarkers with their contentvalue=value:gsub('<span class="nowrap" style="padding%-left:0%.1em;">'(s?)</span>',"'%1");-- replace {{'}} or {{'s}} with simple apostrophe or apostrophe-svalue=value:gsub(' ',' ');-- replace entity with plain spacevalue=value:gsub('\226\128\138',' ');-- replace hair space with plain spaceifnotmw.ustring.find(value,cfg.indic_script)then-- don't remove zero-width joiner characters from indic scriptvalue=value:gsub('‍','');-- remove ‍ entitiesvalue=mw.ustring.gsub(value,'[\226\128\141\226\128\139\194\173]','');-- remove zero-width joiner, zero-width space, soft hyphenendvalue=value:gsub('[\009\010\013 ]+',' ');-- replace horizontal tab, line feed, carriage return with plain spacereturnvalue;end--[[--------------------------< C O I N S >--------------------------------------------------------------------COinS metadata (see <http://ocoins.info/>) allows automated tools to parse the citation information.]]localfunctionCOinS(data,class)if'table'~=type(data)ornil==next(data)thenreturn'';endfork,vinpairs(data)do-- spin through all of the metadata parameter valuesif'ID_list'~=kand'Authors'~=kthen-- except the ID_list and Author tables (author nowiki stripmarker done when Author table processed)data[k]=coins_cleanup(v);endendlocalctx_ver="Z39.88-2004";-- treat table strictly as an array with only set values.localOCinSoutput=setmetatable({},{__newindex=function(self,key,value)ifis_set(value)thenrawset(self,#self+1,table.concat{key,'=',mw.uri.encode(remove_wiki_link(value))});endend});ifin_array(class,{'arxiv','biorxiv','citeseerx','medrxiv','ssrn','journal','news','magazine'})or(in_array(class,{'conference','interview','map','press release','web'})andis_set(data.Periodical))or('citation'==classandis_set(data.Periodical)andnotis_set(data.Encyclopedia))thenOCinSoutput.rft_val_fmt="info:ofi/fmt:kev:mtx:journal";-- journal metadata identifierifin_array(class,{'arxiv','biorxiv','citeseerx','medrxiv','ssrn'})then-- set genre according to the type of citation template we are renderingOCinSoutput["rft.genre"]="preprint";-- cite arxiv, cite biorxiv, cite citeseerx, cite medrxiv, cite ssrnelseif'conference'==classthenOCinSoutput["rft.genre"]="conference";-- cite conference (when Periodical set)elseif'web'==classthenOCinSoutput["rft.genre"]="unknown";-- cite web (when Periodical set)elseOCinSoutput["rft.genre"]="article";-- journal and other 'periodical' articlesendOCinSoutput["rft.jtitle"]=data.Periodical;-- journal onlyOCinSoutput["rft.atitle"]=data.Title;-- 'periodical' article titles-- these used only for periodicalsOCinSoutput["rft.ssn"]=data.Season;-- keywords: winter, spring, summer, fallOCinSoutput["rft.quarter"]=data.Quarter;-- single digits 1->first quarter, etc.OCinSoutput["rft.chron"]=data.Chron;-- free-form date componentsOCinSoutput["rft.volume"]=data.Volume;-- does not apply to booksOCinSoutput["rft.issue"]=data.Issue;OCinSoutput['rft.artnum']=data.ArticleNumber;-- {{cite journal}} onlyOCinSoutput["rft.pages"]=data.Pages;-- also used in book metadataelseif'thesis'~=classthen-- all others except cite thesis are treated as 'book' metadata; genre distinguishesOCinSoutput.rft_val_fmt="info:ofi/fmt:kev:mtx:book";-- book metadata identifierif'report'==classor'techreport'==classthen-- cite report and cite techreportOCinSoutput["rft.genre"]="report";elseif'conference'==classthen-- cite conference when Periodical not setOCinSoutput["rft.genre"]="conference";OCinSoutput["rft.atitle"]=data.Chapter;-- conference paper as chapter in proceedings (book)elseifin_array(class,{'book','citation','encyclopaedia','interview','map'})thenifis_set(data.Chapter)thenOCinSoutput["rft.genre"]="bookitem";OCinSoutput["rft.atitle"]=data.Chapter;-- book chapter, encyclopedia article, interview in a book, or map titleelseif'map'==classor'interview'==classthenOCinSoutput["rft.genre"]='unknown';-- standalone map or interviewelseOCinSoutput["rft.genre"]='book';-- book and encyclopediaendendelse-- {'audio-visual', 'AV-media-notes', 'DVD-notes', 'episode', 'interview', 'mailinglist', 'map', 'newsgroup', 'podcast', 'press release', 'serial', 'sign', 'speech', 'web'}OCinSoutput["rft.genre"]="unknown";endOCinSoutput["rft.btitle"]=data.Title;-- book onlyOCinSoutput["rft.place"]=data.PublicationPlace;-- book onlyOCinSoutput["rft.series"]=data.Series;-- book onlyOCinSoutput["rft.pages"]=data.Pages;-- book, journalOCinSoutput["rft.edition"]=data.Edition;-- book onlyOCinSoutput["rft.pub"]=data.PublisherName;-- book and dissertationelse-- cite thesisOCinSoutput.rft_val_fmt="info:ofi/fmt:kev:mtx:dissertation";-- dissertation metadata identifierOCinSoutput["rft.title"]=data.Title;-- dissertation (also patent but that is not yet supported)OCinSoutput["rft.degree"]=data.Degree;-- dissertation onlyOCinSoutput['rft.inst']=data.PublisherName;-- book and dissertationend-- NB. Not currently supported are "info:ofi/fmt:kev:mtx:patent", "info:ofi/fmt:kev:mtx:dc", "info:ofi/fmt:kev:mtx:sch_svc", "info:ofi/fmt:kev:mtx:ctx"-- and now common parameters (as much as possible)OCinSoutput["rft.date"]=data.Date;-- book, journal, dissertationfork,vinpairs(data.ID_list)do-- what to do about these? For now assume that they are common to all?ifk=='ISBN'thenv=v:gsub("[^-0-9X]","");endlocalid=cfg.id_handlers[k].COinS;ifstring.sub(idor"",1,4)=='info'then-- for ids that are in the info:registryOCinSoutput["rft_id"]=table.concat{id,"/",v};elseifstring.sub(idor"",1,3)=='rft'then-- for isbn, issn, eissn, etc. that have defined COinS keywordsOCinSoutput[id]=v;elseif'url'==idthen-- for urls that are assembled in ~/Identifiers; |asin= and |ol=OCinSoutput["rft_id"]=table.concat({data.ID_list[k],"#id-name=",cfg.id_handlers[k].label});elseifidthen-- when cfg.id_handlers[k].COinS is not nil so urls created hereOCinSoutput["rft_id"]=table.concat{cfg.id_handlers[k].prefix,v,cfg.id_handlers[k].suffixor'',"#id-name=",cfg.id_handlers[k].label};-- others; provide a URL and indicate identifier name as #fragment (human-readable, but transparent to browsers)endendlocallast,first;fork,vinipairs(data.Authors)dolast,first=coins_cleanup(v.last),coins_cleanup(v.firstor'');-- replace any nowiki stripmarkers, non-printing or invisible charactersifk==1then-- for the first author name onlyifis_set(last)andis_set(first)then-- set these COinS values if |first= and |last= specify the first author nameOCinSoutput["rft.aulast"]=last;-- book, journal, dissertationOCinSoutput["rft.aufirst"]=first;-- book, journal, dissertationelseifis_set(last)thenOCinSoutput["rft.au"]=last;-- book, journal, dissertation -- otherwise use this form for the first nameendelse-- for all other authorsifis_set(last)andis_set(first)thenOCinSoutput["rft.au"]=table.concat{last,", ",first};-- book, journal, dissertationelseifis_set(last)thenOCinSoutput["rft.au"]=last;-- book, journal, dissertationend-- TODO: At present we do not report "et al.". Add anything special if this condition applies?endendOCinSoutput.rft_id=data.URL;OCinSoutput.rfr_id=table.concat{"info:sid/",mw.site.server:match("[^/]*$"),":",data.RawPage};-- TODO: Add optional extra info:-- rfr_dat=#REVISION<version> (referrer private data)-- ctx_id=<data.RawPage>#<ref> (identifier for the context object)-- ctx_tim=<ts> (timestamp in format yyyy-mm-ddThh:mm:ssTZD or yyyy-mm-dd)-- ctx_enc=info:ofi/enc:UTF-8 (character encoding)OCinSoutput=setmetatable(OCinSoutput,nil);-- sort with version string always first, and combine.-- table.sort( OCinSoutput );table.insert(OCinSoutput,1,"ctx_ver="..ctx_ver);-- such as "Z39.88-2004"returntable.concat(OCinSoutput,"&");end--[[--------------------------< S E T _ S E L E C T E D _ M O D U L E S >--------------------------------------Sets local cfg table and imported functions table to same (live or sandbox) as that used by the other modules.]]localfunctionset_selected_modules(cfg_table_ptr,utilities_page_ptr)cfg=cfg_table_ptr;has_accept_as_written=utilities_page_ptr.has_accept_as_written;-- import functions from selected Module:Citation/CS1/Utilities moduleis_set=utilities_page_ptr.is_set;in_array=utilities_page_ptr.in_array;remove_wiki_link=utilities_page_ptr.remove_wiki_link;strip_apostrophe_markup=utilities_page_ptr.strip_apostrophe_markup;end--[[--------------------------< E X P O R T E D F U N C T I O N S >------------------------------------------]]return{make_coins_title=make_coins_title,get_coins_pages=get_coins_pages,COinS=COinS,set_selected_modules=set_selected_modules,}