Wikipedia Structure

From UMBEL Wiki
Jump to: navigation, search

Different parts of Wikipedia use different schemes for organizing articles into categories (see FAQ here).

The main types of categories used are:

  • topic categories – categories of articles relating to a particular topic, such as Category:Geography or Category:Paris.
  • list categories – categories of articles on subjects in a particular class, such as Category:Villages in Poland.
  • list-and-topic categories – categories which are combinations of the two above types.
  • intermediate categories – categories used to organize large classes of subcategories, such as Category:Albums by artist.
  • universal categories – categories used to provide a complete list of articles which are otherwise normally divided into subcategories.
  • project categories – categories used mainly by Wikipedia's editors for project management purposes, rather than for browsing. A common type is stub categories, which contain short ("stub") articles in a particular field. You can help build Wikipedia by expanding these stubs into longer articles!

Outline of Knowledge

http://en.wikipedia.org/wiki/Portal:Contents

http://en.wikipedia.org/wiki/Portal:Contents/Outline_of_knowledge

Listing of Wikipedia Nationalities -- http://en.wikipedia.org/wiki/Category:People_by_occupation_and_nationality
Listing of Wikipedia Occupations -- http://en.wikipedia.org/wiki/Category:Sportspeople_by_nationality

Templates & Infoboxes

As of Aug 2007:

Pages
3.84 M
Templates
0.76 M
Templates
1.07 M
Template Names
7,783
Attribute Names
19,378
Templ-Attr Pairs
81,671

The following table describes the various aspects of embedded structure within Wikipedia.

Category/Subcategory
Descriptive Notes
Reference Lists
Categories
Administration

admin.csv
Date-related


Faceted


Qualified


Core Categories
  • Category — Category suggestion
  • Contained articles — Semantically related terms (siblings)
  • Hierarchy — Hyponymic and meronymic relations between terms

Category Links
  • In-links: sub-category candidate
  • Out-links: super-category candidate

Portals



Articles
Articles
  • First paragraph — Definitions
  • Full text — Description of meaning; related terms; translations
  • Redirects — Synonymy; spelling variations, misspellings; abbreviations
  • Title — Named entities; domain specific terms or senses
  • Subject — Category suggestion (phrase marked in bold or in first paragraph)
  • Section heading — Category suggestions

Article (other??)


Article Links
General
  • Context — Related terms; co-occurrences
  • Label — Synonyms; spelling variations; related terms
  • Target — Link graph; related terms

In-links
  • Sub-category candidateName –
  • Item — Category suggestion; entity suggestion

Anchor texts
about 3x for every page link

See also links
weight of 5

Double links
weight of 2

Inverse see also links
weight of 2

Links in same category
weight of 1.5

Out-links
  • Name –
  • Item — Category suggestion; entity suggestion

Shared out-link categories
Good measure of shared relationship (esp if pruned?)

Templates
Infoboxes
  • Name –
  • Item — Category suggestion; entity suggestion

Navboxes


General Templates


Disambiguation Pages
General


Redirects
General


List Pages
General








Wikipedia API

WikipediaAPI.png