Info's about this site

Table of Contents

What is the Majestic-12 project?
Why I created this site
FAQ

What is the Majestic-12 project?

Majestic-12 is developing a search engine scalable to billions of web pages that is based on support by the community. Since the task of building a World Wide Web search engine is so huge, we have chosen to make Majestic-12 Distributed Search Engine based on the concept of distributed computing. The idea being that many machines work on one task to get it done quicker than one large machine alone. One of the biggest challenges with the search engines is actually getting billions of pages, and to do this cost effectively we have created a client software called MJ12node that can be run on otherwise idle computers. - http://www.majestic12.co.uk/about.php

Why I created this site

The most important reason is I like the mj12 project and I like to program. The first steps were made because I liked to see how much I crawled the last day. And than more and more statistics followed.

FAQ

When does the mj12 Day begin? (Which time zone used)
What does MiB/GiB/TiB mean? (What's the difference between MB and MiB)
Honored urls vs Live urls. Häh?
Limits of the Live urls
How long is the delay from Mj12 to SonnigeLichtung.de ?
Why is my best Uptime not displayed in Highscore
My Windows Server 2008 is listed as Windows Vista! (Windows Vista/2008)
When does the mj12 Day begin? (Which time zone used)
The MJ12-Server stands in the UK and uses the Europe/London time zone. This means in Winter time the time zone is UTC/GMT and in Sommer time it is UTC+1/GMT+1.
What does MiB/GiB/TiB mean? (What's the difference between MB and MiB)

MiB stands vor MebiByte (GibiByte,TebiByte) (Wikipedia) and means 1024*1024 Byte (220 Byte). The M for Mega as known in MegaByte is a SI prefix for 1000*1000 (106).

Some examples of the different usage: a CD of 650 MB means 650 MiB, a DVD of 4.7 GB means really 4.7 GB, a 1 TB hard disk has 1 TB (1012 Byte) and Windows show you 909 GB and means GiB

On the MJ12-Project-Page all data are shown in old style (MB means MegaByte and 10242 Byte). To avoid inconsistency during proclaiming high scores with 2,000,000 MiB as 2 PetaByte which would be 2.1 PetaByte or 1.9 PetiByte and due to the fact that the difference between PiB and PB is over 11% I decided to use the exact abbreviations.

Honored urls vs Live urls. Häh?

There are 2 separate systems of statistic, the honored urls are the normal urls you know from daily highscore or overall highscore. These are the urls your node returned with the finished buckets!

The Live urls are these which the node front end shows. A snapshot of these are also submitted to the server (once in 15 minutes or once in an hour). The importance of the Live urls/stats is to see if nodes got problems.

A simple rule for distinction is: Next to the Live urls are always success and failure columns.

Limits of the Live urls

Because the stats in general are updated once an hour and the Live urls based on the current run they can't be as exact as the Honored urls.

e.g. if the node restarts everything from last update till restart will be missed, the same happens during reset of node stats through the user itself.

How long is the delay from Mj12 to SonnigeLichtung.de ?

Node related statistics are updated every hour 15 minutes after clock hour (mj12 stats generation start clock hour). Member names and Member overall stats are updated once a day (shortly after midnight UTC)

Why is my best Uptime not displayed in Highscore

The recording of the node highscore started on 1st April 2009 and is limited to nodes which at least returned one bucket during the run back to the server. Furthermore the increase of the uptime is only honored when the node at least returned one bucket this day. These restrictions are need because there are some nodes which do not crawl and get therefore very high uptimes (>100 days)

My Windows Server 2008 is listed as Windows Vista! (Windows Vista/2008)

The problem is both (Windows Server 2008, Windows Vista) had the same internal id NT 6.0.6001 thats why I can't distinguish them and list them currently as Vista/2008. (Wikipedia)

Same with Server 2008 RC2 and Windows 7 (Displayed as Windows 7/200 R2).


W3C XHTML 1.0