Mail To Web
UK FlagEnglish | UK FlagCastellano | UK FlagCatalá
Description | Download | Instructions | Contribute | Contact

Description

MailToWeb is a tool for storing and managing high volumes of electronic mail. MailToWeb processes e-mail folders in mbox format, such as the originated by Netscape and Mozilla (*), and produces a set of HTML documents related by hyperlinks, from which it is easy to access the information using different search criteria.

MailToWeb is also a research tool to study efficient mechanisms of message retrieval based on information extracted from the mail folders.

(*)If you use Microsoft Outlook as your mail processor, you can import your MO mail from Netscape to generate the corresponding mbox format files.

Download and Install

System requirements: Download file: Running instructions (Windows):
  1. Extract the files using WinZip or Unzip
  2. Double-click 'MailToWeb.bat'
Running instructions (Linux):
  1. Unpack the distribution file: 'tar -zxvf MailToWeb-Linux-xxx.tar.gz'
  2. Run './MailToWeb.sh'

Instructions and Screenshots

The main window looks like this:

Main Windows of the application

Click on "File ... Convert ...", a file selection window is presented:

Window to choose input and output directory

You have to choose:
From this point the process doesn't require any assistance from the user: It will automatically go from the "Reading e-mail" to "Analyzing e-mail" stages. It may take some time: even up to 1-2 hours for a very large (600Mb) mailbox.

Screenshot of the read stage Screenshot of analyzing e-mail

When the process is over, open the created HTML file in your favorite browser. It should look like this:

Screenshot of message archive

The search engine is only available with the Java plug-in installed. Now you can enter keywords to search, or click on "Person Index", "Person Graph" or "Calendar".

Person Index (click to zoom)
Person Graph (click to zoom)
Calendar (click to zoom)
Person Index
Person Graph
Calendar

Contribute and Research

This software is part of a research project aimed at identifying common features of mailboxes. Thus, every time it is executed on a mailbox  it creates a file "data/public.txt" which contains numerical data to be used for statistical analysis purposes. This file contains a series of lines of the form <label>::<number> divided in chunks, each chunk corresponding to one message read. The following table indicates the meaning of the labels.

NU::? Message number
FO::? Folder number
SE::? Sender number
TO::? Receiver number
CC::? Copy receiver number (may be more than one)
SI::? Message subject number
DA::? Sending date and time expressed as milliseconds passed since el 1/1/1970
SZ::? Message size

*Notice that there is no personal information recorded in the file*

 
You can contribute with this project by sending the file 'data/public.txt' created under the generated website to catedratelefonica@fundacio.upf.es.

Remember that t
his file has only numerical data about your mailbox. You can check its content using a word processor or a plain text editor. 

Contact us

Credits:

If you need to contact us about this project, use the following e-mail address: catedratelefonica@fundacio.upf.es