Description
MailToWeb
is a tool for storing and managing high
volumes of electronic mail. MailToWeb processes e-mail folders in
mbox format, such as the originated by Netscape and Mozilla (*),
and produces a set of HTML documents related by hyperlinks,
from which it is easy to access the information using different search
criteria.
MailToWeb is also a research tool to study efficient
mechanisms of message retrieval based on
information extracted from the mail folders.
(*)If you use Microsoft Outlook as your mail processor, you can import your
MO mail from Netscape to generate the corresponding mbox format files.
Download and Install
System requirements:
- Java and Java Plug-in.
We have tested this software with version 1.4.1 and higher.
- Linux or Windows Operating Systems are supported.
- Disk space equal or bigger than the e-mail folder size.
Download file:
Running instructions (Windows):
- Extract the files using WinZip or Unzip
- Double-click 'MailToWeb.bat'
Running instructions (Linux):
- Unpack the distribution file: 'tar -zxvf
MailToWeb-Linux-xxx.tar.gz'
- Run './MailToWeb.sh'
Instructions and Screenshots
The main window looks like this:
Click on "File ... Convert ...", a file selection window is presented:

You have to choose:
- Source e-mail folders directory.
The directory in which your mbox format files are (the mailbox). It may
contain subdirectories. Non mbox format files are ignored.
- HTML output file. The output
file for the HTML archive of your e-mail; a sub-directory "data" will be
created in the same position as this file.
- Save XML files to disk. Select it if you want to save the XML files
generated by the application onto your disk (not recommended)
From this point the process doesn't require any assistance from the
user: It will automatically go from the "Reading e-mail" to "Analyzing
e-mail" stages. It may take some time: even up to 1-2 hours for a very
large (600Mb) mailbox.
When the process is over, open the created HTML
file in your favorite browser. It should look like this:

The search engine is only available with the Java plug-in installed. Now you can
enter keywords to search, or click on "Person Index", "Person Graph" or
"Calendar".
Person Index
(click to zoom)
|
Person Graph
(click to zoom)
|
Calendar
(click to zoom)
|

|

|
 |
Contribute and Research
This software is part of a research project aimed at identifying common
features of mailboxes. Thus, every time it is executed on a mailbox it
creates a file "data/public.txt" which contains numerical data to be
used for statistical analysis purposes. This file contains a series of lines of
the form <label>::<number> divided in chunks, each chunk
corresponding to one message read. The following table indicates the meaning of
the labels.
| NU::? |
Message number |
| FO::? |
Folder number |
|
SE::? |
Sender number |
| TO::? |
Receiver number |
|
CC::? |
Copy receiver number (may be more than one) |
|
SI::? |
Message subject number |
|
DA::? |
Sending date and time expressed as milliseconds passed since el
1/1/1970 |
| SZ::? |
Message size |
|
*Notice that there is no personal information recorded in the file*
You can
contribute with this project by sending the file 'data/public.txt'
created under the generated website to catedratelefonica@fundacio.upf.es.
Remember that this file has only numerical
data about your mailbox. You can check its content using a word processor or a plain text editor.
|
Contact us
Credits:
If you need to contact us about this project, use the following e-mail
address: catedratelefonica@fundacio.upf.es