Type: Bachelor or Master thesis, also possible in combination with a project.
Background info: Are you happy with the search capabilities of your mailing program? In particular: if you use Thunderbird, are you happy with Thunderbird's search capabilities? I'm not and I think we can do better.
Goal: Subtasks are: (1) Write an efficient parser that reads one or files in MBOX format and produces a CSV file with one line per mail and columns for the various structured and unstructured parts of an email (from, to, subject, date, body, ...); (2) take proper care of encoding issues, which are a major issue when dealing with a large number of emails; (3) setup an instance of CompleteSearch for the data from the CSV file; (4) provide a simple and effective search interface using the instance from 3 as a backend; (5) implement a plugin for Thunderbird.