DesignHistory

Version 1 (Adrian Georgescu, 08/29/2010 01:23 pm)

1 1 Adrian Georgescu
2 1 Adrian Georgescu
== History == #blinkhistory
3 1 Adrian Georgescu
4 1 Adrian Georgescu
 Keep a history of all the sessions, whatever their type, along with data associated with it:
5 1 Adrian Georgescu
  * All: SIP traces, basic info
6 1 Adrian Georgescu
  * Audio calls: recordings, answering machine messages
7 1 Adrian Georgescu
  * Chat: transcripts, MSRP trace
8 1 Adrian Georgescu
  * File transfers: path, MSRP trace
9 1 Adrian Georgescu
  * SMS: transcripts
10 1 Adrian Georgescu
11 1 Adrian Georgescu
 * Determine which backend to use for storing: sqlite/zodb/others. (preferably with support for full text search)
12 1 Adrian Georgescu
  * SQLite with fts3:
13 1 Adrian Georgescu
   * On Linux, fts3 seems to be precompiled (at least on Debian and Ubuntu)
14 1 Adrian Georgescu
   * fts3 is not precompiled in the sqlite3.dll shipped with the binary distribution of Python for Windows (2.5, 2.6 or 2.7); however, sqlite.org has a binary distribution for Windows consisting of a .dll and a .def which does have fts3 precompiled; since we will be shipping blink using py2exe, we should be able to easily replace the default dll with the one with fts3 support
15 1 Adrian Georgescu
   * [http://www.sqlite.org/fts3.html This] page describes the use of sqlite with fts3.
16 1 Adrian Georgescu
   * It may not be possible (or desirable due to performance penalties) to use an ORM: SQLObject does not seem to support fts3, while SQLAlchemy seems to partially support it; it may also be the case that an ORM would not be able to construct optimal queries as required by fts3.
17 1 Adrian Georgescu
  * ZODB with indexing:
18 1 Adrian Georgescu
   * It seems that indexing support is not distributed in Debian or Ubuntu older than lucid (10.04). The corresponding Ubuntu lucid packages are python-zope.catalog and python-zope.index (with quite a few dependencies)
19 1 Adrian Georgescu
   * Seems to use more than twise as much storage space than SQLite for the same data
20 1 Adrian Georgescu
   * Might need repacking of data at various intervals, operation which is cpu-intensive
21 1 Adrian Georgescu
  * Tests
22 1 Adrian Georgescu
   * The tests were performed using a SIP trace file generated by the command line clients having 416MB (595895 packets)
23 1 Adrian Georgescu
   * SQLite with fts3 (scripts used are attached):
24 1 Adrian Georgescu
    * Database size: 1000MB
25 1 Adrian Georgescu
    * Creation time: 2 minutes 51 seconds
26 1 Adrian Georgescu
    * Searching for luci@umts.ro (found 3028 packets) and retrieving results: 255 ms
27 1 Adrian Georgescu
    * Searching for 4a14e48dbc421d5b3521d1247fdb41871248e96f (a nonce, found 2 packets) and retrieving results: 6ms
28 1 Adrian Georgescu
   * ZODB with indexing: after 3 hours and 11GB written, the process stopped due to lack of free space on the filesystem. It should be noted that the process was only indexing the packet contents and not the other fields. A ZODB database without any index occupies approximately the same size as the SIP trace file used, thus the difference occurs exclusively due to the index.
29 1 Adrian Georgescu
 * Allow the user to perform complex searches in the history database
30 1 Adrian Georgescu
 * Needs to be integrated with logging