combox-paper

notes and other things concerning combox
git clone git://git.ricketyspace.net/combox-paper.git
Log | Files | Refs

commit f93d7c9e2e06b55fe66ec12c296b59a811d82f48
parent 715cc2d5719dcc60d59fe57e62fbd40b9e70bdf8
Author: Siddharth Ravikumar <sravik@bgsu.edu>
Date:   Fri, 11 Mar 2016 21:05:30 -0500

edited chapters one, two, three.

Diffstat:
report/chapters/1-intr.tex | 32++++++++++++++++----------------
report/chapters/2-lit-r.tex | 24++++++++++++------------
report/chapters/3-arch-d.tex | 16++++++++--------
report/combox-report.pdf | 0
4 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/report/chapters/1-intr.tex b/report/chapters/1-intr.tex @@ -8,7 +8,7 @@ data/information on their servers and at the same time there is a lot of evidence of governments and other powerful organizations being able to access information/data stored on the Internet companies' computers\cite{website:wikileaks-spyfiles}. Also, most companies add a -standard clause in their privacy policy that allows them to disclose +standard clause in their privacy policy that allow them to disclose information about users or information stored/created by users to ``third parties'': @@ -78,8 +78,8 @@ N node directories; shards \verb+strunk-white.pdf.shard0+ to \end{figure} combox does not sync encrypted shards stored in the node directories -to the respective file storage providers' servers and it depends on the -respective file storage provider's client program to sync the +to the respective file storage providers' data store and it depends on +the respective file storage provider's client program to sync the shards. combox can be used on all of the user's computers. For instance, the @@ -88,8 +88,8 @@ reconstruct the file from the encrypted shards stored in the node directories into the combox directory on their second computer; figure \ref{fig:1-combox-overview-1} illustrates this. Here too, combox depends on the client program of the respective file storage provider -to sync shards to/from the file storage provider's server to/from the -respective node directory on the user's computer. +to sync shards to/from the file storage provider's data store to/from +the respective node directory on the user's computer. \begin{figure}[h] \begin{verbatim} @@ -125,17 +125,17 @@ respective node directory on the user's computer. \label{fig:1-combox-overview-1} \end{figure} -As of combox \verb+v0.2.3+, combox is compatible on GNU/Linux and OS -X, it supports just two file storage providers -- Google Drive and -Dropbox. +As of combox version \verb+0.2.3+, combox is compatible on GNU/Linux +and OS X, it supports just two file storage providers -- Google Drive +and Dropbox. \section{How is combox different from Combo-Box?}\label{1-sec-cb-diff} Combo-Box by Wesley Vollmar\cite{vollmar-combo-box} was the first implementation of the idea of storing encrypted shards of a file on storage provided different file storage providers and depending on the -file storage provider's client to sync shards to their respective -servers. Differences between Vollmar's Combo-Box and combox are +file storage provider's client to sync shards to their respective data +store. Differences between Vollmar's Combo-Box and combox are enumerated below: \begin{description} @@ -147,18 +147,18 @@ enumerated below: while combox is not yet cognizant about space left on each node directory and splits the file into N equal shards, where N is equal to the number of node directories. -\item[User Interface] Combo-Box is graphical application while combox - is mostly a commandline program; combox's configuration wizard has a - graphical interface. The configuration wizard has a commandline - interface too for users who like TUI. +\item[User Interface] Combo-Box is a graphical application while + combox is mostly a command-line program; combox's configuration + wizard has a graphical interface. The configuration wizard has a + command-line interface too for users who like TUI. \item[Database] Combo-Box uses a traditional SQL database with two tables to keep track of files' shards, files' hash, files' last ``sync time'' and for ``security and stability'' uses stored procedures that retrieve/store information in the database\cite{vollmar-combo-box}. - combox on the other hand uses a no SQL key-value data store to track - the files stored in the combox directory using the pickleDB + combox on the other hand uses a key-value data store to track the + files stored in the combox directory using the pickleDB library\cite{pylib:pickledb}. The key-value data store is a JSON file and all access to this data store is done through an instance of \verb+combox.silo.ComboxSilo+ diff --git a/report/chapters/2-lit-r.tex b/report/chapters/2-lit-r.tex @@ -35,10 +35,10 @@ operations -- Create, Rename, Update, Delete (CRUD) -- are possible. Information about the files stored in the unified location is stored in a SQLite database. Unlike combox, which depends the file storage provider' client to sync file fragments/shards to the file -storage provider's server, the Android application developed by Yeo et -al. takes the responsibility to sync file fragments/shards to each -file storage provider and uses the OAuth 2.0\cite{protocol:oauth2} -protocol for authorization. +storage provider's data store, the Android application developed by +Yeo et al. takes the responsibility to sync file fragments/shards to +each file storage provider and uses the OAuth +2.0\cite{protocol:oauth2} protocol for authorization. For encrypting file fragments, they use AES-256; the key for encrypting file fragments is derived from the user's password by using @@ -46,12 +46,12 @@ Password-Based Key Derivation Function (PBKDF2)\cite{kaliski}. For erasure coding they use the JigDFS library\cite{jigdfs}. The Android application is able do ``progressive streaming'' of media files; this means that large media files can be streamed in real-time from the -from the file storage providers' servers; this is an attractive +from the file storage providers' data store; this is an attractive feature in a ``resource constrained'' device where storage is expensive. Yeo et al. propose methods for achieving data de-duplication; file -compression based on the type of the file; intelligent pre-fetching +compression based on file type; intelligent pre-fetching and caching of file fragments and ``automatic restoration in exploiting file-versioning''; these features were not implemented in the prototype Android application and there is possibility of Yeo et @@ -79,7 +79,7 @@ storage provider. In SkyCDS, the content delivery to subscribers of the content is segregated into two distinct layers -- Metadata Flow Layer and the Content Flow Layer. The publisher of the content largely interacts -with the Metadata Flow Layer that controls and keeps track of the what +with the Metadata Flow Layer that controls and keeps track of what content is published and the subscriber also largely interacts with the Metadata Flow layer to subscribe to content published in the content delivery system. The Content Flow Layer is where the content @@ -110,7 +110,7 @@ space and reliability. \verb+git-annex+ allows one to version controlled large files that are not usually feasible to version control under -\verb+git+\cite{program:git}. \verb+git-annex+, checks in the names +\verb+git+\cite{program:git}. \verb+git-annex+, checks in the name and other meta-data about the files in git and stores the actual content under \verb+.git/annex+ directory. When a file is added to \verb+git-annex+, a symlink of the file is created in place of the @@ -163,10 +163,10 @@ nex/objects/3j/vG/SHA256E-s108196923--7de9484ee96908268e21b451eb9805552c32b44da0 Now, the file \verb+deb-nicholson-80s.medium.webm+ is checked into \verb+git-annex+ and we can now do a \verb+git annex sync+ to sync the repository to other \verb+git-annex+ repositories. It must be noted -here that that when the repository is synced, the file content itself -is not transferred to the other \verb+git-annex+ repositories; only -the file's name and its meta-data that is stored in a separate git -branch called \verb+git-annex+ are +here that when the repository is synced, the file content itself is +not transferred to the other \verb+git-annex+ repositories; only the +file's name and its meta-data that is stored in a separate git branch +called \verb+git-annex+ are transferred\cite{documentation:git-annex-hworks}. In order to create a copy of a given file in another git annex repository, \verb+git annex get /path/to/filename.ext+ has to done. diff --git a/report/chapters/3-arch-d.tex b/report/chapters/3-arch-d.tex @@ -29,7 +29,7 @@ combox will create two encrypted shards of file \verb+humans.txt+ -- encrypted shard under the Dropbox directory and the other encrypted shard under the Google Drive directory. Now, the Dropbox client and the Google client will sync the respective shards that was place under -their directories to their respective servers. +their directories to their respective data store. \begin{figure}[h] \includegraphics[scale=0.6]{3-combox-structure} @@ -50,12 +50,12 @@ for file modification, deletion and rename/move. \subsection{combox configuration}\label{sec:3-combox-config} -combox configuration wizard triggers automatically when combox finds -that it is not configured. The combox configuration setups up the -combox directory; asks the user to point to the location of the node -directories; reads the key (passphrase) to be used to encrypt file -shards that are spread across the node directories. The combox -configuration is written to +The combox configuration wizard triggers automatically when combox +finds that it is not configured. The combox configuration wizard +setups up the combox directory; asks the user to point to the location +of the node directories; reads the key (passphrase) to be used to +encrypt file shards that are spread across the node directories. The +combox configuration is written to \verb+$HOME/.combox/config.yaml+; this YAML configuration file can be manually edited by the user. @@ -182,7 +182,7 @@ combox directory. The only information that is stored in the combox data store, about a file in the combox directory is its SHA-512 hash; The SHA-512 hash of a file is enough information to detect changes in the file. In the -data store, there also four dictionaries -- \verb+file_moved+, +data store, there is also four dictionaries -- \verb+file_moved+, \verb+file_deleted+, \verb+file_created+, \verb+file_modified+ -- which tracks the number of shards of a file that was moved/deleted/created/modified due the respective file being diff --git a/report/combox-report.pdf b/report/combox-report.pdf Binary files differ.